: Traditional statistical approaches have advanced our understanding of the genetics of complex diseases, yet are limited to linear additive models. Here we applied machine learning (ML) to genome-wide data from 41,686 individuals in the largest European consortium on Alzheimer's disease (AD) to investigate the effectiveness of various ML algorithms in replicating known findings, discovering novel loci, and predicting individuals at risk. We utilised Gradient Boosting Machines (GBMs), biological pathway-informed Neural Networks (NNs), and Model-based Multifactor Dimensionality Reduction (MB-MDR) models. ML approaches successfully captured all genome-wide significant genetic variants identified in the training set and 22% of associations from larger meta-analyses. They highlight 6 novel loci which replicate in an external dataset, including variants which map to ARHGAP25, LY6H, COG7, SOD1 and ZNF597. They further identify novel association in AP4E1, refining the genetic landscape of the known SPPL2A locus. Our results demonstrate that machine learning methods can achieve predictive performance comparable to classical approaches in genetic epidemiology and have the potential to uncover novel loci that remain undetected by traditional GWAS. These insights provide a complementary avenue for advancing the understanding of AD genetics.

Machine learning in Alzheimer’s disease genetics / M. Bracher-Smith, F. Melograna, B. Ulm, C. Bellenguez, B. Grenier-Boley, D. Duroux, A.J. Nevado, P. Holmans, B.M. Tijms, M. Hulsman, I. De Rojas, R. Campos-Martin, S.V. Der Lee, A. Castillo, F. Küçükali, O. Peters, A. Schneider, M. Dichgans, D. Rujescu, N. Scherbaum, J. Deckert, S. Riedel-Heller, L. Hausner, L. Molina-Porcel, E. Düzel, T. Grimmer, J. Wiltfang, S. Heilmann-Heimbach, S. Moebus, T. Tegos, N. Scarmeas, O. Dols-Icardo, F. Moreno, J. Pérez-Tur, M.J. Bullido, P. Pastor, R. Sánchez-Valle, V. Álvarez, M. Boada, P. García-González, R. Puerta, P. Mir, L.M. Real, G. Piñol-Ripoll, J.M. García-Alberca, E. Rodriguez-Rodriguez, H. Soininen, S. Heikkinen, A. De Mendonça, S. Mehrabian, L. Traykov, J. Hort, M. Vyhnalek, N. Sandau, J.Q. Thomassen, Y.A.L. Pijnenburg, H. Holstege, J. Van Swieten, I. Ramakers, F. Verhey, P. Scheltens, C. Graff, G. Papenberg, V. Giedraitis, J. Williams, P. Amouyel, A. Boland, J. Deleuze, G. Nicolas, C. Dufouil, F. Pasquier, O. Hanon, S. Debette, E. Grünblatt, J. Popp, R. Ghidoni, D. Galimberti, B. Arosio, P. Mecocci, V. Solfrizzi, L. Parnetti, A. Squassina, L. Tremolizzo, B. Borroni, M. Wagner, B. Nacmias, M. Spallazzi, D. Seripa, I. Rainero, A. Daniele, F. Piras, C. Masullo, G. Rossi, F. Jessen, P. Kehoe, T. Magda, P. Sánchez-Juan, K. Sleegers, M. Ingelsson, M. Hiltunen, R. Sims, W. Van Der Flier, O.A. Andreassen, A. Ruiz, A. Ramirez, N. Null, I. Jansen, S. Van Der Lee, V. Andrade, V. Fernández, M. Dalmasso, L. Kleineidam, S. Ahmad, D. Aarsland, A. Cano, C. Abdelnour, E. Alarcón-Martín, D. Alcolea, M. Alegret, I. Alvarez, N.J. Armstrong, T. Anthoula, I. Appollonio, M. Arcaro, S. Archetti, A.A. Pastor, L. Athanasiu, H. Bailly, N. Banaj, M. Baquero, A.B. Pastor, C. Berr, C. Besse, V. Bessi, G. Binetti, S. Fostinelli, S. Bellini, A. Bizarro, R. Blesa, M. Boada, S. Boschi, P. Bossù, G. Bråthen, C. Bresner, H. Brodaty, K.J. Brookes, D. Buiza-Rueda, K. Bûrger, V. Burholt, M. Calero, G. Chene, Á. Carracedo, R. Cecchetti, L. Cervera-Carles, C. Charbonnier, C. Chillotti, S. Ciccone, J.A.H.R. Claassen, J. Clarimon, C. Clark, E. Conti, A. Corma-Gómez, G.M. Giuffrè, C. Custodero, D. Daian, E. Dardiotis, J. Dartigues, P.P. De Deyn, T. Del Ser, N. Denning, J. Diehl-Schmid, M. Diez-Fairen, P.D. Rossi, S. Djurovic, E. Duron, S. Engelborghs, J. Blázquez, M. Ewers, T. Fabrizio, S.F. Nielsen, L. Farotti, C. Fenoglio, M. Fernández-Fuertes, C.B. Ferreira, E. Ferri, B. Fin, P. Fischer, T. Fladby, K. Fließbach, J. Fortea, T.M. Foroud, S. Fostinelli, N.C. Fox, E. Franco-Macías, A. Frank-García, L. Froelich, J.M. García-Alberca, P. García-González, S. Garcia-Madrona, G. Garcia-Ribas, I. Giegling, G. Giorgio, O. Goldhardt, A. González-Pérez, G. Grande, E. Green, T. Guetta-Baranes, A. Haapasalo, G. Hadjigeorgiou, H. Hampel, J. Hardy, A.M. Hartmann, G. Leonenko, J. Harwood, S. Helisalmi, M.T. Heneka, I. Hernández, M.J. Herrmann, P. Hoffmann, C. Holmes, R.H. Vilas, M. Hulsman, G. Jan Biessels, C. Johansson, L. Kilander, A.K. Ståhlbom, M. Kivipelto, A. Koivisto, J. Kornhuber, M.H. Kosmidis, C. Lage, E.J. Laukka, A. Lauria, J. Lehtisalo, O. Lerch, A. Lleó, A.L. De Munain, S. Love, M. Löwemark, L. Luckcuck, J. Macías, C.A. Macleod, W. Maier, F. Mangialasche, S. Marco, M. Marquié, R. Marshall, A.M. Montes, C.M. Rodríguez, S. Mead, M. Medina, A. Meggy, S. Mendoza, M. Menéndez-González, M. Mol, L. Montrreal, K. Morgan, M.M. Nöthen, T. Ngandu, B.G. Nordestgaard, R. Olaso, A. Orellana, M. Orsini, M. Capdevila, A. Padovani, C. Paolo, M. Martinez-Lucas, P. Pericard, J.A. Pineda, C. Pisanu, T. Polak, D. Posthuma, J. Priller, R. Puerta, O. Quenez, I. Quintela, A. Rábano, M.J.T. Reinders, P. Riederer, C. Olivé, A. Rongve, I.R. Allende, M. Rosende-Roca, J.L. Royo, E. Rubino, M.E. Sáez, P. Sakka, I. Saltvedt, F. García-Gutierrez, M.B. Sánchez-Arjona, F. Sanchez-Garcia, P. Sánchez-Juan, R. Sánchez-Valle, S.B. Sando, M. Scamosci, E. Scarpini, M. Scherer, M. Schmid, J.M. Schott, G. Selbæk, A.A. Shadrin, O. Skrobot, A. Solomon, S. Sorbi, O. Sotolongo-Grau, A. Spottke, E. Stordal, A. Miguel, L. Tárraga, N. Tesí, A. Thalamuthu, T. Thomas, L. Traykov, A. Tybjærg-Hansen, A. Uitterlinden, A. Ullgren, I. Ulstein, S. Valero, C. Van Broeckhoven, J. Van Dongen, J. Van Rooij, R. Vandenberghe, J. Vidal, M.G. Vita, J. Vogelgsang, M. Wagner, D. Wallon, L. Weinhold, G. Windle, B. Woods, M. Yannakoulia, M. Zulaica, M. Ghanbari, P. Sachdev, K. Mather, M.A. Ikram, R. Frikke-Schmidt, N. Amin, G. Roshchupkin, J. Lambert, K. Van Steen, C. Van Duijn, V. Escott-Price. - In: NATURE COMMUNICATIONS. - ISSN 2041-1723. - 16:1(2025), pp. 6726.1-6726.16. [10.1038/s41467-025-61650-z]

Machine learning in Alzheimer’s disease genetics

D. Galimberti;B. Arosio;M. Arcaro;C. Fenoglio;E. Ferri;G. Grande;E. Scarpini;
2025

Abstract

: Traditional statistical approaches have advanced our understanding of the genetics of complex diseases, yet are limited to linear additive models. Here we applied machine learning (ML) to genome-wide data from 41,686 individuals in the largest European consortium on Alzheimer's disease (AD) to investigate the effectiveness of various ML algorithms in replicating known findings, discovering novel loci, and predicting individuals at risk. We utilised Gradient Boosting Machines (GBMs), biological pathway-informed Neural Networks (NNs), and Model-based Multifactor Dimensionality Reduction (MB-MDR) models. ML approaches successfully captured all genome-wide significant genetic variants identified in the training set and 22% of associations from larger meta-analyses. They highlight 6 novel loci which replicate in an external dataset, including variants which map to ARHGAP25, LY6H, COG7, SOD1 and ZNF597. They further identify novel association in AP4E1, refining the genetic landscape of the known SPPL2A locus. Our results demonstrate that machine learning methods can achieve predictive performance comparable to classical approaches in genetic epidemiology and have the potential to uncover novel loci that remain undetected by traditional GWAS. These insights provide a complementary avenue for advancing the understanding of AD genetics.
Settore BIOS-10/A - Biologia cellulare e applicata
2025
22-lug-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
Bracher Smith.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 3.65 MB
Formato Adobe PDF
3.65 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1178419
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 2
social impact