Machine learning-based reclassification of germline variants of unknown significance: The RENOVO algorithm

Favalli, V.; Tini, G.; Bonetti, E.; Vozza, G.; Guida, A.; Gandini, S.; Pelicci, P.G.; Mazzarella, L.

doi:10.1016/j.ajhg.2021.03.010

The increasing scope of genetic testing allowed by next-generation sequencing (NGS) dramatically increased the number of genetic variants to be interpreted as pathogenic or benign for adequate patient management. Still, the interpretation process often fails to deliver a clear classification, resulting in either variants of unknown significance (VUSs) or variants with conflicting interpretation of pathogenicity (CIP); these represent a major clinical problem because they do not provide useful information for decision-making, causing a large fraction of genetically determined disease to remain undertreated. We developed a machine learning (random forest)-based tool, RENOVO, that classifies variants as pathogenic or benign on the basis of publicly available information and provides a pathogenicity likelihood score (PLS). Using the same feature classes recommended by guidelines, we trained RENOVO on established pathogenic/benign variants in ClinVar (training set accuracy = 99%) and tested its performance on variants whose interpretation has changed over time (test set accuracy = 95%). We further validated the algorithm on additional datasets including unreported variants validated either through expert consensus (ENIGMA) or laboratory-based functional techniques (on BRCA1/2 and SCN5A). On all datasets, RENOVO outperformed existing automated interpretation tools. On the basis of the above validation metrics, we assigned a defined PLS to all existing ClinVar VUSs, proposing a reclassification for 67% with >90% estimated precision. RENOVO provides a validated tool to reduce the fraction of uninterpreted or misinterpreted variants, tackling an area of unmet need in modern clinical genetics.

Machine learning-based reclassification of germline variants of unknown significance: The RENOVO algorithm / V. Favalli, G. Tini, E. Bonetti, G. Vozza, A. Guida, S. Gandini, P.G. Pelicci, L. Mazzarella. - In: AMERICAN JOURNAL OF HUMAN GENETICS. - ISSN 1537-6605. - 108:4(2021 Apr 01), pp. 682-695. [10.1016/j.ajhg.2021.03.010]

Machine learning-based reclassification of germline variants of unknown significance: The RENOVO algorithm

Favalli, Valentina^Primo;Tini, Giulia;E. Bonetti;G. Vozza;Guida, Alessandro;Gandini, Sara;P.G. Pelicci;L. Mazzarella^Ultimo

2021

Abstract

The increasing scope of genetic testing allowed by next-generation sequencing (NGS) dramatically increased the number of genetic variants to be interpreted as pathogenic or benign for adequate patient management. Still, the interpretation process often fails to deliver a clear classification, resulting in either variants of unknown significance (VUSs) or variants with conflicting interpretation of pathogenicity (CIP); these represent a major clinical problem because they do not provide useful information for decision-making, causing a large fraction of genetically determined disease to remain undertreated. We developed a machine learning (random forest)-based tool, RENOVO, that classifies variants as pathogenic or benign on the basis of publicly available information and provides a pathogenicity likelihood score (PLS). Using the same feature classes recommended by guidelines, we trained RENOVO on established pathogenic/benign variants in ClinVar (training set accuracy = 99%) and tested its performance on variants whose interpretation has changed over time (test set accuracy = 95%). We further validated the algorithm on additional datasets including unreported variants validated either through expert consensus (ENIGMA) or laboratory-based functional techniques (on BRCA1/2 and SCN5A). On all datasets, RENOVO outperformed existing automated interpretation tools. On the basis of the above validation metrics, we assigned a defined PLS to all existing ClinVar VUSs, proposing a reclassification for 67% with >90% estimated precision. RENOVO provides a validated tool to reduce the fraction of uninterpreted or misinterpreted variants, tackling an area of unmet need in modern clinical genetics.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				ClinVar; VUS; machine learning; reclassification; variant interpretation; Computer User Training; Datasets as Topic; Genes, BRCA1; Germ-Line Mutation; Humans; Reproducibility of Results; Machine Learning
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore MED/04 - Patologia Generale
			
	Data di pubblicazione
	
				1-apr-2021
			
	Data ahead of print o data di stampa
	
				23-mar-2021
			
	Rivista in ANCE
	
				AMERICAN JOURNAL OF HUMAN GENETICS
			
	DOI
	
				https://dx.doi.org/10.1016/j.ajhg.2021.03.010
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S000292972100094X-main.pdf solo utenti autorizzati Tipologia: Publisher's version/PDF Dimensione 2.42 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.42 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/930362

Citazioni

11

32

32

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca