Known pathogenic variants associated with genetic Mendelian diseases represent a tiny minority of the overall genetic variation that characterizes the human genome. In this context classical imbalance-aware machine learning methods are unable to distinguish pathogenic from benign variants, since they are severely biased toward the majority (benign) class. Recent works based on ensemble and hyper-ensemble methods showed that by adopting sampling techniques we can significantly improve performance on this challenging task. Inspired by these findings and by recent successful applications of deep learning to Precision Medicine, we propose two learning techniques for neural networks designed to assure a certain balancing between pathogenic and benign variants during the training phase, or to assure that with high probability at least one pathogenic variant is included in the training mini-batch set of examples. The experimental prediction of non-coding mutations associated with Mendelian diseases show the effectiveness of these proposed neural network training approaches.

Training Neural Networks with Balanced Mini-batch to Improve the Prediction of Pathogenic Genomic Variants in Mendelian Diseases / L. Cappelletti, J. Gliozzo, A. Petrini, G. Valentini. - In: SENSORS & TRANSDUCERS. - ISSN 2306-8515. - 234:6(2019), pp. 16-21.

Training Neural Networks with Balanced Mini-batch to Improve the Prediction of Pathogenic Genomic Variants in Mendelian Diseases

L. Cappelletti
Primo
;
J. Gliozzo
Secondo
;
A. Petrini;G. Valentini
Ultimo
2019

Abstract

Known pathogenic variants associated with genetic Mendelian diseases represent a tiny minority of the overall genetic variation that characterizes the human genome. In this context classical imbalance-aware machine learning methods are unable to distinguish pathogenic from benign variants, since they are severely biased toward the majority (benign) class. Recent works based on ensemble and hyper-ensemble methods showed that by adopting sampling techniques we can significantly improve performance on this challenging task. Inspired by these findings and by recent successful applications of deep learning to Precision Medicine, we propose two learning techniques for neural networks designed to assure a certain balancing between pathogenic and benign variants during the training phase, or to assure that with high probability at least one pathogenic variant is included in the training mini-batch set of examples. The experimental prediction of non-coding mutations associated with Mendelian diseases show the effectiveness of these proposed neural network training approaches.
Neural Networks; Imbalance-aware Neural Networks; Deep Learning; Prediction of pathogenic genomic variants; Mendelian diseases
Settore INF/01 - Informatica
2019
https://www.sensorsportal.com/HTML/DIGEST/P_3087.htm
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1022610
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact