Known pathogenic variants associated with genetic Mendelian diseases represent a tiny minority of the overall genetic variation that characterizes the human genome. In this context classical imbalance-aware machine learning methods are unable to distinguish pathogenic from benign variants, since they are severely biased toward the majority (benign) class. Recent works based on ensemble and hyper-ensemble methods showed that by adopting sampling techniques we can significantly improve performance on this challenging task. Inspired by these findings and by recent successful applications of deep learning to Precision Medicine, we propose two learning techniques for neural networks designed to assure a certain balancing between pathogenic and benign variants during the training phase, or to assure that with high probability at least one pathogenic variant is included in the training mini-batch set of examples. The experimental prediction of non-coding mutations associated with Mendelian diseases show the effectiveness of these proposed neural network training approaches.
Training Neural Networks with Balanced Mini-batch to Improve the Prediction of Pathogenic Genomic Variants in Mendelian Diseases / L. Cappelletti, J. Gliozzo, A. Petrini, G. Valentini. - In: SENSORS & TRANSDUCERS. - ISSN 2306-8515. - 234:6(2019), pp. 16-21.
Training Neural Networks with Balanced Mini-batch to Improve the Prediction of Pathogenic Genomic Variants in Mendelian Diseases
L. CappellettiPrimo
;J. GliozzoSecondo
;A. Petrini;G. ValentiniUltimo
2019
Abstract
Known pathogenic variants associated with genetic Mendelian diseases represent a tiny minority of the overall genetic variation that characterizes the human genome. In this context classical imbalance-aware machine learning methods are unable to distinguish pathogenic from benign variants, since they are severely biased toward the majority (benign) class. Recent works based on ensemble and hyper-ensemble methods showed that by adopting sampling techniques we can significantly improve performance on this challenging task. Inspired by these findings and by recent successful applications of deep learning to Precision Medicine, we propose two learning techniques for neural networks designed to assure a certain balancing between pathogenic and benign variants during the training phase, or to assure that with high probability at least one pathogenic variant is included in the training mini-batch set of examples. The experimental prediction of non-coding mutations associated with Mendelian diseases show the effectiveness of these proposed neural network training approaches.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.