Predicting pathogenic single nucleotide variants (SNVs) in non-coding regions of the human genome presents a significant challenge for the extreme class imbalance between pathogenic “positive” variants and physiological “negative” ones, since most machine learning methods are biased toward predicting negative examples. We designed two “block-shaped” tabular-DNN architectures: a Modular Block-Deep Neural Network (MoB-DNN) and a tabular Residual Network (T-ResNet), able to address the class imbalance problem through a mini-batch balancing strategy. We employed a hierarchical optimization approach to efficiently tune hyper-parameters related to training procedure, architecture, batch size, and mini-batch balancing ratio. Our experimental results demonstrate that T-ResNet outperforms and MoB-DNN shows competitive performance with a state-of-the-art hyper-ensemble method, suggesting that residual connections provide significant advantages for capturing complex patterns in non coding regions of the human genome.
Modular Deep Neural Networks with Residual Connections for Predicting the Pathogenicity of Genetic Variants in Non Coding Genomic Regions / F. Stacchietti, M. Nicolini, L. Chimirri, P.N. Robinson, E. Casiraghi, G. Valentini (LECTURE NOTES IN COMPUTER SCIENCE). - In: Advances in Computational Intelligence / [a cura di] Ignacio Rojas, Gonzalo Joya, Andreu Catala. - [s.l] : Springer, 2026. - ISBN 9783032027245. - pp. 398-410 (( Intervento presentato al 18. convegno IWANN International Work-Conference on Artificial Neural Networks Part I : June 16–18 tenutosi a Coruña (Spagna) nel 2025 [10.1007/978-3-032-02725-2_31].
Modular Deep Neural Networks with Residual Connections for Predicting the Pathogenicity of Genetic Variants in Non Coding Genomic Regions
F. Stacchietti
Primo
;M. NicoliniSecondo
;E. CasiraghiPenultimo
;G. Valentini
Ultimo
2026
Abstract
Predicting pathogenic single nucleotide variants (SNVs) in non-coding regions of the human genome presents a significant challenge for the extreme class imbalance between pathogenic “positive” variants and physiological “negative” ones, since most machine learning methods are biased toward predicting negative examples. We designed two “block-shaped” tabular-DNN architectures: a Modular Block-Deep Neural Network (MoB-DNN) and a tabular Residual Network (T-ResNet), able to address the class imbalance problem through a mini-batch balancing strategy. We employed a hierarchical optimization approach to efficiently tune hyper-parameters related to training procedure, architecture, batch size, and mini-batch balancing ratio. Our experimental results demonstrate that T-ResNet outperforms and MoB-DNN shows competitive performance with a state-of-the-art hyper-ensemble method, suggesting that residual connections provide significant advantages for capturing complex patterns in non coding regions of the human genome.| File | Dimensione | Formato | |
|---|---|---|---|
|
stacchietti_et_al.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Licenza:
Nessuna licenza
Dimensione
492.97 kB
Formato
Adobe PDF
|
492.97 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




