The proper integration of multiple sources of data and the unbalance between annotated and unannotated proteins represent two of the main issues of the automated function prediction (AFP) problem. Most of supervised and semisupervised learning algorithms for AFP proposed in literature do not jointly consider these items, with a negative impact on both sensitivity and precision performances, due to the unbalance between annotated and unannotated proteins that characterize the majority of functional classes and to the specific and complementary information content embedded in each available source of data. We propose UNIPred (unbalance-aware network integration and prediction of protein functions), an algorithm that properly combines different biomolecular networks and predicts protein functions using parametric semisupervised neural models. The algorithm explicitly takes into account the unbalance between unannotated and annotated proteins both to construct the integrated network and to predict protein annotations for each functional class. Full-genome and ontology-wide experiments with three eukaryotic model organisms show that the proposed method compares favorably with state-of-the-art learning algorithms for AFP.
UNIPred : Unbalance-aware network integration and prediction of protein functions / M. Frasca, A. Bertoni, G. Valentini. - In: JOURNAL OF COMPUTATIONAL BIOLOGY. - ISSN 1066-5277. - 22:12(2015 Nov), pp. 1057-1074. [10.1089/cmb.2014.0110]
UNIPred : Unbalance-aware network integration and prediction of protein functions
M. FrascaPrimo
;A. BertoniSecondo
;G. Valentini
2015
Abstract
The proper integration of multiple sources of data and the unbalance between annotated and unannotated proteins represent two of the main issues of the automated function prediction (AFP) problem. Most of supervised and semisupervised learning algorithms for AFP proposed in literature do not jointly consider these items, with a negative impact on both sensitivity and precision performances, due to the unbalance between annotated and unannotated proteins that characterize the majority of functional classes and to the specific and complementary information content embedded in each available source of data. We propose UNIPred (unbalance-aware network integration and prediction of protein functions), an algorithm that properly combines different biomolecular networks and predicts protein functions using parametric semisupervised neural models. The algorithm explicitly takes into account the unbalance between unannotated and annotated proteins both to construct the integrated network and to predict protein annotations for each functional class. Full-genome and ontology-wide experiments with three eukaryotic model organisms show that the proposed method compares favorably with state-of-the-art learning algorithms for AFP.File | Dimensione | Formato | |
---|---|---|---|
UNIPred_rev1.pdf
accesso aperto
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione
450.08 kB
Formato
Adobe PDF
|
450.08 kB | Adobe PDF | Visualizza/Apri |
cmb.2014.0110.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
1.15 MB
Formato
Adobe PDF
|
1.15 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.