Negative examples in automated protein function prediction (AFP), that is proteins known not to possess a given protein function, are usually not directly stored in public proteome and genome databases, such as the Gene Ontology database. Nevertheless, most computational methods need negative examples to infer new predictions. A variety of algorithms has been proposed in AFP for negative selection, ranging from network- and feature-based heuristics, to hierarchy-based and hierarchy-less strategies. Moreover, several bio-molecular information sources about proteins, such as gene co-expression, genetic and protein-protein interactions data, are naturally encoded in protein networks, where nodes are proteins and edges connect proteins sharing common characteristics. Although selecting negatives in biological networks is thereby a central and challenging problem in computational biology, detecting the characteristics proteins should have to be considered as negative is still a difficult task. It this work, we show that a few protein features extracted from the network help in detecting reliable negatives. We tested such features in two real world experiments: predicting unreliable negatives with an SVM classifier through temporal holdout on model organisms for AFP, and selecting reliable negatives with a clustering-based state-of-the-art negative selection procedure.
Analysis of Informative Features for Negative Selection in Protein Function Prediction / M. Frasca, F. Lipreri, D. Malchiodi - In: Bioinformatics and Biomedical Engineering / [a cura di] I. Rojas Ignacio, F. Ortuño. - Switzerland : Springer, 2017. - ISBN 9783319561530. - pp. 267-276 (( Intervento presentato al 5. convegno IWBBIO tenutosi a Granada nel 2017.
Titolo: | Analysis of Informative Features for Negative Selection in Protein Function Prediction |
Autori: | FRASCA, MARCO (Primo) MALCHIODI, DARIO (Corresponding) |
Parole Chiave: | negative example selection; protein function prediction; biological networks; fuzzy clustering; protein features |
Settore Scientifico Disciplinare: | Settore INF/01 - Informatica |
Data di pubblicazione: | 2017 |
Digital Object Identifier (DOI): | http://dx.doi.org/10.1007/978-3-319-56154-7_25 |
Tipologia: | Book Part (author) |
Appare nelle tipologie: | 03 - Contributo in volume |
File in questo prodotto:
File | Descrizione | Tipologia | Licenza | |
---|---|---|---|---|
NegSel_iwbbio_v2.pdf | Post-print, accepted manuscript ecc. (versione accettata dall'editore) | Administrator Richiedi una copia |