IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Negative examples in automated protein function prediction (AFP), that is proteins known not to possess a given protein function, are usually not directly stored in public proteome and genome databases, such as the Gene Ontology database. Nevertheless, most computational methods need negative examples to infer new predictions. A variety of algorithms has been proposed in AFP for negative selection, ranging from network- and feature-based heuristics, to hierarchy-based and hierarchy-less strategies. Moreover, several bio-molecular information sources about proteins, such as gene co-expression, genetic and protein-protein interactions data, are naturally encoded in protein networks, where nodes are proteins and edges connect proteins sharing common characteristics. Although selecting negatives in biological networks is thereby a central and challenging problem in computational biology, detecting the characteristics proteins should have to be considered as negative is still a difficult task. It this work, we show that a few protein features extracted from the network help in detecting reliable negatives. We tested such features in two real world experiments: predicting unreliable negatives with an SVM classifier through temporal holdout on model organisms for AFP, and selecting reliable negatives with a clustering-based state-of-the-art negative selection procedure.

Analysis of Informative Features for Negative Selection in Protein Function Prediction / M. Frasca, F. Lipreri, D. Malchiodi (LECTURE NOTES IN COMPUTER SCIENCE). - In: Bioinformatics and Biomedical Engineering / [a cura di] I. Rojas Ignacio, F. Ortuño. - Switzerland : Springer, 2017. - ISBN 9783319561530. - pp. 267-276 (( Intervento presentato al 5. convegno IWBBIO tenutosi a Granada nel 2017 [10.1007/978-3-319-56154-7_25].

Analysis of Informative Features for Negative Selection in Protein Function Prediction

M. Frasca^Primo;F. Lipreri;D. Malchiodi

2017

Abstract

Negative examples in automated protein function prediction (AFP), that is proteins known not to possess a given protein function, are usually not directly stored in public proteome and genome databases, such as the Gene Ontology database. Nevertheless, most computational methods need negative examples to infer new predictions. A variety of algorithms has been proposed in AFP for negative selection, ranging from network- and feature-based heuristics, to hierarchy-based and hierarchy-less strategies. Moreover, several bio-molecular information sources about proteins, such as gene co-expression, genetic and protein-protein interactions data, are naturally encoded in protein networks, where nodes are proteins and edges connect proteins sharing common characteristics. Although selecting negatives in biological networks is thereby a central and challenging problem in computational biology, detecting the characteristics proteins should have to be considered as negative is still a difficult task. It this work, we show that a few protein features extracted from the network help in detecting reliable negatives. We tested such features in two real world experiments: predicting unreliable negatives with an SVM classifier through temporal holdout on model organisms for AFP, and selecting reliable negatives with a clustering-based state-of-the-art negative selection procedure.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			negative example selection; protein function prediction; biological networks; fuzzy clustering; protein features
		
	Settori scientifico-disciplinari del contributo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			2017
		
	DOI
	
			https://dx.doi.org/10.1007/978-3-319-56154-7_25
		
	Tipologia
	
			Book Part (author)
		
	Appare nelle tipologie:
	
			03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
NegSel_iwbbio_v2.pdf accesso riservato Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 150.57 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	150.57 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/473619

Citazioni

ND

3

1

social impact