IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally”, according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost- sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism S. cerevisiae, using the FunCat taxonomy and 7 biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.

Hierarchical cost-sensitive algorithms for genome-wide gene function prediction / N. Cesa Bianchi, G. Valentini - In: Machine learning in systems biology : proceedings of the third international workshop : september 5-6, 2009, Ljubljana, Slovenia / [a cura di] S. Dzeroski, P. Geurts, J. Rousu. - Helsinki : University of Helsinky, Department of Computer Science, 2009. - ISBN 9789521056994. - pp. 25-34 (( Intervento presentato al 3. convegno International Workshop on Machine Learning in Systems Biology tenutosi a Ljubljana, Slovenia nel 2009.

Hierarchical cost-sensitive algorithms for genome-wide gene function prediction

N. Cesa Bianchi^Primo;G. Valentini^Ultimo

2009

Abstract

In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally”, according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost- sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism S. cerevisiae, using the FunCat taxonomy and 7 biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Titolo del progetto
	
	Titolo Progetto
	
									Pattern Analysis, Statistical Modelling and Computational Learning 2
								
	Acronimo
	
									PASCAL2
								
	Nome finanziatore
	
										EUROPEAN COMMISSION
									
	Finanziamento
	
									FP7
								
	N. Contratto
	
									216886
								
	Data di pubblicazione
	
				2009
			
	URL
	
				http://mlsb09.ijs.si/files/MLSB09-Proceedings.pdf
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
workshop 2009.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 859.87 kB Formato Adobe PDF Visualizza/Apri	859.87 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/178725

Citazioni

ND

ND

21

social impact