In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally”, according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost- sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism S. cerevisiae, using the FunCat taxonomy and 7 biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.

Hierarchical cost-sensitive algorithms for genome-wide gene function prediction / N. Cesa Bianchi, G. Valentini - In: Machine learning in systems biology : proceedings of the third international workshop : september 5-6, 2009, Ljubljana, Slovenia / [a cura di] S. Dzeroski, P. Geurts, J. Rousu. - Helsinki : University of Helsinky, Department of Computer Science, 2009. - ISBN 9789521056994. - pp. 25-34 (( Intervento presentato al 3. convegno International Workshop on Machine Learning in Systems Biology tenutosi a Ljubljana, Slovenia nel 2009.

Hierarchical cost-sensitive algorithms for genome-wide gene function prediction

N. Cesa Bianchi
Primo
;
G. Valentini
Ultimo
2009

Abstract

In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally”, according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost- sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism S. cerevisiae, using the FunCat taxonomy and 7 biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.
Settore INF/01 - Informatica
   Pattern Analysis, Statistical Modelling and Computational Learning 2
   PASCAL2
   EUROPEAN COMMISSION
   FP7
   216886
2009
http://mlsb09.ijs.si/files/MLSB09-Proceedings.pdf
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
workshop 2009.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 859.87 kB
Formato Adobe PDF
859.87 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/178725
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 21
social impact