IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

High-throughput biotechnologies are playing an increasingly important role in biomolecular research. Their ability to provide genome wide views of molecular mechanisms occurring in living cells could play a crucial role in the elucidation of biomolecular processes at system level but dataset produced using these techniques are often high-dimensional and very noisy making their analysis challenging because the need to extract relevant information froma sea of noise. Gene function prediction is a central problem in modern bioinformatics and recent works pointed out that gene function prediction performances can be improved by integrating heterogeneous biomolecular datasources. In this contribution we compared performances achievable in gene function prediction by early and late data fusion methods. Given that, among the available late fusion methods, ensemble systems have not been, at today, extensively investigated, all the late fusion experiments were performed using multiple classifier systems. Experimental results show that late fusion of heterogeneous datasets realized by mean of ensemble systems outperformed both early fusion approaches and base learners trained on single types of biomolecular data.

Comparing early and late data fusion methods for gene function prediction / M. Re, G. Valentini - In: Neural nets WIRN09 : proceedings of the 19th italian workshop on neural nets, Vetri sul Mare, Salerno, Italy May 28-30 2009 / [a cura di] B. Apolloni, S. Bassis, F.C. Morabito. - Amsterdam : IOS Press, 2009. - ISBN 9781607500728. - pp. 197-207 (( Intervento presentato al 19. convegno Italian Workshop on Neural Nets tenutosi a Vietri sul Mare, Salerno, Italy nel 2009.

Comparing early and late data fusion methods for gene function prediction

M. Re^Primo;G. Valentini^Ultimo

2009

Abstract

High-throughput biotechnologies are playing an increasingly important role in biomolecular research. Their ability to provide genome wide views of molecular mechanisms occurring in living cells could play a crucial role in the elucidation of biomolecular processes at system level but dataset produced using these techniques are often high-dimensional and very noisy making their analysis challenging because the need to extract relevant information froma sea of noise. Gene function prediction is a central problem in modern bioinformatics and recent works pointed out that gene function prediction performances can be improved by integrating heterogeneous biomolecular datasources. In this contribution we compared performances achievable in gene function prediction by early and late data fusion methods. Given that, among the available late fusion methods, ensemble systems have not been, at today, extensively investigated, all the late fusion experiments were performed using multiple classifier systems. Experimental results show that late fusion of heterogeneous datasets realized by mean of ensemble systems outperformed both early fusion approaches and base learners trained on single types of biomolecular data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Data integration; Decision fusion; Decision templates; Early fusion; Gene function prediction; Late fusion; Naive Bayes combiner; Vector space integration; Weighted averaging
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2009
			
	DOI
	
				https://dx.doi.org/10.3233/978-1-60750-072-8-197
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/147645

Citazioni

ND

0

0

ND

social impact