IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

The complexity of gene expression and the elucidation of the mechanisms involved in its regulation constitute an extremely difficult challenge in modern bioinformatics despite the amount of information made recently available by high-throughput biotechnologies and genome-wide investigations. In this contribution we investigated the effectiveness of ensemble systems for gene expression prediction. The ability of ensemble systems to integrate heterogeneous datasets allows to exploit not only promoter sequence-based datasets, but also other sources of information, such as phylogenetic patterns of regulatory motifs and covalent histone modifications. To this end we collected data from literature, and we predicted the expression class of 2490 S.Cerevisiae genes using an ensemble of Support Vector Machines trained with 4 different sources of data. The experimental results highlighted that improvement in gene expression prediction performances can be obtained by using ensemble systems. Nevertheless, further investigations are required in order to find the best combination of datasets and data fusion methods for gene-expression class prediction.

Predicting gene expression from heterogeneous data / M. Re, G. Valentini - In: CIBB 2009, the sixth International conference on bioinformatics and biostatistics : 15-17 oct. 2009, Genova, Italy : proceedings / [a cura di] F. Masulli, L. Peterson, R. Tagliaferri. - [s.l] : Università degli Studi di Salerno, DMI, 2009. - ISBN 9788890353727. (( Intervento presentato al 6. convegno International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB) tenutosi a Genova nel 2009.

Predicting gene expression from heterogeneous data

M. Re^Primo;G. Valentini^Ultimo

2009

Abstract

The complexity of gene expression and the elucidation of the mechanisms involved in its regulation constitute an extremely difficult challenge in modern bioinformatics despite the amount of information made recently available by high-throughput biotechnologies and genome-wide investigations. In this contribution we investigated the effectiveness of ensemble systems for gene expression prediction. The ability of ensemble systems to integrate heterogeneous datasets allows to exploit not only promoter sequence-based datasets, but also other sources of information, such as phylogenetic patterns of regulatory motifs and covalent histone modifications. To this end we collected data from literature, and we predicted the expression class of 2490 S.Cerevisiae genes using an ensemble of Support Vector Machines trained with 4 different sources of data. The experimental results highlighted that improvement in gene expression prediction performances can be obtained by using ensemble systems. Nevertheless, further investigations are required in order to find the best combination of datasets and data fusion methods for gene-expression class prediction.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2009
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
re-vale-cibb09.5-2.pdf accesso aperto Tipologia: Pre-print (manoscritto inviato all'editore) Dimensione 185.1 kB Formato Adobe PDF Visualizza/Apri	185.1 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/178278

Citazioni

ND

ND

ND

ND

social impact