IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Searching for structures in complex bio-molecular data is a central issue in several branches of bioinformatics. In particular, the reliability of clusters discovered by a given clustering algorithm have been recently assessed through methods based on the concept of stability with respect to random perturbations of the data. In this context, a major problem is to assess the confidence of the measures of reliability. We discuss a partially ”distribution independent” method based on the classical Bernstein inequality to assess the statistical significance of the discovered clusterings. Experimental results with gene expression data show the effectiveness of the proposed approach.

Discovering Significant Structures in Clustered Bio-molecular Data Through the Bernstein Inequality / A. Bertoni, G. Valentini (LECTURE NOTES IN ARTIFICIAL INTELLIGENCE). - In: Knowledge-Based Intelligent Information and Engineering Systems : KES 2007 - WIRN 2007 / [a cura di] B. Apolloni, R. J. Howlett, L. Jain. - Berlin : Springer, 2007 Sep 14. - ISBN 9783540748281. - pp. 886-891 (( Intervento presentato al 11. convegno KES International Conference, KES 2007 XVII ItalianWorkshop on Neural Networks : September 12-14 tenutosi a Vietri sul mare nel 2007 [10.1007/978-3-540-74829-8_108].

Discovering Significant Structures in Clustered Bio-molecular Data Through the Bernstein Inequality

A. Bertoni^Primo;G. Valentini^Ultimo

2007

Abstract

Searching for structures in complex bio-molecular data is a central issue in several branches of bioinformatics. In particular, the reliability of clusters discovered by a given clustering algorithm have been recently assessed through methods based on the concept of stability with respect to random perturbations of the data. In this context, a major problem is to assess the confidence of the measures of reliability. We discuss a partially ”distribution independent” method based on the classical Bernstein inequality to assess the statistical significance of the discovered clusterings. Experimental results with gene expression data show the effectiveness of the proposed approach.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			14-set-2007
		
	DOI
	
			https://dx.doi.org/10.1007/978-3-540-74829-8_108
		
	URL
	
			http://www.springerlink.com/content/6w830t33u2854146/?p=b50a1b5ba96246e7b55e32e8a5826086&pi=1
		
	Tipologia
	
			Book Part (author)
		
	Appare nelle tipologie:
	
			03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/44124

Citazioni

ND

2

2

social impact