The validation of clusters discovered in bio-molecular data is a central issue in bioinformatics. Recently, stability-based methods have been successfully applied to the analysis of the reliability of clusterings characterized by a relatively low number of examples and clusters. Nevertheless, several problems in functional genomics are characterized by a very large number of examples and clusters. We present a stability-based algorithm to discover significant clusters in hierarchical clusterings with a large number of examples and clusters. Preliminary results on gene expression data of patients affected by Human Myeloid Leukemia, show how to apply the proposed method when thousands of gene clusters are involved.

An algorithm to assess the reliability of hierarchical clusters in gene expression data / R. Avogadri, M. Brioschi, F. Ruffino, F. Ferrazzi, A. Beghini, G. Valentini - In: Knowledge-Based Intelligent Information and Engineering Systems : 12. International Conference, KES 2008 : Pt. 2 / [a cura di] I. Lovrek, R.J. Howlett, L.C. Jain. - Berlin : Springer, 2008. - ISBN 978-3-540-85566-8. - pp. 764-770 (( Intervento presentato al 12. convegno Knowledge-Based Intelligent Information and Engineering Systems : International Conference : (KES) tenutosi a Zagreb (Croatia) nel 2008 [10.1007/978-3-540-85567-5_95].

An algorithm to assess the reliability of hierarchical clusters in gene expression data

R. Avogadri
Primo
;
F. Ruffino;A. Beghini
Penultimo
;
G. Valentini
Ultimo
2008

Abstract

The validation of clusters discovered in bio-molecular data is a central issue in bioinformatics. Recently, stability-based methods have been successfully applied to the analysis of the reliability of clusterings characterized by a relatively low number of examples and clusters. Nevertheless, several problems in functional genomics are characterized by a very large number of examples and clusters. We present a stability-based algorithm to discover significant clusters in hierarchical clusterings with a large number of examples and clusters. Preliminary results on gene expression data of patients affected by Human Myeloid Leukemia, show how to apply the proposed method when thousands of gene clusters are involved.
bioinformatics ; genomics ; pattern clustering ; hierarchical cluster reliability ; gene expression data ; bio-molecular data ; bioinformatics ; stability-based methods ; functional genomics ; human myeloid leukemia ; gene clusters
Settore MED/03 - Genetica Medica
Settore INF/01 - Informatica
2008
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/59210
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact