Background: In recent years unsupervised ensemble clustering methods have been successfully applied to DNA microarray data analysis to improve the accuracy and the reliability of clustering results. Nevertheless, a major problem is represented by the fact that classes of functionally correlated examples (e.g. subclasses of diseases characterized at bio-molecular level) are not in general clearly separable, and in many cases the same gene may belong to different functional classes (e.g. may participate to different biological processes). Results: We propose an ensemble clustering algorithm scheme, based on a fuzzy approach, that directly permit to deal with overlapping classes or with genes or samples that may belong to more clusters at the same time. From our algorithmic scheme several fuzzy ensemble clustering algorithms may be derived, according to the way the multiple clusterings are combined and the consensus clustering is generated. We test some of the proposed ensemble algorithms with two DNA microarray data sets available on the web, comparing the results with other single and ensemble clustering methods. Conclusions: Our proposed fuzzy ensemble approach may be applied to discover classes of co-expressed genes or subclasses of functionally related examples, and in principle it may be applied for the unsupervised analysis of different types of complex bio-molecular data. Fuzzy ensemble algorithms can assign each gene/sample to multiple classes and can estimate and improve the accuracy and the reliability of the discovered clusterings, as shown by our experimental results.

An unsupervised fuzzy ensemble algorithmic scheme for gene expression data analysis / R. Avogadri, G. Valentini. ((Intervento presentato al convegno NETTAB 2007 workshop on a Semantic Web for Bioinformatics tenutosi a Pisa, Italy nel 2007.

An unsupervised fuzzy ensemble algorithmic scheme for gene expression data analysis

R. Avogadri
Primo
;
G. Valentini
Ultimo
2007

Abstract

Background: In recent years unsupervised ensemble clustering methods have been successfully applied to DNA microarray data analysis to improve the accuracy and the reliability of clustering results. Nevertheless, a major problem is represented by the fact that classes of functionally correlated examples (e.g. subclasses of diseases characterized at bio-molecular level) are not in general clearly separable, and in many cases the same gene may belong to different functional classes (e.g. may participate to different biological processes). Results: We propose an ensemble clustering algorithm scheme, based on a fuzzy approach, that directly permit to deal with overlapping classes or with genes or samples that may belong to more clusters at the same time. From our algorithmic scheme several fuzzy ensemble clustering algorithms may be derived, according to the way the multiple clusterings are combined and the consensus clustering is generated. We test some of the proposed ensemble algorithms with two DNA microarray data sets available on the web, comparing the results with other single and ensemble clustering methods. Conclusions: Our proposed fuzzy ensemble approach may be applied to discover classes of co-expressed genes or subclasses of functionally related examples, and in principle it may be applied for the unsupervised analysis of different types of complex bio-molecular data. Fuzzy ensemble algorithms can assign each gene/sample to multiple classes and can estimate and improve the accuracy and the reliability of the discovered clusterings, as shown by our experimental results.
Settore INF/01 - Informatica
An unsupervised fuzzy ensemble algorithmic scheme for gene expression data analysis / R. Avogadri, G. Valentini. ((Intervento presentato al convegno NETTAB 2007 workshop on a Semantic Web for Bioinformatics tenutosi a Pisa, Italy nel 2007.
Conference Object
File in questo prodotto:
File Dimensione Formato  
avo-vale-nettab07-final.pdf

accesso aperto

Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 158.3 kB
Formato Adobe PDF
158.3 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/44210
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact