In the framework of unsupervised pattern analysis of gene expression, the high dimensionality of the data as well as the accuracy of clustering algorithms and the reliability of the discovered clusters are critical problems. We propose and analyze an algorithmic scheme for unsupervised cluster ensembles, where the dimensionality reduction is obtained by means of randomized embeddings with low distortion. Multiple "base" clusterings are performed on random subspaces, approximately preserving the distances between the projected examples. In this way the accuracy of each "base" clustering is maintained, and the diversity between them is improved. By combining the multipleclusterings, we can enhance the ov erall accuracy and the reliability of the discovered clusters, as shown by our experimental results with high-dimensional gene expression
Randomized Embedding Cluster Ensembles for gene expression data analysis / A. Bertoni, G. Valentini. ((Intervento presentato al convegno SETIT 2007 - IEEE International Conf. on Sciences of Electronic, Technologies of Information and Telecommunications tenutosi a Hammamet, Tunisia nel 2007.
Randomized Embedding Cluster Ensembles for gene expression data analysis
A. BertoniPrimo
;G. ValentiniUltimo
2007
Abstract
In the framework of unsupervised pattern analysis of gene expression, the high dimensionality of the data as well as the accuracy of clustering algorithms and the reliability of the discovered clusters are critical problems. We propose and analyze an algorithmic scheme for unsupervised cluster ensembles, where the dimensionality reduction is obtained by means of randomized embeddings with low distortion. Multiple "base" clusterings are performed on random subspaces, approximately preserving the distances between the projected examples. In this way the accuracy of each "base" clustering is maintained, and the diversity between them is improved. By combining the multipleclusterings, we can enhance the ov erall accuracy and the reliability of the discovered clusters, as shown by our experimental results with high-dimensional gene expressionFile | Dimensione | Formato | |
---|---|---|---|
bertoni-vale-SETIT07.pdf
accesso aperto
Tipologia:
Pre-print (manoscritto inviato all'editore)
Dimensione
570.17 kB
Formato
Adobe PDF
|
570.17 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.