We present an algorithmic scheme for unsupervised cluster ensembles, based on randomized projections between metric spaces, by which a substantial dimensionality reduction is obtained. Multiple clusterings are performed on random subspaces, approximately preserving the distances between the projected data, and then they are combined using a pairwise similarity matrix; in this way the accuracy of each ``base" clustering is maintained, and the diversity between them is improved. The proposed approach is effective for clustering problems characterized by high dimensional data, as shown by our preliminary experimental results.

Ensembles based on random projections to improve the accuracy of clustering algorithms / A. Bertoni, G. Valentini (LECTURE NOTES IN COMPUTER SCIENCE). - In: Neural nets : 16th Italian Workshop on Neural Nets, WIRN 2005 and International Workshop on Natural and Artificial Immune Systems, NAIS 2005 : Vietri sul Mare, Italy, june 8-11, 2005 : revised selected papers / [a cura di] B. Apolloni, M. Marinaro, G. Nicosia, R.Tagliaferri. - Berlin : Springer, 2006. - ISBN 3540331832. - pp. 31-37 (( Intervento presentato al 16. convegno Italian Workshop on Neural Nets (WIRN) and International Workshop on Natural and Artificial Immune Systems (NAIS) tenutosi a Vietri sul Mare nel 2005 [10.1007/11731177_5].

Ensembles based on random projections to improve the accuracy of clustering algorithms

A. Bertoni
Primo
;
G. Valentini
Ultimo
2006

Abstract

We present an algorithmic scheme for unsupervised cluster ensembles, based on randomized projections between metric spaces, by which a substantial dimensionality reduction is obtained. Multiple clusterings are performed on random subspaces, approximately preserving the distances between the projected data, and then they are combined using a pairwise similarity matrix; in this way the accuracy of each ``base" clustering is maintained, and the diversity between them is improved. The proposed approach is effective for clustering problems characterized by high dimensional data, as shown by our preliminary experimental results.
Settore INF/01 - Informatica
2006
SIREN
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
bertoni-vale-WIRN05.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 164.75 kB
Formato Adobe PDF
164.75 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/20345
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 18
  • ???jsp.display-item.citation.isi??? 10
social impact