In the past two decades the estimation of the intrinsic dimensionality of a dataset has gained considerable importance, since it is a relevant information for several real life applications. Unfortunately, although a great deal of research effort has been devoted to the development of effective intrinsic dimensionality estimators, the problem is still open. For this reason, in this paper we propose a novel robust intrinsic dimensionality estimator that exploits the information conveyed by the normalized nearest neighbor distances, through a technique based on rank-order statistics that limits common underestimation issues related to the edge effect. Experiments performed on both synthetic and real datasets highlight the robustness and the effectiveness of the proposed algorithm when compared to state-of-the-art methodologies.

A novel intrinsic dimensionality estimator based on rank-order statistics / S. Bassis, A. Rozza, C. Ceruti, G. Lombardi, E. Casiraghi, P. Campadelli - In: Clustering high-dimensional data : first International workshop, CHDD 2012, Naples, Italy, May 15, 2012 : revised selected papers / [a cura di] F. Masulli, A. Petrosino, S. Rovetta. - Prima edizione. - Berlin : Springer, 2015. - ISBN 9783662485774. - pp. 102-117 (( Intervento presentato al 1. convegno International workshop on Clustering high-dimensional data, CHDD tenutosi a Naples (Italy) nel 2012.

A novel intrinsic dimensionality estimator based on rank-order statistics

S. Bassis;C. Ceruti;E. Casiraghi;P. Campadelli
2015

Abstract

In the past two decades the estimation of the intrinsic dimensionality of a dataset has gained considerable importance, since it is a relevant information for several real life applications. Unfortunately, although a great deal of research effort has been devoted to the development of effective intrinsic dimensionality estimators, the problem is still open. For this reason, in this paper we propose a novel robust intrinsic dimensionality estimator that exploits the information conveyed by the normalized nearest neighbor distances, through a technique based on rank-order statistics that limits common underestimation issues related to the edge effect. Experiments performed on both synthetic and real datasets highlight the robustness and the effectiveness of the proposed algorithm when compared to state-of-the-art methodologies.
intrinsic dimensionality estimation; manifold learning; rank-order statistics
Settore INF/01 - Informatica
2015
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
chdd13id.pdf

accesso riservato

Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 260.19 kB
Formato Adobe PDF
260.19 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
chp_10.1007_978-3-662-48577-4_7.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 360.82 kB
Formato Adobe PDF
360.82 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/342037
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact