In the last decades the estimation of the intrinsic dimensionality of a dataset has gained considerable importance. Despite the great deal of research work devoted to this task, most of the proposed solutions prove to be unreliable when the intrinsic dimensionality of the input dataset is high and the manifold where the points lie is nonlinearly embedded in a higher dimensional space. In this paper we propose a novel robust intrinsic dimensionality estimator that exploits the twofold complementary information conveyed both by the normalized nearest neighbor distances and by the angles computed on couples of neighboring points, providing also closed-forms for the Kullback-Leibler divergences of the respective distributions. Experiments performed on both synthetic and real datasets highlight the robustness and the effectiveness of the proposed algorithm when compared to state of the art methodologies.

DANCo: Dimensionality from Angle and Norm Concentration / C. Ceruti, S. Bassis, A. Rozza, G. Lombardi, E. Casiraghi, P. Campadelli. - (2012 Jun 18).

DANCo: Dimensionality from Angle and Norm Concentration

C. Ceruti
Primo
;
S. Bassis
Secondo
;
A. Rozza;G. Lombardi;E. Casiraghi
Penultimo
;
P. Campadelli
Ultimo
2012

Abstract

In the last decades the estimation of the intrinsic dimensionality of a dataset has gained considerable importance. Despite the great deal of research work devoted to this task, most of the proposed solutions prove to be unreliable when the intrinsic dimensionality of the input dataset is high and the manifold where the points lie is nonlinearly embedded in a higher dimensional space. In this paper we propose a novel robust intrinsic dimensionality estimator that exploits the twofold complementary information conveyed both by the normalized nearest neighbor distances and by the angles computed on couples of neighboring points, providing also closed-forms for the Kullback-Leibler divergences of the respective distributions. Experiments performed on both synthetic and real datasets highlight the robustness and the effectiveness of the proposed algorithm when compared to state of the art methodologies.
intrinsic dimensionality estimation; manifold learning; von Mises distribution; Kullback-Leibler divergence
Settore INF/01 - Informatica
Settore MAT/06 - Probabilita' e Statistica Matematica
18-giu-2012
http://arxiv.org/abs/1206.3881
File in questo prodotto:
File Dimensione Formato  
DANCO_ARXIV_1206.3881v1.pdf

accesso aperto

Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 2.77 MB
Formato Adobe PDF
2.77 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/451784
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact