Random geometric graphs in high dimension

Erba, V.; Ariosto, S.; Gherardi, M.; Rotondo, P.

doi:10.1103/PhysRevE.102.012306

Many machine learning algorithms used for dimensional reduction and manifold learning leverage on the computation of the nearest neighbors to each point of a data set to perform their tasks. These proximity relations define a so-called geometric graph, where two nodes are linked if they are sufficiently close to each other. Random geometric graphs, where the positions of nodes are randomly generated in a subset of R-d , offer a null model to study typical properties of data sets and of machine learning algorithms. Up to now, most of the literature focused on the characterization of low-dimensional random geometric graphs whereas typical data sets of interest in machine learning live in high-dimensional spaces (d >> 10(2)). In this work, we consider the infinite dimensions limit of hard and soft random geometric graphs and we show how to compute the average number of subgraphs of given finite size k , e.g., the average number of k cliques. This analysis highlights that local observables display different behaviors depending on the chosen ensemble: soft random geometric graphs with continuous activation functions converge to the naive infinite-dimensional limit provided by Erdos-Renyi graphs, whereas hard random geometric graphs can show systematic deviations from it. We present numerical evidence that our analytical results, exact in infinite dimensions, provide a good approximation also for dimension d greater than or similar to 10.

Random geometric graphs in high dimension / V. Erba, S. Ariosto, M. Gherardi, P. Rotondo. - In: PHYSICAL REVIEW. E. - ISSN 2470-0045. - 102:1(2020), pp. 012306.1-012306.7. [10.1103/PhysRevE.102.012306]

Random geometric graphs in high dimension

V. Erba;Sebastiano Ariosto;M. Gherardi;P. Rotondo

2020

Abstract

Many machine learning algorithms used for dimensional reduction and manifold learning leverage on the computation of the nearest neighbors to each point of a data set to perform their tasks. These proximity relations define a so-called geometric graph, where two nodes are linked if they are sufficiently close to each other. Random geometric graphs, where the positions of nodes are randomly generated in a subset of R-d , offer a null model to study typical properties of data sets and of machine learning algorithms. Up to now, most of the literature focused on the characterization of low-dimensional random geometric graphs whereas typical data sets of interest in machine learning live in high-dimensional spaces (d >> 10(2)). In this work, we consider the infinite dimensions limit of hard and soft random geometric graphs and we show how to compute the average number of subgraphs of given finite size k , e.g., the average number of k cliques. This analysis highlights that local observables display different behaviors depending on the chosen ensemble: soft random geometric graphs with continuous activation functions converge to the naive infinite-dimensional limit provided by Erdos-Renyi graphs, whereas hard random geometric graphs can show systematic deviations from it. We present numerical evidence that our analytical results, exact in infinite dimensions, provide a good approximation also for dimension d greater than or similar to 10.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari dell'articolo
	
			Settore FIS/02 - Fisica Teorica, Modelli e Metodi Matematici
		
	Data di pubblicazione
	
			2020
		
	Rivista in ANCE
	
			PHYSICAL REVIEW. E
		
	DOI
	
			https://dx.doi.org/10.1103/PhysRevE.102.012306
		
	Tipologia
	
			Article (author)
		
	Appare nelle tipologie:
	
			01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
PhysRevE.102.012306.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 434.5 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	434.5 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/783618

Citazioni

1

7

6

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca