We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural characterization of clusters, called $(eta,gamma)$-convexity, that can be applied to any finite set of points equipped with a metric (or even a semimetric, as the triangle inequality is not needed). Using $(eta,gamma)$-convexity, we can translate natural density properties of clusters (which include, for instance, clusters that are strongly non-convex in $R^d$) into a graph-theoretic notion of convexity. By exploiting this convexity notion, we design a deterministic algorithm that recovers $(eta,gamma)$-convex clusters using $O(k^2 log n + k^2 (rac{6}{etagamma})^{dens(X)})$ same-cluster queries, where $k$ is the number of clusters and $dens(X)$ is the density dimension of the semimetric. We show that an exponential dependence on the density dimension is necessary, and we also show that, if we are allowed to make $O(k^2 + k log n)$ additional queries to a "cluster separation" oracle, then we can recover clusters that have different and arbitrary scales, even when the scale of each cluster is unknown.

Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries / M. Bressan, N. Cesa Bianchi, S. Lattanzi, A. Paudice (PROCEEDINGS OF MACHINE LEARNING RESEARCH). - In: Proceedings of Thirty Fourth Conference on Learning Theory / [a cura di] M. Belkin, S. Kpotufe. - [s.l] : PMLR, 2021. - pp. 775-803 (( Intervento presentato al 34. convegno Conference on Learning Theory.

Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

M. Bressan;N. Cesa Bianchi;A. Paudice
2021

Abstract

We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural characterization of clusters, called $(eta,gamma)$-convexity, that can be applied to any finite set of points equipped with a metric (or even a semimetric, as the triangle inequality is not needed). Using $(eta,gamma)$-convexity, we can translate natural density properties of clusters (which include, for instance, clusters that are strongly non-convex in $R^d$) into a graph-theoretic notion of convexity. By exploiting this convexity notion, we design a deterministic algorithm that recovers $(eta,gamma)$-convex clusters using $O(k^2 log n + k^2 (rac{6}{etagamma})^{dens(X)})$ same-cluster queries, where $k$ is the number of clusters and $dens(X)$ is the density dimension of the semimetric. We show that an exponential dependence on the density dimension is necessary, and we also show that, if we are allowed to make $O(k^2 + k log n)$ additional queries to a "cluster separation" oracle, then we can recover clusters that have different and arbitrary scales, even when the scale of each cluster is unknown.
Non-convex clusters; same-cluster queries; geodesic convexity
Settore INF/01 - Informatica
European Learning and Intelligent Systems Excellence (ELISE)
PRIN201719NCESA_01 - Algorithms, Games, and Digital Markets (ALGADIMAR) - CESA BIANCHI, NICOLO' ANTONIO - PRIN2017 - PRIN bando 2017 - 2019
http://proceedings.mlr.press/v134/bressan21a/bressan21a.pdf
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
bressan21a.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 856.9 kB
Formato Adobe PDF
856.9 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/860362
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact