IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

In correlation clustering, we are givennobjects together with a binary similarityscore between each pair of them. The goal is to partition the objects into clustersso to minimise the disagreements with the scores. In this work we investigatecorrelation clustering as an active learning problem: each similarity score can belearned by making a query, and the goal is to minimise both the disagreementsand the total number of queries. On the one hand, we describe simple activelearning algorithms, which provably achieve an almost optimal trade-off whilegiving cluster recovery guarantees, and we test them on different datasets. On theother hand, we prove information-theoretical bounds on the number of queriesnecessary to guarantee a prescribed disagreement bound. These results give a richcharacterization of the trade-off between queries and clustering error.

Correlation Clustering with Adaptive Similarity Queries / M. Bressan, N. Cesa-Bianchi, A. Paudice, F. Vitale (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems / [a cura di] H. Wallach, H. Larochelle, A. Beygelzime, F. d'Alché-Buc, E. Fox, R. Garnett. - [s.l] : Curran Associates, Inc., 2019. - pp. 12510-12519 (( Intervento presentato al 32. convegno Neural Information Processing Systems nel 2019.

Correlation Clustering with Adaptive Similarity Queries

M. Bressan;N. Cesa-Bianchi;A. Paudice;F. Vitale

2019

Abstract

In correlation clustering, we are givennobjects together with a binary similarityscore between each pair of them. The goal is to partition the objects into clustersso to minimise the disagreements with the scores. In this work we investigatecorrelation clustering as an active learning problem: each similarity score can belearned by making a query, and the goal is to minimise both the disagreementsand the total number of queries. On the one hand, we describe simple activelearning algorithms, which provably achieve an almost optimal trade-off whilegiving cluster recovery guarantees, and we test them on different datasets. On theother hand, we prove information-theoretical bounds on the number of queriesnecessary to guarantee a prescribed disagreement bound. These results give a richcharacterization of the trade-off between queries and clustering error.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			2019
		
	URL
	
			http://papers.nips.cc/paper/9417-correlation-clustering-with-adaptive-similarity-queries.pdf
		
	Tipologia
	
			Book Part (author)
		
	Appare nelle tipologie:
	
			03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
9417-correlation-clustering-with-adaptive-similarity-queries.pdf accesso aperto Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 392.79 kB Formato Adobe PDF Visualizza/Apri	392.79 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/691411

Citazioni

ND

13

1

social impact