We carry out a systematic study of classification problems on networked data, presenting novel techniques with good performance both in theory and in practice. We assess the power of node classification based on class-linkage information only. In particular, we propose four new algorithms that exploit the homiphilic bias (linked entities tend to belong to the same class) in different ways. The set of the algorithms we present covers diverse practical needs: some of them operate in an active transductive setting and others in an on-line transductive setting. A third group works within an explorative protocol, in which the vertices of an unknown graph are progressively revealed to the learner in an on-line fashion. Within the mistake bound learning model, for each of our algorithms we provide a rigorous theoretical analysis, together with an interpretation of the obtained performance bounds. We also design adversarial strategies achieving matching lower bounds. In particular, we prove optimality for all input graphs and for all fixed regularity values of suitable labeling complexity measures. We also analyze the computational requirements of our methods, showing that our algorithms can to handle very large data sets. In the case of the on-line protocol, for which we exhibit an optimal algorithm with constant amortized time per prediction, we validate our theoretical results carrying out experiments on real-world datasets.

FAST LEARNING ON GRAPHS / F. Vitale ; relatore: Nicolo' Cesa-Bianchi ; correlatore: Claudio Gentile ; direttore della scuola di dottorato in informatica: Ernesto Damiani. Universita' degli Studi di Milano, 2011 Mar 25. 23. ciclo, Anno Accademico 2010. [10.13130/vitale-fabio_phd2011-03-25].

FAST LEARNING ON GRAPHS

F. Vitale
2011

Abstract

We carry out a systematic study of classification problems on networked data, presenting novel techniques with good performance both in theory and in practice. We assess the power of node classification based on class-linkage information only. In particular, we propose four new algorithms that exploit the homiphilic bias (linked entities tend to belong to the same class) in different ways. The set of the algorithms we present covers diverse practical needs: some of them operate in an active transductive setting and others in an on-line transductive setting. A third group works within an explorative protocol, in which the vertices of an unknown graph are progressively revealed to the learner in an on-line fashion. Within the mistake bound learning model, for each of our algorithms we provide a rigorous theoretical analysis, together with an interpretation of the obtained performance bounds. We also design adversarial strategies achieving matching lower bounds. In particular, we prove optimality for all input graphs and for all fixed regularity values of suitable labeling complexity measures. We also analyze the computational requirements of our methods, showing that our algorithms can to handle very large data sets. In the case of the on-line protocol, for which we exhibit an optimal algorithm with constant amortized time per prediction, we validate our theoretical results carrying out experiments on real-world datasets.
25-mar-2011
Settore INF/01 - Informatica
graph learning ; graph prediction ; graph theory ; graph clustering ; transductive learning ; online learning ; random spanning trees ; random walks ; node classification ; effective resistance
CESA BIANCHI, NICOLO' ANTONIO
Doctoral Thesis
FAST LEARNING ON GRAPHS / F. Vitale ; relatore: Nicolo' Cesa-Bianchi ; correlatore: Claudio Gentile ; direttore della scuola di dottorato in informatica: Ernesto Damiani. Universita' degli Studi di Milano, 2011 Mar 25. 23. ciclo, Anno Accademico 2010. [10.13130/vitale-fabio_phd2011-03-25].
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R07651.pdf

accesso aperto

Tipologia: Tesi di dottorato completa
Dimensione 3.46 MB
Formato Adobe PDF
3.46 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/155500
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact