Stream mining poses unique challenges to machinelearning: predictive models are required to be scalable, incrementally trainable, must remain bounded in size, and benonparametric in order to achieve high accuracy even in complexand dynamic environments. Moreover, the learning system mustbe parameterless - traditional tuning methods are problematicin streaming settings - and avoid requiring prior knowledge ofthe number of distinct class labels occurring in the stream. Inthis paper, we introduce a new algorithmic approach for nonparametriclearning in data streams. Our approach addresses allabove mentioned challenges by learning a model that covers theinput space using simple local classifiers. The distribution of theseclassifiers dynamically adapts to the local (unknown) complexityof the classification problem, thus achieving a good balancebetween model complexity and predictive accuracy. By means ofan extensive empirical evaluation against standard nonparametricbaselines, we show state-of-the-art results in terms of accuracyversus model size. Our empirical analysis is complemented by atheoretical performance guarantee which does not rely on anystochastic assumption on the source generating the stream.

The ABACOC algorithm: a novel approach for nonparametric classification of data streams / R. De Rosa, F. Orabona, N. Cesa-Bianchi - In: Data Mining (ICDM), 2015 IEEE International Conference on[s.l] : IEEE, 2016. - ISBN 9781467395045. - pp. 733-738 (( convegno IEEE International Conference on Data Mining tenutosi a Atlantic City nel 2015.

The ABACOC algorithm: a novel approach for nonparametric classification of data streams

R. De Rosa;N. Cesa-Bianchi
2016

Abstract

Stream mining poses unique challenges to machinelearning: predictive models are required to be scalable, incrementally trainable, must remain bounded in size, and benonparametric in order to achieve high accuracy even in complexand dynamic environments. Moreover, the learning system mustbe parameterless - traditional tuning methods are problematicin streaming settings - and avoid requiring prior knowledge ofthe number of distinct class labels occurring in the stream. Inthis paper, we introduce a new algorithmic approach for nonparametriclearning in data streams. Our approach addresses allabove mentioned challenges by learning a model that covers theinput space using simple local classifiers. The distribution of theseclassifiers dynamically adapts to the local (unknown) complexityof the classification problem, thus achieving a good balancebetween model complexity and predictive accuracy. By means ofan extensive empirical evaluation against standard nonparametricbaselines, we show state-of-the-art results in terms of accuracyversus model size. Our empirical analysis is complemented by atheoretical performance guarantee which does not rely on anystochastic assumption on the source generating the stream.
Constant Budget Model Size; Data Stream; High-Speed Data; Nonparametric Classification
Settore INF/01 - Informatica
2016
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
icdm2015.pdf

accesso riservato

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 303.75 kB
Formato Adobe PDF
303.75 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
07373381.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 212.69 kB
Formato Adobe PDF
212.69 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/423476
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 10
social impact