We study sequential prediction problems in which, at each time instance, the forecaster chooses a binary vector from a certain fixed set S ⊆ {0, 1}d and suffers a loss that is the sum of the losses of those vector components that equal to one. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible vector in the class. We consider the “bandit” setting in which the forecaster has only access to the losses of the chosen vectors. We introduce a new general forecaster achieving a regret bound that, for a variety of concrete choices of S, is of order of square root of (nd ln|S|) where n is the time horizon. This is not improvable in general and is better than previously known bounds. We also point out that computationally efficient implementations for various interesting choices of S exist.

Combinatorial bandits / N. Cesa-Bianchi, G. Lugosi - In: COLT 2009 : proceedings of the 22nd Annual Conference on Learning Theory, Montréal, Canada, June 18-21, 2009 / / [a cura di] S. Dasgupta, A. Klivans. - Madison, USA : Omnipress, 2009. - ISBN 9780982252918. - pp. 237-246 (( Intervento presentato al 22. convegno Annual Conference on Learning Theory tenutosi a Montréal, Canada nel 2009.

Combinatorial bandits

N. Cesa-Bianchi
Primo
;
2009

Abstract

We study sequential prediction problems in which, at each time instance, the forecaster chooses a binary vector from a certain fixed set S ⊆ {0, 1}d and suffers a loss that is the sum of the losses of those vector components that equal to one. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible vector in the class. We consider the “bandit” setting in which the forecaster has only access to the losses of the chosen vectors. We introduce a new general forecaster achieving a regret bound that, for a variety of concrete choices of S, is of order of square root of (nd ln|S|) where n is the time horizon. This is not improvable in general and is better than previously known bounds. We also point out that computationally efficient implementations for various interesting choices of S exist.
Settore INF/01 - Informatica
2009
http://www.cs.mcgill.ca/~colt2009/papers/024.pdf#page=1
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
comband-colt.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 228.07 kB
Formato Adobe PDF
228.07 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/140003
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact