IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

We study how the regret guarantees of nonstochastic multi-armed bandits can be improved, if the effective range of the losses in each round is small (for example, the maximal difference between two losses or in a given round). Despite a recent impossibility result, we show how this can be made possible under certain mild additional assumptions, such as availability of rough estimates of the losses, or knowledge of the loss of a single, possibly unspecified arm, at the end of each round. Along the way, we develop a novel technique which might be of independent interest, to convert any multi-armed bandit algorithm with regret depending on the loss range, to an algorithm with regret depending only on the effective range, while attaining better regret bounds than existing approaches.

Bandit Regret Scaling with the Effective Loss Range / N. Cesa-Bianchi, O. Shamir (PROCEEDINGS OF MACHINE LEARNING RESEARCH). - In: Algorithmic Learning Theory / [a cura di] F. Janoos, M. Mohri, K. Sridharan. - [s.l] : PMLR, 2018. - pp. 128-151 (( Intervento presentato al 29. convegno International Conference on Algorithmic Learning Theory nel 2018.

Bandit Regret Scaling with the Effective Loss Range

N. Cesa-Bianchi;O. Shamir

2018

Abstract

We study how the regret guarantees of nonstochastic multi-armed bandits can be improved, if the effective range of the losses in each round is small (for example, the maximal difference between two losses or in a given round). Despite a recent impossibility result, we show how this can be made possible under certain mild additional assumptions, such as availability of rough estimates of the losses, or knowledge of the loss of a single, possibly unspecified arm, at the end of each round. Along the way, we develop a novel technique which might be of independent interest, to convert any multi-armed bandit algorithm with regret depending on the loss range, to an algorithm with regret depending only on the effective range, while attaining better regret bounds than existing approaches.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2018
			
	URL
	
				http://proceedings.mlr.press/v83/cesa-bianchi18a/cesa-bianchi18a.pdf
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/570257

Citazioni

ND

5

ND

2

social impact