IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number S of non-zero coefficients in the linear reward function. Previous works focused on the case where S is known, or the action sets satisfy additional assumptions. In this work, we obtain the first sparse regret bounds that hold when S is unknown and the action sets are adversarially generated. Our techniques combine online to confidence set conversions with a novel randomized model selection approach over a hierarchy of nested confidence sets. When S is known, our analysis recovers state-of-the-art bounds for adversarial action sets. We also show that a variant of our approach, using Exp3 to dynamically select the confidence sets, can be used to improve the empirical performance of stochastic linear bandits while enjoying a regret bound with optimal dependence on the time horizon.

Sparsity-Agnostic Linear Bandits with Adaptive Adversaries / T. Jin, K. Jang, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems / [a cura di] A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang. - [s.l] : Curran Associates, Inc., 2024. - ISBN 9798331314385. - pp. 42015-42047 (( Intervento presentato al 38. convegno Annual Conference on Neural Information Processing Systems tenutosi a Vancouver nel 2024.

Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

Tianyuan Jin;K. Jang;N. Cesa Bianchi

2024

Abstract

We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number S of non-zero coefficients in the linear reward function. Previous works focused on the case where S is known, or the action sets satisfy additional assumptions. In this work, we obtain the first sparse regret bounds that hold when S is unknown and the action sets are adversarially generated. Our techniques combine online to confidence set conversions with a novel randomized model selection approach over a hierarchy of nested confidence sets. When S is known, our analysis recovers state-of-the-art bounds for adversarial action sets. We also show that a variant of our approach, using Exp3 to dynamically select the confidence sets, can be used to improve the empirical performance of stochastic linear bandits while enjoying a regret bound with optimal dependence on the time horizon.

Scheda breve

Scheda completa

Scheda completa (DC)

	Presenza di coautori internazionali
	
				Sì
			
	Lingua del contributo
	
				English
			
	Settori scientifico-disciplinari del contributo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Tipo
	
				Intervento a convegno
			
	Revisione (peer review)
	
				Esperti anonimi
			
	Classificazione in base al tipo di ricerca
	
				Ricerca di base
			
	Classificazione della pubblicazione
	
				Pubblicazione scientifica
			
	Titolo del progetto
	
	Titolo Progetto
	
									Learning in Markets and Society
								
	Nome finanziatore
	
										MINISTERO DELL'UNIVERSITA' E DELLA RICERCA
									
	N. Contratto
	
									2022EKNE5K_001
								
	Titolo Progetto
	
									European Lighthouse of AI for Sustainability (ELIAS)
								
	Acronimo
	
									ELIAS
								
	Nome finanziatore
	
										EUROPEAN COMMISSION
									
	N. Contratto
	
									101120237
								
	Titolo del volume
	
				Advances in Neural Information Processing Systems
			
	Curatori del volume
	
				A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang
			
	Editore
	
				Curran Associates, Inc.
			
	Data di pubblicazione
	
				2024
			
	Pagina iniziale
	
				42015
			
	Pagina finale
	
				42047
			
	Numero di pagine
	
				33
			
	ISBN
	
				9798331314385
			
	Collana
	
				ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
			
	Numero del volume
	
				37
			
	Tipo di volume
	
				Volume a diffusione internazionale
			
	Nome del convegno
	
				Annual Conference on Neural Information Processing Systems
			
	Luogo del convegno
	
				Vancouver
			
	Anno del convegno
	
				2024
			
	Numero del convegno
	
				38
			
	Tipo di convegno
	
				Convegno internazionale
			
	Sezione
	
				Intervento inviato
			
	URL
	
				https://proceedings.neurips.cc/paper_files/paper/2024/file/4a36c3c51af11ed9f34615b81edb5bbc-Paper-Conference.pdf
			
	Centro di ricerca coordinata
	
				DSRC - Data science research center
			
	Banca dati sorgente
	
				bibtex
			
	Identificativo SCOPUS
	
				2-s2.0-105000531896
			
	Adesione alla policy Open Access di Ateneo
	
				Aderisco
			
	Tutti gli autori
	
						T. Jin, K. Jang, N. Cesa Bianchi
					
	Tipologia
	
				Book Part (author)
			
	Fulltext
	
				open
			
	Tipologia sito docente
	
				273
			
	Citazione
	
				Sparsity-Agnostic Linear Bandits with Adaptive Adversaries / T. Jin, K. Jang, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems / [a cura di] A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang. - [s.l] : Curran Associates, Inc., 2024. - ISBN 9798331314385. - pp. 42015-42047 (( Intervento presentato al 38. convegno Annual Conference on Neural Information Processing Systems tenutosi a Vancouver nel 2024.
			
	Tipologia
	
				info:eu-repo/semantics/bookPart
			
	Numero autori
	
				3
			
	Tipologia
	
				Prodotti della ricerca::03 - Contributo in volume
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
NeurIPS-2024-sparsity-agnostic-linear-bandits-with-adaptive-adversaries-Paper-Conference.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 723.2 kB Formato Adobe PDF Visualizza/Apri	723.2 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1157683

Citazioni

ND

0

ND

ND

social impact