We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number S of non-zero coefficients in the linear reward function. Previous works focused on the case where S is known, or the action sets satisfy additional assumptions. In this work, we obtain the first sparse regret bounds that hold when S is unknown and the action sets are adversarially generated. Our techniques combine online to confidence set conversions with a novel randomized model selection approach over a hierarchy of nested confidence sets. When S is known, our analysis recovers state-of-the-art bounds for adversarial action sets. We also show that a variant of our approach, using Exp3 to dynamically select the confidence sets, can be used to improve the empirical performance of stochastic linear bandits while enjoying a regret bound with optimal dependence on the time horizon.

Sparsity-Agnostic Linear Bandits with Adaptive Adversaries / T. Jin, K. Jang, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems / [a cura di] A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang. - [s.l] : Curran Associates, Inc., 2024. - ISBN 9798331314385. - pp. 42015-42047 (( Intervento presentato al 38. convegno Annual Conference on Neural Information Processing Systems tenutosi a Vancouver nel 2024.

Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

K. Jang;N. Cesa Bianchi
2024

Abstract

We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number S of non-zero coefficients in the linear reward function. Previous works focused on the case where S is known, or the action sets satisfy additional assumptions. In this work, we obtain the first sparse regret bounds that hold when S is unknown and the action sets are adversarially generated. Our techniques combine online to confidence set conversions with a novel randomized model selection approach over a hierarchy of nested confidence sets. When S is known, our analysis recovers state-of-the-art bounds for adversarial action sets. We also show that a variant of our approach, using Exp3 to dynamically select the confidence sets, can be used to improve the empirical performance of stochastic linear bandits while enjoying a regret bound with optimal dependence on the time horizon.
English
Settore INFO-01/A - Informatica
Intervento a convegno
Esperti anonimi
Ricerca di base
Pubblicazione scientifica
   Learning in Markets and Society
   MINISTERO DELL'UNIVERSITA' E DELLA RICERCA
   2022EKNE5K_001

   European Lighthouse of AI for Sustainability (ELIAS)
   ELIAS
   EUROPEAN COMMISSION
   101120237
Advances in Neural Information Processing Systems
A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang
Curran Associates, Inc.
2024
42015
42047
33
9798331314385
37
Volume a diffusione internazionale
Annual Conference on Neural Information Processing Systems
Vancouver
2024
38
Convegno internazionale
Intervento inviato
https://proceedings.neurips.cc/paper_files/paper/2024/file/4a36c3c51af11ed9f34615b81edb5bbc-Paper-Conference.pdf
DSRC - Data science research center
bibtex
Aderisco
T. Jin, K. Jang, N. Cesa Bianchi
Book Part (author)
open
273
Sparsity-Agnostic Linear Bandits with Adaptive Adversaries / T. Jin, K. Jang, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems / [a cura di] A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang. - [s.l] : Curran Associates, Inc., 2024. - ISBN 9798331314385. - pp. 42015-42047 (( Intervento presentato al 38. convegno Annual Conference on Neural Information Processing Systems tenutosi a Vancouver nel 2024.
info:eu-repo/semantics/bookPart
3
Prodotti della ricerca::03 - Contributo in volume
File in questo prodotto:
File Dimensione Formato  
NeurIPS-2024-sparsity-agnostic-linear-bandits-with-adaptive-adversaries-Paper-Conference.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 723.2 kB
Formato Adobe PDF
723.2 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1157683
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact