We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed feedback from that of bandit feedback, our analysis allows us to obtain new results in four important settings. We derive the first optimal (up to logarithmic factors) regret bounds for combinatorial semi-bandits with delay and adversarial Markov Decision Processes with delay (both known and unknown transition functions). Furthermore, we use our analysis to develop an efficient algorithm for linear bandits with delay achieving near-optimal regret bounds. In order to derive these results we show that FTRL remains stable across multiple rounds under mild assumptions on the regularizer.

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs / L. Zierahn, D. Van Der Hoeven, T. Lancewicki, A. Rosenberg, N.A. Cesa Bianchi. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1533-7928. - 26:(2025), pp. 104.1-104.60.

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs

L. Zierahn
Primo
;
D. Van Der Hoeven
Secondo
;
N.A. Cesa Bianchi
Ultimo
2025

Abstract

We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed feedback from that of bandit feedback, our analysis allows us to obtain new results in four important settings. We derive the first optimal (up to logarithmic factors) regret bounds for combinatorial semi-bandits with delay and adversarial Markov Decision Processes with delay (both known and unknown transition functions). Furthermore, we use our analysis to develop an efficient algorithm for linear bandits with delay achieving near-optimal regret bounds. In order to derive these results we show that FTRL remains stable across multiple rounds under mild assumptions on the regularizer.
Settore INFO-01/A - Informatica
   European Lighthouse of AI for Sustainability (ELIAS)
   ELIAS
   EUROPEAN COMMISSION
   101120237

   Algorithms, Games, and Digital Markets (ALGADIMAR)
   ALGADIMAR
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017R9FHSR_006
2025
mar-2025
http://jmlr.org/papers/v26/24-0496.html
Article (author)
File in questo prodotto:
File Dimensione Formato  
24-0496.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 564.04 kB
Formato Adobe PDF
564.04 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1175556
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact