The framework of feedback graphs is a generalization of sequential decisionmaking with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge. We prove nearly optimal regret bounds of order min minε p (αε/ε)T, minε(δε/ε)1/3T2/3 (ignoring logarithmic factors), where αε and δε are graph-theoretic quantities measured on the support of the stochastic feedback graph G with edge probabilities thresholded at ε. Our result, which holds without any preliminary knowledge about G, requires the learner to observe only the realized out-neighborhood of the chosen action. When the learner is allowed to observe the realization of the entire graph (but only the losses in the out-neighborhood of the chosen action), we derive a more efficient algorithm featuring a dependence on weighted versions of the independence and weak domination numbers that exhibits improved bounds for some special cases.

Learning on the Edge: Online Learning with Stochastic Feedback Graphs / E. Esposito, F. Fusco, D. van der Hoeven, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: NeurIPS / [a cura di] S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh. - [s.l] : Curran Associates, 2022. - pp. 34776-34788 (( Intervento presentato al 36. convegno Conference on Neural Information Processing Systems : Monday November 28th through Friday December 9th tenutosi a New Orleans nel 2022.

Learning on the Edge: Online Learning with Stochastic Feedback Graphs

E. Esposito
Primo
;
D. van der Hoeven
Penultimo
;
N. Cesa Bianchi
Ultimo
2022

Abstract

The framework of feedback graphs is a generalization of sequential decisionmaking with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge. We prove nearly optimal regret bounds of order min minε p (αε/ε)T, minε(δε/ε)1/3T2/3 (ignoring logarithmic factors), where αε and δε are graph-theoretic quantities measured on the support of the stochastic feedback graph G with edge probabilities thresholded at ε. Our result, which holds without any preliminary knowledge about G, requires the learner to observe only the realized out-neighborhood of the chosen action. When the learner is allowed to observe the realization of the entire graph (but only the losses in the out-neighborhood of the chosen action), we derive a more efficient algorithm featuring a dependence on weighted versions of the independence and weak domination numbers that exhibits improved bounds for some special cases.
Settore INF/01 - Informatica
   European Learning and Intelligent Systems Excellence (ELISE)
   ELISE
   EUROPEAN COMMISSION
   H2020
   951847

   One Health Action Hub: task force di Ateneo per la resilienza di ecosistemi territoriali (1H_Hub) Linea Strategica 3, Tema One health, one earth
   1H_Hub
   UNIVERSITA' DEGLI STUDI DI MILANO

   Algorithms, Games, and Digital Markets (ALGADIMAR)
   ALGADIMAR
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017R9FHSR_006
2022
Institute of Electrical and Electronics Engineers (IEEE)
https://proceedings.neurips.cc/paper_files/paper/2022/file/e0e956681b04ac126679e8c7dd706b2e-Paper-Conference.pdf
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
NeurIPS-2022-learning-on-the-edge-online-learning-with-stochastic-feedback-graphs-Paper-Conference.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 490.32 kB
Formato Adobe PDF
490.32 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/961320
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 1
social impact