On the Minimax Regret for Online Learning with Feedback Graphs

Eldowa, K.; Esposito, E.; Cesari, T.; Cesa Bianchi, N.

In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is O√αT ln K, where K is the number of actions, α is the independence number of the graph, and T is the time horizon. The √ln K factor is known to be necessary when α = 1 (the experts case). On the other hand, when α = K (the bandits case), the minimax rate is known to be Θ√KT , and a lower bound Ω√αT  is known to hold for any α. Our improved upper bound OpαT (1 + ln(K/α)) holds for any α and matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with q-Tsallis entropy for a carefully chosen value of q ∈ [1/2, 1) that varies with α. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to time- varying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved ΩpαT (ln K)/(ln α) lower bound for all α > 1, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as α < K.

On the Minimax Regret for Online Learning with Feedback Graphs / K. Eldowa, E. Esposito, T. Cesari, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems. 36 / [a cura di] A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, S. Levine. - [s.l] : Curran Associates, 2023. - pp. 46122-46133 (( Intervento presentato al 37. convegno Neural Information Processing Systems tenutosi a 2023 nel 2023.

On the Minimax Regret for Online Learning with Feedback Graphs

K. Eldowa^Primo;E. Esposito^Secondo;T. Cesari^Penultimo;N. Cesa Bianchi^Ultimo

2023

Abstract

In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is O√αT ln K, where K is the number of actions, α is the independence number of the graph, and T is the time horizon. The √ln K factor is known to be necessary when α = 1 (the experts case). On the other hand, when α = K (the bandits case), the minimax rate is known to be Θ√KT , and a lower bound Ω√αT is known to hold for any α. Our improved upper bound OpαT (1 + ln(K/α)) holds for any α and matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with q-Tsallis entropy for a carefully chosen value of q ∈ [1/2, 1) that varies with α. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to time- varying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved ΩpαT (ln K)/(ln α) lower bound for all α > 1, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as α < K.

Scheda breve

Scheda completa

Scheda completa (DC)

	Presenza di coautori internazionali
	
				Sì
			
	Lingua del contributo
	
				English
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Tipo
	
				Intervento a convegno
			
	Revisione (peer review)
	
				Comitato scientifico
			
	Classificazione in base al tipo di ricerca
	
				Ricerca di base
			
	Classificazione della pubblicazione
	
				Pubblicazione scientifica
			
	Titolo del progetto
	
	Titolo Progetto
	
									Learning in Markets and Society
								
	Nome finanziatore
	
										MINISTERO DELL'UNIVERSITA' E DELLA RICERCA
									
	N. Contratto
	
									2022EKNE5K_001
								
	Titolo Progetto
	
									European Lighthouse of AI for Sustainability (ELIAS)
								
	Acronimo
	
									ELIAS
								
	Nome finanziatore
	
										EUROPEAN COMMISSION
									
	N. Contratto
	
									101120237
								
	Titolo del volume
	
				Advances in Neural Information Processing Systems. 36
			
	Curatori del volume
	
				A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, S. Levine
			
	Editore
	
				Curran Associates
			
	Data di pubblicazione
	
				2023
			
	Pagina iniziale
	
				46122
			
	Pagina finale
	
				46133
			
	Numero di pagine
	
				12
			
	Collana
	
				ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
			
	Numero del volume
	
				36
			
	Tipo di volume
	
				Volume a diffusione internazionale
			
	Nome del convegno
	
				Neural Information Processing Systems
			
	Luogo del convegno
	
				2023
			
	Anno del convegno
	
				2023
			
	Numero del convegno
	
				37
			
	Tipo di convegno
	
				Convegno internazionale
			
	Sezione
	
				Intervento inviato
			
	URL
	
				https://proceedings.neurips.cc/paper_files/paper/2023/file/908f03779b5b063413fbf0247a46a403-Paper-Conference.pdf
			
	Centro di ricerca coordinata
	
				DSRC - Data science research center
			
	Banca dati sorgente
	
				bibtex
			
	Identificativo ISI
	
				WOS:001227224004028
			
	Identificativo SCOPUS
	
				2-s2.0-85188680200
			
	Adesione alla policy Open Access di Ateneo
	
				Aderisco
			
	Tutti gli autori
	
						K. Eldowa, E. Esposito, T. Cesari, N. Cesa Bianchi
					
	Tipologia
	
				Book Part (author)
			
	Fulltext
	
				open
			
	Tipologia sito docente
	
				273
			
	Citazione
	
				On the Minimax Regret for Online Learning with Feedback Graphs / K. Eldowa, E. Esposito, T. Cesari, N. Cesa Bianchi (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems. 36 / [a cura di] A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, S. Levine. - [s.l] : Curran Associates, 2023. - pp. 46122-46133 (( Intervento presentato al 37. convegno Neural Information Processing Systems tenutosi a 2023 nel 2023.
			
	Tipologia
	
				info:eu-repo/semantics/bookPart
			
	Numero autori
	
				4
			
	Tipologia
	
				Prodotti della ricerca::03 - Contributo in volume
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
NeurIPS-2023-on-the-minimax-regret-for-online-learning-with-feedback-graphs-Paper-Conference.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 327.62 kB Formato Adobe PDF Visualizza/Apri	327.62 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1034112

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

On the Minimax Regret for Online Learning with Feedback Graphs

K. Eldowa^Primo;E. Esposito^Secondo;T. Cesari^Penultimo;N. Cesa Bianchi^Ultimo

Primo

Secondo

Penultimo

Ultimo

2023

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Pubblicazioni consigliate

Citazioni

social impact

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

On the Minimax Regret for Online Learning with Feedback Graphs

K. EldowaPrimo;E. EspositoSecondo;T. CesariPenultimo;N. Cesa BianchiUltimo

Primo

Secondo

Penultimo

Ultimo

2023

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

K. Eldowa^Primo;E. Esposito^Secondo;T. Cesari^Penultimo;N. Cesa Bianchi^Ultimo

Scheda breve

Scheda completa

Scheda completa (DC)