Many online decision-making problems correspond to maximizing a sequence of submodular functions. In this work, we introduce sum-max functions, a subclass of monotone submodular functions capturing several interesting problems, including best-of-$K$-bandits, combinatorial bandits, and the bandit versions on $M$-medians and hitting sets. We show that all functions in this class satisfy a key property that we call pseudo-concavity. This allows us to prove $\big(1 - \frac{1}{e}\big)$-regret bounds for bandit feedback in the nonstochastic setting of the order of $\sqrt{MKT}$ (ignoring log factors), where $T$ is the time horizon and $M$ is a cardinality constraint. This bound, attained by a simple and efficient algorithm, significantly improves on the $\widetilde{\mathcal{O}}\big(T^{2/3}\big)$ regret bound for online monotone submodular maximization with bandit feedback. We also extend our results to a bandit version of the facility location problem.

Sum-max Submodular Bandits / S. Pasteris, A. Rumi, F. Vitale, N. Cesa Bianchi (PROCEEDINGS OF MACHINE LEARNING RESEARCH). - In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics / [a cura di] S. Dasgupta, S. Mandt, Y. Li. - [s.l] : ML Research Press, 2024. - pp. 2323-2331 (( Intervento presentato al 27. convegno International Conference on Artificial Intelligence and Statistics tenutosi a Valencia nel 2024.

Sum-max Submodular Bandits

A. Rumi;N. Cesa Bianchi
2024

Abstract

Many online decision-making problems correspond to maximizing a sequence of submodular functions. In this work, we introduce sum-max functions, a subclass of monotone submodular functions capturing several interesting problems, including best-of-$K$-bandits, combinatorial bandits, and the bandit versions on $M$-medians and hitting sets. We show that all functions in this class satisfy a key property that we call pseudo-concavity. This allows us to prove $\big(1 - \frac{1}{e}\big)$-regret bounds for bandit feedback in the nonstochastic setting of the order of $\sqrt{MKT}$ (ignoring log factors), where $T$ is the time horizon and $M$ is a cardinality constraint. This bound, attained by a simple and efficient algorithm, significantly improves on the $\widetilde{\mathcal{O}}\big(T^{2/3}\big)$ regret bound for online monotone submodular maximization with bandit feedback. We also extend our results to a bandit version of the facility location problem.
English
Settore INFO-01/A - Informatica
Intervento a convegno
Comitato scientifico
Ricerca di base
Pubblicazione scientifica
   Algorithms, Games, and Digital Markets (ALGADIMAR)
   ALGADIMAR
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017R9FHSR_006

   European Lighthouse of AI for Sustainability (ELIAS)
   ELIAS
   EUROPEAN COMMISSION
   101120237
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
S. Dasgupta, S. Mandt, Y. Li
ML Research Press
2024
2323
2331
9
238
Volume a diffusione internazionale
Diamond
International Conference on Artificial Intelligence and Statistics
Valencia
2024
27
Convegno internazionale
https://proceedings.mlr.press/v238/pasteris24a.html
bibtex
Aderisco
S. Pasteris, A. Rumi, F. Vitale, N. Cesa Bianchi
Book Part (author)
open
273
Sum-max Submodular Bandits / S. Pasteris, A. Rumi, F. Vitale, N. Cesa Bianchi (PROCEEDINGS OF MACHINE LEARNING RESEARCH). - In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics / [a cura di] S. Dasgupta, S. Mandt, Y. Li. - [s.l] : ML Research Press, 2024. - pp. 2323-2331 (( Intervento presentato al 27. convegno International Conference on Artificial Intelligence and Statistics tenutosi a Valencia nel 2024.
info:eu-repo/semantics/bookPart
4
Prodotti della ricerca::03 - Contributo in volume
File in questo prodotto:
File Dimensione Formato  
pasteris24a.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 19.67 MB
Formato Adobe PDF
19.67 MB Adobe PDF Visualizza/Apri
SumMax-Compressed.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 1.61 MB
Formato Adobe PDF
1.61 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1122767
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact