Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems. In many cases, however, these applications have a strong social component, whose integration in the bandit algorithm could lead to a dramatic performance increase. For instance, content may be served to a group of users by taking advantage of an underlying network of social relationships among them. In this paper, we introduce novel algorithmic approaches to the solution of such networked bandit problems. More specifically, we design and analyze a global recommendation strategy which allocates a bandit algorithm to each network node (user) and allows it to "share" signals (contexts and payoffs) with the neghboring nodes. We then derive two more scalable variants of this strategy based on different ways of clustering the graph nodes. We experimentally compare the algorithm and its variants to state-of-the-art methods for contextual bandits that do not use the relational information. Our experiments, carried out on synthetic and real-world datasets, show a consistent increase in prediction performance obtained by exploiting the network structure.

A gang of bandits / N. Cesa-Bianchi, C. Gentile, G. Zappella (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in neural information processing systems / [a cura di] C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K.Q. Weinberger. - [s.l] : Neural information processing systems foundation, 2013. - pp. 1-9 (( convegno Conference on Neural Information Processing Systems tenutosi a South Lake Tahoe nel 2013.

A gang of bandits

N. Cesa-Bianchi;G. Zappella
2013

Abstract

Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems. In many cases, however, these applications have a strong social component, whose integration in the bandit algorithm could lead to a dramatic performance increase. For instance, content may be served to a group of users by taking advantage of an underlying network of social relationships among them. In this paper, we introduce novel algorithmic approaches to the solution of such networked bandit problems. More specifically, we design and analyze a global recommendation strategy which allocates a bandit algorithm to each network node (user) and allows it to "share" signals (contexts and payoffs) with the neghboring nodes. We then derive two more scalable variants of this strategy based on different ways of clustering the graph nodes. We experimentally compare the algorithm and its variants to state-of-the-art methods for contextual bandits that do not use the relational information. Our experiments, carried out on synthetic and real-world datasets, show a consistent increase in prediction performance obtained by exploiting the network structure.
Settore INF/01 - Informatica
2013
http://papers.nips.cc/paper/5006-a-gang-of-bandits
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
5006-a-gang-of-bandits.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 1.1 MB
Formato Adobe PDF
1.1 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/231403
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 89
  • ???jsp.display-item.citation.isi??? ND
social impact