We address the deployment of perceptual attention to social interactions as displayed in conversational clips, when relying on multimodal information (audio and video). A probabilistic modelling framework is proposed that goes beyond the classic saliency paradigm while integrating multiple information cues. Attentional allocation is determined not just by stimulus-driven selection but, importantly, by social value as modulating the selection history of relevant multimodal items. Thus, the construction of attentional priority is the result of a sampling procedure conditioned on the potential value dynamics of socially relevant objects emerging moment to moment within the scene. Preliminary experiments on a publicly available dataset are presented.

Give Ear to My Face: Modelling Multimodal Attention to Social Interactions / G. Boccignone, V. Cuculo, A. D’Amelio, G. Grossi, R. Lanzarotti (LECTURE NOTES IN COMPUTER SCIENCE). - In: Computer Vision : ECCV 2018 Workshops / [a cura di] L. Leal-Taixé, S. Roth. - [s.l] : Springer, 2019. - ISBN 9783662539064. - pp. 331-345 (( Intervento presentato al 9. convegno International Workshop on Human Behavior Understanding tenutosi a Munich nel 2018.

Give Ear to My Face: Modelling Multimodal Attention to Social Interactions

G. Boccignone;V. Cuculo
;
A. D’Amelio;G. Grossi;R. Lanzarotti
2019

Abstract

We address the deployment of perceptual attention to social interactions as displayed in conversational clips, when relying on multimodal information (audio and video). A probabilistic modelling framework is proposed that goes beyond the classic saliency paradigm while integrating multiple information cues. Attentional allocation is determined not just by stimulus-driven selection but, importantly, by social value as modulating the selection history of relevant multimodal items. Thus, the construction of attentional priority is the result of a sampling procedure conditioned on the potential value dynamics of socially relevant objects emerging moment to moment within the scene. Preliminary experiments on a publicly available dataset are presented.
audio-visual attention; social interaction; multimodal perception
Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
2019
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
eccvBU20181.pdf

accesso riservato

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 2.27 MB
Formato Adobe PDF
2.27 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Boccignone2019_Chapter_GiveEarToMyFaceModellingMultim.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 1.46 MB
Formato Adobe PDF
1.46 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/616183
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 4
social impact