Give Ear to My Face: Modelling Multimodal Attention to Social Interactions

Boccignone, G.; Cuculo, V.; D’Amelio, A.; Grossi, G.; Lanzarotti, R.

doi:10.1007/978-3-030-11012-3_27

We address the deployment of perceptual attention to social interactions as displayed in conversational clips, when relying on multimodal information (audio and video). A probabilistic modelling framework is proposed that goes beyond the classic saliency paradigm while integrating multiple information cues. Attentional allocation is determined not just by stimulus-driven selection but, importantly, by social value as modulating the selection history of relevant multimodal items. Thus, the construction of attentional priority is the result of a sampling procedure conditioned on the potential value dynamics of socially relevant objects emerging moment to moment within the scene. Preliminary experiments on a publicly available dataset are presented.

Give Ear to My Face: Modelling Multimodal Attention to Social Interactions / G. Boccignone, V. Cuculo, A. D’Amelio, G. Grossi, R. Lanzarotti (LECTURE NOTES IN COMPUTER SCIENCE). - In: Computer Vision : ECCV 2018 Workshops / [a cura di] L. Leal-Taixé, S. Roth. - [s.l] : Springer, 2019. - ISBN 9783662539064. - pp. 331-345 (( Intervento presentato al 9. convegno International Workshop on Human Behavior Understanding tenutosi a Munich nel 2018.

Give Ear to My Face: Modelling Multimodal Attention to Social Interactions

G. Boccignone;V. Cuculo;A. D’Amelio;G. Grossi;R. Lanzarotti

2019

Abstract

We address the deployment of perceptual attention to social interactions as displayed in conversational clips, when relying on multimodal information (audio and video). A probabilistic modelling framework is proposed that goes beyond the classic saliency paradigm while integrating multiple information cues. Attentional allocation is determined not just by stimulus-driven selection but, importantly, by social value as modulating the selection history of relevant multimodal items. Thus, the construction of attentional priority is the result of a sampling procedure conditioned on the potential value dynamics of socially relevant objects emerging moment to moment within the scene. Preliminary experiments on a publicly available dataset are presented.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				audio-visual attention; social interaction; multimodal perception
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
			
	Data di pubblicazione
	
				2019
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-030-11012-3_27
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
eccvBU20181.pdf accesso riservato Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 2.27 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.27 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
Boccignone2019_Chapter_GiveEarToMyFaceModellingMultim.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 1.46 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.46 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/616183

Citazioni

ND

38

7

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca