Understanding human behavioural signals is one of the key ingredients of an effective human-human and human-computer interaction (HCI). In such respect, non verbal communication plays a key role and is composed by a variety of modalities acting jointly to convey a common message. In particular, cues like gesture, facial expression, prosody etc. have the same importance as spoken words. Gaze behaviour makes no exception, being one of the most common, yet unobtrusive ways of communicating. To this aim, many computational models of visual attention allocation have been proposed; although such models were primarily conceived in the psychological field, in the last couple of decades, the problem of predicting attention allocation on a visual stimuli has started to catch the interest of the computer vision and pattern recognition community, pushed by the fast growing number of possible applications (e.g. autonomous driving, image/video compression, robotics). In this renaissance of attention modelling, some of the key features characterizing eye movements were at best overlooked; in particular the explicit unrolling in time of eye movements (i.e. their dynamics) has been seldom taken into account. Moreover, the vast majority of the proposed models are only able to deal with static stimuli (images), with few notable exceptions. The main contribution of this work is a novel computational model of attentive eye guidance which derives gaze dynamics in a principled way, by reformulating attention deployment as a stochastic foraging problem. We show how treating a virtual observer attending to a video as a stochastic composite forager searching for valuable patches in a multi-modal landscape, leads to simulated gaze trajectories that are not statistically distinguishable from the ones performed by humans while free-viewing the same scene. Model simulation and experiments are carried out on a publicly available dataset of eye-tracked subjects displaying conversations and social interactions between humans.

A STOCHASTIC FORAGING MODEL OF ATTENTIVE EYE GUIDANCE ON DYNAMIC STIMULI / A. D'amelio ; supervisor: G.Grossi; co-supervisor: G. Boccignone ; coordinatore: P. Boldi. Dipartimento di Informatica Giovanni Degli Antoni, 2021 Mar 22. 33. ciclo, Anno Accademico 2020. [10.13130/d-amelio-alessandro_phd2021-03-22].

A STOCHASTIC FORAGING MODEL OF ATTENTIVE EYE GUIDANCE ON DYNAMIC STIMULI

A. D'Amelio
2021

Abstract

Understanding human behavioural signals is one of the key ingredients of an effective human-human and human-computer interaction (HCI). In such respect, non verbal communication plays a key role and is composed by a variety of modalities acting jointly to convey a common message. In particular, cues like gesture, facial expression, prosody etc. have the same importance as spoken words. Gaze behaviour makes no exception, being one of the most common, yet unobtrusive ways of communicating. To this aim, many computational models of visual attention allocation have been proposed; although such models were primarily conceived in the psychological field, in the last couple of decades, the problem of predicting attention allocation on a visual stimuli has started to catch the interest of the computer vision and pattern recognition community, pushed by the fast growing number of possible applications (e.g. autonomous driving, image/video compression, robotics). In this renaissance of attention modelling, some of the key features characterizing eye movements were at best overlooked; in particular the explicit unrolling in time of eye movements (i.e. their dynamics) has been seldom taken into account. Moreover, the vast majority of the proposed models are only able to deal with static stimuli (images), with few notable exceptions. The main contribution of this work is a novel computational model of attentive eye guidance which derives gaze dynamics in a principled way, by reformulating attention deployment as a stochastic foraging problem. We show how treating a virtual observer attending to a video as a stochastic composite forager searching for valuable patches in a multi-modal landscape, leads to simulated gaze trajectories that are not statistically distinguishable from the ones performed by humans while free-viewing the same scene. Model simulation and experiments are carried out on a publicly available dataset of eye-tracked subjects displaying conversations and social interactions between humans.
22-mar-2021
Settore INF/01 - Informatica
Audio-visual attention; gaze models; social interaction; multimodal perception
GROSSI, GIULIANO
GROSSI, GIULIANO
BOCCIGNONE, GIUSEPPE
BOLDI, PAOLO
Doctoral Thesis
A STOCHASTIC FORAGING MODEL OF ATTENTIVE EYE GUIDANCE ON DYNAMIC STIMULI / A. D'amelio ; supervisor: G.Grossi; co-supervisor: G. Boccignone ; coordinatore: P. Boldi. Dipartimento di Informatica Giovanni Degli Antoni, 2021 Mar 22. 33. ciclo, Anno Accademico 2020. [10.13130/d-amelio-alessandro_phd2021-03-22].
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R11866.pdf

accesso aperto

Tipologia: Tesi di dottorato completa
Dimensione 35.71 MB
Formato Adobe PDF
35.71 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/816678
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact