This paper introduces ScanDDM, a novel scanpath model enabling generalised zero-shot goal-directed attention prediction. Leveraging recent advancements in vision-and-language learning techniques, ScanDDM models goal-directed attention by integrating high-level abstract concepts provided through textual prompts. The approach relies on a multialternative Drift Diffusion Model (DDM), framing gaze dynamics as a decision-making process that encapsulates both fixation duration and saccade execution. This allows to implement a value-based evidence accumulation process akin to the neurobiological mechanisms surmised to underlie human perceptual decision making. ScanDDM’s efficacy is quantitatively evaluated against the state-of-the-art model on the COCO-Search18 dataset, demonstrating excellent capabilities in predicting task-driven scanpaths in a zero-shot setting. Moreover, qualitative results showcase ScanDDM’s ability to generalize to complex and abstract concepts, beyond simple visual search tasks. Source code available at: https://github.com/phuselab/scanDDM.
ScanDDM: Generalised Zero-Shot Neuro-Dynamical Modelling of Goal-Directed Attention / A. D'Amelio, M. Lucchi, G. Boccignone (LECTURE NOTES IN COMPUTER SCIENCE). - In: Computer Vision – ECCV 2024 Workshops / [a cura di] A. Del Bue, C. Canton, J. Pont-Tuset, T. Tommasi. - [s.l] : Springer Science and Business Media Deutschland GmbH, 2025. - ISBN 9783031915772. - pp. 234-244 (( Intervento presentato al 18. convegno ECCV tenutosi a Milano nel 2024 [10.1007/978-3-031-91578-9_17].
ScanDDM: Generalised Zero-Shot Neuro-Dynamical Modelling of Goal-Directed Attention
A. D'Amelio
Primo
;G. BoccignoneUltimo
2025
Abstract
This paper introduces ScanDDM, a novel scanpath model enabling generalised zero-shot goal-directed attention prediction. Leveraging recent advancements in vision-and-language learning techniques, ScanDDM models goal-directed attention by integrating high-level abstract concepts provided through textual prompts. The approach relies on a multialternative Drift Diffusion Model (DDM), framing gaze dynamics as a decision-making process that encapsulates both fixation duration and saccade execution. This allows to implement a value-based evidence accumulation process akin to the neurobiological mechanisms surmised to underlie human perceptual decision making. ScanDDM’s efficacy is quantitatively evaluated against the state-of-the-art model on the COCO-Search18 dataset, demonstrating excellent capabilities in predicting task-driven scanpaths in a zero-shot setting. Moreover, qualitative results showcase ScanDDM’s ability to generalize to complex and abstract concepts, beyond simple visual search tasks. Source code available at: https://github.com/phuselab/scanDDM.| File | Dimensione | Formato | |
|---|---|---|---|
|
978-3-031-91578-9_17 (3).pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Licenza:
Nessuna licenza
Dimensione
10.24 MB
Formato
Adobe PDF
|
10.24 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




