Unlabeled Multimodal Datasets for Robust Emotion Recognition

Ramos, I.F.; Gianini, G.; Damiani, E.

doi:10.1007/978-3-031-93598-5_10

Despite the vast literature on emotion recognition, intra- and inter-subject variability and emotional cultural differences are still outstanding challenges that limit the state-of-the-art model’s generalization ability and robustness to out-of-training distribution data. We argue that potential solution to these problems could be based on the use of unlabeled large-scale datasets available online, in particular those providing multi-modal streams, whose availability is increasing. The aim of this work is to explore the use of multi-modal large datasets, with both EEG and Eye-tracking data streams, to increase the robustness of an emotion recognition downstream task. Three data sets on different scales, with data from different numbers of subjects (117, 47, and 16 subjects) for different pretext tasks (gaze estimation, attention type recognition, and emotion recognition), were used for self-supervised pretraining of a deep learning model and compared with the performance obtained under fully supervised training with a small emotion recognition dataset, SEED-IV (15 subjects). The use of unlabeled multimodal datasets has shown promising results to improve emotion recognition robustness using Eye-related data, although further research is needed to fully benefit from the unprecedented amount of data available in the near future.

Unlabeled Multimodal Datasets for Robust Emotion Recognition / I.F. Ramos, G. Gianini, E. Damiani (COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE). - In: Management of Digital EcoSystems / [a cura di] R. Chbeir, E. Damiani, S. Dustdar, Y. Manolopoulos, E. Masciari, E. Pitoura, A. Rinaldi. - [s.l] : Springer, 2026 Jul. - ISBN 9783031935978. - pp. 131-144 (( Intervento presentato al 2024. convegno MEDES tenutosi a Napoli nel 16 [10.1007/978-3-031-93598-5_10].

Unlabeled Multimodal Datasets for Robust Emotion Recognition

I.F. Ramos;G. Gianini;E. Damiani

2026

Abstract

Despite the vast literature on emotion recognition, intra- and inter-subject variability and emotional cultural differences are still outstanding challenges that limit the state-of-the-art model’s generalization ability and robustness to out-of-training distribution data. We argue that potential solution to these problems could be based on the use of unlabeled large-scale datasets available online, in particular those providing multi-modal streams, whose availability is increasing. The aim of this work is to explore the use of multi-modal large datasets, with both EEG and Eye-tracking data streams, to increase the robustness of an emotion recognition downstream task. Three data sets on different scales, with data from different numbers of subjects (117, 47, and 16 subjects) for different pretext tasks (gaze estimation, attention type recognition, and emotion recognition), were used for self-supervised pretraining of a deep learning model and compared with the performance obtained under fully supervised training with a small emotion recognition dataset, SEED-IV (15 subjects). The use of unlabeled multimodal datasets has shown promising results to improve emotion recognition robustness using Eye-related data, although further research is needed to fully benefit from the unprecedented amount of data available in the near future.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Titolo del progetto
	
	Titolo Progetto
	
									Collaborative Intelligence for Safety Critical systems (CISC)
								
	Acronimo
	
									CISC
								
	Nome finanziatore
	
										EUROPEAN COMMISSION
									
	Finanziamento
	
									H2020
								
	N. Contratto
	
									955901
								
	Data di pubblicazione
	
				lug-2026
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-031-93598-5_10
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
Ines_Ramos_MEDES2024.pdf accesso riservato Tipologia: Pre-print (manoscritto inviato all'editore) Licenza: Nessuna licenza Dimensione 918.54 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	918.54 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1177084

Citazioni

ND

0

ND

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Unlabeled Multimodal Datasets for Robust Emotion Recognition

I.F. Ramos;G. Gianini;E. Damiani

2026

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Pubblicazioni consigliate

Citazioni

social impact

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Unlabeled Multimodal Datasets for Robust Emotion Recognition

I.F. Ramos;G. Gianini;E. Damiani

2026

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)