This article presents a solution for Speech Emotion Recognition (SER) in multilingual setting using a hierarchical approach. The approach involves two levels, the first level identifies the gender of the speaker, while the second level predicts their emotional state. We evaluate the performance of three classifiers of increasing complexity: k-NN, transfer learning based on YAM- Net, and Bidirectional Long Short-Term Memory neural networks. The models were trained, validated, and tested on a dataset that includes the big-six emotions and was collected from well-known SER datasets representing six different lan- guages. Our results indicate that there are differences in classification accuracy when considering all data versus only female or male data, across all classifiers. Interestingly, prior knowledge of the speaker’s gender can improve the overall classification performance
Gender-Aware Speech Emotion Recognition in Multiple Languages / M. Nicolini, S. Ntalampiras (LECTURE NOTES IN COMPUTER SCIENCE). - In: Pattern Recognition Applications and Methods / [a cura di] M. De Marsico, G. Sanniti Di Baja, A. Fred. - [s.l] : Springer, 2024 Feb 22. - ISBN 9783031547256. - pp. 111-123 [10.1007/978-3-031-54726-3_7]
Gender-Aware Speech Emotion Recognition in Multiple Languages
M. Nicolini;S. Ntalampiras
2024
Abstract
This article presents a solution for Speech Emotion Recognition (SER) in multilingual setting using a hierarchical approach. The approach involves two levels, the first level identifies the gender of the speaker, while the second level predicts their emotional state. We evaluate the performance of three classifiers of increasing complexity: k-NN, transfer learning based on YAM- Net, and Bidirectional Long Short-Term Memory neural networks. The models were trained, validated, and tested on a dataset that includes the big-six emotions and was collected from well-known SER datasets representing six different lan- guages. Our results indicate that there are differences in classification accuracy when considering all data versus only female or male data, across all classifiers. Interestingly, prior knowledge of the speaker’s gender can improve the overall classification performanceFile | Dimensione | Formato | |
---|---|---|---|
978-3-031-54726-3_7.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
632.82 kB
Formato
Adobe PDF
|
632.82 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.