Joint Learning of Emotions in Music and Generalized Sounds

Simonetta, F.; Certo, F.; Ntalampiras, S.

doi:10.1145/3678299.3678328

n this study, we aim to determine if generalized sounds and mu- sic can share a common emotional space, improving predictions of emotion in terms of arousal and valence. We propose the use of multiple datasets as a multi-domain learning technique. Our approach involves creating a common space encompassing fea- tures that characterize both generalized sounds and music, as they can evoke emotions in a similar manner. To achieve this, we uti- lized two publicly available datasets, namely IADS-E and PMEmo, following a standardized experimental protocol. We employed a wide variety of features that capture diverse aspects of the audio structure including key parameters of spectrum, energy, and voic- ing. Subsequently, we performed joint learning on the common feature space, leveraging heterogeneous model architectures. Inter- estingly, this synergistic scheme outperforms the state-of-the-art in both sound and music emotion prediction. The code enabling full replication of the presented experimental pipeline is available at https://github.com/LIMUNIMI/MusicSoundEmotions

Joint Learning of Emotions in Music and Generalized Sounds / F. Simonetta, F. Certo, S. Ntalampiras - In: AM '24: Proceedings / [a cura di] L. A, Ludovico, D.A. Mauro. - [s.l] : ACM, 2024. - ISBN 979-8-4007-0968-5. - pp. 302-307 (( Intervento presentato al 19. convegno Audio Mostly tenutosi a Milano nel 2024 [10.1145/3678299.3678328].