Affective computing is gaining increased interest by the scientific community in the last decades with the acoustic modality playing a central role. This paper presents an extensive computational analysis of emotional speech focusing on the Italian language. More precisely, we propose a novel classification algorithm based on a suitable data augmentation scheme. The aim is to classify the seven emotions (anger, disgust, fear, joy, neutral, sadness, and surprise) included in the only publicly available database of Italian emotional speech, i.e. EMOVO. To this end, we employed two feature sets, Mel Frequency Cepstral Coefficients and log-Mel spectrogram, each one combined with a suitable classifier, i.e. Mutilayer perceptron and Convolutional neural network respectively. The implementation and evaluation of the proposed SER pipeline can be accessed through the following link: https://github.com/irenemante/ser_emovo

Italian Speech Emotion Recognition / I. Mantegazza, S. Ntalampiras - In: 2023 24th International Conference on Digital Signal Processing (DSP)[s.l] : IEEE, 2023. - ISBN 979-8-3503-3959-8. - pp. 1-5 (( convegno DSP tenutosi a Rhodes nel 2023 [10.1109/DSP58604.2023.10167766].

Italian Speech Emotion Recognition

S. Ntalampiras
2023

Abstract

Affective computing is gaining increased interest by the scientific community in the last decades with the acoustic modality playing a central role. This paper presents an extensive computational analysis of emotional speech focusing on the Italian language. More precisely, we propose a novel classification algorithm based on a suitable data augmentation scheme. The aim is to classify the seven emotions (anger, disgust, fear, joy, neutral, sadness, and surprise) included in the only publicly available database of Italian emotional speech, i.e. EMOVO. To this end, we employed two feature sets, Mel Frequency Cepstral Coefficients and log-Mel spectrogram, each one combined with a suitable classifier, i.e. Mutilayer perceptron and Convolutional neural network respectively. The implementation and evaluation of the proposed SER pipeline can be accessed through the following link: https://github.com/irenemante/ser_emovo
Affective computing; Convolutional neural network; Multilayer perceptron; data augmentation; MFCCs; log-Mel spectrogram
Settore INF/01 - Informatica
2023
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
audio_pattern_recognition.pdf

accesso riservato

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 454.53 kB
Formato Adobe PDF
454.53 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Italian_Speech_Emotion_Recognition.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 1.15 MB
Formato Adobe PDF
1.15 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/983658
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact