IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Following the constantly increasing adoption of affective computing based solutions, this paper investigates the feasibility of multilingual anger identification. To this end, we formed such a corpus by suitably combining seven different datasets representing five different languages, i.e. English, German, Italian, Urdu, and Persian. After analyzing the diverse characteristics of the datasets, we designed four classification algorithms, namely Support Vector Machine, Decision Tree-based Bagging scheme, Convolutional Neural Network, and Convolutional Recurrent Neural Network. Such classification mechanisms are trained on appropriate features extracted from time and/or frequency domains, while speech data have been balanced considering every diverse characteristic incorporated in the datasets (language, sex, acted, etc.). Our findings render multilingual anger identification feasible since the proposed audio pattern recognition methodology based on Mel-spectrograms and CRNN achieved quite satisfactory identification rates.

Language-agnostic speech anger identification / A. Saitta, S. Ntalampiras - In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP)[s.l] : IEEE, 2021. - ISBN 978-1-6654-2933-7. - pp. 249-253 (( Intervento presentato al 44. convegno International Conference on Telecommunications and Signal Processing (TSP) tenutosi a Brno nel 2021 [10.1109/TSP52935.2021.9522606].

Language-agnostic speech anger identification

Saitta, Alessandra;S. Ntalampiras

2021

Abstract

Following the constantly increasing adoption of affective computing based solutions, this paper investigates the feasibility of multilingual anger identification. To this end, we formed such a corpus by suitably combining seven different datasets representing five different languages, i.e. English, German, Italian, Urdu, and Persian. After analyzing the diverse characteristics of the datasets, we designed four classification algorithms, namely Support Vector Machine, Decision Tree-based Bagging scheme, Convolutional Neural Network, and Convolutional Recurrent Neural Network. Such classification mechanisms are trained on appropriate features extracted from time and/or frequency domains, while speech data have been balanced considering every diverse characteristic incorporated in the datasets (language, sex, acted, etc.). Our findings render multilingual anger identification feasible since the proposed audio pattern recognition methodology based on Mel-spectrograms and CRNN achieved quite satisfactory identification rates.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				speech emotion recognition; multilingual emotion recognition; audio pattern recognition; deep learning
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2021
			
	DOI
	
				https://dx.doi.org/10.1109/TSP52935.2021.9522606
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
61 Language-agnostic speech anger identification.pdf accesso aperto Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 320.9 kB Formato Adobe PDF Visualizza/Apri	320.9 kB	Adobe PDF	Visualizza/Apri
Language-agnostic_speech_anger_identification.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 2.7 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.7 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/865518

Citazioni

ND

5

4

ND

social impact