Emotion Recognition from Speech: An Unsupervised Learning Approach

Rovetta, S.; Mnasri, Z.; Masulli, F.; Cabri, A.

doi:10.2991/ijcis.d.201019.002

Speech processing is quickly shifting toward affective computing, that requires handling emotions and modeling expressive speech synthesis and recognition. The latter task has been so far achieved by supervised classifiers. This implies a prior labeling and data preprocessing, with a cost that increases with the size of the database, in addition to the risk of committing errors. A typical emotion recognition corpus therefore has a relatively limited number of instances. To avoid the cost of labeling, and at the same time to reduce the risk of overfitting due to lack of data, unsupervised learning seems a suitable alternative to recognize emotions from speech. The recent advances in clustering techniques make it possible to reach good performances, comparable to that obtained by classifiers, with much less preprocessing load and even with generalization guarantees. This paper presents a novel approach for emotion recognition from speech signal, based on some variants of fuzzy clustering, such as probabilistic, possibilistic and graded-possibilistic fuzzy c-means. Experiments indicate that this approach (a) is effective in recognition, with in-corpus performances comparable to other proposals in the literature but with the added value of complexity control and (b) allows an innovative way to analyze emotions conveyed by speech using possibilistic membership degrees.

Emotion Recognition from Speech: An Unsupervised Learning Approach / S. Rovetta, Z. Mnasri, F. Masulli, A. Cabri. - In: INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS. - ISSN 1875-6883. - 14:1(2020), pp. 23-35. [10.2991/ijcis.d.201019.002]

Emotion Recognition from Speech: An Unsupervised Learning Approach

Rovetta, Stefano;Mnasri, Zied;Masulli, Francesco;A. Cabri^Ultimo

2020

Abstract

Speech processing is quickly shifting toward affective computing, that requires handling emotions and modeling expressive speech synthesis and recognition. The latter task has been so far achieved by supervised classifiers. This implies a prior labeling and data preprocessing, with a cost that increases with the size of the database, in addition to the risk of committing errors. A typical emotion recognition corpus therefore has a relatively limited number of instances. To avoid the cost of labeling, and at the same time to reduce the risk of overfitting due to lack of data, unsupervised learning seems a suitable alternative to recognize emotions from speech. The recent advances in clustering techniques make it possible to reach good performances, comparable to that obtained by classifiers, with much less preprocessing load and even with generalization guarantees. This paper presents a novel approach for emotion recognition from speech signal, based on some variants of fuzzy clustering, such as probabilistic, possibilistic and graded-possibilistic fuzzy c-means. Experiments indicate that this approach (a) is effective in recognition, with in-corpus performances comparable to other proposals in the literature but with the added value of complexity control and (b) allows an innovative way to analyze emotions conveyed by speech using possibilistic membership degrees.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Emotion recognition; Feature extraction; Fuzzy clustering; K-means; Membership function; Speech signa
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2020
			
	Rivista in ANCE
	
				INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS
			
	DOI
	
				https://dx.doi.org/10.2991/ijcis.d.201019.002
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
03-125945494.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 2.16 MB Formato Adobe PDF Visualizza/Apri	2.16 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/955219

Citazioni

ND

6

2

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca