The importance of the emotion information in human speech has been growing in recent years due to increasing use of natural user interfacing in embedded systems. Speech-based human-machine communication has the advantage of a high degree of usability, but it need not be limited to speech-to-text and text-to-speech capabilities. Emotion recognition in uttered speech has been considered in this research to integrate a speech recognizer/synthesizer with the capacity to recognize and synthesize emotion. This paper describes a complete framework for recognizing and synthesizing emotional speech based on smart logic (fuzzy logic and artificial neural networks). Time-domain signal-processing algorithms has been applied to reduce computational complexity at the feature-extraction level. A fuzzy-logic engine was modeled to make inferences about the emotional content of the uttered speech. An artificial neural network was modeled to synthesize emotive speech. Both were designed to be integrated into an embedded handheld device that implements a speech-based natural user interface (NUI).

Smart recognition and synthesis of emotional speech for embedded systems with natural user interfaces / M. Malcangi - In: IJCNN 2011 conference proceedings : the 2011 international conference on neural networks, San Jose, California, USA, july 31-august 5, 2011Piscataway : IEEE, 2011. - ISBN 9781424496372. - pp. 867-871 (( convegno International Joint Conference on Neural Networks tenutosi a San Josè, USA nel 2011 [10.1109/IJCNN.2011.6033312].

Smart recognition and synthesis of emotional speech for embedded systems with natural user interfaces

M. Malcangi
Primo
2011

Abstract

The importance of the emotion information in human speech has been growing in recent years due to increasing use of natural user interfacing in embedded systems. Speech-based human-machine communication has the advantage of a high degree of usability, but it need not be limited to speech-to-text and text-to-speech capabilities. Emotion recognition in uttered speech has been considered in this research to integrate a speech recognizer/synthesizer with the capacity to recognize and synthesize emotion. This paper describes a complete framework for recognizing and synthesizing emotional speech based on smart logic (fuzzy logic and artificial neural networks). Time-domain signal-processing algorithms has been applied to reduce computational complexity at the feature-extraction level. A fuzzy-logic engine was modeled to make inferences about the emotional content of the uttered speech. An artificial neural network was modeled to synthesize emotive speech. Both were designed to be integrated into an embedded handheld device that implements a speech-based natural user interface (NUI).
Emotional speech recognition ; emotional speech sinthesis ; natural usuer interface ; fuzzy logic ; artificial neural networks ; embedded systems
Settore INF/01 - Informatica
2011
IEEE Computational Intelligence Society
International Neural Networks Society
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/168124
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact