We discuss a new approach for driving avatars using synthetic speech generated from pure text. Lip and face muscles are controlled by the information embedded in the utterance and its related expressiveness. Rule-based, text-to-speech synthesis is used to generate phonetic and expression transcriptions of the text to be uttered by the avatar. Two artificial neural networks, one for text-to-phone transcription and the other for phone-to-viseme mapping have been trained from phonetic transcription data. Two fuzzy-logic engines were tuned for smoothed control of lip and face movement. Simulations have been run to test neural-fuzzy controls using a parametric speech synthesizer to generate voices and a face synthesizer to generate facial movement. Experimental results show that soft computing affords a good solution for the smoothed control of avatars during the expressive utterance of text.

Text-driven avatars based on artificial neural networks and fuzzy logic / M.N. Malcangi. - In: INTERNATIONAL JOURNAL OF COMPUTERS. - ISSN 1998-4308. - 4:2(2010), pp. 61-69.

Text-driven avatars based on artificial neural networks and fuzzy logic

M.N. Malcangi
Primo
2010

Abstract

We discuss a new approach for driving avatars using synthetic speech generated from pure text. Lip and face muscles are controlled by the information embedded in the utterance and its related expressiveness. Rule-based, text-to-speech synthesis is used to generate phonetic and expression transcriptions of the text to be uttered by the avatar. Two artificial neural networks, one for text-to-phone transcription and the other for phone-to-viseme mapping have been trained from phonetic transcription data. Two fuzzy-logic engines were tuned for smoothed control of lip and face movement. Simulations have been run to test neural-fuzzy controls using a parametric speech synthesizer to generate voices and a face synthesizer to generate facial movement. Experimental results show that soft computing affords a good solution for the smoothed control of avatars during the expressive utterance of text.
speech-driven avatar ; phone-to-viseme conversion ; text-to-speech synthesis ; artificial neural network ; fuzzy logic
Settore INF/01 - Informatica
2010
http://www.naun.org/journals/computers/19-269.pdf
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/143902
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact