- This paper presents a new approach for driving avatars with text-to-speech synthesis that uses pure text as an information source. The goal is to move lips and face muscles on the basis of the phonetic nature of the utterance and the related expression. Several methods came together to define this solution. Rule-based text-to-speech synthesis generates phonetic and expression transcription of the text to be uttered by the avatar. Phonetic transcription is used to train two artificial neural networks, one for text-to-phone transcription and the other for phone-to-viseme mapping. Then two fuzzy-logic engines were tuned for smoothed control of lip and face movements.

Soft-computing methods for text-to-speech driven avatars / M. Malcangi - In: Mathematica Methods and Applied Computing : proceedings of the applied computing conference 2009 : proceedings of the 11 internationa conference on mathematical methods and computer techniques in electrical engineering (MMACTEE) : Vougliameni, Athens, september 28-30,2009 : volume 1 / [a cura di] N. Mastorakis [et al.]. - Stevens Point, WI : WSEAS, 2009. - ISBN 9789604741243. - pp. 288-292 (( Intervento presentato al 1. convegno Applied Computing Conference 2009 tenutosi a Vouliagmeni, Athens, Greece nel 2009.

Soft-computing methods for text-to-speech driven avatars

M. Malcangi
Primo
2009

Abstract

- This paper presents a new approach for driving avatars with text-to-speech synthesis that uses pure text as an information source. The goal is to move lips and face muscles on the basis of the phonetic nature of the utterance and the related expression. Several methods came together to define this solution. Rule-based text-to-speech synthesis generates phonetic and expression transcription of the text to be uttered by the avatar. Phonetic transcription is used to train two artificial neural networks, one for text-to-phone transcription and the other for phone-to-viseme mapping. Then two fuzzy-logic engines were tuned for smoothed control of lip and face movements.
Phone-to-viseme conversion ; Text-to-speech synthesis ; Artificial neural networks ; Fuzzy logic
Settore INF/01 - Informatica
2009
Technical University of Sofia
University Politehnica of Bucharest
University of Genova
Zhejiang University of Technology
Norwegian University of Science and Technology
Universidy of Algarve, Portugal
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/72743
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact