The task of text-to-speech (TTS) synthesis usually refers to a single language and to a single speaker, concatenating short parametrically controlled speech segments by means of a rule-based algorithm. The main disadvantage of this solution is its strong language and speaker dependency. We propose a framework designed to overcame this limitation, employing a multi-language text-to-speech synthesis system. The text-to-speech synthesis framework was designed to embed phonetic and prosodic information in a set of rules. Synthesis of more than one language can easily be carried out by switching from one rule set to another. The system does not depend on phone sets recorded from an actual specific human voice. Rather, it relies on a human-like, speech-synthesis model that can generate the units needed to produce the desired utterance for a specific test string in any kind of voice (male, female, child).

A Framework for Mixed-language Text-to-speech Synthesis / M. Malcangi, P. Grew - In: Recent advances in computational intelligence, man-machine systems and cybernetics : proceedings of the 8th WSEAS International Conference on computational intelligence, man-machine systems and cybernetics (CIMMACS '09): Puerto De La Cruz, Tenerife, Canary Islands, Spain, December 14-16, 2009 / [a cura di] C. A. Bulucea [et al.]. - Stevens Point, WI : WSEAS, 2009. - ISBN 9789604741441. - pp. 151-154 (( Intervento presentato al 8. convegno Computational Intelligence, Man-machine Systems and Cybernetics (CIMMACS ’09) tenutosi a Puerto de la Cruz, Tenerife, Spain nel 2009.

A Framework for Mixed-language Text-to-speech Synthesis

M. Malcangi
Primo
;
2009

Abstract

The task of text-to-speech (TTS) synthesis usually refers to a single language and to a single speaker, concatenating short parametrically controlled speech segments by means of a rule-based algorithm. The main disadvantage of this solution is its strong language and speaker dependency. We propose a framework designed to overcame this limitation, employing a multi-language text-to-speech synthesis system. The text-to-speech synthesis framework was designed to embed phonetic and prosodic information in a set of rules. Synthesis of more than one language can easily be carried out by switching from one rule set to another. The system does not depend on phone sets recorded from an actual specific human voice. Rather, it relies on a human-like, speech-synthesis model that can generate the units needed to produce the desired utterance for a specific test string in any kind of voice (male, female, child).
Text-to-speech ; Multi-language speech synthesis ; Rule-based speech synthesis
Settore INF/01 - Informatica
2009
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/72746
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
social impact