Text-to-speech (TTS) synthesis is becoming a fundamental part of any embedded system that has to interact with humans. Language-independence in speech synthesis is a primary requirement for systems that are not practical to update, as is the case for most embedded systems. Because current text-to-speech synthesis usually refers to a single language and to a single speaker (or at most a limited set of voices), a framework for language-independent, text-to-speech synthesis is proposed to overcome these limitations in implementing speech synthesis on embedded systems. The proposed text-to-speech synthesis framework was designed to embed phonetic and prosodic information in a set of rules. To complete this language-independent speech-synthesis solution, a universal set of phones has been defined so that the appropriate speech sounds for every language are available at run time. Synthesis of more than one language can easily be carried out by switching from one rule set to another while keeping a common phone-data set. Using a vocal-track-based speech synthesizer, the system does not depend on phone sets recorded from an actual specific human voice, so voice types can be chosen at run time.

Toward Language-Independent Text-to-Speech Synthesis / M. Malcangi, P. Grew. - In: WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS. - ISSN 1790-0832. - 7:3(2010 Mar), pp. 411-421.

Toward Language-Independent Text-to-Speech Synthesis

M. Malcangi
Primo
;
2010

Abstract

Text-to-speech (TTS) synthesis is becoming a fundamental part of any embedded system that has to interact with humans. Language-independence in speech synthesis is a primary requirement for systems that are not practical to update, as is the case for most embedded systems. Because current text-to-speech synthesis usually refers to a single language and to a single speaker (or at most a limited set of voices), a framework for language-independent, text-to-speech synthesis is proposed to overcome these limitations in implementing speech synthesis on embedded systems. The proposed text-to-speech synthesis framework was designed to embed phonetic and prosodic information in a set of rules. To complete this language-independent speech-synthesis solution, a universal set of phones has been defined so that the appropriate speech sounds for every language are available at run time. Synthesis of more than one language can easily be carried out by switching from one rule set to another while keeping a common phone-data set. Using a vocal-track-based speech synthesizer, the system does not depend on phone sets recorded from an actual specific human voice, so voice types can be chosen at run time.
Text-to-speech, multi-language speech synthesis, rule-based speech synthesis
Settore INF/01 - Informatica
mar-2010
http://www.wseas.us/e-library/transactions/information/2010/89-374.pdf
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/139785
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact