IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Text-to-speech (TTS) synthesis is becoming a fundamental part of any embedded system that has to interact with humans. Language-independence in speech synthesis is a primary requirement for systems that are not practical to update, as is the case for most embedded systems. Because current text-to-speech synthesis usually refers to a single language and to a single speaker (or at most a limited set of voices), a framework for language-independent, text-to-speech synthesis is proposed to overcome these limitations in implementing speech synthesis on embedded systems. The proposed text-to-speech synthesis framework was designed to embed phonetic and prosodic information in a set of rules. To complete this language-independent speech-synthesis solution, a universal set of phones has been defined so that the appropriate speech sounds for every language are available at run time. Synthesis of more than one language can easily be carried out by switching from one rule set to another while keeping a common phone-data set. Using a vocal-track-based speech synthesizer, the system does not depend on phone sets recorded from an actual specific human voice, so voice types can be chosen at run time.

Toward Language-Independent Text-to-Speech Synthesis / M. Malcangi, P. Grew. - In: WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS. - ISSN 1790-0832. - 7:3(2010 Mar), pp. 411-421.

Toward Language-Independent Text-to-Speech Synthesis

M. Malcangi^Primo;P. Grew

2010

Abstract

Text-to-speech (TTS) synthesis is becoming a fundamental part of any embedded system that has to interact with humans. Language-independence in speech synthesis is a primary requirement for systems that are not practical to update, as is the case for most embedded systems. Because current text-to-speech synthesis usually refers to a single language and to a single speaker (or at most a limited set of voices), a framework for language-independent, text-to-speech synthesis is proposed to overcome these limitations in implementing speech synthesis on embedded systems. The proposed text-to-speech synthesis framework was designed to embed phonetic and prosodic information in a set of rules. To complete this language-independent speech-synthesis solution, a universal set of phones has been defined so that the appropriate speech sounds for every language are available at run time. Synthesis of more than one language can easily be carried out by switching from one rule set to another while keeping a common phone-data set. Using a vocal-track-based speech synthesizer, the system does not depend on phone sets recorded from an actual specific human voice, so voice types can be chosen at run time.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			Text-to-speech, multi-language speech synthesis, rule-based speech synthesis
		
	Settori scientifico-disciplinari dell'articolo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			mar-2010
		
	Rivista in ANCE
	
			WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS
		
	URL
	
			http://www.wseas.us/e-library/transactions/information/2010/89-374.pdf
		
	Tipologia
	
			Article (author)
		
	Appare nelle tipologie:
	
			01 - Articolo su periodico

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/139785

Citazioni

ND

2

ND

social impact