The applicability of soft computing to implementing text-to-speech conversion is subject to debate. Using neural networks for phoneme-level, text-to-speech conversion has several advantages over hard computing. Soft computing’s capacity to generalize makes it possible to map words missing from the database, as well as to reduce contradictions related to different pronunciations for the same word. Neural networks have been shown to optimally solve a large class of applied pattern-matching problems, but very little research has been done to match the requirements of pattern generation in machine-to-human interaction. An artificial speech synthesizer based on neural networks is being developed for application to deeply embedded systems for language-independent speech commands on hands-free interfaces. A feed-forward, backpropagation artificial neural network has been trained for this purpose using a custom-developed, regular expression-based, text-to-phones transcription engine to generate training patterns. Initial experimental results show the expected properties of language independence and in-system learning.

A Language-Independent Neural Network-Based Speech Synthesizer / M. Malcangi, D. Frontini - In: Proceedings of the 10th International Conference on Engineering Applications of Neural Networks / [a cura di] Kostantinos Margaritis, Lazaros Iliadis. - Thessaloniki : Publishing Centre Alexander T.E.I., 2007. - ISBN 978-960-287-093-8. - pp. 395-402 (( Intervento presentato al 10. convegno International Conference on Engineering Applications of Neural Networks tenutosi a Thessaloniki, Greece nel 2007.

A Language-Independent Neural Network-Based Speech Synthesizer

M. Malcangi
Primo
;
2007

Abstract

The applicability of soft computing to implementing text-to-speech conversion is subject to debate. Using neural networks for phoneme-level, text-to-speech conversion has several advantages over hard computing. Soft computing’s capacity to generalize makes it possible to map words missing from the database, as well as to reduce contradictions related to different pronunciations for the same word. Neural networks have been shown to optimally solve a large class of applied pattern-matching problems, but very little research has been done to match the requirements of pattern generation in machine-to-human interaction. An artificial speech synthesizer based on neural networks is being developed for application to deeply embedded systems for language-independent speech commands on hands-free interfaces. A feed-forward, backpropagation artificial neural network has been trained for this purpose using a custom-developed, regular expression-based, text-to-phones transcription engine to generate training patterns. Initial experimental results show the expected properties of language independence and in-system learning.
Speech synthesis ; text-to-speech ; artificial neural network ; regular expressions
Settore INF/01 - Informatica
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/48288
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact