The applicability of soft computing to implementing text-to-speech conversion is subject to debate. Using neural networks for phoneme-level, text-to-speech conversion has several advantages over hard computing. Soft computing’s capacity to generalize makes it possible to map words missing from the database, as well as to reduce contradictions related to different pronunciations for the same word. Neural networks have been shown to optimally solve a large class of applied pattern-matching problems, but very little research has been done to match the requirements of pattern generation in machine-to-human interaction. An artificial speech synthesizer based on neural networks is being developed for application to deeply embedded systems for language-independent speech commands on hands-free interfaces. A feed-forward, backpropagation artificial neural network has been trained for this purpose using a custom-developed, regular expression-based, text-to-phones transcription engine to generate training patterns. Initial experimental results show the expected properties of language independence and in-system learning.
A Language-Independent Neural Network-Based Speech Synthesizer / M. Malcangi, D. Frontini - In: Proceedings of the 10th International Conference on Engineering Applications of Neural Networks / [a cura di] Kostantinos Margaritis, Lazaros Iliadis. - Thessaloniki : Publishing Centre Alexander T.E.I., 2007. - ISBN 978-960-287-093-8. - pp. 395-402 (( Intervento presentato al 10. convegno International Conference on Engineering Applications of Neural Networks tenutosi a Thessaloniki, Greece nel 2007.
A Language-Independent Neural Network-Based Speech Synthesizer
M. MalcangiPrimo
;
2007
Abstract
The applicability of soft computing to implementing text-to-speech conversion is subject to debate. Using neural networks for phoneme-level, text-to-speech conversion has several advantages over hard computing. Soft computing’s capacity to generalize makes it possible to map words missing from the database, as well as to reduce contradictions related to different pronunciations for the same word. Neural networks have been shown to optimally solve a large class of applied pattern-matching problems, but very little research has been done to match the requirements of pattern generation in machine-to-human interaction. An artificial speech synthesizer based on neural networks is being developed for application to deeply embedded systems for language-independent speech commands on hands-free interfaces. A feed-forward, backpropagation artificial neural network has been trained for this purpose using a custom-developed, regular expression-based, text-to-phones transcription engine to generate training patterns. Initial experimental results show the expected properties of language independence and in-system learning.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.