Audio based real-time speech animation
of embodied conversational agents

Malcangi, M.; De Tintis, R.

doi:10.1007/978-3-540-24598-8_32

A framework dedicated to embodied agents facial animation based on speech analysis in presence of background noise is described. Target application areas are entertainment and mobile visual communication. This novel approach derives from the speech signal all the necessary information needed to drive 3-D facial models. Using both digital signal processing and soft computin (fuzzy logic and neural networks) methodologies, a very flexible and lowcost solution for the extraction of lips and facial-related information has been implemented. The main advantage of the speech-based approach is that it is not invasive, as speech is captured by means of a microphone and there is no physical contact with the subject (no use of magnetic sensors or optical markers). This gives additional flexibility to the application in that more applicability derives, if compared to other methodologies. First a speech-based lip driver system was developed in order to synchronize speech to lip movements, then the methodology was extended to some important facial movements so that a facesynching system could be modeled. The developed system is speaker and language independent, so also neural network training operations are not required.

Audio based real-time speech animation of embodied conversational agents / M. Malcangi, R. de Tintis - In: Gesture-based communication in human-computer interaction : 5th International GestureWorkshop, GW 2003 Genova, Italy, April 15-17, 2003 : Selected Revised Papers / [a cura di] A. Camurri, G. Volpe. - Berlin : Springer, 2004. - ISBN 9783540210726. - pp. 429-430 (( Intervento presentato al 5th. convegno International GestureWorkshop tenutosi a Genova - Italy nel 2004 [10.1007/978-3-540-24598-8_32].

Audio based real-time speech animation of embodied conversational agents

M. Malcangi^Primo;R. de Tintis

2004

Abstract

A framework dedicated to embodied agents facial animation based on speech analysis in presence of background noise is described. Target application areas are entertainment and mobile visual communication. This novel approach derives from the speech signal all the necessary information needed to drive 3-D facial models. Using both digital signal processing and soft computin (fuzzy logic and neural networks) methodologies, a very flexible and lowcost solution for the extraction of lips and facial-related information has been implemented. The main advantage of the speech-based approach is that it is not invasive, as speech is captured by means of a microphone and there is no physical contact with the subject (no use of magnetic sensors or optical markers). This gives additional flexibility to the application in that more applicability derives, if compared to other methodologies. First a speech-based lip driver system was developed in order to synchronize speech to lip movements, then the methodology was extended to some important facial movements so that a facesynching system could be modeled. The developed system is speaker and language independent, so also neural network training operations are not required.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Speech-animated Avatars ; Speech sprocessing ; Fuzzy Logic ; Artificial Neural Networks
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2004
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-540-24598-8_32
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/142620

Citazioni

ND

8

ND

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Audio based real-time speech animation of embodied conversational agents

M. Malcangi^Primo;R. de Tintis

Primo

2004

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Pubblicazioni consigliate

Citazioni

social impact

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Audio based real-time speech animation of embodied conversational agents

M. MalcangiPrimo;R. de Tintis

Primo

2004

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

M. Malcangi^Primo;R. de Tintis

Scheda breve

Scheda completa

Scheda completa (DC)