Exploiting temporal feature integration for generalized sound recognition

Ntalampiras, S.; Potamitis, I.; Fakotakis, N.

doi:10.1155/2009/807162

This paper presents a methodology that incorporates temporal feature integration for automated generalized sound recognition. Such a system can be of great use to scene analysis and understanding based on the acoustic modality. The performance of three feature sets based on Mel filterbank, MPEG-7 audio protocol, and wavelet decomposition is assessed. Furthermore we explore the application of temporal integration using the following three different strategies: (a) short-term statistics, (b) spectral moments, and (c) autoregressive models. The experimental setup is thoroughly explained and based on the concurrent usage of professional sound effects collections. In this way we try to form a representative picture of the characteristics of ten sound classes. During the first phase of our implementation, the process of audio classification is achieved through statistical models (HMMs) while a fusion scheme that exploits the models constructed by various feature sets provided the highest average recognition rate. The proposed system not only uses diverse groups of sound parameters but also employs the advantages of temporal feature integration.

Exploiting temporal feature integration for generalized sound recognition / S. Ntalampiras, I. Potamitis, N. Fakotakis. - In: EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING. - ISSN 1687-6172. - 2009:1(2009).

Exploiting temporal feature integration for generalized sound recognition

S. Ntalampiras;Potamitis, Ilyas;Fakotakis, Nikos

2009

Abstract

This paper presents a methodology that incorporates temporal feature integration for automated generalized sound recognition. Such a system can be of great use to scene analysis and understanding based on the acoustic modality. The performance of three feature sets based on Mel filterbank, MPEG-7 audio protocol, and wavelet decomposition is assessed. Furthermore we explore the application of temporal integration using the following three different strategies: (a) short-term statistics, (b) spectral moments, and (c) autoregressive models. The experimental setup is thoroughly explained and based on the concurrent usage of professional sound effects collections. In this way we try to form a representative picture of the characteristics of ten sound classes. During the first phase of our implementation, the process of audio classification is achieved through statistical models (HMMs) while a fusion scheme that exploits the models constructed by various feature sets provided the highest average recognition rate. The proposed system not only uses diverse groups of sound parameters but also employs the advantages of temporal feature integration.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			Signal Processing; Hardware and Architecture; Electrical and Electronic Engineering
		
	Settori scientifico-disciplinari dell'articolo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			2009
		
	Rivista in ANCE
	
			EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING
		
	DOI
	
			https://dx.doi.org/10.1155/2009/807162
		
	Tipologia
	
			Article (author)
		
	Appare nelle tipologie:
	
			01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
807162.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 366.03 kB Formato Adobe PDF Visualizza/Apri	366.03 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/615142

Citazioni

ND

34

27

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca