In this paper we propose a novel architecture for environmental sound classification. In the first section we introduce the reader to the current work in this research field. Subsequently, we explore the usage of Mel frequency cepstral coefficients (MFCCs) and MPEG7 audio features in combination with a classification method based on Gaussian mixture models (GMMs). We provide details concerning the feature extraction process as well as the recognition stage of the proposed methodology. The performance of this implementation is evaluated by setting up experimental tests in six different categories of environmental sounds (aircraft, motorcycle, car, crowd, thunder, train). The proposed method is fast because it does not require high computational resources covering therefore the needs of a real time application.

Automatic recognition of urban soundscenes / S. Ntalampiras, I. Potamitis, N. Fakotakis (STUDIES IN COMPUTATIONAL INTELLIGENCE). - In: New Directions in Intelligent Interactive Multimedia / [a cura di] G.A. Tsihrintzis, M. Virvou, R.J. Howlett, L.C. Jain. - [s.l] : Springer, 2008. - ISBN 9783540681267. - pp. 147-153 (( Intervento presentato al 1. convegno International Symposium on Intelligent Interactive Multimedia Systems and Services tenutosi a Piraeus nel 2008.

Automatic recognition of urban soundscenes

S. Ntalampiras;
2008

Abstract

In this paper we propose a novel architecture for environmental sound classification. In the first section we introduce the reader to the current work in this research field. Subsequently, we explore the usage of Mel frequency cepstral coefficients (MFCCs) and MPEG7 audio features in combination with a classification method based on Gaussian mixture models (GMMs). We provide details concerning the feature extraction process as well as the recognition stage of the proposed methodology. The performance of this implementation is evaluated by setting up experimental tests in six different categories of environmental sounds (aircraft, motorcycle, car, crowd, thunder, train). The proposed method is fast because it does not require high computational resources covering therefore the needs of a real time application.
Computer Audition; Automatic audio recognition; MPEG-7 audio; MFCC; Gaussian mixture model (GMM)
Settore INF/01 - Informatica
2008
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
01 KES 2008.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 249.47 kB
Formato Adobe PDF
249.47 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/615131
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 37
  • ???jsp.display-item.citation.isi??? 7
social impact