IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

In this paper we present an effective approach which addresses the issue of speech/music discrimination. Our architecture focuses on the matter from the scope of improving the performance of a speech recognition system by excluding the processing of information which is not speech. Multiresolution analysis is applied to the input signal while the most significant statistical features are calculated over a predefined texture size. These characteristics are then modeled using a state of the art technique for probability density function estimation, Gaussian mixture models (GMM). A classification scheme consisting of a conventional maximum likelihood decision methodology constitutes the next step of our implementation. Despite the fact that our system is based solely on wavelet signal processing, it demonstrated very good performance achieving 91.8% recognition rate.

Speech/music discrimination based on discrete wavelet transform / S. Ntalampiras, N. Fakotakis (LECTURE NOTES IN COMPUTER SCIENCE). - In: Artificial Intelligence: Theories, Models and Applications / [a cura di] J. Darzentas, G.A. Vouros, S. Vosinakis, A. Arnellos. - [s.l] : Springer, 2008. - ISBN 9783540878803. - pp. 205-211 (( Intervento presentato al 5. convegno Hellenic Conference on Artificial Intelligence tenutosi a Syros nel 2008.

Speech/music discrimination based on discrete wavelet transform

S. Ntalampiras;Fakotakis, Nikos

2008

Abstract

In this paper we present an effective approach which addresses the issue of speech/music discrimination. Our architecture focuses on the matter from the scope of improving the performance of a speech recognition system by excluding the processing of information which is not speech. Multiresolution analysis is applied to the input signal while the most significant statistical features are calculated over a predefined texture size. These characteristics are then modeled using a state of the art technique for probability density function estimation, Gaussian mixture models (GMM). A classification scheme consisting of a conventional maximum likelihood decision methodology constitutes the next step of our implementation. Despite the fact that our system is based solely on wavelet signal processing, it demonstrated very good performance achieving 91.8% recognition rate.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			Computer audition; content-based audio classification; discrete wavelet transform; Gaussian mixture model
		
	Settori scientifico-disciplinari del contributo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			2008
		
	Enti collegati al convegno
	
			University of the Aegean
Department of Product and Systems Design Engineering
Prefecture of the Cyclades
Holy Metropolis of Syros
		
	DOI
	
			https://dx.doi.org/10.1007/978-3-540-87881-0_19
		
	Tipologia
	
			Book Part (author)
		
	Appare nelle tipologie:
	
			03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
Ntalampiras-Fakotakis2008_Chapter_SpeechMusicDiscriminationBased.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 220.82 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	220.82 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/615066

Citazioni

ND

6

3

social impact