IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

This paper provides a thorough description of a methodology which leads to high accuracy as regards automatic analysis of broadcast audio. The main objective is to find a feature set for efficient speech/music discrimination while keeping the number of its dimensions as small as possible. Three groups of parameters based on Mel-scale filterbank, MPEG-7 standard and wavelet decomposition are examined in detail. We annotated on-line radio recordings characterized by great diversity, for building probabilistic models and testing four frameworks. The proposed approach utilizes wavelets and MPEG-7 ASP descriptor for modeling speech and music respectively, and results to 98.5 % average recognition rate

A comparative study in automatic recognition of broadcast audio / S. Ntalampiras, N. Fakotakis (INTERSPEECH). - In: Proceedings of the Annual Conference of the International Speech Communication Association[s.l] : ISCA, 2008. - ISBN 9781615673780. - pp. 2498-2501 (( Intervento presentato al 9. convegno INTERSPEECH tenutosi a Brisbane nel 2008.

A comparative study in automatic recognition of broadcast audio

S. Ntalampiras;Fakotakis, Nikos

2008

Abstract

This paper provides a thorough description of a methodology which leads to high accuracy as regards automatic analysis of broadcast audio. The main objective is to find a feature set for efficient speech/music discrimination while keeping the number of its dimensions as small as possible. Three groups of parameters based on Mel-scale filterbank, MPEG-7 standard and wavelet decomposition are examined in detail. We annotated on-line radio recordings characterized by great diversity, for building probabilistic models and testing four frameworks. The proposed approach utilizes wavelets and MPEG-7 ASP descriptor for modeling speech and music respectively, and results to 98.5 % average recognition rate

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				content-based audio recognition; speech/music discrimination; mfcc; mpeg-7; wavelets
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2008
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/615006

Citazioni

ND

0

0

ND

social impact