IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness.

An automatic speaker recognition system for intelligence applications / E. Marchetto, F. Avanzini, F. Flego (EUROPEAN SIGNAL PROCESSING CONFERENCE). - In: European Signal Processing Conference[s.l] : EUSIPCO, 2009. - ISBN 9781617388767. - pp. 1612-1616 (( Intervento presentato al 17. convegno European Signal Processing Conference (EUSIPCO) tenutosi a Glasgow nel 2009 [10.5281/zenodo.41687].

An automatic speaker recognition system for intelligence applications

MARCHETTO, ENRICO;F. Avanzini;FLEGO F.

2009

Abstract

This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Face recognition; Face detection; Audio de-noising; Speaker recognition; voice technologies
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
			
	Data di pubblicazione
	
				2009
			
	DOI
	
				https://dx.doi.org/10.5281/zenodo.41687
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
marchetto_eusipco09.pdf accesso riservato Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 260.12 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	260.12 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/655833

Citazioni

ND

5

ND

ND

social impact