This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness.

An automatic speaker recognition system for intelligence applications / E. Marchetto, F. Avanzini, F. Flego (EUROPEAN SIGNAL PROCESSING CONFERENCE). - In: European Signal Processing Conference[s.l] : EUSIPCO, 2009. - ISBN 9781617388767. - pp. 1612-1616 (( Intervento presentato al 17. convegno European Signal Processing Conference (EUSIPCO) tenutosi a Glasgow nel 2009 [10.5281/zenodo.41687].

An automatic speaker recognition system for intelligence applications

F. Avanzini;
2009

Abstract

This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness.
Face recognition; Face detection; Audio de-noising; Speaker recognition; voice technologies
Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
2009
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
marchetto_eusipco09.pdf

accesso riservato

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 260.12 kB
Formato Adobe PDF
260.12 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/655833
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact