This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness.
An automatic speaker recognition system for intelligence applications / E. Marchetto, F. Avanzini, F. Flego (EUROPEAN SIGNAL PROCESSING CONFERENCE). - In: European Signal Processing Conference[s.l] : EUSIPCO, 2009. - ISBN 9781617388767. - pp. 1612-1616 (( Intervento presentato al 17. convegno European Signal Processing Conference (EUSIPCO) tenutosi a Glasgow nel 2009 [10.5281/zenodo.41687].
An automatic speaker recognition system for intelligence applications
F. Avanzini;
2009
Abstract
This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness.File | Dimensione | Formato | |
---|---|---|---|
marchetto_eusipco09.pdf
accesso riservato
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione
260.12 kB
Formato
Adobe PDF
|
260.12 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.