This paper describes a corpus consisting of audio data for automatic space monitoring based solely on the perceived acoustic information. The particular database is created as part of a project aiming at the detection of abnormal events, which lead to life-threatening situations or property damage. The audio corpus is composed of vocal reactions and environmental sounds that are usually encountered in atypical situations. The audio data is composed of three parts: Phase I - professional sound effects collections, Phase II recordings obtained from action and drama movies and Phase III - vocal reactions related to real-world emergency events as retrieved from television, radio broadcast news, documentaries etc. The annotation methodology is given in details along with preliminary classification results and statistical analysis of the dataset regarding Phase I. The main objective of such a dataset is to provide training data for automatic recognition machines that detect hazardous situations and to provide security enhancement in public environments, which otherwise require human supervision.

Audio database in support of potential threat and crisis situation management / S. Ntalampiras, I. Potamitis, T. Ganchev, N. Fakotakis - In: Proceedings of the 6th International Conference on Language Resources and Evaluation[s.l] : European Language Resources Association (ELRA), 2008. - ISBN 2951740840. - pp. 1288-1291 (( Intervento presentato al 6. convegno International Conference on Language Resources and Evaluation tenutosi a Marrakech nel 2008.

Audio database in support of potential threat and crisis situation management

S. Ntalampiras;
2008

Abstract

This paper describes a corpus consisting of audio data for automatic space monitoring based solely on the perceived acoustic information. The particular database is created as part of a project aiming at the detection of abnormal events, which lead to life-threatening situations or property damage. The audio corpus is composed of vocal reactions and environmental sounds that are usually encountered in atypical situations. The audio data is composed of three parts: Phase I - professional sound effects collections, Phase II recordings obtained from action and drama movies and Phase III - vocal reactions related to real-world emergency events as retrieved from television, radio broadcast news, documentaries etc. The annotation methodology is given in details along with preliminary classification results and statistical analysis of the dataset regarding Phase I. The main objective of such a dataset is to provide training data for automatic recognition machines that detect hazardous situations and to provide security enhancement in public environments, which otherwise require human supervision.
classification; speech; recognition; stress
Settore INF/01 - Informatica
2008
et al.
European Media Laboratory GmbH (EML)
Instituut voor Nederlandse Lexicologie (INL)
Linguatec
Microsoft
Nuance
http://www.lrec-conf.org/proceedings/lrec2008/summaries/327.html
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
327_paper.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 691.35 kB
Formato Adobe PDF
691.35 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/615117
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact