This paper is focused on the classification of vocalizations characterizing intents of individual cats. Cats vocalize in order to convey different emotions and/or intents and although their repertoire/vocabulary may not be universal, it exhibits consistent characteristics on an individual basis. In this work, we present a complete pipeline for processing streams of audio, with the twofold goal being both detection as well as interpretation of cat vocalizations. The proposed system is based on YAMNet pre-trained deep network where we apply meaningful modifications addressing the requirements of the task-at-hand. Interestingly, the overall system is able to run in real-time on modern smartphones using the developed application. At the same time, we address the non-stationarity problem meaning that class dictionary is updated on-the-fly following user recommendations. After extensive experiments, we show that the proposed pipeline achieves quite satisfactory detection and recognition accuracy. To the best of our knowledge, this is the first attempt in the related literature to address continuous detection and interpretation of cat vocalizations in a real-time fashion.

Acoustic classification of individual cat vocalizations in evolving environments / S. Ntalampiras, D. Kosmin, J. Sanchez - In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP)[s.l] : IEEE, 2021. - ISBN 978-1-6654-2933-7. - pp. 254-258 (( Intervento presentato al 44. convegno International Conference on Telecommunications and Signal Processing (TSP) tenutosi a Brno nel 2021 [10.1109/TSP52935.2021.9522660].

Acoustic classification of individual cat vocalizations in evolving environments

S. Ntalampiras
;
2021

Abstract

This paper is focused on the classification of vocalizations characterizing intents of individual cats. Cats vocalize in order to convey different emotions and/or intents and although their repertoire/vocabulary may not be universal, it exhibits consistent characteristics on an individual basis. In this work, we present a complete pipeline for processing streams of audio, with the twofold goal being both detection as well as interpretation of cat vocalizations. The proposed system is based on YAMNet pre-trained deep network where we apply meaningful modifications addressing the requirements of the task-at-hand. Interestingly, the overall system is able to run in real-time on modern smartphones using the developed application. At the same time, we address the non-stationarity problem meaning that class dictionary is updated on-the-fly following user recommendations. After extensive experiments, we show that the proposed pipeline achieves quite satisfactory detection and recognition accuracy. To the best of our knowledge, this is the first attempt in the related literature to address continuous detection and interpretation of cat vocalizations in a real-time fashion.
Bioacoustics; cat vocalizations; deep learning; transfer learning; learning in non-stationary environments
Settore INF/01 - Informatica
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
60 cats TSP2021.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 4.53 MB
Formato Adobe PDF
4.53 MB Adobe PDF Visualizza/Apri
Acoustic_classification_of_individual_cat_vocalizations_in_evolving_environments.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 11.06 MB
Formato Adobe PDF
11.06 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/865520
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 2
social impact