This paper is focused on the classification of vocalizations characterizing intents of individual cats. Cats vocalize in order to convey different emotions and/or intents and although their repertoire/vocabulary may not be universal, it exhibits consistent characteristics on an individual basis. In this work, we present a complete pipeline for processing streams of audio, with the twofold goal being both detection as well as interpretation of cat vocalizations. The proposed system is based on YAMNet pre-trained deep network where we apply meaningful modifications addressing the requirements of the task-at-hand. Interestingly, the overall system is able to run in real-time on modern smartphones using the developed application. At the same time, we address the non-stationarity problem meaning that class dictionary is updated on-the-fly following user recommendations. After extensive experiments, we show that the proposed pipeline achieves quite satisfactory detection and recognition accuracy. To the best of our knowledge, this is the first attempt in the related literature to address continuous detection and interpretation of cat vocalizations in a real-time fashion.
Acoustic classification of individual cat vocalizations in evolving environments / S. Ntalampiras, D. Kosmin, J. Sanchez - In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP)[s.l] : IEEE, 2021. - ISBN 978-1-6654-2933-7. - pp. 254-258 (( Intervento presentato al 44. convegno International Conference on Telecommunications and Signal Processing (TSP) tenutosi a Brno nel 2021 [10.1109/TSP52935.2021.9522660].
Acoustic classification of individual cat vocalizations in evolving environments
S. Ntalampiras
;
2021
Abstract
This paper is focused on the classification of vocalizations characterizing intents of individual cats. Cats vocalize in order to convey different emotions and/or intents and although their repertoire/vocabulary may not be universal, it exhibits consistent characteristics on an individual basis. In this work, we present a complete pipeline for processing streams of audio, with the twofold goal being both detection as well as interpretation of cat vocalizations. The proposed system is based on YAMNet pre-trained deep network where we apply meaningful modifications addressing the requirements of the task-at-hand. Interestingly, the overall system is able to run in real-time on modern smartphones using the developed application. At the same time, we address the non-stationarity problem meaning that class dictionary is updated on-the-fly following user recommendations. After extensive experiments, we show that the proposed pipeline achieves quite satisfactory detection and recognition accuracy. To the best of our knowledge, this is the first attempt in the related literature to address continuous detection and interpretation of cat vocalizations in a real-time fashion.| File | Dimensione | Formato | |
|---|---|---|---|
|
60 cats TSP2021.pdf
accesso aperto
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione
4.53 MB
Formato
Adobe PDF
|
4.53 MB | Adobe PDF | Visualizza/Apri |
|
Acoustic_classification_of_individual_cat_vocalizations_in_evolving_environments.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
11.06 MB
Formato
Adobe PDF
|
11.06 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




