Acoustic classification of individual cat vocalizations in evolving environments

Ntalampiras, S.; Kosmin, D.; Sanchez, J.

doi:10.1109/TSP52935.2021.9522660

This paper is focused on the classification of vocalizations characterizing intents of individual cats. Cats vocalize in order to convey different emotions and/or intents and although their repertoire/vocabulary may not be universal, it exhibits consistent characteristics on an individual basis. In this work, we present a complete pipeline for processing streams of audio, with the twofold goal being both detection as well as interpretation of cat vocalizations. The proposed system is based on YAMNet pre-trained deep network where we apply meaningful modifications addressing the requirements of the task-at-hand. Interestingly, the overall system is able to run in real-time on modern smartphones using the developed application. At the same time, we address the non-stationarity problem meaning that class dictionary is updated on-the-fly following user recommendations. After extensive experiments, we show that the proposed pipeline achieves quite satisfactory detection and recognition accuracy. To the best of our knowledge, this is the first attempt in the related literature to address continuous detection and interpretation of cat vocalizations in a real-time fashion.

Acoustic classification of individual cat vocalizations in evolving environments / S. Ntalampiras, D. Kosmin, J. Sanchez - In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP)[s.l] : IEEE, 2021. - ISBN 978-1-6654-2933-7. - pp. 254-258 (( Intervento presentato al 44. convegno International Conference on Telecommunications and Signal Processing (TSP) tenutosi a Brno nel 2021 [10.1109/TSP52935.2021.9522660].

Acoustic classification of individual cat vocalizations in evolving environments

S. Ntalampiras;Kosmin, Danylo;Sanchez, Javier

2021

Abstract

This paper is focused on the classification of vocalizations characterizing intents of individual cats. Cats vocalize in order to convey different emotions and/or intents and although their repertoire/vocabulary may not be universal, it exhibits consistent characteristics on an individual basis. In this work, we present a complete pipeline for processing streams of audio, with the twofold goal being both detection as well as interpretation of cat vocalizations. The proposed system is based on YAMNet pre-trained deep network where we apply meaningful modifications addressing the requirements of the task-at-hand. Interestingly, the overall system is able to run in real-time on modern smartphones using the developed application. At the same time, we address the non-stationarity problem meaning that class dictionary is updated on-the-fly following user recommendations. After extensive experiments, we show that the proposed pipeline achieves quite satisfactory detection and recognition accuracy. To the best of our knowledge, this is the first attempt in the related literature to address continuous detection and interpretation of cat vocalizations in a real-time fashion.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Bioacoustics; cat vocalizations; deep learning; transfer learning; learning in non-stationary environments
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2021
			
	DOI
	
				https://dx.doi.org/10.1109/TSP52935.2021.9522660
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
60 cats TSP2021.pdf accesso aperto Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 4.53 MB Formato Adobe PDF Visualizza/Apri	4.53 MB	Adobe PDF	Visualizza/Apri
Acoustic_classification_of_individual_cat_vocalizations_in_evolving_environments.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 11.06 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	11.06 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/865520

Citazioni

ND

13

10

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca