Animal vocalizations can differ depending on the context in which they are produced and serve as an instant indicator of an animal’s emotional state. Interestingly, from an evolutional perspective, it should be possible to directly compare different species using the same set of acoustic markers. This paper proposes a deep neural network architecture for analysing and recognizing vocalizations representing positive and negative emotional states. Understanding these vocalizations is critical for advancing animal health and welfare, a subject of growing importance due to its ethical, environmental, economic, and public health implications. To this end, a framework assessing the relationships between vocalizations was developed. Towards keeping all potentially relevant audio content, the constructed framework operates on log-Mel spectrograms. Similarities/dissimilarities are learned by a suitably designed Siamese Neural Network composed of convolutional layers. The formed latent space is appropriately clustered to identify the support set facilitating the emotion classification task. We employed a publicly available dataset and followed a thorough experimental protocol. The efficacy of such a scheme is shown after extensive experiments considering both classification and support set selection. Last but not least, by elaborating collectively the network’s activations when processing positive and negative vocalizations, important differences in the time-frequency plane are evidenced across emotions and species, assisting their understanding from animal scientists.

Species-independent analysis and identification of emotional animal vocalizations / S. Ntalampiras. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 15:1(2025 Aug), pp. 28828.1-28828.10. [10.1038/s41598-025-14323-2]

Species-independent analysis and identification of emotional animal vocalizations

S. Ntalampiras
2025

Abstract

Animal vocalizations can differ depending on the context in which they are produced and serve as an instant indicator of an animal’s emotional state. Interestingly, from an evolutional perspective, it should be possible to directly compare different species using the same set of acoustic markers. This paper proposes a deep neural network architecture for analysing and recognizing vocalizations representing positive and negative emotional states. Understanding these vocalizations is critical for advancing animal health and welfare, a subject of growing importance due to its ethical, environmental, economic, and public health implications. To this end, a framework assessing the relationships between vocalizations was developed. Towards keeping all potentially relevant audio content, the constructed framework operates on log-Mel spectrograms. Similarities/dissimilarities are learned by a suitably designed Siamese Neural Network composed of convolutional layers. The formed latent space is appropriately clustered to identify the support set facilitating the emotion classification task. We employed a publicly available dataset and followed a thorough experimental protocol. The efficacy of such a scheme is shown after extensive experiments considering both classification and support set selection. Last but not least, by elaborating collectively the network’s activations when processing positive and negative vocalizations, important differences in the time-frequency plane are evidenced across emotions and species, assisting their understanding from animal scientists.
Animal health and welfare; Audio pattern recognition; Latent representation; Spectral clustering
Settore INFO-01/A - Informatica
ago-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
s41598-025-14323-2.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 1.87 MB
Formato Adobe PDF
1.87 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1178737
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex 0
social impact