Predicting the sound quality of an environment represents an important task especially in urban parks where the coexistence of sources of anthropic and biophonic nature produces complex sound patterns. To this end, an index has been defined by us, denoted as soundscape ranking index (SRI), which assigns a positive weight to natural sounds (biophony) and a negative one to anthropogenic sounds. A numerical strategy to optimize the weight values has been implemented by training two machine learning algorithms, the random forest (RF) and the perceptron (PPN), over an augmented data-set. Due to the availability of a relatively small fraction of labelled recorded sounds, we employed Monte Carlo simulations to mimic the distribution of the original data-set while keeping the original balance among the classes. The results show an increase in the classification performance. We discuss the issues that special care needs to be addressed when the augmented data are based on a too small original data-set.

Data augmentation to improve the soundscape ranking index prediction / R. Benocci, A. Potenza, G. Zambon, A. Afify, H.E. Roman. - In: WSEAS TRANSACTIONS ON ENVIRONMENT AND DEVELOPMENT. - ISSN 1790-5079. - 19:(2023 Sep 20), pp. 891-902. [10.37394/232015.2023.19.85]

Data augmentation to improve the soundscape ranking index prediction

A. Afify
Penultimo
;
2023

Abstract

Predicting the sound quality of an environment represents an important task especially in urban parks where the coexistence of sources of anthropic and biophonic nature produces complex sound patterns. To this end, an index has been defined by us, denoted as soundscape ranking index (SRI), which assigns a positive weight to natural sounds (biophony) and a negative one to anthropogenic sounds. A numerical strategy to optimize the weight values has been implemented by training two machine learning algorithms, the random forest (RF) and the perceptron (PPN), over an augmented data-set. Due to the availability of a relatively small fraction of labelled recorded sounds, we employed Monte Carlo simulations to mimic the distribution of the original data-set while keeping the original balance among the classes. The results show an increase in the classification performance. We discuss the issues that special care needs to be addressed when the augmented data are based on a too small original data-set.
data augmentation; ecoacoustic indices; machine learning; soundscape; soundscape ranking index; urban parks
Settore FIS/07 - Fisica Applicata(Beni Culturali, Ambientali, Biol.e Medicin)
20-set-2023
Article (author)
File in questo prodotto:
File Dimensione Formato  
b725115-026(2023).pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 6 MB
Formato Adobe PDF
6 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1019808
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact