Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.

Transfer Learning for Improved Audio-Based Human Activity Recognition / S. Ntalampiras, I. Potamitis. - In: BIOSENSORS. - ISSN 2079-6374. - 8:3(2018 Sep). [10.3390/bios8030060]

Transfer Learning for Improved Audio-Based Human Activity Recognition

S. Ntalampiras
Primo
;
2018

Abstract

Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.
echo state network; generalized audio recognition; hidden Markov model; multidomain features; transfer learning
Settore INF/01 - Informatica
set-2018
25-giu-2018
Article (author)
File in questo prodotto:
File Dimensione Formato  
27 biosensors-08-00060-v2.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 365.96 kB
Formato Adobe PDF
365.96 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/580722
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 8
  • OpenAlex ND
social impact