IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.

Transfer Learning for Improved Audio-Based Human Activity Recognition / S. Ntalampiras, I. Potamitis. - In: BIOSENSORS. - ISSN 2079-6374. - 8:3(2018 Sep). [10.3390/bios8030060]

Transfer Learning for Improved Audio-Based Human Activity Recognition

S. Ntalampiras^Primo;Potamitis, Ilyas

2018

Abstract

Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				echo state network; generalized audio recognition; hidden Markov model; multidomain features; transfer learning
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				set-2018
			
	Data ahead of print o data di stampa
	
				25-giu-2018
			
	Rivista in ANCE
	
				BIOSENSORS
			
	DOI
	
				https://dx.doi.org/10.3390/bios8030060
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
27 biosensors-08-00060-v2.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 365.96 kB Formato Adobe PDF Visualizza/Apri	365.96 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/580722

Citazioni

2

15

8

ND

social impact