The present paper concentrates on classifying the state of an infant based on the content of the associated vocalizations. The specific problem belongs to the paralinguistic audio signal processing domain while i has not been studied extensively. Since the specific problem is connected to atypical vocalic expressions, i.e. the sounds produced by the infant are made under stressful conditions we employed the Teager Energy Operator combined with the Mel Frequency Cepstral Coefficients. However these two sets may include overlapping information and therefore provide a redundant feature set if used concurrently. In order to capture the most discriminative information with the minimum number of dimensions we applied canonical correlation analysis on the extracted feature sets. Canonical correlation analysis searches for the correlations between two sets of multidimensional variables and projects them onto a lower-dimensional space in which they are maximally correlated. Subsequently we model the feature space using the Support Vector Machine with a linear kernel. We thoroughly evaluated the proposed methodology on a real-world dataset and present the results in the confusion matrix form. The dataset includes the following five different states: a) hungry, b) uncomfortable (need change), c) need to burp, d) in pain, e) need to sleep. Ultimately the goal of the system is to become an automatic and non-invasive framework for monitoring infants as well as helping pediatricians to better understand their status.
Canonical correlation analysis for classifying baby crying sound events / S. Ntalampiras, I. Potamitis - In: 22nd International Congress on Sound and Vibration, ICSV 2015 / [a cura di] M.J. Crocker, M. Pawelczyk, F. Pedrielli, E. Carletti, S. Luzzi. - [s.l] : International Institute of Acoustics and Vibrations, 2015. - ISBN 9788888942483. - pp. 1-7 (( Intervento presentato al 22. convegno International Congress on Sound and Vibration tenutosi a Firenze nel 2015.
Canonical correlation analysis for classifying baby crying sound events
S. Ntalampiras
;
2015
Abstract
The present paper concentrates on classifying the state of an infant based on the content of the associated vocalizations. The specific problem belongs to the paralinguistic audio signal processing domain while i has not been studied extensively. Since the specific problem is connected to atypical vocalic expressions, i.e. the sounds produced by the infant are made under stressful conditions we employed the Teager Energy Operator combined with the Mel Frequency Cepstral Coefficients. However these two sets may include overlapping information and therefore provide a redundant feature set if used concurrently. In order to capture the most discriminative information with the minimum number of dimensions we applied canonical correlation analysis on the extracted feature sets. Canonical correlation analysis searches for the correlations between two sets of multidimensional variables and projects them onto a lower-dimensional space in which they are maximally correlated. Subsequently we model the feature space using the Support Vector Machine with a linear kernel. We thoroughly evaluated the proposed methodology on a real-world dataset and present the results in the confusion matrix form. The dataset includes the following five different states: a) hungry, b) uncomfortable (need change), c) need to burp, d) in pain, e) need to sleep. Ultimately the goal of the system is to become an automatic and non-invasive framework for monitoring infants as well as helping pediatricians to better understand their status.File | Dimensione | Formato | |
---|---|---|---|
30 icsv22paper.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
365.51 kB
Formato
Adobe PDF
|
365.51 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.