Human-machine interaction is calling for a sophisticated understanding of subjects’ behavior performed by smartphones, home automation and entertainment devices, and many service robots. During an interaction with human beings in their environment, a service robot has to be capable to perceive and process visual and sound information of the scene that he observes. To capture salient elements in such different signals many semi-supervised deep learning methods have been proposed. In this article, it is proposed a new convolutional neural network, endowed with a mechanism of attention in order not only to classify, but also to localize temporally a sound event, and in a semi-supervised way.

Sound classification and localization in service robots with attention mechanisms / M. Bodini - In: Computer-Aided Developments: Electronics and Communication / [a cura di] A. Kumar Sinha, J. Pradeep Darsy. - Prima edizione. - Boca Raton : CRC Press, 2019 Sep 30. - ISBN 9780429340710. - pp. 69-76 (( 1. Annual Conference on Computer-Aided Developments in Electronics and Communication (CADEC-2019) : 2-3 marzo Amaravati 2019 [10.1201/9780429340710-9].

Sound classification and localization in service robots with attention mechanisms

M. Bodini
Primo
2019

Abstract

Human-machine interaction is calling for a sophisticated understanding of subjects’ behavior performed by smartphones, home automation and entertainment devices, and many service robots. During an interaction with human beings in their environment, a service robot has to be capable to perceive and process visual and sound information of the scene that he observes. To capture salient elements in such different signals many semi-supervised deep learning methods have been proposed. In this article, it is proposed a new convolutional neural network, endowed with a mechanism of attention in order not only to classify, but also to localize temporally a sound event, and in a semi-supervised way.
Service robots; Convolutional neural networks; Deep learning; Semi-supervised learning; Audio pattern recognition
Settore INFO-01/A - Informatica
30-set-2019
Vellore Institute of Technology, Amaravati, India
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
10.1201:9780429340710-9.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Licenza: Nessuna licenza
Dimensione 1.13 MB
Formato Adobe PDF
1.13 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1216663
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 3
social impact