This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change. We propose to distinguish the concept of “performance” from the one of “interpretation”, which expresses the “artistic intention”. Towards assessing this distinction, we carried out an experimental evaluation where 91 subjects were invited to listen to various audio recordings created by resynthesizing MIDI data obtained through Automatic Music Transcription (AMT) systems and a sensorized acoustic piano. During the resynthesis, we simulated different contexts and asked listeners to evaluate how much the interpretation changes when the context changes. Results show that: (1) MIDI format alone is not able to completely grasp the artistic intention of a music performance; (2) usual objective evaluation measures based on MIDI data present low correlations with the average subjective evaluation. To bridge this gap, we propose a novel measure which is meaningfully correlated with the outcome of the tests. In addition, we investigate multimodal machine learning by providing a new score-informed AMT method and propose an approximation algorithm for the p-dispersion problem.

A perceptual measure for evaluating the resynthesis of automatic music transcriptions / F. Simonetta, F. Avanzini, S. Ntalampiras. - In: MULTIMEDIA TOOLS AND APPLICATIONS. - ISSN 1573-7721. - 81:(2022), pp. 32371-32391. [10.1007/s11042-022-12476-0]

A perceptual measure for evaluating the resynthesis of automatic music transcriptions

F. Simonetta
Primo
;
F. Avanzini
Secondo
;
S. Ntalampiras
Ultimo
2022

Abstract

This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change. We propose to distinguish the concept of “performance” from the one of “interpretation”, which expresses the “artistic intention”. Towards assessing this distinction, we carried out an experimental evaluation where 91 subjects were invited to listen to various audio recordings created by resynthesizing MIDI data obtained through Automatic Music Transcription (AMT) systems and a sensorized acoustic piano. During the resynthesis, we simulated different contexts and asked listeners to evaluate how much the interpretation changes when the context changes. Results show that: (1) MIDI format alone is not able to completely grasp the artistic intention of a music performance; (2) usual objective evaluation measures based on MIDI data present low correlations with the average subjective evaluation. To bridge this gap, we propose a novel measure which is meaningfully correlated with the outcome of the tests. In addition, we investigate multimodal machine learning by providing a new score-informed AMT method and propose an approximation algorithm for the p-dispersion problem.
Audio resynthesis; Automatic music transcription; Music information retrieval; Music perception
Settore INF/01 - Informatica
Settore ING-INF/06 - Bioingegneria Elettronica e Informatica
Settore L-ART/07 - Musicologia e Storia della Musica
Article (author)
File in questo prodotto:
File Dimensione Formato  
2202.12257.pdf

accesso aperto

Descrizione: Accepted Version
Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 967.64 kB
Formato Adobe PDF
967.64 kB Adobe PDF Visualizza/Apri
11042_2022_Article_12476.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 1.16 MB
Formato Adobe PDF
1.16 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/945977
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact