This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change. We propose to distinguish the concept of “performance” from the one of “interpretation”, which expresses the “artistic intention”. Towards assessing this distinction, we carried out an experimental evaluation where 91 subjects were invited to listen to various audio recordings created by resynthesizing MIDI data obtained through Automatic Music Transcription (AMT) systems and a sensorized acoustic piano. During the resynthesis, we simulated different contexts and asked listeners to evaluate how much the interpretation changes when the context changes. Results show that: (1) MIDI format alone is not able to completely grasp the artistic intention of a music performance; (2) usual objective evaluation measures based on MIDI data present low correlations with the average subjective evaluation. To bridge this gap, we propose a novel measure which is meaningfully correlated with the outcome of the tests. In addition, we investigate multimodal machine learning by providing a new score-informed AMT method and propose an approximation algorithm for the p-dispersion problem.
A perceptual measure for evaluating the resynthesis of automatic music transcriptions / F. Simonetta, F. Avanzini, S. Ntalampiras. - In: MULTIMEDIA TOOLS AND APPLICATIONS. - ISSN 1573-7721. - 81:(2022), pp. 32371-32391. [10.1007/s11042-022-12476-0]
A perceptual measure for evaluating the resynthesis of automatic music transcriptions
F. Simonetta
Primo
;F. AvanziniSecondo
;S. NtalampirasUltimo
2022
Abstract
This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change. We propose to distinguish the concept of “performance” from the one of “interpretation”, which expresses the “artistic intention”. Towards assessing this distinction, we carried out an experimental evaluation where 91 subjects were invited to listen to various audio recordings created by resynthesizing MIDI data obtained through Automatic Music Transcription (AMT) systems and a sensorized acoustic piano. During the resynthesis, we simulated different contexts and asked listeners to evaluate how much the interpretation changes when the context changes. Results show that: (1) MIDI format alone is not able to completely grasp the artistic intention of a music performance; (2) usual objective evaluation measures based on MIDI data present low correlations with the average subjective evaluation. To bridge this gap, we propose a novel measure which is meaningfully correlated with the outcome of the tests. In addition, we investigate multimodal machine learning by providing a new score-informed AMT method and propose an approximation algorithm for the p-dispersion problem.File | Dimensione | Formato | |
---|---|---|---|
2202.12257.pdf
accesso aperto
Descrizione: Accepted Version
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione
967.64 kB
Formato
Adobe PDF
|
967.64 kB | Adobe PDF | Visualizza/Apri |
11042_2022_Article_12476.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Dimensione
1.16 MB
Formato
Adobe PDF
|
1.16 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.