Crowd-assessing quality in uncertain data linking datasets

Faria, D.; Ferrara, A.; Jimenez-Ruiz, E.; Montanelli, S.; Pesquita, C.

doi:10.1017/S0269888920000363

The quality of a dataset used for evaluating data linking methods, techniques, and tools depends on the availability of a set of mappings, called reference alignment, that is known to be correct. In particular, it is crucial that mappings effectively represent relations between pairs of entities that are indeed similar due to the fact that they denote the same object. Since the reliability of mappings is decisive in order to perform a fair evaluation of automatic linking methods and tools, we call this property of mappings as mapping fairness. In this article, we propose a crowd-based approach, called Crowd Quality (CQ), for assessing the quality of data linking datasets by measuring the fairness of the mappings in the reference alignment. Moreover, we present a real experiment, where we evaluate two state-of-the-art data linking tools before and after the refinement of the reference alignment based on the CQ approach, in order to present the benefits deriving from the crowd assessment of mapping fairness.

Crowd-assessing quality in uncertain data linking datasets / D. Faria, A. Ferrara, E. Jimenez-Ruiz, S. Montanelli, C. Pesquita. - In: KNOWLEDGE ENGINEERING REVIEW. - ISSN 0269-8889. - 35(2020), pp. e33.1-e33.25. [10.1017/S0269888920000363]

Crowd-assessing quality in uncertain data linking datasets

Faria D.;A. Ferrara;Jimenez-Ruiz E.;S. Montanelli;Pesquita C.

2020

Abstract

The quality of a dataset used for evaluating data linking methods, techniques, and tools depends on the availability of a set of mappings, called reference alignment, that is known to be correct. In particular, it is crucial that mappings effectively represent relations between pairs of entities that are indeed similar due to the fact that they denote the same object. Since the reliability of mappings is decisive in order to perform a fair evaluation of automatic linking methods and tools, we call this property of mappings as mapping fairness. In this article, we propose a crowd-based approach, called Crowd Quality (CQ), for assessing the quality of data linking datasets by measuring the fairness of the mappings in the reference alignment. Moreover, we present a real experiment, where we evaluate two state-of-the-art data linking tools before and after the refinement of the reference alignment based on the CQ approach, in order to present the benefits deriving from the crowd assessment of mapping fairness.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari dell'articolo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			2020
		
	Rivista in ANCE
	
			KNOWLEDGE ENGINEERING REVIEW
		
	DOI
	
			https://dx.doi.org/10.1017/S0269888920000363
		
	Tipologia
	
			Article (author)
		
	Appare nelle tipologie:
	
			01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
crowdassessing_quality_in_uncertain_data_linking_datasets.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 3.01 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.01 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/786659

Citazioni

ND

0

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca