Below and Above Why-Provenance for Datalog Queries

Calautti, M.; Livshits, E.; Pieris, A.; Schneider, M.

doi:10.1145/3695829

Datalog is a well-established rule-based language that allows us to express complex recursive queries. As for every other query language, explaining why a result to a Datalog query is obtained is an essential task towards explainable and transparent query evaluation. A standard way of explaining a query result is the so-called why-provenance, which provides information about the witnesses to a query result in the form of subsets of the input database that as a whole can be used to derive that result. The data complexity of why-provenance for Datalog queries has been recently studied and it was shown to be intractable, namely NP-complete even if the recursion is linear. An interesting question that comes up is whether we can ensure tractable data complexity by considering a slightly less informative notion of provenance, in particular, whyminimal-provenance that keeps only subset-minimal witnesses to a query result. Another interesting question is whether we can adopt a more informative notion of provenance than why-provenance, in particular, whymultiplicity-provenance that also keeps track of how many times a certain fact occurs in a witness to a query result, without paying a price in data complexity. This work provides definitive answers to the above questions: (i) whyminimal-provenance ensures tractable data complexity, and (ii) whymultiplicity-provenance can be adopted without paying a price in complexity, apart from one surprising case where the data complexity becomes PSPACE-complete.

Below and Above Why-Provenance for Datalog Queries / M. Calautti, E. Livshits, A. Pieris, M. Schneider. - In: PROCEEDINGS OF THE ACM ON MANAGEMENT OF DATA. - ISSN 2836-6573. - 2:5(2024 Nov 07), pp. 211.1-211.21. [10.1145/3695829]

Below and Above Why-Provenance for Datalog Queries

M. Calautti^Primo;Livshits, Ester;Pieris, Andreas;Schneider, Markus

2024

Abstract

Datalog is a well-established rule-based language that allows us to express complex recursive queries. As for every other query language, explaining why a result to a Datalog query is obtained is an essential task towards explainable and transparent query evaluation. A standard way of explaining a query result is the so-called why-provenance, which provides information about the witnesses to a query result in the form of subsets of the input database that as a whole can be used to derive that result. The data complexity of why-provenance for Datalog queries has been recently studied and it was shown to be intractable, namely NP-complete even if the recursion is linear. An interesting question that comes up is whether we can ensure tractable data complexity by considering a slightly less informative notion of provenance, in particular, whyminimal-provenance that keeps only subset-minimal witnesses to a query result. Another interesting question is whether we can adopt a more informative notion of provenance than why-provenance, in particular, whymultiplicity-provenance that also keeps track of how many times a certain fact occurs in a witness to a query result, without paying a price in data complexity. This work provides definitive answers to the above questions: (i) whyminimal-provenance ensures tractable data complexity, and (ii) whymultiplicity-provenance can be adopted without paying a price in complexity, apart from one surprising case where the data complexity becomes PSPACE-complete.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari dell'articolo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Data di pubblicazione
	
				7-nov-2024
			
	Rivista in ANCE
	
				PROCEEDINGS OF THE ACM ON MANAGEMENT OF DATA
			
	DOI
	
				https://dx.doi.org/10.1145/3695829
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
3695829.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 929.96 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	929.96 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1157382

Citazioni

ND

ND

ND

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca