Datalog is a well-established rule-based language that allows us to express complex recursive queries. As for every other query language, explaining why a result to a Datalog query is obtained is an essential task towards explainable and transparent query evaluation. A standard way of explaining a query result is the so-called why-provenance, which provides information about the witnesses to a query result in the form of subsets of the input database that as a whole can be used to derive that result. The data complexity of why-provenance for Datalog queries has been recently studied and it was shown to be intractable, namely NP-complete even if the recursion is linear. An interesting question that comes up is whether we can ensure tractable data complexity by considering a slightly less informative notion of provenance, in particular, whyminimal-provenance that keeps only subset-minimal witnesses to a query result. Another interesting question is whether we can adopt a more informative notion of provenance than why-provenance, in particular, whymultiplicity-provenance that also keeps track of how many times a certain fact occurs in a witness to a query result, without paying a price in data complexity. This work provides definitive answers to the above questions: (i) whyminimal-provenance ensures tractable data complexity, and (ii) whymultiplicity-provenance can be adopted without paying a price in complexity, apart from one surprising case where the data complexity becomes PSPACE-complete.
Below and Above Why-Provenance for Datalog Queries / M. Calautti, E. Livshits, A. Pieris, M. Schneider. - In: PROCEEDINGS OF THE ACM ON MANAGEMENT OF DATA. - ISSN 2836-6573. - 2:5(2024 Nov 07), pp. 211.1-211.21. [10.1145/3695829]
Below and Above Why-Provenance for Datalog Queries
M. CalauttiPrimo
;
2024
Abstract
Datalog is a well-established rule-based language that allows us to express complex recursive queries. As for every other query language, explaining why a result to a Datalog query is obtained is an essential task towards explainable and transparent query evaluation. A standard way of explaining a query result is the so-called why-provenance, which provides information about the witnesses to a query result in the form of subsets of the input database that as a whole can be used to derive that result. The data complexity of why-provenance for Datalog queries has been recently studied and it was shown to be intractable, namely NP-complete even if the recursion is linear. An interesting question that comes up is whether we can ensure tractable data complexity by considering a slightly less informative notion of provenance, in particular, whyminimal-provenance that keeps only subset-minimal witnesses to a query result. Another interesting question is whether we can adopt a more informative notion of provenance than why-provenance, in particular, whymultiplicity-provenance that also keeps track of how many times a certain fact occurs in a witness to a query result, without paying a price in data complexity. This work provides definitive answers to the above questions: (i) whyminimal-provenance ensures tractable data complexity, and (ii) whymultiplicity-provenance can be adopted without paying a price in complexity, apart from one surprising case where the data complexity becomes PSPACE-complete.| File | Dimensione | Formato | |
|---|---|---|---|
|
3695829.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
929.96 kB
Formato
Adobe PDF
|
929.96 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




