In the last few years several repositories for storing XML documents and languages for querying XML data have been studied and implemented. All the query languages proposed so far allow to obtain exact answers, but when applied to large XML repositories or warehouses, such precise queries may require high response times. To overcome this problem, in traditional relational warehouses fast approximate queries are supported, built on concise data statistics based on histograms or sampling techniques. We believe that the current trend of XML claims for the extension of such approaches also to query massive XML data-sets. In our work we propose a novel approach to summarize an XML document collection using concise data statistics (e.g., histograms), which allows approximate queries on such data using the XQuery standard language.
Analysis and design of approximate queries over XML documents using statistical techniques / S. Marrara, L. Tanca - In: Proceedings of the VLDB 2003 PhD workshop, co-located with the 29. International conference on very large data bases (VLDB 2003) / [a cura di] M.H. Scholl, T. Grust. - Berlin : [S.n.], 2003. (( convegno VLDB PhD Workshop tenutosi a Berlin nel 2003.
Analysis and design of approximate queries over XML documents using statistical techniques
S. MarraraPrimo
;
2003
Abstract
In the last few years several repositories for storing XML documents and languages for querying XML data have been studied and implemented. All the query languages proposed so far allow to obtain exact answers, but when applied to large XML repositories or warehouses, such precise queries may require high response times. To overcome this problem, in traditional relational warehouses fast approximate queries are supported, built on concise data statistics based on histograms or sampling techniques. We believe that the current trend of XML claims for the extension of such approaches also to query massive XML data-sets. In our work we propose a novel approach to summarize an XML document collection using concise data statistics (e.g., histograms), which allows approximate queries on such data using the XQuery standard language.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.