In the last few years several repositories for storing XML documents and languages for querying XML data have been studied and implemented. All the query languages proposed so far allow to obtain exact answers, but when applied to large XML repositories or warehouses, such precise queries may require high response times. To overcome this problem, in traditional relational warehouses fast approximate queries are supported, built on concise data statistics based on histograms or sampling techniques. In this paper we propose a novel approach to summarize an XML document collection taking into account the hierarchical structure of XML documents, which makes the summarization process substantially more difficult than in case of flat, relational data.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.