IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

XML information items collected from heterogeneous sources often carry similar semantics but turn out to be structured in different ways. Variations in structure make effective search of information across multiple datasources hard to achieve. Our approach is aimed at a flexible search and processing technique, capable to extract relevant information from a possibly huge set of XML documents. ApproXML is a software tool supporting approximate pattern-based querying, able to locate and extract XML information dealing flexibly with differences in structure and tag vocabulary. Our method relies on representing XML documents as graphs, through a variant of the DOM model. The relevant information is selected as follows [Dam00a]: first, a XML pattern, i.e. a partially specified subtree, is provided by the user. Then, the XML documents of the target dataset are scanned; XML fragments are located and sorted according to their similarity to the pattern.

The APPROXML tool demonstration / E. Damiani, N. Lavarini, S. Marrara, B. Oliboni, D. Pasini, L. Tanca, G. Viviani - In: Advances in database technology, EDBT 2002 : 8. International conference on extending database technology : Prague, Czech Republic, march 25-27, 2002 : proceedings / [a cura di] Christian S. Jensen ... [et al.]. - Berlin : Springer, 2002. - ISBN 3540433244. - pp. 187-200 (( Intervento presentato al 8. convegno International Conference on Extending Database Technology tenutosi a Praga nel 2002 [10.1007/3-540-45876-X_52].

The APPROXML tool demonstration

E. Damiani^Primo;N. Lavarini;S. Marrara;B. Oliboni;D. Pasini;L. Tanca;G. Viviani

2002

Abstract

XML information items collected from heterogeneous sources often carry similar semantics but turn out to be structured in different ways. Variations in structure make effective search of information across multiple datasources hard to achieve. Our approach is aimed at a flexible search and processing technique, capable to extract relevant information from a possibly huge set of XML documents. ApproXML is a software tool supporting approximate pattern-based querying, able to locate and extract XML information dealing flexibly with differences in structure and tag vocabulary. Our method relies on representing XML documents as graphs, through a variant of the DOM model. The relevant information is selected as follows [Dam00a]: first, a XML pattern, i.e. a partially specified subtree, is provided by the user. Then, the XML documents of the target dataset are scanned; XML fragments are located and sorted according to their similarity to the pattern.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2002
			
	DOI
	
				https://dx.doi.org/10.1007/3-540-45876-X_52
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/49503

Citazioni

ND

10

ND

ND

social impact