This chapter introduces OntoExtractor, a tool for semi-automatic generation of taxonomy from a set of documents or data sources. The tool generates the taxonomy in a bottom-up fashion: starting from structural analysis of the documents, it generates a set of clusters, which can be refined by a further grouping generated by content analysis. Metadata describing the content of each cluster is automatically generated and analysed by the tool for generating the final taxonomy. A simulation of a tool, based on implicit and explicit voting mechanism, for the maintenance of the taxonomy is also described. The author describes a system that can be used to generate taxonomy from a heterogeneous source of information, using wrappers for converting the original format of the document to a structured one. This way OntoExtractor can virtually generate taxonomy from any source of information just adding the proper wrapper. Moreover, the trust mechanism allows a reliable method for maintaining the taxonomy and for overcoming the unavoidable generation of wrong classes in the taxonomy.

OntoExtractor : a tool for semi-automatic generation and maintenance of taxonomies from semi-structured documents / M. Leida - In: Semantic knowledge management : an ontology-based framework / [a cura di] Antonio Zilli ... [et al.]. - Hershey : Information science reference, 2009. - ISBN 9781605660349.

OntoExtractor : a tool for semi-automatic generation and maintenance of taxonomies from semi-structured documents

M. Leida
Primo
2009

Abstract

This chapter introduces OntoExtractor, a tool for semi-automatic generation of taxonomy from a set of documents or data sources. The tool generates the taxonomy in a bottom-up fashion: starting from structural analysis of the documents, it generates a set of clusters, which can be refined by a further grouping generated by content analysis. Metadata describing the content of each cluster is automatically generated and analysed by the tool for generating the final taxonomy. A simulation of a tool, based on implicit and explicit voting mechanism, for the maintenance of the taxonomy is also described. The author describes a system that can be used to generate taxonomy from a heterogeneous source of information, using wrappers for converting the original format of the document to a structured one. This way OntoExtractor can virtually generate taxonomy from any source of information just adding the proper wrapper. Moreover, the trust mechanism allows a reliable method for maintaining the taxonomy and for overcoming the unavoidable generation of wrong classes in the taxonomy.
Acquisition ; collaboration ; Collaborative ; Knowledge ; Knowledge creation ; Knowledge society ; Knowledge worker ; Management ; Semantic.
Settore INF/01 - Informatica
2009
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/50284
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact