This chapter introduces OntoExtractor, a tool for semi-automatic generation of taxonomy from a set of documents or data sources. The tool generates the taxonomy in a bottom-up fashion: starting from structural analysis of the documents, it generates a set of clusters, which can be refined by a further grouping generated by content analysis. Metadata describing the content of each cluster is automatically generated and analysed by the tool for generating the final taxonomy. A simulation of a tool, based on implicit and explicit voting mechanism, for the maintenance of the taxonomy is also described. The author describes a system that can be used to generate taxonomy from a heterogeneous source of information, using wrappers for converting the original format of the document to a structured one. This way OntoExtractor can virtually generate taxonomy from any source of information just adding the proper wrapper. Moreover, the trust mechanism allows a reliable method for maintaining the taxonomy and for overcoming the unavoidable generation of wrong classes in the taxonomy.
|Titolo:||OntoExtractor : a tool for semi-automatic generation and maintenance of taxonomies from semi-structured documents|
LEIDA, MARCELLO (Primo)
|Parole Chiave:||Acquisition ; collaboration ; Collaborative ; Knowledge ; Knowledge creation ; Knowledge society ; Knowledge worker ; Management ; Semantic.|
|Settore Scientifico Disciplinare:||Settore INF/01 - Informatica|
|Data di pubblicazione:||2009|
|Tipologia:||Book Part (author)|
|Appare nelle tipologie:||03 - Contributo in volume|