We propose a novel general technique aimed at pruning and cleansing the Wikipedia category hierarchy, with a tunable level of aggregation. Our approach is endogenous, since it does not use any information coming from Wikipedia articles, but it is based solely on the user-generated (noisy) Wikipedia category folksonomy itself. We show how the proposed techniques can help reduce the level of noise in the hierarchy and discuss how alternative centrality measures can differently impact on the result.
Cleansing Wikipedia categories using centrality / P. Boldi, C. Monti - In: WWW '16 Companion : proceedings[s.l] : ACM, 2016. - ISBN 9781450341448. - pp. 969-974 (( Intervento presentato al Montréal. convegno International Conference Companion on World Wide Web tenutosi a 25 nel 2016 [10.1145/2872518.2891111].
Cleansing Wikipedia categories using centrality
P. BoldiPrimo
;C. MontiUltimo
2016
Abstract
We propose a novel general technique aimed at pruning and cleansing the Wikipedia category hierarchy, with a tunable level of aggregation. Our approach is endogenous, since it does not use any information coming from Wikipedia articles, but it is based solely on the user-generated (noisy) Wikipedia category folksonomy itself. We show how the proposed techniques can help reduce the level of noise in the hierarchy and discuss how alternative centrality measures can differently impact on the result.File | Dimensione | Formato | |
---|---|---|---|
p969-boldi.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
1.02 MB
Formato
Adobe PDF
|
1.02 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.