The Open Wikipedia Ranking is an open dataset published yearly, containing the ranking of Wikipedia pages with respect to centrality measures computed on the whole Wikipedia graph for that year. In this paper, ten years after its start, we report some details, results and anecdotal observations on this dataset. The goal of the Open Wikipedia Ranking is to provide a completely open and reproducible ranking of Wikipedia pages based on indegree, PageRank, harmonic centrality, and page views; the Wikipedia graphs themselves are also made available by the Laboratory of Web Algorithmics. What characterizes the Open Wikipedia Ranking is that the whole graph construction and ranking process are meticulously documented and reproducible. All computations are based on open-source Java software and algorithms from the literature. Thus, the reason of the centrality score of pages can be exactly traced back to structural graph properties.
Ten Years of Open Wikipedia Ranking / P. Boldi, F. Furia, S. Vigna - In: WWW '25: Companion / [a cura di] G. Long, M. Blumestein. - [s.l] : ACM, 2025. - ISBN 979-8-4007-1331-6. - pp. 883-887 (( convegno The ACM on Web Conference tenutosi a Sydney nel 2025 [10.1145/3701716.3715510].
Ten Years of Open Wikipedia Ranking
P. Boldi;F. Furia;S. Vigna
2025
Abstract
The Open Wikipedia Ranking is an open dataset published yearly, containing the ranking of Wikipedia pages with respect to centrality measures computed on the whole Wikipedia graph for that year. In this paper, ten years after its start, we report some details, results and anecdotal observations on this dataset. The goal of the Open Wikipedia Ranking is to provide a completely open and reproducible ranking of Wikipedia pages based on indegree, PageRank, harmonic centrality, and page views; the Wikipedia graphs themselves are also made available by the Laboratory of Web Algorithmics. What characterizes the Open Wikipedia Ranking is that the whole graph construction and ranking process are meticulously documented and reproducible. All computations are based on open-source Java software and algorithms from the literature. Thus, the reason of the centrality score of pages can be exactly traced back to structural graph properties.| File | Dimensione | Formato | |
|---|---|---|---|
|
main.pdf
accesso riservato
Tipologia:
Pre-print (manoscritto inviato all'editore)
Dimensione
602.72 kB
Formato
Adobe PDF
|
602.72 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
3701716.3715510.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Dimensione
1.04 MB
Formato
Adobe PDF
|
1.04 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




