The neighbourhood function NG(t) of a graph G gives, for each t, the number of pairs of nodes <x, y> such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm (approximate neighbourhood function) has been proposed with the purpose of approximating NG(t) on large graphs. We describe a breakthrough improvement over ANF in terms of speed and scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters and combines them efficiently through broadword programming; our implementation uses overdecomposition to exploit multi-core parallelism. With HyperANF, for the first time we can compute in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. Then, we turn to the study of the distribution of the distances between reachable nodes (that can be efficiently approximated by means of HyperANF), and discover the surprising fact that its index of dispersion provides a clear-cut characterisation of proper social networks vs. web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a graph as a new, informative statistics that is able to discriminate between the above two types of graphs. We believe this is the first proposal of a significant new non-local structural index for complex networks whose computation is highly scalable.

HyperANF : approximating the neighbourhood function of very large graphs on a budget / P. Boldi, M. Rosa, S. Vigna - In: Proceedings of the 20th international conference on World Wide WebNew York : ACM Press, 2011. - ISBN 9781450306324. - pp. 625-634 (( Intervento presentato al 20th. convegno International World Wide Web Conference tenutosi a Hyderabad, India nel 2011 [10.1145/1963405.1963493].

HyperANF : approximating the neighbourhood function of very large graphs on a budget

P. Boldi
Primo
;
M. Rosa
Secondo
;
S. Vigna
Ultimo
2011

Abstract

The neighbourhood function NG(t) of a graph G gives, for each t, the number of pairs of nodes such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm (approximate neighbourhood function) has been proposed with the purpose of approximating NG(t) on large graphs. We describe a breakthrough improvement over ANF in terms of speed and scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters and combines them efficiently through broadword programming; our implementation uses overdecomposition to exploit multi-core parallelism. With HyperANF, for the first time we can compute in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. Then, we turn to the study of the distribution of the distances between reachable nodes (that can be efficiently approximated by means of HyperANF), and discover the surprising fact that its index of dispersion provides a clear-cut characterisation of proper social networks vs. web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a graph as a new, informative statistics that is able to discriminate between the above two types of graphs. We believe this is the first proposal of a significant new non-local structural index for complex networks whose computation is highly scalable.
Effective diameter; Neighbourhood function; Probabilistic counters; Shortest paths
Settore INF/01 - Informatica
2011
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/156430
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 91
  • ???jsp.display-item.citation.isi??? ND
social impact