The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random r-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n)) I/Os and O(nlogn) time to peel a random hypergraph with n edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.6 billion keys in less than 21 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.
Cache-Oblivious Peeling of Random Hypergraphs / D. Belazzougui, P. Boldi, G. Ottaviano, R. Venturini, S. Vigna - In: 2014 Data Compression Conference (DCC 2014)[s.l] : IEEE, 2014. - ISBN 9781479934188. - pp. 352-361 (( convegno 2014 Data Compression Conference (DCC 2014) tenutosi a Snowbird, Utah, USA nel 2014.
|Titolo:||Cache-Oblivious Peeling of Random Hypergraphs|
BOLDI, PAOLO (Secondo)
VIGNA, SEBASTIANO (Ultimo)
|Parole Chiave:||hypergraph peeling; minimal perfect hash function|
|Settore Scientifico Disciplinare:||Settore INF/01 - Informatica|
|Data di pubblicazione:||2014|
|Digital Object Identifier (DOI):||http://dx.doi.org/10.1109/DCC.2014.48|
|Tipologia:||Book Part (author)|
|Appare nelle tipologie:||03 - Contributo in volume|