The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random r-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n)) I/Os and O(nlogn) time to peel a random hypergraph with n edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.6 billion keys in less than 21 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.
Cache-Oblivious Peeling of Random Hypergraphs / D. Belazzougui, P. Boldi, G. Ottaviano, R. Venturini, S. Vigna - In: 2014 Data Compression Conference (DCC 2014)[s.l] : IEEE, 2014. - ISBN 9781479934188. - pp. 352-361 (( convegno 2014 Data Compression Conference (DCC 2014) tenutosi a Snowbird, Utah, USA nel 2014.
Titolo: | Cache-Oblivious Peeling of Random Hypergraphs |
Autori: | BOLDI, PAOLO (Secondo) VIGNA, SEBASTIANO (Ultimo) |
Parole Chiave: | hypergraph peeling; minimal perfect hash function |
Settore Scientifico Disciplinare: | Settore INF/01 - Informatica |
Data di pubblicazione: | 2014 |
Digital Object Identifier (DOI): | http://dx.doi.org/10.1109/DCC.2014.48 |
Tipologia: | Book Part (author) |
Appare nelle tipologie: | 03 - Contributo in volume |