Cache-Oblivious Peeling of Random Hypergraphs

Belazzougui, D.; Boldi, P.; Ottaviano, G.; Venturini, R.; Vigna, S.

doi:10.1109/DCC.2014.48

The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random r-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n)) I/Os and O(nlogn) time to peel a random hypergraph with n edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.6 billion keys in less than 21 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.

Cache-Oblivious Peeling of Random Hypergraphs / D. Belazzougui, P. Boldi, G. Ottaviano, R. Venturini, S. Vigna - In: 2014 Data Compression Conference (DCC 2014)[s.l] : IEEE, 2014. - ISBN 9781479934188. - pp. 352-361 (( convegno 2014 Data Compression Conference (DCC 2014) tenutosi a Snowbird, Utah, USA nel 2014 [10.1109/DCC.2014.48].

Cache-Oblivious Peeling of Random Hypergraphs

D. Belazzougui;P. Boldi^Secondo;G. Ottaviano;R. Venturini;S. Vigna^Ultimo

2014

Abstract

The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random r-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n)) I/Os and O(nlogn) time to peel a random hypergraph with n edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.6 billion keys in less than 21 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			hypergraph peeling; minimal perfect hash function
		
	Settori scientifico-disciplinari del contributo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			2014
		
	DOI
	
			https://dx.doi.org/10.1109/DCC.2014.48
		
	Tipologia
	
			Book Part (author)
		
	Appare nelle tipologie:
	
			03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/243463

Citazioni

ND

19

14

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca