Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03

Fast scalable construction of (minimal perfect hash) functions / M. Genuzio, G. Ottaviano, S. Vigna (LECTURE NOTES IN COMPUTER SCIENCE). - In: Experimental Algorithms / [a cura di] A.V. Goldberg, A.S. Kulikov. - Prima edizione. - [s.l] : Springer, 2016. - ISBN 9783319388502. - pp. 339-352 (( Intervento presentato al 15. convegno SEA tenutosi a St. Petersburg nel 2016 [10.1007/978-3-319-38851-9_23].

Fast scalable construction of (minimal perfect hash) functions

M. Genuzio
Primo
;
S. Vigna
Ultimo
2016

Abstract

Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03
Minimal perfect hash; static functions; Gaussian elimination
Settore INF/01 - Informatica
2016
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
chp%3A10.1007%2F978-3-319-38851-9_23.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 337.75 kB
Formato Adobe PDF
337.75 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/423292
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 14
social impact