IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, order-preserving minimal perfect hash functions (Fox et al. 1991) have been used to retrieve the position of a key in a given list of keys; however, the ability to preserve any given order leads to an unavoidable Ω(n log n) lower bound on the number of bits required to store the function. Recently, it was observed (Belazzougui et al. 2009) that very frequently the keys to be hashed are sorted in their intrinsic (i.e., lexicographical) order. This is typically the case of dictionaries of search engines, list of URLs of Web graphs, and so on. We refer to this restricted version of the problem as monotone minimal perfect hashing. We analyze experimentally the data structures proposed in Belazzougui et al. [2009], and along our way we propose some new methods that, albeit asymptotically equivalent or worse, perform very well in practice and provide a balance between access speed, ease of construction, and space usage.

Theory and practice of monotone minimal perfect hashing / D. Belazzougui, P. Boldi, R. Pagh, S. Vigna. - In: ACM JOURNAL OF EXPERIMENTAL ALGORITHMICS. - ISSN 1084-6654. - 16:(2011), pp. 2025378.3.2:1-2025378.3.2:26. [10.1145/1963190.2025378]

Theory and practice of monotone minimal perfect hashing

D. Belazzougui;P. Boldi;R. Pagh;S. Vigna

2011

Abstract

Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, order-preserving minimal perfect hash functions (Fox et al. 1991) have been used to retrieve the position of a key in a given list of keys; however, the ability to preserve any given order leads to an unavoidable Ω(n log n) lower bound on the number of bits required to store the function. Recently, it was observed (Belazzougui et al. 2009) that very frequently the keys to be hashed are sorted in their intrinsic (i.e., lexicographical) order. This is typically the case of dictionaries of search engines, list of URLs of Web graphs, and so on. We refer to this restricted version of the problem as monotone minimal perfect hashing. We analyze experimentally the data structures proposed in Belazzougui et al. [2009], and along our way we propose some new methods that, albeit asymptotically equivalent or worse, perform very well in practice and provide a balance between access speed, ease of construction, and space usage.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Monotone minimal perfect hashing; Succinct data structures; Very large databases
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2011
			
	Rivista in ANCE
	
				ACM JOURNAL OF EXPERIMENTAL ALGORITHMICS
			
	DOI
	
				https://dx.doi.org/10.1145/1963190.2025378
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
a3_2-belazzougui.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 391.35 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	391.35 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/297865

Citazioni

ND

30

ND

social impact