Bloom Filters are a fundamental and pervasive data structure. Within the growing area of Learned Data Structures, several Learned versions of Bloom Filters have been considered, yielding advantages over classic Filters. Each of them uses a classifier, which is the Learned part of the data structure. Although it has a central role in those new filters, and its space footprint as well as classification time may affect the performance of the Learned Filter, no systematic study of which specific classifier to use in which circumstances is available. We report progress in this area here, providing also initial guidelines on which classifier to choose among five classic classification paradigms.

On the Choice of General Purpose Classifiers in Learned Bloom Filters: An Initial Analysis Within Basic Filters / M. Frasca, D. Malchiodi, R. Giancarlo, D. Raimondi, G. Fumagalli - In: Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods / [a cura di] M. De Marsico, G. Sanniti di Baja, A. Fred. - Prima edizione. - [s.l] : SciTePress, 2022. - ISBN 978-989-758-549-4. - pp. 675-682 (( Intervento presentato al 11. convegno International Conference on Pattern Recognition Applications and Methods ICPRAM 2022 tenutosi a Evento online nel 2022 [10.5220/0010889000003122].

On the Choice of General Purpose Classifiers in Learned Bloom Filters: An Initial Analysis Within Basic Filters

M. Frasca
Primo
;
D. Malchiodi
Secondo
;
2022

Abstract

Bloom Filters are a fundamental and pervasive data structure. Within the growing area of Learned Data Structures, several Learned versions of Bloom Filters have been considered, yielding advantages over classic Filters. Each of them uses a classifier, which is the Learned part of the data structure. Although it has a central role in those new filters, and its space footprint as well as classification time may affect the performance of the Learned Filter, no systematic study of which specific classifier to use in which circumstances is available. We report progress in this area here, providing also initial guidelines on which classifier to choose among five classic classification paradigms.
Learned Bloom Filters; Learned Data Structures; Information Retrieval; Classification
Settore INF/01 - Informatica
   Multi-criteria optimized data structures: from compressed indexes to learned indexes, and beyond
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017WR7SHH_004
2022
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
preprint_paper_76_cameraReady.pdf

accesso riservato

Descrizione: Articolo Principale
Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 261.92 kB
Formato Adobe PDF
261.92 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/912419
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
social impact