It is well known that Bloom Filters have a performance essentially independent of the data used to query the filters themselves, but this is no more true when considering Learned Bloom Filters. In this work we analyze how the performance of such learned data structures is impacted by the classifier chosen to build the filter and by the complexity of the dataset used in the training phase. Such analysis, which has not been proposed so far in the literature, involves the key performance indicators of space efficiency, false positive rate, and reject time. By screening various implementations of Learned Bloom Filters, our experimental study highlights that only one of these implementations exhibits higher robustness to classifier performance and to noisy data, and that only two families of classifiers have desirable properties in relation to the previous performance indicators.

A Critical Analysis of Classifier Selection in Learned Bloom Filters: The Essentials / D. Malchiodi, D. Raimondi, G. Fumagalli, R. Giancarlo, M. Frasca (COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE). - In: Engineering Applications of Neural Networks / [a cura di] L. Iliadis, I. Maglogiannis, S. Alonso, C. Jayne, E. Pimenidis. - [s.l] : Springer Nature, 2023. - ISBN 978-3-031-34203-5. - pp. 47-61 (( Intervento presentato al 24. convegno International Conference, EAAAI/EANN tenutosi a León nel 2023 [10.1007/978-3-031-34204-2_5].

A Critical Analysis of Classifier Selection in Learned Bloom Filters: The Essentials

D. Malchiodi
;
M. Frasca
2023

Abstract

It is well known that Bloom Filters have a performance essentially independent of the data used to query the filters themselves, but this is no more true when considering Learned Bloom Filters. In this work we analyze how the performance of such learned data structures is impacted by the classifier chosen to build the filter and by the complexity of the dataset used in the training phase. Such analysis, which has not been proposed so far in the literature, involves the key performance indicators of space efficiency, false positive rate, and reject time. By screening various implementations of Learned Bloom Filters, our experimental study highlights that only one of these implementations exhibits higher robustness to classifier performance and to noisy data, and that only two families of classifiers have desirable properties in relation to the previous performance indicators.
No
English
Learned Bloom filters; Data complexity; Learned data structures
Settore INF/01 - Informatica
Intervento a convegno
Esperti anonimi
Ricerca di base
Pubblicazione scientifica
   Multi-criteria optimized data structures: from compressed indexes to learned indexes, and beyond
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017WR7SHH_004
Engineering Applications of Neural Networks
L. Iliadis, I. Maglogiannis, S. Alonso, C. Jayne, E. Pimenidis
Springer Nature
2023
47
61
15
978-3-031-34203-5
978-3-031-34204-2
1826
Volume a diffusione internazionale
International Conference, EAAAI/EANN
León
2023
24
Convegno internazionale
Intervento inviato
crossref
Aderisco
D. Malchiodi, D. Raimondi, G. Fumagalli, R. Giancarlo, M. Frasca
Book Part (author)
reserved
273
A Critical Analysis of Classifier Selection in Learned Bloom Filters: The Essentials / D. Malchiodi, D. Raimondi, G. Fumagalli, R. Giancarlo, M. Frasca (COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE). - In: Engineering Applications of Neural Networks / [a cura di] L. Iliadis, I. Maglogiannis, S. Alonso, C. Jayne, E. Pimenidis. - [s.l] : Springer Nature, 2023. - ISBN 978-3-031-34203-5. - pp. 47-61 (( Intervento presentato al 24. convegno International Conference, EAAAI/EANN tenutosi a León nel 2023 [10.1007/978-3-031-34204-2_5].
info:eu-repo/semantics/bookPart
5
Prodotti della ricerca::03 - Contributo in volume
File in questo prodotto:
File Dimensione Formato  
EANN-2023-published.pdf

accesso riservato

Descrizione: Lavoro pubblicato
Tipologia: Publisher's version/PDF
Dimensione 1.37 MB
Formato Adobe PDF
1.37 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/981848
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact