We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.

Scalable Distributed Data Anonymization / S. De Capitani di Vimercati, D. Facchinetti, S. Foresti, G. Oldani, S. Paraboschi, M. Rossi, P. Samarati - In: 2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)[s.l] : IEEE, 2021. - ISBN 978-1-6654-4724-9. - pp. 401-403 (( convegno PerCom tenutosi a Kassel nel 2021 [10.1109/PerComWorkshops51409.2021.9431063].

Scalable Distributed Data Anonymization

S. De Capitani di Vimercati;S. Foresti;P. Samarati
2021

Abstract

We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.
Settore INF/01 - Informatica
   Multi-Owner data Sharing for Analytics and Integration respecting Confidentiality and Owner control (MOSAICrOWN)
   MOSAICrOWN
   EUROPEAN COMMISSION
   H2020
   825333

   Machine Learning-based, Networking and Computing Infrastructure Resource Management of 5G and beyond Intelligent Networks (MARSAL)
   MARSAL
   EUROPEAN COMMISSION
   H2020
   101017171

   High quality Open data Publishing and Enrichment (HOPE)
   HOPE
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017MMJJRE_003
2021
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
dffoprs-percom2021.pdf

accesso aperto

Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 361.51 kB
Formato Adobe PDF
361.51 kB Adobe PDF Visualizza/Apri
Scalable_Distributed_Data_Anonymization.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 217.86 kB
Formato Adobe PDF
217.86 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/869830
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact