More and more scenarios rely today on data analysis of mas- sive amount of data, possibly contributed from multiple parties (data controllers). Data may, however, contain information that is sensitive or that should be protected (e.g., since it exposes identities of the data sub- jects) and cannot simply be freely shared and used for analysis. Business rules, restrictions from individuals (data subjects to which data refer), as well as privacy regulations demand data to be sanitized before being released or shared with others. Unfortunately, such protection typically comes with a loss of utility of the released data, impacting the perfor- mance of the analytics tasks to be executed. In this paper, we present DT-Anon, a target-driven anonymization approach that aims at protecting (anonymizing) data while preserving as much as possible the capability of a classification task operating down- stream to learn from the anonymized data. The basic idea of our app- roach is to perform the anonymization process on partitions produced by a decision tree driven by the target of the classification task. Each parti- tion is then independently anonymized, to limit the impact of anonymiza- tion on the attributes and values that work as predictors for the target of the classification task. Our experimental evaluation confirms the effec- tiveness of the approach.

DT-Anon: Decision Tree Target-Driven Anonymization / S. DE CAPITANI DI VIMERCATI, S. Foresti, V. Ghirimoldi, P. Samarati (LECTURE NOTES IN COMPUTER SCIENCE). - In: Data and Applications Security and Privacy XXXVIII / [a cura di] A.L Ferrara, R. Krishnan. - [s.l] : Springer, 2024. - ISBN 978-3-031-65171-7. - pp. 111-130 (( Intervento presentato al 38. convegno Conference on Data and Applications Security and Privacy tenutosi a San Jose nel 2024 [10.1007/978-3-031-65172-4_8].

DT-Anon: Decision Tree Target-Driven Anonymization

S. DE CAPITANI DI VIMERCATI
;
S. Foresti;P. Samarati
2024

Abstract

More and more scenarios rely today on data analysis of mas- sive amount of data, possibly contributed from multiple parties (data controllers). Data may, however, contain information that is sensitive or that should be protected (e.g., since it exposes identities of the data sub- jects) and cannot simply be freely shared and used for analysis. Business rules, restrictions from individuals (data subjects to which data refer), as well as privacy regulations demand data to be sanitized before being released or shared with others. Unfortunately, such protection typically comes with a loss of utility of the released data, impacting the perfor- mance of the analytics tasks to be executed. In this paper, we present DT-Anon, a target-driven anonymization approach that aims at protecting (anonymizing) data while preserving as much as possible the capability of a classification task operating down- stream to learn from the anonymized data. The basic idea of our app- roach is to perform the anonymization process on partitions produced by a decision tree driven by the target of the classification task. Each parti- tion is then independently anonymized, to limit the impact of anonymiza- tion on the attributes and values that work as predictors for the target of the classification task. Our experimental evaluation confirms the effec- tiveness of the approach.
data anonymization; machine learning classifier; target-driven anonymization; decision tree
Settore INF/01 - Informatica
   Edge AI Technologies for Optimised Performance Embedded Processing (EdgeAI)
   EdgeAI
   MINISTERO DELLO SVILUPPO ECONOMICO
   101097300

   Green responsibLe privACy preservIng dAta operaTIONs
   GLACIATION
   EUROPEAN COMMISSION

   POLAR: POLicy specificAtion and enfoRcement for privacy-enhanced data management
   POLAR
   MINISTERO DELL'UNIVERSITA' E DELLA RICERCA
   2022LA8XBH_001

   SEcurity and RIghts in the CyberSpace (SERICS)
   SERICS
   MINISTERO DELL'UNIVERSITA' E DELLA RICERCA
   codice identificativo PE00000014
2024
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
978-3-031-65172-4_8.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 1.26 MB
Formato Adobe PDF
1.26 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1077568
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact