More and more scenarios rely today on data analysis of mas- sive amount of data, possibly contributed from multiple parties (data controllers). Data may, however, contain information that is sensitive or that should be protected (e.g., since it exposes identities of the data sub- jects) and cannot simply be freely shared and used for analysis. Business rules, restrictions from individuals (data subjects to which data refer), as well as privacy regulations demand data to be sanitized before being released or shared with others. Unfortunately, such protection typically comes with a loss of utility of the released data, impacting the perfor- mance of the analytics tasks to be executed. In this paper, we present DT-Anon, a target-driven anonymization approach that aims at protecting (anonymizing) data while preserving as much as possible the capability of a classification task operating down- stream to learn from the anonymized data. The basic idea of our app- roach is to perform the anonymization process on partitions produced by a decision tree driven by the target of the classification task. Each parti- tion is then independently anonymized, to limit the impact of anonymiza- tion on the attributes and values that work as predictors for the target of the classification task. Our experimental evaluation confirms the effec- tiveness of the approach.
DT-Anon: Decision Tree Target-Driven Anonymization / S. DE CAPITANI DI VIMERCATI, S. Foresti, V. Ghirimoldi, P. Samarati (LECTURE NOTES IN COMPUTER SCIENCE). - In: Data and Applications Security and Privacy XXXVIII / [a cura di] A.L Ferrara, R. Krishnan. - [s.l] : Springer, 2024. - ISBN 978-3-031-65171-7. - pp. 111-130 (( Intervento presentato al 38. convegno Conference on Data and Applications Security and Privacy tenutosi a San Jose nel 2024 [10.1007/978-3-031-65172-4_8].
DT-Anon: Decision Tree Target-Driven Anonymization
S. DE CAPITANI DI VIMERCATI
;S. Foresti;P. Samarati
2024
Abstract
More and more scenarios rely today on data analysis of mas- sive amount of data, possibly contributed from multiple parties (data controllers). Data may, however, contain information that is sensitive or that should be protected (e.g., since it exposes identities of the data sub- jects) and cannot simply be freely shared and used for analysis. Business rules, restrictions from individuals (data subjects to which data refer), as well as privacy regulations demand data to be sanitized before being released or shared with others. Unfortunately, such protection typically comes with a loss of utility of the released data, impacting the perfor- mance of the analytics tasks to be executed. In this paper, we present DT-Anon, a target-driven anonymization approach that aims at protecting (anonymizing) data while preserving as much as possible the capability of a classification task operating down- stream to learn from the anonymized data. The basic idea of our app- roach is to perform the anonymization process on partitions produced by a decision tree driven by the target of the classification task. Each parti- tion is then independently anonymized, to limit the impact of anonymiza- tion on the attributes and values that work as predictors for the target of the classification task. Our experimental evaluation confirms the effec- tiveness of the approach.File | Dimensione | Formato | |
---|---|---|---|
978-3-031-65172-4_8.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
1.26 MB
Formato
Adobe PDF
|
1.26 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.