The attribute-based access control (ABAC) model has been gaining popularity in recent years because of its advantages in granularity, flexibility, and usability. Few approaches based on association rules mining have been proposed for the automatic generation of ABAC policies from access logs. Their aim is the identification of policies that do not overfit over training data, are not too general and thus does to disclose sensitive resources to everyone, and are interpretable by humans. The large ABAC privilege space along with the sparsity and unbalance distribution of the available logs make the solution of this task particularly complex and current approaches have different limitations. In this paper we compare different symbolic and nonsymbolic machine learning (ML) techniques for inferring ABAC policies and discuss their pros and cons. Based on experimental results on a toy dataset and on a real dataset, we argue that which is the best technique depends on the characteristics of the considered data. When the data are highly separable according to PCA and t-SNE decomposition, the quality of the obtained ABAC policies is higher and also policies are easily interpretable. By contrast, when this property does not hold, the quality of the obtained policies is low; in this case, non-symbolic ML techniques show better results than the symbolic ones.
On the Quality of Classification Models for Inferring ABAC Policies from Access Logs / L. Cappelletti, S. Valtolina, G. Valentini, M. Mesiti, E. Bertino - In: 2019 IEEE International Conference on Big Data (Big Data)[s.l] : IEEE, 2019. - ISBN 9781728108582. - pp. 4000-4007 (( convegno International Conference on Big Data, Big Data tenutosi a Los Angeles nel 2019 [10.1109/BigData47090.2019.9005959].
On the Quality of Classification Models for Inferring ABAC Policies from Access Logs
L. Cappelletti;S. Valtolina
;G. Valentini;M. Mesiti;
2019
Abstract
The attribute-based access control (ABAC) model has been gaining popularity in recent years because of its advantages in granularity, flexibility, and usability. Few approaches based on association rules mining have been proposed for the automatic generation of ABAC policies from access logs. Their aim is the identification of policies that do not overfit over training data, are not too general and thus does to disclose sensitive resources to everyone, and are interpretable by humans. The large ABAC privilege space along with the sparsity and unbalance distribution of the available logs make the solution of this task particularly complex and current approaches have different limitations. In this paper we compare different symbolic and nonsymbolic machine learning (ML) techniques for inferring ABAC policies and discuss their pros and cons. Based on experimental results on a toy dataset and on a real dataset, we argue that which is the best technique depends on the characteristics of the considered data. When the data are highly separable according to PCA and t-SNE decomposition, the quality of the obtained ABAC policies is higher and also policies are easily interpretable. By contrast, when this property does not hold, the quality of the obtained policies is low; in this case, non-symbolic ML techniques show better results than the symbolic ones.File | Dimensione | Formato | |
---|---|---|---|
09005959.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
350.14 kB
Formato
Adobe PDF
|
350.14 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.