The attribute-based access control (ABAC) model has been gaining popularity in recent years because of its advantages in granularity, flexibility, and usability. Few approaches based on association rules mining have been proposed for the automatic generation of ABAC policies from access logs. Their aim is the identification of policies that do not overfit over training data, are not too general and thus does to disclose sensitive resources to everyone, and are interpretable by humans. The large ABAC privilege space along with the sparsity and unbalance distribution of the available logs make the solution of this task particularly complex and current approaches have different limitations. In this paper we compare different symbolic and nonsymbolic machine learning (ML) techniques for inferring ABAC policies and discuss their pros and cons. Based on experimental results on a toy dataset and on a real dataset, we argue that which is the best technique depends on the characteristics of the considered data. When the data are highly separable according to PCA and t-SNE decomposition, the quality of the obtained ABAC policies is higher and also policies are easily interpretable. By contrast, when this property does not hold, the quality of the obtained policies is low; in this case, non-symbolic ML techniques show better results than the symbolic ones.

On the Quality of Classification Models for Inferring ABAC Policies from Access Logs / L. Cappelletti, S. Valtolina, G. Valentini, M. Mesiti, E. Bertino - In: 2019 IEEE International Conference on Big Data (Big Data)[s.l] : IEEE, 2019. - ISBN 9781728108582. - pp. 4000-4007 (( convegno International Conference on Big Data, Big Data tenutosi a Los Angeles nel 2019 [10.1109/BigData47090.2019.9005959].

On the Quality of Classification Models for Inferring ABAC Policies from Access Logs

L. Cappelletti;S. Valtolina
;
G. Valentini;M. Mesiti;
2019

Abstract

The attribute-based access control (ABAC) model has been gaining popularity in recent years because of its advantages in granularity, flexibility, and usability. Few approaches based on association rules mining have been proposed for the automatic generation of ABAC policies from access logs. Their aim is the identification of policies that do not overfit over training data, are not too general and thus does to disclose sensitive resources to everyone, and are interpretable by humans. The large ABAC privilege space along with the sparsity and unbalance distribution of the available logs make the solution of this task particularly complex and current approaches have different limitations. In this paper we compare different symbolic and nonsymbolic machine learning (ML) techniques for inferring ABAC policies and discuss their pros and cons. Based on experimental results on a toy dataset and on a real dataset, we argue that which is the best technique depends on the characteristics of the considered data. When the data are highly separable according to PCA and t-SNE decomposition, the quality of the obtained ABAC policies is higher and also policies are easily interpretable. By contrast, when this property does not hold, the quality of the obtained policies is low; in this case, non-symbolic ML techniques show better results than the symbolic ones.
ABAC policies; inference of policies from logs; machine learning
Settore INF/01 - Informatica
2019
Ankura
Baidu
IEEE
IEEE Computer Society
Very
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
09005959.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 350.14 kB
Formato Adobe PDF
350.14 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/726877
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 7
social impact