One of the promises of the "big data" revolution is that trough the analysis of large datasets people will benefit from the solution to many different problems obtained by the deployment of advanced machine learning models. One of the challenges of this standard approach, is that information needs to be centralized on the data center or the machine where the training phase is performed, posing many concerns about privacy. In this paper we take a step towards secure and efficient processing of distributed large datasets, where original data reside at different locations and are processed in a privacy preserving way. In particular we rely on the available technologies to achieve the secure design of a machine learning model by performing the training phase on encrypted data. The case study we examine is focused on the forecasting of energy production by wind farms situated in different locations. We show in detail how the machine learning model is created on the basis of the available datasets, we compare the results with the ones produced by the previous models, and discuss also their performances.
Towards Efficient and Secure Analysis of Large Datasets / S. Cimato, S. Nicolo - In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)[s.l] : IEEE, 2020. - ISBN 9781728173030. - pp. 1351-1356 (( Intervento presentato al 44. convegno 44th Annual Computers, Software, and Applications Conference (COMPSAC), 5th IEEE International Workshop on Distributed Big Data Management tenutosi a Madrid nel 2020 [10.1109/COMPSAC48688.2020.00-68].
Towards Efficient and Secure Analysis of Large Datasets
S. Cimato
Primo
;
2020
Abstract
One of the promises of the "big data" revolution is that trough the analysis of large datasets people will benefit from the solution to many different problems obtained by the deployment of advanced machine learning models. One of the challenges of this standard approach, is that information needs to be centralized on the data center or the machine where the training phase is performed, posing many concerns about privacy. In this paper we take a step towards secure and efficient processing of distributed large datasets, where original data reside at different locations and are processed in a privacy preserving way. In particular we rely on the available technologies to achieve the secure design of a machine learning model by performing the training phase on encrypted data. The case study we examine is focused on the forecasting of energy production by wind farms situated in different locations. We show in detail how the machine learning model is created on the basis of the available datasets, we compare the results with the ones produced by the previous models, and discuss also their performances.File | Dimensione | Formato | |
---|---|---|---|
paperBDDM-437.pdf
accesso aperto
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione
517.62 kB
Formato
Adobe PDF
|
517.62 kB | Adobe PDF | Visualizza/Apri |
09202485.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
628.68 kB
Formato
Adobe PDF
|
628.68 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.