Data transformation and schema conciliation are relevant topics in Industry due to the incorporation of data-intensive business processes in organizations. As the amount of data sources increases, the complexity of such data increases as well, leading to complex and nested data schemata. Nowadays, novel approaches are being employed in academia and Industry to assist non-expert users in transforming, integrating, and improving the quality of datasets (i.e., data wrangling). However, there is a lack of support for transforming semi-structured complex data. This article makes a state-of-the-art by identifying and analyzing the most relevant solutions that can be found in academia and Industry to transform this type of data. In addition, we propose a Domain-Specific Language (DSL) to support the transformation of complex data as a first approach to enhance data wrangling processes. We also develop a framework to implement the DSL and evaluate it in a real-world case study.
CHAMALEON: Framework to improve Data Wrangling with Complex Data / A. Valencia Parra, A.J. Varela Vaca, M.T. Gómez López, P. Ceravolo - In: ICIS 2019[s.l] : Association for Information Systems (AIS) Electronic Library, 2019. - ISBN 978-0-9966831-9-7. - pp. 1-17 (( Intervento presentato al 40. convegno International Conference on Information Systems : December : December ,15th through 18th tenutosi a Munich (Germany) nel 2019.
CHAMALEON: Framework to improve Data Wrangling with Complex Data
P. CeravoloUltimo
2019
Abstract
Data transformation and schema conciliation are relevant topics in Industry due to the incorporation of data-intensive business processes in organizations. As the amount of data sources increases, the complexity of such data increases as well, leading to complex and nested data schemata. Nowadays, novel approaches are being employed in academia and Industry to assist non-expert users in transforming, integrating, and improving the quality of datasets (i.e., data wrangling). However, there is a lack of support for transforming semi-structured complex data. This article makes a state-of-the-art by identifying and analyzing the most relevant solutions that can be found in academia and Industry to transform this type of data. In addition, we propose a Domain-Specific Language (DSL) to support the transformation of complex data as a first approach to enhance data wrangling processes. We also develop a framework to implement the DSL and evaluate it in a real-world case study.File | Dimensione | Formato | |
---|---|---|---|
CHAMALEON_FrameworktoimproveDataWranglingwithComplexData.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
663.9 kB
Formato
Adobe PDF
|
663.9 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.