Data lakehouses are modern data architectures designed to sup- port the integrated management and analysis of large volumes of heterogeneous, business-oriented data on distributed platforms. When such data include spatial information, a key question is how to semantically integrate it with other sources — an operation re- ferred to as geo-enrichment — thereby creating new opportunities for more effective and insightful data analysis. Yet, the notion of geo-enrichment has received limited attention in the academic liter- ature and is often associated with commercial information services. In this paper, we present our vision and discuss key challenges, par- ticularly those related to defining a data exploration environment that provides geo-enrichment operators and tools for both discover- ing relevant data sources and interacting with geo-enrichable data. Our discussion is grounded in a use case involving the integration of spatial datasets provided by Eurostat—the statistical office of the European Union (EU)—within Apache Sedona on a Spark cluster, as adopted in the context of the EU-funded GRINS project.
Geo-enrichment in a Data Lakehouse: Exploring Challenges and Opportunities / V.S.R. Siddabattula, F. Hachem, M. Leo, G. Rosa, M.L. Damiani - In: Proceedings of the 4th International Workshop on Spatial Big Data and AI for Industrial Applications (GeoIndustry25)[s.l] : ACM, 2025. - ISBN 979-8-4007-2182-3. - pp. 1-9 (( 4. International Workshop on Spatial Big Data and AI for Industrial Applications Minneapolis 2025 [10.1145/3764919.3770886].
Geo-enrichment in a Data Lakehouse: Exploring Challenges and Opportunities
V.S.R. Siddabattula;F. Hachem;M.L. Damiani
2025
Abstract
Data lakehouses are modern data architectures designed to sup- port the integrated management and analysis of large volumes of heterogeneous, business-oriented data on distributed platforms. When such data include spatial information, a key question is how to semantically integrate it with other sources — an operation re- ferred to as geo-enrichment — thereby creating new opportunities for more effective and insightful data analysis. Yet, the notion of geo-enrichment has received limited attention in the academic liter- ature and is often associated with commercial information services. In this paper, we present our vision and discuss key challenges, par- ticularly those related to defining a data exploration environment that provides geo-enrichment operators and tools for both discover- ing relevant data sources and interacting with geo-enrichable data. Our discussion is grounded in a use case involving the integration of spatial datasets provided by Eurostat—the statistical office of the European Union (EU)—within Apache Sedona on a Spark cluster, as adopted in the context of the EU-funded GRINS project.| File | Dimensione | Formato | |
|---|---|---|---|
|
GeoEnrichment paper-compresso.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Licenza:
Creative commons
Dimensione
296.84 kB
Formato
Adobe PDF
|
296.84 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




