Data lakehouses are modern data architectures designed to sup- port the integrated management and analysis of large volumes of heterogeneous, business-oriented data on distributed platforms. When such data include spatial information, a key question is how to semantically integrate it with other sources — an operation re- ferred to as geo-enrichment — thereby creating new opportunities for more effective and insightful data analysis. Yet, the notion of geo-enrichment has received limited attention in the academic liter- ature and is often associated with commercial information services. In this paper, we present our vision and discuss key challenges, par- ticularly those related to defining a data exploration environment that provides geo-enrichment operators and tools for both discover- ing relevant data sources and interacting with geo-enrichable data. Our discussion is grounded in a use case involving the integration of spatial datasets provided by Eurostat—the statistical office of the European Union (EU)—within Apache Sedona on a Spark cluster, as adopted in the context of the EU-funded GRINS project.

Geo-enrichment in a Data Lakehouse: Exploring Challenges and Opportunities / V.S.R. Siddabattula, F. Hachem, M. Leo, G. Rosa, M.L. Damiani - In: Proceedings of the 4th International Workshop on Spatial Big Data and AI for Industrial Applications (GeoIndustry25)[s.l] : ACM, 2025. - ISBN 979-8-4007-2182-3. - pp. 1-9 (( 4. International Workshop on Spatial Big Data and AI for Industrial Applications Minneapolis 2025 [10.1145/3764919.3770886].

Geo-enrichment in a Data Lakehouse: Exploring Challenges and Opportunities

V.S.R. Siddabattula;F. Hachem;M.L. Damiani
2025

Abstract

Data lakehouses are modern data architectures designed to sup- port the integrated management and analysis of large volumes of heterogeneous, business-oriented data on distributed platforms. When such data include spatial information, a key question is how to semantically integrate it with other sources — an operation re- ferred to as geo-enrichment — thereby creating new opportunities for more effective and insightful data analysis. Yet, the notion of geo-enrichment has received limited attention in the academic liter- ature and is often associated with commercial information services. In this paper, we present our vision and discuss key challenges, par- ticularly those related to defining a data exploration environment that provides geo-enrichment operators and tools for both discover- ing relevant data sources and interacting with geo-enrichable data. Our discussion is grounded in a use case involving the integration of spatial datasets provided by Eurostat—the statistical office of the European Union (EU)—within Apache Sedona on a Spark cluster, as adopted in the context of the EU-funded GRINS project.
Big Data architectures; spatial data integration; geo-enrichment
Settore INFO-01/A - Informatica
   Arricchimento di Dati Geospaziali tramite Linking e Analytics (ADaLinA)
   ADaLinA
   ALMA MATER STUDIORUM - UNIVERSITA' DI BOLOGNA
2025
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
GeoEnrichment paper-compresso.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 296.84 kB
Formato Adobe PDF
296.84 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1202950
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact