General-purpose LLMs are increasingly employed in a variety of data-centric tasks, ranging from entity and relationship extraction from plain text to schema-aware data integration and analysis. Their large number of parameters and extensive training corpora enable strong generalization across application domains. However, LLMs lack access to up-to-date knowledge and are prone to hallucinations, limiting their reliability in data management scenarios. To address these issues, prompt engineering techniques have been proposed to specify the context in which a task should be performed. In this paper, we explore the use of conceptual schemas as a foundation for schema-driven prompt engineering, providing structured and reusable contexts for grounding LLM behavior in data-centric applications. We present SchemaLink, an intelligent web-based system for the graphical design and enhancement of conceptual schemas and their exploitation for LLM-based tasks. We demonstrate the applicability of our approach to knowledge extraction from plain text and to the discovery of joinable columns in data lakes, showing how schema-driven prompting improves grounding and consistency across heterogeneous data sources.

A Schema-Driven Prompt Engineering Approach for Data-Centric LLM Tasks / E. Cavalleri, M. Mesiti (... IEEE ... INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE ...) (ONLINE)). - In: AIxDKE[s.l] : Institute of Electrical and Electronics Engineers (IEEE), 2026 Apr. - ISBN 979-8-3315-4750-9. - pp. 36-43 (( International Conference on AI x Data and Knowledge Engineering : February, 2nd - 4th Laguna Hills (CA, USA) 2026 [10.1109/aixdke67294.2026.00014].

A Schema-Driven Prompt Engineering Approach for Data-Centric LLM Tasks

E. Cavalleri
Primo
;
M. Mesiti
Ultimo
2026

Abstract

General-purpose LLMs are increasingly employed in a variety of data-centric tasks, ranging from entity and relationship extraction from plain text to schema-aware data integration and analysis. Their large number of parameters and extensive training corpora enable strong generalization across application domains. However, LLMs lack access to up-to-date knowledge and are prone to hallucinations, limiting their reliability in data management scenarios. To address these issues, prompt engineering techniques have been proposed to specify the context in which a task should be performed. In this paper, we explore the use of conceptual schemas as a foundation for schema-driven prompt engineering, providing structured and reusable contexts for grounding LLM behavior in data-centric applications. We present SchemaLink, an intelligent web-based system for the graphical design and enhancement of conceptual schemas and their exploitation for LLM-based tasks. We demonstrate the applicability of our approach to knowledge extraction from plain text and to the discovery of joinable columns in data lakes, showing how schema-driven prompting improves grounding and consistency across heterogeneous data sources.
Settore INFO-01/A - Informatica
apr-2026
Institute of Electrical and Electronics Engineers (IEEE)
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
A_Schema-Driven_Prompt_Engineering_Approach_for_Data-Centric_LLM_Tasks.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Licenza: Nessuna licenza
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1238317
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 0
social impact