Knowledge graphs (KGs) are useful tools to uniformly represent and integrate heterogeneous information about a domain of interest. However, they are inherently incomplete; therefore, new facts should be introduced by extracting them from structured and unstructured data sources. Starting from RNA-KG, the first KG tailored for representing different kinds of RNA molecules that we recently developed, in this paper we evaluate the use of SPIRES for extracting interactions among bio-entities involving RNA molecules from scientific papers guided by the RNA-KG schema. SPIRES is a general-purpose knowledge extraction system for mining information conforming to a specified schema. A customized prompt is generated and submitted to a Large Language Model (LLM) along with a text to extract a set of RDF triples adhering to the schema constraints. The experiments show a high accuracy in extracting interactions from the scientific literature.
On the extraction of meaningful RNA interactions from Scientific Publications through LLMs and SPIRES / E. Cavalleri, M. Mesiti (CEUR WORKSHOP PROCEEDINGS). - In: EDBT/ICDT-WS 2024 : EDBT/ICDT 2024 Workshops / [a cura di] T. Palpanas, H.V. Jagadish. - [s.l] : CEUR-WS, 2024. - pp. 1-6 (( convegno EDBT/ICDT 2024 Joint Conference tenutosi a Paestum nel 2024.
On the extraction of meaningful RNA interactions from Scientific Publications through LLMs and SPIRES
E. CavalleriPrimo
;M. MesitiUltimo
2024
Abstract
Knowledge graphs (KGs) are useful tools to uniformly represent and integrate heterogeneous information about a domain of interest. However, they are inherently incomplete; therefore, new facts should be introduced by extracting them from structured and unstructured data sources. Starting from RNA-KG, the first KG tailored for representing different kinds of RNA molecules that we recently developed, in this paper we evaluate the use of SPIRES for extracting interactions among bio-entities involving RNA molecules from scientific papers guided by the RNA-KG schema. SPIRES is a general-purpose knowledge extraction system for mining information conforming to a specified schema. A customized prompt is generated and submitted to a Large Language Model (LLM) along with a text to extract a set of RDF triples adhering to the schema constraints. The experiments show a high accuracy in extracting interactions from the scientific literature.| File | Dimensione | Formato | |
|---|---|---|---|
|
DARLI-AP-6.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Licenza:
Creative commons
Dimensione
524.34 kB
Formato
Adobe PDF
|
524.34 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




