Knowledge graphs (KGs) are useful tools to uniformly represent and integrate heterogeneous information about a domain of interest. However, they are inherently incomplete; therefore, new facts should be introduced by extracting them from structured and unstructured data sources. Starting from RNA-KG, the first KG tailored for representing different kinds of RNA molecules that we recently developed, in this paper we evaluate the use of SPIRES for extracting interactions among bio-entities involving RNA molecules from scientific papers guided by the RNA-KG schema. SPIRES is a general-purpose knowledge extraction system for mining information conforming to a specified schema. A customized prompt is generated and submitted to a Large Language Model (LLM) along with a text to extract a set of RDF triples adhering to the schema constraints. The experiments show a high accuracy in extracting interactions from the scientific literature.

On the extraction of meaningful RNA interactions from Scientific Publications through LLMs and SPIRES / E. Cavalleri, M. Mesiti (CEUR WORKSHOP PROCEEDINGS). - In: EDBT/ICDT-WS 2024 : EDBT/ICDT 2024 Workshops / [a cura di] T. Palpanas, H.V. Jagadish. - [s.l] : CEUR-WS, 2024. - pp. 1-6 (( convegno EDBT/ICDT 2024 Joint Conference tenutosi a Paestum nel 2024.

On the extraction of meaningful RNA interactions from Scientific Publications through LLMs and SPIRES

E. Cavalleri
Primo
;
M. Mesiti
Ultimo
2024

Abstract

Knowledge graphs (KGs) are useful tools to uniformly represent and integrate heterogeneous information about a domain of interest. However, they are inherently incomplete; therefore, new facts should be introduced by extracting them from structured and unstructured data sources. Starting from RNA-KG, the first KG tailored for representing different kinds of RNA molecules that we recently developed, in this paper we evaluate the use of SPIRES for extracting interactions among bio-entities involving RNA molecules from scientific papers guided by the RNA-KG schema. SPIRES is a general-purpose knowledge extraction system for mining information conforming to a specified schema. A customized prompt is generated and submitted to a Large Language Model (LLM) along with a text to extract a set of RDF triples adhering to the schema constraints. The experiments show a high accuracy in extracting interactions from the scientific literature.
RNA-based technologies; Knowledge Graphs; RNA-drug discovery; Large Language Models
Settore INFO-01/A - Informatica
2024
https://ceur-ws.org/Vol-3651/DARLI-AP-6.pdf
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
DARLI-AP-6.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 524.34 kB
Formato Adobe PDF
524.34 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1172545
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact