Few-shot prompting with Large Language Models (LLMs) has emerged as a promising paradigm for advancing information extraction, particularly in data-scarce domains like biomedicine, where high annotation costs constrain the availability of training data.However, challenges persist in biomedical Named Entity Recognition (NER), where LLMs fail to achieve necessary accuracy and lag behind supervised fine-tuned models. In this study, we introduce FETA (First Extract, Tag Afterwards), a two-stage approach for entity recognition that combines instruction-guided prompting and a novel self-verification strategy to improve accuracy and reliability of LLM predictions in domain-specific NER tasks. FETA achieves state-of-the-art results on multiple established biomedical datasets.Our experiments demonstrate that carefully designed prompts, using self-verification and instruction guidance, can steer general-purpose LLMs to outperform fine-tuned models in knowledge-intensive NER tasks, unlocking their potential for more reliable and accurate information extraction in resource-constrained settings.

Mind Your Steps in Biomedical Named Entity Recognition: First Extract, Tag Afterwards / D. Shlyk, S. Montanelli, M. Mesiti, L. Hunter - In: HeaLing 2026 / [a cura di] V. Danilova, M. Kurfalı, Y. Söderfeldt, J. Reed, A. Burchell. - [s.l] : Association for Computational Linguistics, 2026 Mar 28. - ISBN 979-8-89176-367-8. - pp. 127-141 (( 1. Proceedings of the Workshop on Linguistic Analysis for Health Rabat (Morocco) 2026 [10.18653/v1/2026.healing-1.11].

Mind Your Steps in Biomedical Named Entity Recognition: First Extract, Tag Afterwards

D. Shlyk
Primo
;
S. Montanelli
Secondo
;
M. Mesiti
Penultimo
;
2026

Abstract

Few-shot prompting with Large Language Models (LLMs) has emerged as a promising paradigm for advancing information extraction, particularly in data-scarce domains like biomedicine, where high annotation costs constrain the availability of training data.However, challenges persist in biomedical Named Entity Recognition (NER), where LLMs fail to achieve necessary accuracy and lag behind supervised fine-tuned models. In this study, we introduce FETA (First Extract, Tag Afterwards), a two-stage approach for entity recognition that combines instruction-guided prompting and a novel self-verification strategy to improve accuracy and reliability of LLM predictions in domain-specific NER tasks. FETA achieves state-of-the-art results on multiple established biomedical datasets.Our experiments demonstrate that carefully designed prompts, using self-verification and instruction guidance, can steer general-purpose LLMs to outperform fine-tuned models in knowledge-intensive NER tasks, unlocking their potential for more reliable and accurate information extraction in resource-constrained settings.
Settore INFO-01/A - Informatica
28-mar-2026
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
unpaywall-bitstream-657001812.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 1.59 MB
Formato Adobe PDF
1.59 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1234078
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 0
social impact