Purpose This exploratory pilot study evaluated whether combining structured clinical information, medical imaging, and literature-derived knowledge could enhance the diagnostic reasoning and output quality of a large language model in distinguishing odontogenic sinusitis from maxillary sinus mucositis. Methods Six complex clinical cases were constructed with nasal endoscopy, computed tomography findings, and clinical vignettes. ChatGPT-4.0 was prompted using four strategies: (1) clinical text only, (2) text with medical imaging, (3) text with structured literature excerpts, and (4) text with both imaging and literature input. Seven blinded expert reviewers rated 168 responses across diagnostic accuracy, clinical reasoning, safety, and overall usefulness using a five-point scale. Statistical comparisons and inter-rater reliability were assessed. Results All prompting strategies produced clinically safe outputs with minimal hallucinations or unsafe recommendations. Combining structured literature and imaging with clinical text significantly improved clinical reasoning scores (F(3,123) = 4.32, p = 0.0058), ranking first or second in six of seven evaluation domains. No significant differences were observed in diagnostic accuracy or safety across strategies. Inter-rater reliability was substantial (kappa = 0.7). Conclusion Providing structured evidence and imaging appeared to enhance the clinical reasoning quality of large language models in this small, simulated dataset, without compromising diagnostic accuracy or safety. These preliminary findings suggest that structured multimodal prompting may help improve the interpretability and reliability of artificial intelligence tools for supporting diagnostic reasoning in sinus-related diseases, though larger prospective validation studies are needed before clinical implementation.

Retrieval-augmented generative AI enhances clinical reasoning in odontogenic sinusitis versus maxillary sinus mucositis / S. Hack, J.R. Craig, C. Lin, C. Fu, M.A. Kwiatkowska, P. Kocum, F. Allevi, A.M. Saibene. - In: EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY. - ISSN 0937-4477. - (2026). [Epub ahead of print] [10.1007/s00405-026-10040-2]

Retrieval-augmented generative AI enhances clinical reasoning in odontogenic sinusitis versus maxillary sinus mucositis

F. Allevi
Penultimo
;
A.M. Saibene
Ultimo
2026

Abstract

Purpose This exploratory pilot study evaluated whether combining structured clinical information, medical imaging, and literature-derived knowledge could enhance the diagnostic reasoning and output quality of a large language model in distinguishing odontogenic sinusitis from maxillary sinus mucositis. Methods Six complex clinical cases were constructed with nasal endoscopy, computed tomography findings, and clinical vignettes. ChatGPT-4.0 was prompted using four strategies: (1) clinical text only, (2) text with medical imaging, (3) text with structured literature excerpts, and (4) text with both imaging and literature input. Seven blinded expert reviewers rated 168 responses across diagnostic accuracy, clinical reasoning, safety, and overall usefulness using a five-point scale. Statistical comparisons and inter-rater reliability were assessed. Results All prompting strategies produced clinically safe outputs with minimal hallucinations or unsafe recommendations. Combining structured literature and imaging with clinical text significantly improved clinical reasoning scores (F(3,123) = 4.32, p = 0.0058), ranking first or second in six of seven evaluation domains. No significant differences were observed in diagnostic accuracy or safety across strategies. Inter-rater reliability was substantial (kappa = 0.7). Conclusion Providing structured evidence and imaging appeared to enhance the clinical reasoning quality of large language models in this small, simulated dataset, without compromising diagnostic accuracy or safety. These preliminary findings suggest that structured multimodal prompting may help improve the interpretability and reliability of artificial intelligence tools for supporting diagnostic reasoning in sinus-related diseases, though larger prospective validation studies are needed before clinical implementation.
Allergology; Dental patient assessment; Gerodontics; Maxillofacial surgery; Oral and Maxillofacial Surgery; Otorhinolaryngology
Settore MEDS-18/A - Otorinolaringoiatria
Settore MEDS-15/B - Chirurgia maxillo-facciale
2026
15-mar-2026
Article (author)
File in questo prodotto:
File Dimensione Formato  
ODS LLM (2026).pdf

accesso riservato

Tipologia: Publisher's version/PDF
Licenza: Nessuna licenza
Dimensione 1.26 MB
Formato Adobe PDF
1.26 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1227175
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact