Objectives: Clear, complete operative documentation is essential for surgical safety, continuity of care, and medico-legal standards. Large language models such as ChatGPT offer promise for automating clinical documentation; however, their performance in operative note generation, particularly in surgical subspecialties, remains underexplored. This study aimed to compare the quality, accuracy, and efficiency of operative notes authored by a surgical resident, attending surgeon, GPT alone, and an attending surgeon using GPT as a writing aid. Methods: Five publicly available otolaryngologic procedures were selected. For each procedure, four operative notes were generated, one by a resident, one by an attending, one by GPT alone, and one by a hybrid of attending plus GPT. Ten blinded otolaryngologists (five residents, five attendings) independently reviewed all 20 notes. Reviewers scored each note across eight domains using a five-point scale, assigned a final approval rating, and provided qualitative feedback. Writing time was recorded to assess documentation efficiency. Results: Hybrid notes written by an attending surgeon with GPT assistance received the highest average domain scores and the highest "as is" approval rate (79%), outperforming all other groups. GPT-only notes were the fastest to generate but had the lowest approval rate (23%) and the highest incidence of both omissions and overdocumentation. Writing time was significantly reduced in both AI-assisted groups compared to human-only authorship. Inter-rater reliability among reviewers was moderate to high across most domains. Conclusion: In this limited dataset, hybrid human-AI collaboration outperformed both human-only and AI-only authorship in operative documentation. These findings support GPT-assisted documentation to improve operative note efficiency and consistency. Level of evidence: N/A.
Surgeon, Trainee, or GPT? A Blinded Multicentric Study of AI‐Augmented Operative Notes / S. Hack, R. Attal, G. Locatelli, G. Scotta, A. Maniaci, F.M. Parisi, N. Van Der Poel, M. Van Daele, A. Garcia‐lliberos, C. Rodriguez‐prado, C.M. Chiesa‐estomba, M. Andueza‐guembe, P. Cobb, H.G. Zalzal, A.M. Saibene. - In: LARYNGOSCOPE. - ISSN 0023-852X. - (2025), pp. 1-11. [Epub ahead of print] [10.1002/lary.70063]
Surgeon, Trainee, or GPT? A Blinded Multicentric Study of AI‐Augmented Operative Notes
G. Locatelli;A.M. SaibeneUltimo
2025
Abstract
Objectives: Clear, complete operative documentation is essential for surgical safety, continuity of care, and medico-legal standards. Large language models such as ChatGPT offer promise for automating clinical documentation; however, their performance in operative note generation, particularly in surgical subspecialties, remains underexplored. This study aimed to compare the quality, accuracy, and efficiency of operative notes authored by a surgical resident, attending surgeon, GPT alone, and an attending surgeon using GPT as a writing aid. Methods: Five publicly available otolaryngologic procedures were selected. For each procedure, four operative notes were generated, one by a resident, one by an attending, one by GPT alone, and one by a hybrid of attending plus GPT. Ten blinded otolaryngologists (five residents, five attendings) independently reviewed all 20 notes. Reviewers scored each note across eight domains using a five-point scale, assigned a final approval rating, and provided qualitative feedback. Writing time was recorded to assess documentation efficiency. Results: Hybrid notes written by an attending surgeon with GPT assistance received the highest average domain scores and the highest "as is" approval rate (79%), outperforming all other groups. GPT-only notes were the fastest to generate but had the lowest approval rate (23%) and the highest incidence of both omissions and overdocumentation. Writing time was significantly reduced in both AI-assisted groups compared to human-only authorship. Inter-rater reliability among reviewers was moderate to high across most domains. Conclusion: In this limited dataset, hybrid human-AI collaboration outperformed both human-only and AI-only authorship in operative documentation. These findings support GPT-assisted documentation to improve operative note efficiency and consistency. Level of evidence: N/A.| File | Dimensione | Formato | |
|---|---|---|---|
|
GPT and surgical notes (2025).pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Licenza:
Nessuna licenza
Dimensione
1.27 MB
Formato
Adobe PDF
|
1.27 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




