AI systems are widely proposed as second-opinion advisors in clinical diagnosis, offering the promise of enhancing decision accuracy and clinician confidence while preserving human oversight. However, successful deployment in real-world practice faces a critical barrier: clinicians' reliance on AI is often miscalibrated, manifesting as misuse (over-reliance driven by automation bias) and disuse (under-utilization driven by self-anchoring bias). This paper addresses these deployment challenges by systematically analyzing how such reliance patterns affect diagnostic accuracy, confidence, and decision-making across diverse medical specialties. We report results from controlled simulations involving over 300 medical professionals across six diagnostic settings—including knee MRI analysis, spinal X-rays, cardiac ECG evaluation, and gastrointestinal endoscopy—using a human-first, AI-second workflow. Although AI advice improved average diagnostic accuracy (+2 percentage points) and clinician confidence (+3 points on a normalized scale), overall levels of appropriate reliance remained well below 50%, with disuse emerging as the more prevalent and consequential barrier. We introduce and validate Appropriate Reliance as an actionable metric for assessing and improving human-AI collaboration, providing practical guidance for developers, healthcare institutions, and policymakers seeking to deploy second-opinion AI systems safely and effectively. By identifying the sociotechnical barriers and offering evidence-based design insights, this work supports the emerging application of AI as a collaborative advisor in clinical workflows, charting a clear path toward deployment that enhances diagnostic safety, accountability, and patient care. Specifically, we propose integrating the Appropriate Reliance metric into system development workflows, clinician training, and regulatory evaluations to enable safe and effective deployment of second-opinion AI systems.

Calibrating Reliance: Addressing Misuse and Disuse in AI-Based Second-Opinion Systems for Medical Diagnosis / F. Cabitza, A. Campagner, G.E. Tontini (PROCEEDINGS OF THE ... AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE). - In: New Faculty Highlights, Journal Track, IAAI-26 and EAAI-26 Main Track / [a cura di] S. Koenig, C. Jenkins, M. E. Taylor. - [s.l] : AAAI Press : Association for the Advancement of Artificial Intelligence, 2026 Mar 14. - ISBN 1-57735-906-2. - pp. 40210-40216 (( 40. Fortieth AAAI Conference on Artificial Intelligence : Thirty-Eighth Conference on Innovative Applications of Artificial Intelligence : Sixteenth Symposium on Educational Advances in Artificial Intelligence : January 20–27 Singapore 2026 [10.1609/aaai.v40i47.41457].

Calibrating Reliance: Addressing Misuse and Disuse in AI-Based Second-Opinion Systems for Medical Diagnosis

G.E. Tontini
Ultimo
2026

Abstract

AI systems are widely proposed as second-opinion advisors in clinical diagnosis, offering the promise of enhancing decision accuracy and clinician confidence while preserving human oversight. However, successful deployment in real-world practice faces a critical barrier: clinicians' reliance on AI is often miscalibrated, manifesting as misuse (over-reliance driven by automation bias) and disuse (under-utilization driven by self-anchoring bias). This paper addresses these deployment challenges by systematically analyzing how such reliance patterns affect diagnostic accuracy, confidence, and decision-making across diverse medical specialties. We report results from controlled simulations involving over 300 medical professionals across six diagnostic settings—including knee MRI analysis, spinal X-rays, cardiac ECG evaluation, and gastrointestinal endoscopy—using a human-first, AI-second workflow. Although AI advice improved average diagnostic accuracy (+2 percentage points) and clinician confidence (+3 points on a normalized scale), overall levels of appropriate reliance remained well below 50%, with disuse emerging as the more prevalent and consequential barrier. We introduce and validate Appropriate Reliance as an actionable metric for assessing and improving human-AI collaboration, providing practical guidance for developers, healthcare institutions, and policymakers seeking to deploy second-opinion AI systems safely and effectively. By identifying the sociotechnical barriers and offering evidence-based design insights, this work supports the emerging application of AI as a collaborative advisor in clinical workflows, charting a clear path toward deployment that enhances diagnostic safety, accountability, and patient care. Specifically, we propose integrating the Appropriate Reliance metric into system development workflows, clinician training, and regulatory evaluations to enable safe and effective deployment of second-opinion AI systems.
Settore MEDS-10/A - Gastroenterologia
14-mar-2026
Association for the Advancement of Artificial Intelligence (AAAI)
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
Cabitza. Calibrating Reliance Addressing Misuse and Disuse. Proceedings of the AAAI Conference on Artificial Intelligence 2026.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 2.11 MB
Formato Adobe PDF
2.11 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1239196
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact