With the development of deep learning (DL)-based methods, automated atrial fibrillation (AF) detection from electrocardiograms (ECGs) has recently gained much attention. Although the performance of DL has been encouraging, the susceptibility of DL models to overfitting would benefit from the exploration of uncertainty quantification (UQ) to ensure safe integration into clinical practice. However, there has been limited exploration of UQ methods in the context of DL models for AF detection using Holter ECG recordings, and a comprehensive comparison of various UQ techniques remains absent. This study addressed this gap by introducing a benchmark study wherein 11 distinct UQ methods were rigorously evaluated and compared across three public Holter repositories: IRIDIA-AF, Long-Term AF, and MIT-BIH AF datasets. A residual DL model was used for the UQ methods, which is one of the most common architectures in this domain for its ability to capture complex patterns within ECG data. The findings revealed that batch-ensemble (BE) and packed-ensemble (PE) outperformed other UQ methods concerning both performance, as quantified by sensitivity, specificity and expected calibration error, and computational efficiency. In addition, when we implemented reject inference to discard ECG segments where the model confidence was not sufficiently high, BE and PE still showed to reject the least number of samples, while retaining the highest detection performance.

Uncertainty estimation of deep learning models for atrial fibrillation detection from Holter recordings: A benchmark study / M.M. Rahman, M.W. Rivolta, F. Badilini, R. Sassi. - In: BIOMEDICAL SIGNAL PROCESSING AND CONTROL. - ISSN 1746-8094. - 113:Part C(2026 Mar), pp. 109032.1-109032.10. [10.1016/j.bspc.2025.109032]

Uncertainty estimation of deep learning models for atrial fibrillation detection from Holter recordings: A benchmark study

M.M. Rahman
Primo
;
M.W. Rivolta
Secondo
;
R. Sassi
Ultimo
2026

Abstract

With the development of deep learning (DL)-based methods, automated atrial fibrillation (AF) detection from electrocardiograms (ECGs) has recently gained much attention. Although the performance of DL has been encouraging, the susceptibility of DL models to overfitting would benefit from the exploration of uncertainty quantification (UQ) to ensure safe integration into clinical practice. However, there has been limited exploration of UQ methods in the context of DL models for AF detection using Holter ECG recordings, and a comprehensive comparison of various UQ techniques remains absent. This study addressed this gap by introducing a benchmark study wherein 11 distinct UQ methods were rigorously evaluated and compared across three public Holter repositories: IRIDIA-AF, Long-Term AF, and MIT-BIH AF datasets. A residual DL model was used for the UQ methods, which is one of the most common architectures in this domain for its ability to capture complex patterns within ECG data. The findings revealed that batch-ensemble (BE) and packed-ensemble (PE) outperformed other UQ methods concerning both performance, as quantified by sensitivity, specificity and expected calibration error, and computational efficiency. In addition, when we implemented reject inference to discard ECG segments where the model confidence was not sufficiently high, BE and PE still showed to reject the least number of samples, while retaining the highest detection performance.
Atrial fibrillation; Bayesian deep learning; Deep learning; Uncertainty quantification;
Settore INFO-01/A - Informatica
Settore IBIO-01/A - Bioingegneria
   Adaptive AI methods for Digital Health (AIDH)
   AIDH
   POLITECNICO DI MILANO
mar-2026
Article (author)
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1746809425015435-main.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Licenza: Nessuna licenza
Dimensione 1.92 MB
Formato Adobe PDF
1.92 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1201595
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact