To gain a comprehensive understanding of a patient’s health, advanced analytics must be applied to the data collected by electronic health record (EHR) systems. However, managing and curating this data requires carefully designed workflows. While digitalization and standardization enable continuous health monitoring, missing data values and technical issues can compromise the consistency and timeliness of the data. In this paper, we propose a workflow for developing prognostic models that leverages the SMART BEAR infrastructure and the capabilities of the Big Data Analytics (BDA) engine to homogenize and harmonize data points. Our workflow improves the quality of the data by evaluating different imputation algorithms and selecting one that maintains the distribution and correlation of features similar to the raw data. We applied this workflow to a subset of the data stored in the SMART BEAR repository and examined its impact on the prediction of emerging health states such as cardiovascular disease and mild depression. We also discussed the possibility of model validation by clinicians in the SMART BEAR project, the transmission of subsequent actions in the decision support system, and the estimation of the required number of data points.

Data management for continuous learning in EHR systems / V. Bellandi, P. Ceravolo, J. Maggesi, S. Maghool. - In: ACM TRANSACTIONS ON INTERNET TECHNOLOGY. - ISSN 1533-5399. - (2024), pp. 1-23. [Epub ahead of print] [10.1145/3660634]

Data management for continuous learning in EHR systems

V. Bellandi
Primo
;
P. Ceravolo
Secondo
;
J. Maggesi;S. Maghool
Ultimo
2024

Abstract

To gain a comprehensive understanding of a patient’s health, advanced analytics must be applied to the data collected by electronic health record (EHR) systems. However, managing and curating this data requires carefully designed workflows. While digitalization and standardization enable continuous health monitoring, missing data values and technical issues can compromise the consistency and timeliness of the data. In this paper, we propose a workflow for developing prognostic models that leverages the SMART BEAR infrastructure and the capabilities of the Big Data Analytics (BDA) engine to homogenize and harmonize data points. Our workflow improves the quality of the data by evaluating different imputation algorithms and selecting one that maintains the distribution and correlation of features similar to the raw data. We applied this workflow to a subset of the data stored in the SMART BEAR repository and examined its impact on the prediction of emerging health states such as cardiovascular disease and mild depression. We also discussed the possibility of model validation by clinicians in the SMART BEAR project, the transmission of subsequent actions in the decision support system, and the estimation of the required number of data points.
Internet of Things; Electronic Health Records; Data Management; Continuous Learning
Settore INF/01 - Informatica
2024
7-mag-2024
https://dl.acm.org/doi/pdf/10.1145/3660634
Article (author)
File in questo prodotto:
File Dimensione Formato  
ACM.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 1.85 MB
Formato Adobe PDF
1.85 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1048814
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact