To gain a comprehensive understanding of a patient’s health, advanced analytics must be applied to the data collected by electronic health record (EHR) systems. However, managing and curating this data requires carefully designed workflows. While digitalization and standardization enable continuous health monitoring, missing data values and technical issues can compromise the consistency and timeliness of the data. In this paper, we propose a workflow for developing prognostic models that leverages the SMART BEAR infrastructure and the capabilities of the Big Data Analytics (BDA) engine to homogenize and harmonize data points. Our workflow improves the quality of the data by evaluating different imputation algorithms and selecting one that maintains the distribution and correlation of features similar to the raw data. We applied this workflow to a subset of the data stored in the SMART BEAR repository and examined its impact on the prediction of emerging health states such as cardiovascular disease and mild depression. We also discussed the possibility of model validation by clinicians in the SMART BEAR project, the transmission of subsequent actions in the decision support system, and the estimation of the required number of data points.
Data management for continuous learning in EHR systems / V. Bellandi, P. Ceravolo, J. Maggesi, S. Maghool. - In: ACM TRANSACTIONS ON INTERNET TECHNOLOGY. - ISSN 1533-5399. - (2024), pp. 1-23. [Epub ahead of print] [10.1145/3660634]
Data management for continuous learning in EHR systems
V. BellandiPrimo
;P. CeravoloSecondo
;J. Maggesi;S. Maghool
Ultimo
2024
Abstract
To gain a comprehensive understanding of a patient’s health, advanced analytics must be applied to the data collected by electronic health record (EHR) systems. However, managing and curating this data requires carefully designed workflows. While digitalization and standardization enable continuous health monitoring, missing data values and technical issues can compromise the consistency and timeliness of the data. In this paper, we propose a workflow for developing prognostic models that leverages the SMART BEAR infrastructure and the capabilities of the Big Data Analytics (BDA) engine to homogenize and harmonize data points. Our workflow improves the quality of the data by evaluating different imputation algorithms and selecting one that maintains the distribution and correlation of features similar to the raw data. We applied this workflow to a subset of the data stored in the SMART BEAR repository and examined its impact on the prediction of emerging health states such as cardiovascular disease and mild depression. We also discussed the possibility of model validation by clinicians in the SMART BEAR project, the transmission of subsequent actions in the decision support system, and the estimation of the required number of data points.File | Dimensione | Formato | |
---|---|---|---|
ACM.pdf
accesso aperto
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione
1.85 MB
Formato
Adobe PDF
|
1.85 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.