Between January and October of 2020, the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2) virus has infected more than 34 million persons in a worldwide pandemic leading to overone million deaths worldwide (data from the Johns Hopkins University). Since the virus begun to spread,emergency departments were busy with COVID-19 patients for whom a quick decision regarding in- oroutpatient care was required. The virus can cause characteristic abnormalities in chest radiographs (CXR),but, due to the low sensitivity of CXR, additional variables and criteria are needed to accurately predictrisk. Here, we describe a computerized system primarily aimed at extracting the most relevant radiological,clinical, and laboratory variables for improving patient risk prediction, and secondarily at presenting anexplainable machine learning system, which may provide simple decision criteria to be used by cliniciansas a support for assessing patient risk. To achieve robust and reliable variable selection, Boruta and RandomForest (RF) are combined in a 10-fold cross-validation scheme to produce a variable importance estimate notbiased by the presence of surrogates. The most important variables are then selected to train a RF classifier,whose rules may be extracted, simplified, and pruned to finally build an associative tree, particularlyappealing for its simplicity. Results show that the radiological score automatically computed through aneural network is highly correlated with the score computed by radiologists, and that laboratory variables,together with the number of comorbidities, aid risk prediction. The prediction performance of our approachwas compared to that that of generalized linear models and shown to be effective and robust. The proposedmachine learning-based computational system can be easily deployed and used in emergency departmentsfor rapid and accurate risk prediction in COVID-19 patients.

Explainable machine learning for early assessment of COVID-19 risk prediction in emergency departments / E. Casiraghi, D. Malchiodi, G. Trucco, M. Frasca, L. Cappelletti, F. Tommaso, A.A. Esposito, E. Avola, A. Jachetti, J. Reese, A. Rizzi, P.N. Robinson, G. Valentini. - In: IEEE ACCESS. - ISSN 2169-3536. - 8(2020), pp. 196299-196325.

Explainable machine learning for early assessment of COVID-19 risk prediction in emergency departments

E. Casiraghi
Primo
;
D. Malchiodi
Secondo
;
G. Trucco;M. Frasca;L. Cappelletti;F. Tommaso;A.A. Esposito;E. Avola;A. Rizzi;G. Valentini
Ultimo
2020

Abstract

Between January and October of 2020, the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2) virus has infected more than 34 million persons in a worldwide pandemic leading to overone million deaths worldwide (data from the Johns Hopkins University). Since the virus begun to spread,emergency departments were busy with COVID-19 patients for whom a quick decision regarding in- oroutpatient care was required. The virus can cause characteristic abnormalities in chest radiographs (CXR),but, due to the low sensitivity of CXR, additional variables and criteria are needed to accurately predictrisk. Here, we describe a computerized system primarily aimed at extracting the most relevant radiological,clinical, and laboratory variables for improving patient risk prediction, and secondarily at presenting anexplainable machine learning system, which may provide simple decision criteria to be used by cliniciansas a support for assessing patient risk. To achieve robust and reliable variable selection, Boruta and RandomForest (RF) are combined in a 10-fold cross-validation scheme to produce a variable importance estimate notbiased by the presence of surrogates. The most important variables are then selected to train a RF classifier,whose rules may be extracted, simplified, and pruned to finally build an associative tree, particularlyappealing for its simplicity. Results show that the radiological score automatically computed through aneural network is highly correlated with the score computed by radiologists, and that laboratory variables,together with the number of comorbidities, aid risk prediction. The prediction performance of our approachwas compared to that that of generalized linear models and shown to be effective and robust. The proposedmachine learning-based computational system can be easily deployed and used in emergency departmentsfor rapid and accurate risk prediction in COVID-19 patients.
COVID-19; Predictive models; Machine learning; Radio frequency; Computational modeling; Feature extraction; Data models; Associative tree; Boruta feature selection; clinical data analysis; COVID-19; generalized linear models; missing data imputation; random forest classifier; risk prediction
Settore INF/01 - Informatica
   PIANO DI SOSTEGNO ALLA RICERCA 2015-2017 - LINEA 2 "DOTAZIONE ANNUALE PER ATTIVITA' ISTITUZIONALE"
2020
Article (author)
File in questo prodotto:
File Dimensione Formato  
IEEE-Access-published.pdf

accesso aperto

Descrizione: Articolo pubblicato
Tipologia: Publisher's version/PDF
Dimensione 2.49 MB
Formato Adobe PDF
2.49 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/780630
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 60
  • ???jsp.display-item.citation.isi??? 43
social impact