Machine learning for prediction of in-hospital mortality in coronavirus disease 2019 patients: results from an Italian multicenter study

Vezzoli, M.; Inciardi, R.M.; Oriecuia, C.; Paris, S.; Murillo, N.H.; Agostoni, P.; Ameri, P.; Bellasi, A.; Camporotondo, R.; Canale, C.; Carubelli, V.; Carugo, S.; Catagnano, F.; Danzi, G.; Dalla Vecchia, L.; Giovinazzo, S.; Gnecchi, M.; Guazzi, M.; Iorio, A.; La Rovere, M.T.; Leonardi, S.; Maccagni, G.; Mapelli, M.; Margonato, D.; Merlo, M.; Monzo, L.; Mortara, A.; Nuzzi, V.; Pagnesi, M.; Piepoli, M.; Porto, I.; Pozzi, A.; Provenzale, G.; Sarullo, F.; Senni, M.; Sinagra, G.; Tomasoni, D.; Adamo, M.; Volterrani, M.; Maroldi, R.; Metra, M.; Lombardi, C.M.; Specchia, C.

doi:10.2459/JCM.0000000000001329

Background Several risk factors have been identified to predict worse outcomes in patients affected by SARS-CoV-2 infection. Machine learning algorithms represent a novel approach to identifying a prediction model with a good discriminatory capacity to be easily used in clinical practice. The aim of this study was to obtain a risk score for in-hospital mortality in patients with coronavirus disease infection (COVID-19) based on a limited number of features collected at hospital admission. Methods and results We studied an Italian cohort of consecutive adult Caucasian patients with laboratory-confirmed COVID-19 who were hospitalized in 13 cardiology units during Spring 2020. The Lasso procedure was used to select the most relevant covariates. The dataset was randomly divided into a training set containing 80% of the data, used for estimating the model, and a test set with the remaining 20%. A Random Forest modeled in-hospital mortality with the selected set of covariates: its accuracy was measured by means of the ROC curve, obtaining AUC, sensitivity, specificity and related 95% confidence interval (CI). This model was then compared with the one obtained by the Gradient Boosting Machine (GBM) and with logistic regression. Finally, to understand if each model has the same performance in the training and test set, the two AUCs were compared using the DeLong's test. Among 701 patients enrolled (mean age 67.2 +/- 13.2 years, 69.5% male individuals), 165 (23.5%) died during a median hospitalization of 15 (IQR, 9-24) days. Variables selected by the Lasso procedure were: age, oxygen saturation, PaO2/FiO(2), creatinine clearance and elevated troponin. Compared with those who survived, deceased patients were older, had a lower blood oxygenation, lower creatinine clearance levels and higher prevalence of elevated troponin (all P < 0.001). The best performance out of the samples was provided by Random Forest with an AUC of 0.78 (95% CI: 0.68-0.88) and a sensitivity of 0.88 (95% CI: 0.58-1.00). Moreover, Random Forest was the unique model that provided similar performance in sample and out of sample (DeLong test P = 0.78). Conclusion In a large COVID-19 population, we showed that a customizable machine learning-based score derived from clinical variables is feasible and effective for the prediction of in-hospital mortality.

Machine learning for prediction of in-hospital mortality in coronavirus disease 2019 patients: results from an Italian multicenter study / M. Vezzoli, R.M. Inciardi, C. Oriecuia, S. Paris, N.H. Murillo, P. Agostoni, P. Ameri, A. Bellasi, R. Camporotondo, C. Canale, V. Carubelli, S. Carugo, F. Catagnano, G. Danzi, L. Dalla Vecchia, S. Giovinazzo, M. Gnecchi, M. Guazzi, A. Iorio, M.T. La Rovere, S. Leonardi, G. Maccagni, M. Mapelli, D. Margonato, M. Merlo, L. Monzo, A. Mortara, V. Nuzzi, M. Pagnesi, M. Piepoli, I. Porto, A. Pozzi, G. Provenzale, F. Sarullo, M. Senni, G. Sinagra, D. Tomasoni, M. Adamo, M. Volterrani, R. Maroldi, M. Metra, C.M. Lombardi, C. Specchia. - In: JOURNAL OF CARDIOVASCULAR MEDICINE. - ISSN 1558-2027. - 23:7(2022 Jul 01), pp. 439-446. [10.2459/JCM.0000000000001329]

Machine learning for prediction of in-hospital mortality in coronavirus disease 2019 patients: results from an Italian multicenter study

Vezzoli, Marika;Inciardi, Riccardo Maria;Oriecuia, Chiara;Paris, Sara;Murillo, Natalia Herrera;P. Agostoni;Ameri, Pietro;A. Bellasi;Camporotondo, Rita;Canale, Claudia;Carubelli, Valentina;S. Carugo;Catagnano, Francesco;Danzi, Giambattista;Dalla Vecchia, Laura;Giovinazzo, Stefano;Gnecchi, Massimiliano;M. Guazzi;Iorio, Anita;La Rovere, Maria Teresa;Leonardi, Sergio;Maccagni, Gloria;M. Mapelli;Margonato, Davide;Merlo, Marco;Monzo, Luca;A. Mortara;Nuzzi, Vincenzo;Pagnesi, Matteo;M. Piepoli;Porto, Italo;Pozzi, Andrea;G. Provenzale;Sarullo, Filippo;Senni, Michele;Sinagra, Gianfranco;Tomasoni, Daniela;Adamo, Marianna;Volterrani, Maurizio;Maroldi, Roberto;Metra, Marco;Lombardi, Carlo Mario;Specchia, Claudia

2022

Abstract

Background Several risk factors have been identified to predict worse outcomes in patients affected by SARS-CoV-2 infection. Machine learning algorithms represent a novel approach to identifying a prediction model with a good discriminatory capacity to be easily used in clinical practice. The aim of this study was to obtain a risk score for in-hospital mortality in patients with coronavirus disease infection (COVID-19) based on a limited number of features collected at hospital admission. Methods and results We studied an Italian cohort of consecutive adult Caucasian patients with laboratory-confirmed COVID-19 who were hospitalized in 13 cardiology units during Spring 2020. The Lasso procedure was used to select the most relevant covariates. The dataset was randomly divided into a training set containing 80% of the data, used for estimating the model, and a test set with the remaining 20%. A Random Forest modeled in-hospital mortality with the selected set of covariates: its accuracy was measured by means of the ROC curve, obtaining AUC, sensitivity, specificity and related 95% confidence interval (CI). This model was then compared with the one obtained by the Gradient Boosting Machine (GBM) and with logistic regression. Finally, to understand if each model has the same performance in the training and test set, the two AUCs were compared using the DeLong's test. Among 701 patients enrolled (mean age 67.2 +/- 13.2 years, 69.5% male individuals), 165 (23.5%) died during a median hospitalization of 15 (IQR, 9-24) days. Variables selected by the Lasso procedure were: age, oxygen saturation, PaO2/FiO(2), creatinine clearance and elevated troponin. Compared with those who survived, deceased patients were older, had a lower blood oxygenation, lower creatinine clearance levels and higher prevalence of elevated troponin (all P < 0.001). The best performance out of the samples was provided by Random Forest with an AUC of 0.78 (95% CI: 0.68-0.88) and a sensitivity of 0.88 (95% CI: 0.58-1.00). Moreover, Random Forest was the unique model that provided similar performance in sample and out of sample (DeLong test P = 0.78). Conclusion In a large COVID-19 population, we showed that a customizable machine learning-based score derived from clinical variables is feasible and effective for the prediction of in-hospital mortality.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				coronavirus disease 2019; inflammation; machine learning methods; mortality score
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore MED/11 - Malattie dell'Apparato Cardiovascolare
			
	Data di pubblicazione
	
				1-lug-2022
			
	Rivista in ANCE
	
				JOURNAL OF CARDIOVASCULAR MEDICINE
			
	DOI
	
				https://dx.doi.org/10.2459/JCM.0000000000001329
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
machine_learning_for_prediction_of_in_hospital.4.pdf accesso riservato Descrizione: Original Article Tipologia: Publisher's version/PDF Dimensione 749.39 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	749.39 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1007889

Citazioni

4

7

7

7

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca