Robust Deep Learning-based Segmentation of Glioblastoma on Routine Clinical MRI Scans Using Sparsified Training

Eijgelaar, R.; Visser, M.; Dmj, M.; Barkhof, F.; Vrenken, H.; Van Herk, M.; Bello, L.; Conti Nibali, M.; Rossi, M.; Sciortino, T.; Berger, M.; Hervey-Jumper, S.; Witte, M.

doi:10.1148/ryai.2020190103

Purpose: To improve the robustness of deep learning-based glioblastoma segmentation in a clinical setting with sparsified datasets. Materials and methods: In this retrospective study, preoperative T1-weighted, T2-weighted, T2-weighted fluid-attenuated inversion recovery, and postcontrast T1-weighted MRI from 117 patients (median age, 64 years; interquartile range [IQR], 55-73 years; 76 men) included within the Multimodal Brain Tumor Image Segmentation (BraTS) dataset plus a clinical dataset (2012-2013) with similar imaging modalities of 634 patients (median age, 59 years; IQR, 49-69 years; 382 men) with glioblastoma from six hospitals were used. Expert tumor delineations on the postcontrast images were available, but for various clinical datasets, one or more sequences were missing. The convolutional neural network, DeepMedic, was trained on combinations of complete and incomplete data with and without site-specific data. Sparsified training was introduced, which randomly simulated missing sequences during training. The effects of sparsified training and center-specific training were tested using Wilcoxon signed rank tests for paired measurements. Results: A model trained exclusively on BraTS data reached a median Dice score of 0.81 for segmentation on BraTS test data but only 0.49 on the clinical data. Sparsified training improved performance (adjusted P < .05), even when excluding test data with missing sequences, to median Dice score of 0.67. Inclusion of site-specific data during sparsified training led to higher model performance Dice scores greater than 0.8, on par with a model based on all complete and incomplete data. For the model using BraTS and clinical training data, inclusion of site-specific data or sparsified training was of no consequence. Conclusion: Accurate and automatic segmentation of glioblastoma on clinical scans is feasible using a model based on large, heterogeneous, and partially incomplete datasets. Sparsified training may boost the performance of a smaller model based on public and site-specific data.

Robust Deep Learning-based Segmentation of Glioblastoma on Routine Clinical MRI Scans Using Sparsified Training / R. Eijgelaar, M. Visser, D. Müller, F. Barkhof, H. Vrenken, M. van Herk, L. Bello, M. Conti Nibali, M. Rossi, T. Sciortino, M. Berger, S. Hervey-Jumper, M. Witte. - In: RADIOLOGY. ARTIFICIAL INTELLIGENCE. - ISSN 2638-6100. - 2:5(2020 Sep), pp. e190103.1-e190103.9. [10.1148/ryai.2020190103]

Robust Deep Learning-based Segmentation of Glioblastoma on Routine Clinical MRI Scans Using Sparsified Training

Eijgelaar RS^Primo;Visser M;Müller DMJ;Barkhof F;Vrenken H;van Herk M;L. Bello;M. Conti Nibali;M. Rossi;T. Sciortino;Berger MS;Hervey-Jumper S;Witte MG^Ultimo

2020

Abstract

Purpose: To improve the robustness of deep learning-based glioblastoma segmentation in a clinical setting with sparsified datasets. Materials and methods: In this retrospective study, preoperative T1-weighted, T2-weighted, T2-weighted fluid-attenuated inversion recovery, and postcontrast T1-weighted MRI from 117 patients (median age, 64 years; interquartile range [IQR], 55-73 years; 76 men) included within the Multimodal Brain Tumor Image Segmentation (BraTS) dataset plus a clinical dataset (2012-2013) with similar imaging modalities of 634 patients (median age, 59 years; IQR, 49-69 years; 382 men) with glioblastoma from six hospitals were used. Expert tumor delineations on the postcontrast images were available, but for various clinical datasets, one or more sequences were missing. The convolutional neural network, DeepMedic, was trained on combinations of complete and incomplete data with and without site-specific data. Sparsified training was introduced, which randomly simulated missing sequences during training. The effects of sparsified training and center-specific training were tested using Wilcoxon signed rank tests for paired measurements. Results: A model trained exclusively on BraTS data reached a median Dice score of 0.81 for segmentation on BraTS test data but only 0.49 on the clinical data. Sparsified training improved performance (adjusted P < .05), even when excluding test data with missing sequences, to median Dice score of 0.67. Inclusion of site-specific data during sparsified training led to higher model performance Dice scores greater than 0.8, on par with a model based on all complete and incomplete data. For the model using BraTS and clinical training data, inclusion of site-specific data or sparsified training was of no consequence. Conclusion: Accurate and automatic segmentation of glioblastoma on clinical scans is feasible using a model based on large, heterogeneous, and partially incomplete datasets. Sparsified training may boost the performance of a smaller model based on public and site-specific data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore MED/27 - Neurochirurgia
			
	Data di pubblicazione
	
				set-2020
			
	Rivista in ANCE
	
				RADIOLOGY. ARTIFICIAL INTELLIGENCE
			
	DOI
	
				https://dx.doi.org/10.1148/ryai.2020190103
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
ryai.2020190103.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 1.08 MB Formato Adobe PDF Visualizza/Apri	1.08 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/902535

Citazioni

18

25

26

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca