Objective: HIV treatment failure is commonly associated with drug resistance and the selection of a new regimen is often guided by genotypic resistance testing. The interpretation of complex genotypic data poses a major challenge. We have developed artificial neural network (ANN) models that predict virological response to therapy from HIV genotype and other clinical information. Here we compare the accuracy of ANN with alternative modelling methodologies, random forests (RF) and support vector machines (SVM). Methods: Data from 1204 treatment change episodes (TCEs) were identified from the HIV Resistance Response Database Initiative (RDI) database and partitioned at random into a training set of 1154 and a test set of 50. The training set was then partitioned using an L-cross (L = 10 in this study) validation scheme for training individual computational models. Seventy six input variables were used for training the models: 55 baseline genotype mutations; the 14 potential drugs in the new treatment regimen; four treatment history variables; baseline viral load; CD4 count and time to follow-up viral load. The output variable was follow-up viral load. Performance was evaluated in terms of the correlations and absolute differences between the individual models' predictions and the actual ΔVL values. Results: The correlations (r2) between predicted and actual ΔVL varied from 0.318 to 0.546 for ANN, 0.590 to 0.751 for RF and 0.300 to 0.720 for SVM. The mean absolute differences varied from 0.677 to 0.903 for ANN, 0.494 to 0.644 for RF and 0.500 to 0.790 for SVM. ANN models were significantly inferior to RF and SVM models. The predictions of the ANN, RF and SVM committees all correlated highly significantly with the actual ΔVL of the independent test TCEs, producing r2 values of 0.689, 0.707 and 0.620, respectively. The mean absolute differences were 0.543, 0.600 and 0.607 log10 copies/ml for ANN, RF and SVM, respectively. There were no statistically significant differences between the three committees. Combining the committees' outputs improved correlations between predicted and actual virological responses. The combination of all three committees gave a correlation of r2 = 0.728. The mean absolute differences followed a similar pattern. Conclusions: RF and SVM models can produce predictions of virological response to HIV treatment that are comparable in accuracy to a committee of ANN models. Combining the predictions of different models improves their accuracy somewhat. This approach has potential as a future clinical tool and a combination of ANN and RF models is being taken forward for clinical evaluation.
A comparison of three computational modelling methods for the prediction of virological response to combination therapy / D. Wang, B. Larder, A. Revell, J. Montaner, R. Harrigan, F. De Wolf, J. Lange, S. Wegner, L. Ruiz, M.J. Perez-Elias, S. Emery, J. Gatell, A. d’Arminio Monforte, C. Torti, M. Zazzi, C. Lane. - In: ARTIFICIAL INTELLIGENCE IN MEDICINE. - ISSN 0933-3657. - 47:1(2009), pp. 63-74.
A comparison of three computational modelling methods for the prediction of virological response to combination therapy
A. d’Arminio Monforte;
2009
Abstract
Objective: HIV treatment failure is commonly associated with drug resistance and the selection of a new regimen is often guided by genotypic resistance testing. The interpretation of complex genotypic data poses a major challenge. We have developed artificial neural network (ANN) models that predict virological response to therapy from HIV genotype and other clinical information. Here we compare the accuracy of ANN with alternative modelling methodologies, random forests (RF) and support vector machines (SVM). Methods: Data from 1204 treatment change episodes (TCEs) were identified from the HIV Resistance Response Database Initiative (RDI) database and partitioned at random into a training set of 1154 and a test set of 50. The training set was then partitioned using an L-cross (L = 10 in this study) validation scheme for training individual computational models. Seventy six input variables were used for training the models: 55 baseline genotype mutations; the 14 potential drugs in the new treatment regimen; four treatment history variables; baseline viral load; CD4 count and time to follow-up viral load. The output variable was follow-up viral load. Performance was evaluated in terms of the correlations and absolute differences between the individual models' predictions and the actual ΔVL values. Results: The correlations (r2) between predicted and actual ΔVL varied from 0.318 to 0.546 for ANN, 0.590 to 0.751 for RF and 0.300 to 0.720 for SVM. The mean absolute differences varied from 0.677 to 0.903 for ANN, 0.494 to 0.644 for RF and 0.500 to 0.790 for SVM. ANN models were significantly inferior to RF and SVM models. The predictions of the ANN, RF and SVM committees all correlated highly significantly with the actual ΔVL of the independent test TCEs, producing r2 values of 0.689, 0.707 and 0.620, respectively. The mean absolute differences were 0.543, 0.600 and 0.607 log10 copies/ml for ANN, RF and SVM, respectively. There were no statistically significant differences between the three committees. Combining the committees' outputs improved correlations between predicted and actual virological responses. The combination of all three committees gave a correlation of r2 = 0.728. The mean absolute differences followed a similar pattern. Conclusions: RF and SVM models can produce predictions of virological response to HIV treatment that are comparable in accuracy to a committee of ANN models. Combining the predictions of different models improves their accuracy somewhat. This approach has potential as a future clinical tool and a combination of ANN and RF models is being taken forward for clinical evaluation.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.