Missing values are common in medical datasets and may be amenable to data imputation when modelling a given data set or validating on an external cohort. This paper discusses model averaging over samples of the imputed distribution and extends this approach to generic non-linear modelling with the Partial Logistic Artificial Neural Network (PLANN) regularised within the evidence-based framework with Automatic Relevance Determination (ARD), The study then applies the imputation to external validation over new patient cohorts, considering also the case of predictions made for individual patients. A prognostic index is defined for the non-linear model and validation results show that 4 statistically significant risk groups identified at the 95% level of confidence from the modelling data, from Christie Hospital (n=931), retain good separation during external validation with data from the British Columbia Cancer Agency (n=4, 083).

Missing Data Imputation in Longitudinal Cohort Studies: Application of PLANN-ARD in Breast Cancer Survival / A. S. Fernandes, I. H. Jarman, T. A. Etchells, J. M. Fonseca, E. Biganzoli, C. Bajdik, P. J. G. Lisboa - In: 2008 Seventh International Conference on Machine Learning and Applications[s.l] : CSREA Press, 2008. - ISBN 978-0-7695-3495-4. - pp. 644-649

Missing Data Imputation in Longitudinal Cohort Studies: Application of PLANN-ARD in Breast Cancer Survival

E. Biganzoli;
2008

Abstract

Missing values are common in medical datasets and may be amenable to data imputation when modelling a given data set or validating on an external cohort. This paper discusses model averaging over samples of the imputed distribution and extends this approach to generic non-linear modelling with the Partial Logistic Artificial Neural Network (PLANN) regularised within the evidence-based framework with Automatic Relevance Determination (ARD), The study then applies the imputation to external validation over new patient cohorts, considering also the case of predictions made for individual patients. A prognostic index is defined for the non-linear model and validation results show that 4 statistically significant risk groups identified at the 95% level of confidence from the modelling data, from Christie Hospital (n=931), retain good separation during external validation with data from the British Columbia Cancer Agency (n=4, 083).
Settore MED/01 - Statistica Medica
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/191776
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 8
social impact