The introduction of time series modeling techniques made analyzing the different factors underlying the changes in mortality and incidence rates over time possible, both for analytic and predictive purposes. Ageperiodcohort analyses contribute to the etiologic purpose of descriptive epidemiology making inference from the group to the individual possible. These refer to a family of statistical techniques that study the temporal trends of outcomes, such as mortality an incidence, in terms of three temporal variables: subject age, calendar period and the subject's birth cohort. Useful as it is, the ageperiodcohort model is marred by a structural problem of identifiability: the variables of age, period and cohort have an exact linear dependence, i.e. "age = period  cohort". Predicting a future event is a complex and insidious process, however, it is a useful endeavor in most human activities. The information gained on probable future trends, even if unreliable or imprecise is highly valuable. Predicted future cancer incidence and mortality rates are essential tools for both epidemiology and health planning. Numerous methods to carry out ageperiodcohort analysis are described in the literature, three of these are illustrated in detail and compared by applying them to real data (WHO mortality database): a method based on penalized likelihood, one using generalized additive models (GAM) and one based on partial least squares (PLS) techniques. Predictive analysis techniques are also presented and compared, using observed mortality data. Short term ageperiod prediction methods based on joinpoint analysis and Bayesian modelling, and a long term technique, which uses a Bayesian ageperiodcohort model, are reviewed. In details, predictions through ageperiod method based on joinpoint analysis are carried out applying linear, Poisson and loglinear regression models. In the ageperiodcohort analysis comparison, the penalized likelihood and GAM methods produce similar results, while effect estimates from the PLS model are noticeably different. These differences can be explained by looking at how the three models solve the issue of perfect collinearity between age, period and cohort parameters. On the one hand, the penalized likelihood and GAM methods use different techniques to distribute the linear drift between the period and cohort effects. The PLS method, on the other hand, solves the identifiability problem by tackling the generalized inverse, minimizing the estimated parameter variance and covariance matrix. Without a formal simulation analysis, comments are limited to stating that the two models based on linear drift distribution are more suitable for epidemiological comparisons, where the effects of age are well defined (as in the case of cancer mortality) and the major problems reside in untangling the period and cohort effects. The PLS model, on the other hand, may hypothetically prove to be a useful method to predict future trends. Ageperiodcohort analysis is thus an extremely useful tool in the study of mortality data, particularly for cohort effect analysis, but it should be used with due caution since it is relatively easy to draw erroneous conclusions. The predictive method comparison shows that estimates from the different models are similar, especially for the Poisson and loglinear models. However, the linear model has a tendency to underestimate, while the other considered models seem to overestimate, particularly as the forecasting time period grew larger. Overall, the Bayesian ageperiod model seems to be less suitable for short and medium term mortality predictions, while the other models do not show large performance differences. From these limited tests the linear model and the Bayesian ageperiodcohort model seem to provide better estimates when mortality values are low, whereas in the case of greater numbers Poisson and loglinear models seem like better choices. Finally, the analyzed data's unknown underlying distribution shape determines which model predicts more successfully. However, all the studied models are appropriate for predicting data over short periods (up to 5 years). While none of them performs well over the medium term. Prediction of future trends will always be a complex and insidious exercise, albeit an extremely useful one, furthermore the obtained estimates should be taken with caution and only regarded as a general indication of potential interest for epidemiology and health planning.
METODI STATISTICI PER L'ANALISI E LA PREVISIONE DELLA MORTALITA' PER TUMORE / T. Rosso ; tutor: A. Decarli ; coordinator: A. Decarli. DIPARTIMENTO DI SCIENZE CLINICHE E DI COMUNITA', 2015 Dec 11. 28. ciclo, Anno Accademico 2015. [10.13130/rossotiziana_phd20151211].
METODI STATISTICI PER L'ANALISI E LA PREVISIONE DELLA MORTALITA' PER TUMORE
T. Rosso
2015
Abstract
The introduction of time series modeling techniques made analyzing the different factors underlying the changes in mortality and incidence rates over time possible, both for analytic and predictive purposes. Ageperiodcohort analyses contribute to the etiologic purpose of descriptive epidemiology making inference from the group to the individual possible. These refer to a family of statistical techniques that study the temporal trends of outcomes, such as mortality an incidence, in terms of three temporal variables: subject age, calendar period and the subject's birth cohort. Useful as it is, the ageperiodcohort model is marred by a structural problem of identifiability: the variables of age, period and cohort have an exact linear dependence, i.e. "age = period  cohort". Predicting a future event is a complex and insidious process, however, it is a useful endeavor in most human activities. The information gained on probable future trends, even if unreliable or imprecise is highly valuable. Predicted future cancer incidence and mortality rates are essential tools for both epidemiology and health planning. Numerous methods to carry out ageperiodcohort analysis are described in the literature, three of these are illustrated in detail and compared by applying them to real data (WHO mortality database): a method based on penalized likelihood, one using generalized additive models (GAM) and one based on partial least squares (PLS) techniques. Predictive analysis techniques are also presented and compared, using observed mortality data. Short term ageperiod prediction methods based on joinpoint analysis and Bayesian modelling, and a long term technique, which uses a Bayesian ageperiodcohort model, are reviewed. In details, predictions through ageperiod method based on joinpoint analysis are carried out applying linear, Poisson and loglinear regression models. In the ageperiodcohort analysis comparison, the penalized likelihood and GAM methods produce similar results, while effect estimates from the PLS model are noticeably different. These differences can be explained by looking at how the three models solve the issue of perfect collinearity between age, period and cohort parameters. On the one hand, the penalized likelihood and GAM methods use different techniques to distribute the linear drift between the period and cohort effects. The PLS method, on the other hand, solves the identifiability problem by tackling the generalized inverse, minimizing the estimated parameter variance and covariance matrix. Without a formal simulation analysis, comments are limited to stating that the two models based on linear drift distribution are more suitable for epidemiological comparisons, where the effects of age are well defined (as in the case of cancer mortality) and the major problems reside in untangling the period and cohort effects. The PLS model, on the other hand, may hypothetically prove to be a useful method to predict future trends. Ageperiodcohort analysis is thus an extremely useful tool in the study of mortality data, particularly for cohort effect analysis, but it should be used with due caution since it is relatively easy to draw erroneous conclusions. The predictive method comparison shows that estimates from the different models are similar, especially for the Poisson and loglinear models. However, the linear model has a tendency to underestimate, while the other considered models seem to overestimate, particularly as the forecasting time period grew larger. Overall, the Bayesian ageperiod model seems to be less suitable for short and medium term mortality predictions, while the other models do not show large performance differences. From these limited tests the linear model and the Bayesian ageperiodcohort model seem to provide better estimates when mortality values are low, whereas in the case of greater numbers Poisson and loglinear models seem like better choices. Finally, the analyzed data's unknown underlying distribution shape determines which model predicts more successfully. However, all the studied models are appropriate for predicting data over short periods (up to 5 years). While none of them performs well over the medium term. Prediction of future trends will always be a complex and insidious exercise, albeit an extremely useful one, furthermore the obtained estimates should be taken with caution and only regarded as a general indication of potential interest for epidemiology and health planning.File  Dimensione  Formato  

phd_unimi_R10099.pdf
Open Access dal 10/06/2017
Descrizione: Tesi completa
Tipologia:
Tesi di dottorato completa
Dimensione
1.84 MB
Formato
Adobe PDF

1.84 MB  Adobe PDF  Visualizza/Apri 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.