Introduction. Attributable fraction (AF), proposed by Levin, quantifies the reduction in the disease prevalence that could be achieved by eliminating the exposure (or risk factor) of interest from the population. Disease etiology involves multiple risk factors that may act simultaneously in the occurrence of disease and the optimal approach to quantify the individual and the joint effects of different risk factors on the disease burden is one of the goals in epidemiological research. Adjusted AFs quantify the effect of one risk factor after controlling of other factors (i.e., risk factors that may act together to cause disease, adjustment variables or confounders). Adjusted AFs may add up more than the joint AF (i.e., the AF for eliminating all risk factors from the population) and in some situation may add up to more than 1, leading to the conclusion that adjusted AFs should not be used to the purpose of partitioning the joint effect into individual contributions. Eide and Gefeller proposed a way to accomplish this task. Sequential AFs quantify the additional effect of one risk factor on the disease risk after the preceding risk factors have already been removed in a specified order from the population. However, sequential AFs depend on the order in which risk factors are removed from the population. Average AFs overcome this shortcoming by averaging sequential AFs for a risk factor over all orders by which risk factors can be removed from the population. Average AFs quantify the additional effect of one risk factor on the disease risk after the preceding factors selected randomly have already been removed from the population. Objective. This work aims to illustrate the main methodologies to estimate AFs and corresponding confidence intervals in presence of multiple risk factors with a focus on casecontrol study design. Moreover, we provide AF estimates for the major risk factors using Italian casecontrol data on oral cavity and breast cancers. Modification of casecontrol study design. In the original notation, sequential and average AFs could not be used in casecontrol study design, since the ratio of controls to cases in the sample is fixed a priori and the resulting AF estimates will be biased. Ferguson et al. proposed a prevalencebased weighting approach to correct the imbalance between controls and cases. The method consists in weighting the likelihood function of the model used to estimate sequential and average AFs for the disease prevalence. Variance estimation. The main approaches for estimating AF confidence intervals (CIs) are based on asymptotic approximation (Delta method) and simulations (Monte Carlo method). Ferguson proposed a method based on Monte Carlo simulations for constructing average AF variance. They also proposed the “averisk” R package for calculating average AFs and corresponding CIs in both prospective and casecontrol studies. In this work, we proposed a modification of the Ferguson’s method to account for sequential AF variability on the total variability. Variances comparison. We compared our and Ferguson’s methods to estimate average AF variance using simulated data. We generate two classes of simulated dataset. Each class included four scenarios according to different correlation structure: from independence (scenario 1) to strong correlation among risk factors (scenario 4). The two classes differed in the prevalence and strength of the association between risk factors. In particular, the first class had a high prevalence and modest relative risks, whereas the second class had a low prevalence and huge relative risks. For both classes of simulated data, standard deviation increment (i.e., the relative difference between our and Ferguson’s methods) became gradually larger increasing the number of independent risk factors (from two to ten). Conversely, standard deviation increment decreased incrementing the number of correlated risk factors. Although in some situations (i.e., for correlated risk factors) the contribution of our method could have a substantial relative impact on total AF variability (up to 88%), the absolute standard deviation differences between two methods were very small (less than 0.15) indicating a limited contribution of our method than the Feguson’s one. Application to real data. We estimated average AFs using a casecontrol study conducted in Italy on 946 oral cavity cases and 2492 controls. Risk factors considered for AF estimation were smoking, alcohol drinking, red meat intake, vegetables intake, fruit intake, and family history of oral cavity cancer. The final model included also terms for sex, age, study centre, years of education, BMI, and nonalcohol energy intake to account for possible confounding effect. We set a prevalence of oral cavity cancer according to statistics from the consortium of Italian Cancer Registry (AIRTUM) to adjust average AFs for casecontrol data structure. Eightyeight percent (95% CI: 78%; 98%) of oral cavity cases were attributable to the considered risk factors. In particular, the average AF for smoking was 0.34 (95% CI: 0.27; 0.41), indicating that 34% of oral cavity cases would not has occurred if smoking was randomly removed from the population over all possible risk factor removal orders. For the remaining risk factors, average AFs were 0.27 (95% CI: 0.17; 0.37) for alcohol drinking, 0.11 (95% CI: 0.06; 0.17) for low vegetables intake, 0.08 (95% CI: 0.02; 0.15) for low fruit intake, 0.06 (95% CI: 0.01; 0.12) for high red meat intake, and 0.009 (95% CI: 0.001; 0.02) for family history. We analyzed a further casecontrol study on 2569 breast cancer cases and 2588 controls. We set a prevalence of breast cancer to adjust average AFs for casecontrol data structure. The final model included alcohol drinking, parity, breastfeeding, use of oral contraceptives (OCs), and family history of breast cancer as risk factors; study centre, age, years of education, smoking, age at menarche and use of hormonal replacement therapy (HRT) as adjusting factors. The joint AF was 0.49 (95% CI: 0.35; 0.63) indicating that approximately half of the breast cancer cases would not has occurred if all risk factors were simultaneously eliminated from the population. In particular, average AFs were 0.27 (95% CI: 0.16; 0.39) for parity, 0.12 (95% CI: 0.06; 0.18) for alcohol drinking, 0.04 (95% CI: 0.02; 0.10) for breastfeeding (No or <4 months), 0.04 (95% CI: 0.03; 0.06) for family history of breast cancer, and 0.01 (95% CI: 0.01; 0.03) for OCs users. Conclusions. Sequential and average AFs are useful tools to apportion exposurespecific contributions in a population exposed to multiple risk factors. Sequential and average AFs share some mathematical properties such as componentadditivity, symmetry, marginal rationality, and internal marginal rationality. Average AFs, however, do not represent the actual amount of disease ascribable for each risk factors because they assume that risk factors are removed from the population in a random order. Nevertheless, average AFs could be useful parameters to estimate the average burden of disease for each risk factors across all possible removal orders. In this work, we proposed an alternative approach to estimate the average AF confidence interval accounting for sequential AF variability on the total AF one. We compared the performance between our and Fergusons’ methods to estimate AF variance. Although our method could have a relative impact on total AF variability, the absolute standard deviation differences suggest a limited contribution of our method. However, this topic should be further analyzed.
ESTIMATES OF CANCER POPULATION ATTRIBUTABLE FRACTIONS FOR MULTIPLE RISK FACTORS FROM A NETWORK OF ITALIAN CASECONTROL STUDIES / M. Di Maso ; tutor: M. Ferraroni. DIPARTIMENTO DI SCIENZE CLINICHE E DI COMUNITA', 2018 Mar 07. 30. ciclo, Anno Accademico 2017. [10.13130/dimasomatteo_phd20180307].
ESTIMATES OF CANCER POPULATION ATTRIBUTABLE FRACTIONS FOR MULTIPLE RISK FACTORS FROM A NETWORK OF ITALIAN CASECONTROL STUDIES
M. DI MASO
2018
Abstract
Introduction. Attributable fraction (AF), proposed by Levin, quantifies the reduction in the disease prevalence that could be achieved by eliminating the exposure (or risk factor) of interest from the population. Disease etiology involves multiple risk factors that may act simultaneously in the occurrence of disease and the optimal approach to quantify the individual and the joint effects of different risk factors on the disease burden is one of the goals in epidemiological research. Adjusted AFs quantify the effect of one risk factor after controlling of other factors (i.e., risk factors that may act together to cause disease, adjustment variables or confounders). Adjusted AFs may add up more than the joint AF (i.e., the AF for eliminating all risk factors from the population) and in some situation may add up to more than 1, leading to the conclusion that adjusted AFs should not be used to the purpose of partitioning the joint effect into individual contributions. Eide and Gefeller proposed a way to accomplish this task. Sequential AFs quantify the additional effect of one risk factor on the disease risk after the preceding risk factors have already been removed in a specified order from the population. However, sequential AFs depend on the order in which risk factors are removed from the population. Average AFs overcome this shortcoming by averaging sequential AFs for a risk factor over all orders by which risk factors can be removed from the population. Average AFs quantify the additional effect of one risk factor on the disease risk after the preceding factors selected randomly have already been removed from the population. Objective. This work aims to illustrate the main methodologies to estimate AFs and corresponding confidence intervals in presence of multiple risk factors with a focus on casecontrol study design. Moreover, we provide AF estimates for the major risk factors using Italian casecontrol data on oral cavity and breast cancers. Modification of casecontrol study design. In the original notation, sequential and average AFs could not be used in casecontrol study design, since the ratio of controls to cases in the sample is fixed a priori and the resulting AF estimates will be biased. Ferguson et al. proposed a prevalencebased weighting approach to correct the imbalance between controls and cases. The method consists in weighting the likelihood function of the model used to estimate sequential and average AFs for the disease prevalence. Variance estimation. The main approaches for estimating AF confidence intervals (CIs) are based on asymptotic approximation (Delta method) and simulations (Monte Carlo method). Ferguson proposed a method based on Monte Carlo simulations for constructing average AF variance. They also proposed the “averisk” R package for calculating average AFs and corresponding CIs in both prospective and casecontrol studies. In this work, we proposed a modification of the Ferguson’s method to account for sequential AF variability on the total variability. Variances comparison. We compared our and Ferguson’s methods to estimate average AF variance using simulated data. We generate two classes of simulated dataset. Each class included four scenarios according to different correlation structure: from independence (scenario 1) to strong correlation among risk factors (scenario 4). The two classes differed in the prevalence and strength of the association between risk factors. In particular, the first class had a high prevalence and modest relative risks, whereas the second class had a low prevalence and huge relative risks. For both classes of simulated data, standard deviation increment (i.e., the relative difference between our and Ferguson’s methods) became gradually larger increasing the number of independent risk factors (from two to ten). Conversely, standard deviation increment decreased incrementing the number of correlated risk factors. Although in some situations (i.e., for correlated risk factors) the contribution of our method could have a substantial relative impact on total AF variability (up to 88%), the absolute standard deviation differences between two methods were very small (less than 0.15) indicating a limited contribution of our method than the Feguson’s one. Application to real data. We estimated average AFs using a casecontrol study conducted in Italy on 946 oral cavity cases and 2492 controls. Risk factors considered for AF estimation were smoking, alcohol drinking, red meat intake, vegetables intake, fruit intake, and family history of oral cavity cancer. The final model included also terms for sex, age, study centre, years of education, BMI, and nonalcohol energy intake to account for possible confounding effect. We set a prevalence of oral cavity cancer according to statistics from the consortium of Italian Cancer Registry (AIRTUM) to adjust average AFs for casecontrol data structure. Eightyeight percent (95% CI: 78%; 98%) of oral cavity cases were attributable to the considered risk factors. In particular, the average AF for smoking was 0.34 (95% CI: 0.27; 0.41), indicating that 34% of oral cavity cases would not has occurred if smoking was randomly removed from the population over all possible risk factor removal orders. For the remaining risk factors, average AFs were 0.27 (95% CI: 0.17; 0.37) for alcohol drinking, 0.11 (95% CI: 0.06; 0.17) for low vegetables intake, 0.08 (95% CI: 0.02; 0.15) for low fruit intake, 0.06 (95% CI: 0.01; 0.12) for high red meat intake, and 0.009 (95% CI: 0.001; 0.02) for family history. We analyzed a further casecontrol study on 2569 breast cancer cases and 2588 controls. We set a prevalence of breast cancer to adjust average AFs for casecontrol data structure. The final model included alcohol drinking, parity, breastfeeding, use of oral contraceptives (OCs), and family history of breast cancer as risk factors; study centre, age, years of education, smoking, age at menarche and use of hormonal replacement therapy (HRT) as adjusting factors. The joint AF was 0.49 (95% CI: 0.35; 0.63) indicating that approximately half of the breast cancer cases would not has occurred if all risk factors were simultaneously eliminated from the population. In particular, average AFs were 0.27 (95% CI: 0.16; 0.39) for parity, 0.12 (95% CI: 0.06; 0.18) for alcohol drinking, 0.04 (95% CI: 0.02; 0.10) for breastfeeding (No or <4 months), 0.04 (95% CI: 0.03; 0.06) for family history of breast cancer, and 0.01 (95% CI: 0.01; 0.03) for OCs users. Conclusions. Sequential and average AFs are useful tools to apportion exposurespecific contributions in a population exposed to multiple risk factors. Sequential and average AFs share some mathematical properties such as componentadditivity, symmetry, marginal rationality, and internal marginal rationality. Average AFs, however, do not represent the actual amount of disease ascribable for each risk factors because they assume that risk factors are removed from the population in a random order. Nevertheless, average AFs could be useful parameters to estimate the average burden of disease for each risk factors across all possible removal orders. In this work, we proposed an alternative approach to estimate the average AF confidence interval accounting for sequential AF variability on the total AF one. We compared the performance between our and Fergusons’ methods to estimate AF variance. Although our method could have a relative impact on total AF variability, the absolute standard deviation differences suggest a limited contribution of our method. However, this topic should be further analyzed.File  Dimensione  Formato  

phd_unimi_R10981.pdf
accesso aperto
Tipologia:
Tesi di dottorato completa
Dimensione
630.29 kB
Formato
Adobe PDF

630.29 kB  Adobe PDF  Visualizza/Apri 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.