During these last few years an increasing body of scientific evidence showed that looking at the single exposure to chemicals without considering the mixture effect can cause an underestimate of the chemical exposures risk. This poses also statistical challenges on how to manage more complex datasets. Weighted Quantile Sum (WQS) regression is a new statistical model that allows to deal with this problems. It is able to test the association of the overall environmental exposures with an outcome and to find the main actors in the association between the exposure and the dependent variable. Through this work we showed how we adapted the model to allow to fit a WQS regression in presence of binary, multinomial and count outcomes. Moreover, we implemented two more extensions: the possibility to test for an interaction between the WQS index (representing the overall exposure) and a continuous or categorical variable; and the ability of having two indices in the same model, one looking in the positive and the second in the negative direction when the mixture can have a bidirectional effect on the outcome. The first extension answers to a frequent and important line of inquiry in epidemiologic studies that is whether there is an effect modification (i.e., an interaction) between an exposure and a particular covariate of interest that can affect the association between the exposure and the outcome. The second extension allows to estimate both the protective and harmful effect of the mixture within the same regression model. Lastly, we showed how to apply this novel method in the genetic context thanks to the inclusion of the double WQS index. We then compared its results with the standard methodology used to test the effect of a gene set on a particular phenotype. The simulation studies performed to test the new extensions showed the good performance of the methods reducing the bias and standard error of the estimates of the effect of the mixture on the outcome and correctly identifying the elements in the mixture that play a major role in the studied association. A high specificity was also observed. Through the case studies we were able to see how WQS confirmed previous major findings and providing new insights respect to previous literature. When we tested for the interaction between age or sex and the exposure to lead (Pb), cadmium (Cd), mercury (Hg), selenium (Se) and manganese (Mn) we found that the association between the forced vital capacity (FVC) and Pb and Hg was attenuated among older children, while female FVC is more susceptible to Cd and Hg compared to males. The application of the double index to test the association between 43 nutrients and obesity showed a harmful effect of moisture (from all sources), polyunsaturated fatty acids, saturated fatty acids, sodium, caffeine and cholesterol while a protective effect was found for beta-carotene, vitamin B12, vitamin B6, vitamin D, folic acid, vitamin C, folate DFE and alpha-carotene. Finally, through WQS we observed a significant role of the genes involved in cell-cycle in the risk of death for ovarian cancer which was not shown applying single sample Gene Set Enrichment Analysis. The advantages of WQS regression and the extension that we described in this work are the ease of use and interpretation of the results; moreover, none of the other environmental mixture methods allow to consider the effect modification due to a covariate or to measure the amount of positive and negative association when the elements in the mixture show both effects. This work will be the starting point for additional future extensions, improvements and applications of the model while all these extensions will be implemented in the gWQS package of the statistical software R.

THE WEIGHTED QUANTILE SUM REGRESSION: EXTENSIONS AND APPLICATIONS / S. Renzetti ; tutor: M. Ferraroni, S. Calza ; coordinatore: C. La Vecchia. Dipartimento di Scienze Cliniche e di Comunità, 2021 Mar 10. 33. ciclo, Anno Accademico 2020.

THE WEIGHTED QUANTILE SUM REGRESSION: EXTENSIONS AND APPLICATIONS

S. Renzetti
2021

Abstract

During these last few years an increasing body of scientific evidence showed that looking at the single exposure to chemicals without considering the mixture effect can cause an underestimate of the chemical exposures risk. This poses also statistical challenges on how to manage more complex datasets. Weighted Quantile Sum (WQS) regression is a new statistical model that allows to deal with this problems. It is able to test the association of the overall environmental exposures with an outcome and to find the main actors in the association between the exposure and the dependent variable. Through this work we showed how we adapted the model to allow to fit a WQS regression in presence of binary, multinomial and count outcomes. Moreover, we implemented two more extensions: the possibility to test for an interaction between the WQS index (representing the overall exposure) and a continuous or categorical variable; and the ability of having two indices in the same model, one looking in the positive and the second in the negative direction when the mixture can have a bidirectional effect on the outcome. The first extension answers to a frequent and important line of inquiry in epidemiologic studies that is whether there is an effect modification (i.e., an interaction) between an exposure and a particular covariate of interest that can affect the association between the exposure and the outcome. The second extension allows to estimate both the protective and harmful effect of the mixture within the same regression model. Lastly, we showed how to apply this novel method in the genetic context thanks to the inclusion of the double WQS index. We then compared its results with the standard methodology used to test the effect of a gene set on a particular phenotype. The simulation studies performed to test the new extensions showed the good performance of the methods reducing the bias and standard error of the estimates of the effect of the mixture on the outcome and correctly identifying the elements in the mixture that play a major role in the studied association. A high specificity was also observed. Through the case studies we were able to see how WQS confirmed previous major findings and providing new insights respect to previous literature. When we tested for the interaction between age or sex and the exposure to lead (Pb), cadmium (Cd), mercury (Hg), selenium (Se) and manganese (Mn) we found that the association between the forced vital capacity (FVC) and Pb and Hg was attenuated among older children, while female FVC is more susceptible to Cd and Hg compared to males. The application of the double index to test the association between 43 nutrients and obesity showed a harmful effect of moisture (from all sources), polyunsaturated fatty acids, saturated fatty acids, sodium, caffeine and cholesterol while a protective effect was found for beta-carotene, vitamin B12, vitamin B6, vitamin D, folic acid, vitamin C, folate DFE and alpha-carotene. Finally, through WQS we observed a significant role of the genes involved in cell-cycle in the risk of death for ovarian cancer which was not shown applying single sample Gene Set Enrichment Analysis. The advantages of WQS regression and the extension that we described in this work are the ease of use and interpretation of the results; moreover, none of the other environmental mixture methods allow to consider the effect modification due to a covariate or to measure the amount of positive and negative association when the elements in the mixture show both effects. This work will be the starting point for additional future extensions, improvements and applications of the model while all these extensions will be implemented in the gWQS package of the statistical software R.
10-mar-2021
Settore MED/01 - Statistica Medica
FERRARONI, MONICA
LA VECCHIA, CARLO VITANTONIO BATTISTA
Doctoral Thesis
THE WEIGHTED QUANTILE SUM REGRESSION: EXTENSIONS AND APPLICATIONS / S. Renzetti ; tutor: M. Ferraroni, S. Calza ; coordinatore: C. La Vecchia. Dipartimento di Scienze Cliniche e di Comunità, 2021 Mar 10. 33. ciclo, Anno Accademico 2020.
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R11883.pdf

Open Access dal 27/02/2023

Dimensione 1.26 MB
Formato Adobe PDF
1.26 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/818694
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact