Combinatorial mixtures refers to a flexible class of models for inference on mixture distributions whose components have multidimensional parameters. The idea behind it is to allow each element of component-specific parameter vectors to be shared by a subset of other components. We develop Bayesian inference and computation approaches for this class of distributions. We define a general prior distribution structure where a positive probability is put on every possible combination of sharing patterns, whence the name combinatorial mixtures. This partial sharing allows for greater generality and flexibility in comparison with traditional approaches to mixture modeling, while still allowing to assign significant mass to models that are more parsimonious than the general mixture case in which no sharing takes place. One of the implications of our setting is that, once a maximum number of components K∗ is specified, inference on the parameters and the number of components, say K, is subsumed by the inference on combinatorial patterns. We illustrate our combinatorial mixtures in an application based on the normal model. This work was originally motivated by the analysis of cancer subtypes: in terms of biological measures of interest, subtypes may be characterized by differences in location, scale, correlations or any of the combinations. We use data on molecular classification of lung cancer from the web-based information supporting the published manuscript Garber et al. (2001). In this context, the main goals of a mixture model analysis are to a) estimate the number of subgroups in a sample; b) make inferences about the assignment of samples to these subgroups; and c) generate hypotheses about which of the mechanisms above is likely to characterize the subgroups. Our paper adds a new tool to Bayesian mixture models, that allows to answer all three of these questions.

Combinatorial mixtures of multiparameter distributions / V. Edefonti, G. Parmigiani - In: ISI 2007 : 56. Session of the International Statistical Institute : 22-29 August 2007 Lisboa, Portugal : Book of Abstracts / [a cura di] M.I. Gomez, D. Pestana, P. Silva. - Lisboa : CEAUL, 2007. - ISBN 978-972-8859-71-8. - pp. 279-280 (( Intervento presentato al 56. convegno Session of the International Statistical Institute tenutosi a Lisboa (Portugal) nel 2007.

Combinatorial mixtures of multiparameter distributions

V. Edefonti
Primo
;
2007

Abstract

Combinatorial mixtures refers to a flexible class of models for inference on mixture distributions whose components have multidimensional parameters. The idea behind it is to allow each element of component-specific parameter vectors to be shared by a subset of other components. We develop Bayesian inference and computation approaches for this class of distributions. We define a general prior distribution structure where a positive probability is put on every possible combination of sharing patterns, whence the name combinatorial mixtures. This partial sharing allows for greater generality and flexibility in comparison with traditional approaches to mixture modeling, while still allowing to assign significant mass to models that are more parsimonious than the general mixture case in which no sharing takes place. One of the implications of our setting is that, once a maximum number of components K∗ is specified, inference on the parameters and the number of components, say K, is subsumed by the inference on combinatorial patterns. We illustrate our combinatorial mixtures in an application based on the normal model. This work was originally motivated by the analysis of cancer subtypes: in terms of biological measures of interest, subtypes may be characterized by differences in location, scale, correlations or any of the combinations. We use data on molecular classification of lung cancer from the web-based information supporting the published manuscript Garber et al. (2001). In this context, the main goals of a mixture model analysis are to a) estimate the number of subgroups in a sample; b) make inferences about the assignment of samples to these subgroups; and c) generate hypotheses about which of the mechanisms above is likely to characterize the subgroups. Our paper adds a new tool to Bayesian mixture models, that allows to answer all three of these questions.
Bayesian inference ; Markov Chain Monte Carlo ; Clustering
2007
International Statistical Institute
Instituto Nacional de Estatistica
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/44415
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact