IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Most histologic classiﬁcations of ma jor cancers include large heterogeneous classes. Identiﬁcation of clinically relevant subgroups within these classes is among the most important challenges in cancer genomics. Our approach to this challenge is to seek undiscovered subclasses in broad classes, exploiting a potential biological connection between the unclassiﬁed group and known classiﬁcations working for tumors in other organ sites. Statistically, this problem can be thought of as semi-supervised learning, where a known classiﬁcation is exported to help the clustering procedure. The known classiﬁcation is learned from the supervised part of the model and then used as a ﬁlter for selecting a suitable subset of variables able to identify meaningful subgroups of samples in the unsupervised part of the model. From this perspective, the identiﬁed subgroups can be thought of as having the same interpretation as the original ones. Our implementation is a Bayesian parametric model based on Normal Mixtures and amenable to MCMC computing. textitCombinatorial mixtures characterize the set of the a priori assumptions. Combinatorial mixtures names a new more general and ﬂexible class of models for Bayesian parametric inference in which component parameters are allowed to be diﬀerent or equal, and positive mass is put on every possible combination of equalities and inequalities. This is especially critical in interpreting cancer clusters as those may arise from changes in location, scale or correlations, or any of the combinations. The solution is illustrated using data on molecular classiﬁcation of lung cancer, with molecular classes learned in breast cancer.

Integrating supervised and unsupervised learning in genomics applications / V. Edefonti, G. Parmigiani - In: Proc. Valencia / ISBA 8th World Meeting on Bayesian Statistics[s.l] : José M. Bernardo, 2006 Jun 01. - pp. 54-55 (( Intervento presentato al 8. convegno Valencia International Meeting on Bayesian Statistics ; World Meeting of the International Society for Bayesian Analysis tenutosi a Alicante nel 2006.

Integrating supervised and unsupervised learning in genomics applications

V. Edefonti^Primo;G. Parmigiani^Secondo

2006

Abstract

Most histologic classiﬁcations of ma jor cancers include large heterogeneous classes. Identiﬁcation of clinically relevant subgroups within these classes is among the most important challenges in cancer genomics. Our approach to this challenge is to seek undiscovered subclasses in broad classes, exploiting a potential biological connection between the unclassiﬁed group and known classiﬁcations working for tumors in other organ sites. Statistically, this problem can be thought of as semi-supervised learning, where a known classiﬁcation is exported to help the clustering procedure. The known classiﬁcation is learned from the supervised part of the model and then used as a ﬁlter for selecting a suitable subset of variables able to identify meaningful subgroups of samples in the unsupervised part of the model. From this perspective, the identiﬁed subgroups can be thought of as having the same interpretation as the original ones. Our implementation is a Bayesian parametric model based on Normal Mixtures and amenable to MCMC computing. textitCombinatorial mixtures characterize the set of the a priori assumptions. Combinatorial mixtures names a new more general and ﬂexible class of models for Bayesian parametric inference in which component parameters are allowed to be diﬀerent or equal, and positive mass is put on every possible combination of equalities and inequalities. This is especially critical in interpreting cancer clusters as those may arise from changes in location, scale or correlations, or any of the combinations. The solution is illustrated using data on molecular classiﬁcation of lung cancer, with molecular classes learned in breast cancer.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Bayesian inference ; mixture models ; combinatorial mixtures
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore MED/01 - Statistica Medica
			
	Data di pubblicazione
	
				1-giu-2006
			
	Enti collegati al convegno
	
				International Society for Bayesian Analysis (ISBA)
Universitat de Valencia
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/62267

Citazioni

ND

ND

ND

ND

social impact