In a recent work we evaluated the ability of semi-supervised learning methods based on random walks to rank genes with respect to Cancer Modules (CM) using networks constructed from different sources of information (Re and Valentini, 2012). The performance of this approach was evaluated using a relatively simple data integration scheme consisting in the unweighted sum of the adjacency matrices of the biomolecular networks involved in our experiments. Despite the achievement of good performances, our tests were all based on a network integration approach applied before the gene prioritization phase (early data integration). Recently published works demonstrated that good results can also be obtained by performing the integration step after the production of a prioritization ranking for each available dataset (late data integration), through the integration the ranking vectors (Kolde et al., 2012). The aim of this contribution is to compare prioritization performances on CM genes using early and late data integration methods in order to highlight benefits and potential pitfalls characterizing these approaches when applied in large scale gene prioritization problems.
Comparison of early and late omics data integration for cancer modules gene ranking / M. Re, M. Mesiti, G. Valentini. ((Intervento presentato al convegno NETTAB : Integrated Bio-Search tenutosi a Como nel 2012.
Comparison of early and late omics data integration for cancer modules gene ranking
M. RePrimo
;M. MesitiSecondo
;G. ValentiniUltimo
2012
Abstract
In a recent work we evaluated the ability of semi-supervised learning methods based on random walks to rank genes with respect to Cancer Modules (CM) using networks constructed from different sources of information (Re and Valentini, 2012). The performance of this approach was evaluated using a relatively simple data integration scheme consisting in the unweighted sum of the adjacency matrices of the biomolecular networks involved in our experiments. Despite the achievement of good performances, our tests were all based on a network integration approach applied before the gene prioritization phase (early data integration). Recently published works demonstrated that good results can also be obtained by performing the integration step after the production of a prioritization ranking for each available dataset (late data integration), through the integration the ranking vectors (Kolde et al., 2012). The aim of this contribution is to compare prioritization performances on CM genes using early and late data integration methods in order to highlight benefits and potential pitfalls characterizing these approaches when applied in large scale gene prioritization problems.File | Dimensione | Formato | |
---|---|---|---|
remesitivale_Nettab2012.pdf
accesso aperto
Tipologia:
Pre-print (manoscritto inviato all'editore)
Dimensione
62.45 kB
Formato
Adobe PDF
|
62.45 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.