In a recent work we evaluated the ability of semi-supervised learning methods based on random walks to rank genes with respect to Cancer Modules (CM) using networks constructed from different sources of information (Re and Valentini, 2012). The performance of this approach was evaluated using a relatively simple data integration scheme consisting in the unweighted sum of the adjacency matrices of the biomolecular networks involved in our experiments. Despite the achievement of good performances, our tests were all based on a network integration approach applied before the gene prioritization phase (early data integration). Recently published works demonstrated that good results can also be obtained by performing the integration step after the production of a prioritization ranking for each available dataset (late data integration), through the integration the ranking vectors (Kolde et al., 2012). The aim of this contribution is to compare prioritization performances on CM genes using early and late data integration methods in order to highlight benefits and potential pitfalls characterizing these approaches when applied in large scale gene prioritization problems.

Comparison of early and late omics data integration for cancer modules gene ranking / M. Re, M. Mesiti, G. Valentini. ((Intervento presentato al convegno NETTAB : Integrated Bio-Search tenutosi a Como nel 2012.

Comparison of early and late omics data integration for cancer modules gene ranking

M. Re
Primo
;
M. Mesiti
Secondo
;
G. Valentini
Ultimo
2012-11-15

Abstract

In a recent work we evaluated the ability of semi-supervised learning methods based on random walks to rank genes with respect to Cancer Modules (CM) using networks constructed from different sources of information (Re and Valentini, 2012). The performance of this approach was evaluated using a relatively simple data integration scheme consisting in the unweighted sum of the adjacency matrices of the biomolecular networks involved in our experiments. Despite the achievement of good performances, our tests were all based on a network integration approach applied before the gene prioritization phase (early data integration). Recently published works demonstrated that good results can also be obtained by performing the integration step after the production of a prioritization ranking for each available dataset (late data integration), through the integration the ranking vectors (Kolde et al., 2012). The aim of this contribution is to compare prioritization performances on CM genes using early and late data integration methods in order to highlight benefits and potential pitfalls characterizing these approaches when applied in large scale gene prioritization problems.
Settore INF/01 - Informatica
Comparison of early and late omics data integration for cancer modules gene ranking / M. Re, M. Mesiti, G. Valentini. ((Intervento presentato al convegno NETTAB : Integrated Bio-Search tenutosi a Como nel 2012.
Conference Object
File in questo prodotto:
File Dimensione Formato  
remesitivale_Nettab2012.pdf

accesso aperto

Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 62.45 kB
Formato Adobe PDF
62.45 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/213423
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact