The analysis of non–coding DNA regulatory regions is one of the most challenging open problems in computational biology. In this paper we investigate whether we can predict functional information about genes by using information extracted from their sequences together with expression data. We formalize this problem as a classification problem, and we apply Support Vector Machines (SVMs) with non linear kernels to predict classes of co-expressed genes obtained from clustering procedures. SVMs are trained using information about selected motifs extracted from DNA regulatory regions through combinatorial and statistical methods. In our experiments, we show that functional classes of genes can be predicted from biological sequence data in S. cerevisiae, achieving results competitive with those recently presented in the literature.
|Titolo:||Classification of co-expressed genes from DNA regulatory regions|
|Autori interni:||VALENTINI, GIORGIO (Ultimo)|
PAVESI, GIULIO (Primo)
|Parole Chiave:||Gene classification ; Motif extraction and selection ; Gene expression and bio-sequence data integration ; Combinatorial and machine learning methods integration|
|Settore Scientifico Disciplinare:||Settore INF/01 - Informatica|
|Data di pubblicazione:||lug-2009|
|Digital Object Identifier (DOI):||10.1016/j.inffus.2008.11.005|
|Appare nelle tipologie:||01 - Articolo su periodico|
File in questo prodotto:
- PubMed Central loading...