The analysis of non–coding DNA regulatory regions is one of the most challenging open problems in computational biology. In this paper we investigate whether we can predict functional information about genes by using information extracted from their sequences together with expression data. We formalize this problem as a classification problem, and we apply Support Vector Machines (SVMs) with non linear kernels to predict classes of co-expressed genes obtained from clustering procedures. SVMs are trained using information about selected motifs extracted from DNA regulatory regions through combinatorial and statistical methods. In our experiments, we show that functional classes of genes can be predicted from biological sequence data in S. cerevisiae, achieving results competitive with those recently presented in the literature.

Classification of co-expressed genes from DNA regulatory regions / G.Pavesi, G.Valentini. - In: INFORMATION FUSION. - ISSN 1566-2535. - 10:3(2009 Jul), pp. 233-241. [10.1016/j.inffus.2008.11.005]

Classification of co-expressed genes from DNA regulatory regions

G. Pavesi
Primo
;
G. Valentini
Ultimo
2009

Abstract

The analysis of non–coding DNA regulatory regions is one of the most challenging open problems in computational biology. In this paper we investigate whether we can predict functional information about genes by using information extracted from their sequences together with expression data. We formalize this problem as a classification problem, and we apply Support Vector Machines (SVMs) with non linear kernels to predict classes of co-expressed genes obtained from clustering procedures. SVMs are trained using information about selected motifs extracted from DNA regulatory regions through combinatorial and statistical methods. In our experiments, we show that functional classes of genes can be predicted from biological sequence data in S. cerevisiae, achieving results competitive with those recently presented in the literature.
Gene classification ; Motif extraction and selection ; Gene expression and bio-sequence data integration ; Combinatorial and machine learning methods integration
Settore INF/01 - Informatica
lug-2009
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/54025
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 8
social impact