Objective: In the last few years several complete genome sequences have been made available to the research community. The annotation of their complete inventory of protein coding genes, however, has been so far an elusive goal. Classical ab initio gene prediction methods have been of great support for this task, but show notable weakness in the prediction of genes with unusual structural features. On the other hand, annotation on the basis of similarity to already known genes in other species does not permit the detection of genuinely novel genes and also introduces a potential source of classification error when based on similarity to sequences erroneously annotated as protein coding. Finally, several methods for the functional classification and assessment of evolutionarily conserved regions have been proposed, but, to our knowledge, signal processing techniques have not been applied yet to this problem, despite their proven usefulness at the single genome level. Results: In this article we introduce the use of signal processing in comparative genomics and we propose a simple test able to evaluate the coding potential of a pairwise genomic sequence alignment according to the pattern and periodicity with which substitutions and gaps appear in the alignment. We assess the feasibility of our approach on an annotated set of human-mouse genomic alignments. Conclusion: Results show that the application of signal processing techniques to sequence alignments can be a useful tool for the identification of evolutionarily conserved protein-coding regions.

Detecting conserved coding genomic regions through signal processing of nucleotide substitution patterns / M. Ré, G. Pavesi. - In: ARTIFICIAL INTELLIGENCE IN MEDICINE. - ISSN 0933-3657. - 45:2-3(2009), pp. 117-123. [10.1016/j.artmed.2008.07.015]

Detecting conserved coding genomic regions through signal processing of nucleotide substitution patterns

M. Ré
Primo
;
G. Pavesi
2009

Abstract

Objective: In the last few years several complete genome sequences have been made available to the research community. The annotation of their complete inventory of protein coding genes, however, has been so far an elusive goal. Classical ab initio gene prediction methods have been of great support for this task, but show notable weakness in the prediction of genes with unusual structural features. On the other hand, annotation on the basis of similarity to already known genes in other species does not permit the detection of genuinely novel genes and also introduces a potential source of classification error when based on similarity to sequences erroneously annotated as protein coding. Finally, several methods for the functional classification and assessment of evolutionarily conserved regions have been proposed, but, to our knowledge, signal processing techniques have not been applied yet to this problem, despite their proven usefulness at the single genome level. Results: In this article we introduce the use of signal processing in comparative genomics and we propose a simple test able to evaluate the coding potential of a pairwise genomic sequence alignment according to the pattern and periodicity with which substitutions and gaps appear in the alignment. We assess the feasibility of our approach on an annotated set of human-mouse genomic alignments. Conclusion: Results show that the application of signal processing techniques to sequence alignments can be a useful tool for the identification of evolutionarily conserved protein-coding regions.
Comparative genomics; Fourier transform; Gene annotation; Gene finding; Animals; Humans; Mice; Nucleotides; Conserved Sequence; Genomics; Artificial Intelligence; Medicine (miscellaneous)
Settore INF/01 - Informatica
Settore BIO/11 - Biologia Molecolare
2009
Article (author)
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S093336570800105X-main.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 418.98 kB
Formato Adobe PDF
418.98 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/456641
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact