A new strategy to identify novel genes and gene isoforms: Analysis of human chromosomes 15, 21 and 22

Re', M.; Mignone, F.; Iacono, M.; Grillo, G.; Liuni, S.; Pesole, G.

doi:10.1016/j.gene.2005.09.041

We present here a novel methodology for the identification of genome regions potentially spanning one or more protein coding genes. It is based on the detection of clusters of conserved sequence tags whose evolutionary dynamics, based on the observation of an excess bias of synonymous substitutions at nucleotide level and of conservative replacements at protein level, suggests a likely protein coding role. A benchmark test carried out on a 236 Mbp of human-mouse syntenic regions from human chromosomes 15, 21 and 22 identified 25 CST clusters potentially containing unannotated genes. A further annotation update of the human genome assembly revealed that 11/25 clusters actually contained a total of 20 validated genes and 10 of the remaining 14 clusters had several experimental evidence in support of the presence of protein coding genes. These findings demonstrate the effectiveness and high prediction reliability of the proposed methodology which could specifically be applied to the annotation of novel genome sequences. Published by Elsevier B.V.

A new strategy to identify novel genes and gene isoforms: Analysis of human chromosomes 15, 21 and 22 / Matteo Rè, Flavio Mignone, Michele Iacono, Giorgio Grillo, Sabino Liuni, Graziano Pesole. - In: GENE. - ISSN 0378-1119. - 365:1-2(2006), pp. 35-40.

A new strategy to identify novel genes and gene isoforms: Analysis of human chromosomes 15, 21 and 22

Matteo Rè;Flavio Mignone;Michele Iacono;Giorgio Grillo;Sabino Liuni;Graziano Pesole

2006

Abstract

We present here a novel methodology for the identification of genome regions potentially spanning one or more protein coding genes. It is based on the detection of clusters of conserved sequence tags whose evolutionary dynamics, based on the observation of an excess bias of synonymous substitutions at nucleotide level and of conservative replacements at protein level, suggests a likely protein coding role. A benchmark test carried out on a 236 Mbp of human-mouse syntenic regions from human chromosomes 15, 21 and 22 identified 25 CST clusters potentially containing unannotated genes. A further annotation update of the human genome assembly revealed that 11/25 clusters actually contained a total of 20 validated genes and 10 of the remaining 14 clusters had several experimental evidence in support of the presence of protein coding genes. These findings demonstrate the effectiveness and high prediction reliability of the proposed methodology which could specifically be applied to the annotation of novel genome sequences. Published by Elsevier B.V.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Alternative splicing; Bioinformatics; Coding potential score; Conserved sequence tag; Gene finding; Software
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2006
			
	Rivista in ANCE
	
				GENE
			
	DOI
	
				https://dx.doi.org/10.1016/j.gene.2005.09.041
			
	URL
	
				http://www.sciencedirect.com/science?_ob=MImg&_imagekey=B6T39-4HTCTKX-5-5&_cdi=4941&_user=1080510&_orig=browse&_coverDate=01%2F03%2F2006&_sk=996349999&view=c&wchp=dGLzVlz-zSkWb&md5=afd45ddcc2ac17e9b09fd251485b55aa&ie=/sdarticle.pdf
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/30207

Citazioni

1

1

1

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca