The identification of genes involved in livestock production and disease is a challenge due to the multi-genic, multifactorial nature of the traits and the complexity of integration of information from different studies. Genome-wide techniques such as microarray analysis, SAGE, linkage analysis and linkage disequilibrium analysis have been extensively used in livestock and have often identified a large number of candidate genes. Selection of the most probable candidate genes for further empirical analysis remains a challenge. Novel extensive biological databases (DB) provide an opportunity for candidate gene mining. Bioinformatic methods and tools to prioritize candidate genes underlying pathways or diseases have been presented mostly for application to human disease candidate gene search. These computational methods employ data from a variety of sources to identify the most likely candidate genes from genes sets. The objectives of the study were: 1. to test a set of existing gene prioritization computational methods on real and simulated livestock traits, namely mastitis susceptibility in cattle, production in cattle, litter size in swine, and tick resistance in cattle; 2. to develop a novel method for candidate prioritization that better suits the characteristics of genomic information of livestock species (lower level of annotation, different experimental designs, etc.). The algorithm performs distinct prioritizations from multiple heterogeneous data sources, which are then integrated into one global ranking using order statistics. Information about a trait or pathway is recorded, ordered and stored from a set of known genes using multiple data sources. Then, the candidate genes are ranked based on similarity with the training properties obtained in the first step, resulting in one prioritized list for each data source. Data for linkage and association analysis, and expression analysis (2 microarray studies) were simulated for a complex trait assigning the largest effects to known, well described genes in the Gene Ontology (GO), InterPro, MEDLINE, and Kegg databases. All the other simulated major genes were assigned to genes described in one of the databases for livestock species. Real data were selected from the literature, mainly from large QTL studies, or obtained from collaborators. Software used for the analysis were: Suspect (http://www. genetics.med.ed.ac.uk/software/prospectr.php), Endevour and COeXpress (http://coxpress.sf.net). Results using simulated data showed that where annotation is missing, the accuracy of the considered algorithms decreased drastically. A new prioritization method applied to simulated data correctly ranked candidate gene only when QTL information had a high level of accuracy. Further work is needed in defining methods of weighting QTL data information.

In silico candidate gene mining in livestock species / A. Stella, A. Montironi, F. Panzitta, G. Gandini, P. Boettcher. - In: ITALIAN JOURNAL OF ANIMAL SCIENCE. - ISSN 1594-4077. - 6:Suppl. 1(2007), pp. 217-217. ((Intervento presentato al 17. convegno ASPA Congress tenutosi a Alghero nel 2007.

In silico candidate gene mining in livestock species

G. Gandini;
2007

Abstract

The identification of genes involved in livestock production and disease is a challenge due to the multi-genic, multifactorial nature of the traits and the complexity of integration of information from different studies. Genome-wide techniques such as microarray analysis, SAGE, linkage analysis and linkage disequilibrium analysis have been extensively used in livestock and have often identified a large number of candidate genes. Selection of the most probable candidate genes for further empirical analysis remains a challenge. Novel extensive biological databases (DB) provide an opportunity for candidate gene mining. Bioinformatic methods and tools to prioritize candidate genes underlying pathways or diseases have been presented mostly for application to human disease candidate gene search. These computational methods employ data from a variety of sources to identify the most likely candidate genes from genes sets. The objectives of the study were: 1. to test a set of existing gene prioritization computational methods on real and simulated livestock traits, namely mastitis susceptibility in cattle, production in cattle, litter size in swine, and tick resistance in cattle; 2. to develop a novel method for candidate prioritization that better suits the characteristics of genomic information of livestock species (lower level of annotation, different experimental designs, etc.). The algorithm performs distinct prioritizations from multiple heterogeneous data sources, which are then integrated into one global ranking using order statistics. Information about a trait or pathway is recorded, ordered and stored from a set of known genes using multiple data sources. Then, the candidate genes are ranked based on similarity with the training properties obtained in the first step, resulting in one prioritized list for each data source. Data for linkage and association analysis, and expression analysis (2 microarray studies) were simulated for a complex trait assigning the largest effects to known, well described genes in the Gene Ontology (GO), InterPro, MEDLINE, and Kegg databases. All the other simulated major genes were assigned to genes described in one of the databases for livestock species. Real data were selected from the literature, mainly from large QTL studies, or obtained from collaborators. Software used for the analysis were: Suspect (http://www. genetics.med.ed.ac.uk/software/prospectr.php), Endevour and COeXpress (http://coxpress.sf.net). Results using simulated data showed that where annotation is missing, the accuracy of the considered algorithms decreased drastically. A new prioritization method applied to simulated data correctly ranked candidate gene only when QTL information had a high level of accuracy. Further work is needed in defining methods of weighting QTL data information.
Settore AGR/17 - Zootecnica Generale e Miglioramento Genetico
2007
ASPA
Animal Science and Production Association
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/235434
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact