Motivation: Insertions and deletions contribute significantly to genomic diversity both at intra and inter species levels. The recent advent of NGS methods has opened many opportunities for structural variant discovery, but also required the development of new computational methods. Several bioinformatics tools have been developed for the detection of indels using paired end reads (PE) NGS data. Methods: Existing methods can broadly be grouped into two categories, those that identify genomic clusters of pairs of reads showing atypical insert sizes to identify insertions and deletions with respect to a reference genome and those that consider the distribution of insert sizes for all read pairs covering a given genomic position. We present a variation on the latter approach which also includes information from reads where one member of the pair does not map to the reference genome (broken pairs) and uses machine learning approaches to differentiate between real indels and possible false positive predictions Results: We demonstrate that our approach significantly outperforms other available methods in terms of sensitivity, specificity and computational time/power requirements both in simulations and using publicly available human genome resequencing data. Our analyses demonstrate that use of data from \\\"broken pairs\\\" and careful integration of different statistics from mapping patterns can significantly improve the quality of indel predictions.
Accurate detection of genomic structural variations using high throughput resequencing data / M. Chiara, G. Pesole, H.S. Horner. ((Intervento presentato al 8. convegno BITS Annual Meeting tenutosi a Pisa nel 2011.
|Titolo:||Accurate detection of genomic structural variations using high throughput resequencing data|
CHIARA, MATTEO (Primo)
|Data di pubblicazione:||21-giu-2011|
|Settore Scientifico Disciplinare:||Settore BIO/11 - Biologia Molecolare|
|Enti collegati al convegno:||Consiglio Nazionale delle Ricerche|
Bioinformatics Italian Society
|Citazione:||Accurate detection of genomic structural variations using high throughput resequencing data / M. Chiara, G. Pesole, H.S. Horner. ((Intervento presentato al 8. convegno BITS Annual Meeting tenutosi a Pisa nel 2011.|
|Appare nelle tipologie:||14 - Intervento a convegno non pubblicato|