IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Large-scale initiatives aiming to recover the complete sequence of thousands of human genomes are currently being undertaken worldwide, concurring to the generation of a comprehensive catalog of human genetic variation. The ultimate and most ambitious goal of human population scale genomics is the characterization of the so-called human "variome," through the identification of causal mutations or haplotypes. Several research institutions worldwide currently use genotyping assays based on Next-Generation Sequencing (NGS) for diagnostics and clinical screenings, and the widespread application of such technologies promises major revolutions in medical science. Bioinformatic analysis of human resequencing data is one of the main factors limiting the effectiveness and general applicability of NGS for clinical studies. The requirement for multiple tools, to be combined in dedicated protocols in order to accommodate different types of data (gene panels, exomes, or whole genomes) and the high variability of the data makes difficult the establishment of a ultimate strategy of general use. While there already exist several studies comparing sensitivity and accuracy of bioinformatic pipelines for the identification of single nucleotide variants from resequencing data, little is known about the impact of quality assessment and reads pre-processing strategies. In this work we discuss major strengths and limitations of the various genome resequencing protocols are currently used in molecular diagnostics and for the discovery of novel disease-causing mutations. By taking advantage of publicly available data we devise and suggest a series of best practices for the pre-processing of the data that consistently improve the outcome of genotyping with minimal impacts on computational costs.

Evaluation of quality assessment protocols for high throughput genome resequencing data / M. Chiara, G. Pavesi. - In: FRONTIERS IN GENETICS. - ISSN 1664-8021. - 8:(2017 Jul), pp. 94.1-94.12. [10.3389/fgene.2017.00094]

Evaluation of quality assessment protocols for high throughput genome resequencing data

M. Chiara^Primo;G. Pavesi^Ultimo

2017

Abstract

Large-scale initiatives aiming to recover the complete sequence of thousands of human genomes are currently being undertaken worldwide, concurring to the generation of a comprehensive catalog of human genetic variation. The ultimate and most ambitious goal of human population scale genomics is the characterization of the so-called human "variome," through the identification of causal mutations or haplotypes. Several research institutions worldwide currently use genotyping assays based on Next-Generation Sequencing (NGS) for diagnostics and clinical screenings, and the widespread application of such technologies promises major revolutions in medical science. Bioinformatic analysis of human resequencing data is one of the main factors limiting the effectiveness and general applicability of NGS for clinical studies. The requirement for multiple tools, to be combined in dedicated protocols in order to accommodate different types of data (gene panels, exomes, or whole genomes) and the high variability of the data makes difficult the establishment of a ultimate strategy of general use. While there already exist several studies comparing sensitivity and accuracy of bioinformatic pipelines for the identification of single nucleotide variants from resequencing data, little is known about the impact of quality assessment and reads pre-processing strategies. In this work we discuss major strengths and limitations of the various genome resequencing protocols are currently used in molecular diagnostics and for the discovery of novel disease-causing mutations. By taking advantage of publicly available data we devise and suggest a series of best practices for the pre-processing of the data that consistently improve the outcome of genotyping with minimal impacts on computational costs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			Genome resequencing; Molecular diagnostics; Next-generation sequencing read quality; Precision medicine; Whole exome sequencing; Molecular Medicine; Genetics; Genetics (clinical)
		
	Settori scientifico-disciplinari dell'articolo
	
			Settore BIO/11 - Biologia Molecolare
		
	Data di pubblicazione
	
			lug-2017
		
	Rivista in ANCE
	
			FRONTIERS IN GENETICS
		
	DOI
	
			https://dx.doi.org/10.3389/fgene.2017.00094
		
	Tipologia
	
			Article (author)
		
	Appare nelle tipologie:
	
			01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
fgene-08-00094.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 1.49 MB Formato Adobe PDF Visualizza/Apri	1.49 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/528072

Citazioni

2

9

ND

social impact