Background: Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results: As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions: Our results indicate that even in the “simple” case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.

Complete vertebrate mitogenomes reveal widespread repeats and gene duplications / G. Formenti, A. Rhie, J. Balacco, B. Haase, J. Mountcastle, O. Fedrigo, S. Brown, M.R. Capodiferro, F.O. Al-Ajli, R. Ambrosini, P. Houde, S. Koren, K. Oliver, M. Smith, J. Skelton, E. Betteridge, J. Dolucan, C. Corton, I. Bista, J. Torrance, A. Tracey, J. Wood, M. Uliano-Silva, K. Howe, S. McCarthy, S. Winkler, W. Kwak, J. Korlach, A. Fungtammasan, D. Fordham, V. Costa, S. Mayes, M. Chiara, D.S. Horner, E. Myers, R. Durbin, A. Achilli, E.L. Braun, A.M. Phillippy, E.D. Jarvis, A.N.G. Kirschel, A. Digby, A. Veale, A. Bronikowski, B. Murphy, B. Robertson, C. Baker, C. Mazzoni, C. Balakrishnan, C. Lee, D. Mead, E. Teeling, E.L. Aiden, E. Todd, E. Eichler, G.J.P. Naylor, G. Zhang, J. Smith, J. Wolf, J. Touchon, K. Delmore, K. Jakobsen, L. Komoroske, M. Wilkinson, M. Genner, M. Psenicka, M. Fuxjager, M. Stratton, M. Liedvogel, N. Gemmell, P. Minias, P.O. Dunn, P. Sudmant, P. Morin, Q. Ayub, R. Kraus, S. Vernes, S. Smith, T. Lama, T. Edwards, T. Smith, T. Gilbert, T. Marques-Bonet, T. Einfeldt, B. Venkatesh, W. Johnson, W. Warren, Y. Bukhman. - In: GENOME BIOLOGY. - ISSN 1474-760X. - 22:1(2021 Apr 29), pp. 120.1-120.22. [10.1186/s13059-021-02336-9]

Complete vertebrate mitogenomes reveal widespread repeats and gene duplications

R. Ambrosini;S. Winkler;M. Chiara;D.S. Horner;
2021

Abstract

Background: Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results: As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions: Our results indicate that even in the “simple” case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.
assembly; duplications; long reads; mitochondrial DNA; repeats; sequencing; vertebrate
Settore BIO/18 - Genetica
Settore BIO/11 - Biologia Molecolare
Settore BIO/07 - Ecologia
Article (author)
File in questo prodotto:
File Dimensione Formato  
s13059-021-02336-9.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/878036
Citazioni
  • ???jsp.display-item.citation.pmc??? 17
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 16
social impact