News & Events

Collaborative Vertebrate Genomes Project Establishes New Standards of Quality and Scale in Genome Sequencing

Published: Wednesday, 28 April, 2021

Today, in a special issue of Nature, the Vertebrate Genomes Project (VGP) announced their flagship study on genome assembly quality and standardization for the field of genomics. UCD Professor of Zoology Emma Teeling from the School of Biological and Environmental Science is one of the international scientists involved in the Vertebrate Genome consortium. The study includes 16 diploid high-quality, near error-free, and near complete vertebrate reference genome assemblies for species across all taxa with backbones (i.e., mammals, amphibians, birds, reptiles, and fishes) from five years of piloting the first phase of the VGP project. The VGP identifies technological improvements based on these 16 genome assemblies and demonstrates the feasibility of setting and achieving high-quality reference genome quality metrics using a state-of-the-art automated approach of combining long-read and long-range chromosome scaffolding approaches with novel algorithms that put the pieces of the genome assembly puzzle together. 

Building on the mission of the Genome 10K Community of Scientists (G10K) to sequence the genomes of 10,000 vertebrate species and other comparative genomics efforts, the VGP utilizes the dramatic improvements in sequencing technologies in the last few years to begin production of high-quality reference genome assemblies for all ~70,000 living vertebrates. In fact, the current VGP pipelines have led to the submission of 129 diploid assemblies representing the most complete and accurate versions of those species to date and is on the path to generating thousands of genome assemblies, demonstrating feasibility in not only quality standardization but also scale.

The exceptional quality of the 16 genome assemblies enables unprecedented novel discoveries which affect the characterization of biodiversity for life, conservation, human health, and disease. As an author of the flagship paper, Professor Teeling commented “The first high-quality reference genomes of six bat species, generated with the Bat 1K consortium, revealed selection and loss of immunity-related genes that may underlie bats’ unique tolerance to viral infection. This finding provides novel avenues of research to increase survivability, particularly relevant for emerging infectious diseases, such as the current COVID-19 pandemic.”

Richard Durbin, a Professor at the University of Cambridge and lead of the VGP sequencing hub at the Wellcome Sanger Institute in the UK, added, “These studies mark the start of a new era of genome sequencing that will accelerate over the next decade to enable genomic applications across the whole tree of life, changing our scientific interactions with the living world.”  

Since its establishment in 2016, the VGP has involved hundreds of international scientists working together from more than 50 institutions in 12 different countries and is exemplary in its scientific cooperation, extensive infrastructure, and collaborative leadership. Furthermore, as the first large-scale eukaryotic genomes project to produce reference genome assemblies meeting a specific minimum quality standard, the VGP has become a leading example for other large consortia, including Professor Emma Teeling’s consortium Bat 1K, Pan Human Genome Project, Earth BioGenome Project, Darwin Tree of Life, and European Reference Genome Atlas, among others. 

The VGP will continue to work collaboratively across the globe and with other consortia to complete Phase 1 of the project, approximately one representative species per 260 vertebrate orders separated by a minimum of 50 million years from a common ancestor with other species in Phase 1. The project intends to create comparative genomic resources with these 260 species, including reference-free whole genome alignments, that will provide a means to understand the detailed evolutionary history of these species and create consistent gene annotations. Genome data are primarily generated at three sequencing hubs that have invested in the mission of the VGP including The Rockefeller University’s Vertebrate Genome Lab, New York, USA; Wellcome Sanger Institute, UK; and Max Planck Institute, Germany.

Phase 2 will examine representative species from each vertebrate family and is currently in the process of sample identification and fundraising. The VGP welcomes others to join its efforts, ranging from fundraising and sample collection to generating genome assemblies or including their own genome assemblies that meet the VGP metrics as part of the mission.

Links to all of the reports related to this package can be found here on Nature’s website.