Wolverine genome assembly sets a standard for conservation geneticists
Chromosome-level assembly of the North American wolverine sets a new standard for the weasel (Mustelidae) family.
Look at any list of the top 10 most aggressive animals, and you will undoubtedly find a mugshot of the North American wolverine. Although much smaller than most of the other animals accompanying it on such lists—such as the hippopotamus and the wild boar—this feisty member of the weasel family has been a protagonist in popular myths, even serving as inspiration for superhero characters and as mascots of sports teams. A wide range of cultural, social, economic, and psychological factors influence human-wolverine relations. In many Indigenous societies, the wolverine is a respected cultural keystone species and is often viewed as a trickster. Unfortunately, these solitary carnivores rely on cold temperatures and snowy environments for their reproduction and survival, so their future remains uncertain in a warming world.
A new study published in the August issue of G3: Genes|Genomes|Genetics provides geneticists and conservation biologists with a high-quality, chromosome-level genome assembly of the wolverine, including extensive annotation of genes involved in behavior and immune responses to pathogens. The authors’ goal goes far beyond the wolverine: they seek to provide similar high-quality assemblies for other species predicted to be impacted by increasing global temperatures. This means setting the benchmark for a workflow that offers the best compromise possible when balancing cost, time, simplicity, accuracy, and completeness for long-read assembly and genome annotation.
“Our goal is to replace the existing short-read assemblies and increase the quality standards for new reference genomes in light of current sequencing technologies,” says lead author Si Lok.
Improving the standards of genome assembly in the current era
The DNA sequenced in the report comes from a 30-year-old tissue sample of a male wolverine specimen from the Kugluktak (Coppermine) region of Nunavut. Lok and colleagues use PacBio contiguous long-reads (CLR) mode, which typically provides maximal read length at a reasonable cost; however, it is prone to 15-20% pseudo-random errors. To mitigate such inaccuracies while maintaining the numerous benefits of this approach, the authors used a two-step workflow for genome assembly: 1) the uncorrected CLRs are assembled using Flye assembler, followed by a polishing regimen with high-quality Illumina short reads, and 2) the subsequent scaffolding of the assembly against assemblies of related family members.
“It took us nearly two years to optimize a workflow that produces a final genome assembly comprising well less than 1000 contigs—about 10–100 times better than those found in most genome reports—at a cost of under $10,000,” says Lok. The cost is going down all the time.
This new workflow leads to striking completeness and accuracy: 99.98% of the current BUSCO set of 9,226 genes used to assess assembly quality are complete at exon-level in the wolverine assembly, placing it in the top tier of assemblies produced from long-reads. Lok hopes that their report shows how cost-effective, accurate, and complete sequencing and assembly can be nowadays. “No future genome reports should be less than chromosome-level, given the current technologies.”
Conservation genomics meets wolverine behavior
The new article also provides the first full-length mitochondrial genome assembly for the North American wolverine, as well as a tabulation of potential microsatellite markers for the wolverine. Since monitoring population size and distribution, reproductive success, and gene flow in wild populations often relies on analyses of mitochondrial DNA and microsatellites, the authors hope to provide a resource for developing these and other species-specific genomic markers.
In addition, Lok and colleagues annotated genes whose orthologs have been associated with aggressive traits in other organisms—an adaptation to drive competition for food and mates— and the key components of innate immune responses. “Environmental disruptions from climate change will increase vulnerabilities to new pathogens,” says Lok.
“We are in the process of reporting genomes for other species predicted to be heavily affected by climate change in efforts to support their conservation and ecological relationships, such as that of the Canada lynx and the snowshoe hare,” says Lok. He wishes this report to set a minimum standard of quality for future genome reports and resources for conservation biologists.
Chromosomal-level reference genome assembly of the North American wolverine (Gulo gulo luscus): a resource for conservation genomics
Si Lok, Timothy N H Lau, Brett Trost, Amy H Y Tong, Richard F Wintle, Mark D Engstrom, Elise Stacy, Lisette P Waits, Matthew Scrafford, and Stephen W Scherer
G3: GENES|GENOMES|GENETICS August 2022, jkac138