Fecal alchemy: Turning poop into genomics gold

When it comes to genotyping technology, poop genetics is stuck in the 1990s. While most geneticists are now awash in genome-scale data from thousands of individuals, those who depend on fecal and other non-invasively collected samples still rely on old-school, boutique panels of a dozen or so genetic markers.

But feces — along with fur, feathers, and urine — is critically important stuff for understanding the population genetics, ecology, evolution, behavior, and conservation of wild animals. Many are too elusive or endangered to allow collection of blood samples, and even for common species it is a logistical nightmare to immobilize and draw blood from large numbers of animals in the field. In the latest issue of GENETICS, Snyder-Mackler et al. describe tools that promise to advance studies of such samples into the genomic era.

Patrick Chiyo collecting noninvasive samples from elephants in Amboseli National Park. Photo courtesy Jenny Tung.

Noninvasively collected samples have the obvious advantage of easy access. “We have freezers and freezers full of baboon poop,” says study co-leader Jenny Tung (Duke University). Tung’s group works on behavior and genetics in a wild baboon population in Kenya. But though abundant, poop also presents serious challenges for standard genetic analysis. The DNA present in noninvasive samples is typically a fragmented mixture of host and contaminant sequence. For example, only around 1% of the DNA in a fecal sample comes from the animal that produced the poop. Most of the rest is microbial.

These limitations were first overcome in the 1980s and 1990s, and the ability to analyze DNA from noninvasive samples revolutionized the field. Using such samples not only allowed geneticists to understand the genetic diversity and viability of endangered animals, it allowed them to empirically test important theories about animal behavior and evolution.

“There are many examples. Noninvasive sampling of chimps, baboons, rhesus macaques and other primates revealed that animals really do bias their behavior towards relatives, even paternal relatives that are likely more difficult for an individual to identify as kin,” says Tung. “And in baboons, it also showed that males provide some paternal care to their offspring, which wasn’t expected for a polygamous primate.”

But the genotyping methods used in such studies have changed surprisingly little over the last twenty years. For the most part, researchers still use small groups of carefully validated markers, usually based on stretches of short tandem repeat sequences (microsatellites). This means the field has mostly missed out on the benefits of genomics that have become routine for medical researchers and those who work with laboratory organisms.

“Microsatellite approaches still work. But over the last 5 or 10 years it has become impossible to ignore the way genome-scale datasets allow you to answer entirely different questions,” says Tung.

For example, data on how a genome varies across a population can provide crucial evidence of the evolutionary and demographic forces that have shaped it. Genomic data can also trace in detail the mergers and separations of mixing populations.

Vet, a female yellow baboon, and her children in Amboseli National Park. Photo courtesy of Susan Alberts.

Vet, a female yellow baboon, and her children in Amboseli National Park. Photo courtesy Susan Alberts.

The good news for poop genomics is that short-read next-generation sequencing methods are well suited to the fragmented DNA found in noninvasive samples. These methods have been famously adapted for analyzing a sample type that also suffers from vanishingly small amounts of target sequence: ancient DNA. The bad news is that the expensive, intensive approaches that work well for a precious sample of Neanderthal bone are not practical for a geneticist facing a freezer full of poop.

About six years ago, Tung’s friend and colleague George (PJ) Perry published a major advance that allowed large-scale resequencing from noninvasive samples. It was based on a method known as sequence capture, which enriches for host sequence using synthetic RNA “baits” to capture the target DNA. Tung was excited by the possibilities of the methods, but realized it was still too expensive for most applications. This was partly because the baits had to be custom-designed and synthesized for the species of interest. The method also had the drawback of only capturing a tiny fraction of the genome, while consuming large amounts of sample.

“Even fecal samples are exhaustible,” says Tung. “We have a lot of irreplaceable samples from dead animals, for instance. If we’re going to use them up, we want to cover all our bases and gather data on a truly genome-wide scale.”

So Tung’s group and their collaborators worked to modify and scale up Perry’s protocol. They also constructed the baits in a considerably cheaper way, using in vitro transcription of RNA from baboon DNA templates, sidestepping the need for custom synthesis. The new protocol had more modest input DNA requirements and could enrich the target DNA by 40-fold.

But getting enough sequence per sample was just the beginning. Xiang Zhou (University of Michigan) led the group’s efforts to develop tools to analyze data from the new method. Zhou says one of the reasons microsatellites became so popular was the availability of standard and easy-to-use software for assigning paternity from the data. “If people are going to transition to a new method, we thought it would be incredibly important that we package our models into software that will make it as easy as possible,” says Zhou.

But to develop something comparable for low-coverage sequence, the team faced two major challenges: the data is simultaneously much richer (more sequence) and much lower quality (more uncertainty). To deal with the large quantity of data they needed much more computationally efficient algorithms. They also had to factor in the lower data quality, which makes it impossible to use the simpler approaches that work when the genotype at each site is known with certainty. Instead, they incorporated the error rate across all the sites in the genome, generating a sophisticated statistical model.

One of (several) freezers in the Tung lab containing boxes of fecal samples. Photo courtesy Jenny Tung.

Using the new capture method and the paternity assignment software (called WHODAD), the team were able to construct pedigrees from baboon fecal samples that almost perfectly matched those created using traditional analysis of high-quality DNA from blood. In short, despite the low coverage of the genome (typically less than 1x), and the resulting very high uncertainty of the genotype at any one site, the trends in the data were more than enough to reconstruct family relationships.

But what about cost? Lead author Noah Snyder-Mackler gave the project the pet name “fecal alchemy” because it aims to transform poop into a data goldmine. But not every researcher can afford gold — most labs must use the cheapest tool that will get the job done. Tung says they included a cost analysis in the paper because they are regularly asked about the price of making the switch.

“Right now it costs about twice as much to produce 1x coverage of the entire baboon genome as it does to type 14 microsatellites. But the amount of information you get is much greater! So if you’re thinking in terms of cost per genotype, our method is way more cost effective. But in terms of absolute amounts it’s more expensive. In the end the cost-benefit decision depends on what questions you’re trying to answer,” says Tung. “Of course we’d like to get it even cheaper and more efficient and more robust. We’re working on it!”

FUNDING

This work was partly funded by the National Science Foundation DEB through an EAGER grant, with co-funding from NSF Biological Anthropology.

CITATION

Noah Snyder-Mackler, William H. Majoros, Michael L. Yuan, Amanda O. Shaver, Jacob B. Gordon, Gisela H. Kopp, Stephen A. Schlebusch, Jeffrey D. Wall,Susan C. Alberts, Sayan Mukherjee, Xiang Zhou, Jenny Tung (2016). Efficient Genome-Wide Sequencing and Low-Coverage Pedigree Analysis from Noninvasively Collected Samples. Genetics, 203(2), 699-714.

http://www.genetics.org/content/203/2/699

DOI: 10.1534/genetics.116.187492

Behavior, Ecology, Evolution, Genetics Journal, Genomics, Population Genetics, Primates, Sequencing, Wildlife

Cristy Gelling is a science writer, lapsed yeast geneticist, and former Communications Director at the GSA.

View all posts by Cristy Gelling »

Hongyu Zhao joins GENETICS as new Senior Editor

A new senior editor is joining GENETICS in the Statistical Genetics and Genomics section. We’re excited to welcome Hongyu Zhao to the editorial team. Hongyu ZhaoSenior Editor Hongyu Zhao is the Ira V. Hiscock Professor of Biostatistics, Professor of Genetics, and Professor of Statistics and Data Science at Yale University. He received his BS in…
GSA Member Julio Molina Pineda Receives DeLill Nasser Award, Shines at TAGC 2024

“At any career stage, the GSA membership is an amazing investment for any genetics professional!” Julio Molina Pineda is a PhD Candidate in Cell and Molecular Biology and a Research Assistant at the University of Arkansas, and a Doctoral Academy Fellow at the Lewis Lab. In 2023, Julio was awarded the DeLill Nasser Award for…
In Memoriam: Ellsworth Herman Grell (1932–2023), a pioneer of Drosophila genome engineering and annotation

Ellsworth (Ed) Grell blessed the Drosophila community through three enduring legacies: as a pioneer of chromosome mechanics, as a primary organizer and synthesizer of genetic knowledge in Drosophila, and as a graceful mentor to those fortunate to have known him personally. Ed grew up in rural Nebraska, completed his undergraduate studies at Iowa State, and…
Congratulations to the #Fungal24 Poster Award winners!

We are pleased to announce the recipients of the GSA Poster Awards for posters presented at the 32nd Fungal Genetics Conference! Undergraduate and graduate student members of GSA were eligible for the awards, and a hard-working team of judges made the determinations. Congratulations to all! Felicia Ebot Ojong, The University of Georgia My research is focused…
Poster presentation tips for TAGC 2024

You’ve been selected to present a poster at The Allied Genetics Conference 2024 in March—you’ve celebrated, made plans to attend, now what? This is an exciting opportunity to showcase your research and engage with fellow members of the genetics community, so you want to make sure you’re prepared. We wanted to offer you some tips…
Maximize your TAGC 2024 experience

A guide to all that National Harbor & DC have to offer Are you joining us for The Allied Genetics Conference 2024 in March? Make the most of your #TAGC24 experience in National Harbor! We know the science will keep you busy, but you deserve to unwind and have some fun, so we’ve curated a…
Early Career Leadership Spotlight: Sarah Petrosky

We’re taking time to get to know the members of the GSA’s Early Career Scientist Committees. Join us to learn more about our early career scientist advocates. Sarah PetroskyMultimedia SubcommitteeUniversity of Pittsburgh Research Interest I am interested in understanding adaptation that has been happening recently in populations by dissecting the ways that genes underlying an adaptation…
TAGC 2024 Early Career Award Winners

GSA is pleased to announce the winners of the early career awards presented at The Allied Genetics Conference 2024. These awards are specific to particular TAGC communities and recognize early career scientists’ outstanding work on their respective research organisms. The awardees will present their talks in keynote sessions at TAGC 2024. Don’t miss the opportunity…
Preeminent geneticists recognized with revamped GSA Awards

In 2022, GSA’s Board of Directors launched an audit to review the five major awards conferred by the Society. Today, we are thrilled to announce the recipients of the reimagined GSA Awards, including the new Genetics Society of America Early Career Medal. The scientists honored this year are recognized by their peers for their outstanding…
Fly Board funds outreach programs to spread the word about Drosophila research

In 2020, the Fly Board voted to use part of its reserve fund to support efforts to increase trainee participation as well as equity and diversity in the Drosophila community. An awards committee decides how the money will be spent each year, and from 2020–2022, the committee posted a very broad call for applications from…
New members of the GSA Board of Directors: 2024–2026

We are pleased to announce the election of four new leaders to the GSA Board of Directors: 2024 Vice President/2025 President Brenda Andrews Professor, University of Toronto It’s an honor to continue my association with the Society by serving as Vice President of the Board of Directors. I have broad knowledge of the ongoing activities…