Towering sugar pine trees dominate the mountain forests of California and Oregon. They are the tallest pine trees in the world, regularly growing to skyscraper heights of over 100 meters. But these forest behemoths are under attack from a very tiny foe: an invasive fungus. White pine blister rust was accidentally introduced to western North America nearly a century ago. Since then, blister rust infections have been threatening the survival and reproduction of sugar pines, harming the ecosystem and industries that depend on them. Conservation efforts have shown that genetic variation contributes to the likelihood that one tree and not another succumbs to infection, but efforts to track down the genes involved have been complicated by the staggeringly huge genome of this giant tree and the arduous tests. The sugar pine genome is ten times the size of the human genome—a whopping 31 billion base pairs. Kristian Stevens and colleagues announced the complete sequence of the sugar pine genome in the December issue of GENETICS, the largest genome fully sequenced to date. Their work, along with a companion paper on the sugar pine transcriptome published in G3, highlights the evolutionary implications of such a massive genome size, as well as revealing candidate genes for blister rust resistance and a promising path to efficient selection of resistant individuals.

Despite its enormous size, the sugar pine genome contains about the same number of protein coding genes as the human genome. No less than 79% of the DNA in the sugar pine genome is made up of transposable elements, which accounts for its enormous size. These genetic parasites are stretches of DNA that exist only to proliferate within a genome. Rather than contributing to the sugar pine’s phenotype, they encode machinery that lets them make copies of themselves at new sites in the genome. Transposable elements are common in all eukaryotic genomes, but in conifers, and especially the sugar pine, they have multiplied to enormous numbers. In the sugar pine genome, the transposable elements are mostly non-functional relics. These genomic leftovers can tell researchers about the evolutionary history of the sugar pine and also provide insights about how genomes size evolves. They also create substantial problems for researchers trying to work with the sugar pine genome.

Transposable elements are highly repetitive, and when they are present in numbers as large as in the sugar pine, they are extremely difficult to sequence. Whole genome sequencing generally works by breaking a genome up into extremely small pieces and then putting them back together one by one. Repetitive genetic sequences make this process incredibly difficult because when the pieces are assembled, all the repeats look the same and end up incorrectly merged into one sequence. To get around this problem, the researchers assembling the sugar pine genome used several strategies. They obtained most of the sequence data from a single haploid pine nut, avoiding the typical complications of sequencing two parental genomes in a diploid individual.They sequenced the transcriptome to identify those sequences that produce proteins, and then used those sequences to assemble the corresponding genes. They also used sequencing libraries specially prepared with the reads known to be large distances away from one another, which is useful in linking larger genomic structures—the big picture. These techniques, along with others, allowed the researchers to build a useful working draft of the massive sugar pine genome.

A twig infected with white pine blister rust. Photo by <a href="https://commons.wikimedia.org/wiki/File:Cronartium_ribicola_on_Pinus_strobus_abrimaal2013.jpg">Marek Argent via Wikimedia</a>.

A twig infected with white pine blister rust. Photo by Marek Argent via Wikimedia.

Sequencing an entire genome, especially one as large as the sugar pine, is an impressive technological achievement. More importantly, however, it is an incredibly powerful research tool in the fight against white pine blister rust. This fungus has been infecting multiple species of white pines in the North America since it was first introduced from Asia around the turn of the century. White pine blister rust is a slow killer, taking years to destroy a large tree. An infection begins when fungal spores land on the surface of the tree and begin to germinate. They grow through openings into the twigs and branches, and very slowly make their way towards the main trunk of the tree. The infected branches swell up and large sacks of rusty orange-red spores burst through the branches. The fungal infection causes cankers, which prevents the tree from sending water and nutrients to its damaged limbs. Eventually, these limbs will die. If cankers form on the main trunk, the entire tree may die.

Researchers and forest managers have been looking for a way to fight the spread of white pine blister rust for a long time. Some rare sugar pines carry genetic resistance to white pine blister rust, and have been used in reforestation efforts. In the 1970s, these rare individuals were used to identify a major locus of resistance called Cr1, but the daunting size of the sugar pine genome made further analysis difficult. Using this new genome sequence, Stevens and colleagues were able to make a breakthrough in identifying this gene. They used the small amount of genetic information already known to find large Cr1-associated segments and identify previously unknown SNPs that are closely associated with resistance. These markers are a powerful tool that can be used to quickly and cheapy identify trees that carry the resistant allele without waiting for the results of slow and expensive infection assays. Resistant trees can then be harvested for seeds to be used in reforestation. Now armed with a roadmap, scientists can search the sugar pine genome for the secrets that may help save these iconic trees and the ecosystems that depend on them.

 

Stevens, K. A., Wegrzyn, J. L., Zimin, A., Puiu, D., Crepeau, M., Cardeno, C., Paul, R., Gonzalez, D., Koriabine, M., Holtz-Morris., A. E., Martínez-García, P. J., Sezen, U.U., Marçais, G., Jermstad, K., McGuire, P. E., Loopstra, C. A., Davis, J. M., Eckert, A., deJong, P., Salzberg, S. L., Neale, & Langley, C. H. (2016). Sequence of the Sugar Pine Megagenome. Genetics, 204(4), 1613-1626. DOI:

http://www.genetics.org/content/204/4/1613.abstract

 

Gonzalez-Ibeas, D., Martinez-Garcia, P. J., Famula, R. A., Delfino-Mix, A., Stevens, K. A., Loopstra, C. A., Langley, C. H., Neale, D. B., & Wegrzyn, J. L. (2016). Assessing the gene content of the megagenome: sugar pine (Pinus lambertiana). G3: Genes| Genomes| Genetics, 6(12), 3787-3802. DOI:

http://www.g3journal.org/content/6/12/3787.short

Katie is a science writer at GSA. She did her PhD work on the evolutionary consequences of genetic conflict in fruit flies at the University of Georgia.

View all posts by Katie Pieper »