New & Noteworthy

Network Maintenance at SGD on July 15, 2015

July 07, 2015

The SGD website (www.yeastgenome.org) and all its resources (Download Server, GBrowse, SPELL, YeastMine, Pathway Tools, and Textpresso) will be unavailable on Wednesday, July 15, 2015 from 2:30-4:30 pm PDT (5:30-7:30pm EDT, 9:30-11:30pm GMT, 6:30-8:30am Japan) for network maintenance. We will make every effort to minimize any downtime associated with this maintenance. We apologize for any inconvenience this may cause, and thank you for your patience and understanding.

Categories: Maintenance

SGD Help Video: YeastMine is Awesome!

July 07, 2015

If you’re not already using YeastMine to answer all your questions about the Saccharomyces cerevisiae genome and the gene products it encodes…you should be!

This versatile tool lets you slice and dice data from SGD in any way you choose. You can ask questions like “How many proteins between 25 and 35 kDa in size are integral to the nuclear membrane?” or “Which genes can mutate to confer oxidative stress resistance, and what biological processes are they involved in?”

Start with this video to see a quick sample of three cool features in YeastMine.

Categories: Tutorial

Tags: video, YeastMine

Where’s That Protein?

July 01, 2015

Waldo will always be hard to find, but we now know exactly where to find more than 4,000 S. cerevisiae proteins, thanks to new methods and an analysis pipeline. Image by William Murphy via Wikimedia Commons

You might be familiar with the Where’s Waldo book series, especially (but not necessarily) if you have kids. They challenge the reader to find Waldo within huge, intricately drawn groups of people. Even though Waldo has his distinctive characteristics—glasses and a striped shirt and hat—he can be very hard to find.

Now imagine that the drawings shift under different conditions, so that Waldo could be in any of several places at different times. And imagine that you’re not just looking for Waldo, but also for thousands of other unique individuals—all tagged in the same way. This is the challenge faced by researchers who want to know where each protein in a cell is located and how its location and abundance respond to different environments.

But, as genetic, robotic, microscopic, and computational tools get more and more sophisticated, it’s becoming possible to pinpoint Waldo and his companions even as they move around within the jam-packed yeast cell.

In two new papers, scientists from the University of Toronto describe a huge effort that entailed over 9 billion quantitative measurements to find the location and measure the abundance of more than 4,000 S. cerevisiae proteins. Chong and colleagues wrote in Cell about the approach and experimental methods, while Koh and colleagues published in G3 about the computational methods and the database that houses all the data, called CYCLoPs for Collection of Yeast Cells and Localization Patterns.

This work couldn’t have been done without a valuable resource that was created some years ago: the yeast GFP collection. It’s a set of strains, each with the green fluorescent protein gene fused to the 3’ end of one open reading frame to express a GFP fusion protein from the ORF’s native promoter. Not every yeast protein can be detected this way: some are expressed too weakly, while others may actually be destabilized by their GFP tags. Still, more than 4,100 of these fusion genes—71% of the proteome—give a visible GFP signal in the cell.

The researchers started with these ~4,100 strains and transformed each with a plasmid expressing red fluorescent protein. This allowed them to visualize the boundaries of each cell. Then they got to work, taking pictures of at least 200 cells of each strain and developing an automated pipeline to analyze them. They ended up analyzing 300,000 micrographs of more than 20 million cells, beating the few dozen Where’s Waldo books by a long shot!

The scientists looked at each protein in wild type, in a mutant strain, and in the presence of two drugs. The mutant strain they studied was deleted for RPD3, which encodes a lysine deacetylase that regulates the stability and interactions of histones and other proteins. The drug treatments were done with several different concentrations of rapamycin (an inhibitor of the TORC1 complex, which is an important regulator of cell growth) or hydroxyurea (a DNA replication inhibitor).

The end result was an enormous collection of data, now stored in the CYCLoPs database, that shows the abundance of each protein in each of 16 cellular compartments under all of these different conditions. These data are much more quantitative and consistent than any protein abundance or localization data that had been obtained before. They are stored in such a way that measurements within single cells can be accessed, and the database can be searched by patterns of changes in localization or abundance as well as for data on a particular protein.

The authors came up with some innovative methods for visualizing this immense dataset to get a high-level overview. One of their most surprising findings was just how many proteins localize to multiple places. We tend to think of the cell as a tidy place where each protein has one particular location, but Chong and colleagues found that it’s extremely common for proteins to be in several spots.

Most often, when proteins are present in more than one place, those places are the nucleus and the cytoplasm. Some proteins had already been shown in small-scale studies to be present in both compartments, or to shuttle between them. But the authors saw an astounding 1,029 proteins localizing to both the nucleus and cytoplasm under standard conditions in wild-type cells.

Not counting the proteins in the nucleus and cytoplasm, another 511 proteins localized to more than one place. Some were seen in up to five different subcellular compartments.

The proteins with multiple locations, as a group, were more likely than the average protein to be phosphorylated. This made sense, because phosphorylation of proteins is known to regulate their localization. And many of these proteins themselves had regulatory roles, controlling processes such as cell division.

The fact that data were collected from single cells means that we can use them to uncover the dynamics of protein movement. For example, if a protein was scored as localizing to both the nucleus and the cytoplasm, does that mean there’s a pool of it in both places at all times, or does it move back and forth? The single-cell data for two representative proteins, Mcm2 and Whi5, showed clearly that any one cell has each of these proteins in either the nucleus or cytoplasm, but not both. But some other proteins hang out in both places at once. And the dynamics of still more roving proteins are just waiting to be revealed.

Researchers will be mining the CYCLoPs resource to find detailed information about specific proteins, pathways, and processes for years to come. The data gathered in the rpd3 mutant and under rapamycin and hydroxyurea treatment served as proof of principle that the system can be used to assess the effects of a variety of mutations and drugs.

So this study puts a spotlight on Waldo in each picture and makes it simple to find him and his friends. This mass of data on where proteins are and how they move around has far-reaching implications for yeast systems biology, and the methodology can now be applied to cells of other organisms as well. In the coming weeks, we’ll make it even simpler for you to access these data from SGD, by adding links for individual proteins to the CYCLoPs database.

by Maria Costanzo, Ph.D., Senior Biocuration Scientist, SGD

Categories: Research Spotlight

Tags: protein abundance, protein localization, Saccharomyces cerevisiae

New SGD Help Video: Yeast-Human Functional Complementation Data

June 30, 2015

Yeast and humans diverged about a billion years ago, but there’s still enough functional conservation between some pairs of yeast and human genes that they can be substituted for each other. How cool is that?! Which genes are they? What do they do?

This two-minute video explains how to find, search, and download the yeast-human functional complementation data in SGD. You can find help with many other aspects of SGD in the tutorial videos on our YouTube channel. And as always, please be sure to contact us with any questions or suggestions.

Categories: Homologs, New Data, Tutorial

Tags: video, yeast model for human disease

The Sounds of Silencing

June 17, 2015

For centuries, we thought of the universe as an empty, eerily silent place. Turns out we were dead on when it came to the emptiness, not so much when it came to the silence.

Despite more and more powerful equipment, SETI has yet to find any meaningful radio signals coming from the stars. Yeast research is in a better position: new techniques applied to telomeric gene expression now make sense of the signals. Image by European Southern University (ESO) via Wikimedia Commons

Once we invented devices that could detect electromagnetic radiation—starting with the Tesla coil receiver in the 1890s—we began to realize what a noisy place the universe really is. And now with modern radio telescopes becoming more and more sensitive, we know there is a cacophony of signals out there (although the Search for Extraterrestrial Intelligence has yet to find any non-random patterns).

The ends of chromosomes, telomeres, have also long thought to be largely silent in terms of gene expression. But a new paper in GENETICS by Ellahi and colleagues challenges that idea.

Much like surveying the universe with a high-powered radio telescope, the researchers used modern techniques to make a comprehensive survey of the telomeric landscape–and saw that the genes were not so silent. Their work revealed that there’s a lot more gene expression going on at telomeres than we thought before.

It also gave us some fascinating insights into the role of the Sir proteins, founding members of the conserved sirtuin family that is implicated in aging and cancer.

Telomeres are special structures that “cap” the ends of linear chromosomes to protect the genes near the ends from being lost during DNA replication, something like aglets, those plastic tips that keep the ends of your shoelaces from fraying. They have characteristic DNA sequence elements that we don’t have space to describe here (but you can find a short summary in SGD).

Classical genetics experiments in Drosophila fruit flies showed that telomeres had a silencing effect on the genes near them, and early work in yeast seemed to confirm this. Reporter genes became transcriptionally silenced when they were placed near artificial constructs that mimicked telomere sequences.

This early work was solid, but had a few limitations. The artificial telomere constructs were, well, artificial; some of the reporter genes encoded enzymes that had an effect on overall cellular metabolism, such as Ura3; and the studies tended to look at just one or a few telomeres.

To get the whole story, Ellahi and colleagues decided to look very carefully at the telomeric universe of S. cerevisiae. First, they used ChIP-seq to look at the physical locations of three proteins, Sir2, Sir3, and Sir4, on chromosomes near the telomeres.

These proteins, first characterized and named Silent Information Regulators for their role in silencing yeast’s mating type cassettes, had been seen to also mediate telomeric silencing. Scientists had hypothesized that they might be present at telomeres in a gradient, strongly repressing genes close to the chromosomal ends and petering out with increasing distance from the telomere.

Ellahi and coworkers re-analyzed recent ChIP-seq data from their group to find where the Sir proteins were binding within the first and last 20 kb regions of every chromosome. These 20 kb regions included the telomere and the so-called subtelomeric region where genes are thought to be silenced. They found all three Sir proteins at all 32 natural telomeres.

However, the Sir proteins were not uniformly distributed across the telomeres, but rather occupied distinct positions. Typically, all three were in the same position, as would be expected since they form a complex. And they were definitely not in a gradient along the telomere.

Next the researchers asked whether gene expression was truly silenced in that subtelomeric region. They used mRNA-seq to measure gene expression from the ends of chromosomes in wild type or sir2, sir3, or sir4 null mutants.

They found that contrary to expectations, there is actually a lot of transcription going on near telomeres, even in the closest 5 kb region. The levels are lower than in other parts of the genome, but that can be partly explained by the fact that open reading frames are less dense in these regions. And only 6% of genes are silenced in a Sir-dependent manner.

The sensitivity of mRNA-seq allowed Ellahi and colleagues to uncover new patterns of gene expression in this work. They were able to detect very low-level transcription from some of the telomeric repetitive elements. Also, because the SIR genes are involved in mating type regulation, the mRNA-seq data from the sir mutants revealed a whole new set of genes that are differentially expressed in different cell types (haploids of mating types a and α, or a/α diploids).

The researchers point out that their work raises the question of why the cell would use the Sir proteins to repress transcription of a few subtelomeric genes. Wouldn’t it be more straightforward if these genes just had weaker promoters to keep their expression low?

They hypothesize that Sir repression could actually be part of a stress response mechanism, allowing a few important genes to be turned on strongly when needed. This idea could have intriguing implications for the role of Sir family proteins in aging and cancer in larger organisms.

So, neither the universe nor the ends of our chromosomes are as silent as we thought. But unlike the disappointed SETI researchers, biologists studying everything from yeast to humans can now build on this large quantity of meaningful data from S. cerevisiae telomeres.

by Maria Costanzo, Ph.D., Senior Biocuration Scientist, SGD

Categories: Research Spotlight

Tags: Saccharomyces cerevisiae, silencing, telomere

Yeast-Human Functional Complementation Data Now in SGD

June 10, 2015

Yeast and humans diverged about a billion years ago. So if there’s still enough functional conservation between a pair of similar yeast and human genes that they can be substituted for each other, we know they must be critically important for life. An added bonus is that if a human protein works in yeast, all of the awesome power of yeast genetics and molecular biology can be used to study it.

To make it easier for researchers to identify these “swappable” yeast and human genes, we’ve started collecting functional complementation data in SGD. The data are all curated from the published literature, via two sources. One set of papers was curated at SGD, including the recent systematic study of functional complementation by Kachroo and colleagues. Another set was curated by Princeton Protein Orthology Database (P-POD) staff and is incorporated into SGD with their generous permission.

As a starting point, we’ve collected a relatively simple set of data: the yeast and human genes involved in a functional complementation relationship, with their respective identifiers; the direction of complementation (human gene complements yeast mutation, or vice versa); the source of curation (SGD or P-POD); the PubMed ID of the reference; and an optional free-text note adding more details. In the future we’ll incorporate more information, such as the disease involvement of the human protein and the sequence differences found in disease-associated alleles that fail to complement the yeast mutation.

You can access these data in two ways: using two new templates in YeastMine, our data warehouse; or via our Download page. Please take a look, let us know what you think, and point us to any published data that’s missing. We always appreciate your feedback!

Using YeastMine to Access Functional Complementation Data

YeastMine is a versatile tool that lets you customize searches and create and manipulate lists of search results. To help you get started with YeastMine we’ve created a series of short video tutorials explaining its features.

Gene –> Functional Complementation template

This template lets you query with a yeast gene or list of genes (either your own custom list, or a pre-made gene list) and retrieve the human gene(s) involved in cross-species complementation along with all of the data listed above.

Human Gene –> Functional Complementation template

This template takes either human gene names (HGNC-approved symbols) or Entrez Gene IDs for human genes and returns the yeast gene(s) involved in cross-species complementation, along with the data listed above. You can run the query using a single human gene as input, or create a custom list of human genes in YeastMine for the query. We’ve created two new pre-made lists of human genes that can also be used with this template. The list “Human genes complementing or complemented by yeast genes” includes only human genes that are currently included in the functional complementation data, while the list “Human genes with yeast homologs” includes all human genes that have a yeast homolog as predicted by any of several methods.

Downloading Functional Complementation Data

If you’d prefer to have all the data in one file, simply visit our Curated Data download page and download the file “functional_complementation.tab”.

Categories: New Data, Yeast and Human Disease

Tags: yeast model for human disease

Yeast are People Too

June 03, 2015

Cars on the road today all look pretty similar from the outside, whether they’re gasoline-fueled or electric. On the inside, they’re fairly similar too. Even between the two kinds of car, you can probably get away with swapping parts like the air conditioner, the tires, or the seat belts. Although cars have changed over the years, these things haven’t changed all that much.

Just like these cars, yeast and human cells have some big differences under the hood but still share plenty of parts that are interchangeable. Nissan Leaf image via Wikimedia Commons; Ford Mustang image copyright Bill Nicholls via Creative Commons

The engine, though, is a different story. All the working parts of that Nissan Leaf engine have “evolved” together into a very different engine from the one in that Ford Mustang. They both have engines, but the parts aren’t really interchangeable any more.

We can think of yeast and human cells like this too. We’ve known for a while that we humans have quite a bit in common with our favorite little workhorse S. cerevisiae. But until now, no one had any idea how common it was for yeast-human pairs of similar-looking proteins to function so similarly that they are interchangeable between organisms.

In a study published last week in Science, Kachroo and colleagues looked at this question by systematically replacing a large set of essential yeast genes with their human orthologs. Amazingly, they found that almost half of the human proteins could keep the yeast mutants alive.

Also surprising was that the degree of similarity between the yeast and human proteins wasn’t always the most important factor in whether the proteins could be interchanged. Instead, membership in a gene module—a set of genes encoding proteins that act in a group, such as a complex or pathway—was an important predictor.

The authors found that genes within a given module tended to be either mostly interchangeable or mostly not interchangeable, suggesting that if one protein changes during evolution, then the proteins with which it interacts may need to evolve as well. So we can trade air conditioner parts between the Leaf and the Mustang, but the Mustang’s spark plugs won’t do a thing in that newly evolved electric engine!

To begin their systematic survey, Kachroo and colleagues chose a set of 414 yeast genes that are essential for life and have a single human ortholog. They cloned the human cDNAs in plasmids for yeast expression, and transformed them into yeast that were mutant in the orthologous gene to see if the human gene would supply the missing yeast function.

They tested complementation using three different assays. In one, the human ortholog was transformed into a strain where expression of the yeast gene was under control of a tetracycline-repressible promoter. So if the human gene complemented the yeast mutation, it would be able to keep the yeast alive in the presence of tetracycline.

Another assay used temperature-sensitive mutants in the yeast genes and looked to see if the human orthologs could support yeast growth at the restrictive temperature. And the third assay tested whether a yeast haploid null mutant strain carrying the human gene could be recovered after sporulation of the heterozygous null diploid.

Remarkably, 176 human genes could keep the corresponding yeast mutant alive in at least one of these assays. A survey of the literature for additional examples brought the total to 199, or 47% of the tested set. After a billion years of separate evolution, yeast and humans still have hundreds of interchangeable parts!

That was the first big surprise. But the researchers didn’t stop there. They wondered what distinguished the genes that were interchangeable from those that weren’t. The simplest explanation would seem to be that the more similar the two proteins, the more likely they would work the same way.

But biology is never so simple, is it? While it was true that human proteins with greater than 50% amino acid identity to yeast proteins were more likely to be able to replace their yeast equivalents, and that those with less than 20% amino acid identity were least likely to function in yeast, those in between did not follow the same rules. There was no correlation between similarity and interchangeability in ortholog pairs with 20-50% identity.

After comparing 104 different types of quantitative data on each ortholog pair, including codon usage, gene expression levels, and so on, the authors found only one good predictor. If one yeast protein in a protein complex or pathway could be exchanged with its human ortholog, then usually most of the rest of the proteins in that complex or pathway could too.

This budding yeast-human drives home the point that humans and yeast share a lot in common: so much, that yeast continues (and will continue) to be the pre-eminent tool for understanding the fundamental biology of being human. Image courtesy of Stacia Engel

All of the genes that that make the proteins in these systems are said to be part of a gene module. Kachroo and colleagues found that most or all of the genes in a particular module were likely to be in the same class, either interchangeable or not. We can trade pretty much all of the parts between the radios of a Leaf and a Mustang, but none of the engine parts.

For example, none of the tested subunits of three different, conserved protein complexes (the TriC chaperone complex, origin recognition complex, and MCM complex) could complement the equivalent yeast mutations. But in contrast, 17 out of 19 tested genes in the sterol biosynthesis pathway were interchangeable.

Even within a single large complex, the proteasome, the subunits of one sub-complex, the alpha ring, were largely interchangeable while those of another sub-complex, the beta ring, were not. The researchers tested whether this trend was conserved across other species by testing complementation by proteasome subunit genes from Saccharomyces kluyveri, the nematode Caenorhabditis elegans, and the African clawed frog Xenopus laevis. Sure enough, alpha ring subunits from these organisms complemented the S. cerevisiae mutations, while beta ring subunits did not.

These results suggest that selection pressures operate similarly on all the genes in a module. And if proteins continue to interact across evolution, they can diverge widely in some regions while their interaction interfaces stay more conserved, so that orthologs from different species are more likely to be interchangeable.

The finding that interchangeability is so common has huge implications for research on human proteins. It’s now conceivable to “humanize” an entire pathway or complex, replacing the yeast genes with their human equivalents. And that means that all of the versatile tools of yeast genetics and molecular biology can be brought to bear on the human genes and proteins.

At SGD we’ve always known that yeast has a lot to say about human health and disease. With the growing body of work in these areas, we’re expanding our coverage of yeast-human orthology, cross-species functional complementation, and studies of human disease-associated genes in yeast. Watch this space as we announce new data in YeastMine, in download files, and on SGD web pages.

by Maria Costanzo, Ph.D., Senior Biocuration Scientist, SGD

Categories: Research Spotlight, Yeast and Human Disease

Tags: evolution, functional complementation, Saccharomyces cerevisiae, yeast model for human disease

Abstract Deadline Extended for ICYGMB

May 28, 2015

Breaking news: the deadline for submission of abstracts for oral presentations at the 27th International Conference on Yeast Genetics and Molecular Biology (ICYGMB) has been extended. Abstracts may be submitted for talks in the plenary sessions or in parallel workshops until June 7th.

Abstracts for poster presentations may be submitted until July 15th. But be sure to register for the conference by June 30th to get the early registration price!

Submit your abstract

Register for the conference

Conference home page

Categories: Conferences

The Failed Hook Up: A Sitcom Starring S. cerevisiae

May 27, 2015

As anyone who watches a situation comedy knows, long range relationships are tricky. The longer the couple is separated, the more they drift apart. Eventually they are just too different, and they break up.

Jerry Seinfeld finds girlfriends incompatible for seemingly minor reasons like eating peas one at a time. Different yeast strains become incompatible over small differences in certain genes as well. Image by Dano Nicholson via Flickr

Of course if this were the end of the story, it would be the plot of the worst comedy ever. What usually happens in the sitcom is that one or both of them find someone more compatible and live happily ever after (with lots of silliness and high jinks).

Turns out that according to a new study by Hou and coworkers, our friend Saccharomyces cerevisiae could star in this sitcom. When different populations live in different environments, they drift apart. Eventually, because they accumulate chromosomal translocations and other serious mutations, they have trouble mating and having healthy offspring.

Now researchers already knew that big changes in yeast, like chromosomal translocations, affect hybrid offspring. But what was controversial before and what this study shows is that, as is known for plants and animals, smaller changes like point mutations can affect the ability of distinct populations of yeast to have healthy progeny. It is like Jerry Seinfeld being incompatible with a girl because she eats her peas one at a time (click here for other silly reasons Jerry breaks up with girlfriends).

The key to finding that yeast can be Seinfeldesque was to grow hybrid offspring in different environments. Hybrids that did great on rich media like YPD sometimes suffered under certain, specific growth conditions. Relying on the standard medium YPD masked mutations that could have heralded the beginnings of a new species of yeast.

See, genetic isolation is a powerful way for speciation to happen. One population generates a mutation in a gene and the second population has a mutation in a second gene. In combination, these two mutations cause a growth defect or even death. Now each population must evolve on its own, eventually separating into two species.

To show that this is a route that yeast can take to new species, Hou and coworkers mated 27 different Saccharomyces cerevisiae isolates with the reference laboratory strain S288C and grew their progeny under 20 different conditions. These strains were chosen because they were all able to produce spores with S288C that were viable on rich medium (YPD).

Once they eliminated the 59 pairings that involved parental strains that could not grow under certain conditions, they found that 117 out of 481 or 24.3% of crosses showed at least some negative effect on the growth of the progeny under at least some environmental conditions. And some of these were pretty bad. In 32 cases, at least 20% of the spores could not survive.

The authors decided to focus on crosses between S288C and a clinical isolate, YJM241, where around 25% of spores were inviable under growth conditions that required good respiration, such as the nonfermentable carbon source glycerol. They found that rather than each strain having a variant that affected respiration, the growth defect happened because of two complementary mutations in the clinical isolate.

The first mutation was a nonsense mutation in COX15, a protein involved in maturation of the mitochondrial cytochrome c oxidase complex, which is essential for respiration. The second was a nonsense suppressor mutation in a tyrosine tRNA, SUP7. So YJM241 was fine because it had both the mutation and the mechanism for suppressing the mutation. Its offspring with S288C were not so lucky.

Around 1 in 4 progeny got the mutated COX15 gene without SUP7 and so could not survive under conditions that required respiration. Which of course is why this was missed when the two strains were mated on YPD, where respiration isn’t required for growth.

So this is a case where the separated population, the clinical isolate YJM241, changed on its own such that it would have difficulty producing viable progeny with any other yeast strains. Like the narrator in that old Simon and Garfunkel song, it had become an island unto itself.

The researchers wondered whether this kind of change—a nonsense mutation combined with a suppressor—occurs frequently in natural yeast populations. They surveyed 100 different S. cerevisiae genome sequences and found that nonsense mutations are actually pretty common. Nonsense suppressor mutations were another story, though: they found exactly zero.

Apparently nonsense suppressor mutations are really rare in the yeast world, and Hou and colleagues wondered whether this was because they had a negative effect on growth. They added the SUP7 suppressor mutant gene to 23 natural isolates. It had negative effects on most of the isolates during growth on rich media, but it was more of a mixed bag under various stress conditions. Sometimes the mutation had negative effects and sometimes it had positive effects.

The fact that a suppressor mutation can provide a growth advantage under the right circumstances, combined with the fact that they are very rare, suggests that a new suppressor arising might help a yeast population out of a jam, but once the environment improves the yeast are free to jettison it. Suppressor mutations may be a transitory phenomenon, a momentary dalliance.

So, separate populations of yeast can change over time in subtle ways that prevent them from mating with one another. This can eventually lead to the formation of new species as the changes cause the two to drift too far apart genetically. It is satisfying to know that yeast drift apart like any other plant, animal, or sitcom character.

by D. Barry Starr, Ph.D., Director of Outreach Activities, Stanford Genetics

Categories: Research Spotlight

Tags: evolution, Saccharomyces cerevisiae

Getting the Big Picture from 100 Genomes

May 20, 2015

491px-Nagi_meksykańczyk_średni_głowa_456

Like the Peruvian Hairless dog, in some ways the S288C genome looks quite different from other members of its species. Image via Wikimedia Commons

Imagine if aliens visited the earth to learn about dogs, but they stumbled upon a colony of the very rare Peruvian Hairless. Taking a sample for DNA analysis, they would retreat to their home planet, do their studies, and conclude that all dogs had smooth, mottled skin and a stiff mohawk—as well as whatever crazy mutations the Peruvian Hairless happens to carry.

Until recently, S. cerevisiae researchers have been a bit like those aliens. The genomic sequence of the reference strain S288C was completed in 1996, and for a long time it was the only sequence available. Scientists knew a lot about the S288C genome, but they didn’t have any perspective on the species as a whole.

In the past few years, genomic sequences have become available from a handful of other strains. But now, as described in a new paper in Genome Research, Strope and colleagues have determined the genomic sequences of 93 additional S. cerevisiae strains to make the number an even hundred.

This collection of strains and sequences has already provided new insights into yeast phenotypic and genotypic variation, and represents an incredible resource for future studies. And the comparison with this collection of other strains suggests that in some ways, S288C may be just as unusual as the Peruvian Hairless.

This collection of strains and their sequences gave the researchers a much broader perspective across the whole S. cerevisiae species. It’s as if the aliens discovered Golden Retrievers, Great Danes, Chihuahuas, and more. We only have space here to touch upon a few of the highlights.

First off, they confirmed what many yeast researchers have suspected for a while—S288C is a bit odd. We already knew that a S288C carries polymorphisms in several genes that affect its phenotype. For example, the MIP1 gene in S288C encodes a mitochondrial DNA polymerase that is less efficient than in other strains, making its mitochondrial genome less stable.

Back when fewer strain sequences were available, it wasn’t clear whether the S288C polymorphisms in other genes like MKT1, SSD1, MIP1, AMN1, FLO8, HAP1, BUL2, and SAL1 were the exception or the rule. Now that Strope and colleagues had 100 genomes in hand, they could see that these differences are indeed peculiar to S288C and its close relative W303. They might have arisen because of the long genetic isolation of the strains, or because of special selective pressures they faced during growth in the lab.

They also found a lot of variation in how often S. cerevisiae strains have acquired whole chromosomal regions from other Saccharomyces species. This process, known as introgression, happens when related species mate to form hybrids. Stretches of DNA that are transferred in this way are recognizable because gene order is preserved, but all the genes they contain are highly diverged.

The researchers found 141 of these regions containing 401 genes. Many showed similarity to S. paradoxus, which is known to hybridize with S. cerevisiae, but others apparently came from unknown, as yet un-sequenced Saccharomyces species. In a couple of cases that the authors looked at closely, the introgressed genes had slightly different functions from their native S. cerevisiae counterparts.

Another notable finding by Strope and colleagues concerned some genes that exist in multiple copies. The ENA genes, encoding an ATP-dependent sodium pump, are present in 3 copies in S288C (ENA1, ENA2, and ENA5), while the CUP1-1 and CUP1-2 genes, encoding metallothionein that binds to copper and mediates copper resistance, are present in 10-15 copies.

To get perspective on a whole species, you need to look at lots of different examples. Image by Sue Clark via Flickr

The sequence coverage in these regions relative to their flanking regions allowed the researchers to see exactly how many repeats are present in each strain. All had between 1-14 copies of ENA genes and 1-18 copies of CUP genes. Interestingly, the strains of clinical origin had significantly higher copy numbers of CUP genes than the non-clinical strains, suggesting that copper resistance is an important trait for virulence.

So, instead of being confined to the S288C genome, S. cerevisiae researchers can now get a much fuller idea of the range of genetic and phenotypic variation within the species. The strains (available at the Fungal Genetic Stock Center), along with their genome sequences (available in GenBank), are an amazing resource for classical and quantitative genetics and comparative genomics.

Unlike those aliens, we won’t end up thinking of yeast as a mostly bald dog with a mohawk. No, we will have a fuller picture of S. cerevisiae strains in all their glory.

A few technical details

In selecting the strains to sequence, Strope and colleagues chose from a wide variety of yeast cultures isolated from the environment and from hospital patients with opportunistic S. cerevisiae infections. But they faced a problem: many of the cultures had irregular numbers of chromosomes or genome rearrangements, which would complicate both interpretation of the sequence data and any future genetic analysis.

To avoid this problem, the researchers selected only strains that were able to sporulate and produce four viable spores—showing that their genomes weren’t messed up. They also wanted strains with no auxotrophies (nutritional requirements), since these can negatively affect growth and complicate the comparison of phenotypes. In some cases, they corrected specific mutations in the strains to increase their fitness.

They ended up with 93 homozygous diploid strains to sequence. Producing paired-end reads of 101 bp, they generated genome assemblies that had 22- to 650-fold coverage per strain.

Because the sequence reads were relatively short, they didn’t provide enough information to assemble the sequence across repetitive regions. So Strope and colleagues used a genetic method to determine gene order. They crossed haploid derivatives of the strains to the reference strain S288C; if their genomes were not colinear with that of S288C, then some of the resulting spores would be inviable.

This analysis showed that 79 of the strains had chromosomes colinear to those of S288C, and allowed assembly of their genomes across multicopy sequences. The remaining strains had chromosomal translocations relative to S288C. Twelve of these carried the same reciprocal translocation between chromosomes 8 and 16.

by Maria Costanzo, Ph.D., Senior Biocuration Scientist, SGD

Categories: Research Spotlight

Tags: genome, Saccharomyces cerevisiae, strains