Story behind the paper guest post by Corey Nislow (w/ Metka Lenassi) on "Genomics w/o Borders"

Below is another in the “Story behind the paper” series of guest posts here.  This one is from Corey Nislow w/ Metka Lenassi.  If anyone else has published an open access paper on anything relating to this blog and would like to write a guest post on the Story behind the paper, please let me know.

Genomics without Borders: Genome Sequence of the Extremely Halotolerant Yeast Hortaea werneckii 

by Corey Nislow (with Metka Lenassi)

In this guest post (thank you Jonathan!) I wanted to tell the story behind a paper that my colleagues and I published two weeks ago in PLoS ONE. The story also offers an opportunity to talk about what role, if any, a middle author can play in a scientific study.

The story is set in Slovenia a beautiful country which was part of the former Yugoslavia and which is home to about 2 million inhabitants, 2400+ fungal species (thanks Wikipedia) and some very interesting environments. One of these environments is the Secovlje Salterns where one can find the yeast Hortaea werneckii.

A worker harvests sea salt in the Secovlje salterns, July 17, 2010. Some 2600 tons of salt is expected to be produced during the two and a half month season at the salterns.(Xinhua/Reuters Photo)

I hadn’t heard of Hortaea until I started googling around looking for a yeast extremophile that I can grow in the lab to dissect out its nucleosomes to ask questions regarding nucleosome occupancy and transcription in the face of extreme environments. Turns out it was not a crazy idea–

13 years ago a peculiar black yeast Hortaea werneckii was isolated from its natural habitat: waters containing so much salt, it would kill most living organisms instantly. Since then, two small (but enthusiastic) Slovenian groups have tried to understand its halotolerance. This demanded field trips to the beautiful Slovenian coast, but also a lot of hard work and inventiveness to optimizing protocols used for other organisms – and to do it on a low budget. The first important obstacle was actually cultural – to persuade the scientific community that such extreme yeast even exists in nature! You can see it below. We now have ample evidence as Hortaea has been isolated from many seawater-related environments, saline lakes, but also from surface layers of tropical microbial mats in salterns and even from spider webs in Atacama Desert caves. All these different Hortaea strains are now waiting in their freezer (the Ex culture collection) to be analyzed.

Hortaea werneckii growing happily on 2M salt.

The figure below summarizes what was known about halotolerance of Hortaea before the genome sequence was decoded. In brief, high salinity is detected by sensors of the HOG signaling pathway (green arrows), which modulate the expression of salt-responsive genes (underlined green). The expression of other genes also varies; genes with higher expression at high salinity are written in red, repressed genes in blue). The impact of a hyperosmolar environment is countered by increasing the energy supply to drive energy-demanding processes such as export of Na+ and H+, import of glycerol andthe synthesis of compatible solutes. Melanization of the cell wall reduces the leaking of solutes from the cells and restructuring of membrane lipids helps preserve the integrity of the cells. Read this paper if you want to know more: Gostinčar et al, Adv Appl Microbiol. 2011;77:71-96 .

Gostinčar et al, Adv Appl Microbiol. 2011;77:71-96

This critter, as our recent paper reports, is as interesting genotypically as it is phenotypically. The full genome sequence reported in the PLoS ONE paper shows that genome size is 51mB quite a bit larger than its closest relatives, and given the number of gene models detected (20,000!), for all intents and purposes it looks like Hortaea underwent whole genome duplication last weekend!

Piquing my interest, I immediately started searching for the the genome sequence to have a reference to map nucleosome sequencing reads. Turns out, I had requested the strain from Metka years ago, only to find that one of our lab mates, Uros, with whom I was collaborating at the University of Toronto, had performed some of the groundwork on Hortaea for his PhD. But the network connections don’t stop here, I moved to the University of British Columbia last year, and as it happened Hortaea is popular in Vancouver too! Our new colleagues at UBC were working together with the Slovenian team on sequencing and analyzing the Hortaea genome. In fact, the collaboration started in 2005, catalyzed by a poster at the Budapest FEBS conference were Metka, at that time still a PhD student, and Ivan Sadowski started a discussion about the interesting phenotypic switch that Hortaea undergoes between yeast and filamentous forms. So, by virtue of a convergence of curiosity, good luck and generous collaborators I had the good fortune of being an active participant in the study.

So how does this have anything to do with what a middle author does or doesn’t do on a manuscript? And why do I care? 

Well, I recently re-read the comments section of a fellowship application, and ginned up the guts to read the “supervisor/training environment” section. The chief criticism was that I have a lot of papers on which I am not the senior author. So to the skeptics, I would say- even middle authors play important roles in bringing a study to an audience. In my case my self-interests guided my actions, but along the way, I had the chance to learn about an extraordinary critter, and an amazing group of Slovenian scientists. Yeah, I needed the genome sequence, but I was also excited to help drafting the manuscript, have our sequencing facility prepare additional libraries to close some gaps, and now to bring attention to this extraordinary critter.

The genome sequence offers an exciting new start in studies of Hortaea werneckii. Going forward, the Slovenians want to study its transcriptome and proteome in response to increasing salinity. Preparing knock-out mutants is also a must, to find key genes important for halotolerance. We definitely want to take a closer look at all those cation transporters and their functions. It would also be fun to find its mating partner in one of those frozen Hortaea samples. And now that the genome sequence is available to everybody, the research on this extremely interesting species may start to gain more appeal even to researchers beyond the two stubborn Slovenian groups.

Although I might not get to Slovenia in the foreseeable future, I wouldn’t be surprised if one of my graduate students will meet up with the group at an upcoming yeast meeting. This particular student is dragging our lab into evolutionary genomics by trying to see if he can’t get Hortaea to lose some of its genome in long-term culture (I can’t help but think of “Amadeus” where Salieri is telling Mozart that the composition is fine but it has too many notes….). I’m sure the results will be surprising, and am also encouraged to see what our future collaboration will bring.

Story behind the paper: Corey Nislow on Haloferax Chromatin and eLife

This is fun.  Today I am posting this guest post from Corey Nislow in my continuing “Story behind the paper” series.  The history of this post is what is most fun for me.  A few weeks ago I received this email from Corey:

Hi Jonathan, I hope this mail finds you well.
I wanted to alert you to a study from our lab that will be coming out in the inaugural issue of eLIFE.
After reading your PLoS ONE paper on the Haloferax volcanii genome (inspiration #1) I ordered the critter, prepared nucleosomes and RNA and we went mapping. Without a student to burden, I actually had to do some work…
Anyhow, we found that the genome-wide pattern of nucleosome occupancy and its relation to gene expression was remarkably yeast like. Unsure of where to send the story, we rolled the dice with the new open access journal eLIFE (inspiration #2) and the experience was awesome. I’m quite keen to pursue generating a barcoded deletion set for Hfx.
here’s the paper (coming out Dec. 10) if you’re curious.

And a PDF of the paper was attached.

And I wrote back quickly in my typically elegant manner:

completely awesome

But then I thought better of it and wrote again

So – can I con you into writing a guest post for my blog about the story behind this paper?  Or if you are writing a description somewhere else I would love to share it

And he said, well, yes.  And with a little back and forth, he wrote up the post that it below.  Go halophiles.  Go Haloferax.  Go open access.  Go science.

Chromatin is an ancient innovation conserved between Archaea and Eukarya  – The story behind the story
By Corey Nislow

My group first became interested in understanding the global organization of chromatin in early 2005 when Lars Steinmetz (now program leader at the EMBL) led a team effort at the Stanford Genome Center to design a state-of-the-art whole genome tiling microarray for Saccharomyces cerevisiae. These were heady times at Ron Davis’ Genome Technology shop and the array was another triumph of technology and teamwork. The array has over 7 million exceedingly small (5 µm²). The history of how this microarray transformed our understanding of the transcriptome began in 2006. As Lars’ group dug deeper, the extent of antisense transcription and its role in the regulation of expression became clear.

The availability of this array and its potential for asking interesting questions inspired me to convince William Lee, a new graduate student in my group (now at Memorial Sloan-Kettering) to embark on a seemingly simple experiment. The idea was to ask if we could use the classic micrococcal nuclease assay to define nucleosome positioning on a DNA template. But rather than using a short stretch of DNA that could be assessed by radioactive end-labeling and slab gel analysis, we decided the time was right to go “full-genome”. Accordingly, the template was all ~12.5mB of the yeast genome. Will systematically worked out conditions appropriate for hybridization, wrote the software to extract signal off the array (we were flying blind as the array did not come with an instruction manual) and producing an output that was compatible with the genome browsers of the time. Will’s computational background proved critical here (and at several later stages of the project). The result of this experiment was a map of the yeast genome with each of its approximately 70,000 nucleosome’s charted with respect to their occupancy (the length of time that the nucleosomes spend in contact with the DNA) and positioning (the location of a particular nucleosome relative to specific sequence coordinates) in a logarithmically growing population of cells (the paper). Both occupancy and positioning regulate access of most trans-acting factors for all DNA transactions. Working with my new colleague Tim Hughes at the University of Toronto, we began to mine this data focusing first on how the diverse occupancy patterns correlated with aspects of transcription, e.g. the presence of transcription factor binding sites, the level of expression of particular genes, and the like. With this data for the entire genome, we could systematically correlate nucleosome positioning/occupancy with functional elements, sequence logos and structural features. Des Tillo, a graduate student in Tim’s lab and now a research fellow with Eran Segal, was able to build a model that could predict nucleosome occupancy. The correlation (R=0.45) was not great but it was miles better than anything that existed at the time. Tim and Eran’s labs, work with Jason Lieb and Jonathan Widom, refined the model to greater accuracy 2009 model.

Our original study (essentially a control experiment to define the benchmark nucleosome map in yeast) has been widely cited- many of these cites have come from what were two opposing camps, the sequence advocates and the trans-acting proponents. The sequence folks posed that nucleosome position is directed by the underlying sequence information while the trans-acting folks see chromatin remodelers as having the primary role. Having last worked on chromatin in 1995 as a postdoc in Lorraine Pillus’ lab (cloning yeast SET1), it has been a scientific treat to be both a participant and observer in this most recent renaissance of chromatin glory.

The protocol

As a reminder, the micrococcal nuclease (MNase) assay relies on the preference of this nuclease to digest linker DNA. By chemically crosslinking histones to DNA with formaldehyde, digesting with MNase, then reversing the crosslinks and deproteinizing the DNA, you obtain 2 populations of DNAs, those protected by digestion (and presumably wrapped around nucleosomes in vivo) and a control sample that is crosslinked but not digested (genomic DNA). The former sample becomes the numerator and the latter the denominator and you take the ration between the two. Initially we compared the microarray signal intensities, now next generation sequence counts are used to define nucleosomal DNA. This cartoon depicts the array based assay, but simply swap in an NGS library step for the arrays to upgrade to the current state-of-the-art.  

In 2007 we were restricted to array-based assays (as were most genomic studies) and frankly, the 4bp resolution of the arrays was pretty amazing. But the introduction of Next-generation sequencing opened up the possibility of charting nucleosomes in worms or wildebeest or almonds, there was nothing to stop you other than the short read lengths at the time. The read length issue has since disappeared as the “short-read” platforms can easily cover the length of a nucleosome protected DNA fragment of ~150bases.

So that brings me to the paper I’d like to highlight today, which asks the question: if (and how) chromatin is organized in the archae, and further, is there any correlation of archae chromatin architecture to gene expression?

My extreme background
Just like the universal fascination of kids with dinosaurs, I was captivated by the discovery of life in extreme environments like boiling water or in acid that could melt flesh on contact. Teaching intro bio, I would try to provoke the students by claiming that discovering extraterrestrial life will be a letdown compared to what we can find on earth. So while my students were occupied with classifying yeast nucleosome and transcriptome profiles in different mutants and drug conditions, I had the rare opportunity to indulge my curiosity. Jonathan E’s talks on the dearth of information on microbes, combined with my re-discovery of the early papers from Reeve and Sandman (see review) had me hooked. Reading the literature was like discovering the existence of a parallel chromatin universe. Archae histone complexes were tetramers (as opposed to the octamers of eukaryotic nucleosome core particles) but most everything else was similar- they wrapped DNA (60-80 bases compared to 147 for yeast) and although archael histones did not share primary sequence similarity to eukaryotic nucleosomes, at the structural level they resembled histone H3 and H4 in eukaryotes.

Working from ignorance
Choosing the particular archaeon to study was dictated by one criterion, the ability to grow it in the lab easily without resorting to anaerobic conditions or similar calisthenics. Again, I was fortunate in that the halophilic arcaeon Haloferax volcanii fit the bill, but more importantly, there was a wealth of literature on this critter, including a well-annoted genome (thanks again Jonathan!) and an impressive armamentarium of genomic tools. Indeed the work of Allers, Mevarech and Lloyd and others have established Hfx. volcanii as a bona fide model organism with excellent transformation gene deletion gene tagging and gene expression tools.

Home for Haloferax volcanii

This photograph shows salt pillars that form in the dead sea which borders Jordan to the east and Israel and the West Bank to the west. The salt concentration in the water can exceed 5M!

So cool, now all we had to do was prepare nucleosomal DNA and RNA from Haloferax, sequence the samples, build a map and see where it led us. With everyone in the lab otherwise occupied, I tried to grow these critters. At first I was convinced I’d been out of the lab too long as nothing grew. Actually I just needed to be a little patient. Then the first cell pellets were so snotty that I aspirated them into oblivion. Finally, I had plenty of pellets and my talented yeast nucleosome group adapted their protocols such that we got nice nucleosome ladders.

This was a pleasant surprise and one we did not take for granted given the high CG content of the genome (65%). We then turned to isolating RNA. Without polyA tails for enrichment, our first attempts at RNA-seq were 95% ribosomal. Combining partially successful double-stranded nuclease (DSN) treatment with massive sequencing depth we were able to get fairly high coverage of the transcriptome. Here’s where Ron Ammar, a graduate student supervised by me, Guri Giaever and Gary Bader stepped in and turned my laboratory adventures into a wonderful story. Ron mapped the reads from our nucleosome samples to the reference genome and found what to my eyes looked like a yeast nucleosome map only at half scale.

Here were well-ordered arrays in the gene bodies and nucleosome depleted regions at the ends of genes. The Haloferax genome is a model of streamlining and as a consequence, intergenic regions are tiny and hard to define. With little published data to guide the definition of archea promoters and terminators the transcriptome map saved us. Ron focused on the primary chromosome in Haloferax and hand curated each transcription start and stop site based on the RNA-seq data. This is when we realized we had something interesting. Here were nucleosome depleted promoters and nucleosome depleted terminators and when we constructed an average-o-gram of all the nucleosome signatures for each promoter on the main chromosome, it looked like this….

The take home

The data strongly suggested that archae chromatin is organized in a matter very similar to eukaryotes. And further, the correlation between gene expression and nucleosome positioning, particularly with respect to the +1 and -1 nucleosomes was conserved. This conservation begs some interesting speculation. According to Koonin and colleagues the common ancestor of eukaryotes and archea predates the evolutionary split that gave rise to euryarchael and crenarchael lineages. Both of these branches have bona fide nucleosomes, therefore it would seem parsimonious to assume that the ancestor of these two branches also organized its genome into chromatin with anucleosomal scaffold. The similarities between the chomatin in archaea and eukaryotes, and the correlation between nucleosome occupancy and gene expression in archaea raise the interesting evolutionary possibility that the initial function of nucleosomes and chromatin formation might have been to regulate gene expression rather than for packaging of DNA. This is consistent with two decades of research that has shown that there is an extraordinarily complex relationship between the structure of chromatin and the process of gene expression. It also jives with in vitro observations that yeast H3/H4 tetramers can support robust transcription, while H2A/H2B tetramers cannot.

It is possible, therefore, that as the first eukaryotes evolved, nucleosomes and chromatin started to further compact their DNA into nuclei, which among other things, helped to prevent DNA damage, and that this subsequently enabled early eukaryotes to flourish. This observation is so exciting to me because it brings up so many questions that we can actually address such as- if there are nucleosomes comprised of histones, where are the histone chaperones? And further- despite the conventional wisdom that archael nucleosomes are not post translationally modified- this remains to be confirmed (or denied) experimentally. If conventional wisdom is correct and archea histones are not post countries post-translational and modified, then when did this innovation arise? There are more than enough questions to keep the lab buzzing!

Publishing the paper
Because I truly believed that this result “would be of general interest to a broad readership” we prepared a report for Science which was returned to us within 48 hours. The turnaround from Nature was even faster. I had received emails from eLIFE several months previously, and after reading the promotional materials and the surrounding press, we took our chances s at eLIFE and hoped for the best. The best is exactly what we got. Within a few days the editors emailed that the manuscript was out for peer review and four weeks later we received the reviews. They were unique. They outlined required, non-negotiable revisions (including a complete resequencing of the genome after MNase digestion but without prior cross-linking) but contained no gray areas and required no mind-reading. With all hands on deck and we resubmitted the manuscript in four weeks and were overjoyed with its acceptance. Of course with N=1, combined with a positive outcome it’s hard to be anything but extremely positive about this new journal. But I think the optimism is defendable- the reviews were transparent, and the criticisms made it a better paper. The editorial staff was supportive gave us the opportunity to take the first stab at drafting the digest which accompanies the manuscript.

NOTE ADDED BY JONATHAN EISEN.  A preprint of the paper is available here.  Thanks to the eLife staff for helping us out with this and encouraging posting prior to formally going live on the eLife site.

What’s next and what’s in the freezer
This work represents the Haloferax reference condition, with asynchronously growing cells in rich, high-salt media. We recently collected samples of log phase cultures exposed to several environmental stresses and samples from lag, log and stationary phases of growth to chart archael nucleosome dynamics. We are also refining a home-made ribosomal depletion protocol to make constructing complementary transcriptome maps considerably cheaper. Finally, it is exciting to contemplate a consortium effort to create a systematic, barcoded set of Haloferax deletion (or disruption) mutants for systematic functional studies.

Mille grazie to Jonathan E. for inspiring me to looking at understudied microbes and for encouraging me to walk the walk with respect to publishing in open access forums. And for letting me share my thoughts as a guest on his blog

The tree of life from Haloferax’s perspective Artwork by Trine Giaever