Why I am ashamed to have a paper in Science

So I just had a paper published in Science last week. In many ways, it has all the makings of one of those papers I should be really proud of. First, it represents a collaboration with my undergraduate advisor, Colleen Cavanaugh, the person who inspired me to go to graduate school and who got me interested in microorganisms, which I have worked on ever since (I published my first scientific paper on work I did in her lab). The paper is on one of the coolest biological systems on the planet – bacterial symbionts of deep sea animals that allow these animals to function much like plants (they use chemosynthesis in much the same way plants use photosynthesis). Studies of the deep sea and of chemosynthesis are important for understanding the origin and evolution of life, for understanding global carbon cycles, for understanding the rules by which symbioses evolve and much more. And on top of all of this, the paper reports the sequencing and analysis of the complete genome of one of these symbionts (that from the clam Calyptogena magnifica) – and one of my main areas of research is on the evolution of the genomes of symbionts. And, the genome was sequenced at the Joint Genome Institute, where I now have an Adjunct Position and am working with extensively. All sounds good right? And, I should be happy to get a paper in Science too, right?

Actually, in reality, I am not pleased with how this paper has turned out. This is really due to two things. First, my collaborators failed to keep me in the loop that the paper was accepted in Science. Thus I did not find out about the paper until I did a google search for some other reason and noticed this Deep-Sea News Blog which had a story, well, about the paper in Science. It would of course have been nice to know the paper was accepted and coming out. It would have been even better to have seen the page proofs, which might have given me the chance to catch some little and not so little mistakes (e.g., the paper claims that this species has the largest genome of any intracellular symbiont sequenced to date – which is unfortunately not true). Now, admittedly I was out sick for a while and maybe my collaborators just did not want to bother me with this information. More likely- people were just very busy – and this just slipped through the cracks.

But you know – it is a Science paper. I should be happy however it came into being right? Well, no. Completely and thoroughly wrong. You see, I do not support publishing things in Science. I object because Science is not an Open Access journal. I tried and tried to get Irene Newton the first author to submit this to another journal. But in the end, she did the brunt of the work, and thus she and her advisor, Colleen, got to pick the place. And in the time since Irene submitted the paper, I have become even more miltant against publishing in such non Open Access journals. Publishing in a non Open Access journal like Science make me feel icky in every way. In addition, by choosing to publish the paper there but not elsewhere, the field of deep sea symbionts may have been hurt rather than helped.

How could a Science paper hurt the field? Well, for one, Science with its page length obsession forced Irene to turn her enormous body of work on this genome into a single page paper with most of the detail cut out. I do not think a one page paper does justice to the interesting biology or to her work. A four page paper could have both educated people about the ecosystems in the deep sea, about intracellular symbionts in general, and about this symbiosis in particular. The deep sea is wildly interesting, and also at some risk from human activities. This paper could have been used to do more than just promote someone’s resume (which really is the only reason to publish a one page page in Science).

But of course, even more importantly, anyone without a subscription to Science, well, they can’t even read the paper. And AAAS gets to decide what happens to the text and figures in the future. So – count this as one of my papers I am not really proud of. I love that I helped my Undergrad. advisor and one of my favorite people in the world do this work. But by it not being in an Open Access journal, I have unfortunately contributed to a system that I think is bad for the world. And I just fell icky.

Some news stories and blogs are coming out on the paper:

Below I have embedded a video of a dissection of what I think was a deep sea Calyptogena, just for the fun of it.

This was taken during a deep sea cruise I managed to get on. For mroe detail on this cruise, see the NOAA Ocean Explorers site here.

Genomics Gets Nasty

Just saw an entertaining press release about the publication of the genome of the parasite Trichomonas vaginalis. I find this entertaining because it does a remarkable job of capturing the personality of Jane Carlton, the PI on the project, who I used to work with at TIGR.

I particularly like the end

Viewed under the microscope Trichomonas vaginalis moves quickly; it has four undulating flagella and a tail. “It is a gassy organism,” says Dr. Carlton. It has special power-generating structures called hydrogenosomes. They produce hydrogen. “So it is releasing hydrogen into the liquid media, making it frothy,” she says. “That is why the vaginal discharge is frothy.”

The pathogen grows easily in the lab in test tubes containing some liquid media. And it has, as she says, “a real yuck factor to it.” A good way to know the microbe is growing well is to smell the contents of the test tube. “It smells foul, it has a fishy odor; really nasty,” says Dr. Carlton. “My technician used to get grossed out by that.”

While it is true that Jane has no fear about saying things that make some people uncomfortable, it is entertaining to the it in the NYU press release.

The press release is worth reading for another reason – the history of this genome project is different from many other parasites. In this case, the genome was enormously bigger than had been predicted (usually they are smaller than predicted, in part becuase if you over predict the genome size, you will have some extra money in your grant to cover other issues). The press release gives a good impression of how much of a pain it is to run a genome project sometimes.

Anyway – back from a little layoff and just wanted to say – good job Jane.

Difficult times in predicting flu evolution suggested by recent paper


There is a potentially controversial and very interesting article in the journal PLoS Pathogens on Flu Evolution. The study was led by Edward Holmes at Penn State, and co-authored by many researchers including colleagues of mine at my former institution TIGR. They performed a detailed evolutionary analysis of the cmplete genomes of 413 influenza A viruses of the H3N2 type (the H#N# system refers to the subtypes of Hemaglutanin and Neuraminidase genes).

The virus genomes were sequenced at TIGR using a high throughput flu viral genome sequencing protocol originally developed at described by Elodie Ghedin and colleagues here and here. The viruses they selected were from across New York State as part of a surveillance program.

Using a variety of evolutionary analyses including phylogenetic reconstructions and examination of substitution patterns, they come to a surprising conclusion – that

stochastic processes are more important in influenza virus evolution than previously thought, generating substantial genetic diversity in the short term

This may seem somewhat uninteresting to many out there but if true it is critically important in fighting flu and in understanding viral pathogen evolution. Right now there are substantial efforts to try and predict what future dominant flu strains will look like. These predictions tend to rely on assumptions that positive selection of viruses is critical in generating and maintaining diversity. If stochastic processes are as important as Holmes et al conclude, it would mean that more intensive monitoring of flu is needed in almost real time (since predicting random events tends to be, well, very hard).

I confess I have not tried to evaluate whether or not I think their conclusions are correct, but on first glance they seem sound. This just goes to show that general genomic surveys that try to be relatively unbiased in their sampling can reveal substantial novel patterns not seen before in highly target genome sequencing projects.

What is that NEXTgencode Advertising on My Site

So I saw this add on my site for NextGenCode. It was very cryptic so I went to their site.

They make their site seem like they are a Biotech company promoting genetic engineering as a tool in life enhancement. But looking at their articles and other material it became clear this was a spoof of some sort. The best part of their web site are the ads, like the one for Anhedonia and the one for Losing Blondes (about blondes going extinct in 200 years). I was guessing that this was some spoof put out by people to make fun of the synthetic biology field, but then I gave in a decided to google the company.

This is when I found the Wall Street Journal article that says they are a marketing ploy for a Michael Crighton novel. (I had not clicked on the link about a book stealing trade secrets form the company, but if you do, you get a story about Crighton’s novel stealing their ideas). While some may find the slight of hand they are using here to be deceptive or malicious, I think it is pretty funny. It did not take long for me to figure out it was a spoof of some sort and it was kind of fun trying to figure it out.

Good to see that my site is being used for important advertising. Some day I will discuss the ads on my site for Intelligent Design proponents.

Here are some additional stories about Nextgencode

Of course, any blog about Crighton would be incomplete without mentioning his scientific “credibility” or the debate about it. He has clearly written some interesting science-related books over the years and in many of them the science is not completely absurd. But his anti-global warming book, State of Fear, made Crighton seem like an anti-science advocate. Personally, I never read the book so I cannot comment about the issue directly. But here are some stories about the book for people to look at.

Sea Urchin Genome and the Ridiculous Evolutionary Claims of Genome Researchers

All I can say is “AAAAARGH”

A sea urchin genome has been sequenced and there are some really interesting findings that have been reported based on analysis of the genome. For example, there appears to have been a large expansion of genes involved in the innate immune system in the species sequenced, Strongylocentrotus purpuratus.

All the good science aside, what most struck me were some of the ridiculous quotes attributed to some of the researchers in this project in stories and press releases. For example, in an article on MSNBC, George Weinstock says

“The sea urchin is surprisingly similar to humans,”
“Sea urchins don’t look any more like humans than fruit flies, but about 70 percent of sea urchin genes have a human counterpart whereas only about 40 percent of fruit fly genes do.”

Apparently, George was glossing over the reason this organism was chosen for sequencing in the first place. If you go to the NHGRIs web site you can get the white paper written that led to the selection of this species for sequencing. Perhaps the most important reason is that evolutionarily the sea urchin is closer to humans than fruit flies are. Therefore, it should only be a surprise to someone who does not know the evolutionary position of this species.

Perhaps even more appalling is the discussion of the apparent large number of genes for light sensory systems in this species. Again, Weinstock is quoted:

“There is not a lot of light at the bottom of the ocean, so it is not clear what they might be ‘seeing,'” Weinstock said. “This is certainly an area that will be studied intensively as a result of the genome project.”

I can only view this as some sort of joke. First of all, blind cavefish still have the genes for light perception even though they do not see. This is because it takes time for such genes to disappear. Second, apparently George has never really thought about where this species of sea urchin is found. It is found in the intertidal zone — hardly the dark depths of the ocean. I could go on and on but I will just get more annoyed. In this case, Weinstock has proven that many Genome Scientists are almost completely clueless about the organisms they are working on. Which is a shame. Becuase sea urchins are fascinating creatures and the fact that they are more closely related to humans than are most other invertebrates is one of the main reasons they have been a focus on so much research up until now. Oh well …

Paramecium whining

I just got an announcement from Linda Sperling, announcing the publication of a paper on the Paramecium genome

Dear ciliate researcher,

We are pleased to announce that the Paramecium genome article is now available as an advanced online publication at the following address:

http://www.nature.com/nature/journal/vaop/ncurrent/abs/nature05230.html

We thank all of you for your interest and support.

Jean Cohen and Linda Sperling

Linda Sperling
Centre de Génétique Moléculaire
CNRS
Avenue de la Terrasse
91198 Gif-sur-Yvette CEDEX
FRANCE

sperling@cgm.cnrs-gif.fr
+33 (0)1 69 82 32 09 (telephone)
+33 (0)1 69 82 31 50 (fax)
http://paramecium.cgm.cnrs-gif.fr/

She sent this to an email list for ciliate researchers. I am writing about this in my blog because a blog is where you are supposed to write things these days when you are pissed off. Why am I pissed off about this? Well, the Paramecium paper makes no mention whatsoever of our paper on the genome of a close relative of Paramecium (Tetrahymena thermohila for those interested) which was published in August. And they do not even explicitly mention the Tetrahymena genome project (even though they say they took our data and used it). I guess I am not too surprised since their paper is published in Nature, which recently seems to be taking many liberties with referencing things in Open Access journals (ours was in PLoS Biology).

What is most annoying about this whole thing is that Linda Sperling is on the Scientific Advisory Board of our project, and has been privy to all of our work from the inside and was I am sure fully aware of our paper being accepted long before theirs was. Common courtesy in science would have been for them to have made a reference to our paper in press or at least our project. But for whatever reason, they carefulyl crafter their words to make no mention of our work. Interestingly, here is the email I sent to the same ciliate list on August 29, 2006

For those interested, our paper on the Tetrahymena MAC genome has been published online at PLoS Biology

http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0040286

Jonathan Eisen

Strikingly, their paper was then accepted August 31, 2006. I hate to believe in conspiracies, but it seems just a little too coincidental that their was accepted just after ours was published. And yet still no mention of our work in their paper. Hmmm …

Fortunately, since our paper was in PLoS Biology, they cannot say “sorry – we did not have access to it.” Whatever they say, I can say clearly that Linda Sperling will not be invited to our next SAB meeting.

The latest genomics buzz

The latest buzz in genomics is about the honeybee genome. The people working on this genome have really done a good job of organizing themselves (a sort of model social genomics network in a way). They have a veritable slew of papers coming out this week on various things about the genome and about honeybees that were learned by making use of the genome.

There is an entire issue of Genome Research dedicated to studies of the honeybee (see the press release here) including papers on rates of evolution, circadian rhythms, chemical sensing, sex and death (of course), and even the royal jelly. If you don’t know what royal jelly is, do a google search for that. There is also an overview article in Nature and a genome report in Science. In total 170 researchers were involved in these papers.

Mind you, I am disappointed that these were not published in Open Access journals. And this is particularly sad given that the funding came from the NHGRI, the same group of sanctimonious individuals who kept talking about how the “public” human genome project was “open” in every way for the betterment of humanity. Unfortunately, what they mean by “open” even for the human genome project is a bit of a misnomer. They meant that people could look at the data immediately. But they restricted how people could use the data, despite their attempts to pretend otherwise. Consistent with this, the groups funded by the NHGRI generally do not publish their papers in Open Access journals. Shame shame shame.

OK, enough sniping. The honeybee is so fascinating biologically in so many ways that this genome sequence deserves a bit of extra attention. First, honeybees are social creatures. They have in fact been one of the key models in studying both the evolution of social behavior but also communication among organisms.

Another aspect of their biology that is very interesting is their genetic structure. Like other hymenoptera they have what is know as a haplodiploid life cycle with males being haploid (the result of unertilized eggs) and the females being diploid. This unusual genetics is another reason that honeybees and other hymenoptera have been studied extensively by biologists for many years. In fact, a great little bit of history about this is in a book on the history of studies of altruism from Princeton University press. One of Darwin’s biggest concerns in the origin of life related to the self sacrifical behavior, especially that in honeybee colonies. Apparently, honeybees were a topic of conversation among non scientists and the non reproductive worker castes were well known to the public. Darwin struggled quite a bit to come up with a good explanation that was consistent with natural selection for why some individuals would sacrifice their lives for others.

Dawrin actually cam up with a good logical explanation for this – that some individuals would sacrifice if they were related to others who would benefit. Bees and their relatives played a large part in studies that have revealed in much greater detail how altruism can evolve. They may not be as warm and fizzy as some other organisms being sequenced, but they certainly were a good pick for a genome sequencing project.

Genomics Education Bus

I just got back from the new version of the old GSAC meeting. It is now called GME or Genomes, Medicine and the Environment (or, as we like to call it – stuff Craig Venter is interested in these days). The meeting is organized by the Venter Institute and this year was one of the better versions of this meeting. There were some really interesting talks in a few topic areas (I will try and post some details about these later). But to me, the most interesting part was seeing the Venter genomics education bus (part of their Genomics Discovery program) on tour. They use this bus to go around to high schools and other places to do some genomics education.

Just before coming to the meeting, the bus apparently rolled into New Orleans (see Wired news story here). Lots of people like to complain about Venter and his style, but whatever you may think of him, I think this bus is a great idea. We desperately need more people who do science making an effort to interact with and educate people about scientific research. And since this bus is outfitten with lab equipment and various genome-related toys, it can go into a neighborhood without the best science labs and help introduce students to the fun and excitement of modern science.

Note – the photo was taken by me at the GME meeting in Hilton Head, SC. In the photo are Lisa McDonald, Jennifer Colvin, and (I think) Darryl Bronson.

Metagenomics 2006

Just got back from the “First International Conference on Metagenomics” which was held in San Diego. Despite that this is clearly NOT the first international conference on metagenomics it was not bad.

For those who do not know, metagenomics is the term used when people do DNA sequencing directly from environmental samples without isolating organisms in the first place. This term was coined by Jo Handelsman et al. in an article in 1998, where they referred to all the DNA and its coding potential in soil as the soil “metagenome.”

The meeting was hosted by UCSD/CalIT2 which are trying to move into the metagenomics field in a large part due to the large grant they have from the Moore foundation to build a metagenomics database with the Venter Institute. The database is called CAMERA and it is planning to have its first release shortly.

To be honest, even though I am involved in CAMERA, the UCSD/CAMERA folks would be better off not trying to make it seem like they are the only people organizing meetings in this area. Nevertheless, the meeting was pretty good.

There were talks by people focusing on different aspects of metagenomics, including data collection, databasing, and data analysis as well as some interesting biology. My favorites were one by Jeff Gordon, from Wash. University in St. Louis. He is doing some of the most spectacular stuff in studies of the human microbiome and he discussed a few of the studies from his group. Most importantly, he emphasized the use of germ free animals as a model system. Basically, they raise animals in completely sterile conditions and have produced mice and fish and other species that have no microbes associated with them. This allows them to do experimental manipulations to ask controlled questions about host microbe interactions. My other favorite talk was by Ford Doolittle, who even though I disagreed with some of the things he said, he always challenges the audience to rethink their assumptions. In this case, he talked about the species concept in microbes and why he thinks it does not have much us.

Overall, I got the feeling that people were being a little too worried about the difficulties in metagenomics. Yes, analyzing sequence data from environmental samples is complicated. Yes, all the bioinformatics is harder because you are dealing with a mixed sample of DNA fragments and you do not know which fragment comes from which organism in the sample. And yes, the databasing and data analysis can be very complicated because the amount of raw data and metadata can be huge. But in the end, metagenomics has the potential to be an incredibly powerful tool in studies of microorganisms in nature. And the fact that it is somewhat harder than standard genome sequencing does not mean that we are not already learning a lot from it. What we need to keep in mind is that it is simply a tool – and to try and turn it into a field (which is what it seemed like some of the players would like) is a mistake.

If you are interested in the meeting itself, the talks and discussion sessions are available here.

Genomics Education highlighted at 14th Annual International Meeting on Microbial Genomics

Just got back from the 14th Annual International Meeting on Microbial Genomics, where I gave talk on microbial symbiont genomics. This was one of the best meetings I have been to in a while. It had the right combination of everything including:

  1. Many excellent talks and posters (OK, in the interest of not upsetting people for not saying their talk or poster was great, I will not make a big list of all the ones I thought were good, but I will give a few highlights below).
  2. Excellent location (UCLAs Lake Arrowhead Conference Center, which is in the mountains east of Los Angeles). This is a place that is very conducive to getting to know colleagues and it almost forces interaction among people. There is one central building where there is a dining hall, a nice deck if you want to eat outside, the conference room, rooms for posters, and a large living room for hanging out. The rooms for sleeping are mostly great (e.g., mine was a split level condo like structure with a living room and a bedroom/bath on floor one and a bedroom/bath on floor 2). And being in the mountains is very pleasant. Plus there is a pool, jacuzzi, and sports facilities that are very nice. The only annoying thing is that the Lake itself, which is 100 yards away, but it really almost private, with most of the shoreline occupied by houses and private docks.
  3. Good food. The food is not spectacular or anything but better than the food at 90% of the conferences I have been at.

In terms of talks, there were quite of few that were both interesting topics and very well presented. For example, Jessica Green from U. C. Merced gave a great talk about spatial distributions of microorganisms, Julian Parkhill from the Sanger Center put together a really nice story about mechanisms by which microbial pathogens generate phenotypic diversity, and Julie Huber from MBL impressed many with her talk about the “Deep Rare Biosphere.”

But to me, the best two talks were ones on science education reform by two people from UCLA. Erin Sanders-Lorenz presented a summary of her course she has been teaching at UCLA that has students doing “phylogenomic” analysis which takes them from isolating and culturing organisms from environmental samples to building evolutionary trees of genes isolated from these cultured species.. This seemed like a very creative, hand on, novel way to teach students the excitement of science and some things about evolution. It sounded so well thought out that I asked for (and got) a copy of her lab manual.

Much as I liked this class, the one described by Cheryl Kerfeld knocked my socks off. She described a program they have developed at UCLA called the Undergraduate Genomics Research Initiative. This is an interdepartmental multi-course collaboration with the central theme involving the sequencing and analysis of the genome of a bacterium called Ammonifex degensii. The various courses are organized around a central course on genome sequencing. The linked courses include ones in many different departments at UCLA as well as various courses at other universities. They have clearly given enormous thought to how to do a truly project based course which likely will catch students attention and interest much more than standard lectures or standard labs.

There have been other successful hands on genome sequencing courses before – perhaps the first being one by Brad Goodner at Hiram College who had students participate in the sequencing and analysis of the genome of Agrobacterium tumefaciens (e.g., see a press release here). The Kerfeld UCLA UGRI program sounds like it has gone to the next level by integrating many courses across departments and by having creative ways to encourage participation of students in multiple aspects of the project. It really is worth giving a look at the UCLA UGRI program’s web site.

Other tidbits about the meeting:

  • Jeffrey H. Miller from UCLA organized it
  • This is the same Jeffrey Miller who identified most of the mutator genes in E. coli with a really creative genetic screen
  • There was another Jeffrey Miller from UCLA at the meeting (will leave this up to google for people to figure out who this other Miller is).