Much ado about plants and blogs in PLoS Biology

Some good new articles in PLoS Biology in the last few weeks worth checking out.  There is definitely a theme there if you want to look for it.  So here are some of the papers connected to that theme and even one that covers both.

Fwd: MG-RAST User Newsletter No.2 — MG-RAST paper available // NEW version released // 2nd User Workshop planned

Out sick so email updates will have to do
Got this by email
Subject: MG-RAST User Newsletter No.2 — MG-RAST paper available // NEW version released // 2nd User Workshop planned



Dear MG-RAST users,

The MG-RAST team has some news we would like to share:

1) MG-RAST manuscript published
*******************************

The manuscript for MG-RAST has been accepted in BMC Bioinformatics and is available under the following URL:

http://www.biomedcentral.com/1471-2105/9/386/abstract

Please cite:
——–
The metagenomics RAST server F. Meyer, et al
BMC Bioinformatics 2008, 9:386


2) New version of the server software available
***********************************************

After a lot of testing (Thanks again to all beta testers), we have released a new version of our MG-RAST server software. You will notice improvements in nearly all aspects of the MG-RAST platform and user experience.

Here is a  summary of changes and additions:
– added overview page with statistical summary
– ability to download arbitrary subsets of fragments as fasta
– user can change parameters for metabolic reconstruction and phylogenetic reconstruction on the fly
– ditto comparison/heatmaps
– recruitment plot
– metabolic comparison tool using KEGG pathway maps
– updated databases in background (NR no longer from 2006, Silva RNA database included)
– detailed sequence information and alignments
– support for groups and inviting friends to look at data
– much faster user interface
– ability to see/download fragments and see blast alignments
– added Silva rRNA database (from: http://www.arb-silva.de/)
– many small detailed fixes


3) MG-RAST workshop planned for winter 2008/2009
************************************************
In addition to the tutorial at Metagenomics 2008 in San Diego (see: http://metagenomics.calit2.net/) we are inviting participant registration for our second MG-RAST workshop. Please send an email to mg-rast@mcs.anl.gov if you are interested in participating.


— from the MG-RAST team

Amy Harmon, New York Times, on Open Access publishing

Amy Harmon, who writes for the New York Times and has written some excellent recent pieces on evolution and genomics is answering some questions on the New York Times website

And one of them was about communicating science and Amy responded (with other comments):

Of course, the one way scientists do, theoretically, communicate with the public is by publishing their results. Since these papers are written for other scientists, they can be hard to understand. But even for people game to wade through them, they are often hard to obtain. The two leading scientific journals, Science and Nature, and many others, require people to pay for access to papers whose authors have been financed by taxpayers. “Open access” publishers like the Public Library of Science do not, so it would be nice to see scientists choosing — or being required — to publish in journals that are open to the public.

Nothing more for me to add.

Congratulations to Stanley Falkow, Microbiologist Extraordinaire and Lasker Winner

Short one here as I am at a conference. Very happy to see Stan Falkow get a Lasker Award.

“The breadth and depth of Falkow’s career is being recognized with the 2008 Lasker-Koshland Award for Special Achievement in Medical Science.”

He is a great scientist but more importantly to me, he has an infectious (no pun intended) enthusiasm for science, microbiology, and life. For more detail, there is a really nice article in the Stanford Report about him and his career. Every time I interact with him (e.g., when I was a student at Stanford) I feel like I got a microbiology passion boost that lasted for years. And clearly he has had this effect on many many others. Congratulations to him for a well deserved award.

Lake arrowhead notes – UPDATED

Well I gave my talk

Seemed to go ok except getting cutoff early because the chair ignored
that the session started late but that is ok

George Weinstock is now speaking using my laptop so I am trying to
post from my phone

He said one key thing I left out … Big scale microbial sequencing
projects are now possible thanks to next gen sequencing in particular
454-Roche tools

More later
Sent from my iPhone

————————————————-

More now

George Weinstock gave a good overview of the “Human Microbiome Project” which is a NIH Roadmap initiative to catalogue the genomic content of the microbes associated with humans. He described some of the big picture of why do the project, of the different fundingin initiatives being done through NIH and he gave some detail on the “jumpstart” project going on at the big genome centers right now. He outlined how the current plan is to select a few hundred people and to survey their mcirobiomes from multiple sites using rRNA PCR and possibly metagenomics. In addition, he described how there is also an effort to sequence 100s if not a 1000 genomes of cultured organisms that have been isolated from human environments. He did say one thing I disagreed with which is that he thinks it is somewhat reasonable to treat the environment that microbes live in in essence as a big bag of genes. In other words, if you sequence from a community, he implied that one can focus just on the genes and their functions and not the organisms that they come from. On this I disagree (and pointed this out after the next talk). But overall George gave anice overview of the project and its goals.

Eric Wommack gave a good talk about viral metagenomics work he has been doing. He pointed out that a lot of the viral world is “unknown” but that does not mean it is unimportant. And this is consistent with what I and George Weinstock said which is that we need more genome data from viral isolates. Eric presented some very useful results on the challenges of using short read sequence data in metagenomics and he referenced a few papers on this. He also referred to a cool viral genome survey project that I was not aware of by Hatfull which involved undergraduates in sequencing and analyzing the genomes of phage that infect Mycobacterium smegmatis.

Jim Bristow on Biofuels. He is now giving a summary of some of the JGI work on the genomics of cellulolytic organisms and processes. He is focusing on the termite gut community and had some good one liners about this (e.g., he said many people want to kill termites but not JGI. They are our friends; he also said “it takes a village to sequence a termite gut”).

Not sure exactly how to say this, but here goes. There was one talk in the AM I was not overly fond of. This was a talk by Bernard Palsson. Now I confess, I am not overly familiar with much of his work but what I know of it suggests he does some really solid, interesting and important work on metabolic network modeling and analysis. But his talk at this meeting was disappointing. His talk was about his use of genome sequencing to characterize “adaptive evolution” in E. coli. And the results he presented seemed solid enough. The problem I had was that it was a prime example of “overselling genomics”. Why? Here is what they did. They took E. coli mutants. And the then took them through cycles of growth and then dilution. And then they looked at the populations after a certain number of generations and did a variety of analyses. Included in this was some whole genome sequencing that helped identify mutations arising in the cultures. And then they did some characterization of these mutations/mutants including some competition experiments and some pretty interesting gene expression studies of some RNA polymerase mutants. And he made some conclusions based on their results like that E. coli in the lab can find new adaptive peaks and that mutations differ in different replicates, and that different mutations confer different fitness, that they can monitor the appearance of mutations over time, and so on.

So what is the problem — the problem is that he (1) presented this as though the serial cycling of E. coli was novel when in fact it is not and that (2) he presented the conclusions as though they were novel when they also are not. People have been doing this type of experiment for many decades (in fact, one person, Rich Lenski, has been doing an experiment like this for decades). And they get these exact results. But they have not sequenced genomes as part of their experiment. And thus, at least for this talk, they were not mentioned, and the rediscovery of many truisms in population genetics was presented as novel because it involved genome sequencing.

Trent Northen created a serious buzz during and after his talk with his presentation of some of the things one can do with Nanostructure Initiated Mass Spectrometry (NIMS). I confess – I want his toys.

Lynn Silver is now talking about the challenges in the development of new antibiotics. She argues that the focus by some on trying to find new targets for antibiotics has been a bit misguided.

Julian Parkhill gave a good talk about population genomics of Salmonella. He pointed out a few things people still ignore. For example, if you want to identify polymorphisms in a species to use for population genetics/genomics studies, you really need to do a survey to identify polymoprhisms from diverse members of the population. If you do not, and then you use a biased set of polymorphisms, your population inferences will be wrong. He also said, in response to a quesiton of mine, that at least for this species, they see very little variation in copy number in genes which is different than what people seem to see in humans.

Tiffany Williams from Baylor gave a talk about using high throughput sequencing in collaborations with developing countries. She outlined some of the challenges as well as the benefits from such collaborations.

Kim Lewis gave a very interesting talk on microbial biofilms and persister cells, of which I know vanishingly little. He showed some very cool experiments trying to “complement” unculturable organisms and get them to grow.

Jeffrey F. Miller gave a talk focusing on diversity generating retroelements in bacteria which appear to be a means by which bacteria can target particular regions of the genome for mutagenesis in a comparable way to VDJ mutagenesis in humans. This was perhaps my favorite talk so far at the meeting as it combined microbial genomics, evolvability, mutation processes and other things I tend to focus on.

Steven Benner gave a talk which I had to skip out on early because I was doing a radio interview. Benner said one thing that annoyed me at the beginning – he made a comment that was complaining about prior talks that referred to “Rosetta Stone” methods of predicting function (I was one of the people who mentioned this) because he thought that we were referring to blast searches. He clearly was not paying any attention as the Rosetta Stone method is a method to predict function for genes by finding connections between non homologous proteins based upon having other proteins that have domains found in both of the original proteins of interest. Oh well, glad I had to leave early because I was itching to jump up and correct him.

Heather Allen, from Jo Handelsman’s gave a very good talk about doing functional metagenomic screens for antibiotic resistance encoding genes. She has been using DNA from multiple soil sites, including a pristine site in Alaska, and screening the DNA for antibiotic resistance genes in E. coli. These screens identify a wide diversity of genes, including some novel forms. This work helps highlight the need to not just sequence the snot out of the world but to also do some functional assays at the same time. In addition she mentioned that she was able to come to the meeting because Jo Handelsman set up a fund for mothers to pay for babysitters to come to a meeting with them. All I can say is Jo Handelsman was already one of my favorite people in science and this is just another brilliant and wonderful thing that she does.

David Relman gave a talk about two studies of the human microbiome that his lab has been doing: (1) studies of marine mammals to compare the microbial diversity in their surfaces with the diversity in the water and the diversity on their insides and (2) study the response of the human gut microbial community to antibiotic treatment. I am particularly fond of the antibitotic treatment study because they are treating it as an “ecological disturbance” study and analyzing it much like ecologists would analyze recovery of a forest after fires. I think we definitely need more ecologists to bring their techniques and skills to human microbiome studies and so this was exciting to see.

Ashlee Earl gave a talk about biofilm formation in Bacillus subtilis. Much like Kim Lamb’s talk earlier, this talk was in an area I know little about and I guess you could say it kind of blew my mind. It seems that in B. subtilis and I guess in many other microbes biofilms are in essence analogous to multicellular organisms. Within a biofilm there are different types of cells that have different roles and the patterns are highly reproducible and organized. It seems to me that the boundary between multicellular and single-celled organisms is getting blurrier and blurrier. Ashlee reported on some cool experiments where she collected strains from around the world and then dod comparative genetics and genomics of their biofilm formation patterns.

Alas I missed Mary Lidstrom’s talk which based upon prior experiences I am sure was fascinating. She has been working in studying processes inside single bacterial cells and has been developing a suite of techniques and tools to carry out such studies. Maybe someone else from the meeting can post details about her talk.

Unfortunately, I had a conference call during some of the next talks that I had to do so I do not have details for the blog. Then I returned and served as chair for a session. I did take some notes so here goes.

Byung-Kwan Cho gave a tour de force talk about reconstructing the transcriptional regulatory network in E. coli. He presented results from a dazzling and dizzying array of genome-scale methods (e.g., ChipChip, tiled arrays, sequencing, etc etc) to characterize transcription regulation. In addition he did some complex and big scale computational work to combine all of the data together to characterize networks. It was quite impressive stuff.

Ginger Armbrust talked about her favorite critters – diatoms and focused on how they used the genome data to characterize silicon deposition processes. She was convincing as to the importance of diatoms and to the value of having the genome sequences from some species. She did discuss some of the challenges of using the genome data including the challenges in gene prediction for microbial eukaryotes. She also discussed her dream of utilizing some of the new genomic information as part of real time sensors in the oceans.

Anthanasios Typas discussed work to build tools for carrying out genome-scale analyses of genetic and chemical-genetic interactions. For example they are working on taking two comprehensive gene KO libraries from E. coli and using them to create all possible double mutants and to then screen those mutants for whether they have the same or different phenotypes than the single mutants. This allows them to look for gene-gene interactions. They also are doing this type of analysis with chemical-gene interactions.

Devaki Bhaya gave a brief talk on what I think is the single most interesting thing in all of microbiology right now – CRISPRs. These are clustered regularly interspaced short palindromic repeats. She is studying them in cyanobacteria from Yellowstone hot springs

Good quotes from the meeting:

  • So we simply sequenced the genome of the different variants
  • Antibitoics do not kill things, they corrupt them
  • Dormancy is the default mode of most bacterial life
  • Who knows what a yoctomole is?
  • I am going to defend genomics
  • There comes a point in life when you have to bring chemists into the picture
  • Gosh, was that today or yesterday
  • The rectal swabs are here in tan color
  • I’ll try to let the pictures do the talking and I will get out of the way
  • Our model system de jour
  • And there’s Jeffrey Dahmer
  • And this is my cheesy analogy here
  • He could not be here so I am here. His loss. My gain. Hopefully not your loss.
  • We are the environment. We live the phenotype.
  • If I have time I will tell you about a dream
  • Every fifth breath – thank a diatom
  • While we still have poles
  • A paper came out next year

Darwin on the Wall

Lots of other bloggers posting about this but I got to put it out there too.  Check out the remarkable story of the Darwin Shaped Wall Stain and how it is galvanizing the evolution community – See Evolutionists Flock To Darwin-Shaped Wall Stain.  It is from the Onion.  One of my favorite “news” sources.  Hat tip to many many people for pointing this out.

Predicting the future (for molluscs)

As many of you know, I spend a decent amount of my blogging time trying to come up with funny evolution or genomics related posts. Well, if you like that type of thing, you really have to check out this new site:

The Molluskan Zodiac

The site states

“While most people are familiar with western astrology and with the Chinese zodiac, much less is known about the ‘molluskan zodiac’ (sometimes known as the mariners zodiac). But ask any fisherman, and they will tell you instantly which of the ten signs of the molluskan zodiac they were born under.”

It is very very funny. And real of course. Kudos to Keith Bradnam, who happens to be from the UC Davis Genome Center (where I work) for revealing the inner secrets of these wonderful invertebrates. And while you are checking out the Zodiac, check out Bradnam’s new PLoS One paper on intron length which he authored with Ian Korf. Science humor, invertebrates, and Open Access publishing. Now what could be better than that?

Tracing the evolutionary history of Sarah Palin: links to a parasitic nematode and the pathogenic fungus Botryotinia fuckeliana

You see, as a total sequence analysis dork, when I see names, I frequently ask whether the letters in the name include only letters which are used as amino acid abbreviations. I started this game when the brilliant notes/letters came out in Science in the early 90s about whether ELVIS was overrepresented in protein sequences. Of course, despite being 20 years old, Science still keeps these under wraps requiring registration to see them (see for example the Stevens letter).

Anyway, alas, three of the major candidates for the US election have names that do not use traditional amino acid abbreviations so I am stuck with analyzing Sarah Palin. But that is OK because of her professed aversion to evolution and support to Creationism (and since sequence analysis is inherently an evolutionary study).

So – I took here name and went to the NCBI Blast page and did some searches. And what came up? Well, here are some of the top hits from the blastp searches (which I used to compare the pretend peptide “SARAHPALIN” with all the peptides in the non redundant collection at Genbank).

>ref|XP_001545292.1| Gene info hypothetical protein BC1G_16161 [Botryotinia fuckeliana B05.10]
gb|EDN25226.1| Gene info predicted protein [Botryotinia fuckeliana B05.10]
Length=383

GENE ID: 5425746 BC1G_16161 | hypothetical protein
[Botryotinia fuckeliana B05.10]

Score = 26.9 bits (56), Expect = 189
Identities = 8/9 (88%), Positives = 8/9 (88%), Gaps = 0/9 (0%)

Query 1 SARAHPALI 9
SARA PALI
Sbjct 209 SARAQPALI 217


>ref|YP_061725.1| Gene info homoserine dehydrogenase [Leifsonia xyli subsp. xyli str. CTCB07]
gb|AAT88620.1| Gene info homoserine dehydrogenase [Leifsonia xyli subsp. xyli str. CTCB07]
Length=451

GENE ID: 2939000 thrA | homoserine dehydrogenase
[Leifsonia xyli subsp. xyli str. CTCB07] (10 or fewer PubMed links)

Score = 26.9 bits (56), Expect = 189
Identities = 8/9 (88%), Positives = 8/9 (88%), Gaps = 0/9 (0%)

Query 1 SARAHPALI 9
SAR HPALI
Sbjct 267 SARVHPALI 275

>ref|ZP_02031476.1| hypothetical protein PARMER_01474 [Parabacteroides merdae ATCC
43184]
gb|EDN87136.1| hypothetical protein PARMER_01474 [Parabacteroides merdae ATCC
43184]
Length=299

Score = 26.1 bits (54), Expect = 340
Identities = 7/8 (87%), Positives = 8/8 (100%), Gaps = 0/8 (0%)

Query 3 RAHPALIN 10
RAHPAL+N

Sbjct 170 RAHPALVN 177

>ref|XP_567332.1| Gene info hypothetical protein CNJ01520 [Cryptococcus neoformans var. neoformans
JEC21]
ref|XP_773201.1| Gene info hypothetical protein CNBJ1950 [Cryptococcus neoformans var. neoformans
B-3501A]
gb|EAL18554.1| Gene info hypothetical protein CNBJ1950 [Cryptococcus neoformans var. neoformans
B-3501A]
gb|AAW45815.1| Gene info hypothetical protein CNJ01520 [Cryptococcus neoformans var. neoformans
JEC21]
Length=437

GENE ID: 3254188 CNJ01520 | hypothetical protein
[Cryptococcus neoformans var. neoformans JEC21] (10 or fewer PubMed links)

Score = 26.1 bits (54), Expect = 340
Identities = 8/9 (88%), Positives = 8/9 (88%), Gaps = 0/9 (0%)

Query 1 SARAHPALI 9
SAR HPALI
Sbjct 415 SARQHPALI 423


>ref|YP_001626035.1| Gene info citrate synthase [Renibacterium salmoninarum ATCC 33209]
gb|ABY24621.1| Gene info citrate synthase [Renibacterium salmoninarum ATCC 33209]
Length=386

GENE ID: 5822379 RSal33209_2898 | citrate synthase
[Renibacterium salmoninarum ATCC 33209]

Score = 25.7 bits (53), Expect = 456
Identities = 9/11 (81%), Positives = 9/11 (81%), Gaps = 2/11 (18%)

Query 1 SARAHP--ALI 9
SARAHP ALI
Sbjct 218 SARAHPYAALI 228


>ref|YP_001817256.1| Gene info integral membrane sensor hybrid histidine kinase [Opitutus terrae
PB90-1]
gb|ACB73656.1| Gene info integral membrane sensor hybrid histidine kinase [Opitutus terrae
PB90-1]
Length=936

GENE ID: 6208547 Oter_0366 | integral membrane sensor hybrid histidine kinase
[Opitutus terrae PB90-1]

Score = 25.2 bits (52), Expect = 611
Identities = 7/7 (100%), Positives = 7/7 (100%), Gaps = 0/7 (0%)

Query 3 RAHPALI 9
RAHPALI
Sbjct 256 RAHPALI 262


>ref|YP_001757871.1| Gene info putative anti-sigma regulatory factor, serine/threonine protein
kinase [Methylobacterium radiotolerans JCM 2831]
gb|ACB27188.1| Gene info putative anti-sigma regulatory factor, serine/threonine protein
kinase [Methylobacterium radiotolerans JCM 2831]
Length=331

GENE ID: 6141303 Mrad2831_5232 | putative anti-sigma regulatory factor,
serine/threonine protein kinase [Methylobacterium radiotolerans JCM 2831]

Score = 25.2 bits (52), Expect = 611
Identities = 7/8 (87%), Positives = 8/8 (100%), Gaps = 0/8 (0%)

Query 2 ARAHPALI 9
ARAHPAL+
Sbjct 299 ARAHPALV 306

>ref|ZP_01466013.1| hydrolase, TatD family [Stigmatella aurantiaca DW4/3-1]
gb|EAU63211.1| hydrolase, TatD family [Stigmatella aurantiaca DW4/3-1]
Length=209

Score = 25.2 bits (52), Expect = 611
Identities = 7/7 (100%), Positives = 7/7 (100%), Gaps = 0/7 (0%)

Query 3 RAHPALI 9
RAHPALI
Sbjct 79 RAHPALI 85


>ref|YP_001558323.1| Gene info glycosyl transferase group 1 [Clostridium phytofermentans ISDg]
gb|ABX41584.1| Gene info glycosyl transferase group 1 [Clostridium phytofermentans ISDg]
Length=357

GENE ID: 5743305 Cphy_1206 | glycosyl transferase group 1
[Clostridium phytofermentans ISDg]

Score = 25.2 bits (52), Expect = 611
Identities = 8/10 (80%), Positives = 8/10 (80%), Gaps = 0/10 (0%)

Query 1 SARAHPALIN 10
S RAHP LIN

Sbjct 113 SERAHPLLIN 122

There does not appear to be a perfect match in the NCBI NR protein database. But take a close look at the #1 scoring hit. That is right, it is from and organism called Botryotinia fuckeliana. No comment on the appropriateness of this name, but it does contain a term I will probably use a lot if she gets elected.

Of course, anybody who has heard me blather on and on about evolution knows that I am always talking about how blast top hits are not a good measure of relatedness per se (see my NAR paper where I first talked about this in 1995). So – I decided to build a tree of Sarah Palin. I used the NCBI Distance Tree option which you can do from blast searches.

Since most likely you cannot see that in enough detail – here is a zoom in.

That one did not come through on the Blog so well either so I decided to output the tree in Newick format and then I searched for a program that could draw a better figure on the web (we have tools in my lab to do this but I am trying to do this all on the web as an exercise). And I found a web site that makes drawtree available. And I plugged in the Newick format and it made a nicer one.


Though making trees from really short sequences is not ideal, in this tree, Sarah Palin is shown to be at the root of a branch including a protein from the parasitic nematode Brugia malayi. So if we take an evolutionary interpretation it seems that this causative agent of filariasis (well, a protein from this agent) is descended from SarahPalin. In other words, she seems to be ancestral to this parasite.

So in conclusion – by similarity – SarahPalin is closest to a plant pathogen with an unusual name. And by phylogeny SarahPalin is ancestral to a parasitic nematode. Sounds about right.

Help save the world and get $100,000 seed grant to do it

Just got this email and thought I would share since it does relate to some of the themes of my blog. I note that the Gates Foundation is VERY supportive of Open Access publishing as one of their previous grants helped support the journal “PLoS Neglected Tropical Diseases.” I am hoping that at some point the Gatges Foundation will require OA publishing for all of the projects they fund.

The Bill & Melinda Gates Foundation is now accepting grant proposals for Round 2 of Grand Challenges Explorations, a US$100 million initiative to encourage unconventional global health solutions.

Based on your feedback, we have made changes for Round 2 of Grand Challenges Explorations. We modified the topics from Round 1 and added two additional topics. We will no longer require applicants to register for a topic in advance of submitting their proposals. We also updated the application form in response to feedback from the initial round.

Grant proposals are being accepted online at http://www.gcgh.org/explorations until November 2, 2008, on the following topics:

New! — Create new vaccines for diarrhea, HIV, malaria, pneumonia, and tuberculosis
New! — Create new tools to accelerate the eradication of malaria
— Create new ways to protect against infectious diseases
— Create drugs or delivery systems that limit the emergence of resistance
— Create new ways to prevent or cure HIV infection
— Explore the basis for latency in tuberculosis

Initial grants will be $100,000 each, and projects showing promise will have the opportunity to receive additional funding of $1 million or more. Full descriptions of the new topics and application instructions are available at http://www.gcgh.org/explorations.

We are looking forward to receiving innovative ideas from scientists around the world and from all scientific disciplines. Anyone can apply, regardless of education or experience level. If you don’t submit a proposal yourself, we hope you will forward this message to someone else who might be interested.

Thank you for your commitment to solving the world’s greatest health challenges.

Quick Post – Spore Sounds Cool

Nothing conclusive here. I have not tried it yet. But Spore sure sounds like a cool evolution game to play. See Carl Zimmer’s NY Times article and his blog for more.