Crosspost from http://microbe.net: A very misleading “bacteria in buildings” advertisement presented as “news”

Am crossposting this from http://microbe.net where I posted it earlier. See original post here: A very misleading “bacteria in buildings” advertisement presented as “news”
Wow this “story” (which is really an ad) is just so incredibly bad I do not know what to say: Dangerous Bacteria Isolated in Healthcare HVAC Evaporator Coils. I do not even know where to begin with criticism so I will just go step by step through some of the advertisement.
1. Title: Dangerous Bacteria Isolated in Healthcare HVAC Evaporator Coils
There is no evidence that the bacteria being looked at here are dangerous.
2. First sentence ”A recent study suggests that doctors may want to monitor the environmental condition of their air conditioners evaporator coil before surgery to help prevent the spread of bacterial infections”
No evidence is presented anywhere that monitoring AC coils has any even remote potential value here.
3. Second sentence: Dr. Rajiv Sahay, Laboratory Director at Environmental Diagnostics Laboratory (EDLab) and his colleagues sampled evaporator coils in healthcare air handling systems and isolated Pseudomonas aeruginosa a known noscocomial pathogen.
Well, Pseudomonas aeruginosa is indeed a known pathogen.  However, there is no evidence presented that all the things they detect are indeed pathogenic/virulent.  In fact, later in the article they report their results as being for “Pseudomonas sp” which suggests that their typing was very broad.  It is very possible that many of the cells they detected are not pathogenic.
4. Ignore the middle part.  It is just saying that Pseudomonas aeruginosa can be nasty in compromised patients.
5. They then go on to discuss their study more “In the study, over 560,000 colony forming units (CFU)/gram of Pseudomonas sp were isolated from deep within the evaporator coil system.”
What study?  No data is presented.  No methods.  No results.  Nothing.
6. They then say “Potential aerosolization of these micro-organisms from the infested coil is immense due to a discharge of air stream with 6 miles/hours (commonly observed) across the evaporator coils”
Not so sure about that.  Would have been much better to study ACTUAL aerosolization.
7. Then we find out that they person who conducted the study Dr. Rajiv Sahay is also the one selling the cleaning service to clean your air coils.  That does not instill confidence in me.
So a person selling HVAC cleaning reports unpublished results that they claim suggest if you do not clean your HVACs in hospitals you put all your patients at risk.  I am on board with the need to study microbes in hospitals more.  I am on board with the potential risks of microbes in AC systems.  I am not on board with not presenting data, and with getting the science wrong.

Special Guest Post & Discussion Invitation from Matthew Hahn on Ortholog Conjecture Paper


I am very excited about today’s post.  It is the first in what I hope will be many – posts from authors of interesting papers describing the “Story behind the paper“.  I write extensive detailed posts about my papers and also have tried to interview others about their papers if they are relevant to this blog.  But Matthew Hahn approached me recently about the possibility of him writing up some details on his recent paper on the functions of orthologs vs. paralogs.  So I said “sure” and set up a guest account for him to write up his comments and details of the paper.  


For those of you who do not know, Matt is on the faculty at U. Indiana.  He was a post doc at UC Davis so I have a particular bias in favor of him.  But his recent paper has generated some controversy (I posted some links about it here).  So it is great to get some more detail from him.  In addition, I note, I am also using this approach to try and teach people how easy it is to write a blog post by getting them guest accounts on Blogger and letting them write up something with links, pictures, etc.  So hopefully we can get more scientists blogging too.


Anyway – without any further ado – here is Matt’s post:

———————————————————————–
Following Jonathan’s excellent example of how explaining the history of a project helps to illuminate how the process of science actually happens, I thought I’d start by giving a bit of history behind our study, and the paper that we recently published in PLoS Computational Biology (http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002073). And then I’ll address the critics…
How this all got started
It all started a bit more than three years ago, in the summer of 2008. Pedja (as Predrag Radivojac is known to friends) was giving a talk to a group of us on protein function prediction that he also presented as a tutorial at the Automated Function Prediction SIG at ISMB 2008. Pedja and I had already collaborated on a small project involving the evolution of phosphoryation sites, but I really had no idea about his work on function prediction, and little idea in general about how function prediction was done. Reviewing different ways to accomplish transfer-by-similarity, he eventually got around to evolutionary (phylogenomic) approaches. Here is what I remember of this specific exchange during his talk:
Pedja: …and of course these methods only use orthologs for prediction, because orthologs have more similar functions than do paralogs.
me (from audience): Who says?
Pedja: Umm, you say. I mean, the evolutionary biologists say.
me: No, we don’t. I don’t know of any data that says any such thing.
Pedja: Whatever, Matt. We’ll talk about this later.
Well, we did talk about it later, and it turned out that although this claim is made in tons of papers, there is basically no data to support it. In the best cases a real example of one gene family will be cited, but there are very few of these. In the worst cases, the authors will just cite some random paper about gene duplicates (or Fitch’s original paper defining orthologs and paralogs). Of course I agree that patterns of sequence evolution might lead you to conclude this relationship was true, but there was no experimental data.


In fact, as we say in our paper, rarely did anyone recognize that this claim needed to be tested, or even that it was a claim that could be tested. At the time Eugene Koonin was the only person to say this: “The validity of the conjecture on functional equivalency of orthologs is crucial for reliable annotation of newly sequenced genomes and, more generally, for the progress of functional genomics. The huge majority of genes in the sequenced genomes will never be studied experimentally, so for most genomes transfer of functional information between orthologs is the only means of detailed functional characterization” (http://www.ncbi.nlm.nih.gov/pubmed/16285863). I really liked the way that Eugene had said this, and started to refer to the idea that orthologs were more functionally similar than paralogs as the “ortholog conjecture.” So to be clear: I completely made up this phrase, but used the most evocative word from the Koonin paper.
Luckily for Pedja and me we had just gotten a small internal grant to work on genome annotation and we had an incoming master’s student (Nathan Nehrt) who was willing to work on a project intending to test the ortholog conjecture.
Interlude: the crappy state of things in the study of the evolution of function
In order to test anything about how function evolves between orthologs and paralogs—or between any genes—one of course needs some kind of data on gene function in multiple species. And this turns out to be a big problem.
Because, as Koonin says in the earlier quote, the vast majority of experimental data comes from a very few species, and these species are not exactly closely related. Here is an approximate phylogeny of the major eukaryotic model organisms:
It’s obvious from this figure that if you need both 1) lots of functional data from two species, and 2) a pretty good idea of exactly what the homologous relationships are between the genes you’re studying, you’re going to have to study human and mouse.
This is actually a pretty bleak picture for people who study molecular evolution (as I do). While we have tons and tons of sequence data both within and between species, and a very good idea about how these sequences evolve, and fancy models with which to analyze these sequences…we know next to diddly-squat about general patterns relating these sequence differences to functional differences. There are lots of interesting things to be gleaned from studies of sequence evolution, but it really would be nice to know something about the relationship between sequence and function.
What we found
What exactly does the ortholog conjecture predict? In my mind, at least, it predicts something like this:
In this completely fictitious graph the relationship between protein function and sequence similarity is a declining one, only it declines faster for paralogs than it does for orthologs. Also, just possibly, gene duplicates start out with slightly diverged function the minute they appear. Anyway, those were our predictions.
But here is what we found (Figure 1 in Nehrt et al. 2011):

(Panel A uses the Biological Process ontology and panel B uses the Molecular Function ontology.)
There are really two different, equally surprising results here. First, there is no relationship between sequence divergence and functional divergence for orthologs (among 2,579 one-to-one orthologs between human and mouse). Absolutely none—it’s a straight horizontal line. Second, there is a relationship for paralogs (among 21,771 comparisons), exactly as we predicted there would be. So according to our results, paralogs actually have more conserved function than do orthologs. Our interpretation of the data was that the most important determinant of function was the organismal context in which a gene/protein found itself: given the same amount of sequence divergence, two proteins in the same organism would be more functionally similar. For orthologs, this means that the sequence divergence of our target gene was not the most important thing, but rather the sum total of divergence in all of the genes that contribute to its cellular context. Which is why all the orthologs have on average similar functional divergence—they are all exactly the same age and hence have approximately the same levels of divergence in these interactors (in this case sequence divergence for paralogs is a much better indicator of their splitting time).
Without going through every result in the paper and our interpretation of every result, suffice it to say that after about a year-and-a-half of working on this (around February 2010), we were satisfied that we had something we were willing to submit. I even seem to remember showing the above figure to Jonathan on a visit to UC-Davis! So we did submit the paper, first to PNAS and then, after rejection, to PLoS Computational Biology, where it was rejected again.
The content of the reviews was approximately the same at both journals. Basically, people were not convinced of our results mostly because the functional relationships were all based on data in the Gene Ontology database. To be specific, the functional data we used came from experiments conducted in 12,204 different papers. We didn’t use any predicted functions, only functions assigned using experimental data. And we did A LOT of work to try to eliminate problems that might have affected our results, including repeating the main analysis using only GO terms common to both the human and mouse datasets. But there can still be bias hidden within these functional assignments because someone always has to interpret the experiment—to say that a yeast two-hybrid experiment means that a gene has function X. And because of these biases, people weren’t buying it.
To get a measure of functional similarity that did not depend on the interpretation of any experiments, we decided to repeat the entire analysis using microarray data, using the correlation in expression levels across 25 tissues as the measure of functional similarity. By this time Nathan was graduating and moving on to Maricel Kann’s lab as a research programmer, so we recruited one of Pedja’s Ph.D. students, Wyatt Clark, to pick up where Nathan had left off. (Wyatt had actually been a student in my undergraduate Evolution course a few years earlier, so we figured he knew something…) After repeating all of the GO-based analyses himself—always better to double-check, right?—Wyatt got all of the microarray data in order and produced this figure (Figure 4 in Nehrt et al. 2011):
So a year after we first submitted a paper, we submitted a new version to PLoS CB including the array analysis, and this was enough to convince the reviewers that our results were not merely due to some strange bias in GO.
The fallout, and some responses
First, let me say that I had some idea that this would be a controversial-ish paper, and that we’d get at least some blowback. For about the first 20 versions of the manuscript (including some submitted versions) I put the words “ortholog conjecture” in quotes in the title, never an endearing move. (Pedja finally convinced me to take them out of the latest submissions.) But I also thought people would be happier that an untested assumption had finally been tested—and we have definitely gotten some positive feedback along these lines, including several groups that told us they have data that support our findings. By coincidence my lab had another paper come out the same week as this one (http://www.ncbi.nlm.nih.gov/pubmed/21636278), and I mistakenly thought it would generate much more attention. I still think the biological importance of the results in that one are much greater than the ortholog conjecture results, but either because we didn’t publish in an open-access journal (Jonathan is always right) or simply because the function-prediction community is more active on the interweb tubes, there have been a surprising number of critical responses (partially collected here: http://phylogenomics.blogspot.com/2011/09/some-links-on-ortholog-conjecture-paper.html). So here are some responses to general critiques.
The ortholog conjecture says only that orthologs are similar.
Okay, this one is a bit unfair, as only one person has said this. The real problem here is that Michael Galperin seems to have deeply misunderstood what we mean by the ortholog conjecture. According to him the ortholog conjecture is “the assumption that orthologs (genes with a common origin that were vertically inherited from the same gene in the last common ancestor of the host organisms) typically retain the same function or have closely related ones.” Umm, no. In fact, if you really think this is what the ortholog conjecture says, then our results support it—human and mouse orthologs do typically have closely related functions. But we are explicitly testing for a difference between orthologs and paralogs, not whether or not orthologs retain any functions. At no point did we say (or even hint) that orthologs should not be used for functional prediction. The whole point of our analysis and conclusions is that we should stop ignoring paralogs, which would give us a ton more data to use for the prediction of functions.
The assignments of orthology and paralogy are incorrect.
This is an easy one: we did in fact get the definitions of in- and out-paralogs correct (and laid them out in Figure S1). According to Sonnhammer and Koonin: “Our definition of ‘outparalogs’ is: paralogs in the given lineage that evolved by gene duplications that happened before the radiation (speciation) event” (http://www.ncbi.nlm.nih.gov/pubmed/12446146). For the purposes of our study, this means that outparalogs are defined as any paralogs that diverged before the speciation event between human and mouse and inparalogs diverged after this speciation event. Outparalogs do not indicate only paralogs in two different species, though by necessity in our dataset inparalogs are only found in the same species (all in human or all in mouse). Therefore, with respect to our conclusion that the most important determinant of function is which genome you are found in (i.e. context), it wouldn’t matter if we had incorrect gene trees: we would never confuse two genes in the same species (either inparalogs or some of the outparalogs) with two genes in different species (all orthologs and the remaining outparalogs).
You should have inferred functions yourselves
This is a fair suggestion, and not having enough time to annotate functions for 40,000 proteins would be a pretty weak excuse for doing good science. Instead…I’ll just say that it turns out professional curators are much better at assigning functions than even the original study authors (see http://www.ncbi.nlm.nih.gov/pubmed/20829821). Curators have a much broader view of the whole set of terms available in any ontology, and a much more consistent idea of how to apply these terms. My favorite line from the above cited article: “…because of the relatively low accuracy of the authors’ submissions, the use of authors’ annotations did not result in saving of curators’ time…”
GO is not appropriate for this analysis because it is biased.
This is the most frustrating criticism of our study, if only because it’s partly true: GO is biased. In our paper we actually detail several of these biases, including the observation that the same set of authors will give two proteins more-similar functions than will two different sets of authors. We tried very hard to attempt to control for these biases, though of course one cannot account for all of them. The most uncharitable part of this critique, however, has to be the fact that people conveniently forgot to say that our array analysis was completely distinct from the GO-based analysis (though it has its own issues), and that Burkhard Rost’s analysis of protein-protein interaction (http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020079) was also completely free of any bias in GO and was consistent with all of our results.
More annoying than this, you’d think from some of the critiques of GO that it was some sort of fly-by-night operation that no one should ever depend on. I mean, c’mon—there are human curators and human experimenters and of course they’re all biased so badly one could never compare functions between proteins much less between species. What were we thinking? (Only that the original GO paper has been cited >7000 times.) Funnily enough, at several points during the course of this work Pedja suggested—only half-jokingly—that we should just assume the ortholog conjecture was correct and write a paper about how GO must be wrong. Seriously, though: one would think from the excuses people came up with for the problems inherent in GO that we should simply stop using it to, you know, predict function in other species. And we were applying it to two relatively closely related mammals, one of which is explicitly a model for the other.
What next?
Our paper laid out several explicit hypotheses about the evolution of function that arose from our findings. Unfortunately, testing any of these hypotheses will require a ton more functional data, in more than one species. I know there are multiple groups working to collect these sorts of labor-intensive datasets, and Pedja and I are thinking about doing it ourselves (with collaborators, of course!). Massive datasets that reveal protein function will always be a lot harder to collect than sequence data, especially ones free from biases.
So let’s get to it…

—————————

Note – Toni Gabaldón was trying to post a detailed response but Blogger kept cutting him off with a character limit.  So I have posted his response below.

I appreciate the effort by Matthew Hahnn on explaining the story behind his paper on the so-called “Ortholog conjecture” and on facing some of the criticism. This paper attracted my interest as that of many others that work on or just use orthology. For instance it was chosen by one of my postdocs for our “Journal Club” meeting. And it was discussed during our last “Quest for Orthologs” meeting in Cambridge. I think is raising a necessary discussion and therefore I think is a good paper. This does not mean that I fully agree with the interpretation and conclusions ;-). I hope to modestly contribute to this debate with the following post.

I think one of the causes that this paper has caused so much debate is that the conclusions seem to challenge common practice (inferring function from orthologs), and could be interpreted as the need of changing the strategies of genome annotation. I think, however, that one should interpret carefully these results before start annotating based on paralogous proteins. As I will discuss below one of the problems is that we need to agree in what is the conjecture to then agree in how to test it. I see three main points that can be a source of confusion: i) the issue of what is actually stated by this conjecture, ii) the issue of annotation, and iii) the issue of time

1) What is the “ortholog conjecture”?
Or in other terms, when should we expect orthologs to be more likely to share function than paralogs?. Always? Of course not. All of us would agree that two recently duplicated paralogs are likely to be more similar in function than two distant orthologs, so it is obvious that the conjecture is not simply “orthologs are more similar in function than paralogs”. In reality the expectation that orthologs are more likely to be similar in function than paralogs, as least this is how I interpret it, is directly related to the effect that duplication have on functional divergence. If gene duplication has some effect on functional divergence (even in not 100% of the cases), then, given all other things equal (divergence time, story of speciation/duplication events – except fpr the duplication defining the orthologs) one would expect orthologs to be more likely to conserve function.

I think this complexity is not well considered (by many authors, in general). Hahn refeers to the famous review of orthology by Koonin (2005) as the source for the term “ortholog conjecture”. However, In that paper this conjecture is discussed always within the context of genes accross two particular species, whether in Hahn’s paper it is taken as well to other contexts. Thus, the proper context in which to test this conjecture is only between orthologs and between-species paralogs. As we can see,  Red and purple lines in Hahn paper in figure2 do not show any clear difference.

 Secondly, Koonin was very cautions in his paper, stating that he was referring to “equivalent functions” and not exactly the same “function”, correctly implying that the functional contexts would be different in the two different species. This brings me to the next point.

ii) annotation
If the expectation of functional conservation of orthologs refers to a given pair of species, then it makes no sense to test that expectation between paralogs within the same species and orthologs in different species. We were interested in this issue and it took us some effort to control for this “species” influence on the comparison, if you are interested you can read our paper on divergence of expression profiles between orthologs and paralogs (http://www.ncbi.nlm.nih.gov/pubmed/21515902)

As Hahn founds, and it was anticipated by Koonin in that review, there is a huge influence of the “species context”, a big constraint of what fraction of the function is shared. Indeed I think is the dominant signal in Hahn’s paper. Why is that? One possibility is that the functional context determines the function, I agree. However, we should not discard biases in how different communities working around a model species define processes and function, also the type of experiments that are usually done. For instance experimental inference from KO mutants might be common from mouse, but I guess is not the case in humans (!!). I think this may be having a big influence and might even be the dominant signal in Hahns paper.

Finally function has many levels and I expect subfunctionalization mostly affect lower levels (i.e. more specific). Biases may also
 exist in the level of annotation between species or between families of different size (contributing more or less to the orthologs/paralogs class).

Microarray data are less likely to be subject to biases (although some may exist), at least they should be expected to be free of “human interpretation biases” and so Hahn and colleaguies did well, in my opinion, of testing that dataset. It is important to note that for microarrays and for orthologs and between-species paralogs (which I think is the right frame for testing the conjecture) ortholgs are more likely to share an expression context. This is compatible to what we found in the paper mentioned above, and compatible with the orthology conjecture as stated by koonin (accross species)

iii) time
 Finally, one aspect which I think is fundamental is the notion of “divergence time”. Since paralogs can emerge at different time-scales they are composed by a heterogeneous set of protein pairs. Most of comparisons of orthologs and paralogs (Hahn’s as well) use sequence divergence as a proxy of time. However this is only a poor estimate, specially when duplications (as in here) are involved (we explored this issue in the past: http://www.ncbi.nlm.nih.gov/pubmed/21075746). This means that for a given divergence time paralogs may have larger sequence divergence than orthologs at the same divergence time, or otherwise (if gene conversion is playing a role). Is the conjecture based on sequence divergence or on divergence time?, I think the initial sense of using orthology to annotate accross species is based on the notion of comparing things at the same evolutionary distance. Thus basing our conclusions on divergence times might not be the proper way of doing it.

CONCLUSIONS AND PROPOSAL FOR RE-STATEMENT

To conclude, and with the intention of going beyond this particular paper,
I would finish by saying that the key to the problem lies on how we interpret the so-called “ortholog conjecture” or how are our expectations on how function evolves. What I get from re-reading Eugene Koonin’s paper and how I am using that “assumption” in my day-to-day work is the following:

“Orthologs in two given species are more likely to share equivalent functions than paralogs between these two species”

Therefore the notion of “accross the same pair of species” is important and thus only part of the comparisons made by Hahn and colleagues could directly test this. Looking at the microarray and between-species comparisons data, the conjecture may even hold true!!

I, however, do think that the conjecture as stated above is limited and does not capture the complexity of orthology relationships. Indeed us, and many other researchers, are tuning the confidence of the orthology-based annotation based on whether the orthologs are one-to-one, one-to-many or many-to-many, even when orthologs are “super-orthologs” (with no duplication event in the lineages separating the two orthologs).

Since, the underlying assumption of the ortholog conjecture is that duplication may (not necessarily always) promote functional shifts, then many-to-many orthology relationships will tend to include  orthologous pairs with different functions.

 Thus I would re-state the conjecture (or expectation) as follows:

 “In the absence of additional duplication events in the lineages separating them, two orthologous genes from two given species are more likely to share equivalent functions than two paralogs between these two species”

 This would be a more conservative expectation, which is closer to the current use of orthology-based annotation that tends to identify one-to-one orthologs, rather than any type.

 When duplications start appearing in subsequent lineages thus creating one- or many-to-many orthology relationships, the situation is less clear. Following the assumption that duplications may promote functional divergence. Then one could expand the conjecture by “the more duplications in the evolutionary history separating two genes, the lower the expectation that these two genes would share equivalent functions”.

 I wrote this contribution on the fly, and surely there are ways of expressing this in more appropriate terms. In any case I hope I made clear the idea that the conjecture emerges from the notion of duplications causing functional shifts and that our expectations will be clearer if expressed on those terms. This goes on the lines of what Jonathan Eisen mentioned on considering the whole phylogenetic story to annotate genes.

 Under this perspective, the real important hypothesis is that “duplications tend promote functional shifts”, I think this is based on solid grounds and has been tested intensively in the past.

 Cheers,

Toni Gabaldón

http://treevolution.blogspot.com

Interested in sex? How about in bacteria? Then these #PLoSGenetics papers are for you

Well I was torn about this. Should I title the post ” ICE, ICE, Bacterial BABIES” or say something about sex? I settled on sex, but not sure if that was wise.

Anyway – quick post to say that there are two papers from PLoS Genetics last month that caught my eye. They are

The latter is a “review” paper linked to the first one which is a research paper. The papers together provide both a good background and a window into modern studies of “ICEs” or integrative conjugative elements in bacteria.

I like the summary from the first paper:

Some mobile genetic elements spread genetic information horizontally between prokaryotes by conjugation, a mechanism by which DNA is transferred directly from one cell to the other. Among the processes allowing genetic transfer between cells, conjugation is the one allowing the simultaneous transfer of larger amounts of DNA and between the least related cells. As such, conjugative systems are key players in horizontal transfer, including the transfer of antibiotic resistance to and between many human pathogens. Conjugative systems are encoded both in plasmids and in chromosomes. The latter are called Integrative Conjugative Elements (ICE); and their number, identity, and mechanism of conjugation were poorly known. We have developed an approach to identify and characterize these elements and found more ICEs than conjugative plasmids in genomes. While both ICEs and plasmids use similar conjugative systems, there are remarkable preferences for some systems in some elements. Our evolutionary analysis shows that plasmid conjugative systems have often given rise to ICEs and vice versa. Therefore, ICEs and conjugative plasmids should be regarded as one and the same, the differences in their means of existence in cells probably the result of different requirements for stabilization and/or transmissibility of the genetic information they contain.

That should be enough to get people started. And that is alas all I have time to write about here.

Put down what you are doing & read this article: Amy Harmon "Autistic & seeking a place in an adult world"

Some people out there complain about the death of great scientific and medical writing. Well, I say to them “What exactly have you been reading?” Sure there is crummy stuff out there. But there are some masterpieces. And yesterday night I found one – Amy Harmon has an article that was released online last night and published in the Sunday New York Times Today: Autistic and Seeking a Place in an Adult World.
It is a spectacular piece of work – captivating, heartbreaking (in ways), inspiring (in others) and just brilliant in many ways. I have just read it for the third or fourth time. And probably about to go back for another look. I recommend everyone and anyone give it a look.

Assemblathon 1 paper out, includes many #UCDavis folks, though @vsbuffalo name backwards

Quick one here. A new paper is out from many folks, including Aaron Darling from my lab as well as a few other UC Davis folks. It is a cool paper: Assemblathon 1: A competitive assessment of de novo short read assembly methods
One minor issue – seems they got Vince Buffalo‘s name backwards – he is listed as Buffalo Vince on the Genome Research page, though in the PDF they have his name correct. Will have to see what he has to say about that.

Can I just say I love Biomed Central #OpenAccess

Well, I have given Biomed Central a bit of snarky grief the last few days over a few things. First, I posted about to my Posterous site (but not here) a little comment about how their web site looks weird in safari:

Then I posted to this blog a little ditty about how I did not like some parts of a phylogenetic tree they use in marketing:No award to give out but here are some lessons in using Google’s image search to find an image source

My main complaint was the poor treatment of microbes in the tree. In that post I discussed how I used google image search to trace the tree

 to a few sites and discovered that they recognized it was a bit of a biased tree.  And I noted they had fridge magnets that had the tree and how I wanted one.

And, well, they have responded brilliantly.

Matthew Cockerill posted to my posterous site about how he was looking into the Safari issue and then, they fixed it (it was a font display issue).

And then today in the mail I received a gift and a note

The note reads “”We’ll do justice to the microbial world one day”.

Indeed.

I note, even without their responses, I truly love Biomed Central.  I published my first open access paper was published in a Biomed Central Journal, Genome Biology: http://genomebiology.com/2000/1/6/research/0011 and I have published quite a few articles in their journals including:

Biomed Central was THE pioneer for truly open access publications in biology and they are still doing great things.  I note in addition, they do a very good job covering microbiology not only in their general journals but also with specific microbiology focused journals including:

So it seems – you are already doing some justice to the microbial world.

C-DEBI Research Support > Request for Research Proposals

Katrina Edwards on the Atlantis

I have always been fascinated by life in extreme places on the planet. And somehow I have managed to do projects on microbes from places like Antarctica, boiling hotsprings in Yellowstone and Kamchatka, acid pools, and more. The extremes are fascinating to me because they tell us a lot about the limits of life as well as indirectly about life in “normal” places.

And of course, I am not alone. Many many scientists are fascinated by life’s extremes. But not everyone ends up studying life in extreme environments of course. One reason for this is that many extreme environments that might be of interest are kind of hard to study. Consider the deep sea. Not so easy to do work there and just getting samples can be a massive undertaking.

Just imagine though. What if there were a way to “tag along” on an existing project studying life’s extremes at no cost to you or your grants? Even better what if there were a way to get extra funds to not just tag along on a project but to carry out detailed research at the same time?

Well, amazingly, there is such a chance right now. The C-DEBI “Center for Dark Energy Biosphere” project is calling for proposals. C-DEBI Research Support > Request for Research Proposals

They have money. They have drills. They have been and will continue to be collecting lots of samples from the bottom of the ocean and the crust below.  They are doing a bunch of microbiology (as well as other things). And they are calling for people out there to join them in various ways including;

And if you are interested they are heading out in a few days on a cruise to study the seafloor at “North Pond” a site in the bottom of the ocean on the Mid=Atlantic Ridge. For more information about this cruise see

I note – I was a visiting scientist for a few days at one of the C-DEBI meetings about evolution earlier this year. It was a great meeting – on Catalina Island – and I wrote a VERY long blog post about it: The Tree of Life: A “work” trip to Catalina Island: USC, Wrigley, C-DEBI, dark energy biosphere, Virgin Oceanic, Deep Five, & more. You can learn more about the C-DEBI project by reading that post.  And you can look at my pretty pictures below:

I note in addition, I am forever in debt to Katrina Edwards the PI of the C-DEBI project ever since she gave a frigging awesome tour to my kids of the Atlantis when it was docked in San Francisco

But regardless of the personal connections I have to C-DEBI, the project is very interesting and the fact that they are offering up funds to support “outsiders” who want to participate in the project in some way is great.

Great paper showing the potential power of comparative and evolutionary genomics in #PLoS Genetics

There is a wonderful paper that has just appeared in PLoS Genetics I want to call people’s attention to: PLoS Genetics: Emergence and Modular Evolution of a Novel Motility Machinery in Bacteria

In the paper, researchers from CNRS and Aix-Marseille in France used some nice comparative and evolutionary genomics analyses along with experimental work to characterize the function and evolution of gliding motility in bacteria.

Their summary of their work:

Motility over solid surfaces (gliding) is an important bacterial mechanism that allows complex social behaviours and pathogenesis. Conflicting models have been suggested to explain this locomotion in the deltaproteobacterium Myxococcus xanthus: propulsion by polymer secretion at the rear of the cells as opposed to energized nano-machines distributed along the cell body. However, in absence of characterized molecular machinery, the exact mechanism of gliding could not be resolved despite several decades of research. In this study, using a combination of experimental and computational approaches, we showed for the first time that the motility machinery is composed of large macromolecular assemblies periodically distributed along the cell envelope. Furthermore, the data suggest that the motility machinery derived from an ancient gene cluster also found in several non-gliding bacterial lineages. Intriguingly, we find that most of the components of the gliding machinery are closely related to a sporulation system, suggesting unsuspected links between these two apparently distinct biological processes. Our findings now pave the way for the first molecular studies of a long mysterious motility mechanism.

Basically, they started with some genetic and functional studies in Myxococcus xanthus.  They analyzed these in the context of the genome sequence (note – I was a co-author on the original genome paper).  And then they did some extensive comparative and evolutionary analysis of these genes, producing some wonderful figures along the way such as:

Figure 2. Taxonomic distribution of the closest homologues of the 14 genes composing the G1, G2, and M1 clusters, and genetic organization of the core complex. (A) For a given gene, the number of homologues in the corresponding genome is indicated by the numbers within arrows. The relationships between the species carrying the different homologues of the genes are indicated by the phylogeny on the left. Based on their taxonomic distribution, the 14 genes can be divided into Group A (grey background) and Group B (white background). (B) In all non Deltaproteobacteria and in Geobacter, the Group B genes clustered in a single genomic region.  doi:10.1371/journal.pgen.1002268.g002  


Based on their analysis they then came up with some hypotheses as to which genes were involved in key parts of gliding motility and what their biochemical functions were and they then went and confirmed this with experiments.  I am not going to go into detail on the functional work they did but you can read their paper for more details.

They wrapped up their paper by proposing an model for the evolutionary history of gliding motility.  I am not sure I buy all components of their model since our sampling of genomes right now is still very poor, but they have a pretty detailed theory captured in part in this figure:

Figure 8. Evolution and structure of the Myxococcus gliding motility machinery. A) Evolutionary scenario describing the emergence and evolution of the gliding motility machinery in M. xanthus. The relationships between organisms carrying close homologues of the 14 genes encoding putative components of the gliding machinery in M. xanthus are represented by the phylogeny. Green and red arrows respectively indicate gene acquisition and gene loss. The number of gene copies that were acquired or lost is indicated within arrows. The purple dotted arrows represent horizontal gene transfer events of one or several components. WGD marks the putative whole genome duplication event that occurred in the ancestor of Myxococcales. For each gene, locus_tag, former (agm/agl/agn) and new (glt and agl) names are provided. The number of complete genomes that contain homologues of glt and agl genes compared to the total number of complete genomes available at the beginning of this study are indicated in brackets. (B) The Myxococcus gliding machinery. The diagram compiles data from this work and published literature. Components were added based on bioinformatic predictions, mutagenesis, interaction and localization studies. Exhaustive information is not available for all proteins and thus the diagram largely is subject to modifications once more data will be available. Known interactions within the complex from experimental evidence are AglR-GltG, AglZ-MglA and interactions within the AglRQS molecular motor [13], [15]. For clarity, the proteins were colour-coded as in the rest of the manuscript 

Anyway – I don’t have much time right now to provide more detail on the paper.  But it is definitely worth checking out.

Storification of my notes/tweets from #UCDavis CLIMB Symposium "The infant gut microbiome: prebiotics, probiotics and establishment"

I made a Storify posting for the CLIMB Symposium I participated in yesterday. First I am reposting my summary of what the symposium was about which I posted the day before the meeting:

There is a symposium tomorrow at UC Davis organized by a undergraduates in the CLIMB program.  CLIMB stands for “Collaborative Learning at the Interface of Mathematics and Biology (CLIMB)” and is a program that emphasizes hands-on training using mathematics and computation to answer state-of-the-art questions in biology.  A select group of undergraduates participate in the program and this summer the students had to do some sort of modelling project.  Somehow I managed to convince them to do work on human gut microbes.  And they have done a remarkable job.  

As part of their summer work, they organized a symposium on the topic and their symposium takes place tomorrow.  Details are below. 

The Infant Gut Microbiome: Prebiotics, Probiotics, & Establishment 

  • Jonathan Eisen, UC Davis “DNA and the hidden world of microbes”
  • Mark Underwood, UC Davis “Dysbiosis and necrotizing enterocolitis”
  • Ruth Ley, Cornell University “Host-microbial interactions and metabolic syndrome” 
  • CLIMB 2010 cohort “Breast milk metabolism and bacterial coexistence in the infant microbiome”
  • David Relman, Stanford University “Early days: assembly of the human gut microbiome during childhood” 
  • Bruce German, UC Davis

The only major issue for me is I am losing my voice.  So we will see how this goes.  Though I note I have gotten some very sage advice on how to treat my voice problem via the magic of twitter.  If I do not collapse I will also be tweeting/posting about the other talks during the day. 


Anyway – here is the storification:

http://storify.com/phylogenomics/climb-symposium-at-uc-davis.js<a href=”http://storify.com/phylogenomics/climb-symposium-at-uc-davis” target=”_blank”>View “CLIMB Symposium at UC Davis” on Storify</a>

Coming Monday at #UCDavis "The Infant Gut Microbiome: Prebiotics, Probiotics, & Establishment"

Just a little announcement here.  There is a symposium tomorrow at UC Davis organized by a undergraduates in the CLIMB program.  CLIMB stands for “Collaborative Learning at the Interface of Mathematics and Biology (CLIMB)” and is a program that emphasizes hands-on training using mathematics and computation to answer state-of-the-art questions in biology.  A select group of undergraduates participate in the program and this summer the students had to do some sort of modelling project.  Somehow I managed to convince them to do work on human gut microbes.  And they have done a remarkable job.

As part of their summer work, they organized a symposium on the topic and their symposium takes place tomorrow.  Details are below.

The Infant Gut Microbiome: Prebiotics, Probiotics, & Establishment

Monday, 12 September 2011, 9am-4pm

Life Sciences 1022

UC Davis

9:00-9:10 Introduction

9:10-9:40 Jonathan Eisen, UC Davis

“DNA and the hidden world of microbes”

9:40-10:40 Mark Underwood, UC Davis

“Dysbiosis and necrotizing enterocolitis”

10:40-10:50 break

10:50-11:50 Ruth Ley, Cornell University

“Host-microbial interactions and metabolic syndrome”

11:50-12:00 general discussion

12:00-1:00 lunch

1:00-2:00 CLIMB 2010 cohort

“Breast milk metabolism and bacterial coexistence in the infant microbiome”

2:00-2:10 break

2:10-3:10 David Relman, Stanford University

“Early days: assembly of the human gut microbiome during childhood”

3:10-3:40 Bruce German, UC Davis

3:40-4:00 next steps

The only major issue for me is I am losing my voice.  So we will see how this goes.  Though I note I have gotten some very sage advice on how to treat my voice problem via the magic of twitter.  If I do not collapse I will also be tweeting/posting about the other talks during the day.