Does phylogeny matter? (In Eco-Evo meta-analyses) … Apparently, yes, but it depends.

As many may know – I am pretty obsessed with the uses of phylogeny in biological studies.  In fact, one could say this has driven almost all of my work.  Thus when an email went around a little bit ago about an article for a journal club at UC Davis where the title begins with “Does phylogeny matter?”, well, I had to take a look.  Alas, I was a bit worried when I saw the article was in Ecology Letters because I am at home and was not sure about access policies for this journal.

But I was pleasantly surprised to get full access without any library – VPN login to the following article: Does phylogeny matter? Assessing the impact of phylogenetic information in ecological meta-analysis – Chamberlain – 2012 – Ecology Letters – Wiley Online Library.  I cannot figure out WHY it is freely available right now, nor how long it will be, but I took the chance to look the article over.

And I was even more pleasantly surprised to look over the article.  Many meta-analyses can seem forced – if not almost unbearable to look through.  But this one is very well done.  Basically they did a massive comparison of conclusions that one could reach when one either does or does not take into account the phylogenetic non-independence of taxa when conducting meta-analyses in evo-eco studies.  They searched published literature for meta-analyses and then .. well I will use their words here (from the end of their introduction):

Herein, we re-analyse datasets from previously published meta-analytic studies, comparing results of traditional and phylogenetic meta-analyses. In addition, we attempt to explain variation in the effect of phylogenetic information on meta-analytic outcomes by examining characteristics of phylogenies. We ask: (1) how does accounting for phylogenetic non-independence change results of individual meta-analyses? and (2) across datasets, what characteristics of phylogenies explain changes in effect size for phylogenetic vs. traditional meta-analyses? As a complement to our main questions, in Appendix A, we also ask (3) how does accounting for phylogenetic non-independence affect model fit of individual meta-analyses? and (4) across datasets, what characteristics of phylogenies explain variation in the relative fit of phylogenetic meta-analyses? Despite the many compelling reasons to incorporate phylogenetic information into meta-analyses that involve multiple species, investigators often use model comparison criteria, such as Akaike’s Information Criterion (AIC) to assess fit of phylogenetic vs. traditional meta-analytic models. We found a clear bias in relation to phylogeny size for one of the two methods currently used to quantify relative model fit (Q-based AIC), thus our findings have important implications for meta-analysts using such model comparisons (see Appendix A for details).

And the key conclusions are

Here, we have shown that incorporating phylogenies influences ecological meta-analysis outcomes, in many cases changing whether the observed effect size differs significantly from zero. We also show that the degree of difference between traditional and phylogenetic meta-analyses depends on key characteristics of phylogenies. Despite this potential complication, we strongly recommend incorporating phylogenetic information into ecological meta-analyses to account for species non-independence.

They also offer up three main recommendations for consideration

To conclude, we outline three recommendations for the use of phylogenetic meta-analyses in ecology and evolutionary biology:

  1. Use phylogenetic meta-analysis, but note that some response metrics are less likely to be affected by phylogenetic methods.
  2. Include as many species as possible.
  3. Be aware that phylogeny shape may influence meta-analytic outcomes. 

Definitely worth a look …

Chamberlain, S., Hovick, S., Dibble, C., Rasmussen, N., Van Allen, B., Maitner, B., Ahern, J., Bell-Dereske, L., Roy, C., Meza-Lopez, M., Carrillo, J., Siemann, E., Lajeunesse, M., & Whitney, K. (2012). Does phylogeny matter? Assessing the impact of phylogenetic information in ecological meta-analysis Ecology Letters, 15 (6), 627-636 DOI: 10.1111/j.1461-0248.2012.01776.x

Cool paper, & winner of "worst new omics word award": Predatosome
And the bad new omics words keep streaming in. Today’s winner of the “Worst New Omics Word Award” is going to Carey Lambert, Chien-Yi Chang, Michael J. Capeness and R. Elizabeth Sockett from Nottingham for their use/ invention of “Predatosome”. They use this term in the title of their new PLoS One paper: The First Bite— Profiling the Predatosome in the Bacterial Pathogen Bdellovibrio. Here is the very long sentence where the define it:

The gene products required for the initial invasive predatory processes have not been extensively studied but the genome sequencing of B. bacteriovorus HD100 [1] revealed a genome of 3.85Mb, including a core genome similar to that of non-predatory bacteria and some 40% of the genome comprising a potential predicted “predatosome” of genes, encoding both hydrolytic products that may be employed in prey degradation, and genes that may be required specifically for host predation and thus are not conserved across the Proteobacteria.

The paper is actually quite interesting. They use genomic approaches to characterize a fascinating organism – the bacterial species Bdellovibrio bacteriovorus. This bug is a predatory organism – eating other bacteria. Since it eats them from the inside out, some, including these authors, refer to this organism as a pathogen of other bacteria and their is some discussion here and elsewhere for its potential to serve as a “living antibiotic” in much the same way people are trying to use bacterial viruses (a.k.a. phage).

The paper overall is quite nice on first read. They used microarray studies to characterize gene expression patterns in different phases of the life cycle (see Figure above for the life cycle outline). They backed up this work by quantitative PCR studies and regular RT PCR. And based upon their analysis they found some genes that are “Up-Regulated in Predatory, but Not HI” phase (HI stands for host-independent). And here is where they really tell us what they mean by predatosome:

This category of 240 genes are very interesting as they potentially exclude those genes simply involved with release from attack-phase into growth, namely they should be part of the “predatosome” of predatorily specific genes.

It seems to me this terminology is completely unnecessary. All they need to do is say they are studying the genes related to the predatory phase. To assign these genes to the “predatosome” is a bit much. They continue in the paper to report some really interesting stuff. For example, they also examine another predatory bacterial species, and look at whether there are genes conserved in the process between species. They made some really nice figures by the way about the different phases of hte life cycle in this organism and which genes are expressed:

Anyway – the science in the paper is nice. However, the invention of yet another omics word is a bit much. And thus Lambert et al. are winners of the highly coveted “Worst New Omics Word Award” for their invention of “predatosome“. Details on the paper are below – and that is where the figures come from too. (Hat tip to Bora for letting me know about the paper, and the word).

Lambert, C., Chang, C., Capeness, M., & Sockett, R. (2010). The First Bite— Profiling the Predatosome in the Bacterial Pathogen Bdellovibrio PLoS ONE, 5 (1) DOI: 10.1371/journal.pone.0008599

Friendfeed comments below:

Confronting Intelligent Design arguments directly in the scientific literature
A representative from Wiley publishing sent me a link to an interesting new paper. Entitled “Using Protistan Examples to Dispel the Myths of Intelligent Design” by Mark Farmer, from the University of Georgia and Andrea Habura, from the University at Albany, New York. It is from the Journal of Eukaryotic Microbiology and is based upon a presentation they gave at a workshop at a conference.

Basically, the article is a detailed discussion of how examples relating to microbial eukaryotes (I hate the term protist …) that are used by Intelligent Design advocates are, well, BS. And the article discusses the evidence that refutes the ID arguments.

One thing they discuss is the issue of the Cambrian Explosion. ID supporters, such as Stephen Meyer have made many arguments about they feel the diversification in the Cambrian is not explainable through evolutionary processes. Farmer and Habura refute this by pointing out that the diversity seen in microbial eukaryotes at the time of the Cambrian was immense and that what came out of the “explosion” was actually not that spectacular relative to what already existed in the microbial eukaryotes:

The extant diversity of the protists should therefore be seen as the “background radiation” of the eukaryotic Big Bang, with the Cambrian radiation of the metazoa being a subsequent event within a specific group.

They go on to discuss examples involving speciation, the fossil record, evolution of drug resistance in Plasmodium, and a few other things. In each case they discuss a claim by ID supporters and then discuss evidence for why this claim is not valid. Overall the paper is worth reading if you are involved in any discussions with ID supporters.

I note that when I finished the above writing, I went to look at Pubmed to find other examples of people taking on ID arguments in the literature with a focus on issues in microbes. Here are two other recent examples:

Some discussion of this has now popped up on the web:

FARMER, M., & HABURA, A. (2010). Using Protistan Examples to Dispel the Myths of Intelligent Design Journal of Eukaryotic Microbiology, 57 (1), 3-10 DOI: 10.1111/j.1550-7408.2009.00460.x

My first PLoS One paper …. yay: automated phylogenetic tree based rRNA analysis
Well, I have truly entered the modern world. My first PLoS One paper has just come out. It is entitled “An Automated Phylogenetic Tree-Based Small Subunit rRNA Taxonomy and Alignment Pipeline (STAP)” and well, it describes automated software for analyzing rRNA sequences that are generated as part of microbial diversity studies. The main goal behind this was to keep up with the massive amounts of rRNA sequences we and others could generate in the lab and to develop a tool that would remove the need for “manual” work in analyzing rRNAs.

The work was done primarily by Dongying Wu, a Project Scientist in my lab with assistance from a Amber Hartman, who is a PhD student in my lab. Naomi Ward, who was on the faculty at TIGR and is now at Wyoming, and I helped guide the development and testing of the software.

We first developed this pipeline/software in conjunction with analyzing the rRNA sequences that were part of the Sargasso Sea metagenome and results from the word was in the Venter et al. Sargasso paper. We then used the pipeline and continued to refine it as part of a variety of studies including a paper by Kevin Penn et al on coral associated microbes. Kevin was working as a technician for me and Naomi and is now a PhD student at Scripps Institute of Oceanography. We also had some input from various scientists we were working with on rRNA analyses, especially Jen Hughes Martiny

We made a series of further refinements and worked with people like Saul Kravitz from the Venter Institute and the CAMERA metagenomics database to make sure that the software could be run outside of my lab. And then we finally got around to writing up a paper …. and now it is out.

You can download the software here. The basics of the software are summarized below: (see flow chart too).

  • Stage 1: Domain Analysis
    • Take a rRNA sequence
    • blast it against a database of representative rRNAs from all lines of life
    • use the blast results to help choose sequences to use to make a multiple sequence alignment
    • infer a phylogenetic tree from the alignment
    • assign the sequence to a domain of life (bacteria, archaea, eukaryotes)

  • Stage 2: First pass alignment and tree within domain
    • take the same rRNA sequence
    • blast against a database of rRNAs from within the domain of interest
    • use the blast results to help choose sequences for a multiple alignment
    • infer a phylogenetic tree from the alignment
    • assign the sequence to a taxonomic group

  • Stage 3: Second pass alignment and tree within domain
    • extract sequences from members of the putative taxonomic group (as well as some others to balance the diversity)
    • make a multiple sequence alignment
    • infer a phylogenetic tree

From the above path, we end up with an alignment, which is useful for things such as counting number of species in a sample as well as a tree which is useful for determining what types of organisms are in the sample.

I note – the key is that it is completely automated and can be run on a single machine or a cluster and produces comparable results to manual methods. In the long run we plan to connect this to other software and other labs develop to build a metagenomics and microbial diversity workflow that will help in the processing of massive amounts of sequence data for microbial diversity studies.

I should note this work was supported primarily by a National Science Foundation grant to me and Naomi Ward as part of their “Assembling the Tree of Life” Program (Grant No. 0228651). Some final work on the project was funded by the Gordon and Betty Moore Foundation through grant #1660 to Jonathan Eisen and the CAMERA grant to UCSD.

Wu, D., Hartman, A., Ward, N., & Eisen, J. (2008). An Automated Phylogenetic Tree-Based Small Subunit rRNA Taxonomy and Alignment Pipeline (STAP) PLoS ONE, 3 (7) DOI: 10.1371/journal.pone.0002566