Reading in detail Carl Woese’s 1998 "Manifesto on Microbial Genomics" for the first time …

I am a bit stunned by this paper from Carl Woese in 1998 which I was aware of but have not read in detail until now: ScienceDirect.com – Current Biology – A manifesto for microbial genomics

I re-discovered it because I am making a compilation of papers by Woese in relation to the tribute page I have set up.  And the title (a manifesto about microbial genomics) combined with the date (1998 – early in the genome sequencing era) struck me as something worth looking at.  Plus I knew others (e.g., Phil Hugenholtz, Nikos Kyrpides, …) had mentioned this paper to me so I figured – hey – how about actually reading it in detail.  And fortunately it is freely available at the Current Biology web site (not sure why that is actually).  Anyway – what I found in the paper is basically an argument for much of my career from 1998-2008.

Some choice lines in here but the crux is as follows

The first order of business in microbial genomics should be a phylogenetically representative genomic screen of the microbial world. In other words, all the major microbial taxa and their subdivisions — which are the major source of biological diversity on Earth — should be represented by several genome sequences. There are now more than 30 recognized major eubacterial taxa — each the phylogenetic equivalent of a eukaryotic kingdom — and at least half that number in the (far less well characterized) Archaea; not to mention the yet-to-be-discovered kingdoms among the unicellular eukaryotes.

This basically lays out the Tree of Life project I co-ran at TIGR and the Genomic Encyclopedia of Bacteria and Archaea project I co-ran / run at the DOE JGI.

The ending is perfect

This is not the place to go into the specifies of which microbial genomes would be most useful. I would suggest, however, that a phylogenetic tree hang on the wall of every laboratory in which microbial genomes are being sequenced — for inspiration.

Somehow I had missed the crux of this paper until now.  I think it is worth reading by everyone out there working on microbes and/or their genomes.

Oh – and here is the compilation of Woese’s papers I am making in Mendeley.

http://www.mendeley.com/groups/2940711/papers-by-carl-woese/widget/21/3/

A blast from the past: Plasmodium, plastids, phylogeny, and reproducibility

A few days ago I got an email from a colleague who I had not seen in many years.  It was from Malcolm Gardner who worked at TIGR when I was there and is now at Seattle Biomed.

His email was related to the 2002 publication of the complete genome sequence of Plasmodium falciparum the causative agent of most human malaria cases –  for which he was the lead author.   Someone had emailed Malcolm asking if he could provide details about the settings used in the blast searches that were part of the evolutionary analyses of the paper.   The paper is freely available at Nature – at least for now – every once in a while the Nature Publishing Group seems to put it behind a paywall despite their promises not to.

Malcolm was contacting me because I had run / coordinated much of the evolutionary analysis reported in that paper.  I note – as one of the only evolution focused people at TIGR it was pretty common for people to come to me and ask if I could help them with their genome.  I pretty much always said yes since, well, I loved doing that kind of thing and it was really exciting in the early days of genome sequencing to be the first person to ask some evolution related question about the data.


Malcolm included the email he had received (which did not have a lot of detail) and he and I wrote back and forth trying to figure out exactly what this person wanted.  And then I said, well, maybe the person should get in touch with me directly so I can figure out what they really want/need.  It seemed unusual that someone was asking about something like that from a 10 year old paper, but, whatever.  

As I was communicating with this person, I started digging through my files and my brain trying to remember exactly what had been done for this paper more than 10 years ago.  I remember Malcolm and others from the Plasmodium community organizing some “jamborees” looking at the annotation of the genome. At one of those jamborees I met with some of the folks from the Sanger Center (which was one of the big players in the P. falciparum genome sequencing) with Malcolm and – after some discussion I ended up doing three main things relating to the paper, which I describe below.

Thing 1: Conserved eukaryote genes

One of my analyses was to use the genome to look for genes conserved in eukaryotes but not present in bacteria or archaea.  I did this to try and find genes that could be considered likely to have been invented on the evolutionary branch leading up to the common ancestor of eukaryotes.

As an aside, at about the same time I was asked to write a News and Views for Nature about the publication of the Schizosaccharomyces pombe genome.  In the N&V I had written “Genome sequencing: Brouhaha over the other yeast” I noted how the authors had used the genome to do some interesting analysis of conserved eukaryotic genes.  With the help of the Nature staff I had also made a figure which demonstrated (sort of) what they were trying to do in their analysis – which was to find genes that originated on the branch leading up to the common ancestor of the eukaryotes for which genomes were available at the time.  As another aside – the S. pombe genome paper and my News and Views article are freely available …

Figure 1: The tree of life, with the branches labelled according to Wood et al.’s analysis of genes that might be specific to eukaryotes versus prokaryotes, and to multicellular versus single-celled organisms. Bacteria and archaea are prokaryotes (they do not have nuclei). From Nature 415, 845-848 (21 February 2002) | doi:10.1038/nature725. The eukaryotic portion of the tree is based on Baldauf et al. 2000

Anyway, I did a similar analysis to what was in the S. pombe genome paper and I found a reasonable number and helped write a section for the paper on this.

Comparative genome analysis with other eukaryotes for which the complete genome is available (excluding the parasite E. cuniculi) revealed that, in terms of overall genome content, P. falciparum is slightly more similar to Arabidopsis thaliana than to other taxa. Although this is consistent with phylogenetic studies (64), it could also be due to the presence in the P. falciparum nuclear genome of genes derived from plastids or from the nuclear genome of the secondary endosymbiont. Thus the apparent affinity of Plasmodium and Arabidopsis might not reflect the true phylogenetic history of the P. falciparum lineage. Comparative genomic analysis was also used to identify genes apparently duplicated in the P. falciparum lineage since it split from the lineages represented by the other completed genomes (Supplementary Table B). 

There are 237 P. falciparum proteins with strong matches to proteins in all completed eukaryotic genomes but no matches to proteins, even at low stringency, in any complete prokaryotic proteome (Supplementary Table C). These proteins help to define the differences between eukaryotes and prokaryotes. Proteins in this list include those with roles in cytoskeleton construction and maintenance, chromatin packaging and modification, cell cycle regulation, intracellular signalling, transcription, translation, replication, and many proteins of unknown function. This list overlaps with, but is somewhat larger than, the list generated by an analysis of the S. pombe genome (65). The differences are probably due in part to the different stringencies used to identify the presence or absence of homologues in the two studies.

The list of genes is available as supplemental material on the Nature web site.  Alas it is in MS Word format which is not the most useful thing.  But more on that issue at the end of this post.

Thing 2. Searching for lineage specific duplications

Another aspect of comparative genomic analysis that I used to do for most genomes at TIGR was to look for lineage specific duplications (i.e., genes that have undergone duplications in the lineage of the species being studied to the exclusion of the lineages for which other genomes are available).  The quick and dirty way we used to do this was to simply look for genes that had a better blast match to another gene from their own genome than to genes in any other genome.  The list of genes we identified this way is also provided as a Word document in Supplemental materials.

Thing 3: Searching for organelle derived genes in the nuclear genome of P. falciparum

The third thing I did for the paper was to search for organelle derived genes in the nuclear genome of Plasmodium.  Specifically I was looking for genes derived from the mitochondrial genome and plastid genome.  For those who do not know, Plasmodium is a member of the Apicomplexa – all organisms in this group have an unusual organelle called the Apicoplast.  Though the exact nature of this organelle had been debated, it’s evolutionary origins were determined by none other than Malcolm Gardner many years earlier (Gardner et al. 1994). They had shown that this organelle was in fact derived from chloroplasts (which themselves are derived from cyanoabcteria).  I am shamed to say that before hanging out with Malcolm and talking about Plasmodium I did not know this.  This finding of a chloroplast in an evolutionary group of eukaryotes that are not particularly closely related to plants is one of the key pieces of evidence in the “secondary endosymbiosis” hypothesis which proposes that some eukaryotes have brought into themselves as an endosymbiont a single-celled photosynthetic algae which had a chloroplast.  
Anyway – here we were – with the first full genome of a member of the Apicomplexans group.  And we could use it to discover some new details on plastid evolution and secondary endosymbioses.  So I adapted some methods I had used in analyzing the Arabidopsis genome (see Lin et al. 1999 and AGI 2000), and searched for plastid derived genes in the nuclear genome of Plasmodium.  Why look in the nuclear genome for plastid genes?  Or mitochondrial genes for that matter.  Well, it turns out that genes that were once in the organelle genomes frequently move to the nuclear genome of their “host”.  In fact, a lot of genes move.  So – if you want to study the evolution of an organism’s organelles, it is sometimes more fruitful to look in the nuclear genome than in the actual organelle’s genome.  OK – now back to the Plasmodium genome.  What I was doing was trying to find genes in the nuclear that had once been in the plastid genome.  How would you look for these?  
To find mitochondrial-derived genes I did blast searches against the same database of genomes used to study the evolution of eukaryotes but for this I looked for genes in Plasmodium that has decent matches to genes in alpha proteobacteria.  And for those I then build phylogenetic trees of each gene and its homologs, then screened through all the trees to look for any in which the gene from Plasmodium grouped in a tree inside a clade with sequences from alpha proteobacteria (and allowed for mitochondrial genes from other eukaryotes to be in this clade).  
To find plastid derived genes I did a similar screen except instead searched for genes that grouped in evolutionary trees with genes from cyanobacteria (or eukaryotic genes that were from plastids).  The section of the paper that I helped write is below:

A large number of nuclear-encoded genes in most eukaryotic species trace their evolutionary origins to genes from organelles that have been transferred to the nucleus during the course of eukaryotic evolution. Similarity searches against other complete genomes were used to identify P. falciparum nuclear-encoded genes that may be derived from organellar genomes. Because similarity searches are not an ideal method for inferring evolutionary relatedness (66), phylogenetic analysis was used to gain a more accurate picture of the evolutionary history of these genes. Out of 200 candidates examined, 60 genes were identified as being of probable mitochondrial origin. The proteins encoded by these genes include many with known or expected mitochondrial functions (for example, the tricarboxylic acid (TCA) cycle, protein translation, oxidative damage protection, the synthesis of haem, ubiquinone and pyrimidines), as well as proteins of unknown function. Out of 300 candidates examined, 30 were identified as being of probable plastid origin, including genes with predicted roles in transcription and translation, protein cleavage and degradation, the synthesis of isoprenoids and fatty acids, and those encoding four subunits of the pyruvate dehydrogenase complex. The origin of many candidate organelle-derived genes could not be conclusively determined, in part due to the problems inherent in analysing genes of very high (A + T) content. Nevertheless, it appears likely that the total number of plastid-derived genes in P. falciparum will be significantly lower than that in the plant A. thaliana (estimated to be over 1,000). Phylogenetic analysis reveals that, as with the A. thaliana plastid, many of the genes predicted to be targeted to the apicoplast are apparently not of plastid origin. Of 333 putative apicoplast-targeted genes for which trees were constructed, only 26 could be assigned a probable plastid origin. In contrast, 35 were assigned a probable mitochondrial origin and another 85 might be of mitochondrial origin but are probably not of plastid origin (they group with eukaryotes that have not had plastids in their history, such as humans and fungi, but the relationship to mitochondrial ancestors is not clear). The apparent non-plastid origin of these genes could either be due to inaccuracies in the targeting predictions or to the co-option of genes derived from the mitochondria or the nucleus to function in the plastid, as has been shown to occur in some plant species (67).

Thing 4: Analysis of DNA repair genes 

Arnab Pain from the Sanger Center and I analyzed genes predicted to be involved in DNA repair and recombination processes and wrote a section for the paper:

DNA repair processes are involved in maintenance of genomic integrity in response to DNA damaging agents such as irradiation, chemicals and oxygen radicals, as well as errors in DNA metabolism such as misincorporation during DNA replication. The P. falciparum genome encodes at least some components of the major DNA repair processes that have been found in other eukaryotes (111, 112). The core of eukaryotic nucleotide excision repair is present (XPB/Rad25, XPG/Rad2, XPF/Rad1, XPD/Rad3, ERCC1) although some highly conserved proteins with more accessory roles could not be found (for example, XPA/Rad4, XPC). The same is true for homologous recombinational repair with core proteins such as MRE11, DMC1, Rad50 and Rad51 present but accessory proteins such as NBS1 and XRS2 not yet found. These accessory proteins tend to be poorly conserved and have not been found outside of animals or yeast, respectively, and thus may be either absent or difficult to identify in P. falciparum. However, it is interesting that Archaea possess many of the core proteins but not the accessory proteins for these repair processes, suggesting that many of the accessory eukaryotic repair proteins evolved after P. falciparum diverged from other eukaryotes. 

The presence of MutL and MutS homologues including possible orthologues of MSH2, MSH6, MLH1 and PMS1 suggests that P. falciparum can perform post-replication mismatch repair. Orthologues of MSH4 and MSH5, which are involved in meiotic crossing over in other eukaryotes, are apparently absent in P. falciparum. The repair of at least some damaged bases may be performed by the combined action of the four base excision repair glycosylase homologues and one of the apurinic/apyrimidinic (AP) endonucleases (homologues of Xth and Nfo are present). Experimental evidence suggests that this is done by the long-patch pathway (113). 

The presence of a class II photolyase homologue is intriguing, because it is not clear whether P. falciparum is exposed to significant amounts of ultraviolet irradiation during its life cycle. It is possible that this protein functions as a blue-light receptor instead of a photolyase, as do members of this gene family in some organisms such as humans. Perhaps most interesting is the apparent absence of homologues of any of the genes encoding enzymes known to be involved in non-homologous end joining (NHEJ) in eukaryotes (for example, Ku70, Ku86, Ligase IV and XRCC1)(112). NHEJ is involved in the repair of double strand breaks induced by irradiation and chemicals in other eukaryotes (such as yeast and humans), and is also involved in a few cellular processes that create double strand breaks (for example, VDJ recombination in the immune system in humans). The role of NHEJ in repairing radiation-induced double strand breaks varies between species (114). For example, in humans, cells with defects in NHEJ are highly sensitive to -irradiation while yeast mutants are not. Double strand breaks in yeast are repaired primarily by homologous recombination. As NHEJ is involved in regulating telomere stability in other organisms, its apparent absence in P. falciparum may explain some of the unusual properties of the telomeres in this species (115).

Back to the story
Anyway … back to the story.  I do not have current access to all of TIGR’s old computer systems which is where my searches for the genome paper reside.  But I figured I might have some notes somewhere on my computer about what blast parameters I used for these searches.  And amazingly I did.  As I was getting ready to write back to Malcolm and to the person who has asked for the information I decided to double check to see what was in the paper.  And amazingly, much of the detail was right there all along.   

Plasmodium falciparum proteins were searched against a database of proteins from all complete genomes as well as from a set of organelle, plasmid and viral genomes. Putative recently duplicated genes were identified as those encoding proteins with better BLASTP matches (based on E value with a 10-15 cutoff) to other proteins in P. falciparum than to proteins in any other species. Proteins of possible organellar descent were identified as those for which one of the top six prokaryotic matches (based on E value) was to either a protein encoded by an organelle genome or by a species related to the organelle ancestors (members of the Rickettsia subgroup of the -Proteobacteria or cyanobacteria). Because BLAST matches are not an ideal method of inferring evolutionary history, phylogenetic analysis was conducted for all these proteins. For phylogenetic analysis, all homologues of each protein were identified by BLASTP searches of complete genomes and of a non-redundant protein database. Sequences were aligned using CLUSTALW, and phylogenetic trees were inferred using the neighbour-joining algorithms of CLUSTALW and PHYLIP. For comparative analysis of eukaryotes, the proteomes of all eukaryotes for which complete genomes are available (except the highly reduced E. cuniculi) were searched against each other. The proportion of proteins in each eukaryotic species that had a BLASTP match in each of the other eukaryotic species was determined, and used to infer a ‘whole-genome tree’ using the neighbour-joining algorithm. Possible eukaryotic conserved and specific proteins were identified as those with matches to all the complete eukaryotic genomes (10-30 E-value cutoff) but without matches to any complete prokaryotic genome (10-15 cutoff).

Alas, I cannot for the life of me find what other parameters I used for the blastp searches.  I am 99.9999% sure I used default settings but alas, I don’t know what default settings for blast were in that era.  And I am not even sure which version of blastp was installed on the TIGR computer systems then.  I certainly need to do a better job of making sure everything I do is truly reproducible.

Reproducibility

This all brings me to the actual real part of this story.  Reproducibility.  It is a big deal.  Anyone should be able to reproduce what was done in a study.  And alas, it is difficult to do that when not all the methods are fully described.  And one should also provide intermediate results so that people to do not have to redo everything you did in a study but can just reproduce part of it.   It would be good to have, for example, released all the phylogenetic trees from the analysis of organellar genes in Plasmodium.  Alas, I do not seem to have all of these files as they were stored in a directory at TIGR dedicated to this genome project and as I am no longer at TIGR I do not have ready access to that material.  It is probably still lounging around somewhere on the JCVI computer systems (TIGR alas, no longer officially exists … it was swallowed by the J. Craig Venter Institute …).  But I will keep digging and I will post them to some place like FigShare if/when I find them.

Perhaps more importantly, I will be working with my lab to make sure that in the future we store/record/make available EVERYTHING that would allow people to reproduce, re-analyze, re-jigger, re-whatever anything from our papers.

The key lesson – plan in advance for how you are going to share results, methods, data, etc …

Video, slides & storify of my talk on "#Phylogeny-driven approaches to #genomics and #metagenomics" from #CSMUBC2012

Just got back from the Canadian Society for Microbiology meeting where I gave the keynote talk on the last day of the meeting (Saturday).  Was a very short, but good trip.  Got to see some key collaborators and colleagues and Vancouver was very nice for the few days I was there.

I recorded my talk on my laptop using the Keynote “Record Slideshow” function.  I then exported it to Slideshare (just the slides – no audio) and to Youtube (video of slides with audio).  They are posted below.  I also did a mini storification of my talk which is also below.

http://storify.com/phylogenomics/my-talk-at-csmubc2012.js[<a href=”http://storify.com/phylogenomics/my-talk-at-csmubc2012″ target=”_blank”>View the story “My talk at CSMUBC2012” on Storify</a>]

Does phylogeny matter? (In Eco-Evo meta-analyses) … Apparently, yes, but it depends.

As many may know – I am pretty obsessed with the uses of phylogeny in biological studies.  In fact, one could say this has driven almost all of my work.  Thus when an email went around a little bit ago about an article for a journal club at UC Davis where the title begins with “Does phylogeny matter?”, well, I had to take a look.  Alas, I was a bit worried when I saw the article was in Ecology Letters because I am at home and was not sure about access policies for this journal.

But I was pleasantly surprised to get full access without any library – VPN login to the following article: Does phylogeny matter? Assessing the impact of phylogenetic information in ecological meta-analysis – Chamberlain – 2012 – Ecology Letters – Wiley Online Library.  I cannot figure out WHY it is freely available right now, nor how long it will be, but I took the chance to look the article over.

And I was even more pleasantly surprised to look over the article.  Many meta-analyses can seem forced – if not almost unbearable to look through.  But this one is very well done.  Basically they did a massive comparison of conclusions that one could reach when one either does or does not take into account the phylogenetic non-independence of taxa when conducting meta-analyses in evo-eco studies.  They searched published literature for meta-analyses and then .. well I will use their words here (from the end of their introduction):

Herein, we re-analyse datasets from previously published meta-analytic studies, comparing results of traditional and phylogenetic meta-analyses. In addition, we attempt to explain variation in the effect of phylogenetic information on meta-analytic outcomes by examining characteristics of phylogenies. We ask: (1) how does accounting for phylogenetic non-independence change results of individual meta-analyses? and (2) across datasets, what characteristics of phylogenies explain changes in effect size for phylogenetic vs. traditional meta-analyses? As a complement to our main questions, in Appendix A, we also ask (3) how does accounting for phylogenetic non-independence affect model fit of individual meta-analyses? and (4) across datasets, what characteristics of phylogenies explain variation in the relative fit of phylogenetic meta-analyses? Despite the many compelling reasons to incorporate phylogenetic information into meta-analyses that involve multiple species, investigators often use model comparison criteria, such as Akaike’s Information Criterion (AIC) to assess fit of phylogenetic vs. traditional meta-analytic models. We found a clear bias in relation to phylogeny size for one of the two methods currently used to quantify relative model fit (Q-based AIC), thus our findings have important implications for meta-analysts using such model comparisons (see Appendix A for details).

And the key conclusions are

Here, we have shown that incorporating phylogenies influences ecological meta-analysis outcomes, in many cases changing whether the observed effect size differs significantly from zero. We also show that the degree of difference between traditional and phylogenetic meta-analyses depends on key characteristics of phylogenies. Despite this potential complication, we strongly recommend incorporating phylogenetic information into ecological meta-analyses to account for species non-independence.

They also offer up three main recommendations for consideration

To conclude, we outline three recommendations for the use of phylogenetic meta-analyses in ecology and evolutionary biology:

  1. Use phylogenetic meta-analysis, but note that some response metrics are less likely to be affected by phylogenetic methods.
  2. Include as many species as possible.
  3. Be aware that phylogeny shape may influence meta-analytic outcomes. 

Definitely worth a look …

Chamberlain, S., Hovick, S., Dibble, C., Rasmussen, N., Van Allen, B., Maitner, B., Ahern, J., Bell-Dereske, L., Roy, C., Meza-Lopez, M., Carrillo, J., Siemann, E., Lajeunesse, M., & Whitney, K. (2012). Does phylogeny matter? Assessing the impact of phylogenetic information in ecological meta-analysis Ecology Letters, 15 (6), 627-636 DOI: 10.1111/j.1461-0248.2012.01776.x

ResearchBlogging.org

More phylogeny fun from Rod Page: TreeBase -> Genome Browser

More phylogeny fun from Rod Page.  Been reading up on his blog post: iPhylo: Browsing TreeBASE using a genome browser-like interface.  Seems very cool.

This looks useful: Online Phylogeny Course from Rod Page

If you have an interest in phylogeny then this is definitely worth checking out – Rod Page has an online phylogeny course: Phylogeny.  It has some nice links in there to other online resources, some videos of talks, and various phylogeny resources.

Converting repeated emails into FAQs: Today’s = How to Get Figures/Details from 2009 GEBA paper

OK I am now officially completely driven insane by email.  As part of my attempt to reduce email communication with people I am going to start posting some of the emails I get often into FAQs.

Today’s email relates to the 2009 paper on a “Phylogeny driven genomic encyclopedia of bacteria and archaea” for which I was the senior and corresponding author.  The email is asking for higher resolution figures that were published in the paper.  This person and many others have asked for a higher res. version of our “genome tree” which was Figure 1.  Here is the version from the paper

But alas, as a JPG when you zoom in you can’t see the text very well.  And about 30 or so people, maybe more, have asked for a higher res. version.  Well, the simplest way to get this figure with legible fonts when zoomed in is to get the PDF of the paper and zoom in on it.  But that may not be for everyone – so here is a link to the PDF of the figure that I posted on postures (Blogger does not allow PDF uploads).  I also posted PDFs of the other figures.

Many people also ask for the treefile (which is basically a coded version of the phylogenetic tree for viewing and analysis).  I am directly posting the treefile below and have also submitted it to “Treebase” (which we should have done before).  Enjoy … and in the future I will be pointing people to this page when they ask for the figure/treefile. Not sure this will have saved me anytime but am sick of writing a lot of this in emails back to people …

#NEXUS
BEGIN trees;
TREE ‘Tree1’ = (((((‘GEBA_Thermanaerovibrio_acidaminovorans’:0.190689,’GEBA_Dethiosulfovibrio_peptidovorans’:0.263143):0.276658,((((((((((((((((((((((((((((((((((((((((‘Escherichia_coli_O157_H7_str_Sakai’:0.0,’Escherichia_coli_str_K12_substr_MG1655′:0.0):1.51E-4,’Escherichia_coli_str_K12_substr_DH10B’:0.0):2.0E-5,(‘Escherichia_coli_ATCC_8739′:1.7E-4,’Escherichia_coli_HS’:1.71E-4):1.7E-4):0.0,(‘Escherichia_coli_536′:0.001196,’Escherichia_coli_APEC_O1’:0.0):0.0):0.0,((‘Shigella_flexneri_2a_str_301′:0.001883,’Shigella_flexneri_2a_str_2457T’:0.0):1.71E-4,’Shigella_flexneri_5_str_8401′:3.44E-4):5.12E-4):0.0,(‘Shigella_boydii_CDC_3083_94′:5.13E-4,’Shigella_boydii_Sb227′:8.57E-4):3.42E-4):0.0,’Escherichia_coli_SMS_3_5′:0.0):0.0,’Escherichia_coli_O157_H7_EDL933′:0.0):0.0,’Escherichia_coli_CFT073′:0.0):0.0,’Escherichia_coli_UTI89′:0.0):0.0,’Escherichia_coli_E24377A’:0.0):0.0,’Shigella_dysenteriae_Sd197′:0.001025):1.69E-4,’Shigella_sonnei_Ss046′:0.001028):0.012167,(((((((‘Salmonella_enterica_subsp_enterica_serovar_Typhi_str_Ty2′:0.0,’Salmonella_enterica_subsp_enterica_serovar_Typhi_str_CT18′:1.7E-4):3.4E-4,’Salmonella_typhimurium_LT2′:0.0):0.0,’Salmonella_enterica_subsp_enterica_serovar_Paratyphi_A_str_ATCC_9150′:8.5E-4):2.0E-6,’Salmonella_enterica_subsp_enterica_serovar_Paratyphi_B_str_SPB7′:1.8E-4):1.69E-4,’Salmonella_enterica_subsp_enterica_serovar_Choleraesuis_str_SC_B67′:3.48E-4):0.001774,’Salmonella_enterica_subsp_arizonae_serovar_62_z4_z23_’:0.001673):0.008533,’Citrobacter_koseri_ATCC_BAA_895′:0.007106):0.003103):0.00437,’Klebsiella_pneumoniae_subsp_pneumoniae_MGH_78578′:0.012094):0.004392,’Enterobacter_sakazakii_ATCC_BAA_894′:0.019781):0.007021,’Enterobacter_sp_638′:0.019532):0.027075,(‘Erwinia_tasmaniensis’:0.033979,’Serratia_proteamaculans_568′:0.033607):0.013319):0.008604,’Erwinia_carotovora_subsp_atroseptica_SCRI1043′:0.03122):0.012206,((((((‘Yersinia_pestis_KIM’:0.0,’Yersinia_pseudotuberculosis_IP_31758′:0.0):0.0,’Yersinia_pestis_CO92′:0.0):0.0,’Yersinia_pseudotuberculosis_IP_32953′:0.0):0.0,((((‘Yersinia_pseudotuberculosis_PB1_’:0.0,’Yersinia_pestis_Pestoides_F’:0.0):0.0,(‘Yersinia_pseudotuberculosis_YPIII’:0.0,’Yersinia_pestis_Nepal516′:0.0):0.0):0.0,’Yersinia_pestis_Antiqua’:1.72E-4):0.0,’Yersinia_pestis_biovar_Microtus_str_91001′:0.0):0.0):0.0,’Yersinia_pestis_Angola’:3.44E-4):0.00575,’Yersinia_enterocolitica_subsp_enterocolitica_8081′:0.009468):0.030284):0.008222,((((‘Candidatus_Blochmannia_pennsylvanicus_str_BPEN’:0.111832,’Candidatus_Blochmannia_floridanus’:0.199319):0.14508,(((‘Buchnera_aphidicola_str_APS_Acyrthosiphon_pisum_’:0.081646,’Buchnera_aphidicola_str_Sg_Schizaphis_graminum_’:0.078489):0.090786,’Buchnera_aphidicola_str_Bp_Baizongia_pistaciae_’:0.232468):0.043831,(‘Wigglesworthia_glossinidia_endosymbiont_of_Glossina_brevipalpis’:0.30099,’Buchnera_aphidicola_str_Cc_Cinara_cedri_’:0.303142):0.063766):0.064196):0.054991,’Baumannia_cicadellinicola_str_Hc_Homalodisca_coagulata_’:0.166451):0.081445,’Sodalis_glossinidius_str_morsitans_’:0.026696):0.024889):0.01239,’Photorhabdus_luminescens_subsp_laumondii_TTO1′:0.048195):0.050763,((((‘Haemophilus_somnus_129PT’:9.73E-4,’Haemophilus_somnus_2336′:7.28E-4):0.038446,’Pasteurella_multocida_subsp_multocida_str_Pm70′:0.033371):0.011204,((‘Mannheimia_succiniciproducens_MBEL55E’:0.02899,’Actinobacillus_succinogenes_130Z’:0.035825):0.013874,(((‘Haemophilus_influenzae_Rd_KW20′:0.002043,’Haemophilus_influenzae_PittGG’:0.00105):5.49E-4,’Haemophilus_influenzae_86_028NP’:5.15E-4):2.24E-4,’Haemophilus_influenzae_PittEE’:0.001026):0.040304):0.008478):0.011217,(((‘Actinobacillus_pleuropneumoniae_L20′:0.00137,’Actinobacillus_pleuropneumoniae_serovar_7_str_AP76′:3.39E-4):1.74E-4,’Actinobacillus_pleuropneumoniae_serovar_3_str_JL03′:0.003062):0.011765,’Haemophilus_ducreyi_35000HP’:0.024577):0.038512):0.080775):0.047571,(((((‘Vibrio_cholerae_O1_biovar_eltor_str_N16961′:3.64E-4,’Vibrio_cholerae_O395’:8.38E-4):0.041305,(‘Vibrio_vulnificus_CMCP6′:5.27E-4,’Vibrio_vulnificus_YJ016’:1.72E-4):0.020829):0.011445,(‘Vibrio_parahaemolyticus_RIMD_2210633′:0.00625,’Vibrio_harveyi_ATCC_BAA_1116′:0.010839):0.011481):0.027253,’Vibrio_fischeri_ES114′:0.05376):0.027779,’Photobacterium_profundum_SS9’:0.072105):0.054993):0.023142,(‘Aeromonas_hydrophila_subsp_hydrophila_ATCC_7966′:0.006284,’Aeromonas_salmonicida_subsp_salmonicida_A449’:0.0108):0.109795):0.020707,((((((‘Shewanella_halifaxensis_HAW_EB4′:0.008905,’Shewanella_pealeana_ATCC_700345’:0.003805):0.025082,(‘Shewanella_sediminis_HAW_EB3′:0.019306,’Shewanella_woodyi_ATCC_51908′:0.012326):0.017175):0.016987,’Shewanella_loihica_PV_4’:0.021173):0.024752,((((((‘Shewanella_sp_ANA_3′:0.002092,’Shewanella_sp_MR_4′:9.8E-4):7.54E-4,’Shewanella_sp_MR_7′:3.48E-4):0.003036,’Shewanella_oneidensis_MR_1’:0.002265):0.010027,(‘Shewanella_sp_W3_18_1′:3.26E-4,’Shewanella_putrefaciens_CN_32’:1.85E-4):0.001996):0.006463,((‘Shewanella_baltica_OS155′:0.001236,’Shewanella_baltica_OS195′:0.0):5.23E-4,’Shewanella_baltica_OS185’:6.84E-4):0.009966):0.015626,(‘Shewanella_denitrificans_OS217′:0.017657,’Shewanella_frigidimarina_NCIMB_400′:0.02512):0.022711):0.020469):0.018183,’Shewanella_amazonensis_SB2B’:0.035265):0.092511,’Psychromonas_ingrahamii_37′:0.170059):0.021406):0.025447,((‘Idiomarina_loihiensis_L2TR’:0.129215,’Pseudoalteromonas_atlantica_T6c’:0.121204):0.026755,(‘Pseudoalteromonas_haloplanktis_TAC125′:0.120366,’Colwellia_psychrerythraea_34H’:0.144993):0.031198):0.015669):0.082415,’GEBA_Kangiella_koreensis’:0.172853):0.0486,((((((‘Acinetobacter_baumannii_ACICU’:8.84E-4,’Acinetobacter_baumannii_ATCC_17978′:0.003413):5.27E-4,(‘Acinetobacter_baumannii_SDF’:8.48E-4,’Acinetobacter_baumannii_AYE’:1.7E-4):0.0):0.020056,’Acinetobacter_sp_ADP1′:0.027905):0.107102,((‘Psychrobacter_arcticus_273_4′:0.006888,’Psychrobacter_cryohalolentis_K5′:0.006086):0.066403,’Psychrobacter_sp_PRwf_1′:0.060503):0.121017):0.15381,’Alcanivorax_borkumensis_SK2’:0.175254):0.040455,(((‘Chromohalobacter_salexigens_DSM_3043′:0.167612,’Marinomonas_sp_MWYL1’:0.175719):0.038094,(‘Marinobacter_aquaeolei_VT8′:0.1125,’Hahella_chejuensis_KCTC_2396’:0.130701):0.053467):0.019075,(((((((((‘Pseudomonas_syringae_pv_syringae_B728a’:0.001801,’Pseudomonas_syringae_pv_phaseolicola_1448A’:0.002321):0.002597,’Pseudomonas_syringae_pv_tomato_str_DC3000′:0.003918):0.02921,’Pseudomonas_fluorescens_Pf_5′:0.014609):0.00751,’Pseudomonas_fluorescens_PfO_1′:0.010753):0.026949,((((‘Pseudomonas_putida_KT2440′:1.71E-4,’Pseudomonas_putida_F1′:0.0):0.001622,’Pseudomonas_putida_GB_1′:0.003034):0.003248,’Pseudomonas_putida_W619′:0.007868):0.003675,’Pseudomonas_entomophila_L48′:0.007306):0.02498):0.030505,’Pseudomonas_mendocina_ymp’:0.030084):0.018411,’Pseudomonas_stutzeri_A1501′:0.035067):0.021645,((‘Pseudomonas_aeruginosa_PAO1′:0.0,’Pseudomonas_aeruginosa_UCBPP_PA14′:1.71E-4):0.001113,’Pseudomonas_aeruginosa_PA7’:0.001132):0.032184):0.118537,(‘Saccharophagus_degradans_2_40′:0.107494,’Cellvibrio_japonicus_Ueda107’:0.109149):0.090952):0.024456):0.029903):0.041423):0.029921,(((‘Legionella_pneumophila_str_Corby’:0.001506,’Legionella_pneumophila_subsp_pneumophila_str_Philadelphia_1′:5.43E-4):3.1E-4,(‘Legionella_pneumophila_str_Lens’:0.00222,’Legionella_pneumophila_str_Paris’:0.001831):3.76E-4):0.255271,((‘Coxiella_burnetii_RSA_331′:0.00102,’Coxiella_burnetii_RSA_493′:5.1E-4):0.001141,’Coxiella_burnetii_Dugway_5J108_111’:0.001413):0.286937):0.063734):0.020357,((((((((‘Francisella_tularensis_subsp_tularensis_SCHU_S4′:0.0,’Francisella_tularensis_subsp_tularensis_FSC198′:0.0):0.001507,’Francisella_tularensis_subsp_tularensis_WY96_3418’:3.35E-4):3.34E-4,((‘Francisella_tularensis_subsp_holarctica_OSU18′:0.0,’Francisella_tularensis_subsp_holarctica’:5.02E-4):0.0,’Francisella_tularensis_subsp_holarctica_FTNF002_00′:3.35E-4):0.002852):0.0,’Francisella_tularensis_subsp_mediasiatica_FSC147′:0.002516):0.001333,’Francisella_tularensis_subsp_novicida_U112′:6.81E-4):0.02679,’Francisella_philomiragia_subsp_philomiragia_ATCC_25017′:0.028537):0.343835,’Dichelobacter_nodosus_VCS1703A’:0.35117):0.058577,((‘Candidatus_Ruthia_magnifica_str_Cm_Calyptogena_magnifica_’:0.048117,’Candidatus_Vesicomyosocius_okutanii_HA’:0.070088):0.299069,’Thiomicrospira_crunogena_XCL_2′:0.218606):0.069384):0.027865):0.019953,(((‘Halorhodospira_halophila_SL1′:0.175731,’Alkalilimnicola_ehrlichei_MLHE_1′:0.10505):0.103584,’Nitrosococcus_oceani_ATCC_19707’:0.225193):0.031953,((((((‘Xanthomonas_campestris_pv_campestris_str_8004′:3.43E-4,’Xanthomonas_campestris_pv_campestris_str_ATCC_33913′:5.15E-4):3.45E-4,’Xanthomonas_campestris_pv_campestris’:1.7E-4):0.006192,(((‘Xanthomonas_oryzae_pv_oryzae_KACC10331′:1.75E-4,’Xanthomonas_oryzae_pv_oryzae_MAFF_311018′:0.0):0.0,’Xanthomonas_oryzae_pv_oryzae_PXO99A’:0.002957):0.008947,(‘Xanthomonas_campestris_pv_vesicatoria_str_85_10′:0.001996,’Xanthomonas_axonopodis_pv_citri_str_306′:0.00127):0.004934):0.005782):0.031343,’Stenotrophomonas_maltophilia_K279a’:0.060056):0.030521,(((‘Xylella_fastidiosa_M23′:0.0,’Xylella_fastidiosa_Temecula1′:1.74E-4):0.005446,’Xylella_fastidiosa_M12′:0.003419):0.005622,’Xylella_fastidiosa_9a5c’:0.003688):0.09616):0.251782,’Methylococcus_capsulatus_str_Bath’:0.208011):0.028921):0.02517):0.056558,((((((‘Nitrosomonas_europaea_ATCC_19718′:0.04084,’Nitrosomonas_eutropha_C91′:0.04959):0.144982,’Nitrosospira_multiformis_ATCC_25196′:0.08516):0.076439,’Thiobacillus_denitrificans_ATCC_25259′:0.145726):0.023371,’Methylobacillus_flagellatus_KT’:0.155554):0.027233,(((‘Azoarcus_sp_BH72′:0.052172,’Azoarcus_sp_EbN1′:0.058776):0.062917,’Dechloromonas_aromatica_RCB’:0.113139):0.05407,((((((‘Polynucleobacter_sp_QLW_P1DMWA_1′:0.011,’Polynucleobacter_necessarius_STIR1’:0.023235):0.136644,((((‘Cupriavidus_taiwanensis’:0.005601,’Ralstonia_eutropha_H16′:0.005264):0.006279,’Ralstonia_eutropha_JMP134′:0.007605):0.010799,’Ralstonia_metallidurans_CH34′:0.013934):0.029247,(‘Ralstonia_solanacearum_GMI1000′:0.02203,’Ralstonia_pickettii_12J’:0.021784):0.025195):0.037629):0.022794,(((‘Burkholderia_phytofirmans_PsJN’:0.004098,’Burkholderia_xenovorans_LB400′:0.002629):0.017129,’Burkholderia_phymatum_STM815′:0.017185):0.015186,(((((‘Burkholderia_ambifaria_AMMD’:0.001201,’Burkholderia_ambifaria_MC40_6′:1.65E-4):0.00415,(((‘Burkholderia_cenocepacia_AU_1054′:0.0,’Burkholderia_cenocepacia_HI2424′:1.71E-4):1.71E-4,’Burkholderia_cenocepacia_MC0_3′:0.0):0.002317,’Burkholderia_sp_383′:0.007333):0.002889):0.00261,’Burkholderia_vietnamiensis_G4′:0.003346):0.015421,’Burkholderia_multivorans_ATCC_17616’:0.005208):0.005853,((((((((‘Burkholderia_mallei_NCTC_10247′:0.0,’Burkholderia_mallei_NCTC_10229′:1.72E-4):3.43E-4,’Burkholderia_mallei_ATCC_23344′:1.91E-4):0.0,’Burkholderia_mallei_SAVP1′:0.0):1.71E-4,’Burkholderia_pseudomallei_K96243′:5.16E-4):0.0,’Burkholderia_pseudomallei_1710b’:0.0):0.0,’Burkholderia_pseudomallei_668′:0.0):0.0,’Burkholderia_pseudomallei_1106a’:1.71E-4):0.004036,’Burkholderia_thailandensis_E264′:0.005011):0.007936):0.019911):0.063072):0.026718,(‘Herminiimonas_arsenicoxydans’:0.021781,’Janthinobacterium_sp_Marseille’:0.013485):0.104015):0.024622,((‘Methylibium_petroleiphilum_PM1′:0.071447,’Leptothrix_cholodnii_SP_6’:0.085431):0.041536,((((‘Delftia_acidovorans_SPH_1′:0.061373,’Acidovorax_sp_JS42′:0.025396):0.016116,’Acidovorax_avenae_subsp_citrulli_AAC00_1′:0.025052):0.01365,’Verminephrobacter_eiseniae_EF01_2’:0.072504):0.026641,((‘Polaromonas_naphthalenivorans_CJ2′:0.031581,’Polaromonas_sp_JS666′:0.022265):0.044515,’Rhodoferax_ferrireducens_T118’:0.06856):0.022175):0.065841):0.119404):0.025568,((((‘Bordetella_bronchiseptica_RB50′:1.72E-4,’Bordetella_parapertussis_12822′:0.001377):1.18E-4,’Bordetella_pertussis_Tohama_I’:0.001433):0.021534,’Bordetella_avium_197N’:0.031614):0.010355,’Bordetella_petrii_DSM_12804′:0.021324):0.132685):0.071258):0.027059):0.037078,(((‘Neisseria_gonorrhoeae_FA_1090′:5.0E-4,’Neisseria_gonorrhoeae_NCCP11945’:0.001485):0.006103,(((‘Neisseria_meningitidis_MC58′:0.001366,’Neisseria_meningitidis_Z2491′:0.00101):7.87E-4,’Neisseria_meningitidis_FAM18′:0.001575):5.57E-4,’Neisseria_meningitidis_053442′:0.002602):0.001738):0.167934,’Chromobacterium_violaceum_ATCC_12472’:0.09855):0.087498):0.124859):0.205339,(((((((((((((‘Rhizobium_etli_CIAT_652′:0.003437,’Rhizobium_etli_CFN_42′:0.005094):0.006043,’Rhizobium_leguminosarum_bv_viciae_3841′:0.011645):0.036083,’Agrobacterium_tumefaciens_str_C58’:0.045233):0.016839,(‘Sinorhizobium_meliloti_1021′:0.006124,’Sinorhizobium_medicae_WSM419’:0.006258):0.029756):0.070623,(((((‘Bartonella_henselae_str_Houston_1′:0.020444,’Bartonella_quintana_str_Toulouse’:0.029388):0.009808,’Bartonella_tribocorum_CIP_105476′:0.025338):0.031387,’Bartonella_bacilliformis_KC583′:0.065286):0.098029,(((((‘Brucella_canis_ATCC_23365′:6.88E-4,’Brucella_suis_1330′:1.72E-4):6.89E-4,’Brucella_suis_ATCC_23445’:0.001033):0.0,(((‘Brucella_abortus_S19′:1.76E-4,’Brucella_melitensis_biovar_Abortus_2308′:1.72E-4):0.0,’Brucella_abortus_biovar_1_str_9_941′:0.0):6.84E-4,’Brucella_melitensis_16M’:0.002611):1.71E-4):0.0,’Brucella_ovis_ATCC_25840′:0.001206):0.013975,’Ochrobactrum_anthropi_ATCC_49188′:0.01275):0.047742):0.034988,(‘Mesorhizobium_loti_MAFF303099′:0.077568,’Mesorhizobium_sp_BNC1’:0.07419):0.027327):0.021349):0.086016,((((((‘Methylobacterium_extorquens_PA1′:0.011368,’Methylobacterium_populi_BJ001′:0.004799):0.03256,’Methylobacterium_radiotolerans_JCM_2831′:0.045067):0.041026,’Methylobacterium_sp_4_46′:0.058056):0.074338,’Beijerinckia_indica_subsp_indica_ATCC_9039’:0.149061):0.026115,(‘Xanthobacter_autotrophicus_Py2′:0.058512,’Azorhizobium_caulinodans_ORS_571’:0.044976):0.079042):0.02072,((((‘Bradyrhizobium_sp_ORS278′:0.007711,’Bradyrhizobium_sp_BTAi1′:0.006207):0.029183,’Bradyrhizobium_japonicum_USDA_110’:0.033032):0.020122,(((‘Rhodopseudomonas_palustris_TIE_1′:0.0,’Rhodopseudomonas_palustris_CGA009’:3.43E-4):0.030733,(‘Rhodopseudomonas_palustris_BisB5′:0.012815,’Rhodopseudomonas_palustris_HaA2’:0.011805):0.010984):0.018577,(‘Rhodopseudomonas_palustris_BisB18′:0.037297,’Rhodopseudomonas_palustris_BisA53’:0.031869):0.014739):0.023424):0.01122,(‘Nitrobacter_winogradskyi_Nb_255′:0.018532,’Nitrobacter_hamburgensis_X14′:0.017835):0.041031):0.111343):0.056949):0.041219,’Parvibaculum_lavamentivorans_DS_1’:0.17312):0.029751,((‘Maricaulis_maris_MCS10′:0.176054,’Hyphomonas_neptunium_ATCC_15444’:0.253873):0.037937,(‘Caulobacter_sp_K31′:0.046364,’Caulobacter_crescentus_CB15’:0.03404):0.177012):0.067981):0.024054,(((((‘Silicibacter_pomeroyi_DSS_3′:0.054186,’Silicibacter_sp_TM1040′:0.048369):0.026318,’Roseobacter_denitrificans_OCh_114′:0.076151):0.022514,’Jannaschia_sp_CCS1′:0.099178):0.015502,’Dinoroseobacter_shibae_DFL_12’:0.062324):0.048422,(((‘Rhodobacter_sphaeroides_ATCC_17029′:3.54E-4,’Rhodobacter_sphaeroides_2_4_1′:7.92E-4):0.0133,’Rhodobacter_sphaeroides_ATCC_17025′:0.013986):0.076691,’Paracoccus_denitrificans_PD1222’:0.091793):0.030158):0.202704):0.030795,(((‘Novosphingobium_aromaticivorans_DSM_12444′:0.086704,’Erythrobacter_litoralis_HTCC2594′:0.121244):0.045873,’Sphingopyxis_alaskensis_RB2256’:0.085216):0.03517,(‘Sphingomonas_wittichii_RW1′:0.074921,’Zymomonas_mobilis_subsp_mobilis_ZM4’:0.113715):0.03071):0.187787):0.027917,((‘Rhodospirillum_rubrum_ATCC_11170′:0.169238,’Magnetospirillum_magneticum_AMB_1’:0.125062):0.055149,(((‘Gluconobacter_oxydans_621H’:0.101123,’Gluconacetobacter_diazotrophicus_PAl_5′:0.059111):0.060663,’Granulibacter_bethesdensis_CGDNIH1′:0.085514):0.032014,’Acidiphilium_cryptum_JF_5′:0.120348):0.145099):0.055072):0.09899,(((((‘Rickettsia_bellii_RML369_C’:6.4E-4,’Rickettsia_bellii_OSU_85_389′:6.24E-4):0.053585,(((‘Rickettsia_typhi_str_Wilmington’:0.015422,’Rickettsia_prowazekii_str_Madrid_E’:0.012042):0.039247,((‘Rickettsia_felis_URRWXCal2′:0.00684,’Rickettsia_akari_str_Hartford’:0.024586):0.00502,(((‘Rickettsia_rickettsii_str_Iowa’:0.0,’Rickettsia_rickettsii_str_Sheila_Smith_’:0.0):0.00663,’Rickettsia_conorii_str_Malish_7′:0.002765):0.00386,’Rickettsia_massiliae_MTU5′:0.00794):0.008218):0.002179):0.00446,’Rickettsia_canadensis_str_McKiel’:0.030982):0.03239):0.199956,(‘Orientia_tsutsugamushi_Boryong’:0.012903,’Orientia_tsutsugamushi_str_Ikeda’:0.003049):0.386268):0.180586,((((‘Anaplasma_phagocytophilum_HZ’:0.123173,’Anaplasma_marginale_str_St_Maries’:0.120008):0.159306,((‘Ehrlichia_ruminantium_str_Gardel’:0.002292,’Ehrlichia_ruminantium_str_Welgevonden’:0.002453):0.053871,(‘Ehrlichia_canis_str_Jake’:0.027749,’Ehrlichia_chaffeensis_str_Arkansas’:0.027236):0.031378):0.134774):0.151991,((‘Wolbachia_pipientis’:0.072061,’Wolbachia_endosymbiont_of_Drosophila_melanogaster’:0.04265):0.018425,’Wolbachia_endosymbiont_strain_TRS_of_Brugia_malayi’:0.077353):0.269684):0.141145,’Neorickettsia_sennetsu_str_Miyayama’:0.697366):0.171539):0.064736,’Candidatus_Pelagibacter_ubique_HTCC1062′:0.549831):0.056614):0.089191,’Magnetococcus_sp_MC_1′:0.339568):0.063213):0.093791,((‘Acidobacteria_bacterium_Ellin345′:0.221486,’Solibacter_usitatus_Ellin6076’:0.235193):0.269435,((((((‘Geobacter_sulfurreducens_PCA’:0.05188,’Geobacter_metallireducens_GS_15′:0.04331):0.045927,’Geobacter_uraniireducens_Rf4′:0.08407):0.035503,(‘Pelobacter_propionicus_DSM_2379′:0.107415,’Geobacter_lovleyi_SZ’:0.104313):0.062144):0.11137,’Pelobacter_carbinolicus_DSM_2380′:0.218315):0.102231,(((((((‘Desulfovibrio_vulgaris_subsp_vulgaris_DP4′:2.75E-4,’Desulfovibrio_vulgaris_subsp_vulgaris_str_Hildenborough’:2.47E-4):0.096556,’Desulfovibrio_desulfuricans_subsp_desulfuricans_str_G20′:0.130391):0.041935,’Lawsonia_intracellularis_PHE_MN1_00′:0.194671):0.108731,(‘GEBA_Desulfohalobium_retbaense’:0.206905,’GEBA_Desulfomicrobium_baculatum’:0.21572):0.039647):0.204436,’Desulfotalea_psychrophila_LSv54′:0.368324):0.05185,(‘Desulfococcus_oleovorans_Hxd3′:0.325853,’Syntrophobacter_fumaroxidans_MPOB’:0.267991):0.045811):0.045242,’Syntrophus_aciditrophicus_SB’:0.335728):0.039815):0.036306,(((‘Sorangium_cellulosum_So_ce_56_’:0.340758,’GEBA_Haliangium_ochraceum’:0.326975):0.069132,((‘Anaeromyxobacter_sp_Fw109_5′:0.05281,’Anaeromyxobacter_dehalogenans_2CP_C’:0.046187):0.14976,’Myxococcus_xanthus_DK_1622′:0.199974):0.121235):0.065096,’Bdellovibrio_bacteriovorus_HD100′:0.487944):0.045793):0.04331):0.033782):0.033693,’GEBA_Denitrovibrio_acetiphilus’:0.52279):0.036282,(((((((‘Sulfurimonas_denitrificans_DSM_1251′:0.23347,’Arcobacter_butzleri_RM4018′:0.208618):0.040632,’Sulfurovum_sp_NBC37_1’:0.230443):0.030751,(((((‘Campylobacter_jejuni_subsp_jejuni_NCTC_11168′:0.001263,’Campylobacter_jejuni_RM1221’:9.27E-4):6.61E-4,((‘Campylobacter_jejuni_subsp_doylei_269_97′:0.006051,’Campylobacter_jejuni_subsp_jejuni_81116′:3.16E-4):0.001123,’Campylobacter_jejuni_subsp_jejuni_81_176’:9.26E-4):3.5E-4):0.119606,((‘Campylobacter_curvus_525_92′:0.03145,’Campylobacter_concisus_13826′:0.037212):0.061724,’Campylobacter_fetus_subsp_fetus_82_40′:0.097271):0.023098):0.020182,’Campylobacter_hominis_ATCC_BAA_381′:0.16927):0.075991,’GEBA_Sulfurospirillum_deleyianum’:0.136694):0.07072):0.026032,((((((‘Helicobacter_pylori_HPAG1′:0.003032,’Helicobacter_pylori_26695′:0.004293):9.24E-4,’Helicobacter_pylori_Shi470′:0.004956):0.002035,’Helicobacter_pylori_J99′:0.00625):0.008252,’Helicobacter_acinonychis_str_Sheeba’:0.010673):0.199339,’Helicobacter_hepaticus_ATCC_51449′:0.121696):0.062387,’Wolinella_succinogenes_DSM_1740′:0.107844):0.108511):0.052813,’Nitratiruptor_sp_SB155_2′:0.132215):0.401479,(‘Aquifex_aeolicus_VF5′:0.291771,’Sulfurihydrogenibium_sp_YO3AOP1′:0.270137):0.202377):0.049843,’Elusimicrobium_minutum_Pei191’:0.667875):0.034753):0.023181,(((((((‘GEBA_Dyadobacter_fermentans’:0.115999,’GEBA_Spirosoma_linguale’:0.124662):0.074785,’Cytophaga_hutchinsonii_ATCC_33406′:0.182703):0.054101,’Candidatus_Amoebophilus_asiaticus_5a2′:0.332349):0.041954,((‘GEBA_Pedobacter_heparinus’:0.181859,’GEBA_Chitinophaga_pinensis’:0.30193):0.037197,(((((‘Flavobacterium_psychrophilum_JIP02_86′:0.059237,’Flavobacterium_johnsoniae_UW101′:0.052582):0.072716,’Gramella_forsetii_KT0803′:0.133872):0.034419,’GEBA_Capnocytophaga_ochracea’:0.123672):0.087924,’Candidatus_Sulcia_muelleri_GWSS’:0.665368):0.057908,((((‘Bacteroides_fragilis_YCH46′:0.0,’Bacteroides_fragilis_NCTC_9343′:0.0):0.024113,’Bacteroides_thetaiotaomicron_VPI_5482′:0.02726):0.02733,’Bacteroides_vulgatus_ATCC_8482’:0.051865):0.063951,((‘Porphyromonas_gingivalis_ATCC_33277′:8.58E-4,’Porphyromonas_gingivalis_W83′:0.001013):0.151787,’Parabacteroides_distasonis_ATCC_8503’:0.06466):0.046245):0.18136):0.057273):0.041539):0.194704,(‘GEBA_Rhodothermus_marinus’:0.154571,’Salinibacter_ruber_DSM_13855′:0.312844):0.159944):0.062353,((((‘Chlorobium_tepidum_TLS’:0.032082,’Chlorobaculum_parvum_NCIB_8327′:0.032712):0.058237,(((‘Prosthecochloris_vibrioformis_DSM_265′:0.059493,’Pelodictyon_luteolum_DSM_273′:0.0478):0.035359,’Chlorobium_chlorochromatii_CaD3’:0.105899):0.014378,(‘Chlorobium_limicola_DSM_245′:0.051253,’Chlorobium_phaeobacteroides_DSM_266′:0.062034):0.016101):0.038256):0.042465,’Chlorobium_phaeobacteroides_BS1′:0.115118):0.129491,’Chloroherpeton_thalassium_ATCC_35110’:0.162073):0.253695):0.148638,((((‘Akkermansia_muciniphila_ATCC_BAA_835′:0.396966,’Opitutus_terrae_PB90_1′:0.451463):0.059359,’Methylacidiphilum_infernorum_V4’:0.427955):0.105972,(((((‘Chlamydia_trachomatis_D_UW_3_CX’:0.002254,’Chlamydia_trachomatis_A_HAR_13′:0.002736):0.004197,(‘Chlamydia_trachomatis_434_Bu’:0.0,’Chlamydia_trachomatis_L2b_UCH_1_proctitis’:3.44E-4):0.002396):0.021326,’Chlamydia_muridarum_Nigg’:0.019605):0.087368,(((‘Chlamydophila_abortus_S26_3′:0.030285,’Chlamydophila_caviae_GPIC’:0.023437):0.007338,’Chlamydophila_felis_Fe_C_56′:0.022986):0.042096,((‘Chlamydophila_pneumoniae_AR39′:0.0,’Chlamydophila_pneumoniae_J138’:1.71E-4):1.71E-4,(‘Chlamydophila_pneumoniae_CWL029′:3.42E-4,’Chlamydophila_pneumoniae_TW_183′:3.44E-4):0.0):0.088464):0.034515):0.269463,’Candidatus_Protochlamydia_amoebophila_UWE25’:0.26373):0.294159):0.072779,(‘Rhodopirellula_baltica_SH_1′:0.328464,’GEBA_Planctomyces_limnophilus’:0.334605):0.359784):0.051776):0.041976):0.020277,((((((‘Borrelia_afzelii_PKo’:0.011852,’Borrelia_garinii_PBi’:0.016443):0.006519,’Borrelia_burgdorferi_B31′:0.013496):0.093741,’Borrelia_hermsii_DAH’:0.090331):0.391295,((‘Treponema_pallidum_subsp_pallidum_SS14′:0.0,’Treponema_pallidum_subsp_pallidum_str_Nichols’:0.0):0.267368,’Treponema_denticola_ATCC_35405′:0.151221):0.231493):0.155836,’GEBA_Brachyspira_murdochii’:0.52097):0.062841,(((‘Leptospira_borgpetersenii_serovar_Hardjo_bovis_L550′:2.2E-4,’Leptospira_borgpetersenii_serovar_Hardjo_bovis_JB197’:0.0):0.017229,(‘Leptospira_interrogans_serovar_Lai_str_56601′:1.81E-4,’Leptospira_interrogans_serovar_Copenhageni_str_Fiocruz_L1_130’:8.38E-4):0.020842):0.144917,(‘Leptospira_biflexa_serovar_Patoc_strain_Patoc_1_Paris_’:0.0,’Leptospira_biflexa_serovar_Patoc_strain_Patoc_1_Ames_’:1.71E-4):0.165532):0.395767):0.1053):0.036183,((((((((‘Mycoplasma_hyopneumoniae_J’:0.002353,’Mycoplasma_hyopneumoniae_7448′:0.003734):0.002271,’Mycoplasma_hyopneumoniae_232′:0.003699):0.430983,((((‘Mycoplasma_synoviae_53′:0.261036,’Mycoplasma_agalactiae_PG2′:0.263813):0.099129,’Mycoplasma_pulmonis_UAB_CTIP’:0.278073):0.046282,’Mycoplasma_arthritidis_158L3_1′:0.383762):0.035874,’Mycoplasma_mobile_163K’:0.328229):0.046689):0.262618,(((‘Ureaplasma_parvum_serovar_3_str_ATCC_700970′:1.66E-4,’Ureaplasma_parvum_serovar_3_str_ATCC_27815′:0.0):0.453388,’Mycoplasma_penetrans_HF_2’:0.366678):0.056306,((‘Mycoplasma_pneumoniae_M129′:0.106972,’Mycoplasma_genitalium_G37′:0.117431):0.32371,’Mycoplasma_gallisepticum_R’:0.299834):0.137813):0.242873):0.06613,((‘Mycoplasma_mycoides_subsp_mycoides_SC_str_PG1′:0.018411,’Mycoplasma_capricolum_subsp_capricolum_ATCC_27343′:0.013116):0.146931,’Mesoplasma_florum_L1’:0.163211):0.286922):0.11052,(((‘Onion_yellows_phytoplasma_OY_M’:0.016914,’Aster_yellows_witches_broom_phytoplasma_AYWB’:0.018909):0.202409,’Candidatus_Phytoplasma_mali’:0.277597):0.160881,’Acholeplasma_laidlawii_PG_8A’:0.249163):0.180723):0.122978,(((‘GEBA_Streptobacillus_moniliformis’:0.173665,’GEBA_Leptotrichia_buccalis’:0.08471):0.045217,’GEBA_Sebaldella_termitidis’:0.115851):0.114134,’Fusobacterium_nucleatum_subsp_nucleatum_ATCC_25586′:0.22874):0.263641):0.093021,((((((((‘Desulfitobacterium_hafniense_Y51′:0.199501,’Heliobacterium_modesticaldum_Ice1′:0.161842):0.040662,’Moorella_thermoacetica_ATCC_39073’:0.189602):0.025151,((((‘Desulfotomaculum_reducens_MI_1′:0.141921,’GEBA_Desulfotomaculum_acetoxidans’:0.173079):0.026808,’Pelotomaculum_thermopropionicum_SI’:0.119406):0.031051,’Candidatus_Desulforudis_audaxviator_MP104C’:0.239396):0.041262,’Carboxydothermus_hydrogenoformans_Z_2901′:0.192197):0.033883):0.022877,’Syntrophomonas_wolfei_subsp_wolfei_str_Goettingen’:0.340927):0.029485,’Symbiobacterium_thermophilum_IAM_14863′:0.302816):0.024169,(‘Natranaerobius_thermophilus_JW_NM_WN_LF’:0.328273,’GEBA_Veillonella_parvula’:0.318108):0.03302):0.020798,((((((((((((((‘Bacillus_thuringiensis_serovar_konkukian_str_97_27′:3.38E-4,’Bacillus_thuringiensis_str_Al_Hakam’:1.69E-4):1.69E-4,((‘Bacillus_anthracis_str_Sterne’:0.0,’Bacillus_anthracis_str_Ames_Ancestor_’:0.0):0.0,’Bacillus_anthracis_str_Ames’:0.0):0.001183):6.51E-4,’Bacillus_cereus_E33L’:5.32E-4):0.001131,(‘Bacillus_cereus_ATCC_14579′:0.002251,’Bacillus_weihenstephanensis_KBAB4′:0.013987):0.004118):0.001515,’Bacillus_cereus_ATCC_10987′:2.57E-4):0.020967,’Bacillus_cereus_subsp_cytotoxis_NVH_391_98’:0.013733):0.085584,(((‘Bacillus_subtilis_subsp_subtilis_str_168′:0.015012,’Bacillus_amyloliquefaciens_FZB42′:0.015183):0.020109,’Bacillus_pumilus_SAFR_032′:0.040049):0.011797,’Bacillus_licheniformis_ATCC_14580’:0.023456):0.065389):0.021459,(‘Geobacillus_thermodenitrificans_NG80_2′:0.011367,’Geobacillus_kaustophilus_HTA426’:0.01873):0.094956):0.020796,((((((((‘Staphylococcus_aureus_subsp_aureus_Mu3′:0.0,’Staphylococcus_aureus_subsp_aureus_Mu50′:1.67E-4):1.67E-4,’Staphylococcus_aureus_subsp_aureus_N315’:1.67E-4):0.0,(‘Staphylococcus_aureus_subsp_aureus_JH9′:0.001001,’Staphylococcus_aureus_subsp_aureus_JH1’:0.0):3.34E-4):5.16E-4,(((‘Staphylococcus_aureus_subsp_aureus_MRSA252′:6.67E-4,’Staphylococcus_aureus_RF122’:3.35E-4):0.0,(((‘Staphylococcus_aureus_subsp_aureus_NCTC_8325′:6.77E-4,’Staphylococcus_aureus_subsp_aureus_USA300_TCH1516′:0.0):0.0,’Staphylococcus_aureus_subsp_aureus_USA300’:0.0):0.0,(‘Staphylococcus_aureus_subsp_aureus_COL’:6.68E-4,’Staphylococcus_aureus_subsp_aureus_str_Newman’:3.33E-4):3.34E-4):1.67E-4):0.0,(‘Staphylococcus_aureus_subsp_aureus_MSSA476′:0.0,’Staphylococcus_aureus_subsp_aureus_MW2’:0.0):3.34E-4):1.51E-4):0.032187,((‘Staphylococcus_epidermidis_ATCC_12228′:3.35E-4,’Staphylococcus_epidermidis_RP62A’:1.66E-4):0.023543,’Staphylococcus_haemolyticus_JCSC1435′:0.025193):0.012868):0.014679,’Staphylococcus_saprophyticus_subsp_saprophyticus_ATCC_15305′:0.045225):0.208482,((((((((((‘Streptococcus_pyogenes_SSI_1′:1.69E-4,’Streptococcus_pyogenes_MGAS315’:3.35E-4):3.86E-4,(((((‘Streptococcus_pyogenes_M1_GAS’:3.34E-4,’Streptococcus_pyogenes_MGAS5005′:1.67E-4):3.34E-4,(((‘Streptococcus_pyogenes_MGAS8232′:6.78E-4,’Streptococcus_pyogenes_MGAS10750′:8.62E-4):1.67E-4,’Streptococcus_pyogenes_str_Manfredo’:5.01E-4):3.34E-4,’Streptococcus_pyogenes_MGAS10394′:5.01E-4):0.0):1.67E-4,(‘Streptococcus_pyogenes_MGAS9429′:5.01E-4,’Streptococcus_pyogenes_MGAS2096′:0.0):5.01E-4):0.0,’Streptococcus_pyogenes_MGAS10270′:5.09E-4):3.34E-4,’Streptococcus_pyogenes_MGAS6180’:5.01E-4):2.83E-4):0.032705,((‘Streptococcus_agalactiae_2603V_R’:5.02E-4,’Streptococcus_agalactiae_A909′:0.0):5.02E-4,’Streptococcus_agalactiae_NEM316′:0.0):0.032531):0.009744,((‘Streptococcus_thermophilus_LMG_18311′:4.96E-4,’Streptococcus_thermophilus_CNRZ1066′:6.75E-4):0.001002,’Streptococcus_thermophilus_LMD_9′:8.46E-4):0.042346):0.010989,’Streptococcus_mutans_UA159’:0.053241):0.014154,((‘Streptococcus_suis_98HAH33′:1.41E-4,’Streptococcus_suis_05ZYH33’:0.003069):0.045053,((((‘Streptococcus_pneumoniae_D39′:0.0,’Streptococcus_pneumoniae_R6’:0.0):0.001089,(‘Streptococcus_pneumoniae_TIGR4′:7.57E-4,’Streptococcus_pneumoniae_CGSP14′:0.001274):5.04E-4):1.79E-4,’Streptococcus_pneumoniae_Hungary19A_6’:5.78E-4):0.025464,(‘Streptococcus_gordonii_str_Challis_substr_CH1′:0.013866,’Streptococcus_sanguinis_SK36’:0.013168):0.014048):0.01685):0.015608):0.079689,((‘Lactococcus_lactis_subsp_cremoris_SK11′:0.001565,’Lactococcus_lactis_subsp_cremoris_MG1363′:8.56E-4):0.005333,’Lactococcus_lactis_subsp_lactis_Il1403′:0.005458):0.157584):0.124114,’Enterococcus_faecalis_V583’:0.096988):0.030082,(((((‘Lactobacillus_acidophilus_NCFM’:0.017053,’Lactobacillus_helveticus_DPC_4571′:0.025039):0.047802,(‘Lactobacillus_gasseri_ATCC_33323′:0.008251,’Lactobacillus_johnsonii_NCC_533’:0.00468):0.081515):0.035749,(‘Lactobacillus_delbrueckii_subsp_bulgaricus_ATCC_BAA_365′:0.001075,’Lactobacillus_delbrueckii_subsp_bulgaricus_ATCC_11842’:9.47E-4):0.095385):0.189762,((‘Lactobacillus_casei_BL23′:1.67E-4,’Lactobacillus_casei_ATCC_334′:1.71E-4):0.13496,’Lactobacillus_sakei_subsp_sakei_23K’:0.116042):0.029638):0.037981,(((((‘Lactobacillus_brevis_ATCC_367′:0.113171,’Lactobacillus_plantarum_WCFS1′:0.09989):0.023598,’Pediococcus_pentosaceus_ATCC_25745’:0.140539):0.018317,(‘Lactobacillus_fermentum_IFO_3956′:0.075472,’Lactobacillus_reuteri_F275’:0.062813):0.114981):0.022679,((‘Leuconostoc_mesenteroides_subsp_mesenteroides_ATCC_8293′:0.031056,’Leuconostoc_citreum_KM20′:0.035589):0.124121,’Oenococcus_oeni_PSU_1′:0.234536):0.131959):0.021585,’Lactobacillus_salivarius_UCC118’:0.124879):0.034651):0.077386):0.111668,(((‘Listeria_monocytogenes_EGD_e’:0.00208,’Listeria_monocytogenes_str_4b_F2365′:7.75E-4):0.001511,’Listeria_welshimeri_serovar_6b_str_SLCC5334′:0.003974):0.001047,’Listeria_innocua_Clip11262′:0.001031):0.107265):0.033808):0.044404,’Lysinibacillus_sphaericus_C3_41′:0.152339):0.036331):0.021337,’Oceanobacillus_iheyensis_HTE831′:0.18896):0.019902,(‘Bacillus_clausii_KSM_K16′:0.093196,’Bacillus_halodurans_C_125′:0.066964):0.053506):0.027554,’Exiguobacterium_sibiricum_255_15′:0.211271):0.115095,’GEBA_Alicyclobacillus_acidocaldarius’:0.223192):0.06559,(((((‘Clostridium_acetobutylicum_ATCC_824′:0.123954,’Clostridium_novyi_NT’:0.114099):0.023517,((((‘Clostridium_perfringens_ATCC_13124′:5.13E-4,’Clostridium_perfringens_str_13′:0.001376):4.86E-4,’Clostridium_perfringens_SM101’:0.00102):0.09186,((‘Clostridium_botulinum_E3_str_Alaska_E43′:0.003782,’Clostridium_botulinum_B_str_Eklund_17B’:0.003812):0.055674,’Clostridium_beijerinckii_NCIMB_8052′:0.067314):0.054513):0.057627,((((((‘Clostridium_botulinum_A_str_ATCC_3502′:0.0,’Clostridium_botulinum_A_str_ATCC_19397′:0.0):0.0,’Clostridium_botulinum_A_str_Hall’:1.67E-4):0.001673,(‘Clostridium_botulinum_F_str_Langeland’:0.001211,’Clostridium_botulinum_B1_str_Okra’:0.003486):0.001309):0.001791,’Clostridium_botulinum_A3_str_Loch_Maree’:0.007218):0.098641,’Clostridium_kluyveri_DSM_555′:0.121865):0.020585,’Clostridium_tetani_E88′:0.115493):0.020236):0.012084):0.136573,’Clostridium_phytofermentans_ISDg’:0.312208):0.036695,((‘GEBA_Anaerococcus_prevotii’:0.28787,’Finegoldia_magna_ATCC_29328′:0.211317):0.173769,((‘Alkaliphilus_metalliredigens_QYMF’:0.12868,’Alkaliphilus_oremlandii_OhILAs’:0.107044):0.082634,’Clostridium_difficile_630′:0.211875):0.045791):0.027696):0.039355,((((‘Thermoanaerobacter_pseudethanolicus_ATCC_33223′:0.004904,’Thermoanaerobacter_sp_X514′:0.004594):0.034053,’Thermoanaerobacter_tengcongensis_MB4′:0.041493):0.149566,’Caldicellulosiruptor_saccharolyticus_DSM_8903′:0.23409):0.036154,’Clostridium_thermocellum_ATCC_27405’:0.176618):0.036512):0.061826):0.019474):0.069313,((((((((((((((‘GEBA_Sanguibacter_keddieii’:0.064063,’GEBA_Jonesia_denitrificans’:0.099947):0.030002,’GEBA_Xylanimonas_cellulosilytica’:0.094672):0.01975,’GEBA_Cellulomonas_flavigena’:0.082405):0.035617,’GEBA_Beutenbergia_cavernae’:0.115349):0.051352,((‘GEBA_Brachybacterium_faecium’:0.196353,’GEBA_Kytococcus_sedentarius’:0.168724):0.036428,((((‘Arthrobacter_aurescens_TC1′:0.020568,’Arthrobacter_sp_FB24′:0.028887):0.039891,’Renibacterium_salmoninarum_ATCC_33209′:0.068094):0.050028,’Kocuria_rhizophila_DC2201’:0.130484):0.065937,((‘Clavibacter_michiganensis_subsp_michiganensis_NCPPB_382′:0.003996,’Clavibacter_michiganensis_subsp_sepedonicus’:0.003983):0.084663,’Leifsonia_xyli_subsp_xyli_str_CTCB07′:0.089834):0.133992):0.04523):0.013623):0.017241,’Kineococcus_radiotolerans_SRS30216′:0.154573):0.033958,(((((((‘GEBA_Nocardiopsis_dassonvillei’:0.093463,’Thermobifida_fusca_YX’:0.072507):0.074571,’GEBA_Thermomonospora_curvata’:0.099223):0.020272,(‘GEBA_Thermobispora_bispora’:0.072121,’GEBA_Streptosporangium_roseum’:0.083245):0.048392):0.038128,’Acidothermus_cellulolyticus_11B’:0.153492):0.027393,(((‘Frankia_sp_CcI3′:0.019914,’Frankia_alni_ACN14a’:0.017049):0.027318,’Frankia_sp_EAN1pec’:0.039855):0.137522,((((((((((‘Mycobacterium_gilvum_PYR_GCK’:0.025373,’Mycobacterium_vanbaalenii_PYR_1′:0.014398):0.020335,((‘Mycobacterium_sp_KMS’:0.0,’Mycobacterium_sp_MCS’:0.0):2.54E-4,’Mycobacterium_sp_JLS’:4.38E-4):0.029295):0.012695,’Mycobacterium_smegmatis_str_MC2_155′:0.027998):0.015961,((((((((‘Mycobacterium_bovis_AF2122_97′:0.0,’Mycobacterium_bovis_BCG_str_Pasteur_1173P2′:3.48E-4):5.22E-4,’Mycobacterium_tuberculosis_F11′:3.48E-4):0.0,’Mycobacterium_tuberculosis_H37Ra’:0.0):0.0,’Mycobacterium_tuberculosis_H37Rv’:0.0):1.74E-4,’Mycobacterium_tuberculosis_CDC1551′:3.48E-4):0.028503,(‘Mycobacterium_marinum_M’:2.83E-4,’Mycobacterium_ulcerans_Agy99′:0.005107):0.026919):0.008581,’Mycobacterium_leprae_TN’:0.050438):0.008421,(‘Mycobacterium_avium_104′:1.71E-4,’Mycobacterium_avium_subsp_paratuberculosis_K_10′:0.001878):0.024032):0.041807):0.021744,’Mycobacterium_abscessus’:0.054563):0.05338,((‘GEBA_Gordonia_bronchialis’:0.080084,’GEBA_Tsukamurella_paurometabola’:0.095182):0.023186,(‘Nocardia_farcinica_IFM_10152′:0.066164,’Rhodococcus_sp_RHA1’:0.059553):0.015357):0.01873):0.081897,((‘GEBA_Saccharomonospora_viridis’:0.098036,’Saccharopolyspora_erythraea_NRRL_2338′:0.100342):0.021978,’GEBA_Actinosynnema_mirum’:0.097173):0.027185):0.031045,’GEBA_Nakamurella_multipartita’:0.151197):0.032028,’GEBA_Geodermatophilus_obscurus’:0.140115):0.027068,((‘Salinispora_arenicola_CNS_205′:0.012069,’Salinispora_tropica_CNB_440′:0.010019):0.097618,’GEBA_Stackebrandtia_nassauensis’:0.170224):0.084563):0.033468):0.03169):0.026445,(((‘Streptomyces_avermitilis_MA_4680′:0.027139,’Streptomyces_coelicolor_A3_2_’:0.027351):0.021484,’Streptomyces_griseus_subsp_griseus_NBRC_13350′:0.037328):0.090414,’GEBA_Catenulispora_acidiphila’:0.137426):0.05113):0.0286,((‘GEBA_Kribbella_flavida’:0.118628,’Nocardioides_sp_JS614′:0.133604):0.026332,’Propionibacterium_acnes_KPA171202′:0.242956):0.044904):0.019561):0.056195,((((‘Corynebacterium_glutamicum_R’:4.77E-4,’Corynebacterium_glutamicum_ATCC_13032′:0.001073):0.033614,’Corynebacterium_efficiens_YS_314′:0.038791):0.051561,’Corynebacterium_diphtheriae_NCTC_13129′:0.068116):0.044622,(‘Corynebacterium_urealyticum_DSM_7109′:0.065695,’Corynebacterium_jeikeium_K411’:0.054499):0.056416):0.221144):0.045301,((‘Bifidobacterium_longum_DJO10A’:1.88E-4,’Bifidobacterium_longum_NCC2705′:3.39E-4):0.040933,’Bifidobacterium_adolescentis_ATCC_15703′:0.037617):0.303161):0.039753,(‘Tropheryma_whipplei_TW08_27′:8.95E-4,’Tropheryma_whipplei_str_Twist’:6.79E-4):0.478476):0.120846,’GEBA_Acidimicrobium_ferrooxidans’:0.43829):0.090081,(((‘GEBA_Cryptobacterium_curtum’:0.142154,’GEBA_Eggerthella_lenta’:0.088587):0.047811,’GEBA_Slackia_heliotrinireducens’:0.117231):0.104268,’GEBA_Atopobium_parvulum’:0.269623):0.198082):0.036936,(‘GEBA_Conexibacter_woesei’:0.383152,’Rubrobacter_xylanophilus_DSM_9941′:0.357989):0.109409):0.087352,(((((((((‘Synechococcus_sp_WH_7803′:0.023543,’Synechococcus_sp_CC9311’:0.044194):0.014221,((‘Synechococcus_sp_CC9605′:0.022222,’Synechococcus_sp_WH_8102′:0.025798):0.006423,’Synechococcus_sp_CC9902’:0.032326):0.022152):0.016325,(((‘Prochlorococcus_marinus_subsp_marinus_str_CCMP1375′:0.058394,’Prochlorococcus_marinus_str_MIT_9211’:0.053438):0.019947,((‘Prochlorococcus_marinus_str_NATL2A’:0.00276,’Prochlorococcus_marinus_str_NATL1A’:0.002109):0.070906,((‘Prochlorococcus_marinus_str_MIT_9515′:0.013908,’Prochlorococcus_marinus_subsp_pastoris_str_CCMP1986’:0.016702):0.023743,(((‘Prochlorococcus_marinus_str_AS9601′:0.005031,’Prochlorococcus_marinus_str_MIT_9301′:0.005416):0.003975,’Prochlorococcus_marinus_str_MIT_9215′:0.011185):0.004908,’Prochlorococcus_marinus_str_MIT_9312’:0.010415):0.020455):0.117983):0.022565):0.040798,(‘Prochlorococcus_marinus_str_MIT_9313′:0.00429,’Prochlorococcus_marinus_str_MIT_9303′:0.003453):0.043557):0.012995):0.046129,’Synechococcus_sp_RCC307’:0.067465):0.195828,(‘Synechococcus_elongatus_PCC_7942′:3.21E-4,’Synechococcus_elongatus_PCC_6301’:0.00142):0.098101):0.061386,(((((‘Nostoc_sp_PCC_7120′:0.004827,’Anabaena_variabilis_ATCC_29413′:0.003501):0.036361,’Nostoc_punctiforme_PCC_73102′:0.042386):0.086848,’Trichodesmium_erythraeum_IMS101’:0.151445):0.024405,(((‘Cyanothece_sp_ATCC_51142′:0.088208,’Microcystis_aeruginosa_NIES_843′:0.108904):0.018975,’Synechocystis_sp_PCC_6803′:0.111161):0.027275,’Synechococcus_sp_PCC_7002’:0.127272):0.053965):0.023778,(‘Acaryochloris_marina_MBIC11017′:0.127345,’Thermosynechococcus_elongatus_BP_1’:0.138774):0.035212):0.017944):0.063385,(‘Synechococcus_sp_JA_3_3Ab’:0.023735,’Synechococcus_sp_JA_2_3B_a_2_13_’:0.025648):0.172906):0.056598,’Gloeobacter_violaceus_PCC_7421′:0.211642):0.301851,(((((‘Roseiflexus_castenholzii_DSM_13941′:0.02822,’Roseiflexus_sp_RS_1′:0.017828):0.118476,’Chloroflexus_aurantiacus_J_10_fl’:0.154341):0.066504,’Herpetosiphon_aurantiacus_ATCC_23779′:0.219302):0.124086,(‘GEBA_Sphaerobacter_thermophilus’:0.242004,’GEBA_Thermobaculum_terrenum’:0.272557):0.055029):0.055648,((‘Dehalococcoides_sp_CBDB1′:5.54E-4,’Dehalococcoides_sp_BAV1′:8.51E-4):0.017643,’Dehalococcoides_ethenogenes_195’:0.014048):0.50406):0.083393):0.032459):0.039409):0.035188):0.022061):0.039665):0.054226,(((‘Thermosipho_melanesiensis_BI429′:0.127803,’Fervidobacterium_nodosum_Rt17_B1’:0.188887):0.074834,(((‘Thermotoga_maritima_MSB8′:0.007577,’Thermotoga_sp_RQ2′:5.3E-4):0.005057,’Thermotoga_petrophila_RKU_1′:0.006656):0.159923,’Thermotoga_lettingae_TMO’:0.230211):0.037393):0.080613,’Petrotoga_mobilis_SJ95′:0.362128):0.246177):0.287154,((‘Thermus_thermophilus_HB8′:0.001498,’Thermus_thermophilus_HB27’:0.001224):0.144321,(‘GEBA_Meiothermus_ruber’:0.114423,’GEBA_Meiothermus_silvanus’:0.077947):0.086289):0.12529):0.1212895,(‘Deinococcus_geothermalis_DSM_11300′:0.043888,’Deinococcus_radiodurans_R1’:0.07314):0.1212895);
END;

New openaccess paper from my lab on "Zorro" software for automated masking of sequence alignments

A new Open Access paper from my lab was just published in PLoS One: Accounting For Alignment Uncertainty in Phylogenomics. Wu M, Chatterji S, Eisen JA (2012) Accounting For Alignment Uncertainty in Phylogenomics. PLoS ONE 7(1): e30288. doi:10.1371/journal.pone.0030288

The paper describes the software “Zorro” which is used for automated “masking” of sequence alignments.  Basically, if you have a multiple sequence alignment you would like to use to infer a phylogenetic tree, in some cases it is desirable to block out regions of the alignment that are not reliable.  This blocking is called “masking.”

Masking is thought by many to be important because sequence alignments are in essence a hypothesis about the common ancestry of specific residues in different genes/proteins/regions of the genome.  This “positional homology” is not always easy to assign and for regions where positional homology is ambiguous it may be better to ignore such regions when inferring phylogenetic trees from alignments.

Historically, masking has been done by hand/eye looking for columns in a multiple sequence alignment that seem to have issues and then either eliminating those columns or giving them a lower weight and using a weighting scheme in the phylogenetic analysis.

What Zorro does is it removes much of the subjectivity of this process and generates automated masking patterns for sequence alignments.  It does this by assigning confidence scores to each column in a multiple seqeunce alignment. These scores can then be used to account for alignment accuracy in phylogenetic inference pipelines.

The software is available at Sourceforge: ZORRO – probabilistic masking for phylogenetics.  It was written primarily by Martin Wu (who is now a Professor at the University of Virginia) and Sourav Chatterji with a little help here and there from Aaron Darling I think.  The development of Zorro was part of my “iSEEM” project that was supported by the Gordon and Betty Moore Foundation.

In the interest of sharing, since the paper is fully open access, I am posting it here below the fold. UPDATE 2/9 – decided to remove this since it got in the way of getting to the comments …

One old, one new – a few phylogeny papers worth checking out

Just a quick one here. A few days ago in my lab we were discussing some challenges with doing phylogenetic diversity (PD) measurements in very very large phylogenetic trees. PD is a measure of total branch length in a phylogenetic tree for a group of taxa … and we use it for many purposes.
For many of our applications we have been using an algorithm described by Mike Steele “Phylogenetic diversity and the Greedy Algorithm“. But alas, is is not keeping up with the massive tree sets we are dealing with. Fortunately Aaron Darling in my lab found a alternative paper with a perfect sounding title for us: Phylogenetic Diversity within Seconds from Minh, Klaere, and von Haeseler. This seems like it will do the trick. I note – Kudos to Systematic Biology for making some older papers freely available. Not sure of their general policies on this but good to see.
Anyway – back to the grind …

Announcement: Workshop on Multiple Sequence Alignment and Phylogeny Estimation

Posting this for Tandy Warnow

Workshop on Advances in Multiple Sequence Alignment and Phylogeny Estimation

May 20 and 21, 2012, Smithsonian Institution, Washington, DC

The workshop is funded by the National Science Foundation through grant DEB 0733029 to the University of Texas. Registration is required, and attendance is limited to 40 participants. The workshop will include presentations of new methods for multiple sequence alignment and phylogeny estimation, also training in the use of these methods, and personal assistance in analyzing datasets using the SATé software (see this page). Applications for the workshop (and for travel support) are due by February 15, 2012, and will be responded to by March 1. We expect to be able to provide support to all attendees. Please click here for the application form. For more information, please send an email to Tandy Warnow (see below).

Letter from Tandy explaining workshop:
Dear Colleagues,
We are writing to let you know about a workshop and symposium that we will hold on May 20-22, 2012, at the Smithsonian Institution in Washington, DC. The workshop will provide training in advanced methods for multiple sequence alignment and phylogeny estimation, and will take place on May 20 and 21; the symposium will follow immediately and will feature research presentations on the same topic. This workshop is funded by:
The workshop will include presentations of new methods for maximum likelihood phylogeny estimation of large sequence alignments (including GARLI and FastTree), for comparing different alignments of the same dataset, for phylogenetic analyses of datasets that include partial sequences (e.g., short reads generated in a metagenomic analysis), for supertree estimation, and for simulating sequence evolution. However, a main focus is to train participants in both basic and advanced use of the SATé software (Liu et al. 2009, Science, Vol. 324, no. 5934, pp. 1561-1564) for simultaneous estimation of alignments and trees (SATé software available for download at http://phylo.bio.ku.edu/software/sate/sate.html ).
Workshop participants are expected to bring laptops with them to the workshop, so that they can perform alignment and phylogenetic tree estimations. We will provide test datasets for you to learn how to use SATé, but strongly encourage you to bring your own datasets to analyze.
Attendance at the workshop is limited to 40 participants, and registration is required. If you are interested in attending the workshop, whether or not you are requesting travel support, please fill out the Word document available at http://www.cs.utexas.edu/users/tandy/workshop-application.doc, and return it to Laurie Alvarez (lauriea@austin.utexas.edu) by February 15, 2012. We will respond to requests for registration by March 1, 2012.
For more information on the workshop, please contact me (Tandy Warnow), at tandy@cs.utexas.edu. For more information on the Symposium, please contact Mike Braun (braunm@si.edu). We look forward to seeing you at the Smithsonian workshop and symposium!
Regards,
Tandy Warnow and Mike Braun
On behalf of the AToL project team:
  • Michael Braun, The Smithsonian Institution 
  • Mark Holder, The University of Kansas
  • Jim Leebens-Mack, The University of Georgia 
  • Randy Linder, The University of Texas 
  • Etsuko Moriyama, The University of Nebraska 
  • Tandy Warnow, The University of Texas