Mendeley groups on environmental PCR, metagenomics, and microbial eukaryotes

As part of my NSF Research Coordination Network grant (RCN EukHiTS), I am currently managing a number of Mendeley groups that amalgamate relevant journal articles on different topics related to environmental PCR, metagenomics, and microbial eukaryotes. These groups are open (anyone can join with a Mendeley account), and I’m trying to keep them regularly updated with new articles (Mendeley members can also add articles, which I strongly encourage!):

  • Eukaryotic HTP Studies – Publications relevant to high-throughput environmental sequencing approaches focused on microbial eukaryotes. Articles will include any type of -Omic methods (marker gene amplicons, metagenomics, metatranscriptomics, etc.), eukaryote-focused tools/pipelines, and review/opinion pieces.
  • rRNA in Eukaryotes – Literature related to the ribosomal repeat array in eukaryotic genomes – variation in rRNA gene copy number, intragenomic polymorphisms, concerted evolution, transposable elements and their evolutionary and ecological implications.
  • Environmental PCRs – primer sets and bias – Literature related to primer set usage and bias across all taxonomic groups (bacteria, archaea, fungi and microbial eukaryotes) – includes primer sets and methods focused on 16S, 18S, ITS, other rRNA, COI, and other marker genes used for environmental sequencing.
  • eDNA in aquatic ecosystems – This group focuses on environmental DNA (eDNA) applications in aquatic ecosystems, include use of eDNA in bioassessment and environmental monitoring. Literature collection covers methods, analytical tools, and empirical studies (both basic and applied science).

Twisted Tree of Life Award #16: Nature & Authors doing taxonomic alchemy converting an archaeon to a bacterium

Well, this is one of the bigger screw ups in terms of evolution I have seen at a major journal in a while.  See the following paper in Nature: The catalytic mechanism for aerobic formation of methane by bacteria : Nature. The paper discusses some functions of “the ocean-dwelling bacterium Nitrosopumilus maritimus.” Some of what is reported in the paper is perhaps interesting (alas I do not have access).  But painfully, there is one big big big big mistake – you see Nitrosopumilus maritimus is not a bacterium.  It is an archaeon (see for example this paper on its genome).

I got pointed to this by Uri Gophna (in an email and in a comment on my blog)(all see this on Twitter)  Sure – some people debate the structure of the tree of life.  But I am pretty certain the authors here  (Siddhesh S. Kamat, Howard J. Williams, Lawrence J. Dangott, Mrinmoy Chakrabarti & Frank M. Raushel) are not trying to make a statement about monophyly of bacteria or just what archaea are.  They just made what seems to be a colossal screw up.  And Nature not only let them, but added to it with things like their “Editors Summary”:

Novel bacterial biosynthesis of methane
Aerobic marine organisms produce significant quantities of the potent greenhouse gas methane, much of it via the cleavage of the highly unreactive carbon–phosphorus bonds of alkylphosphonates. In this study the authors explore the mechanism of PhnJ, an unusual radical S-adenosyl-L-methionine (SAM) enzyme that appears to use a cysteine-based thiyl radical to help catalyse the conversion of the alkylphosphonate substrate to methane and ribose-1,2-cyclic phosphate-5-phosphate. This reaction, not previously encountered in biological chemistry, establishes a novel mechanism for cleaving carbon–phosphorus bonds to form methane and phosphate via a covalent thiophosphate intermediate.

And for this taxonomic alchemy (converting an archaeon to a bacterium) I am awarding them and Nature my coveted “Twisted Tree of Life Award #16″.


I love the ad that came up while I was writing this post and searching for some information.  I think Nature could use the services from this ad:

Seminar “Gene Regulatory Networks in Archaea” Marc Facciotti, 3/1 11 AM Genome Center 4202

The Genome Center Biological Networks Seminars present:

Date: Friday, March 01st , 2013, 11am – 12pm
Location: 4202 GBSF

”Gene Regulatory Networks in Archea”

Speaker: Marc Facciotti, PhD
Assistant Professor, University of California, Davis

For more information regarding the seminar series, upcoming talks and how to subscribe in our mailing list, please visit

Guest post from Kimmen Sjölander about FAT-CAT phylogenomics pipeline

Below is a guest post from my friend and colleague Kimmen Sjölander, Prof. at UC Berkeley and phylogenomics guru. 

Announcing the FAT-CAT phylogenomic annotation webserver.

FAT-CAT is a new web server for phylogenomic prediction of function and ortholog identification and for taxonomic origin prediction of metagenome sequences based on HMM-based classification of protein sequences to >93K pre-calculated phylogenetic trees in the PhyloFacts database. PhyloFacts is unique among phylogenomic databases in having both broad taxonomic coverage – more than 7.3M proteins from >99K unique taxa across the Tree of Life, including targeted coverage of genomes from Eukaryotes, Bacteria and Archaea — and integrating functional data on trees for Pfam domains and multi-domain architectures. PhyloFacts trees include functional and annotation data from UniProt (SwissProt and TrEMBL), GO, BioCyc, Pfam, Enzyme Commission and other sources. The FAT-CAT pipeline uses HMMs at all nodes in PhyloFacts trees to classify user sequences to different levels of functional hierarchies, based on the subtree HMM giving the sequence the strongest score. Phylogenetic placements within orthology groups defined on PhyloFacts trees are used to to predict function and to predict orthologs. Sequences from metagenome projects can be classified taxonomically based on the MRCA of the sequences descending from the top-scoring subtree node. Because of the broad taxonomic and functional coverage, FAT-CAT can identify orthologs and predict function for most sequence inputs. We’re working to make FAT-CAT less computationally intensive so that users will be able to upload entire genomes for analysis; in the interim, we limit users to 20 sequence inputs per day. Registered users are given a higher quota (see details online). We’d love to hear from you if you have feature requests or bug reports; please send any to Kimmen Sjölander – kimmen at berkeley dot edu (parse appropriately). 

Worth a read: Jim Staley on a "Universal Species Concept" and the history of microbial species concepts

Interesting paper came up in my automated google searches for “phylogenomics”: Transitioning Toward a Universal Species Concept for the Classification of all Organisms | InTechOpen.  It is by Jim Staley who has been writing a lot about microbial species concepts in the last few years.  In addition to trying to bridge the gap between bacteria/archaea and eukaryotes in terms of species concepts.  Not sure how I feel about everything in the paper but it has a really nice history of how species have been defined for bacteria. He breaks down this history into four periods

  • Discovery of microorganisms,
  • Advent of pure cultures and phenotypic features,
  • Introduction of molecular analyses and
  • Gene sequencing and genomics.
And goes through a bit of detail on each one.  He also discusses what he sees as a need for a universal species concept and even makes some suggestions about how it might be implemented.  Definitely worth a read.  
Some related posts of mine and or links of potential interest:

RIP Carl Woese: Collecting posts / notes / other information about my main science hero here

My tribute to Carl Woese 12/30/12

Sadly, Carl Woese has passed away.  I am collecting some links and posts about him here in his memory.  He was without a doubt the person who most influenced my career as a scientist.

News stories about Woese’s passing

Some of my posts about Woese

Woese Tree of Life pumpkin (by J. Eisen)

Storification of Tweets and other posts about his passing //[View the story “RIP Carl Woese” on Storify]

Other posts worth reading about Woese’s passing

Some videos with Woese 


My graduate student Russell Neches used a laser to etch a picture of Carl Woese on a piece of toast.

Welcome to the Microbial Earth Project

Map of type strains.

All interested in microbes and their genomes should check out The Microbial Earth Project.  It “is an international effort to generate a comprehensive catalog from genome sequences of all the archaeal and bacterial type strains. The name of the project comes from the recognition that Earth is a predominantly a microbial planet, and by effect in order to understand life on our planet, we need to understand how microbial life works.”

There are some 10,000 described type strains of bacteria and archaea.  Not really a lot given that there are probably millions upon millions of species of bacteria and archaea.  But it is what we have available to us in terms of the formally described and accepted species for which there is an available cultured strain.

At this site you can do things like “Adopt a Type Strain” or view a cool “Map of the type strains“.

The Steering Committee for the project is

Much of the real work being done by Nikos Kyrpides, George Garrity, and others though I am very pleased to be a member of the Steering Committee.   One of my key jobs will be to get the word out early and often.  Hence this post.

Get the genomes of up to 12 type strains of bacteria and/or archaea sequenced, for free

Barny Whitman asked me to post this announcement and, well, I am.  I made one edit below (see strikethrough) in honor of Norm Pace.

Genomic Sequencing of Prokaryotic Bacterial and Archaeal Type Strains

The Community Sequencing Program (CSP) Quarterly Microbial call of the DOE Joint Genomes Institute provides a great opportunity to obtain draft genomic sequences of the type strains of bacterial and archaeal species. The type strains may also include proposed species prior to publication. Type strains must be relevant to DOE mission areas, such as bioenergy, biogeochemistry, bioremediation, carbon cycling, and phylogenetic diversity. However, strains of human pathogens and human associated species are not eligible. Proposals for genome sequencing of type strains can be submitted through the CSP Quarterly Microbial call, whose deadline is December 17, 2012, with approval usually being completed within one month. Up to 12 strains can be included in each proposal. Proposals for larger numbers of strains need to be submitted to the CSP annual call in the spring. If you cannot make the December call, Quarterly calls are also scheduled for March 25, June 17, and September 23, 2013.

Proposals may be completed on-line at: You will need to register and sign in to this server. Once on the server, follow the links to the “CSP Quarterly Microbial/Metagenome”. All strains will have to have been deposited in a culture collection, including proposed type strains prior to publication. If a culture collection ID is not available, you can attach a copy of the Certification of Availability. Once approved, you will need to provide 5-10 µg of high molecular weight DNA.

For questions, contact Barny Whitman, University of Georgia (

New publication from members of my lab (e.g., @ryneches) & lab of Marc Facciotti on ChIP-seq based mapping of archaeal transcription factors

New publication from members of my lab and the lab of Marc Facciotti on a workflow for ChIP-seq based mapping of archaeal transcription factors. The paper includes a description of new software from Russell Neches in my lab called pique for peak calling.

See: A workflow for genome-wide mapping of archaeal transcription factors with ChIP-seq

Russell’s pique software is available on github here:
The Pique software package processes ChIP-seq coverage data to predict protein-binding sites. Strand-specific coverage data are output as tracks for the Gaggle Genome Browser, and putative-binding sites (peaks) are output as ‘bookmark files’. (A) Screenshot of data browsing in the Gaggle Genome Browser. Green box outlines the navigation window for clicking through bookmarks of predicted binding sites. Details of each site can be displayed (inset). The Gaggle toolbar (shown with black arrow) can be used to broadcast selected data to other ‘geese’ in the gaggle package, programs such as R, cytoscape, BLAST or KEGG. (B) Schematic overview of bioinformatics workflow. Wilbanks, E., Larsen, D., Neches, R., Yao, A., Wu, C., Kjolby, R., & Facciotti, M. (2012). A workflow for genome-wide mapping of archaeal transcription factors with ChIP-seq Nucleic Acids Research DOI: 10.1093/nar/gks063