Guest post from Kimmen Sjölander about FAT-CAT phylogenomics pipeline

Below is a guest post from my friend and colleague Kimmen Sjölander, Prof. at UC Berkeley and phylogenomics guru. 


Announcing the FAT-CAT phylogenomic annotation webserver.

FAT-CAT is a new web server for phylogenomic prediction of function and ortholog identification and for taxonomic origin prediction of metagenome sequences based on HMM-based classification of protein sequences to >93K pre-calculated phylogenetic trees in the PhyloFacts database. PhyloFacts is unique among phylogenomic databases in having both broad taxonomic coverage – more than 7.3M proteins from >99K unique taxa across the Tree of Life, including targeted coverage of genomes from Eukaryotes, Bacteria and Archaea — and integrating functional data on trees for Pfam domains and multi-domain architectures. PhyloFacts trees include functional and annotation data from UniProt (SwissProt and TrEMBL), GO, BioCyc, Pfam, Enzyme Commission and other sources. The FAT-CAT pipeline uses HMMs at all nodes in PhyloFacts trees to classify user sequences to different levels of functional hierarchies, based on the subtree HMM giving the sequence the strongest score. Phylogenetic placements within orthology groups defined on PhyloFacts trees are used to to predict function and to predict orthologs. Sequences from metagenome projects can be classified taxonomically based on the MRCA of the sequences descending from the top-scoring subtree node. Because of the broad taxonomic and functional coverage, FAT-CAT can identify orthologs and predict function for most sequence inputs. We’re working to make FAT-CAT less computationally intensive so that users will be able to upload entire genomes for analysis; in the interim, we limit users to 20 sequence inputs per day. Registered users are given a higher quota (see details online). We’d love to hear from you if you have feature requests or bug reports; please send any to Kimmen Sjölander – kimmen at berkeley dot edu (parse appropriately). 

My brothers dream come true: baseball meets bioinformatics

James Fraser & Michael Eisen: Baseball Meets Biology

Harvard Crimson Editorial Update

OK – so I am biased here but those interested in Open Access should check out my brother’s letter to the Harvard Crimson that was published today. He wrote it in response to the lame editorial the Crimson wrote about PLoS One. Some of my favorite quotes from his letter

They did not, however, respond to your repellent effort to rally the forces of elitism to derail a project whose primary aim is to rapidly bring scientific knowledge to everyone.

….

Once they see PLoS One, we are confident that consumers of scientific papers will discover what employers have long ago: If you’re looking for the imprimatur of greatness, try Nature or Harvard—but if you want the real thing, try PLoS One or Berkeley.

Of course, I disagree with the use of Berkeley in this context. Yes it is a public school. But come one – to use Berkeley as the “anti”elitist school of the world is a big stretch. So if you want the real thing, try U. C. Davis, not Berkeley.