Just discovered an interesting paper by Leonardo de Oliveira Martins and David Posada. It is titled “Proving universal common ancestry with similar sequences.” It relates to a paper by Douglas Theobald: “A formal test of the theory of universal common ancestry. Nature 2010; 465:219-22.” Although the latter paper is not openly available the more recent one is.
The new paper is worth a look. Not sure about the Theobald one as I do not have access from home.
Am hoping Leonardo writes more about this in his blog: Bayesian Procedures in Biology ….
Tag: Evolution
‘Danger and Evolution in the Twilight Zone’: Guest post by Randen Patterson and Gaurav Bhardwaj
| Figure 1. PHYRN concept and work flow. |
‘Danger and Evolution in the twilight zone’
I have been communicating with Randen Patterson on and off over the last five years or so about his efforts to try and study the evolution of gene families when the sequence similarity in the gene family is so low that making multiple sequence alignments are very difficult. Recently, Randen moved to UC Davis so I have been talking / emailing with jim more and more about this issue. Of note, Randen has a new paper in PLoS One about this topic: Bhardwaj G, Ko KD, Hong Y, Zhang Z, Ho NL, et al. (2012) PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences. PLoS ONE 7(4): e34261. doi:10.1371/journal.pone.0034261.
| Figure 8. Model for the Evolution of the DANGER Superfamily. |
I invited Randen and the first author Gaurav Bhardwaj to do a guest post here providing some of the story behind their paper for my ongoing series on this topic. I note – if you have published an open access paper on some topic related to this blog I would love to have a guest post from you too. I note – I personally love the fact that they used the “DANGER” family as an example to test their method.
Here is their guest post:
A fundamental problem to phylogenetic inference in the “twilight zone” (<25% pairwise identity), let alone the “midnight zone” (<12% pairwise identity), is the inability to accurately assign evolutionary relationships at these levels of divergence with statistical confidence. This lack of resolution arises from difficulties in separating the phylogenetic signal from the random noise at these levels of divergence. This obviously and ultimately stymies all attempts to truly resolve the Tree of Life. Since most attempts at phylogenetic inferences in twilight/midnight zone have relied on MSA, and with no clear answer on the best phylogenetic methods to resolve protein families in twilight/midnight zone, we have presented rest of this blog post as two questions representative of these problems.
Question1: Is MSA required for accurate phylogenetic inference?
Our Opinion: MSA is an excellent tool for the inference from conserved data sets, but it has been shown by others and us, that the quality of MSA degrades rapidly in the twilight zone. Further, the quest for an optimal MSA becomes increasingly difficult with increased number of taxa under study. Although, quality of MSA methods has improved in last two decades, we have not made significant improvements towards overcoming these problems. Multiple groups have also designed alignment-free methods (see Hohl and Ragan, Syst. Biol. 2007), but so far none of these methods has been able to provide better phylogenetic accuracy than MSA+ML methods. We recently published a manuscript in PLoS One entitled “PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences” introducing a hybrid profile-based method. Our approach focuses on measuring phylogenetic signal from homologous biological patterns (functional domains, structural folds, etc), and their subsequent amplification and encoding as phylogenetic profile. Further, we adopt a distance estimation algorithm that is alignment-free, and thus bypasses the need for an optimal MSA. Our benchmarking studies with synthetic (from ROSE and Seqgen) and biological datasets show that PHYRN outperforms other traditional methods (distance, parsimony and Maximum Liklihood), and provides significantly accurate phylogenies even in data sets exhibiting ~8% average pairwise identity. While this still needs to be evaluated in other simulations (varying tree shapes, rates, models), we are convinced that these types of methods do work and deserve further exploration.
Question 2: How can we as a field critically and fairly evaluate phylogenetic methods?
Our Opinion: A similar problem plagued the field of structural biology whereby there were multiple methods for structural predictions, but no clear way of standardizing or evaluating their performance. An additional problem that applies to phylogenetic inference is that, unlike crystal structures of proteins, phylogenies do not have a corresponding “answer” that can be obtained. Synthetic data sets have tried to answer this question to a certain extent by simulating protein evolution and providing true evolutionary histories that can be used for benchmarking. However, these simulations cannot truly replicate biological evolution (e.g. indel distribution, translocations, biologically relevant birth-death models, etc). In our opinion, we need a CASP-like model (solution adopted by our friends in computational structural biology), where same data sets (with true evolutionary history known only to organizers) are inferred by all the research groups, and then submitted for a critical evaluation to the organizers. To convert this thought to reality, we hereby announce CAPE (Critical Assessment of Protein Evolution) for Summer 2013
2. We are still in pre-production stages, and we welcome any suggestions, comments and inputs about data sets, scoring and evaluating methods.
Bhardwaj, G., Ko, K., Hong, Y., Zhang, Z., Ho, N., Chintapalli, S., Kline, L., Gotlin, M., Hartranft, D., Patterson, M., Dave, F., Smith, E., Holmes, E., Patterson, R., & van Rossum, D. (2012). PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences PLoS ONE, 7 (4) DOI: 10.1371/journal.pone.0034261
Quick post – new paper of interest on "The Infinitely Many Genes Model …"
This paper seems of potential interest: The Infinitely Many Genes Model for the Distributed Genome of Bacteria by Franz Baumdicker, Wolfgang R. Hess, and Peter Pfaffelhuber
Abstract:
The distributed genome hypothesis states that the gene pool of a bacterial taxon is much more complex than that found in a single individual genome. However, the possible fitness advantage, why such genomic diversity is maintained, whether this variation is largely adaptive or neutral, and why these distinct individuals can coexist, remains poorly understood. Here, we present the infinitely many genes (IMG) model, which is a quantitative, evolutionary model for the distributed genome. It is based on a genealogy of individual genomes and the possibility of gene gain (from an unbounded reservoir of novel genes, e.g., by horizontal gene transfer from distant taxa) and gene loss, for example, by pseudogenization and deletion of genes, during reproduction. By implementing these mechanisms, the IMG model differs from existing concepts for the distributed genome, which cannot differentiate between neutral evolution and adaptation as drivers of the observed genomic diversity. Using the IMG model, we tested whether the distributed genome of 22 full genomes of picocyanobacteria (Prochlorococcus and Synechococcus) shows signs of adaptation or neutrality. We calculated the effective population size of Prochlorococcus at 1.01 × 1011 and predicted 18 distinct clades for this population, only six of which have been isolated and cultured thus far. We predicted that the Prochlorococcus pangenome contains 57,792 genes and found that the evolution of the distributed genome of Prochlorococcus was possibly neutral, whereas that of Synechococcus and the combined sample shows a clear deviation from neutrality.
Wish they had gone beyond these two cyanobacteria … but still seems of possible interest.
Baumdicker, F., Hess, W., & Pfaffelhuber, P. (2012). The Infinitely Many Genes Model for the Distributed Genome of Bacteria Genome Biology and Evolution, 4 (4), 443-456 DOI: 10.1093/gbe/evs016
Twisted tree of Life Award #13: Press release from U. Oslo on new protozoan
Wow. Just got pointed to this press release Rare protozoan from sludge in Norwegian lake does not fit on main branches of tree of life (hat tip to Bill Hooker). It is a long PR. And it is riddled with many examples of evolutionary mumbo jumbo – each of which on their own could win a Twisted Tree of Life Award here. And together, well, I am just going to give it one award – the Twisted Tree of Life Award #14.
Here are some statements that are, well, dubious, and/or painful.
- Biologists all over the world have been eagerly awaiting the results of the genetic analysis of one of the world’s smallest known species, hereafter called the protozoan, from a little lake 30 kilometer south of Oslo in Norway.
- Wow – really? All over the world?
- And why not tell us what the F#&$# it is? Where is the name of the organism? WTF?
- When researchers from the University of Oslo, Norway compared its genes with all other known species in the world, they saw that the protozoan did not fit on any of the main branches of the tree of life. The protozoan is not a fungus, alga, parasite, plant or animal.
- That is right. There are five main branches on the tree of life. Fungi. Alga. Parasites. Plants. And animals. Uggh.
- His research group studies tiny organisms hoping to find answers to large, biological questions within ecology and evolutionary biology, and works across such different fields as biology, genetics, bioinformatics, molecular biology and statistics
- Yes, and I study tiny organisms to answer small questions.
- Life on Earth can be divided up into two main groups of species, prokaryotes and eukaryotes. The prokaryote species, such as bacteria, are the simplest form of living organisms on Earth.
- Yup, two main groups. As of 40 f3$*@# years ago.
- The micro-organism is among the oldest, currently living eukaryote organisms we know of. It evolved around one billion years ago, plus or minus a few hundred million years.
- OMG. This is a MODERN ORGANISM. It did not evolve a billion years ago. It is no older than ANYTHING ELSE ON THE PLANET. AAAAAARRRGH.
- The tree of life can be divided into organisms with one or two flagella
- What?
- The tree of life can also be divided into organisms with one or two penises.
- Just like all other mammals, human sperm cells have only one flagellum. Therefore, humankind belongs to the same single flagellum group as fungi and amoebae.
- I don’t even know what to say here.
- The protozoan from Ås has four flagella. The family it belongs to is somewhere between excavates, the oldest group with two flagella, and some amoebae, which is the oldest group with only one flagellum.
- Wow – no prior description of the major groups of eukaryotes and now we use excavates (kind of technical) and amoebae (not technical). Translation error?
- But even w/ translation issues still very strange.
- Were we to reconstruct the oldest, eukaryote cell in the world, we believe it would resemble our species. To calculate how much our species has changed since primordial times, we have to compare its genes with its nearest relatives, amoebae and excavates,” says Shalchian-Tabrizi.
- What? Their species has been around since primordial times? What? That is one really old cell.
- The protozoan lives off algae, but the researchers still do not know what eats the protozoan.
- Why does something have to eat it?
- The protozoan was discovered as early as 1865, but it is only now that, thanks to very advanced genetic analyses, researchers understand how important the species is to the history of life on Earth
- Very advanced? Like, what? Sequencing?
- The problem is that DNA sequences change a lot over time. Parts of the DNA may have been wiped away during the passing of the years. Since the protozoan is a very old species, an extra large amount of gene information is required
- What? Since it is old they need more DNA? What?
Guest Post on Viruses from Claudiu Bandea
![]() |
| From here. |
Guest Post Today from Claudiu Bandea .
Claudiu wrote to me after my paper on “Stalking the Fourth Domain” came out.
He wrote
Jonathan,
I posted a comment on your ‘PLoSOne paper’ blog, but I thought of sending you this mail.
You might be interested in taking a look at the attached paper presenting a fusion model for the origin of ‘ancestral viruses’ from parasitic or symbiotic cellular species, and its implication for the evolution of viruses and cellular domains, which I’m attaching here (you can see the entire series, including comments, at: http://precedings.nature.com/documents/3886/version/1). Possibly, the novel sequences you discovered belong to such ‘transitional forms’ between the cellular domains and the viral domains.
I know it’s a lot of material, but you might want to focus on Fig. 4 and the related discussion about TOLs from the perspective of the current hypotheses on origin and evolution of viruses. Because of your interest in TOL, I want to ask your thoughts on the difference between the concept of TOL based on the line-of-descent, the ways it was historically intended, and the current approaches of using (mostly) sequences which, as you know, due to LGT might not necessarily reflect the line-of-descent relationships.
Best,
Claudiu
After a bit of a back and forth I offered to let him write a guest post on my blog about this. He accepted my offer. I note – I am not endorsing any of his ideas here and to be honest I have not read his papers he refers to – I have skimmed them and the seem interesting but have not had a chance to read them. I also note – I am a bit uncomfortable with the fact that I cannot seem to find any Web Profile / Web Site / Blog / etc. with more detail about him and his work. On one hand – ideas are ideas and they can and should stand on their own. On the other hand context is useful in many cases and I feel like I am missing some context here. He works at the CDC but I am not sure what he actually does there. But in the interest of open discussion of ideas and since, well, not having a web site is certainly not a crime, his post is below.
The most efficient way of silencing ideas is not by criticizing them but by pretending they don’t exist. The antidote might be the blogging world.
A couple of decades ago, I published a novel model on the evolutionary origin of ancestral viral lineages. Recently, I updated this model and integrated it into an ambitious unifying scenario on the origin and evolution of cellular and viral domains, including the origin of life; well, that might have just buried it so deep that it’s gone for good even for those with an open mind and noble intentions.
So, I would like to ask you the favor of reviewing and criticizing this model. As a primer, you might want to read a comment I posted last summer on a book review by Robin Weiss. The book was Carl Zimmer’s A Planet of Viruses and the review by Dr. Weiss, one of the most distinguished contemporary virologists, was entitled Potent Tiny Packages, which symbolizes our century-long perspective on the nature of viruses as virus particles. If we have reasons to call Earth a planet of viruses, as I think Carl successfully made the point, then viruses require our full attention, including the right to be correctly identified and to be included in the Tree of Life.
I know, this is a lot of material, but I hope you’ll find it interesting, and I would be thrilled to address your questions and listen to your ideas.
12 hours of me: Slideshows w/ audio from "BIS2C: Biodiversity & the Tree of Life" at #UCDavis
Well, it has taken a few months of processing but I have finally gotten my lectures from the introductory biology course I teach uploaded in some way to share. The course is “BIS2C: Biodiversity and the Tree of Life” and it is the third quarter of a three quarter introductory biology series at UC Davis. Each year some 2300 or so students take this series which means that we at UC Davis have to offer each of the courses (BIS2A, BIS2B, and BIS2C) each quarter. Every fall I co-teach BIS2C. Alas we do not have a lecture hall big enough for 700 students, so we do the course in two sections. The way we teach it each of the faculty double up and teach their part of the course to each section. The course also has a weekly lab. It is a machine of sorts.
This fall I taught 13 lectures for the course. I covered basically phylogenetic methods, the big picture of the tree of life, and microbial diversity. I used the Apple presentation program Keynote for slides for my lectures and I used the “Record Slideshow” option to record audio in synch with the slides. After a bit of pain, I managed to convert these recordings into video and then posted them to Youtube. And today I am sharing them with you. There are imperfections of course. But I thought some might find them useful. Plus I have made a YouTube playlist for all the lectures if you want to just sit down and enjoy 12 hours or so of me. Now if only Youtube would allow me to change the thumbnail image for each lecture … Plus I note – next year I will be doing much more interactive learning in class so this may be the last record of some of these lectures …
Lecture 1: Introduction to Course and the Tree of Life
Lecture 2: Trees, Taxa and Groups
Lecture 3: Characters
Lecture 4: Phylogenetic Inference
Lecture 5: Phylogenetic Inference
Lecture 6: The Tree of Life
Lecture 7: The Three Domains
Lecture 8: Three Domains and Microbial Diversity
Lecture 9: Microbial Diversity
Lecture 10: Endosymbioses and Lateral Gene Transfer
Lecture 11: Endosymbioses and Lateral Gene Transfer
Lecture 12: Extremophiles
Lecture 13: Human Associated Microbes
Just grand -Donald Williamson published more crap on larval "evolution" – this time in one of the #OMICS journals
Well this is great. Just great. Donald Williamson has a new paper “The Origins of Chordate Larvae” published in Cell and Developmental Biology – a spammy journal from the OMICS publishing group. Don’t know who Williamson is? Well consider yourself lucky. For more on him and his horrendous history of publishing crap see:
- Another paper on “symbiotic speciation” by Donald Williamson is retracted …
- A flying what? Symbiosis retracts paper claiming … – Retraction Watch
- D’oh! Top Science Journal Retractions of 2011: Scientific American
- RETRACTED ARTICLE: Larval genome transfer: hybridogenesis in …
- Surprise! Sometimes Science Gets It Wrong. | The Mary Sue
The Axis of Evol: Getting to the Root of DNA Repair with Philogeny
|
Process
|
Genes in D. radiodurans
|
Unusual features
|
|
Nucleotide Excision Repair
|
UvrABCD, UvrA2
|
UvrA2 not found in most species
|
|
Base Excision Repair
|
AlkA, Ung, Ung2, GT, MutM, MutY-Nths, MPG
|
More MutY-Nths than most species
|
|
AP Endonuclease
|
Xth
|
–
|
|
Mismatch Excision Repair
|
MutS, MutL
|
–
|
|
Recombination
Initiation
Recombinase
Migration and resolution
|
RecFJNRQ, SbcCD, RecD
RecA
RuvABC, RecG
|
–
|
|
Replication
|
PolA, PolC, PolX, phage Pol
|
PolX not in many bacteria
|
|
Ligation
|
DnlJ
|
–
|
|
dNTP pools, cleanup
|
MutTs, RRase
|
–
|
|
Other
|
LexA, RadA, HepA, UVDE, MutS2
|
UvDE not in many bacteria
|
Evolution rap: 3.5* til infinity #music
Well, after a rough day I am in need of some lightness. And thanks to Eric Lowe, an undergrad. working in my lab, I got a giggle out of this:
Slideshow w/ audio of my talk on "A Field Guide to the Microbes" from the AAAS Meeting #AAASMtg
I recorded the audio of my talk on “Towards a field guide to the microbes” from the AAAS meeting on Saturday AM. Here is a slideshow of the talk with audio synched to the slides (I did this using Keynote on a Mac with the “record Slideshow” function).
My slides from the talk are available at Slideshare.





