‘Danger and Evolution in the Twilight Zone’: Guest post by Randen Patterson and Gaurav Bhardwaj

Figure 1. PHYRN concept and work flow.

‘Danger and Evolution in the twilight zone’

I have been communicating with Randen Patterson on and off over the last five years or so about his efforts to try and study the evolution of gene families when the sequence similarity in the gene family is so low that making multiple sequence alignments are very difficult.  Recently, Randen moved to UC Davis so I have been talking / emailing with jim more and more about this issue.  Of note, Randen has a new paper in PLoS One about this topic: Bhardwaj G, Ko KD, Hong Y, Zhang Z, Ho NL, et al. (2012) PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences. PLoS ONE 7(4): e34261. doi:10.1371/journal.pone.0034261.

Figure 8. Model for the Evolution of the DANGER Superfamily.

I invited Randen and the first author Gaurav Bhardwaj to do a guest post here providing some of the story behind their paper for my ongoing series on this topic.  I note – if you have published an open access paper on some topic related to this blog I would love to have a guest post from you too.   I note – I personally love the fact that they used the “DANGER” family as an example to test their method.

Here is their guest post:

A fundamental problem to phylogenetic inference in the “twilight zone” (<25% pairwise identity), let alone the “midnight zone” (<12% pairwise identity), is the inability to accurately assign evolutionary relationships at these levels of divergence with statistical confidence. This lack of resolution arises from difficulties in separating the phylogenetic signal from the random noise at these levels of divergence. This obviously and ultimately stymies all attempts to truly resolve the Tree of Life. Since most attempts at phylogenetic inferences in twilight/midnight zone have relied on MSA, and with no clear answer on the best phylogenetic methods to resolve protein families in twilight/midnight zone, we have presented rest of this blog post as two questions representative of these problems.  

Question1: Is MSA required for accurate phylogenetic inference? 

Our Opinion: MSA is an excellent tool for the inference from conserved data sets, but it has been shown by others and us, that the quality of MSA degrades rapidly in the twilight zone. Further, the quest for an optimal MSA becomes increasingly difficult with increased number of taxa under study. Although, quality of MSA methods has improved in last two decades, we have not made significant improvements towards overcoming these problems. Multiple groups have also designed alignment-free methods (see Hohl and Ragan, Syst. Biol. 2007), but so far none of these methods has been able to provide better phylogenetic accuracy than MSA+ML methods. We recently published a manuscript in PLoS One entitled “PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences” introducing a hybrid profile-based method. Our approach focuses on measuring phylogenetic signal from homologous biological patterns (functional domains, structural folds, etc), and their subsequent amplification and encoding as phylogenetic profile. Further, we adopt a distance estimation algorithm that is alignment-free, and thus bypasses the need for an optimal MSA. Our benchmarking studies with synthetic (from ROSE and Seqgen) and biological datasets show that PHYRN outperforms other traditional methods (distance, parsimony and Maximum Liklihood), and provides significantly accurate phylogenies even in data sets exhibiting ~8% average pairwise identity. While this still needs to be evaluated in other simulations (varying tree shapes, rates, models), we are convinced that these types of methods do work and deserve further exploration. 

Question 2: How can we as a field critically and fairly evaluate phylogenetic methods? 

Our Opinion: A similar problem plagued the field of structural biology whereby there were multiple methods for structural predictions, but no clear way of standardizing or evaluating their performance.  An additional problem that applies to phylogenetic inference is that, unlike crystal structures of proteins, phylogenies do not have a corresponding “answer” that can be obtained.  Synthetic data sets have tried to answer this question to a certain extent by simulating protein evolution and providing true evolutionary histories that can be used for benchmarking.  However, these simulations cannot truly replicate biological evolution (e.g. indel distribution, translocations, biologically relevant birth-death models, etc). In our opinion, we need a CASP-like model (solution adopted by our friends in computational structural biology), where same data sets (with true evolutionary history known only to organizers) are inferred by all the research groups, and then submitted for a critical evaluation to the organizers. To convert this thought to reality, we hereby announce CAPE (Critical Assessment of Protein Evolution) for Summer 20132. We are still in pre-production stages, and we welcome any suggestions, comments and inputs about data sets, scoring and evaluating methods.   

ResearchBlogging.org Bhardwaj, G., Ko, K., Hong, Y., Zhang, Z., Ho, N., Chintapalli, S., Kline, L., Gotlin, M., Hartranft, D., Patterson, M., Dave, F., Smith, E., Holmes, E., Patterson, R., & van Rossum, D. (2012). PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences PLoS ONE, 7 (4) DOI: 10.1371/journal.pone.0034261

Bionformatics Workshops at UC Davis Genome Center

From the UC Davis Genome Center

Our RNASeq Bioinformatics Analysis Workshop has reached its capacity. However, we still have a few seats open for two other workshops. We will be offering our 5th annual week-long Next Generation Sequencing Data Analysis Course on Sept. 10 – 14, 2012. More information will come later at our training website http://training.bioinformatics.ucdavis.edu/

The workshops this year will focus on hands-on and problem solving skills in addition to lectures on basic concepts and terminologies.
The topics are:
1. (FULL) RNASeq Bioinformatics Analysis, May 21-22, 2012 (No prior programming experience necessary)
2. (2 Seats) Data Analysis and Visualization using R, May 23-24, 2012 (Basic programming experience required)
3. (6 Seats) Cloud Computing for Bioinformatics, May 29-30, 2012 (Basic programming experience and system administration knowledge required)

More information can be found at http://training.bioinformatics.ucdavis.edu/

Quick post – new paper of interest on "The Infinitely Many Genes Model …"

This paper seems of potential interest: The Infinitely Many Genes Model for the Distributed Genome of Bacteria by Franz Baumdicker, Wolfgang R. Hess, and Peter Pfaffelhuber

Abstract:

The distributed genome hypothesis states that the gene pool of a bacterial taxon is much more complex than that found in a single individual genome. However, the possible fitness advantage, why such genomic diversity is maintained, whether this variation is largely adaptive or neutral, and why these distinct individuals can coexist, remains poorly understood. Here, we present the infinitely many genes (IMG) model, which is a quantitative, evolutionary model for the distributed genome. It is based on a genealogy of individual genomes and the possibility of gene gain (from an unbounded reservoir of novel genes, e.g., by horizontal gene transfer from distant taxa) and gene loss, for example, by pseudogenization and deletion of genes, during reproduction. By implementing these mechanisms, the IMG model differs from existing concepts for the distributed genome, which cannot differentiate between neutral evolution and adaptation as drivers of the observed genomic diversity. Using the IMG model, we tested whether the distributed genome of 22 full genomes of picocyanobacteria (Prochlorococcus and Synechococcus) shows signs of adaptation or neutrality. We calculated the effective population size of Prochlorococcus at 1.01 × 1011 and predicted 18 distinct clades for this population, only six of which have been isolated and cultured thus far. We predicted that the Prochlorococcus pangenome contains 57,792 genes and found that the evolution of the distributed genome of Prochlorococcus was possibly neutral, whereas that of Synechococcus and the combined sample shows a clear deviation from neutrality.

Wish they had gone beyond these two cyanobacteria … but still seems of possible interest. ResearchBlogging.org Baumdicker, F., Hess, W., & Pfaffelhuber, P. (2012). The Infinitely Many Genes Model for the Distributed Genome of Bacteria Genome Biology and Evolution, 4 (4), 443-456 DOI: 10.1093/gbe/evs016

iEvoBio Call for Challenge entries for conference on Informatics for Phylogenetics, Evolution, and Biodiversity (iEvoBio)

From: Hilmar Lapp
Date: Thu, May 10, 2012 at 11:27 AM
Subject: [iEvoBio] Call for Challenge entries for conference on Informatics for Phylogenetics, Evolution, and Biodiversity (iEvoBio)
To: iEvoBio Announcements

Many trees enter.
Fresh analysis ensues.
New insights emerge.

As a reminder, the iEvoBio conference is again holding a Challenge competition in 2012, this time on the theme, "Synthesizing Phylogenies." Further information on the nature of challenge entries and how to submit them can be found on the iEvoBio website at http://ievobio.org/challenge.html. Submissions are due by June 25, 2012. Cash prizes will be awarded for first place (USD 1,500) and runner-up entries. The winning entries will be selected by a vote of the iEvoBio meeting participants.

Also, alongside the iEvoBio Challenge, 2012 iEvoBio sponsor Biomatters Ltd is running the Geneious Challenge. The goal of this challenge is to develop a new plugin to Geneious Pro, using the public API, that enables a new and exciting visualization or analysis. The winning entry will receive a $1000 cash prize, and all entrants who submit by the deadline will receive a 12-month subscription license. The deadline for the Geneious Challenge is the same as for the iEvoBio Challenge. See http://ievobio.org/geneious_challenge.html for more information.

More details about the iEvoBio conference and program are available at http://ievobio.org. You can also find continuous updates on the conference’s Twitter feed at http://twitter.com/iEvoBio and Google+ page, or subscribe to the low-traffic iEvoBio announcements mailing list at http://groups.google.com/group/ievobio-announce.

iEvoBio 2012 is sponsored by the US National Evolutionary Synthesis Center (NESCent) and by Biomatters Ltd., in partnership with the Society for the Study of Evolution (SSE) and the Systematic Biologists (SSB).

The iEvoBio 2012 Organizing Committee:
Hilmar Lapp, US National Evolutionary Synthesis Center (chair)
Robert Beiko, Dalhousie University
Nico Cellinese, University of Florida
Robert Guralnick, University of Colorado at Boulder
Rebecca Kao, Denver Botanic Gardens
Ellinor Michel, Natural History Museum, London
Nadia Talent, Royal Ontario Museum
Andrea Thomer, University of Illinois at Urbana-Champaign

New paper: MicrobeDB: a locally maintainable database of microbial genomic sequences

New paper out involving the lab.  The lead author is Morgan Langille, who was a post doc in the lab and is now at Dalhausie Dalhousie University.  The paper describes a tool for creation and maintenance of a local genome sequence database. See MicrobeDB: a locally maintainable database of microbial genomic sequences.

Software is available at http://github.com/mlangill/microbedb/.

 

Any method allowed for presentations at ASM meeting, as long as you use Powerpoint on a PC.

Just got this email from ASM linking to a message about my presentation at the upcoming ASM meeting in San Francisco.

Here is the message.

Ugggh.

I highlight some parts that I find disappointing at best.  Basically- they say “You can do your presentation is any way.  As long as you convert it to PowerPoint for a PC.”  Never mind other tools to do presentations.  Like Keynote.  Or Prezi.  Or, well, anything else.  Never mind people who use Macs.  Or Linux computers.  Or iPads.

I have NEVER had a problem doing a presentation off of my Mac or iPad.  I have had MANY problems when I have converted my Keynote or PDF files or other material to Powerpoint for a PC.

Oh, and forget about modifying your presentation in response to anything going on in the session (which I do frequently).  I try to tune my slides to the actual crowd.  No longer possible.

Maybe I should use no slides, like I did at TEDMED.  Or maybe I should do a Ross Perot and have charts.  Maybe I will bring my own projector and set it up just before my talk …  who knows … but I hate it when meetings say “Trust us – you won’t have any problems with our system”.

May 9, 2012

Dear Jonathan Eisen;
Thank you for participating as a speaker at asm2012, ASM’s 112th General Meeting in San Francisco, June 16-19, 2012. As a speaker, we kindly request that you consider the following guidelines as you finalize your PowerPoint presentation in the session listed below and also take note of some of the new requirements and changes for asm2012.

Session Details
Session Date/Time: 6/17/2012 3:00:00 PM – 6/17/2012 5:30:00 PM
Session Title: The Great Indoors: Recent Advances in the Ecology of Built Environments
Presentation Title: microBEnet: The Microbiology of the Built Environment Network (If your presentation title is not listed or incorrect, please provide this information to xxxxx immediately.
Length of your Personal Presentation: You are allotted 30 minutes for your presentation or lecture unless otherwise notified by the convener of this session.


New this Year
In order to provide the highest quality experience for our attendees, ASM now requires that all speakers upload their presentations at least four hours before their session begins (if you have a morning session, we recommend you upload your slides at 2:00 p.m. the afternoon before).

Speakers will no longer be permitted to use personal laptops during their presentations. The General Meeting began experiencing a greater number of technical issues as more and more speakers relied on using personal laptops from which to present. This contributed to unique technical situations during individual presentations as well as awkward transitions between speakers. The General Meeting will be utilizing the Presentation Management System for all speakers and our talented and dedicated set of technicians will be well-prepared and equipped to ensure all presentations are presented in the way they are intended and there are smooth transitions between speakers.

asm2012 will feature a networked presentation submission system, called our Presentation Management System. The tips below will help ensure that little, if any, editing will need to be done on-site, allowing you to quickly review your presentation and then attend other sessions in progress. However, ASM strongly recommends that you visit the Speaker Ready Room in Room 120 to test your slides before your session begins to ensure that they run properly on the Presentation Management System. The Presentation Management System can accommodate both Mac and PC based presentations.

The tips below are for both Windows and Mac users. As all the provided computers will be PCs, Mac users should additionally review Considerations for Mac Users at the bottom of this document.

· Building Your Presentation

Movies:
Please take steps to compress your videos. Uncompressed videos will take longer to upload and load within your presentation, and will not be better quality than a modern MPEG-4 codec. We can only accept movies created as MPGs, WMVs, or with the following AVI codecs: MPEG-4 (Divx, Xvid, or WMVs), Cinepack, Techsmith. Flash content (SWF) is fully supported.

Note: It is important your movies do not completely fill the screen. In the meeting room you will only have a mouse to advance your slides. You can only advance your PowerPoint with a mouse by clicking on the slide, not the movie itself.

DVDs:
If you plan to play a DVD as part of your presentation, please notify a technician in the Speaker Ready Room so arrangements can be made for assistance in your meeting room.

Fonts:
We only supply fonts that are included with Office 2010. If you need a specialized font, it should be embedded into your PowerPoint presentation. For instructions on this process, please click on the following link: http://support.microsoft.com/kb/826832/en-us

· Before you Depart

Advance Submission:
You may submit your presentation starting on Friday, May 18. You will receive a notification on May 18 that will provide you with the link to upload your presentation.

Multiple Presenters:
Please do not combine multiple presenters’ PowerPoints into one file and then submit under one name. The Presentation Management System manages presenters individually and any co-presenter will not be able to logon to edit the combined presentation.

Backup:
Please bring a backup copy of your presentation along with you when you depart for your meeting. Copy your PowerPoint and all movies to a folder on a USB drive or CD. PowerPoint does NOT embed movies, and therefore, they must all be placed in the same folder as your PowerPoint.

· At the Meeting

Speaker Ready Room:

Speakers should review their presentation in the Speaker Ready Room prior to their scheduled presentation. The Speaker Ready Room will be staffed with technicians that can assist with any compatibility or formatting issues within your presentation. The computers in the Speaker Ready Room will be configured with hardware and software exactly like the computer in the meeting rooms.

Be sure to use the mouse to advance your slides, not the keyboard, as you will only have a mouse at the podium to advance your presentation. Left click advances the slides; right click goes back. Once you are comfortable that your presentation is complete, confirm the date, time, and room for your session. Be sure to click the green “save/logout” button on the top of the screen.

Speaker Ready Room: Room 120

Hours of Operation:

Friday, 6/15/12: 8:00 a.m. – 5:00 p.m.
Saturday, 6/16/12: 7:00 a.m. – 7:00 p.m.
Sunday, 6/17/12: 6:00 a.m. – 7:00 p.m.
Monday, 6/18/12: 6:00 a.m. – 7:00 p.m.
Tuesday, 6/19/12: 6:00 a.m. – 3:00 p.m.

· Considerations for Mac Users

Pictures:
If you use a version of PowerPoint prior to 2008, please be sure any embedded pictures are not TIFF format. These images will not show up in Windows PowerPoint. With PowerPoint 2010 for the Mac, this is no longer an issue, and any inserted image will be compatible.

Keynote Users:
Please export your presentation as a PowerPoint Presentation.
If you are having any issues please notify Mac support at XXXXX for additional help.

By following the guidelines above, your presentation will go smoothly. Should you have any questions not addressed in this document, please feel free to email XXXX.

If you have any questions regarding the scientific program or session details, please contact Janet Mitchell at XXXXXX.

Sincerely,

Janet M Mitchell, M.S.
Program Manager, General Meeting

Coming to #UCDavis 5/24: Nathan Wolfe on Forecasting Viral Pandemics

Nathan Wolfe flyer.pdf

Scholarly Kitchen – getting more and more rotten as the days go by

In February I wrote about how something smelled funny with the connection between “The Scholarly Kitchen” blog and the Heartland Institute: The Tree of Life: Something rotten in the Scholarly Kitchen? (Climate Change Denialism is Everywhere)

Well, though I thought the Heartland Institute was a bit extreme from the previous “work” it seems they have gone even more off the deep end recently with their ad campaign featuring Charles Manson Ted Kaczynski

This least effort has led to an even further reduction in support for the folks at Heartland. See for example:

And more.  
So – amazingly – as Heartland dips more and more into extremism – I have seen no sign from anyone at the Scholarly Kitchen of any concern that one of their co-bloggers – David Wojick – also happens to work for the Heartland Institute:
What does this say about TSK?  Not sure.  But it continues to smell funny to me.  Wojick is using his position at TSK to make him seem like an academic.  Heartland is using his seeming academic status to promote their ideas, which get more and more extreme by the day apparently.  As even very conservative groups disavow themselves of any affiliation with Heartland, pulling money, and other kinds of support, I still have yet to see any public comments from TSK folks about whether they think Wojick is using his role in their blog to indirectly promote extreme ideas …

Seminar: Translational Genomic Medicine: From the Science of Discovery to the Science of Action, 5/17, 4pm

Seminar announcement (Flier attached):

The Department of Public Health Sciences School of Medicine, University of California, Davis
and the Graduate Group in Epidemiology presents a guest talk on:

Translational Genomic Medicine: From the Science of Discovery to the Science of Action

Muin J. Khoury MD, PhD
Office of Public Health Genomics, CDC
Epidemiology and Genomics Research Program, NCI

Dr. Khoury is the first director of the CDC’s National Office of Public Health Genomics. In 2000, he received the CDC Research Honor Award for outstanding national leadership in genetics and public health. In 2005, he received the National Cancer Institute visiting scholar award for leadership and vision in genetic epidemiology and public health. He has extensive publications in genetic epidemiology and public health genomics with more than 350 peer reviewed articles and 3 books.

Thursday, May 17, 2012
4:00-5:00 pm
Building Location – 1020 Valley Hall
University of California, Davis campus

Videoconference to:
1222 Education Building, Sacramento
School of Medicine

Please RSVP to PHSInstAffairs and indicate Davis or Sacramento in the subject line.

0512KhourySeminarFlyerRevn4 .pdf

Nanovation! (May 10, 2012) — Nanotechnology and Regimes of Innovation

Please join the Center for Science & Innovation Studies, the Humanities Innovation Lab, the Program in Science & Technology Studies, and the School of Law for a one day conference.


Nanovation! — Nanotechnology and Regimes of Innovation

UC Davis Conference Center, Room A

May 10, 2012 (Schedule below)

Please RSVP for Lunch and Reception: https://www.surveymonkey.com/s/T5BSMXW

nanovation_poster.pdf