Fall 2012 Introduction to Evolution (EVE 100) teaching position at #UCDavis

DEPARTMENT OF EVOLUTION AND ECOLOGY

FALL 2012

TEACHING POSITION AVAILABLE

LECTURER

Introduction to Evolution

(EVE 100)

FALL 2012 (September 24-December 14, 2012)

Responsibilities: A 60% position teaching EVE 100 – Introduction to Evolution (4 units). Lecture–3 hours, Discussion–1 hour. The course subject provides a general survey of the origins of biological diversity and evolutionary mechanisms. Estimated enrollment: 200

Requirements: Ph.D. and demonstrated effective teaching in the subject course or equivalent course.

Salary: Commensurate upon qualifications.

Please submit letter of application, including summary of qualifications, CV, two letters of recommendation and any applicable teaching evaluation summaries via the link below link which contains additional information about the position.

https://recruitments.ucdavis.edu/PositionDetails.aspx?PositionID=103&Title=Fall-2012-Lecturer—Introduction-to-Evolution-%28EVE-100%29

OPEN UNTIL FILLED. FOR FULL CONSIDERATION APPLICATION MUST BE RECEIVED BY JUNE 22, 2012.

This position may be covered by a collective bargaining unit.

The University of California is an Equal Opportunity/Affirmative Action Employer with a strong institutional commitment to the development of a climate that supports equality of opportunity and respect for diversity.

05/23/12

Something fishy with this story: bacteria in fish pedicures

Well, the title drew me in, without a doubt: Fish Pedicures: Bacteria in Your Foot Soak.

To start with _ i guess I have been out of touch as I have never heard of fish pedicures before.  Sounds lovely I must say.

Though if you are considering doing this you might be dissuaded by some of the revelations in the article including that “fish are living creatures that deposit their waste products in the very water in which people are soaking” and “the impossibility of disinfecting or sanitizing live fish.”

Amazingly, fish pedicures are in fact apparently quite popular.  So popular that there are multiple investigations relating to this practice including that “British authorities investigated a reported bacterial outbreak among 6,000 Garra rufa fish ” and “Last spring, British fish inspectors went to London’s Heathrow Airport and intercepted Indonesian shipments of the silver, inch-long freshwater carp destined for British “fish spas.”

And now – the reason for this article – there is a new report in the journal Emerging Infectious Diseases on “Zoonotic Disease Pathogens in Fish Used for Pedicure.”  The article is actually somewhat fascinating and thanks to the CDC it is freely available.

Fun reading for the day …

Useful comparative analysis of sequence classification systems w/ a few questionable bits

There is a useful new publication just out: BMC Bioinformatics | Abstract | A comparative evaluation of sequence classification programs by Adam L Bazinet and Michael P Cummings.  In the paper the authors attempt to do a systematic comparison of tools for classifying DNA sequences according to the taxonomy of the organism from which they come.

I have been interested in such activities since, well, since 1989 when I started working in Colleen Cavanaugh’s lab at Harvard sequencing rRNA genes to do classification.  And I have known one of the authors, Michael Cummings for almost as long.

Their abstract does a decent job of summing up what they did

Background
A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics). Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis. 

Results
We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known. 

Conclusions
We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.

The three main categories of methods they identified are

  • Programs that primarily utilize sequence similarity search
  • Programs that primarily utilize sequence composition models (like CompostBin from my lab)
  • Programs that primarily utilize phylogenetic methods (like AMPHORA & STAP from my lab)
The paper has some detailed discussion and comparison of some of the methods in each category.  They even made a tree of the methods
Figure 1. Program clustering. A neighbor-joining tree
 that clusters the classification programs based on their similar attributes. From here.
In some ways – I love this figure.  Since, well, I love trees.  But in other ways I really really really do not like it.  I don’t like it because they use an explicitly phylogenetic method (neighbor joining, which is designed to infer phylogenetic trees and not to simply cluster entities by their similarity) to cluster entities that do not have a phylogenetic history.  Why use neighbor-joining here?  What is the basis for using this method to cluster methods?  It is cute, sure.  But I don’t get it.  What do deep branches represent in this case?  It drives me a bit crazy when people throw a method designed to represent branching history at a situation where clustering by similarity is needed.  Similarly it drives me crazy when similarity based clustering methods are used when history is needed.
Not to take away from the paper too much since this is definitely worth a read for those working on methods to classify sequences as well as for those using such methods.  They even go so far as to test various web served (e.g., MGRAST) and discuss time to get results.  They also test the methods for their precision and sensitivity.  Very useful bits of information here.
So – overall I like the paper.  But one other thing in here sits in my craw in the wrong way.  The discussion of “marker genes.”  Below is some of the introductory text on the topic.  I have labelled some bits I do not like too much:

It is important to note that some supervised learning methods will only classify sequences that contain “marker genes”. Marker genes are ideally present in all organisms, and have a relatively high mutation rate that produces significant variation between species. The use of marker genes to classify organisms is commonly known as DNA barcoding. The 16S rRNA gene has been used to greatest effect for this purpose in the microbial world (green genes [6], RDP [7]). For animals, the mitochondrial COI gene is popular [8], and for plants the chloroplast genes rbcL and matK have been used [9]. Other strategies have been proposed, such as the use of protein-coding genes that are universal, occur only once per genome (as opposed to 16S rRNA genes that can vary in copy number), and are rarely horizontally transferred [10]. Marker gene databases and their constitutive multiple alignments and phylogenies are usually carefully curated, so taxonomic and functional assignments based on marker genes are likely to show gains in both accuracy and speed over methods that analyze input sequences less discriminately. However, if the sequencing was not specially targeted [11], reads that contain marker genes may only account for a small percentage of a metagenomic sample.  

I think I will just leave these highlighted sections uncommented upon and leave it to people to imagine what I don’t like about them .. for now.

Anyway – again – the paper is worth checking out.  And if you want to know more about methods used for classifying sequences see this Mendeley collection which focuses on metagenomic analysis but has many additional paper on top of the ones discussed in this paper.

Interesting new paper: "Proving universal common ancestry with similar sequences"

Just discovered an interesting paper by Leonardo de Oliveira Martins and David Posada.  It is titled “Proving universal common ancestry with similar sequences.”  It relates to a paper by Douglas Theobald: “A formal test of the theory of universal common ancestry. Nature 2010; 465:219-22.” Although the latter paper is not openly available the more recent one is.  


The new paper is worth a look.  Not sure about the Theobald one as I do not have access from home.


Am hoping Leonardo writes more about this in his blog: Bayesian Procedures in Biology ….

Mini post: Microbial forensics

A few months old here but there is a very interesting post from the Science Media Centre in New Zealand: Science Media Centre: Microbes in soil could help fight crime.  The post describes attempts to use microbes in soil as part of forensic activities.  This relates in many ways to my call for a “Field Guide to Microbes”.

I have been interested in microbial forensics for many years since I worked at TIGR on part of the project to study anthrax genomes.  For those interested in microbe-related forensic activities I have created a Mendeley collection of references on the topic.

http://www.mendeley.com/groups/1147121/_/widget/29/5/

‘Danger and Evolution in the Twilight Zone’: Guest post by Randen Patterson and Gaurav Bhardwaj

Figure 1. PHYRN concept and work flow.

‘Danger and Evolution in the twilight zone’

I have been communicating with Randen Patterson on and off over the last five years or so about his efforts to try and study the evolution of gene families when the sequence similarity in the gene family is so low that making multiple sequence alignments are very difficult.  Recently, Randen moved to UC Davis so I have been talking / emailing with jim more and more about this issue.  Of note, Randen has a new paper in PLoS One about this topic: Bhardwaj G, Ko KD, Hong Y, Zhang Z, Ho NL, et al. (2012) PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences. PLoS ONE 7(4): e34261. doi:10.1371/journal.pone.0034261.

Figure 8. Model for the Evolution of the DANGER Superfamily.

I invited Randen and the first author Gaurav Bhardwaj to do a guest post here providing some of the story behind their paper for my ongoing series on this topic.  I note – if you have published an open access paper on some topic related to this blog I would love to have a guest post from you too.   I note – I personally love the fact that they used the “DANGER” family as an example to test their method.

Here is their guest post:

A fundamental problem to phylogenetic inference in the “twilight zone” (<25% pairwise identity), let alone the “midnight zone” (<12% pairwise identity), is the inability to accurately assign evolutionary relationships at these levels of divergence with statistical confidence. This lack of resolution arises from difficulties in separating the phylogenetic signal from the random noise at these levels of divergence. This obviously and ultimately stymies all attempts to truly resolve the Tree of Life. Since most attempts at phylogenetic inferences in twilight/midnight zone have relied on MSA, and with no clear answer on the best phylogenetic methods to resolve protein families in twilight/midnight zone, we have presented rest of this blog post as two questions representative of these problems.  

Question1: Is MSA required for accurate phylogenetic inference? 

Our Opinion: MSA is an excellent tool for the inference from conserved data sets, but it has been shown by others and us, that the quality of MSA degrades rapidly in the twilight zone. Further, the quest for an optimal MSA becomes increasingly difficult with increased number of taxa under study. Although, quality of MSA methods has improved in last two decades, we have not made significant improvements towards overcoming these problems. Multiple groups have also designed alignment-free methods (see Hohl and Ragan, Syst. Biol. 2007), but so far none of these methods has been able to provide better phylogenetic accuracy than MSA+ML methods. We recently published a manuscript in PLoS One entitled “PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences” introducing a hybrid profile-based method. Our approach focuses on measuring phylogenetic signal from homologous biological patterns (functional domains, structural folds, etc), and their subsequent amplification and encoding as phylogenetic profile. Further, we adopt a distance estimation algorithm that is alignment-free, and thus bypasses the need for an optimal MSA. Our benchmarking studies with synthetic (from ROSE and Seqgen) and biological datasets show that PHYRN outperforms other traditional methods (distance, parsimony and Maximum Liklihood), and provides significantly accurate phylogenies even in data sets exhibiting ~8% average pairwise identity. While this still needs to be evaluated in other simulations (varying tree shapes, rates, models), we are convinced that these types of methods do work and deserve further exploration. 

Question 2: How can we as a field critically and fairly evaluate phylogenetic methods? 

Our Opinion: A similar problem plagued the field of structural biology whereby there were multiple methods for structural predictions, but no clear way of standardizing or evaluating their performance.  An additional problem that applies to phylogenetic inference is that, unlike crystal structures of proteins, phylogenies do not have a corresponding “answer” that can be obtained.  Synthetic data sets have tried to answer this question to a certain extent by simulating protein evolution and providing true evolutionary histories that can be used for benchmarking.  However, these simulations cannot truly replicate biological evolution (e.g. indel distribution, translocations, biologically relevant birth-death models, etc). In our opinion, we need a CASP-like model (solution adopted by our friends in computational structural biology), where same data sets (with true evolutionary history known only to organizers) are inferred by all the research groups, and then submitted for a critical evaluation to the organizers. To convert this thought to reality, we hereby announce CAPE (Critical Assessment of Protein Evolution) for Summer 20132. We are still in pre-production stages, and we welcome any suggestions, comments and inputs about data sets, scoring and evaluating methods.   

ResearchBlogging.org Bhardwaj, G., Ko, K., Hong, Y., Zhang, Z., Ho, N., Chintapalli, S., Kline, L., Gotlin, M., Hartranft, D., Patterson, M., Dave, F., Smith, E., Holmes, E., Patterson, R., & van Rossum, D. (2012). PHYRN: A Robust Method for Phylogenetic Analysis of Highly Divergent Sequences PLoS ONE, 7 (4) DOI: 10.1371/journal.pone.0034261

Quick post – new paper of interest on "The Infinitely Many Genes Model …"

This paper seems of potential interest: The Infinitely Many Genes Model for the Distributed Genome of Bacteria by Franz Baumdicker, Wolfgang R. Hess, and Peter Pfaffelhuber

Abstract:

The distributed genome hypothesis states that the gene pool of a bacterial taxon is much more complex than that found in a single individual genome. However, the possible fitness advantage, why such genomic diversity is maintained, whether this variation is largely adaptive or neutral, and why these distinct individuals can coexist, remains poorly understood. Here, we present the infinitely many genes (IMG) model, which is a quantitative, evolutionary model for the distributed genome. It is based on a genealogy of individual genomes and the possibility of gene gain (from an unbounded reservoir of novel genes, e.g., by horizontal gene transfer from distant taxa) and gene loss, for example, by pseudogenization and deletion of genes, during reproduction. By implementing these mechanisms, the IMG model differs from existing concepts for the distributed genome, which cannot differentiate between neutral evolution and adaptation as drivers of the observed genomic diversity. Using the IMG model, we tested whether the distributed genome of 22 full genomes of picocyanobacteria (Prochlorococcus and Synechococcus) shows signs of adaptation or neutrality. We calculated the effective population size of Prochlorococcus at 1.01 × 1011 and predicted 18 distinct clades for this population, only six of which have been isolated and cultured thus far. We predicted that the Prochlorococcus pangenome contains 57,792 genes and found that the evolution of the distributed genome of Prochlorococcus was possibly neutral, whereas that of Synechococcus and the combined sample shows a clear deviation from neutrality.

Wish they had gone beyond these two cyanobacteria … but still seems of possible interest. ResearchBlogging.org Baumdicker, F., Hess, W., & Pfaffelhuber, P. (2012). The Infinitely Many Genes Model for the Distributed Genome of Bacteria Genome Biology and Evolution, 4 (4), 443-456 DOI: 10.1093/gbe/evs016

iEvoBio Call for Challenge entries for conference on Informatics for Phylogenetics, Evolution, and Biodiversity (iEvoBio)

From: Hilmar Lapp
Date: Thu, May 10, 2012 at 11:27 AM
Subject: [iEvoBio] Call for Challenge entries for conference on Informatics for Phylogenetics, Evolution, and Biodiversity (iEvoBio)
To: iEvoBio Announcements

Many trees enter.
Fresh analysis ensues.
New insights emerge.

As a reminder, the iEvoBio conference is again holding a Challenge competition in 2012, this time on the theme, "Synthesizing Phylogenies." Further information on the nature of challenge entries and how to submit them can be found on the iEvoBio website at http://ievobio.org/challenge.html. Submissions are due by June 25, 2012. Cash prizes will be awarded for first place (USD 1,500) and runner-up entries. The winning entries will be selected by a vote of the iEvoBio meeting participants.

Also, alongside the iEvoBio Challenge, 2012 iEvoBio sponsor Biomatters Ltd is running the Geneious Challenge. The goal of this challenge is to develop a new plugin to Geneious Pro, using the public API, that enables a new and exciting visualization or analysis. The winning entry will receive a $1000 cash prize, and all entrants who submit by the deadline will receive a 12-month subscription license. The deadline for the Geneious Challenge is the same as for the iEvoBio Challenge. See http://ievobio.org/geneious_challenge.html for more information.

More details about the iEvoBio conference and program are available at http://ievobio.org. You can also find continuous updates on the conference’s Twitter feed at http://twitter.com/iEvoBio and Google+ page, or subscribe to the low-traffic iEvoBio announcements mailing list at http://groups.google.com/group/ievobio-announce.

iEvoBio 2012 is sponsored by the US National Evolutionary Synthesis Center (NESCent) and by Biomatters Ltd., in partnership with the Society for the Study of Evolution (SSE) and the Systematic Biologists (SSB).

The iEvoBio 2012 Organizing Committee:
Hilmar Lapp, US National Evolutionary Synthesis Center (chair)
Robert Beiko, Dalhousie University
Nico Cellinese, University of Florida
Robert Guralnick, University of Colorado at Boulder
Rebecca Kao, Denver Botanic Gardens
Ellinor Michel, Natural History Museum, London
Nadia Talent, Royal Ontario Museum
Andrea Thomer, University of Illinois at Urbana-Champaign

Any method allowed for presentations at ASM meeting, as long as you use Powerpoint on a PC.

Just got this email from ASM linking to a message about my presentation at the upcoming ASM meeting in San Francisco.

Here is the message.

Ugggh.

I highlight some parts that I find disappointing at best.  Basically- they say “You can do your presentation is any way.  As long as you convert it to PowerPoint for a PC.”  Never mind other tools to do presentations.  Like Keynote.  Or Prezi.  Or, well, anything else.  Never mind people who use Macs.  Or Linux computers.  Or iPads.

I have NEVER had a problem doing a presentation off of my Mac or iPad.  I have had MANY problems when I have converted my Keynote or PDF files or other material to Powerpoint for a PC.

Oh, and forget about modifying your presentation in response to anything going on in the session (which I do frequently).  I try to tune my slides to the actual crowd.  No longer possible.

Maybe I should use no slides, like I did at TEDMED.  Or maybe I should do a Ross Perot and have charts.  Maybe I will bring my own projector and set it up just before my talk …  who knows … but I hate it when meetings say “Trust us – you won’t have any problems with our system”.

May 9, 2012

Dear Jonathan Eisen;
Thank you for participating as a speaker at asm2012, ASM’s 112th General Meeting in San Francisco, June 16-19, 2012. As a speaker, we kindly request that you consider the following guidelines as you finalize your PowerPoint presentation in the session listed below and also take note of some of the new requirements and changes for asm2012.

Session Details
Session Date/Time: 6/17/2012 3:00:00 PM – 6/17/2012 5:30:00 PM
Session Title: The Great Indoors: Recent Advances in the Ecology of Built Environments
Presentation Title: microBEnet: The Microbiology of the Built Environment Network (If your presentation title is not listed or incorrect, please provide this information to xxxxx immediately.
Length of your Personal Presentation: You are allotted 30 minutes for your presentation or lecture unless otherwise notified by the convener of this session.


New this Year
In order to provide the highest quality experience for our attendees, ASM now requires that all speakers upload their presentations at least four hours before their session begins (if you have a morning session, we recommend you upload your slides at 2:00 p.m. the afternoon before).

Speakers will no longer be permitted to use personal laptops during their presentations. The General Meeting began experiencing a greater number of technical issues as more and more speakers relied on using personal laptops from which to present. This contributed to unique technical situations during individual presentations as well as awkward transitions between speakers. The General Meeting will be utilizing the Presentation Management System for all speakers and our talented and dedicated set of technicians will be well-prepared and equipped to ensure all presentations are presented in the way they are intended and there are smooth transitions between speakers.

asm2012 will feature a networked presentation submission system, called our Presentation Management System. The tips below will help ensure that little, if any, editing will need to be done on-site, allowing you to quickly review your presentation and then attend other sessions in progress. However, ASM strongly recommends that you visit the Speaker Ready Room in Room 120 to test your slides before your session begins to ensure that they run properly on the Presentation Management System. The Presentation Management System can accommodate both Mac and PC based presentations.

The tips below are for both Windows and Mac users. As all the provided computers will be PCs, Mac users should additionally review Considerations for Mac Users at the bottom of this document.

· Building Your Presentation

Movies:
Please take steps to compress your videos. Uncompressed videos will take longer to upload and load within your presentation, and will not be better quality than a modern MPEG-4 codec. We can only accept movies created as MPGs, WMVs, or with the following AVI codecs: MPEG-4 (Divx, Xvid, or WMVs), Cinepack, Techsmith. Flash content (SWF) is fully supported.

Note: It is important your movies do not completely fill the screen. In the meeting room you will only have a mouse to advance your slides. You can only advance your PowerPoint with a mouse by clicking on the slide, not the movie itself.

DVDs:
If you plan to play a DVD as part of your presentation, please notify a technician in the Speaker Ready Room so arrangements can be made for assistance in your meeting room.

Fonts:
We only supply fonts that are included with Office 2010. If you need a specialized font, it should be embedded into your PowerPoint presentation. For instructions on this process, please click on the following link: http://support.microsoft.com/kb/826832/en-us

· Before you Depart

Advance Submission:
You may submit your presentation starting on Friday, May 18. You will receive a notification on May 18 that will provide you with the link to upload your presentation.

Multiple Presenters:
Please do not combine multiple presenters’ PowerPoints into one file and then submit under one name. The Presentation Management System manages presenters individually and any co-presenter will not be able to logon to edit the combined presentation.

Backup:
Please bring a backup copy of your presentation along with you when you depart for your meeting. Copy your PowerPoint and all movies to a folder on a USB drive or CD. PowerPoint does NOT embed movies, and therefore, they must all be placed in the same folder as your PowerPoint.

· At the Meeting

Speaker Ready Room:

Speakers should review their presentation in the Speaker Ready Room prior to their scheduled presentation. The Speaker Ready Room will be staffed with technicians that can assist with any compatibility or formatting issues within your presentation. The computers in the Speaker Ready Room will be configured with hardware and software exactly like the computer in the meeting rooms.

Be sure to use the mouse to advance your slides, not the keyboard, as you will only have a mouse at the podium to advance your presentation. Left click advances the slides; right click goes back. Once you are comfortable that your presentation is complete, confirm the date, time, and room for your session. Be sure to click the green “save/logout” button on the top of the screen.

Speaker Ready Room: Room 120

Hours of Operation:

Friday, 6/15/12: 8:00 a.m. – 5:00 p.m.
Saturday, 6/16/12: 7:00 a.m. – 7:00 p.m.
Sunday, 6/17/12: 6:00 a.m. – 7:00 p.m.
Monday, 6/18/12: 6:00 a.m. – 7:00 p.m.
Tuesday, 6/19/12: 6:00 a.m. – 3:00 p.m.

· Considerations for Mac Users

Pictures:
If you use a version of PowerPoint prior to 2008, please be sure any embedded pictures are not TIFF format. These images will not show up in Windows PowerPoint. With PowerPoint 2010 for the Mac, this is no longer an issue, and any inserted image will be compatible.

Keynote Users:
Please export your presentation as a PowerPoint Presentation.
If you are having any issues please notify Mac support at XXXXX for additional help.

By following the guidelines above, your presentation will go smoothly. Should you have any questions not addressed in this document, please feel free to email XXXX.

If you have any questions regarding the scientific program or session details, please contact Janet Mitchell at XXXXXX.

Sincerely,

Janet M Mitchell, M.S.
Program Manager, General Meeting

Dueling microbial diversity talks at #UCDavis on May 2 #symbioses #microbiology

Here is a storification of the dueling microbial diversity talks that happened at UC Davis on Wednesday May 2.
http://storify.com/phylogenomics/dueling-microbial-diversity-talks-at-ucdavis.js?template=slideshow[View the story “Dueling microbial diversity talks at #UCDavis” on Storify]