Jonathan Eisen

2012 #UCDavis Faculty Research Lecture Award-Michael Turelli 6/6 4PM “How good luck, great collaborators, pretty mathematics and a maternally inherited bacterium (Wolbachia) may stop the spread of dengue fever”

Michael Turelli

Distinguished Professor of Genetics

Department of Evolution and Ecology

and The Center for Population Biology

Recipient of the

2012 UC DAVIS

ACADEMIC SENATE

FACULTY RESEARCH LECTURE AWARD

“How good luck, great collaborators, pretty mathematics and a maternally inherited bacterium (Wolbachia) may stop the spread of dengue fever”

Public Lecture

June 6, 2012

4:10 p.m.

1322 Storer Hall

Turelli Lecture Flyer – listserve.docx

Sign of the apocalypse? Science conference SPAM hybridizes w/ Nigerian advanced fee SPAM.

Normally I do not share SPAM emails. But I have posted occasionally about Journal SPAM and Conference SPAM. So what do I do here. I just received an email that appears to be a hybrid between Conference SPAM and Nigerian advanced fee SPAM. OMG. The merging of two SPAM systems. Too bad the Conference is not about viagra – though since it is about Metagenomics perhaps it somehow got flagged due to studies of the penis or vagina microbiome? In this case I just had to post …

Dear Honored Sir or Madam,

I am Prof. Mohammed Kaoje Abubakar minister of Science for the Republic of Nigeria under former President Alhaji Umaru Musa Yar’Adua. In this role I became in control of large sums of money dedicated to scientific research and the exchange of ideas with researchers from across the globe. However, since the time of the unfortunate death of President Yar’Adua, I have been under intense scrutiny of the new director of the ministry and have been unable to complete my mission. Fortunately, I remain in charge of most of the 200 million dollars US, but the current government will only release the funds in conjunction with scientific activities involving prominent foreign national scientists like yourself.

Therefore, on behalf of the 3rd Nigerian Congress of Metagenomics I am pleased to welcome you to propose a speech on your recent discovery about the genomic basis for the origin and evolution of new functions at the congress by submitting your speech title and CV to us. Meanwhile, we hope you can share your stimulating data, valuable scientific information and influential experiences with other industrial leaders, professionals and research pioneers. You are encouraged to network and explore partnering opportunities.

As a branded Conference of Nigeria Congresses LLC, “Your Think Tank”, NCM continues to expand with magnificent scientific and social programs to maximize your network in a free communication meeting environment.

Activities of NCM 2012

l Keynote Forum – Presentations from Nobel Prize Laureate and Senior Leaders of Renowned Company

l Parallel Forum – 200+ Sessions and Symposiums provide 1000+ speech opportunities for experts from all of the world

l Welcome Banquet – All the participants enjoy the formal buffet dinner with wonderful performance show

l Project Matching Activity – Develop effective platform by free booths supply

l Keymakers Summit – Special Forum for Enterprisers to discuss hot issues face to face

l Exhibition and Poster Zone

The 3rd Nigerian Congress of Metagenomics is initiated for filling the gap between Eastern and West World for metagenomic professionals of free information exchange. In the past decade, NCM has attracted more than 5,000 enthusiastic speakers to communicate on the R & D advances in different therapeutic fields, which have generated great impact on the Chinese Bio/pharmaceutical development, enhanced Research and Development outsourcing, helped regional liaison of big pharma seeking partnership and searching talents, created a lot of opportunities for face-to-face network for multilateral collaboration by sharing both scientific and technological breakthroughs and speed up the process of many challenging drug discovery projects.

For more information PS: http:www.ncm.ng

Warri is a major oil city in Delta State, Nigeria, with a population of over 300,000 people. We look forward to seeing you in Warri for a stimulating and enjoyable conference. Kindest regards,

Prof. Mohammed Kaoje Abubakar8 for the organizing committee.

Phyloseminar: David Pollock 5/30 10am PST “Adaptation, coevolution, & convergence in the context of protein thermodynamics”

Next talk at http://phyloseminar.org/

"Adaptation, coevolution, and convergence in the context of protein thermodynamics"

David Pollock (University of Colorado School of Medicine)

Interactions within and between proteins are a fundamentally important part of how they evolve and adapt. We have been considering how and why proteins adapt, coevolve, and converge, and working to understand these concepts in the context of protein thermostability and function.

We will expand from the previous talk of our collaborator, Dr.
Goldstein, and discuss how and why coevolution is and should be detected, and how thermostability affects reconstruction of ancestral functions. Further, we will discuss our work on adaptive redesign in mitochondrial proteins, perhaps the largest known case of an adaptive burst in multiple metabolic proteins. The convergence between ancestral snakes and ancestral acrodont lizards is also perhaps the largest known case of adaptive convergence. We will consider what these examples tell us about the theory of how proteins appear to evolve in the context of nearly neutral versus cases of adaptive change. Further, we will discuss the impact on understanding phylogenetic relationships, and we will also discuss a unified theory of nearly neutral and adaptive evolution in the context of structure and function.

West Coast USA: 10:00 (10:00 AM) on Wednesday, May 30
East Coast USA: 13:00 (01:00 PM) on Wednesday, May 30
UK: 18:00 (06:00 PM) on Wednesday, May 30
France: 19:00 (07:00 PM) on Wednesday, May 30
Japan: 02:00 (02:00 AM) on Thursday, May 31
New Zealand: 05:00 (05:00 AM) on Thursday, May 31

Useful comparative analysis of sequence classification systems w/ a few questionable bits

There is a useful new publication just out: BMC Bioinformatics | Abstract | A comparative evaluation of sequence classification programs by Adam L Bazinet and Michael P Cummings. In the paper the authors attempt to do a systematic comparison of tools for classifying DNA sequences according to the taxonomy of the organism from which they come.

I have been interested in such activities since, well, since 1989 when I started working in Colleen Cavanaugh’s lab at Harvard sequencing rRNA genes to do classification. And I have known one of the authors, Michael Cummings for almost as long.

Their abstract does a decent job of summing up what they did

Background
A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics). Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis.

Results
We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known.

Conclusions
We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.

The three main categories of methods they identified are

Programs that primarily utilize sequence similarity search
Programs that primarily utilize sequence composition models (like CompostBin from my lab)
Programs that primarily utilize phylogenetic methods (like AMPHORA & STAP from my lab)

The paper has some detailed discussion and comparison of some of the methods in each category. They even made a tree of the methods

Figure 1. Program clustering. A neighbor-joining tree
that clusters the classification programs based on their similar attributes. From here.

In some ways – I love this figure. Since, well, I love trees. But in other ways I really really really do not like it. I don’t like it because they use an explicitly phylogenetic method (neighbor joining, which is designed to infer phylogenetic trees and not to simply cluster entities by their similarity) to cluster entities that do not have a phylogenetic history. Why use neighbor-joining here? What is the basis for using this method to cluster methods? It is cute, sure. But I don’t get it. What do deep branches represent in this case? It drives me a bit crazy when people throw a method designed to represent branching history at a situation where clustering by similarity is needed. Similarly it drives me crazy when similarity based clustering methods are used when history is needed.

Not to take away from the paper too much since this is definitely worth a read for those working on methods to classify sequences as well as for those using such methods. They even go so far as to test various web served (e.g., MGRAST) and discuss time to get results. They also test the methods for their precision and sensitivity. Very useful bits of information here.

So – overall I like the paper. But one other thing in here sits in my craw in the wrong way. The discussion of “marker genes.” Below is some of the introductory text on the topic. I have labelled some bits I do not like too much:

It is important to note that some supervised learning methods will only classify sequences that contain “marker genes”. Marker genes are ideally present in all organisms, and have a relatively high mutation rate that produces significant variation between species. The use of marker genes to classify organisms is commonly known as DNA barcoding. The 16S rRNA gene has been used to greatest effect for this purpose in the microbial world (green genes [6], RDP [7]). For animals, the mitochondrial COI gene is popular [8], and for plants the chloroplast genes rbcL and matK have been used [9]. Other strategies have been proposed, such as the use of protein-coding genes that are universal, occur only once per genome (as opposed to 16S rRNA genes that can vary in copy number), and are rarely horizontally transferred [10]. Marker gene databases and their constitutive multiple alignments and phylogenies are usually carefully curated, so taxonomic and functional assignments based on marker genes are likely to show gains in both accuracy and speed over methods that analyze input sequences less discriminately. However, if the sequencing was not specially targeted [11], reads that contain marker genes may only account for a small percentage of a metagenomic sample.

I think I will just leave these highlighted sections uncommented upon and leave it to people to imagine what I don’t like about them .. for now.

Anyway – again – the paper is worth checking out. And if you want to know more about methods used for classifying sequences see this Mendeley collection which focuses on metagenomic analysis but has many additional paper on top of the ones discussed in this paper.

Interesting new paper: "Proving universal common ancestry with similar sequences"

Just discovered an interesting paper by Leonardo de Oliveira Martins and David Posada. It is titled “Proving universal common ancestry with similar sequences.” It relates to a paper by Douglas Theobald: “A formal test of the theory of universal common ancestry. Nature 2010; 465:219-22.” Although the latter paper is not openly available the more recent one is.

The new paper is worth a look. Not sure about the Theobald one as I do not have access from home.

Am hoping Leonardo writes more about this in his blog: Bayesian Procedures in Biology ….

Today at #UCDavis Luca Comai “Genome-wide discovery of mutations in rice through exome capture & sequencing”

Genetics Seminar

“Genome-wide discovery of mutations in rice through exome capture and sequencing”

Speaker: Dr. Luca Comai

UC Davis | Plant Biology and Genome Center

Monday, May 14th, 2012

4:10 PM

1022 Life Sciences

__________

BAY AREA BIOSYSTEMATISTS (BABS) MEETING 5/22

Bay Area Biosystematists (BABS) Meeting

Tuesday evening, 22 May 2012

at UC Davis, 1022 Life Sciences Building

“PHYLOGENOMICS AND SYSTEMATICS”

The genomics era holds great promise (and challenge) to systematics. There is the prospect of generating sequence data that will provide unprecedented resolution of phylogenetic relationships across the Tree of Life, and a much improved understanding of the tempo and mode of evolution. Join us for two talks on phylogenomics, along with plenty of discussion, leavened by pizza and beer.

Featuring presentations by…

HOLLY BIK, Postdoctoral Researcher, Eisen Lab, UC Davis Genome Center

“Assembling multi-species genomic data”

and…

BASTIEN BOUSSAU, Postdoctoral Fellow, Huelsenbeck Lab, UC Berkeley

“Methods of phylogenetic inference for genome-scale data sets”

Schedule and venue:

5:30 pm: social gathering with beverages (beer and soft drinks) and informal

pizza dinner: cost ca. $10, to be collected at door, 1022 Life Sciences, UC Davis campus.

7:00 – 9:00 pm: talks, followed by discussion, in same room.

Reservations required for beverages and dinner (but not the talk). Please email reservations to your host, Phil Ward: psward by Sunday, May 20

For a map of UC Davis campus and Life Sciences Building:

http://campusmap.ucdavis.edu/?b=97

Parking is available in the West Entry Parking Structure, immediately west of Life Sciences. If coming from the Bay Area take the Hwy. 113 exit off I-80, and then the first exit off Hwy 113, which is Hutchison Drive. This will bring you directly to the parking garage. Or, as Google Maps would say:

All are welcome, members or not. If you want to join the Biosystematists, sign up for our mailing list at:

https://calmail.berkeley.edu/manage/list/listinfo/babs-l@lists.berkeley.edu

Mini post: Microbial forensics

A few months old here but there is a very interesting post from the Science Media Centre in New Zealand: Science Media Centre: Microbes in soil could help fight crime. The post describes attempts to use microbes in soil as part of forensic activities. This relates in many ways to my call for a “Field Guide to Microbes”.

I have been interested in microbial forensics for many years since I worked at TIGR on part of the project to study anthrax genomes. For those interested in microbe-related forensic activities I have created a Mendeley collection of references on the topic.

http://www.mendeley.com/groups/1147121/_/widget/29/5/

Microbial Forensics is a group in Biological Sciences, Law on Mendeley.

Oh the irony – new #OpenAccess #PLoSOne paper on Research Blogs doesn’t share data behind analyses.

Interesting new paper: PLoS ONE: Research Blogs and the Discussion of Scholarly Information. All about the new world of science blogging. Much of the context here relates to openness. Yet as far as I can tell, the data collected that make up the meat of the analyses in the paper, are not shared. Uggh.

Is there something I am missing here? Shouldn’t a prerequisite of publishing this kind of paper be sharing the information / data used in the analyses? Shouldn’t that be released with the paper?

Definitely time to start “Open Data Watch” where people have a place to complain about lack of open availability of data behind papers (I came up with the name as a mimic of Ivan Oransky’s diverse watch sites like Retraction Watch). Originally in thinking about doing this I had been thinking about genomic data. But I am sure this is a problem in other areas. Consider paleontology, where openness to fossils and other samples is, well, not as common as it should be. It is not that hard anymore to find a place to share one’s data. With places like Data Dryad and Biotorrents and FigShare and Merritt and 100s of others it is really inexcusable not to share the data behind a paper in most cases. Certainly, in some cases there maybe privacy issues but that is not the case here (I think) and not an issue in most cases.

Come on people. If scientific papers are to be reproducible and testable, you need to give people access to the data you used. Shema, H., Bar-Ilan, J., & Thelwall, M. (2012). Research Blogs and the Discussion of Scholarly Information PLoS ONE, 7 (5) DOI: 10.1371/journal.pone.0035869

John Roth seminar “Does RecA activity PREVENT chromosome rearrangements?” 5/14

MIC 275 Rec Repair Club

Monday May 14, 2012
LS 1022
10 Am

John Roth:
Mechanisms of duplication formation:
Does RecA activity PREVENT chromosome rearrangements?

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: