Eisen Lab Blog

Why I don’t like to pre-submit slides for talks – lessons from #AAASMoBE meeting

So – I gave a talk at a meeting on Thursday.  The meeting was called “Microbiomes of the Built Environment” and it was sponsored by the Alfred P. Sloan Foundation and run by AAAS.

The meeting organizers, as is often the case, wanted me to submit my slides a few days in advance, in theory to make sure they were loaded into their system and that all worked OK.  Well, as usual, I did not do this.  I like to make my talks fresh – just before the meeting so that I can incorporate new ideas into them and so that they do not have that canned feeling that a lot of talks do.

My talk was to be 15 minutes long and was to focus on my Sloan Foundation funded project “microBEnet: the microbiology of the built environment network” (see http://microbe.net for information about the project). I figured, I would work on the talk on the plane – five plus hours to edit a talk I had given relatively recently on the topic of this project.  And all would be good.  Plus, United had told me there would be WiFi on the plane so if I needed any new material I should be able to get it from the web right?  Well, the flight took off on time – 8 AM on Wednesday morning.  And I opened my laptop once allowed and paid the $15+ dollars for the WiFi and got to work.  Then, about 10 minutes later, the WiFi died and despite heroic efforts by the flight attendants, it never came back. And I plugged away at my slides doing some edits of the following presentation.

I had given this talk for the Annual Sloan Foundation meeting in May of 2013. I had other talks about microBE.net but they were focused on specific aspects and this was my most recent talk on the whole project. And, well, I started doing some minor edits on it, but, well, the slides felt too filled with boring text. And it did not seem to me to capture what I wanted to talk about. So I did the one thing that always helps me in such cases. I shut my computer and got our a notebook and started writing out and drawing out what I really wanted to talk about. And I finally started to have something I liked.

I liked this because what we are trying to do with microBEnet is to create an actual network and this was a visual way of representing our network. So then I got a bit more detailed

This was better. We were trying to help people in the field and help others who might be interested get connected, stay connected, and get rapid, easy access to information and tools. For example, we have been curating a reference collection for the field. And this reference collection has a lot of inputs and a lot of potential uses. In my previous talk I just listed some of this and had a screenshot of the web site. But it would be better to show this no?

Now this was feeling even better. I had a visual framework for the talk. Now I could fill in the details of what I wanted to cover in each of the parts of the network diagram.

So now I had some idea as to what I might want to say on these topics. No slides yet, but some idea as to what I might want to cover. And then, still not back on the computer I thought it would be good to write out an outline / the flow of the talk again. So I did.

Still felt good.

Now all I needed was a title ..

And then finally I felt I could go back to the computer. And so I started working on converting this all into slides.

For the remaining 2 hours of the flight I tried but it was slow going. I wanted to make as much as possible be visual and I needed all sorts of new slides and material from the web (no web connection still) and more. We landed. I took a Taxi to the hotel. I worked on my talk a bit from my room. I emailed various people asking for certain images and slides. And then I had to go to the speaker’s dinner. And I got back to my room at about 9:30. And then I worked on my talk until about 3. And finally I was close to being done. Got a very brief few hours of sleep. Got up. Went to the meeting. Did a couple of minor modifications of my slides in the back of the room. Posted my slides to Slideshare. And then gave my talk.

Here are the final slides

Not perfect. But much more visual. Much more networky. Much better at showing what we actually do and try to do on my project.  And much fresher to me so it was certainly not a canned talk.  Not a polished talk .. but not a canned one.

For more about the meeting, including videos of talks (including mine) see the Storify I made.

Guest post by Jay Kaufman: A Bad Taste That Keeps Not Getting Any Better….

Guest post by Jay Kaufman.  Jay and I have been having some email discussions about a paper in PLOS One.  I offered to let him write a guest post to my blog about his concerns.

——————————-

Jonathan Eisen already posted on this blog about a PLoS ONE paper by Mason et.. published on 23 October 2013.  And he posted related comments on the PLoS ONE website.  I also commented at this site, in reference to his comments and the authors’ response. The purpose of my comments here are just to review those concerns and comment additionally on the PLos ONE response and what this means for the journal’s publication model and the progress of science.

The paper by Mason and colleagues analyzes data on 48 people in each of 4 self-identified ethnic groups (African American, Caucasian, Chinese, and Latino). These study subjects are apparently volunteers, and the paper only states that they are non-smokers over 18 years old who are free of a list of diagnosed diseases and who have not recently had their teeth cleaned. Based on the text of the published paper, there is no consideration of their age, diet, social class, or even gender. The authors culture bacterial species from the study subjects and process the data through an algorithm that maximizes the prediction of racial group membership based on these measured data.

The prediction is moderately successful, but this could result from any number of unsurprising reasons. For example, if alcohol consumption affects some particular bacterial species, and whites drink more than Asians in central Ohio, then whatever species is diminished by alcohol exposure would help predict that a sample was from a white rather than from an Asian volunteer. And likewise for any of a million possible lifestyle, social class and demographic differences.  In fact, this is a general problem with data mining exercises that Lazer et al describe in the current issue of Science.

The authors provide no information about how balanced this sample is with respect to any of these variables. Maybe the 48 Hispanics are younger than the 48 whites on average, or have more tooth decay or eat more refined sugar or any of a million other possibilities. The fact that these countless potentially imbalanced factors get represented in the oral bio-environment hardly seems surprising, and the fact that these behaviors and exposures might be differential by race is an observation that is completely trivial from a sociological perspective.

My concern here, however, the authors assert in the published text that these differences do not arise from any of these myriad environmental factors, but from some innate genetic characteristics of the groups. In the Discussion section on page 3 they state that “ethnicity exerts a selection pressure on the oral microbiome, and…this selection pressure is genetic rather than environmental, since the two ethnicities that shared a common food, nutritional and lifestyle heritage (Caucasians and African Americans) demonstrated significant microbial divergence.” Here is a remarkable statement, that Caucasians and African Americans experience no differential dietary or lifestyle factors. It is directly contradicted by thousands of published papers in sociology, epidemiology and anthropology that document these differences for reasons of culture, geographic origin, social class and discrimination.

Jonathan Eisen’s post directly confronted the authors on this point, and they responded with the following explanation:

“Subjects were selected based on extensive questionnaire surveys and clinical examinations to ensure homogeneity. These questionnaires evaluated educational level, socio-economic status, diet and nutritional history, systemic health status, oral hygiene habits and dental visits, among other things.”

This is surely an important statement about the research design, but the problem is that it appears nowhere in the peer-reviewed text of the published paper. What exactly do the authors mean when they insist that the study subjects were perfectly balanced on factors such as socio-economic status and nutritional history? These complex social and lifestyle variables are notoriously difficult to define and measure. While the authors describe the laboratory techniques in baroque detail, they do not even mention in the published paper that they measured these factors, let alone how these variables were defined and considered in the analysis. This represents a profound limitation for the reader in assessing the validity of these measures and adjustments, and therefore the adequacy of the claimed “homogeneity”. The complete omission of these crucial aspects of the analysis in the paper prevents the reader from investing much confidence in the boldly stated claim that observed differences are “genetic rather than environmental” in origin.

I expressed these concerns in my own post at the journal website on 14 November 2013, but the authors did not respond.  Therefore, at Jonathan’s suggestion, I addressed this concern to the PLoS ONE editors in an e-mail on 22 November 2013. I got passed along from one editor to another, and finally I got a very nice response from Elizabeth Silva on 4 December 2013. She wrote:

I wanted to let you know that I am discussing this article and your concerns with both the Academic Editor and the authors, as well as with my colleagues. We take such concerns very seriously and will ensure that appropriate measures are taken to correct any errors or discrepancies. 


Then I waited.  After 2 months I had heard nothing, so I wrote to Dr. Silva again asking for any word on progress, but received no reply.  So I waited another month.

On 4 March it had been 3 months since the note from PLoS ONE promising appropriate measures to correct any errors or discrepancies, so I wrote again, this time a bit more insistently.  This did message did finally generate a quick and reassuring response from Dr. Silva:

I really am very sorry for the extended delay in replying to you, and for my neglect in providing you with an update. Following your correspondence I contacted the authors to ask them for additional information relating to their statement that they corrected for confounding factors, and details of these methods. They promptly replied with a table of details of the baseline variables that they corrected for, and that they described in the comment on their article (see attached), as well as an additional correction to one of their figure legends. I then contacted the Academic Editor, Dr. Indranil Biswas with the full details of your concerns, as well as the table sent by the authors and the correction they requested for their figure legend. We asked Dr. Biswas to revisit the manuscript, in light of this new information, and he has informed us that he feels the conclusions of the manuscript are sound. We will now work with the authors to draft and issue a formal correction to the published article to update the methods to include the table, and to amend the figure legend in question.

The table that Dr. Silva forwarded displayed a list of variables and a p-value for some kind of test between the values in the 4 race groups. The test is not specified (t-test? chi-square test?) but presumably it is for any difference in means or proportions between the 4 groups.  Most of the p-values are large, indicating little evidence for any difference between the groups in income, age, education, or frequency of tooth-brushing, etc.  Based on this table, the populations differed only in their diets, which were characterized as “Asian Diet”, “Hispanic Diet” and “American Diet”.  Not unsurprisingly, the Asians were significantly more likely to report an “Asian Diet” and the Hispanics were significantly more likely to report a “Hispanic Diet”.  The Blacks and Whites had similar reported consumption of the “American Diet”, which presumably was the basis for the authors’ assertion that these groups have identical social environments.

To date, there has been no correction made to the Mason et al paper at the PLoS ONE website.  Therefore it is perhaps somewhat premature to speculate on how the authors will address the concern voiced by Jonathan Eisen’s posted comment and blog post that balance across a handful of measured covariates does not in any way imply balance across all relevant factors except for genetics. Indeed, it has long been argued in the epidemiology literature that one cannot make indirect inferences about genes by measuring and adjusting for a few environmental exposures and attributing all remaining differences to genes.  The argument that Blacks and Whites in Ohio experience identical environments is clearly false, even if a handful of measured covariates are not significantly different in their small convenience sample, the exact origin of which is still obscure.

There are many observations that can be made from this episode. I offer just a few:

  1. These authors are assiduous in describing their lab techniques, but regarding the study design and analysis they are quite cavalier.  Presumably the reviewers were not population scientists, and so they failed to point out these embarrassing flaws. This raises the question of whether a multidisciplinary journal such as PLoS ONE has the relevant expertise to screen out scientifically invalid papers. The fact that Dr. Silva suggested that the authors’ table of covariates and p-values solves the issue demonstrates a wide gap of understanding.  Specialty journals that handle a narrow disciplinary range are not faced with this kind of crisis of competence.
  2. PLoS ONE is such a large operation with so many papers, that quality control seems to suffer. These beleaguered editors are responsible for an enormous publishing volume. Has quantity overwhelmed quality to the extent that gross errors of logic slip through? Months later, the Mason paper has been accessed thousands of times and generated a great deal of media attention, and yet no correction or erratum has appeared, despite the fact that the authors freely admit that the methods in their published paper are not accurate. 
  3. The publishing model gives PLoS ONE a big incentive (almost $2000) to accept a paper, but once it is published, little incentive to correct or withdraw it. 

Sadly, this is not an isolated example.  This week, PLoS ONE published a paper by Wikoff et al which makes a similar logical gaffe about observed racial difference proving a genetic difference.  We could post a comment online, but it seems that nobody (neither the authors nor the editors) has much time to spend monitoring such comments, nor much incentive to care about them.  The authors have their publication, the journal has its $2000, and another tiny piece of horrific misinformation has been released into the world.  The basic philosophy of PLoS ONE is to reduce the gate-keeper role of scientific publication. I am starting to become convinced that a little gate-keeping is not such a bad idea.

Wrap up of Twitter chat on the human microbiome with a high school bio teacher

It started with this

//platform.twitter.com/widgets.js And eventually we worked out a date, which was yesterday. Now I note I had no idea who these people are/were. But it seemed like a good chance to do some some outreach. So I said yes. And yesterday it happened. OK it was chaotic. But it was fun. Here is a Storify of the Tweets.

Microbiome topic of the day: radiation therapy and the microbiome

Just saw this interesting story in the Observer: Cancer scientists to classify gut bacteria to prevent the side-effects of radiotherapy | Science | The Observer.  It discusses an effort to give more consideration to protecting and / or repopulating the microbiome in relation to radiation therapy.  I think this is critically important.  I want to note – people should give some credit to DARPA for being ahead of their time on this issue.  I went to a workshop in 2004 organized by Brett Giroir and Manley Heather.  The topic was “Radiation Protection” and one of the points of discussion was the gut microbiome and the effect of radiation on it.

Anyway – since that meeting I have been following this topic on and off.  And I do think thinking about the microbiome in relation to radiation therapy (and any radiation exposure) is critically important.

Simple microbiome quiz and then mapping function from PGED

Just did this: Map-Ed Genetics: Pin Yourself on Our World Map! (the microbiome one – there are two right now – the other is personal genomics – and others coming).  Best part is browsing the map of other participants afterwards.

mBio – home of some really cool, #openaccess microbiology papers

Am really enjoying the suite of papers coming out in mBio – the Open Access PLOSOne like journal from the American Society for Microbiology.  Here are some examples of recent papers that caught my eye:

And many many more.  Kudos to ASM and mBio.

is Sexxing up your scientific journal OK? The Journal of Proteomics seems to think so

I saw a Tweet from a college classmate of mine –  Jillian Buriak – that pointed me to this article from the Journal of Proteomics in January 2012.

Harry Belafonte and the secret proteome of coconut milk

And this is what one sees when one goes there:

A “Graphical Abstract” with the text “Here is your coconut woman, as perhaps envisioned by Harry Belafonte. For its proteome, though, have a look at the report inside!”.  I guess this is an attempt at a joke about breasts and coconuts?  And how is it appropriate for a scientific paper?

Want to guess about the gender balance of the people who run the journal? Here are the pics from the web site of the main executive editors and officials

Thoughts out there?  Seems pretty inappropriate to me …

UPDATE 3/21 8 AM – Storifying Twitter comments

UPDATE 3/21 9:26 AM Elsevier says they will take down image but haven’t yet.  Bonus – you can download a PPT slide of the brilliant image

UPDATE 3/21 9:34 AM

Some links of relevance

UPDATE 3/22 12:34 PM

An important read on “Impact Factor Mania” from Arturo Casadevall & Ferric Fang

Really important article in mBio: Causes for the Persistence of Impact Factor Mania.

Full citation : Casadevall A, Fang FC. 2014. Causes for the persistence of impact factor mania. mBio 5(2):e00064-14. doi:10.1128/mBio.00064-14.

In the article the authors discuss what they call “Impact Factor Mania” and outline what they believe the causes of it are (hyper competition for funding and jobs, paucity of objective measures of the importance of scientific work, hyper specialization of science, benefits to selected journals, benefits to scientists, national endorsements and prestige by association).  They then discuss some of the problems with such mania including distortions in the scientific enterprise, inability to accurately predict impact, ignoring many important studies, limited correlation between IF and article citation, imperfection of citation rate, delay in communications, and creation of perverse incentives.  They discuss some of the existing proposals for reform including DORA and a boycott of high impact journals.  And finally they discuss what scientists can do including: reforming criteria for funding and promotion, use of diverse metrics, increase interdisciplinary interactions, encourage elite journals to become less elite, and a return to essential scientific values.

The article is a perfect follow up on our recent “Publish or Perish” meeting.

June R Workshop from Pat Schloss

From Pat Schloss/ the Mothur-announce mailing list

Hi mothur fans,
I’ll be hosting another Crashcourse in R Workshop for Microbial Ecologists this June. The workshop will run from June 23rd to 25th near the Detroit airport. The workshop is being filled on a first come, first served basis. The workshop is geared towards people with interests in microbial ecology that would like to learn R or to learn it better so you shouldn’t feel like this isn’t for you if you are a beginner. I assume no previous computer programming experience. The workshop is an even blend of lecture/discussion and hands on use of R with real data and typically uses mothur output files as a starting point. This is a new workshop that I am offering, so here is a rough outline of what the workshop schedule will look like (it is subject to some minor changes)…
Monday:

AM Introduction to R – operations, variable types
PM R basics – getting data in and out, packages

Tuesday:

AM Plotting: Core functions
PM Plotting: Lattice / RGL

Wednesday:

AM Programming in R – Functions, loops
PM Programming in R – Controlling flow

Each day there will be a lecture and discussion interspersed with hands-on activities and it will run from 9 to 5. If you would like to meet with me one-on-one to discuss your project, I can do that in the evening and during breaks. Please email me for more details. Thanks,
Pat Schloss

UC Davis Summer Bioinformatics Workshops — Registration is Open!

Registration is open for the 2014 Bioinformatics Summer Workshops!

Now in it’s 7th year, the UC Davis Bioinformatics Training Program will be holding two week-long workshops this summer:

June 16-20, 2014: Using Galaxy for Analysis of High Throughput Sequence Data

Sept. 15-19, 2014: Using Command Line for Analysis of High Throughput Sequence Data

These workshops will be held on the UC Davis campus and will run from 9am to 5pm on the dates indicated.

Details

Both workshops will cover modern high throughput sequencing
technologies, applications, and ancillary topics, including:

· Illumina HiSeq / MiSeq, and PacBio RS technologies

· Read Quality Assessment & Improvement

· Genome assembly

· SNP and indel discovery

· RNA-Seq differential expression analysis

· Experimental design

· Hardware and software considerations

· Cloud Computing

Each workshop will include a rich collection of lectures and hands-on sessions, covering both theory and tools. We will cover the basics of several high throughput sequencing technologies, but will focus on Illumina and PacBio data for hands-on exercises. Participants will explore software and protocols, create and modify workflows, and diagnose/treat problematic data.

In June, workshop exercises will be performed using the popular Galaxy platform (http://usegalaxy.org) on the Amazon Cloud
(http://aws.amazon.com/) which allows for powerful web-based data analyses. There are no prerequisites other than basic familiarity with genomic concepts.

In September, exercises will be performed using the Linux command line. Therefore, for this workshop, it is strongly recommended that participants should also have basic familiarity with the Linux/Unix (or Mac) command line.

Who Should Attend

Prior course participants have included faculty, post docs, grad students, staff, and industry researchers. Anyone with an interest in sequence analysis is welcome!

Registration Info

Attendance is limited to 35 participants per workshop in order to foster an effective learning environment and ensure sufficient one-on-one attention. Course tuition is $1,500 for academic or non-profit participants and $1,800 for other participants. Amazon has kindly provided grants of $100 per participant for Amazon Web Services accounts. This will allow you to perform analysis during and after the course using Amazon’s resources, without purchasing your own high performance computing servers!

To register, click on the links above or go to
training.bioinformatics.ucdavis.edu/. We now accept credit cards, as well as UC recharge accounts, for payment. Registration fees include light breakfast, lunch, and snacks, but do not include dinner, lodging or parking fees.

Questions

If you have any questions, please don’t hesitate to contact us:

· Core main telephone line: 530-752-2698

· Core email: bioinformatics.core@ucdavis.edu

See you this summer!

The UC Davis Bioinformatics Core Team

http://training.bioinformatics.ucdavis.edu

http://bioinformatics.ucdavis.edu/