Guest post on "The phone microbiome" from Georgia Barguil in Jack Gilbert’s lab

From @Artologica on Etsy.  The Phonome. 

Today we have a very special guest post from Georgia Barguil in Jack Gilbert’s group at University of Chicago / Argonne National Lab.  Georgia has been coordinating analyses of microbial surveys that have been a collaboration between me and Jack (although really driven by Jack and his lab in most ways).  The study subject: cell phones and shoes.  The study locations: conferences and meetings in order to have participation in microbial surveys by “citizen” scientists of one kind or another.  We did this together at the AAAS meeting.  And then Gilbert’s lab did this at ThirstDC.  And then I did this at SciFoo at Google HQ.  We are working on a paper on this and wanted to get some results out to the community so Georgia wrote up this post.

Ever wanted to know what bacteria are on your shoes and phones? Of course you have! Here we explored the bacteria that call shoes and phones home; the shoes and phones belonged to employees at Google’s Headquarters, and to participants at the Thirst DC and AAAS annual meeting conferences over 2012 (Fig. 1). Altogether, 84 phones (34 from GoogleHQ, 23 from ThirstDC and 27 from AAAS) and 68 shoes (15 from SciFoo, 24 from ThirstDC and 29 from AAAS) were sampled. The DNA of these samples was extracted and the bacteria were identified by sequencing and subsequent computational analysis of a key gene (16SrRNA) found in all bacteria. Here we show some of the results.

Fig. 1: Map showing the 3 sampling locations: AAAS in Vancouver, SciFoo in California and ThirstDC in Washington

There are quite a lot of microorganisms found in these environments, as you can see in the graph below (Fig. 2), where each bar represents a sample and each color represents a group of bacteria. Also by looking at the chart you can see that the bacteria that live on phones and shoes are different, and found in different proportions. Actually, by comparing the bacterial profile from an unidentified sample with this collection, we could tell you whether that sample was from a phone or a shoe!

Fig. 2: Genus-level diversity and abundance of bacteria associated to phone and shoe samples.

In the shoe samples you can see a lot more colors, which implies that the shoes are home to more bacterial groups than the phones. Out of 560 groups of bacteria found, there were 90 that favored either shoes or phones; 70 of these groups favored the shoe environment while the other 20 favored the phone. Some of the groups that preferred the phones were:

  • Streptococcus (dark green)- many streptococcal species are nonpathogenic, and form part of the commensal human microbiome of the mouth, skin, intestine, and upper respiratory tract.
  • Staphylococcus (brown)- most species of this genus are harmless and reside normally on the skin and mucous membranes of humans and other organisms.
  • Rothia (gray)- is a common inhabitant of the human oral cavity and respiratory tract. Some species were identified as gluten-degrading natural colonizers of the upper gastro-intestinal tract.
  • Actinomyces (army green)- normally present in the gingival area, they are part of the commensal flora, and are the cause of most common infection in dental procedures and oral abscesses. Many Actinomyces species are opportunistic pathogens of humans and other mammals, particularly in the oral cavity. In rare cases, these bacteria can cause actinomycosis, a disease characterized by the formation of abscesses in the mouth, lungs, or the gastrointestinal tract.
  • Prevotella (red)- has been a problem for dentists for years. As a human pathogen known for creating periodontal and tooth problems, Prevotella has long been studied in order to counteract its pathogenesis.
  • Gemella (bright yellow)- group of bacteria primarily found in the mucous membranes of humans and other animals, particularly in the oral cavity and upper digestive tract
  • Micrococcus (pale green)- have been isolated from human skin.
  • Corynebacterium (yellow)- occurs commonly in nature in the soil, water, plants, and food products. The non-pathogenic Corynebacterium species can even be found in the mucosa and normal skin flora of humans and animals.
  • Propionibacterium (pale blue)- members of this group are primarily facultative parasites and commensals of humans and other animals, living in and around the sweat glands, sebaceous glands, and other areas of the skin. They are virtually ubiquitous and do not cause problems for most people, but some propionobacteria have been implicated in acne and other skin conditions.

It is evident that all of these groups are commonly found in the skin and mucous membranes of humans, so it is expected that these groups occur in phones due to the close contact with the hands, face, mouth and breath.

In the plot below (Fig. 3), phones (blue squares) and shoes (orange triangles) from all sampling locations were analyzed together and you can see that phones harbor a very different community to shoes (in fact this is a statistically significant difference) – but shoes all look quite similar while phone microbiome are actually quite variable. It may be possible that the microbiome of your phone is reasonably unique to you, and that we could tell whose phones was who’s by the microbes that lived on the phone.

Fig. 3: Principal coordinate analysis (PCoA) plot using the UniFrac distance obtained for all phone (blue squares) and shoe (orange triangles) samples.

When dividing the samples according to geographical location instead of phones/shoes (Fig. 4), the three sampling locations do not form discrete clusters, and are not statistically significantly different (p>0.05), which suggests that no matter the geographical location you sample, you will find similar bacterial communities.

Fig. 4:PCoA plot using the UniFrac distance obtained for both phone and shoe samples from the 3 sampling locations. The red squares represent AAAS samples, while the blue circles and orange triangles represent SciFoo and ThirstDC, respectively.

However, if we only consider the bacteria found on shoes (Fig. 5), then GoogleHQ (green circle) is statistically different from both AAAS (red square) and ThirstDC (blue triangle). This difference is mostly due to a higher abundance of Corynebacterium and Kocuria groups found in the GoogleHQ shoe samples.

Fig. 5: PCoA plot using the UniFrac distance obtained for all shoe samples from SciFoo (green circles), AAAS (red squares) and ThirstDC (blue triangles).

The microbiota found in phones was highly similar among the three sampling locations (Fig. 6), indicating that phones tend to harbor the same groups of microorganisms even in different locations, regardless of the phone model and owner microbiota. As it can be observed in the plot below, phone samples from AAAS (red squares), ThirstDC (orange triangles) and SciFoo (blue circles) are interspersed.

Fig. 6: PCoA plot using the UniFrac distance obtained for all phone samples in the 3 sampling locations. GoogleHQ is represented by the blue circles, while Thirst DC and AAAS are represented by orange triangles and red squares, respectively.

In conclusion, there were more biological differences between shoes and phones than between the three geographical locations. Phones and shoes harbored microbiomes representing the environments they most often came into contact with. Phones were closely related to the skin and upper respiratory tract, and shoes reflected the bacteria found in soil and the environment.

Although many of the groups found both in shoes and phones have pathogenic representatives, you should not be scared, as it does not mean that you are going to get sick. Most of the isolated, characterized and sequenced bacterial groups available in the sequence databases are the pathogenic ones, exactly because of their importance to human health by aiding in the diagnosing and treatment of diseases. Some of the “relatives” of these pathogenic bacteria are actually good-guys that are usually present in your normal microbiota and do not represent any risks, in fact they may actually be preventing the ‘bad-guys’ from growing on your phone!  On the other hand, it is always a good idea to clean your cell phone screen once in a while, just to be safe.

For some other reading about the phone sampling efforts see

Nice review on HiSeq/MiSeq rRNA sequencing from Caporaso et al #microbes

Quick post — nice review worth checking out: The ISME Journal – Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms

from Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, Owens SM, Betley J, Fraser L, Bauer M, Gormley N, Gilbert JA, Smith G, Knight R.

A key part of the paper, with highlighting from me:-

These observations, in agreement with studies that have addressed this question directly (Kuczynski et al., 2010), suggest that increasing the sequencing depth is not likely to provide additional insight into questions of beta diversity, and we therefore argue that (for questions of beta diversity in particular) the decreased cost of sequencing should be applied to study microbial systems using many more samples, for example, in dense temporal or spatial analyses, rather than with many more sequences per sample.  Of course, if the objective is to identify taxa that are very rare in communities, deeper sequencing will be advantageous. Additionally we note that while as few as 10 sequences per sample may be useful for differentiating very different environment types (for example, soil and feces), as environments become more similar (for example, two soil samples of different pH) more sequences will be required to differentiate them.

PCR amplification and pyrosequencing of rpoB as complement to rRNA

Figure 1. Number of OTUs as
 a function of fractional sequence difference
 (OTU cut-off) for the 16S rRNA marker
 gene (A) and the rpoB marker gene (B).

Interesting new paper in PLoS One: PLoS ONE: A Comparison of rpoB and 16S rRNA as Markers in Pyrosequencing Studies of Bacterial Diversity

In the paper they test and use PCR amplification and pyrosequencing of the rpoB gene for studies of the diversity of bacteria. Due to the lower level of conservation of rpoB than rRNA genes at the DNA level they focused on proteobacteria. It seems that with a little perseverance once can get PCR for protein coding genes to work reasonably well for even reasonably broad taxonomic groups (not totally new here but I am not aware of too many papers doing this with pyrosequencing). Anyway, the paper is worth a look.

 Vos M, Quince C, Pijl AS, de Hollander M, Kowalchuk GA (2012) A Comparison of rpoB and 16S rRNA as Markers in Pyrosequencing Studies of Bacterial Diversity. PLoS ONE 7(2): e30600. doi:10.1371/journal.pone.0030600 Vos, M., Quince, C., Pijl, A., de Hollander, M., & Kowalchuk, G. (2012). A Comparison of rpoB and 16S rRNA as Markers in Pyrosequencing Studies of Bacterial Diversity PLoS ONE, 7 (2) DOI: 10.1371/journal.pone.0030600

Slideshow w/ audio of my talk on "A Field Guide to the Microbes" from the AAAS Meeting #AAASMtg

I recorded the audio of my talk on “Towards a field guide to the microbes” from the AAAS meeting on Saturday AM. Here is a slideshow of the talk with audio synched to the slides (I did this using Keynote on a Mac with the “record Slideshow” function).

My slides from the talk are available at Slideshare.

First "Guardians of microbial diversity" award to Rob Dunn #microbiology #GMDs

For this I am awarding him the first of what will be many “Guardians of Microbial Diversity” awards here (we can just call these the GMDs). Not only will he get an award – I am going to send him a GMD gift from the various GMD doodads I am putting together.
Congratulations Rob.  Now off to design some more diverse GMD doodads. 

Fun with microbial diversity studies: SitePainter

Nice new tool out there: SitePainter: A tool for exploring biogeographical patterns by Antonio Gonzalez, Jesse Stombaugh, Christian L. Lauber, Noah Fierer and Rob Knight.

I saw the paper and figured I would see if I could get it up and running.

First I downloaded the software from source forge.

Then I uncompressed it and tried to run it.  When I opened the index.html file in Chrom I got a message saying it only worked in Firefox.  So then I opened it in Firefox and I got an error saying it only worked in newer versions of Firefox.  So I downloaded the latest version of Firefox and then finally the tool opened.  I then followed the simple tutorial they have provided and Voila I was up and running in a few minutes

Completely cool.  Going to definitely have to try this out with our own site-variable microbial data.

Earth Microbiome Project session at AAAS in Vancouver 2/18

This should be fun – session at AAAS meeting in Vancouver.

The Earth Microbiome Project: Modeling the Microbial Planet

Saturday, 18 February 8:30AM-11:30AM

Organized by: Jack A. Gilbert, Argonne National Laboratory, IL


  • Folker Meyer, Argonne National Laboratory, IL 
    • Developing the Metagenome Data Exchange Format
  • Jonathan Eisen, University of California, Davis
    • Toward a Field Guide to the Microbes
  • Rob Knight, University of Colorado, Boulder
    • Uncovering Novel Bioiformatic Techniques for Exploring Microbial Life
  • Rick Stevens, Argonne National Laboratory, IL
    • High-Performance Computing and Modeling the Microbial World

Journal club today on bacteria in toilets – posting some notes here

I am heading a journal club discussion today of the following paper: PLoS ONE: Microbial Biogeography of Public Restroom Surfaces
I am going to use this page/post to put up some notes for the discussion.  Fortunately I have a good guide in this – Rob Dunn wrote a nice commentary/review for Scientific American blogs: Public bathrooms house thousands of kinds of bacteria
Stay tuned/come back to this page as I will be posting some more notes. Any suggestions for other things to look at/discuss would be welcome.

Notes (I note – I am copying much of the text from the paper not rewriting it.)
What was sampled?

Ten surfaces (door handles into and out of the restroom, handles into and out of a restroom stall, faucet handles, soap dispenser, toilet seat, toilet flush handle, floor around the toilet and floor around the sink) in six male and six female restrooms evenly distributed across two buildings on the University of Colorado at Boulder campus were sampled on a single day in November 2010. 

How did they collect samples?

Surfaces where sampled using sterile, cotton-tipped swabs as described previously [14], [15]. As the 12 restrooms were nearly identical in design, we were able to swab the same area at each location between restrooms. In order to characterize tap water communities as a potential source of bacteria, 1 L of faucet water from six of the restrooms (each building having the same water source for each restroom sampled) was collected and filtered through 0.2 µm bottle top filters (Nalgene, Rochester, NY, USA). 

How did they get DNA?

Genomic DNA was extracted from the swabs and filters using the MO BIO PowerSoil DNA isolation kit following the manufacturer’s protocol with the modifications of Fierer et al. [14]. 

How did they get sequence data?

A portion of the 16 S rRNA gene spanning the V1–V2 regions was amplified using the primer set (27 F/338R), PCR mixture conditions and thermal cycling conditions described in Fierer et al. [15]. PCR amplicons of triplicate reactions for each sample were pooled at approximately equal amounts and pyrosequenced at 454 Life Sciences (Branford, CT, USA) on their GS Junior system. A total of 337,333 high-quality partial 16 S rRNA gene sequences were obtained from 101 of the 120 surface samples collected, averaging approximately 3,340 sequences per sample (ranging from 513–6,771) (Table S1) in 4 GS Junior runs, with the best run containing 116,004 high-quality reads. An additional 16,416 sequences (ranging from 2161–5084 per sample) were generated for five of the six water samples collected for source tracking analysis. Each sample was amplified with a unique barcode to enable multiplexing in the GS Junior runs. The barcoded sequencing reads can be separated by data analysis software providing high confidence in assigning sequencing read to each sample. Sequence data generated as part of this study is available upon request by contacting the corresponding author.

How did they analyze the data?

All sequences generated for this study and previously published data sets used for source tracking (see below) were processed and sorted using the default parameters in QIIME [16]. Briefly, high-quality sequences (>200 bp in length, quality score >25, exact match to barcode and primer, and containing no ambiguous characters) were trimmed to 300 bp and clustered into operational taxonomic units (OTUs) at 97% sequence identity using UCLUST [17]. Representative sequences for each OTU were then aligned using PyNAST [18] against the Greengenes core set [19] and assigned taxonomy with the RDP-classifier [20]. Aligned sequences were used to generate a phylogenetic tree with FastTree [21] for both alpha- (phylogenetic diversity, PD)[22] and beta-diversity (unweighted UniFrac) [23] metrics. The unweighted UniFrac metric, which only accounts for the presence/absence of taxa and not abundance, was used to determine the phylogenetic similarity of the bacterial communities associated with the various restroom surfaces. The UniFrac distance matrix was imported into PRIMER v6 where principal coordinate analysis (PCoA) and analysis of similarity (ANOSIM) were conducted to statistically test the relationship between the various communities [24]. In order to eliminate potential biases introduced by sampling depth, all samples (including those used in source tracking) were rarified to 500 sequences per sample for taxonomic, alpha-diversity (PD), beta-diversity (UniFrac) and source tracking comparisons.


To determine the potential sources of bacteria on restroom surfaces and how the importance of different sources varied across the sampled locations, we used the newly developed SourceTracker software package [25]. The SourceTracker model assumes that each surface community is merely a mixture of communities deposited from other known or unknown source environments and, using a Bayesian approach, the model provides an estimate of the proportion of the surface community originating from each of the different sources. When a community contains a mixture of taxa that do not match any of the source environments, that portion of the community is assigned to an “unknown” source. Potential sources we examined included human skin (n = 194), mouth (n = 46), gut (feces) (n = 45) [26] and urine (n = 50), as well as soil (n = 88) [27] and faucet water (n = 5, see above). For skin communities, sequences collected from eight body habitats (palm, index finger, forearm, forehead, nose, hair, labia minora, glans penis) from seven to nine healthy adults on four occasions were used to determine the average community composition of human skin [26]. The mouth (tongue and cheek swabs), gut and urine communities were determined from the same individuals although the urine-associated communities were not published in the initial report of these data [26]. While urine is generally considered to be sterile, it does pick up bacteria associated with the urethra and genitals [28], [29]. The average soil community was determined from a broad diversity of soil types collected across North and South America [27].

Notes on Sourcetracking

Abstract to paper:

Contamination is a critical issue in high-throughput metagenomic studies, yet progress toward a comprehensive solution has been limited. We present SourceTracker, a Bayesian approach to estimate the proportion of contaminants in a given community that come from possible source environments. We applied SourceTracker to microbial surveys from neonatal intensive care units (NICUs), offices and molecular biology laboratories, and provide a database of known contaminants for future testing.

Some lines from paper

We developed SourceTracker, a Bayesian approach to identifying sources and proportions of contamination in marker-gene and functional metagenomics studies. Our approach models contamination as a mixture of entire source communities into a sink community, where the mixing proportions are unknown.

SourceTracker’s distinguishing features are its direct estimation of source proportions and its Bayesian modeling of uncertainty about known and unknown source environments.

SourceTracker outperformed these methods (NAIVE BAYES AND RANDOM FORESTS) because it allows uncertainty in the source and sink distributions, and because it explicitly models a sink sample as a mixture of sources.

SourceTracker also assumes that an environment cannot be both a source and a sink, and we recommend research into bidirectional models.

Based on our results, simple analytical steps can be suggested for tracking sources and assessing contamination in newly acquired datasets. Although source-tracking estimates are limited by the comprehensiveness of the source environments used for training, large-scale projects such as the Earth Microbiome Project will dramatically expand the availability of such resources. SourceTracker is applicable not only to source tracking and forensic analysis in a wide variety of microbial community surveys (where did this biofilm come from?), but also to shotgun metagenomics and other population-genetics data. We made our implementation of SourceTracker available as an R package (, and we advocate automated tests of deposited data to screen samples that may be contaminated before deposition.

Who was there?

A total of 19 phyla were observed across all restroom surfaces with most sequences (≈92%) classified to one of four phyla: Actinobacteria,Bacteroidetes, Firmicutes or Proteobacteria (Figure 1A, Table S2). Previous cultivation-dependent and –independent studies have also frequently identified these as the dominant phyla in a variety of indoor environments [10][13]. Within these dominant phyla, taxa typically associated with human skin (e.g. Propionibacteriaceae,Corynebacteriaceae, Staphylococcaceae and Streptococcaceae) [30]were abundant on all surfaces (Figure 1A). The prevalence of skin bacteria on restroom surfaces is not surprising as most of the surfaces sampled come into direct contact with human skin, and previous studies have shown that skin associated bacteria are generally resilient and can survive on surfaces for extended periods of time [31], [32]. Many other human-associated taxa, including several lineages associated with the gut, mouth and urine, were observed on all surfaces (Figure 1A). Overall, these results demonstrate that, like other indoor environments that have been examined, the microbial communities associated with public restroom surfaces are predominantly composed of human-associated bacteria.

Figure 1. Taxonomic composition of bacterial communities associated with public restroom surfaces.
(A) Average composition of bacterial communities associated with restroom surfaces and potential source environments. (B) Taxonomic differences were observed between some surfaces in male and female restrooms. Only the 19 most abundant taxa are shown. For a more detailed taxonomic breakdown by gender including some of the variation see Supplemental Table S2.

Comparative analysis

Comparisons of the bacterial communities on different restroom surfaces revealed that the communities clustered into three general categories: those communities found on toilet surfaces (the seat and flush handle), those communities on the restroom floor, and those communities found on surfaces routinely touched with hands (door in/out, stall in/out, faucet handles and soap dispenser) (Figure 2, Table 1). By examining the relative abundances of bacterial taxa across all of the restroom samples, we can identify taxa driving the overall community differences between these three general categories. Skin-associated bacteria dominate on those surfaces (the circles in Figure 2) that are routinely and exclusively (we hope) touched by hands and unlikely to come into direct contact with other body parts or fluids (Figure 3A). In contrast, toilet flush handles and seats (the asterisk-shaped symbols in Figure 2) were relatively enriched in Firmicutes (e.g.Clostridiales, Ruminococcaceae, Lachnospiraceae, etc.) andBacteroidetes (e.g. Prevotellaceae and Bacteroidaceae) (Figure 3B). These taxa are generally associated with the human gut [26],[33][35] suggesting fecal contamination of these surfaces. Fecal contamination could occur either via direct contact (with feces or unclean hands) or indirectly as a toilet is flushed and water splashes or is aerosolized [36][38]. From a public health perspective, the high number of gut-associated taxa throughout the restrooms is concerning because enteropathogenic bacteria could be dispersed in the same way as human commensals. Floor surfaces harbored many low abundance taxa (Table S2) and were the most diverse bacterial communities, with an average of 229 OTUs per sample versus most of the other sampled locations having less than 150 OTUs per sample on average (Table S1). The high diversity of floor communities is likely due to the frequency of contact with the bottom of shoes, which would track in a diversity of microorganisms from a variety of sources including soil, which is known to be a highly-diverse microbial habitat [27], [39]. Indeed, bacteria commonly associated with soil (e.g. Rhodobacteraceae, Rhizobiales, Microbacteriaceae and Nocardioidaceae) were, on average, more abundant on floor surfaces (Figure 3C, Table S2). Interestingly, some of the toilet flush handles harbored bacterial communities similar to those found on the floor (Figure 2, Figure 3C), suggesting that some users of these toilets may operate the handle with a foot (a practice well known to germaphobes and those who have had the misfortune of using restrooms that are less than sanitary).

Figure 2. Relationship between bacterial communities associated with ten public restroom surfaces.
Communities were clustered using PCoA of the unweighted UniFrac distance matrix. Each point represents a single sample. Note that the floor (triangles) and toilet (asterisks) surfaces form clusters distinct from surfaces touched with hands.

Table 1. Results of pairwise comparisons for unweighted UniFrac distances of bacterial communities associated with various surfaces of public restrooms on the University of Colorado campus using the ANOSIM test in Primer v6.

Figure 3. Cartoon illustrations of the relative abundance of discriminating taxa on public restroom surfaces.
Light blue indicates low abundance while dark blue indicates high abundance of taxa. (A) Although skin-associated taxa (PropionibacteriaceaeCorynebacteriaceae,Staphylococcaceae and Streptococcaceae) were abundant on all surfaces, they were relatively more abundant on surfaces routinely touched with hands. (B) Gut-associated taxa (ClostridialesClostridiales group XI, Ruminococcaceae,LachnospiraceaePrevotellaceae and Bacteroidaceae) were most abundant on toilet surfaces. (C) Although soil-associated taxa (Rhodobacteraceae, Rhizobiales, Microbacteriaceae and Nocardioidaceae) were in low abundance on all restroom surfaces, they were relatively more abundant on the floor of the restrooms we surveyed. Figure not drawn to scale.

Comparisons 2 (Gender)

While the overall community level comparisons between the communities found on the surfaces in male and female restrooms were not statistically significant (Table S3), there were gender-related differences in the relative abundances of specific taxa on some surfaces (Figure 1B, Table S2). Most notably, Lactobacillaceae were clearly more abundant on certain surfaces within female restrooms than male restrooms (Figure 1B). Some species of this family are the most common, and often most abundant, bacteria found in the vagina of healthy reproductive age women [40], [41] and are relatively less abundant in male urine [28], [29]. Our analysis of female urine samples collected as part of a previous study [26] (Figure 1A), found that Lactobacillaceae were dominant in urine, therefore implying that surfaces in the restrooms where Lactobacillaceae were observed were contaminated with urine. Other studies have demonstrated a similar phenomenon, with vagina-associated bacteria having also been observed in airplane restrooms [11] and a child day care facility [10]. As we found that Lactobacillaceae were most abundant on toilet surfaces and those touched by hands after using the toilet (with the exception of the stall in), they were likely dispersed manually after women used the toilet. Coupling these observations with those of the distribution of gut-associated bacteria indicate that routine use of toilets results in the dispersal of urine- and fecal-associated bacteria throughout the restroom. While these results are not unexpected, they do highlight the importance of hand-hygiene when using public restrooms since these surfaces could also be potential vehicles for the transmission of human pathogens. Unfortunately, previous studies have documented that college students (who are likely the most frequent users of the studied restrooms) are not always the most diligent of hand-washers [42], [43].

Source Tracking

Human sources:

Results of SourceTracker analysis support the taxonomic patterns highlighted above, indicating that human skin was the primary source of bacteria on all public restroom surfaces examined, while the human gut was an important source on or around the toilet, and urine was an important source in women’s restrooms (Figure 4, Table S4). 

Soil not an apparent source:

Contrary to expectations (see above), soil was not identified by the SourceTracker algorithm as being a major source of bacteria on any of the surfaces, including floors (Figure 4). Although the floor samples contained family-level taxa that are common in soil, the SourceTracker algorithm probably underestimates the relative importance of sources, like soils, that contain highly diverse bacterial communities with no dominant OTUs and minimal overlap between those OTUs in the sources and those found in the surface samples. As soils typically have large numbers of OTUs that are rare (i.e. represented by very few sequences) and the OTU overlap between different soil samples is very low [27], it is difficult to identify specific OTUs indicative of a soil source. 

Other potential sources:

The other potential sources we examined, mouth and faucet water, made only minor bacterial contributions to restroom surface communities either because these potential source environments rarely come into contact with restroom surfaces (the mouth – we hope) or they harbor relatively low concentrations of bacteria (faucet water) (Figure 4). While we were able to identify the primary sources for most of the surfaces sampled, many other sources, such as ventilation systems or mops used by the custodial staff, could also be contributing to the restroom surface bacterial communities. More generally, the SourceTracker results demonstrate how direct comparison of bacterial communities from samples of various environment types to those gathered from other settings can be used to determine the relative contribution of that source across samples. Although many of the source-tracking results evident from the restroom surfaces sampled here are somewhat obvious, this may not always be the case in other environments or locations. We could use the same techniques to identify unexpected sources of bacteria from particular environments as was observed recently for outdoor air [44].

Figure 4. Results of SourceTracker analysis showing the average contributions of different sources to the surface-associated bacterial communities in twelve public restrooms.
The “unknown” source is not shown but would bring the total of each sample up to 100%.

While we have known for some time that human-associated bacteria can be readily cultivated from both domestic and public restroom surfaces, little was known about the overall composition of microbial communities associated with public restrooms or the degree to which microbes can be distributed throughout this environment by human activity. The results presented here demonstrate that human-associated bacteria dominate most public restroom surfaces and that distinct patterns of dispersal and community sources can be recognized for microbes associated with these surfaces. Although the methods used here did not provide the degree of phylogenetic resolution to directly identify likely pathogens, the prevalence of gut and skin-associated bacteria throughout the restrooms we surveyed is concerning since enteropathogens or pathogens commonly found on skin (e.g. Staphylococcus aureus) could readily be transmitted between individuals by the touching of restroom surfaces.

Supporting Information Top

Public restroom surfaces sampled and comparison of alpha-diversity metrics for each restroom surface. Note that all alpha-diversity values were determined from 500 randomly selected sequences from each sample.

Average taxonomic composition of bacterial communities associated with female (F) and male (M) public restroom surfaces. Numbers in parentheses indicate the standard error of the mean (SEM). Taxonomy was determined using the RDP-classifier for 500 randomly selected sequences from each sample.

Results of ANOSIM test comparing the bacterial communities associated with male and female restroom surfaces.

Results of SourceTracker analysis showing percentage of microbial community contributions of different source environments to restroom surfaces. Values are the average of ten resamplings with the standard error of the mean reported in parentheses.

Guest post from Antarctica: Joe Grzymski (@grzymski) on "The Story Behind Nitrogen Cost-Minimization"

Well, this is getting really fun. I have been doing “The Story Behind the Paper” posts for my own papers for a while and recently opened this up to guest posts. And the one today is coming to us from the true wilds – Antarctica. Joe Grzymski (aka @grzymski on Twitter) is out there doing field work (yes, microbiologists have the best field sites …). For more on the field project see the Desert Research Institute’s “Mission Antarctica” site. Joe responded to my request for more guest posts and wrote up a really nice discussion of a recent open access paper of his from the ISME Journal. If anyone else is interesting in writing a guest post on an open access paper or an issue in open access, let me know … without any further ado — below is Joe’s post

I thoroughly enjoy reading Jonathan’s posts detailing – far beyond what can possibly be included in published papers – the who, what, where, when, why and how of science. The story behind the potential fourth domain of life article in PLOS ONE provides great detail about how science is done. After reading Matthew Hahn’s insightful history and commentary on his ortholog conjecture paper I was happy to reply to the request for more “stories” and am chiming in from Antarctica (where I am currently doing field research) to discuss the story behind our recent paper in ISME J, “The significance of nitrogen cost minimization in the proteomes of marine microorganisms”. I hope it will provide another example of how a lot of science is lost in final, streamlined, published versions. Also, it is work that was largely done by an undergraduate and was vigorously and carefully reviewed – the improvements and expansion of ideas because of great reviewers highlights the best of the review process. What started out as a short two-page paper morphed into a larger piece of research – not things you can properly detail in a manuscript.

What was the origin of the idea?

The story behind this paper begins in 1997 when I was in graduate school at Rutgers University. Paul Falkowski joined the faculty right around the time when he published a seminal paper, “The evolution of the nitrogen cycle and its influence on the biological sequestration of CO2 in the ocean.” Paul’s office was across from an office I shared with Jay Cullen (who will factor into the story later); Paul was on my committee and influential in how and what I studied in grad school and as a PostDoc. He constantly kept us on our toes (to say the least). Many of the implications of our recent paper were guided by his thoughts and original work on evolution of the nitrogen cycle and many papers on the functional and ecological factors that dictate the structure of phytoplankton communities. There are many papers here by Paul and the awesome Oscar Schofield- my primary dissertation adviser. Incidentally, I overlapped with Felisa Wolfe-Simon at Rutgers for a few years; she was in the science news recently [#arseniclife], and we had common advisers.

Paul’s paper was pre-genomics – but its scope and breadth are strengthened by recent work on isolates, environmental genomes and transcriptomes from the ocean. Simple mass balance says that the reason why we have oil buried deep in the earth and oxygen in the atmosphere is because photosynthesis (net carbon fixation and oxygenation of the atmosphere) exceeds respiration. During long periods of time, organisms draw down CO2, and it gets sequestered from the atmosphere. In his paper, Paul details an inextricable link between the ratios of nitrogen fixation and denitrification (across geological periods) to the potential draw down of CO2 by particulate organic carbon (namely, large sinking diatoms). That is, if nitrogen fixation is abundant and denitrification is zero, there is more available inorganic nitrogen (in the form of nitrate) in the surface ocean for phytoplankton to utilize and carbon sequestration increases. His paper further details why fixed nitrogen is limiting in the ocean surface across geological scales. It boils down to iron limitation, the specialization required to harness the beastly, triple-bond cracking but woefully inefficient nitrogenase enzyme (which has a high Fe requirement) and also the easier, multiple evolution of the process of denitrification. All of this is articulately summarized here.

How did this work advance?

Fast forward to 2001 and publication of the paper by Baudouin-Cornu et al. In this paper, links between environmental imprinting from fluctuating nutrient availability and atomic composition of assimilatory proteins are quantified. Using genome sequences from E. coli and S. cerevisiae, the authors show that carbon and sulfur assimilatory proteins have amino acid sequences that are depleted in carbon and sulfur side chains, respectively. This makes sense. Proteins high in carbon or nitrogen hardly would provide added fitness to an organism that often struggles to find enough of the nutrient to satisfy other fundamental cellular processes. Similar logic also explains why organisms tend to utilize smaller amino acids more frequently than larger ones: it takes more ATP to make a tyrosine than an alanine. Conversely, the pressure to “cost minimize” is less in organisms, like gut dwelling microbes, that have easy access to amino acids. It is not a perfect rule, but most of the time thermodynamic arguments explain a lot about why organisms do what they do. Fast forward again to Craig Venter’s genomic survey of select surface ocean sites (GOS). This (and now other) sequence data sets provided access to genomic information on organisms that inhabit various surface ocean biomes and, crucially, are largely difficult to isolate in pure culture.

What motivated the writing of the paper?

Last summer, I was sitting in my office writing a proposal. I can’t remember the specific topic, but I was thinking about cost-minimization mostly from the perspective of building proteins in cold environments and the challenges organisms face when it is cold: there is little access to organic carbon (food), and other environmental conditions hamper optimal living. I was re-reading Baudouin-Cornu, and there is a specific sentence in the paper in which the authors hypothesize that the phenomenon of cost-minimization might be a broader evolutionary strategy in resource-limited environments. I figured that organisms that did well in the oligotrophic parts of the ocean probably had mechanisms to reduce nitrogen usage and an easy place to start reducing nitrogen is by not making so many proteins or at the very least reducing the usage of arginine, histadine, lysine, asparagine, tryptophan and glutamine – amino acids with at least one added nitrogen on their side chains.

This is a good spot to introduce my co-author, Alex Dussaq.

Co-author, Alex Dussaq

Alex completed his honors undergraduate work in mathematics and biochemistry and was working with me on some coding and analysis projects. To follow Matthew’s example, the conversation that started this paper went like this:

Joe: Alex, I have an interesting idea I want to discuss in a proposal… do you think you can download all the GOS data and calculate the nitrogen, C, H and S atoms per residue side chain as in this paper (hand him Baudouin-Cornu) and then correlate those values with chlorophyll (a proxy for phytoplankton and thus primary productivity), NO3 and Fe. This would be just one figure in the proposal.

Alex: OK, sure that should be pretty easy.

Joe: My proposal is due next week so I need the numbers quickly.

Alex: Yeah, yeah.

Alex codes easier than most people write in their native language. By the way, Alex has moved on to a combined Ph.D./M.D. program at UAB through which he hopes to combine genomics research with new approaches to medicine. I have no doubt he will do unbelievably well in science.

I think that downloading organized data was initially more difficult than it should have been – we spend so much money generating data and so little taking care of it – but we had average values after a few days for several oligotrophic GOS sites and some coastal ocean GOS sites that were convincing enough to put in the proposal. Unfortunately, there are no great metadata – especially physical and chemical characterization of the GOS sites – so we used the “distance to continental land mass” as a proxy for nitrate concentration and oligotrophy (this stung us at first in review). After a week, Alex analyzed all the GOS data and a few important isolated, single organism genomes that factor in the story. After a little less than a month, we had a draft of a two-page brevia that we submitted to Science. It was a simple story that showed data from coastal and open-ocean GOS sites. We found a clear relationship between frequency of nitrogen atoms in side chains of proteins and distance from continental land mass (a proxy for nutrient availability as there are lots of nutrients running off our land). The main conclusion of the paper was that organisms living in oligotrophic oceans tend to have reduced nitrogen content of proteins. Kudos to Alex for some great work.

What was the larger context for the initial findings?

We tried to write the paper from a broader evolutionary and biogeochemical perspective (and used the aforementioned paper by Paul Falkowski as a model). We talked about the implications of organisms in the ocean that are under selective pressure to cost minimize with respect to nitrogen. I’d be happy to share the original submission with anyone who wants to see the evolution of a paper; just contact me. I’d post it here, but Jonathan might charge me for the bytes given how long this is turning out to be. Great reviews make good stories that are decently executed a lot better.

How did the reviewers react?

When reviews of a paper are longer than the original submission, you have an indication that the paper prompted some thought. We received three comprehensive reviews to a two-page paper that contained one main figure and some supplemental material. Given that I didn’t think we could spend time on the subject, we attempted to be brief, too brief especially when compared to the final open access result in ISME. Next, I’ll review some criticisms of the nitrogen cost-minimization hypothesis (having our paper handy will be helpful):

1. Nitrogen cost minimization by simply looking at the predicted proteomes of organisms or environmental genomes assumes that all proteins are made de novo when salvage pathways and dissolved free amino acids (DFAAs) and higher mol. weight/energy compounds are utilized.

Looking at predicted proteomes is indeed a simplification in much the same way that analyzing codon usage frequencies was a simple way to identify with varying degrees of certainty highly expressed genes. No doubt, organisms have multiple methods to acquire the energy they need – especially when under rate-limiting conditions. For example, the pervasive transfer of proteorhodopsin to many different marine microbes presumably helps overcome some nutrient limitation situations by providing added energy from the sun (in the form of a proton gradient), perhaps to aid in transport. The predicted proteome analysis just says that organisms that live in low N waters have lower frequencies of N in their side chains than organisms in the coastal ocean (or in say a sludge metagenome). It doesn’t discount the importance of gene expression, the fact that cells are not “averages” of the genome, etc. None of that really fits into a two-page paper.

2. In our paper, we used the diazotroph Trichodesmium as a model open-ocean organism that was severely N-cost-minimized and compared this to similar success of the SAR11 organism, Pelagibacter ubique. We were criticized because N-fixation should help an organism overcome any N stress.

This was clarified in our next, longer draft. As was shown in the elegant paper by Baudouin-Cornu, assimilatory proteins reflect the “history” of an organism trying to compete for the very atom or molecule they are trying to assimilate. Thus, Trichodesmium would hardly bother to break the triple bond of dinitrogen costing 16 ATP to make ammonia if they were swimming in a vat of inorganic nitrogen. Or put differently, the nitrogenase operon should be nitrogen-cost-minimized reflecting the assimilatory costs of acquiring N. This is, indeed, the case.

3. Why not calculate the bio-energetic costs associated with changes in N content?

We ended up doing this by proxy in the ISME paper. But it raised a far more interesting point that we pursued in further detail and a chicken/egg argument that was pursued subsequently by another reviewer. If you simply plot N atoms per amino acid side chain versus GC, you get a relationship that looks like this:

This is neither surprising nor novel. But it highlights well the “cost” of having a high GC versus low GC genome in terms of added nitrogen atoms in proteins. These data plotted are all marine microbes but the result is universal.

Furthermore, if you plot GC versus median mass of amino acids in the predicted proteome of organisms you get this:

The relationship between GC and the average mass of amino acids is strong. And, this is one of the places where the story gets interesting. Organisms that have low GC genomes have inherently heavier proteins… i.e., All resources being equal and all metabolic pathways being the same (rare, I know), a low GC organism is going to invest more ATP and NADH to make the same protein as a high GC organism. Let’s ignore why this might not matter if you are Helicobacter pylori and quite comfortable acquiring amino acids from your host but focus on ocean microbes. There is a trade-off for all organisms simply based on the GC content of the genome. If you have a low GC genome, you have (on average) larger proteins and less N in your proteins than a high GC genome. Is this trade-off the reason why many of the most successful organisms in the ocean have low GC content? Probably not, but it has to be considered a contributing factor. Constant low nitrogen has to be a major selective pressure given the recent biogeochemical history of the ocean as pointed out in Falkowski (1997). In the final version of the ISME paper, we model differences in the nitrogen budgets of various “model” organisms based on some trade-offs. It was a decent first step, showing that N-cost minimization actually matters.

4. How do you make a quantifiable association between organisms that are so diversely located in space/time and environmental forcing like N availability?

This is a fundamental question in microbial ecology (example, and another). How do we tackle why and when organisms are going to be abundant? Here, I think there are two approaches worth taking. First, what specific genome/metabolic characteristics determine success under specific conditions? For example, what are the characteristics of SAR11 that enable them to “thrive” in oligotrophic waters while their alphaproteobacteria neighbors, the Roseobacter, tend to do better in waters that are more hyper-variable (like the coastal ocean)? Lauro et al. define the characteristics that can be found in genomes of oligotrophic versus copiotrophic organisms. Second, given specific global biogeochemical patterns and environmental forcing constraints, how do we predict organisms will respond? Put in the context of nitrogen cost-minimization, we can ask, “Over geological time will low N waters continue to exert pressure on organisms such that either organisms with N-cost-minimized genomes will thrive or will organisms be forced on a downward GC content trajectory to ease some of this burden?” In our paper, we suggest that the evolutionary history of organisms hints at the impacts nutrient limitations are having on organisms. And this, of course, is by no means new. A beautiful example (albeit not open access).

The divergence of the cyanobacteria Synechococcus and Prochlorococccus during the rise of the diatoms – the most important phytoplankton group in the ocean – suggests the impact of biogeochemical changes on marine microbes. The diversification and proliferation of diatoms in the oceans marginalized cyanobacteria. Diatoms are the workhorses of the ocean biogenic carbon cycle – in comparison to cyanobacteria, they grow quickly and sink faster – thus they sequester fixed CO2, N and Fe that all other surface ocean microbes need. The diatoms changed the ocean, thus putting pressure on cyanobacteria. A result (because many other things also happened) was the genome streamlining and niche adaptation of the lineage. The best example is the high-light adapted MED4 strain of Prochlorococcus. This particular strain has a small genome, low GC and is nitrogen-cost-minimized, as detailed in our paper. Diatoms marginalized cyanobacteria forcing them into specific niches (e.g., high-light, low Fe, low N, low P) where they are successful and well adapted (like these clades that live in iron poor water).

Where we are heading?

What are the implications of cost-minimization in the genomes of ocean microbes? Could it alter the overall nutrient pools in the surface ocean (and thus affect the potential CO2 draw down by phytoplankton)? These are questions we are now pursuing using modeling approaches in an attempt to bolster our understanding of biogeochemistry through genomics and microbial ecology. We are teaming up with Jay Cullen, a chemical oceanography professor, good friend and super smart guy to figure out if cost-minimization and other metabolic changes in microbes might be having more of an effect on biogeochemical cycles than we think. Stay tuned.

Coming Monday at #UCDavis "The Infant Gut Microbiome: Prebiotics, Probiotics, & Establishment"

Just a little announcement here.  There is a symposium tomorrow at UC Davis organized by a undergraduates in the CLIMB program.  CLIMB stands for “Collaborative Learning at the Interface of Mathematics and Biology (CLIMB)” and is a program that emphasizes hands-on training using mathematics and computation to answer state-of-the-art questions in biology.  A select group of undergraduates participate in the program and this summer the students had to do some sort of modelling project.  Somehow I managed to convince them to do work on human gut microbes.  And they have done a remarkable job.

As part of their summer work, they organized a symposium on the topic and their symposium takes place tomorrow.  Details are below.

The Infant Gut Microbiome: Prebiotics, Probiotics, & Establishment

Monday, 12 September 2011, 9am-4pm

Life Sciences 1022

UC Davis

9:00-9:10 Introduction

9:10-9:40 Jonathan Eisen, UC Davis

“DNA and the hidden world of microbes”

9:40-10:40 Mark Underwood, UC Davis

“Dysbiosis and necrotizing enterocolitis”

10:40-10:50 break

10:50-11:50 Ruth Ley, Cornell University

“Host-microbial interactions and metabolic syndrome”

11:50-12:00 general discussion

12:00-1:00 lunch

1:00-2:00 CLIMB 2010 cohort

“Breast milk metabolism and bacterial coexistence in the infant microbiome”

2:00-2:10 break

2:10-3:10 David Relman, Stanford University

“Early days: assembly of the human gut microbiome during childhood”

3:10-3:40 Bruce German, UC Davis

3:40-4:00 next steps

The only major issue for me is I am losing my voice.  So we will see how this goes.  Though I note I have gotten some very sage advice on how to treat my voice problem via the magic of twitter.  If I do not collapse I will also be tweeting/posting about the other talks during the day.