After nearly ten weeks of learning our way around the lab, collecting samples, isolating organisms and sequencing their 16s ribosomal genes we are finally at the point where we are ready to choose our first candidate organisms for whole-genome sequencing!
The Plan:
- Choose candidate organisms
- Prepare a DNA library for each organism for Illumina sequencing
- Sequence and analyze genomes
Although we have a couple of pilot samples for which we are already preparing libraries, most of our organisms need to be screened for admittance into the elite group of “good candidate organisms.”
So who are these potential candidates?
Where were they found?
And perhaps most relevant to our project, have they been sequenced before?
The Contestants
AY4-Y: Brachybacterium (Isolation site: towel)
Available genomes: One completed genome, 3 incomplete (various institutes/projects). Actinobacteria. Nearest sequenced relative is Brachybacterium faecium DSM 4810. Completed genome, GC content 72%, total size 3.6 MB.
Interesting Background Information: B. faecium found in built environments including urban aerosols, floor dust, and poultry litter.
AKU: Curtobacterium flaccumfaciens (Isolation site: carpet)
Available genomes: No completed or in progress genomes; one other member of the genus is targeted by the DOE (from DSMZ).
Interesting Background Information: Known plant pathogen.
MAU: Bacillus Pumilus (Isolation site: library table)
Available genomes: Five incomplete genomes, one finished and published.
Interesting Background Information: Soil bacterium, some strains salt tolerant, but also found in metagenomic analysis of salmon guts.
MFU: Kocuria rosea (Isolation site: library table)
Available genomes: None, other members of Kocuria are being sequenced: one is completed and published (Kocuria rhizophila), one is in a permanent draft stage (Kocuria rhizophila, P7-4), three are targeted (K. marinus, K. palustris, and K. rhizophila).
Interesting Background Information: Soil bacterium, some samples isolated from chemically polluted soil and extreme environments such as spacecraft. Also found in indoor environments and deep-sea sediments.
MBU: Bacillus Simplex (Isolation site: library book)
Available genomes: Two incomplete genomes.
Interesting Background Information: Found in close association with corn and other plants, in marine sediment, as well as in “cloud water” collected from the Puy de Dome summit in France.
MCU: Micrococcus yunnanensis (BLAST also showed Micrococcus luteus as an equally possible result) (Isolation site: library book)
Available Genomes: None, but four other micrococcus species are in the database. One is targeted, one incomplete (Micrococcus luteus), one is in a permanent draft state (Micrococcus luteus) and one completed and published (Micrococcus luteus).
Interesting Background Information: Has been isolated from coral reef bacterial communities, endophytic bacterial communities and was used in a study for the application of bacteria in cement material.
MDU: Rhodococcus erythropolis (Isolation site: martial arts mats)
Available Genomes: Two incomplete, one complete and published, one permanent draft.
Interesting Background Information: Found in soil and bioremediation filters and in a potassium mine. R.Erythropolis was also found in a study of the effects of geochemistry on cave microbial communities.
MEU: Staphylococcus cohnii (Isolation site: martial arts mats)
Available Genomes: None, but many, many other Staphylococcus species are incomplete, targeted or published.
Interesting Background Information: Found in alkaline groundwater, marine samples, animal isolates and dairy products.
Pilot Samples in Library Prep
THU: Leucobacter (Isolation site: toilet):
Available Genomes: One completed genome, no In Progress or Targeted projects. Nearest sequenced relative is Leucobacter chromiiresistens. Draft genome, 29 scaffolds, GC content: 64%, total size 3.3 MB.
Interesting Background Information: Leucobacter species have been isolated from numerous outdoor and built environments (e.g. wastewater and barns). Actinobacteria. L. chromiiresistens is highly resistent to chromium.
TEU: Morganella morganii (Isolation site: toilet)
Availalbe Genomes: No completed or in progress genomes. Three targeted projects as part of the Human Microbiome Project.
Interesting Background Information: Gammaproteobacteria. BL2 pathogen (nosocomial in immune-compromised patients). Found in both the environment and as human-associated.
http://en.wikipedia.org/wiki/Morganella_morganii
Perhaps the most interesting thing about all these organisms is their diversity. We collect samples from the environments we live in and interact with every day, it’s truly amazing the mixture of organisms that is literally right under our fingertips just waiting to be explored. We will take our next steps towards this exploration soon, we are hoping to choose and begin compiling libraries for our first set of candidate organisms within the next two weeks.

Thanks for the post.
A quick question (I have many, some of which, including this one, are partly to get everyone to write this down here …). How did you assign the organisms a name? And how confident are you in these names?
LikeLike
We used a BLAST search of the 16S RNA. For most of the organisms, we received a pretty definitive answer. That is to say, all of the hits that had a name showed up as the same species. For one organism however, Micrococcus yunnanensis, as mentioned above, there was another result: M. luteus. Lacking any more rigorous evidence, I eventually chose to label it as yunnanensis after comparing the descriptions of both organisms to our own cultures. However, environment and morphology are very misleading methods for the characterization of bacteria so I left the M. luteus option open.
The short answer is we won’t truly be able to say their species with absolute confidence until we can map them onto a phylogenetic tree using more than just their 16S sequences (and maybe not even then, depending on the case). Many of the species we found did not have any genome results in GOLD or even in a general Google scholar search for published genomes. In these cases, we searched the database for the genus name instead and used those results to get an idea of the genome status of the organism’s close relatives. In general, I’m very confident in the genus names and pretty confident in the species names. And I believe in conjunction, they are good estimates which we can use to help determine candidates.
LikeLike
Hannah,
Go ahead and add the two organisms that we’ve already started library preps for to this list. It’d be nice to have everything in one place, and it’s certainly not too late to change our minds about those two either.
LikeLike
I think it would be good for here and for elsewhere to give all organisms and Eisen Lab ID. Something like JEXXX. Or microBEnetXXX. And then we will have an ID independent of what taxonomic name we might want to give to them. Then we can discuss
JE1 seems to be Yuckococcus yuckococci
JE2 seems to be …
In addition, as someone obsessed with phylogeny I would very much prefer to see phylogenetic trees of the rRNA sequences not blast hits … it is easy to use NCBI to get a tree. Just do a blast and then select from near the alignment part of the results “Distance tree of blast results”
LikeLike
I’d like to post the trees in the post above with their corresponding samples, do you have a suggested/preferred method for getting the tree from it’s BLAST form to a blog post?
LikeLike
not easy to do … but the best is
1) do the blast to tree
2) when the tree window pops up, export it to another format (I think a few options are given)
3) this will give you what is known as a tree file with a format like
(a,b)(c(d)e)) etc
4) then load this into a better tree drawing program than the one at NCBI –
(Ask Jenna which tool is best – some you have to download – some are available online)
The alternative to what is above is to make a screen capture of the tree shown on the blast -> tree page. This can be fine. On a mac you can use the Grab application. I don’t know how this works on non macs.
LikeLike
Each organism does have a unique ID of numbers and letters that also says something about the origin. We can certainly add that code to these posts.
LikeLike