Metagenomics notes


Summary and introduction

Organized in part around the metagenomic diversity workflow See here.

A key point is to compare and contrast tools for analyzing rRNA with those for analyzing metagenomic data.

Generation of data


  • Background
    • PCR amplification with conserved primers
    • Sequencing
      • Standard
      • 454

See [1][2]

See “Review and re-analysis of domain-specific 16S primers.”

    • Chimera checking
    • Assembly of sequences from paired ends
  • rRNA databases

PCR of other genes

  • Done sometimes to cover broad diversity
    • RecA/RadA best example
  • Frequently done to focus on narrower groups or functional classes
    • Amanox
    • Nitrogen fixation

Metagenomic Sequencing

  • Background
  • Identifying rRNA genes
  • Identifying other genes
    • Which genes should be used?
  • Assembly
  • Metagenomic databases
    • CAMERA
    • RAST
    • IMG/M

General analysis of gene families (rRNA and protein)

rRNA vs. protein

Sequence quality issues


Dividing into OTUs

Phylogenetic trees

Web servers that do multiple things

Species Diversity Measurements Focused on Single Communities

Richness estimators

Alpha diversity

Issues with different sequencing methods

  • 454 Sequencing
    • See recent paper by R. Knight on methods

Comparing between samples

General approaches and Introduction

  • Multiple ways to compare communities and to think about the comparisons
  • Good discussion of this on p559 in Lozupone and Knight review
  • I (Eisen) view things a little differently
  • Would be useful to at least briefly link the microbial diversity/phylodiversity literature to the non-microbial ecology literature on this subject
    • Note parallel development of methods and tools to answer many of the same questions with different datasets. Emphasize need to unify.
  • First key issue is whether to analyze communities one by one with some metric and then to compare the scores or to develop scores explicitly from comparing communities
  • i.e. alpha vs. beta diversity
    • Within community metric could be
      • Richness
      • Diversity
  • Lozupone and Knight outline a few other issues including
    • Qualitative vs. quantitative
      • Qualitative do not take into account species counts, just presence/absence
    • Species based or tree based

OTU and phylotype focused

  • Presence/absence of OTUs
  • Presence/absence of phylotypes (at all levels)
  • SONS

Coverage and Statistical Significance

Phylogenetic structure of communities

Metagenomic distances

Statistical issues

  • Sampling
  • Statistical concepts (e.g. experimental unit (species, community, environment, day, etc))
  • Assessing reliability and repeatability of results

Case Studies and Science

Reviews on methods

Metagenomics informatics

General diversity things



Data Types

    • rRNA sequence
      • Pros
      • Cons
    • rRNA tRFLP
      • Pros
      • Cons
    • metagenomics
      • Pros
      • Cons

Diversity Methods

  • alpha diversity
    • Phylogenetic diversity (PD)
      • Pros
      • Cons
      • Applications
      • References
  • beta diversity

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: