Nice new memory efficient metagenome assembly method from C. Titus Brown –

Interesting new #OpenAccess PNAS paper from C. Titus Brown: Scaling metagenome sequence assembly with probabilistic de Bruijn graphs. Of course, if you follow Titus on Twitter or his blog you would know about this already because not only has he posted about it but he posted a preprint of the paper on arXiv in December.

Check out the press release from Michigan State. Some good lines there like “Analyzing DNA data using traditional computing methods is like trying to eat a large pizza in a single bite.”

A key point in the paper: “The graph representation is based on a probabilistic data structure, a Bloom filter, that allows us to efficiently store assembly graphs in as little as 4 bits per k-mer, albeit inexactly. We show that this data structure accurately represents DNA assembly graphs in low memory.” This is important because right now most assemblers for genome data use a ton of memory.

Anyway the software behind the paper is available on GitHub here. Assemble away.

Author: Jonathan Eisen

I am an evolutionary biologist and a Professor at U. C. Davis. (see my lab site here). My research focuses on the origin of novelty (how new processes and functions originate). To study this I focus on sequencing and analyzing genomes of organisms, especially microbes and using phylogenomic analysis View all posts by Jonathan Eisen