High throughput DNA sequencing technologies have spurred the development of numerous novel methods for genome assembly. With few exceptions, these algorithms are heuristic and require one or more parameters to be manually set by the user. One approach to parameter tuning involves assembling data from an organism with an available high quality reference genome, and measuring assembly accuracy using some metrics. We developed a system to measure assembly quality under several scoring metrics, and to compare assembly quality across a variety of assemblers, sequence data types, and parameter choices. When used in conjunction with training data such as a high quality reference genome and sequence reads from the same organism, our program can be used to manually identify an optimal sequencing and assembly strategy for de novo sequencing of related organisms.
Just a quick post here. A new paper from my lab has come out in Bioinformatics. The paper is relatively simple. Titled “Mauve Assembly Metrics” it reports work of Aaron Darling and Andrew Tritt (with some minor contributions from me and Marc Facciotti). Aaron wrote the program Mauve when he was a student in Nicole Perna’s lab at Wisconsin: Mauve: multiple alignment of conserved genomic sequence with rearrangements. Over the years he (and others) have continued to develop the program and written a few papers too including for example, the development of progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. This new paper reports basically a system/scripts to measure assembly quality. Here is the abstract:
Check out the paper: Mauve Assembly Metrics. Download the scripts/code http://ngopt.googlecode.com and Mauve and play around and let me know what you think.
Note this paper was supported by a grant from the National Science Foundation (ER 0949453). That grant is focused on comparative genomics (sequencing and analysis) of halophlic archaea. Stay tuned for more on that project as we are writing up a series of papers ….
Some related links: