Now that we have chosen our candidates we are in the process of preparing libraries for sequencing. I’ve learned a lot about this process in the past few weeks so I thought I’d share some of what I’ve learned.
First, what is a “genomic library” anyway?
“Genomic library” is the term used to describe the prepared genomic DNA that is sent to the Illumina sequencer for sequencing. Library preparation is a critical step because the quality of a library preparation often determines the quality of the sequencing and the ease of assembly.[i]
How does one prepare a genomic library?
Although there are many different methods to choose from in library preparation all methods have the same basic two goals.
- To cut the DNA into small pieces. The size of the pieces depends on the type of sequencing you are trying to do and the purpose of the sequencing. In our case, we want pieces averaging 500 base pairs that are at maximum 800 base pairs. [ii]
- To add adapters to each piece.
The differences in library preparation methods are largely differences in the mechanisms by which these two goals are accomplished. For example, the DNA can be chopped enzymatically or mechanically or the adapters can be added by one or a number of enzymatic steps.
Pros and Cons of Library Preparation Methods:
Each step of each preparation method has various advantages and disadvantages associated with it. The primary factors for concern in library preparation are:
- Amount of genomic DNA required – in general, the more steps involved in a preparation technique, the more genomic DNA will be required because some DNA will be lost at each step.
- Cutting bias – certain cutting techniques may be biased depending on the DNA sequence. This generally more of a concern in enzymatic cutting than in mechanical cutting.
- G-C content – Amplification steps (i.e. PCR in a thermocycler) tend to change the average G-C content of the DNA sample by favorable amplifying sequences based on the amount of guanine and cytosine in them. In general, using fewer amplification steps will decrease this bias. [iii]
- Price – the preparation methods vary widely in price, this can be a limiting factor.
For our libraries we will be using sonication (sound) to chop up the genomic DNA followed by a series of enzyme treatments from the Illumina library preparation kit that will first prepare the DNA pieces for annealing the adapters and then carry our the annealing process itself.
The adapters we are using will each contain a “barcode,” a short sequence of bases unique to each sample. Barcoding allows us to pool our samples and run them on a single Illumina well bringing down the cost of sequencing significantly.
Once we have the sequences back, we will begin the computationally challenging process of assembling and annotating them.
[i] Monya Baker, “De novo genome assembly: what every biologist should know,” Nature Methods 9.4 (2012): 333-337. http://www.nature.com/nmeth/journal/v9/n4/full/nmeth.1935.html?WT.ec_id=NMETH-201204