doi: 10.1186/gb-2010-11-6-r62. Epub 2010 Jun 17.

Min Wang, Daniel L Burgess, Christie Kovar, Matthew J Rodesch, Mark D'Ascenzo, Jacob Kitzman, Yuan-Qing Wu, Irene Newsham, Todd A Richmond, Jeffrey A Jeddeloh, Donna Muzny, Thomas J Albert, Richard A Gibbs


Whole exome capture in solution with 3 Gbp of data

Matthew N Bainbridge et al. Genome Biol. 2010.


We have developed a solution-based method for targeted DNA capture-sequencing that is directed to the complete human exome. Using this approach allows the discovery of greater than 95% of all expected heterozygous singe base variants, requires as little as 3 Gbp of raw sequence data and constitutes an effective tool for identifying rare coding alleles in large scale genomic studies.

Normalized coverage of replicate SOLiD libraries 2 to 4 versus normalized coverage of replicate library 1. Average coverage for each target region in library 1 is plotted against library 2 (blue), library 3 (green) and library 4 (red). Coverage for each target is represented as a proportion of the total sequence generated. The line X = Y indicates approximately equal levels of coverage in both libraries.

Average coverage for each target region as sequenced by Illumina 75-bp frag reads plotted against SOLiD library 1. Coverage for each target is normalized for the total sequence aligned to target. The line X = Y indicates approximately equal levels of coverage in both libraries.

Coverage distribution across target regions of SOLiD libraries 1 (10 Gbp) and 2 (3 Gbp) and Illumina PE and frag libraries. The number of bases at each level of coverage for each library type is shown for approximately 10 Gbp of SOLiD data (green), approximately 3 Gbp of SOLiD data (red), approximately 3 Gbp of Illumina PE data (yellow) and approximately 3 Gbp of Illumian frag data (blue) after duplicate removal.

