A set of BAC clones spanning the human genome - PubMed (original) (raw)

. 2004 Jul 9;32(12):3651-60.

doi: 10.1093/nar/gkh700. Print 2004.

Ian Bosdet, Duane Smailus, Readman Chiu, Carrie Mathewson, Natasja Wye, Sarah Barber, Mabel Brown-John, Susanna Chan, Steve Chand, Alison Cloutier, Noreen Girn, Darlene Lee, Amara Masson, Michael Mayo, Teika Olson, Pawan Pandoh, Anna-Liisa Prabhu, Eric Schoenmakers, Miranda Tsai, Donna Albertson, Wan Lam, Chik-On Choy, Kazutoyo Osoegawa, Shaying Zhao, Pieter J de Jong, Jacqueline Schein, Steven Jones, Marco A Marra

Affiliations

A set of BAC clones spanning the human genome

Martin Krzywinski et al. Nucleic Acids Res. 2004.

Abstract

Using the human bacterial artificial chromosome (BAC) fingerprint-based physical map, genome sequence assembly and BAC end sequences, we have generated a fingerprint-validated set of 32 855 BAC clones spanning the human genome. The clone set provides coverage for at least 98% of the human fingerprint map, 99% of the current assembled sequence and has an effective resolving power of 79 kb. We have made the clone set publicly available, anticipating that it will generally facilitate FISH or array-CGH-based identification and characterization of chromosomal alterations relevant to disease.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Comparisons of the left end, middle and right end genome sequence coordinates derived from BAC end sequence data to the coordinates derived from the fingerprint-based in silico mapping procedure described in the text. Shown are distributions depicting the frequency of occurrence of differences in the coordinates. In general, a narrower distribution corresponds to better correlation between coordinates derived from the BES data and the in silico approach. Shaded regions indicate the proportion of the comparisons falling into indicated intervals. The X-axis scale is in kilobases; the Y-axis depicts number of comparisons. The line indicates the cumulative distribution of the differences in localization.

Figure 2

Figure 2

Coverage of the sequence assembly provided by the clones in the set. For each chromosome, the coverage of adjacent 700 kb regions is plotted as a colour map. Regions in the assembly without sequence information appear as black areas. Bright blue regions correspond to 100% coverage. Distance scale is in megabases.

Figure 3

Figure 3

Distribution of gaps in sequence coverage. Location and sizes of gaps is determined by using all available sequence coordinates for clones in the set. Of the gaps, 40% are <10 kb. Many gaps in this distribution may not be real, but instead result from our conservative algorithm for in silico sequence coordinate determination. There are 1169 BACs without sequence coordinates and therefore without explicit localization in the genome. When the genome sequence is complete it is possible that many of these BACs will be localized on the sequence.

Figure 4

Figure 4

Resolution provided by the clones in the set. For each chromosome, the average clone cover size is coded by colour. There are a total of 61 656 clone covers with an average cover size of 47 kb and an effective resolution of 79 kb. Regions in the assembly without sequence information appear as black areas. Distance scale is in megabases.

Figure 5

Figure 5

Distribution of clone cover sizes and cumulative distribution of the effective resolution of our clone set. The average effective resolution is 79 kb. This value is obtained by averaging the cover sizes at randomly sampled points of the sequence assembly. 95% of the genome can be resolved by the clone set at a level of 150 kb, or better.

Figure 6

Figure 6

Distribution of coverage depth of the clone set. The depth is represented by the number of clone set BACs with sequence coordinates that span a given region of sequence.

Similar articles

Cited by

References

    1. Shizuya H., Birren,B., Kim,U.J., Mancino,V., Slepak,T., Tachiiri,Y. and Simon,M. (1992) Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Natl Acad. Sci. USA, 89, 8794–8797. - PMC - PubMed
    1. Marra M.A., Kucaba,T.A., Dietrich,N.L., Green,E.D., Brownstein,B., Wilson,R.K., McDonald,K.M., Hillier,L.W., McPherson,J.D. and Waterston,R.H. (1997) High throughput fingerprint analysis of large-insert clones. Genome Res., 7, 1072–1084. - PMC - PubMed
    1. Schein J., Kucaba,T.A., Sekhon,M., Smailus,D., Waterston,R.H. and Marra,M.A. (2004) High-throughput BAC fingerprinting. In Zhao,S. and Stodolsky,M. (eds), Methods in Molecular Biology. Vol. 255: Bacterial Artificial Chromosomes: Library Construction, Physical Mapping and Sequencing. Humana Press Inc., Totowa, NJ, Vol. 1, pp. 143–156. - PubMed
    1. McPherson J.D., Marra,M., Hillier,L., Waterston,R.H., Chinwalla,A., Wallis,J., Sekhon,M., Wylie,K., Mardis,E.R., Wilson,R.K. et al. (2001) A physical map of the human genome. Nature, 409, 934–941. - PubMed
    1. Gregory S.G., Sekhon,M., Schein,J., Zhao,S., Osoegawa,K., Scott,C.E., Evans,R.S., Burridge,P.W., Cox,T.V., Fox,C.A. et al. (2002) A physical map of the mouse genome. Nature, 418, 743–750. - PubMed

Publication types

MeSH terms

LinkOut - more resources