Partial short-read sequencing of a highly inbred Iberian pig and genomics inference thereof - PubMed (original) (raw)

Partial short-read sequencing of a highly inbred Iberian pig and genomics inference thereof

A Esteve-Codina et al. Heredity (Edinb). 2011 Sep.

Abstract

Despite dramatic reduction in sequencing costs with the advent of next generation sequencing technologies, obtaining a complete mammalian genome sequence at sufficient depth is still costly. An alternative is partial sequencing. Here, we have sequenced a reduced representation library of an Iberian sow from the Guadyerbas strain, a highly inbred strain that has been used in numerous QTL studies because of its extreme phenotypic characteristics. Using the Illumina Genome Analyzer II (San Diego, CA, USA), we resequenced ∼ 1% of the genome with average 4 × depth, identifying 68,778 polymorphisms. Of these, 55,457 were putative fixed differences with respect to the assembly, based on the genome of a Duroc pig, and 13,321 were heterozygous positions within Guadyerbas. Despite being highly inbred, the estimate of heterozygosity within Guadyerbas was ∼ 0.78 kb(-1) in autosomes, after correcting for low depth. Nucleotide variability was consistently higher at the telomeric regions than on the rest of the chromosome, likely a result of increased recombination rates. Further, variability was 50% lower in the X-chromosome than in autosomes, which may be explained by a recent bottleneck or by selection. We divided the whole genome in 500 kb windows and we analyzed overrepresented gene ontology terms in regions of low and high variability. Multi organism process, pigmentation and cell killing were overrepresented in high variability regions and metabolic process ontology, within low variability regions. Further, a genome wide Hudson-Kreitman-Aguadé test was carried out per window; overall, variability was in agreement with neutral expectations.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Simulated isolation with migration model that represents the Iberian/Duroc history (the public assembly pertains to a Duroc sow). The Duroc and Iberian populations descend from an ancestral population harboring a nucleotide diversity θ_=4_Neμ; after the split τ generations ago, both breeds of effective sizes N DU and N IB may have interchanged individuals with rate m. A mixed coalescence and gene dropping procedure was used.

Figure 2

Figure 2

Bioinformatics pipeline.

Figure 3

Figure 3

Lowess adjusted curves of variability in chromosomes 4, 7, 14 and X. An increased variability is observed towards the telomeres in metacentric chromosomes 4 and X, whereas the ratio is distorted in SSC7 because of high SLA variability near window 50; SSC14 is acrocentric. Solid red line, Iberian heterozygosity (ĥ); dashed black line, Iberian—Duroc heterozygosity (). Position refers to window number. A full color version of this figure is available at the Heredity journal online.

Figure 4

Figure 4

Histograms comparing observed (black bars) and simulated (grey bars) HKA statistics across autosomal and sex chromosome windows. The simulated results correspond to parameter values that minimized the Wilcoxon statistics.

Figure 5

Figure 5

Expected and observed gene ontology counts among genes located in high and low variability windows. Bars with asterisk are significant (P<0.001) overrepresented gene ontologies.

Similar articles

Cited by

References

    1. Amaral A, Ferretti L, Megens H-J, Crooijmans R, Nie H, Ramos-Onsins SE, et al. 2011Genome wide footprints of pig domestication revealed through massive parallel sequencing of pooled DNA Plos Onein press). - PMC - PubMed
    1. Amaral A, Megens H-J, Kerstens H, Heuven H, Dibbits B, Crooijmans R, et al. Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome. BMC Genomics. 2009;10:374. - PMC - PubMed
    1. Cutler DJ, Jensen JD. To pool, or not to pool. Genetics. 2010;186:41–43. - PMC - PubMed
    1. Chen GK, Marjoram P, Wall JD. Fast and flexible simulation of DNA sequence data. Genome Res. 2009;19:136–142. - PMC - PubMed
    1. Dohm JC, Lottaz C, Borodina T, Himmelbauer H. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 2008;36:e105. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources