Allelic variation in gene expression is common in the human genome - PubMed (original) (raw)

Allelic variation in gene expression is common in the human genome

H Shuen Lo et al. Genome Res. 2003 Aug.

Abstract

Variations in gene sequence and expression underlie much of human variability. Despite the known biological roles of differential allelic gene expression resulting from X-chromosome inactivation and genomic imprinting, a large-scale analysis of allelic gene expression in human is lacking. We examined allele-specific gene expression of 1063 transcribed single-nucleotide polymorphisms (SNPs) by using Affymetrix HuSNP oligo arrays. Among the 602 genes that were heterozygous and expressed in kidney or liver tissues from seven individuals, 326 (54%) showed preferential expression of one allele in at least one individual, and 170 of those showed greater than fourfold difference between the two alleles. The allelic variation has been confirmed by real-time quantitative PCR experiments. Some of these 170 genes are known to be imprinted, such as SNRPN, IPW, HTR2A, and PEG3. Most of the differentially expressed genes are not in known imprinting domains but instead are distributed throughout the genome. Our studies demonstrate that variation of gene expression between alleles is common, and this variation may contribute to human variability.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Evaluation of Affymetrix HuSNP chip for analysis of allelic gene expression. (A) Scatter plots for duplicate experiments. In this example, we performed duplicate experiments by using the same genomic DNA (left) or cDNA made from kidney RNA (right). Each_circle_ represents a pair of intensity values in the two experiments for one SNP. The mean fluorescent intensities of the perfect match probe minus the mismatch probe from experiment 2 are plotted against the mean fluorescent intensity of the perfect match probe minus the mismatch probe from experiment 1. The Pearson correlation coefficients of the duplicate experiments for genomic DNAs and cDNA are 0.98 and 0.95, respectively. The P values for both correlation coefficients are <0.0001. In other duplicate sample tests, Pearson correlation coefficients ranged from 0.98–0.99 for genomic DNA and from 0.88 to 0.96 for cDNA, and the P values for these correlation coefficients are also <0.0001. (B) Probe images of the PEG3 gene (SNP rs3143). The probe images were generated by using the Affymetrix MAS 4.0 software. The genomic DNA, kidney cDNA, and liver cDNA from the same fetus are each represented by a set of 16 hybridization signals. Within each set, each individual grid corresponds to a probe. The eight probes for allele A are on the left, and the eight probes for allele B are on the right. The top eight probes are for perfect match probes, whereas the bottom eight probes are for mismatch probes. The genomic DNA hybridized strongly to the perfect match probes for both alleles, whereas cDNAs hybridized strongly to the perfect match probes of allele B only. (C) SNPs/genes on the HuSNP chip are summarized in this diagram. There are 1494 SNPs on the HuSNP chip. We mapped 1063 SNPs to transcribed regions and 431 SNPs outside of the transcribed regions. Of the 1063 SNPs, 602 SNPs were analyzed (for selection criteria, see Methods). Among these 602 SNPs, 277 of them showed almost equal expression levels between the two alleles, whereas 156 SNPs had a ratio of gene expression between twofold and fourfold, and 170 SNPs had a ratio exceeding fourfold for at least one individual.

Figure 2

Figure 2

Distribution of ratios of the fluorescent intensities between the two alleles for genomic DNA and cDNA. The ratios were computed as (PMA - MMA)/(PMB - MMB) for each SNP for every sample. From the ratios in genomic DNA samples, the 1-SD interval around the mean is between -1.27 and 1.17 in log scale. The interval in log scale corresponds to the interval between 0.28 and 3.22 for the ratios. We selected 602 SNPs for analysis (for selection criteria, see Methods). To compare the distributions of ratios in genomic DNA and cDNA, we plotted frequency of samples against the log ratio. Density functions for genomic DNA, kidney cDNA, and liver cDNA are represented by a black line, blue line, and purple line, respectively. Black triangles, from_left_ to right, indicate X coordinates at log(0.25), log(0.5), log(2), and log(4). The coordinates at log(0.5) and log(2) represent twofold ratios, and log(0.25) and log(4) represent fourfold ratios. The density functions for the kidney cDNA ratios and the liver cDNA ratios are similar. Both have a wider spread compared to the density function for the genomic DNA.

Figure 3

Figure 3

Mapping of allelic gene expression on chromosomes. The 602 SNPs analyzed in this study were mapped to the 22 autosomes and X chromosome (Supplemental Fig. 1). Chromosomes 9, 13, and 15 are shown here. The position of each SNP on the chromosome is based on the annotation in dbSNP. Allelic gene expression from kidney and liver is represented by blue and orange, respectively. Squares indicate the mean of the ratios for each gene, and the thin vertical lines indicate error bars (SD). The values are the ratios (allele A/allele B) between the two alleles. The values were inverted if less than one (allele B/allele A, when allele B was preferentially expressed). The scale marks ratios from one to 10. The chromosomal regions containing known imprinted genes are labeled with the red line on the left.

Figure 3

Figure 3

Mapping of allelic gene expression on chromosomes. The 602 SNPs analyzed in this study were mapped to the 22 autosomes and X chromosome (Supplemental Fig. 1). Chromosomes 9, 13, and 15 are shown here. The position of each SNP on the chromosome is based on the annotation in dbSNP. Allelic gene expression from kidney and liver is represented by blue and orange, respectively. Squares indicate the mean of the ratios for each gene, and the thin vertical lines indicate error bars (SD). The values are the ratios (allele A/allele B) between the two alleles. The values were inverted if less than one (allele B/allele A, when allele B was preferentially expressed). The scale marks ratios from one to 10. The chromosomal regions containing known imprinted genes are labeled with the red line on the left.

Figure 3

Figure 3

Mapping of allelic gene expression on chromosomes. The 602 SNPs analyzed in this study were mapped to the 22 autosomes and X chromosome (Supplemental Fig. 1). Chromosomes 9, 13, and 15 are shown here. The position of each SNP on the chromosome is based on the annotation in dbSNP. Allelic gene expression from kidney and liver is represented by blue and orange, respectively. Squares indicate the mean of the ratios for each gene, and the thin vertical lines indicate error bars (SD). The values are the ratios (allele A/allele B) between the two alleles. The values were inverted if less than one (allele B/allele A, when allele B was preferentially expressed). The scale marks ratios from one to 10. The chromosomal regions containing known imprinted genes are labeled with the red line on the left.

Figure 4

Figure 4

Validation of allele-specific gene expression using real-time quantitative PCR. (A) Genotyping of ELAC2 in 23 fetuses. Genomic DNAs from homozygous AA fetuses are at top left corner (blue), and genomic DNAs from homozygous BB fetuses are at bottom right corners (red). Genomic DNAs from heterozygous fetuses are located near the diagonal line (green). The black square represents no template control (NTC). The_X_-axis is for allele labeled by the VIC dye, and the _Y_-axis is for allele labeled by the FAM dye. (B) The log2 of (FAM intensity/VIC intensity) for ELAC2 was plotted against log2 of (FAM allele/VIC allele) of mixing homozygous DNAs at seven different ratios (8: 1, 4: 1, 2: 1, 1: 1, 1: 2, 1: 4, 1: 8; VIC allele/FAM allele). (C) Real-time quantitative PCR amplification of a cDNA sample from liver for ELAC2. The _X_-axis is the number of PCR amplification cycles, and the _Y_-axis is the fluorescence intensity. The red and blue curves represent alleles labeled with FAM and VIC, respectively. (D) Same as C except that the data are from kidney.

References

    1. Gartler, S.M. and Goldman, M.A. 2001. Biology of the X chromosome. Curr. Opin. Pediatr. 13: 340-345. - PubMed
    1. Lindblad-Toh, K., Tanenbaum, D.M., Daly, M.J., Winchester, E., Lui, W.O., Villapakkam, A., Stanton, S.E., Larsson, C., Hudson, T.J., Johnson, B.E., et al. 2000. Loss-of-heterozygosity analysis of small-cell lung carcinomas using single-nucleotide polymorphism arrays. Nat. Biotechnol. 18: 1001-1005. - PubMed
    1. Little, M.H., Dunn, R., Byrne, J.A., Seawright, A., Smith, P.J., Pritchard-Jones, K., van, H.V., and Hastie, N.D. 1992. Equivalent expression of paternally and maternally inherited WT1 alleles in normal fetal tissue and Wilms' tumours. Oncogene 7: 635-641. - PubMed
    1. Nishiwaki, K., Niikawa, N., and Ishikawa, M. 1997. Polymorphic and tissue-specific imprinting of the human Wilms tumor gene, WT1. Jpn. J. Hum. Genet. 42: 205-211. - PubMed
    1. Plass, C., Shibata, H., Kalcheva, I., Mullins, L., Kotelevtseva, N., Mullins, J., Kato, R., Sasaki, H., Hirotsune, S., Okazaki, Y., et al. 1996. Identification of Grf1 on mouse chromosome 9 as an imprinted gene by RLGS-M. Nat. Genet. 14: 106-109. - PubMed

WEB SITE REFERENCES

    1. ftp://ftp.ncbi.nih.gov/snp/human; National Center for Biotechnology Information (NCBI) dbSNP FTP site.

MeSH terms

LinkOut - more resources