A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines - PubMed (original) (raw)

A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines

Liming Liang et al. Genome Res. 2013 Apr.

Abstract

Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome-wide association studies (GWASs). Here, we more than double the size of our initial data set with expression data on 550 additional individuals from the MRCE family panel using the Illumina whole-genome expression array. We have used new statistical methods for dimension reduction to account for nongenetic effects in estimates of expression levels, and we have also included SNPs imputed from the 1000 Genomes Project. Our methods reduced false-discovery rates and increased the number of expression quantitative trait loci (eQTLs) mapped either locally or at a distance (i.e., in cis or trans) from 1534 in the MRCA data set to 4452 (with <5% FDR). Imputation of 1000 Genomes SNPs further increased the number of eQTLs to 7302. Using the same methods and imputed SNPs in the newly acquired MRCE data set, we identified eQTLs for 9000 genes. The combined results identify strong local and distant effects for transcripts from 14,177 genes. Our eQTL database based on these results is freely available to help define the function of disease-associated variants.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Empirical estimate of false discovery rate. (noPC) Using original expression value; (PC) adjusting nongenetic effect using the top 69 principal components; (1000G) imputation using SNPs from the 1000 Genomes project; (HapMap) imputation using HapMap2 SNPs; (300K) using autosomal SNPs from the Illumina 300K panel.

Figure 2.

Figure 2.

Replication rate by distance from the eQTL to the transcript. (Red lines) Replication of eQTL. (A) Local effect; (B) distant syntenic effect (>1 Mb but gene and SNP on the same chromosome); and (C) distant effect (on a different chromosome). The analysis is based on autosomal SNPs from the Illumina 300K panel.

Figure 3.

Figure 3.

Imputation accuracy by minor allele frequency (MAF; MRCA panel). (A) Correlation (_R_2) between 1000G imputed allele dosage derived from Illumina 300K arrays and true allele counts measured by Illumina 100K arrays, plotted by MAF. (B) Histogram of correlation with true allele counts by MAF in sample (upper panels) and minor allele counts in the 1000G reference haplotype. Note that there are no data in the bottom left panel following removal of singletons from the reference haplotype.

Figure 4.

Figure 4.

Comparison of number of local eQTLs identified by directly genotyped SNPs, imputed HapMap2 SNPs, and imputed SNPs from the 1000 Genomes Project. (A) Results from Affymetrix expression data in the MRCA panel. (B) Results from Illumina expression data in the MRCE panel. (Blue bars) Original unadjusted expression; (red bars) expression values adjusted by the top principal components.

Figure 5.

Figure 5.

Proportion of significant transcripts by overall heritability. (A) Unadjusted expression data and SNPs from the Illumina 300K panel in the MRCA subjects. (B) PC-adjusted expression data and the imputation of 1000G SNPs in the MRCA subjects. (C) Unadjusted expression data and SNPs from the Illumina 300K panel in the MRCE subjects. (D) PC-adjusted expression data and the imputation of 1000G SNPs in the MRCE subjects. The number of transcripts in each heritability category is given at the bottom of each bar.

Figure 6.

Figure 6.

cis eQTL of the gene TIMM22. A map of association of SNPs to transcript abundance of the TIMM22 gene exemplifies the progressive increase in information for the experimental genotypes (A), experimental plus HapMap imputed SNPs (B), and experimental, HapMap imputed and 1000G imputed SNPs (C). The gray vertical bar at the gene TIMM22 indicates the position of the Affymetrix probe

Similar articles

Cited by

References

    1. The 1000 Genomes Project Consortium 2010. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 - PMC - PubMed
    1. Abecasis GR, Wigginton JE 2005. Handling marker-marker linkage disequilibrium: Pedigree analysis with clustered markers. Am J Hum Genet 77: 754–767 - PMC - PubMed
    1. Abecasis GR, Cherny SS, Cookson WO, Cardon LR 2002. Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30: 97–101 - PubMed
    1. Akey JM, Biswas S, Leek JT, Storey JD 2007. On the design and analysis of gene expression studies in human populations. Nat Genet 39: 807–808 - PubMed
    1. Benjamini Y, Hochberg Y 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol 57: 289–300

Publication types

MeSH terms

LinkOut - more resources