Bayesian association-based fine mapping in small chromosomal segments - PubMed (original) (raw)
Bayesian association-based fine mapping in small chromosomal segments
Mikko J Sillanpää et al. Genetics. 2005 Jan.
Abstract
A Bayesian method for fine mapping is presented, which deals with multiallelic markers (with two or more alleles), unknown phase, missing data, multiple causal variants, and both continuous and binary phenotypes. We consider small chromosomal segments spanned by a dense set of closely linked markers and putative genes only at marker points. In the phenotypic model, locus-specific indicator variables are used to control inclusion in or exclusion from marker contributions. To account for covariance between consecutive loci and to control fluctuations in association signals along a candidate region we introduce a joint prior for the indicators that depends on genetic or physical map distances. The potential of the method, including posterior estimation of trait-associated loci, their effects, linkage disequilibrium pattern due to close linkage of loci, and the age of a causal variant (time to most recent common ancestor), is illustrated with the well-known cystic fibrosis and Friedreich ataxia data sets by assuming that haplotypes were not available. In addition, simulation analysis with large genetic distances is shown. Estimation of model parameters is based on Markov chain Monte Carlo (MCMC) sampling and is implemented using WinBUGS. The model specification code is freely available for research purposes from http://www.rni.helsinki.fi/\~mjs/.
Figures
Figure 1.—
Illustration of how dependence between adjacent markers influences the mapping signal (QTL probability on the _y_-axis). The posterior QTL probabilities are drawn as a histogram and the corresponding hypothetical values of locus indicators {_Il_−1, Il, Il+1} for three markers {l − 1, l, l + 1} are given in a single MCMC iteration. (A) Surrounding indicators smooth the spurious signal at marker l downward. (B) Surrounding indicators strengthen the weak but real signal at marker l upward. A similar phenomenon happens in methods that utilize combined information of linkage and association. In these methods, linkage information does not confirm spurious associations but strengthens the real association signals.
Figure 2.—
QTL probabilities. Locus-specific estimates of QTL probabilities for the CF data set (left) and the corresponding prior probabilities (right) are shown. Marker numbers are shown on the _x_-axis and QTL probabilities on the _y_-axis.
Figure 3.—
QTL allelic effects. Locus-specific point estimates (mean) of allelic effects for the CF data set (left) and the corresponding prior (right) are shown. The first and the second allele at each marker locus are shown as open and solid bars, respectively. These quantities are calculated on the basis of pooling samples from two separate MCMC chains with 30,000 samples (after an initial 5000 burn-in rounds) in each. All sampled values were utilized in these estimates, including the iterations where the marker indicator was zero. Marker numbers are shown on the _x_-axis and the underlying hidden phenotype (liability) scale is on the _y_-axis. Note that the effects are double in size.
Figure 4.—
Estimated linkage disequilibrium pattern for the CF data set. The posterior probabilities of jointly selecting two adjacent markers into the model (i.e., their indicators have value one simultaneously) are estimated for each marker pair (diamond, scale in the right _y_-axis). For each marker pair, the corresponding prior (circle, scale in the right _y-_axis) and the physical distance between markers are also shown (box, scale in the left _y_-axis). Markers present in each marker pair are shown on the _x_-axis.
Figure 5.—
Locus-specific point estimates (mean) of weighted genetic variances for the Friedreich ataxia data. Different types of specifications for smoothing parameter λ are shown: (A) wide Gamma(1, 0.01) prior for λ with mean 100 (posterior mean 62.65); (B) narrow Gamma(100, 1) prior for λ with mean 100 (posterior mean 99.32); (C) λ = 10 (strong smoothing); (D) λ = 250 (weak smoothing); and (E) independent prior for indicators with no λ parameter (no smoothing). Marker numbers are shown on the _x_-axis and weighted genetic variances are on the _y_-axis. Note that the variance parameter is for the model where the effects are double in size.
Figure 6.—
Locus-specific estimates of QTL probabilities (left) and weighted genetic variances (right) for the simulated data set, where loci 18 and 22 represent the true gene locations. Marker numbers are shown on the _x_-axis and QTL probabilities/weighted genetic variances are on the _y_-axis.
Similar articles
- Bayesian analysis of multilocus association in quantitative and qualitative traits.
Kilpikari R, Sillanpää MJ. Kilpikari R, et al. Genet Epidemiol. 2003 Sep;25(2):122-35. doi: 10.1002/gepi.10257. Genet Epidemiol. 2003. PMID: 12916021 - Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data.
Sillanpää MJ, Arjas E. Sillanpää MJ, et al. Genetics. 1999 Apr;151(4):1605-19. doi: 10.1093/genetics/151.4.1605. Genetics. 1999. PMID: 10101181 Free PMC article. - Association mapping of complex trait loci with context-dependent effects and unknown context variable.
Sillanpää MJ, Bhattacharjee M. Sillanpää MJ, et al. Genetics. 2006 Nov;174(3):1597-611. doi: 10.1534/genetics.106.061275. Epub 2006 Oct 8. Genetics. 2006. PMID: 17028339 Free PMC article. - Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data.
Sillanpää MJ, Arjas E. Sillanpää MJ, et al. Genetics. 1998 Mar;148(3):1373-88. doi: 10.1093/genetics/148.3.1373. Genetics. 1998. PMID: 9539450 Free PMC article. - Joint tests of linkage and association for quantitative traits.
Allison DB, Neale MC. Allison DB, et al. Theor Popul Biol. 2001 Nov;60(3):239-51. doi: 10.1006/tpbi.2001.1544. Theor Popul Biol. 2001. PMID: 11855958 Review.
Cited by
- A simple new approach to variable selection in regression, with application to genetic fine mapping.
Wang G, Sarkar A, Carbonetto P, Stephens M. Wang G, et al. J R Stat Soc Series B Stat Methodol. 2020 Dec;82(5):1273-1300. doi: 10.1111/rssb.12388. Epub 2020 Jul 10. J R Stat Soc Series B Stat Methodol. 2020. PMID: 37220626 Free PMC article. - A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction.
Mathew B, Léon J, Sillanpää MJ. Mathew B, et al. Heredity (Edinb). 2018 Apr;120(4):356-368. doi: 10.1038/s41437-017-0023-4. Epub 2017 Dec 14. Heredity (Edinb). 2018. PMID: 29238077 Free PMC article. - Prediction of complex human diseases from pathway-focused candidate markers by joint estimation of marker effects: case of chronic fatigue syndrome.
Bhattacharjee M, Rajeevan MS, Sillanpää MJ. Bhattacharjee M, et al. Hum Genomics. 2015 Jun 11;9(1):8. doi: 10.1186/s40246-015-0030-6. Hum Genomics. 2015. PMID: 26063326 Free PMC article. - Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling.
Bhattacharjee M, Gupta R, Davuluri RV. Bhattacharjee M, et al. Front Genet. 2012 Nov 27;3:239. doi: 10.3389/fgene.2012.00239. eCollection 2012. Front Genet. 2012. PMID: 23293650 Free PMC article. - Swift block-updating EM and pseudo-EM procedures for Bayesian shrinkage analysis of quantitative trait loci.
Mutshinda CM, Sillanpää MJ. Mutshinda CM, et al. Theor Appl Genet. 2012 Nov;125(7):1575-87. doi: 10.1007/s00122-012-1936-1. Epub 2012 Jul 24. Theor Appl Genet. 2012. PMID: 22824967
References
- Bertranpetit, J., and F. Calafell, 1996 Genetic and geographical variability in cystic fibrosis: evolutionary considerations, pp. 97–114 in Variation in the Human Genome, edited by D. Chadwick and G. Cardew. Wiley, Chichester, England. - PubMed
- Cardon, L. R., and L. J. Palmer, 2003. Population stratification and spurious allelic association. Lancet 361: 598–604. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources