Bayesian association-based fine mapping in small chromosomal segments - PubMed (original) (raw)

Bayesian association-based fine mapping in small chromosomal segments

Mikko J Sillanpää et al. Genetics. 2005 Jan.

Abstract

A Bayesian method for fine mapping is presented, which deals with multiallelic markers (with two or more alleles), unknown phase, missing data, multiple causal variants, and both continuous and binary phenotypes. We consider small chromosomal segments spanned by a dense set of closely linked markers and putative genes only at marker points. In the phenotypic model, locus-specific indicator variables are used to control inclusion in or exclusion from marker contributions. To account for covariance between consecutive loci and to control fluctuations in association signals along a candidate region we introduce a joint prior for the indicators that depends on genetic or physical map distances. The potential of the method, including posterior estimation of trait-associated loci, their effects, linkage disequilibrium pattern due to close linkage of loci, and the age of a causal variant (time to most recent common ancestor), is illustrated with the well-known cystic fibrosis and Friedreich ataxia data sets by assuming that haplotypes were not available. In addition, simulation analysis with large genetic distances is shown. Estimation of model parameters is based on Markov chain Monte Carlo (MCMC) sampling and is implemented using WinBUGS. The model specification code is freely available for research purposes from http://www.rni.helsinki.fi/\~mjs/.

PubMed Disclaimer

Figures

Figure 1.—

Illustration of how dependence between adjacent markers influences the mapping signal (QTL probability on the _y_-axis). The posterior QTL probabilities are drawn as a histogram and the corresponding hypothetical values of locus indicators {_Il_−1, Il, Il+1} for three markers {l − 1, l, l + 1} are given in a single MCMC iteration. (A) Surrounding indicators smooth the spurious signal at marker l downward. (B) Surrounding indicators strengthen the weak but real signal at marker l upward. A similar phenomenon happens in methods that utilize combined information of linkage and association. In these methods, linkage information does not confirm spurious associations but strengthens the real association signals.

Figure 2.—

QTL probabilities. Locus-specific estimates of QTL probabilities for the CF data set (left) and the corresponding prior probabilities (right) are shown. Marker numbers are shown on the _x_-axis and QTL probabilities on the _y_-axis.

Figure 3.—

QTL allelic effects. Locus-specific point estimates (mean) of allelic effects for the CF data set (left) and the corresponding prior (right) are shown. The first and the second allele at each marker locus are shown as open and solid bars, respectively. These quantities are calculated on the basis of pooling samples from two separate MCMC chains with 30,000 samples (after an initial 5000 burn-in rounds) in each. All sampled values were utilized in these estimates, including the iterations where the marker indicator was zero. Marker numbers are shown on the _x_-axis and the underlying hidden phenotype (liability) scale is on the _y_-axis. Note that the effects are double in size.

Figure 4.—

Estimated linkage disequilibrium pattern for the CF data set. The posterior probabilities of jointly selecting two adjacent markers into the model (i.e., their indicators have value one simultaneously) are estimated for each marker pair (diamond, scale in the right _y_-axis). For each marker pair, the corresponding prior (circle, scale in the right _y-_axis) and the physical distance between markers are also shown (box, scale in the left _y_-axis). Markers present in each marker pair are shown on the _x_-axis.

Figure 5.—

Locus-specific point estimates (mean) of weighted genetic variances for the Friedreich ataxia data. Different types of specifications for smoothing parameter λ are shown: (A) wide Gamma(1, 0.01) prior for λ with mean 100 (posterior mean 62.65); (B) narrow Gamma(100, 1) prior for λ with mean 100 (posterior mean 99.32); (C) λ = 10 (strong smoothing); (D) λ = 250 (weak smoothing); and (E) independent prior for indicators with no λ parameter (no smoothing). Marker numbers are shown on the _x_-axis and weighted genetic variances are on the _y_-axis. Note that the variance parameter is for the model where the effects are double in size.

Figure 6.—

Locus-specific estimates of QTL probabilities (left) and weighted genetic variances (right) for the simulated data set, where loci 18 and 22 represent the true gene locations. Marker numbers are shown on the _x_-axis and QTL probabilities/weighted genetic variances are on the _y_-axis.

Cited by

A simple new approach to variable selection in regression, with application to genetic fine mapping.
Wang G, Sarkar A, Carbonetto P, Stephens M. Wang G, et al. J R Stat Soc Series B Stat Methodol. 2020 Dec;82(5):1273-1300. doi: 10.1111/rssb.12388. Epub 2020 Jul 10. J R Stat Soc Series B Stat Methodol. 2020. PMID: 37220626 Free PMC article.
A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction.
Mathew B, Léon J, Sillanpää MJ. Mathew B, et al. Heredity (Edinb). 2018 Apr;120(4):356-368. doi: 10.1038/s41437-017-0023-4. Epub 2017 Dec 14. Heredity (Edinb). 2018. PMID: 29238077 Free PMC article.
Prediction of complex human diseases from pathway-focused candidate markers by joint estimation of marker effects: case of chronic fatigue syndrome.
Bhattacharjee M, Rajeevan MS, Sillanpää MJ. Bhattacharjee M, et al. Hum Genomics. 2015 Jun 11;9(1):8. doi: 10.1186/s40246-015-0030-6. Hum Genomics. 2015. PMID: 26063326 Free PMC article.
Estimation of Gene Expression at Isoform Level from mRNA-Seq Data by Bayesian Hierarchical Modeling.
Bhattacharjee M, Gupta R, Davuluri RV. Bhattacharjee M, et al. Front Genet. 2012 Nov 27;3:239. doi: 10.3389/fgene.2012.00239. eCollection 2012. Front Genet. 2012. PMID: 23293650 Free PMC article.
Swift block-updating EM and pseudo-EM procedures for Bayesian shrinkage analysis of quantitative trait loci.
Mutshinda CM, Sillanpää MJ. Mutshinda CM, et al. Theor Appl Genet. 2012 Nov;125(7):1575-87. doi: 10.1007/s00122-012-1936-1. Epub 2012 Jul 24. Theor Appl Genet. 2012. PMID: 22824967

References

1. Ball, R. D., 2001. Bayesian methods for quantitative trait loci mapping based on model selection: approximate analysis using Bayesian information criterion. Genetics 159: 1351–1364. - PMC - PubMed
1. Bertranpetit, J., and F. Calafell, 1996 Genetic and geographical variability in cystic fibrosis: evolutionary considerations, pp. 97–114 in Variation in the Human Genome, edited by D. Chadwick and G. Cardew. Wiley, Chichester, England. - PubMed
1. Broman, K. W, and T. P. Speed, 2002. A model selection approach for identification of quantitative trait loci in experimental crosses. J. R. Stat. Soc. B 64: 641–656. - PMC - PubMed
1. Cardon, L. R., and L. J. Palmer, 2003. Population stratification and spurious allelic association. Lancet 361: 598–604. - PubMed
1. Chapman, N. H., and E. A. Thompson, 2002. The effect of population history on the length of ancestral segments. Genetics 162: 449–458. - PMC - PubMed

Bayesian association-based fine mapping in small chromosomal segments - PubMed (original) (raw)