Adjusting batch effects in microarray expression data using empirical Bayes methods - PubMed (original) (raw)
Adjusting batch effects in microarray expression data using empirical Bayes methods
W Evan Johnson et al. Biostatistics. 2007 Jan.
Abstract
Non-biological experimental variation or "batch effects" are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes ( > 25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.
Similar articles
- Alternative empirical Bayes models for adjusting for batch effects in genomic studies.
Zhang Y, Jenkins DF, Manimaran S, Johnson WE. Zhang Y, et al. BMC Bioinformatics. 2018 Jul 13;19(1):262. doi: 10.1186/s12859-018-2263-6. BMC Bioinformatics. 2018. PMID: 30001694 Free PMC article. - Empirical Bayes accomodation of batch-effects in microarray data using identical replicate reference samples: application to RNA expression profiling of blood from Duchenne muscular dystrophy patients.
Walker WL, Liao IH, Gilbert DL, Wong B, Pollard KS, McCulloch CE, Lit L, Sharp FR. Walker WL, et al. BMC Genomics. 2008 Oct 20;9:494. doi: 10.1186/1471-2164-9-494. BMC Genomics. 2008. PMID: 18937867 Free PMC article. - Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE).
Xia Q, Thompson JA, Koestler DC. Xia Q, et al. Stat Appl Genet Mol Biol. 2021 Dec 14;20(4-6):101-119. doi: 10.1515/sagmb-2021-0020. Stat Appl Genet Mol Biol. 2021. PMID: 34905304 Free PMC article. - Intensity-based hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments.
Sartor MA, Tomlinson CR, Wesselkamper SC, Sivaganesan S, Leikauf GD, Medvedovic M. Sartor MA, et al. BMC Bioinformatics. 2006 Dec 19;7:538. doi: 10.1186/1471-2105-7-538. BMC Bioinformatics. 2006. PMID: 17177995 Free PMC article. - A Bayesian method for analysing spotted microarray data.
Meiklejohn CD, Townsend JP. Meiklejohn CD, et al. Brief Bioinform. 2005 Dec;6(4):318-30. doi: 10.1093/bib/6.4.318. Brief Bioinform. 2005. PMID: 16420731 Review.
Cited by
- Distributed Harmonization: Federated Clustered Batch Effect Adjustment and Generalization.
Hoang B, Pang Y, Liang S, Zhan L, Thompson PM, Zhou J. Hoang B, et al. KDD. 2024;2024:5105-5115. doi: 10.1145/3637528.3671590. Epub 2024 Aug 24. KDD. 2024. PMID: 39493643 Free PMC article. - Multivariate approach for longitudinal analysis of brain metabolite levels from ages 5-11 years in children with perinatal HIV infection.
van Biljon N, Robertson F, Holmes M, Cotton MF, Laughton B, van der Kouwe A, Meintjes E, Little F. van Biljon N, et al. Neuroimage. 2021 Aug 15;237:118101. doi: 10.1016/j.neuroimage.2021.118101. Epub 2021 May 4. Neuroimage. 2021. PMID: 33961998 Free PMC article. - Genome-Wide Gene Expression Analysis Shows AKAP13-Mediated PKD1 Signaling Regulates the Transcriptional Response to Cardiac Hypertrophy.
Johnson KR, Nicodemus-Johnson J, Spindler MJ, Carnegie GK. Johnson KR, et al. PLoS One. 2015 Jul 20;10(7):e0132474. doi: 10.1371/journal.pone.0132474. eCollection 2015. PLoS One. 2015. PMID: 26192751 Free PMC article. - Long non-coding RNA exploration for mesenchymal stem cell characterisation.
Riquier S, Mathieu M, Bessiere C, Boureux A, Ruffle F, Lemaitre JM, Djouad F, Gilbert N, Commes T. Riquier S, et al. BMC Genomics. 2021 Jun 4;22(1):412. doi: 10.1186/s12864-020-07289-0. BMC Genomics. 2021. PMID: 34088266 Free PMC article. - Metabolic Profiles of Obesity in American Indians: The Strong Heart Family Study.
Zhao Q, Zhu Y, Best LG, Umans JG, Uppal K, Tran VT, Jones DP, Lee ET, Howard BV, Zhao J. Zhao Q, et al. PLoS One. 2016 Jul 19;11(7):e0159548. doi: 10.1371/journal.pone.0159548. eCollection 2016. PLoS One. 2016. PMID: 27434237 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources