ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets - PubMed (original) (raw)
ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets
Alyssa C Frazee et al. BMC Bioinformatics. 2011.
Abstract
Background: RNA sequencing is a flexible and powerful new approach for measuring gene, exon, or isoform expression. To maximize the utility of RNA sequencing data, new statistical methods are needed for clustering, differential expression, and other analyses. A major barrier to the development of new statistical methods is the lack of RNA sequencing datasets that can be easily obtained and analyzed in common statistical software packages such as R. To speed up the development process, we have created a resource of analysis-ready RNA-sequencing datasets. 2 DESCRIPTION: ReCount is an online resource of RNA-seq gene count tables and auxilliary data. Tables were built from raw RNA sequencing data from 18 different published studies comprising 475 samples and over 8 billion reads. Using the Myrna package, reads were aligned, overlapped with gene models and tabulated into gene-by-sample count tables that are ready for statistical analysis. Count tables and phenotype data were combined into Bioconductor ExpressionSet objects for ease of analysis. ReCount also contains the Myrna manifest files and R source code used to process the samples, allowing statistical and computational scientists to consider alternative parameter values. 3 CONCLUSIONS: By combining datasets from many studies and providing data that has already been processed from. fastq format into ready-to-use. RData and. txt files, ReCount facilitates analysis and methods development for RNA-seq count data. We anticipate that ReCount will also be useful for investigators who wish to consider cross-study comparisons and alternative normalization strategies for RNA-seq.
Figures
Figure 1
Histogram of adjusted p-values from differential expression analysis on the 29 samples included in both Cheung and Montgomery. The p-values in the histogram are from paired t-tests on the 25% of genes with nonzero counts in at least one of the two studies. The peak near zero is somewhat indicative of technical variability between the two studies.
Figure 2
Histogram of adjusted p-values from analysis of differential expression between YRI and CEU populations. The p-values in the histogram are from two-sample t-tests on the 25% of genes with nonzero counts in at least one of the two studies. The peak near zero indicates differential gene expression that may result from either technical or biological variability.
Similar articles
- Cloud-scale RNA-sequencing differential expression analysis with Myrna.
Langmead B, Hansen KD, Leek JT. Langmead B, et al. Genome Biol. 2010;11(8):R83. doi: 10.1186/gb-2010-11-8-r83. Epub 2010 Aug 11. Genome Biol. 2010. PMID: 20701754 Free PMC article. - JingleBells: A Repository of Immune-Related Single-Cell RNA-Sequencing Datasets.
Ner-Gaon H, Melchior A, Golan N, Ben-Haim Y, Shay T. Ner-Gaon H, et al. J Immunol. 2017 May 1;198(9):3375-3379. doi: 10.4049/jimmunol.1700272. J Immunol. 2017. PMID: 28416714 - Polyester: simulating RNA-seq datasets with differential transcript expression.
Frazee AC, Jaffe AE, Langmead B, Leek JT. Frazee AC, et al. Bioinformatics. 2015 Sep 1;31(17):2778-84. doi: 10.1093/bioinformatics/btv272. Epub 2015 Apr 28. Bioinformatics. 2015. PMID: 25926345 Free PMC article. - Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools.
Chowdhury HA, Bhattacharyya DK, Kalita JK. Chowdhury HA, et al. IEEE/ACM Trans Comput Biol Bioinform. 2020 Mar-Apr;17(2):566-586. doi: 10.1109/TCBB.2018.2873010. Epub 2018 Oct 1. IEEE/ACM Trans Comput Biol Bioinform. 2020. PMID: 30281477 Review. - Normalization for Single-Cell RNA-Seq Data Analysis.
Bacher R. Bacher R. Methods Mol Biol. 2019;1935:11-23. doi: 10.1007/978-1-4939-9057-3_2. Methods Mol Biol. 2019. PMID: 30758817 Review.
Cited by
- An evaluation of RNA-seq differential analysis methods.
Li D, Zand MS, Dye TD, Goniewicz ML, Rahman I, Xie Z. Li D, et al. PLoS One. 2022 Sep 16;17(9):e0264246. doi: 10.1371/journal.pone.0264246. eCollection 2022. PLoS One. 2022. PMID: 36112652 Free PMC article. - Sparse sliced inverse regression for high dimensional data analysis.
Hilafu H, Safo SE. Hilafu H, et al. BMC Bioinformatics. 2022 May 7;23(1):168. doi: 10.1186/s12859-022-04700-3. BMC Bioinformatics. 2022. PMID: 35525975 Free PMC article. - Addressing the mean-correlation relationship in co-expression analysis.
Wang Y, Hicks SC, Hansen KD. Wang Y, et al. PLoS Comput Biol. 2022 Mar 30;18(3):e1009954. doi: 10.1371/journal.pcbi.1009954. eCollection 2022 Mar. PLoS Comput Biol. 2022. PMID: 35353807 Free PMC article. - Comprehensive analysis of an immune infiltrate-related competitive endogenous RNA network reveals potential prognostic biomarkers for non-small cell lung cancer.
Yang CZ, Hu LH, Huang ZY, Deng L, Guo W, Liu S, Xiao X, Yang HX, Lin JT, Sun LL, Lin LZ. Yang CZ, et al. PLoS One. 2021 Dec 2;16(12):e0260720. doi: 10.1371/journal.pone.0260720. eCollection 2021. PLoS One. 2021. PMID: 34855841 Free PMC article. - recount3: summaries and queries for large-scale RNA-seq expression and splicing.
Wilks C, Zheng SC, Chen FY, Charles R, Solomon B, Ling JP, Imada EL, Zhang D, Joseph L, Leek JT, Jaffe AE, Nellore A, Collado-Torres L, Hansen KD, Langmead B. Wilks C, et al. Genome Biol. 2021 Nov 29;22(1):323. doi: 10.1186/s13059-021-02533-6. Genome Biol. 2021. PMID: 34844637 Free PMC article.
References
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Holko M, Ayanbule O, Yefanov A, Sobolera A. NCBI GEO: archive for functional genomics data sets - 10 years on. Nucleic Acids Res. 2011;39(suppl 1):D1005–D1010. - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R01-HG005220/HG/NHGRI NIH HHS/United States
- R01 HG005220-03/HG/NHGRI NIH HHS/United States
- P41-HG004059/HG/NHGRI NIH HHS/United States
- T32GM074906/GM/NIGMS NIH HHS/United States
- R01 HG005220/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources