Most “Dark Matter” Transcripts Are Associated With Known Genes (original) (raw)

Figure 2

RNA-Seq read mapping overview.

(A) Proportion of reads with a unique match in the genome mapping to known genes, mRNAs, and spliced ESTs. Reads were pooled across all human or mouse RNA-Seq samples and sequentially matched against a non-redundant set of known genes, mRNA, and spliced EST data. Any remaining reads were classified as “other.” (B) Same as in (A) but considering the total amount of transcribed genomic area, rather than read count. (C) The relationship between the RNA-Seq read depth and the transcribed area in the genome for human brain RNA-Seq reads, based on 50.2 million reads pooled from the three independent samples that were assayed separately. The total transcribed area is indicated for all reads, as well as those that map to known exons, known introns, and intergenic regions. (D) Extrapolation of transcribed genomic area at increasing read depths, based on the distribution of all reads in (C). The model fitted on the uniquely mapped reads is shown in the inset. (E, F) Cumulative fraction of seqfrags as a function of the number of reads mapped to each seqfrags in the combined set of human and mouse samples, respectively.

doi: https://doi.org/10.1371/journal.pbio.1000371.g002