Bayesian integrated modeling of expression data: a case study on RhoG - PubMed (original) (raw)

Bayesian integrated modeling of expression data: a case study on RhoG

Rashi Gupta et al. BMC Bioinformatics. 2010.

Abstract

Background: DNA microarrays provide an efficient method for measuring activity of genes in parallel and even covering all the known transcripts of an organism on a single array. This has to be balanced against that analyzing data emerging from microarrays involves several consecutive steps, and each of them is a potential source of errors. Errors tend to accumulate when moving from the lower level towards the higher level analyses because of the sequential nature. Eliminating such errors does not seem feasible without completely changing the technologies, but one should nevertheless try to meet the goal of being able to realistically assess degree of the uncertainties that are involved when drawing the final conclusions from such analyses.

Results: We present a Bayesian hierarchical model for finding differentially expressed genes between two experimental conditions, proposing an integrated statistical approach where correcting signal saturation, systematic array effects, dye effects, and finding differentially expressed genes, are all modeled jointly. The integration allows all these components, and also the associated errors, to be considered simultaneously. The inference is based on full posterior distribution of gene expression indices and on quantities derived from them rather than on point estimates. The model was applied and tested on two different datasets.

Conclusions: The method presents a way of integrating various steps of microarray analysis into a single joint analysis, and thereby enables extracting information on differential expression in a manner, which properly accounts for various sources of potential error in the process.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Plot of posterior distribution of D i = T _i_1-T _i_2 for three genes. In the upper panel, posterior distributions of the difference D i = T _i_1-T _i_2 are shown for three genes of dataset-1: a non-differentially expressed gene (left), an up-regulated gene (center), and a down-regulated gene (right). In the lower panel, the corresponding posterior distributions are shown for the latent variable T _i_1 corresponding to the experimental condition (solid line), and for T _i_2 corresponding to the control (dotted line).

Figure 2

Figure 2

Plot of point estimates (posterior means) of log-fold change D i against the overall expression (T _i_1 + T _i_2)/2 for dataset-1. Genes with p _i_+ ≥ 0.99 are plotted with diamonds and those with p _i_- ≥ 0.99 are plotted with triangles. The gene RhoG with p _i_+ = 0.91 is plotted with a red circle.

Figure 3

Figure 3

A pictorial representation of the relation of nine genes co-cited with RhoG. The blue boxes (nodes) represent the genes. The "black" edges indicate co-citation of two genes in the PubMed database; the "green" edges indicate a possible regulatory role of JUN and NFKB1 on the expression of RhoG.

Figure 4

Figure 4

Histograms of point estimates (median of posterior distribution) of D i for GAPDH. These point estimates are of the 56 replicates (on the same array) for a house keeping genes (GAPDH) of dataset-2.

Figure 5

Figure 5

Histogram of point estimates (median of posterior distribution) of β i for GAPDH. These point estimates are of the 56 replicates (on the same array) for a house keeping genes (GAPDH) of dataset-2.

Similar articles

Cited by

References

    1. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed T. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acid Res. 2002;30:E15. doi: 10.1093/nar/30.4.e15. - DOI - PMC - PubMed
    1. Tseng GC, Oh M-K, Rohlin L, Liao JC, Wong WH. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 2001;29:2549–2557. doi: 10.1093/nar/29.12.2549. - DOI - PMC - PubMed
    1. Workman C, Jensen LJ, Jarmer H, Berka R, Gautier L, Nielsen HB, Saxild H-H, Nielsen C, Brunak S, Knudsen S. A new non-linear normalization method for reducing variability in DNA microarray experiments. Genome Biol. 2002;3(9):research0048. doi: 10.1186/gb-2002-3-9-research0048. - DOI - PMC - PubMed
    1. Dudoit S, Yang YH, Luu P, Speed TP. In: Microarrays:Optical Technolologies and Informatics, Vol. 4266 of Proceedings of SPIE. Bittner ML, Chen Y, Dorsel AN, Dougherty ER, editor. Vol. 2001. Normalization for cDNA microarray data; pp. 141–152.
    1. Rosenzweig BA, Pine PS, Domon OE, Morris SM, Chen JJ, Sistare FD. Dye bias correction in dual-labeled cDNA microarray gene expression measurements. Environ Health Perspect. 2004;112(4):480–487. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources