Clustering analysis of SAGE data using a Poisson approach - PubMed (original) (raw)

Comparative Study

Clustering analysis of SAGE data using a Poisson approach

Li Cai et al. Genome Biol. 2004.

Abstract

Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties. We modeled SAGE data by Poisson statistics and developed two Poisson-based distances. Their application to simulated and experimental mouse retina data show that the Poisson-based distances are more appropriate and reliable for analyzing SAGE data compared to other commonly used distances or similarity measures such as Pearson correlation or Euclidean distance.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Graphs of clustering results for simulation data. The _x_-axis represents the different time points; the _y_-axis represents the expression level scaled as percentage. Data were normalized before plotting. For each tag, the count vector is rescaled to make the sum of the elements of the count vector equal 1. For example, b4 = (109,306,296,620,93) is rescaled to b4' = b4/θ where θ = (109 + 306 + 296 + 620 + 93).

Figure 2

Figure 2

Graphs of clustering results for mouse retinal SAGE data. The _x_-axis represents the time points of the developing mouse retina SAGE libraries; the _y_-axis represents the relative frequency for each tag scaled as a percentage. Data were normalized before plotting. Each tag from the 10 libraries was rescaled to make the sum of all 10 tags equal to 1. Different colors represent different tags. See Additional data file 1 for more details.

References

    1. Blackshaw S, Fraioli RE, Furukawa T, Cepko CL. Comprehensive analysis of photoreceptor gene expression and the identification of candidate retinal disease genes. Cell. 2001;107:579–589. doi: 10.1016/S0092-8674(01)00574-8. - DOI - PubMed
    1. Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, Kinzler KW. Gene expression profiles in normal and cancer cells. Science. 1997;276:1268–1272. doi: 10.1126/science.276.5316.1268. - DOI - PubMed
    1. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270:484–487. - PubMed
    1. Buckhaults P, Zhang Z, Chen YC, Wang TL, St Croix B, Saha S, Bardelli A, Morin PJ, Polyak K, Hruban RH, et al. Identifying tumor origin using a gene expression-based classification map. Cancer Res. 2003;63:4144–4149. - PubMed
    1. Porter D, Weremowicz S, Chin K, Seth P, Keshaviah A, Lahti-Domenici J, Bae YK, Monitto CL, Merlos-Suarez A, Chan J, et al. A neural survival factor is a candidate oncogene in breast cancer. Proc Natl Acad Sci USA. 2003;100:10931–10936. doi: 10.1073/pnas.1932980100. - DOI - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources