Clustering analysis of SAGE data using a Poisson approach - PubMed (original) (raw)
Comparative Study
Clustering analysis of SAGE data using a Poisson approach
Li Cai et al. Genome Biol. 2004.
Abstract
Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties. We modeled SAGE data by Poisson statistics and developed two Poisson-based distances. Their application to simulated and experimental mouse retina data show that the Poisson-based distances are more appropriate and reliable for analyzing SAGE data compared to other commonly used distances or similarity measures such as Pearson correlation or Euclidean distance.
Figures
Figure 1
Graphs of clustering results for simulation data. The _x_-axis represents the different time points; the _y_-axis represents the expression level scaled as percentage. Data were normalized before plotting. For each tag, the count vector is rescaled to make the sum of the elements of the count vector equal 1. For example, b4 = (109,306,296,620,93) is rescaled to b4' = b4/θ where θ = (109 + 306 + 296 + 620 + 93).
Figure 2
Graphs of clustering results for mouse retinal SAGE data. The _x_-axis represents the time points of the developing mouse retina SAGE libraries; the _y_-axis represents the relative frequency for each tag scaled as a percentage. Data were normalized before plotting. Each tag from the 10 libraries was rescaled to make the sum of all 10 tags equal to 1. Different colors represent different tags. See Additional data file 1 for more details.
References
- Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270:484–487. - PubMed
- Buckhaults P, Zhang Z, Chen YC, Wang TL, St Croix B, Saha S, Bardelli A, Morin PJ, Polyak K, Hruban RH, et al. Identifying tumor origin using a gene expression-based classification map. Cancer Res. 2003;63:4144–4149. - PubMed
Publication types
MeSH terms
Grants and funding
- R01 HG02518-01/HG/NHGRI NIH HHS/United States
- P20 CA096470/CA/NCI NIH HHS/United States
- EY08064/EY/NEI NIH HHS/United States
- P20 CA96470/CA/NCI NIH HHS/United States
- R01 HG002518/HG/NHGRI NIH HHS/United States
- R01 EY008064/EY/NEI NIH HHS/United States
LinkOut - more resources
Full Text Sources