A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data (original) (raw)

Abstract

Motivation: Microarray experiments are affected by numerous sources of non-biological variation that contribute systematic bias to the resulting data. In a dual-label (two-color) cDNA or long-oligonucleotide microarray, these systematic biases are often manifested as an imbalance of measured fluorescent intensities corresponding to Sample A versus those corresponding to Sample B. Systematic biases also affect between-slide comparisons. Making effective corrections for these systematic biases is a requisite for detecting the underlying biological variation between samples. Effective data normalization is therefore an essential step in the confident identification of biologically relevant differences in gene expression profiles. Several normalization methods for the correction of systemic bias have been described. While many of these methods have addressed intensity-dependent bias, few have addressed both intensity-dependent and spatiality-dependent bias.

Results: We present a neural network-based normalization method for correcting the intensity- and spatiality-dependent bias in cDNA microarray datasets. In this normalization method, the dependence of the log-intensity ratio (M) on the average log-intensity (A) as well as on the spatial coordinates (X,Y) of spots is approximated with a feed-forward neural network function. Resistance to outliers is provided by assigning weights to each spot based on how distant their M values is from the median over the spots whose A values are similar, as well as by using pseudospatial coordinates instead of spot row and column indices. A comparison of the robust neural network method with other published methods demonstrates its potential in reducing both intensity-dependent bias and spatial-dependent bias, which translates to more reliable identification of truly regulated genes.

Availability: The normalization method described in this paper is available as the library nnNorm in the BioConductor project (http://www.bioconductor.org). Scripts used to load the freely available data and generate some of the figures in this paper are available in the documentation accompanying this library.

Contact: ltarca@rsvs.ulaval.ca

1 INTRODUCTION

Microarray technology has provided researchers with an unprecedented tool to assess the expression of thousands of genes through simultaneous measurements of the relative abundance of mRNA species corresponding to these genes (Spellman et al., 1998; Golub et al., 1999). Analyses of microarray data can be used to make inferences about gene expression on a genomic scale, such as for identifying differentially expressed genes in two or more tissues, discovering co-regulated classes of genes or distinguishing diagnostic gene expression profiles for a given developmental or disease state (Simon and Dobbin, 2003).

In the case of two-color cDNA and long-oligonucleotide microarrays, the most common sample preparation techniques involve reverse transcription of RNA into cDNA, incorporating modified nucleotides coupled to fluorescent dyes that permit the detection of the cDNA. Two samples, each labeled with a different dye, are hybridized simultaneously to a single array, with the differential labeling permitting independent detection of the fluorescent signal associated with each labeled sample for each element (gene) on the array. Cy5 and Cy3 are frequently used dyes which emit in the red and green range of the spectrum respectively, and thus the two dyes are often denoted as R and G. The resulting fluorescent intensities R and G for each element are assumed to be proportional to the abundance of the corresponding mRNA in the two original samples. Values are frequently expressed as ratios (R/G) to infer the relative abundance of a given mRNA species in one sample versus the other (Quackenbush, 2002). The data are often log-transformed for subsequent analyses, in order to stabilize variances and to convert multiplicative error into additive error (Cui et al., 2003http://www.bepress.com/sagmb/vol2/iss1/art4).

In theory, R/G should be close to 1 (and correspondingly log2_R_/G ≅ 0) for a gene that is expressed at equal levels in the two samples represented on the array. In practice, however, this is seldom the case due to non-biological, or technical, error that is introduced during the numerous steps of a microarray experiment, which are described in Yang et al. (2002b) and Chen et al. (2004). Systematic bias that is introduced during a microarray experiment often manifests itself as an overall imbalance in the fluorescent intensities of one sample versus another, referred to as dye-bias. Owing to the vagaries of microarray manufacture and hybridization, there are also other factors that contribute systematic error in a spatially dependent fashion (Smyth and Speed, 2003).

Sample-dependent bias, as well as spatiality-dependent bias, can be well illustrated with minus–add (MA) plots, whereby the log-ratios (M = log2_R_/G) are plotted versus the average of log-intensities [(A = (1/2)0.5log2_R_ · G)]. These plots, also referred to as ratio-to-intensity (RI) plots (Cui et al., 2003), are a useful diagnostic tool for assessing systematic bias in microarray datasets.

Systematic error is likely to increase the incidence of both false positive and false negative results during data analysis (Nadon and Shoemaker, 2002). While it is best to minimize the introduction of systematic bias during the microarray experiment, the effects of systematic bias on the data can also be dealt with on a post hoc basis through data normalization procedures (Quackenbush, 2002). For two-color cDNA arrays, these normalization procedures are applied both within an array and between arrays prior to data analysis. The commonly used normalization methods, which are summarized in Quackenbush (2002), are based on the assumption that a given set of treatments will only affect the expression of a small subset of the genes in a genome. In other words, the expression of the majority of genes is expected to remain the same across treatments, and thus for large scale arrays, log2_R_/G ≅ 0 for the majority of elements on the array. In the case of smaller-scale arrays where this assumption may not be valid, normalization is often carried out using some subset of genes for which there is a priori evidence that they do not respond to the treatments under study, or using a set of exogenous ‘spike-in’ controls.

While the removal of intensity dependent bias has been addressed with both linear and non-linear regression techniques [see Park et al. (2003) for a classification of these techniques], there are far fewer published techniques designed to remove both intensity- and spatiality-dependent bias. Yang et al. (2002b) developed a framework called ‘composite normalization’ in which the bias estimates to be subtracted from the raw log-ratios are a weighted summation of estimates obtained by loess fit of M(A), with the one obtained by loess fit of M on spots row and column indices. In contrast, Wilson et al. (2003) proposed a two-step procedure in which intensity-based normalization is applied as a first step and then a median filter is applied that further polishes the residuals to remove the spatial trends. Another straightforward method to obtain spatial normalization is to perform a local group background correction instead of classic local background correction prior to the intensity normalization. This approach is implemented in some commercial software packages such as GeneSight (Biodiscovery, www.biodiscovery.com).

Faced with a lack of studies addressing how to deal best with both spatial and intensity biases, one of our aims in this paper was to compare several methods on real and artificially corrupted data. We also present an alternative normalization method in which surrogates of the within-print tip spatial coordinates of the spots as well as the intensity level are fed into a neural network model as regressors. The performance of the proposed robust neural network normalization method is benchmarked with respect to existing methods using four different cDNA microarray datasets that permit us to examine and compare different attributes of the normalization methods: the publicly available Apolipoprotein AI (Callow et al., 2000) and swirl zebra fish (Wuennenberg-Stapleton and Ngai, unpublished) datasets, a dataset generated in our own laboratory comparing Populus trichocarpa (Torr. and Gray) × deltoides (Bart. ex. Marsh.) trees treated with low- versus high-nitrogen fertilizer, and an artificially biased version of the Apolipoprotein AI dataset. The comparison of normalization methods is based on their ability to reduce the within-slide variability and among-replicate variability (generally easy to show) without loss of signal (generally difficult to prove).

According to the criteria that we have used to benchmark these normalization techniques, the neural network approach devised in this work is shown to handle both types of biases at the same time and compares favorably with existing methods with a reasonable CPU time. In comparison to the existing methods, the proposed normalization procedure based on neural networks offers a compact, yet robust bias model, i.e. an equation relating the amount of bias estimated for the observed log-ratio of a particular spot to several input variables such as the spot coordinates and its average log-intensity.

2 SYSTEMS AND METHODS

2.1 Intensity and spatial normalization of cDNA microarrays: current methods

We will consider four published normalization methods for comparison to the proposed neural network normalization method. These are summarized in Table 1. In general these normalization methods correct the raw log-ratio of each spot, M, by subtracting from it an estimate of the bias,

\(\widehat{M}\)

⁠:

\[{M}^{*}=M-\widehat{M}\]

(1)

The simplest normalization method is known as global median normalization, which is based on the assumption that the overall number of RNA molecules in the samples to be compared are the same, and therefore the overall intensities of the arrays to be compared should also be the same (Quackenbush, 2002). In this case,

\(\widehat{M}\)

is designated a constant value,

\(\widehat{M}=c\)

⁠, where c is usually taken as the median of M values in the slide. Global median normalization is referred to in this work as gMed.

It has been shown that bias generally has a significant dependence on average log-intensity values (A) as well as on the location of the spots in the array (Yang et al., 2002b; Dudoit et al., 2002). Therefore, this group has proposed a method that accounts for the bias dependence on intensity level and to some extent on the location of the spots in the array. Their popular method, referred to as print tip loess (designated here as pLo), computes the bias estimate as:

\[\widehat{M}={c}_{i}\left(A\right)\]

(2)

i.e. it considers that the bias is a function of A, but this function is different from one print tip group to another. Not only does print tip loess allow for physical differences between the actual tips of the printer head but it also acts as a surrogate for spatial variation across the slide (Smyth et al., 2003).

Methods that account for spatial variation within the print tips may easily be constructed using the composite normalization technique implemented in BioConductor (http://www.bioconductor.org) as a framework (Yang et al., 2002b). Here, a finer spatial dependence can be accounted for at the same time as the average log-intensity by:

\[\widehat{M}=\alpha \cdot {c}_{i}\left(A\right)+\beta \cdot {c}_{i}(\hbox{ SpotRow },\hbox{ SpotCol })\]

(3)

where c i(SpotRow, SpotCol) is the loess estimate of M using as predictors the spot row and spot column coordinates inside the _i_th print tip. The weighting coefficients α and β for both estimates were assigned equal values of 0.5. This third method will be referred to as cPLo2D, being a composite between print tip loess and a pure 2D normalization.

Wilson et al. (2003) suggest a modification of the Yang et al. (2002b) loess normalization method to deal with the intensity and spatial bias. A loess curve c(A) is first computed for the whole slide and subtracted from the raw M values. Then, a median filter is applied on the residuals to estimate the spatial trend. The median filter simply subtracts from each residual the median of residuals over its spatial neighborhood (a 3 × 3 block of spots with the current spot in the center). This method will be referred further to as gLoMedF, as it is a combination of global loess normalization followed by a spatial median filter.

The last method we consider here for spatial bias reduction is that which is implemented in the GeneSight software package (www.biodiscovery.com), and will be referred to as pLoGS. Instead of the classical local background correction of Red and Green channel intensities, which is applied prior to all the normalization methods described above, with pLoGS we apply a ‘local group median’ background correction followed by a print tip loess normalization. With the local group median background correction, the median of background values over a 3 × 3 square region (with the current spot in the center of this square) is computed and subtracted from the respective foreground intensity. Further a print tip loess normalization is applied to remove the intensity bias as presented for the pLo method. The results we present with this method are obtained with GeneSight 4.1 for the background correction step, but the print tip normalization was carried out in exactly the same manner as pLo, i.e. using the marray package of the BioConductor project (http://www.bioconductor.org).

Now that some of the existing methods for both spatial and intensity normalization have been introduced, we will move on to describe the new method based on neural networks.

2.2 Neural networks-based spatial and intensity normalization

In this section we introduce a new normalization method based on robust neural network models that uses A as well as spatial location of spots as predictors. This method will be referred to as print tip robust neural networks 2D and A, abbreviated as pNN2DA. The objective is to find the best fit of M values within a print tip group using as predictor variables the average log-intensity, A, as well as the two-dimensional space coordinates of the spots, X and Y. Instead of using the spot column and row index as in the composite normalization method (cPLo2D) described earlier, we divide the print tip group (subarray) area into square blocks called bins, usually containing nine spots (3 × 3). All the spots within a bin will have the same X and Y coordinates. The motivation for not using the spot row and column indices as spatial variables is to provide robustness of the estimate, as we do not want to model the spatial variation of log-ratios at the spot level which, indeed, are expected due to biological variation between the two samples. Using the binning procedure as a surrogate for the spatial location accounts for the spatial bias at a lower resolution than at the spot level.

Now the bias

\(\widehat{M}\)

is estimated as a function of these three variables, i.e.

\(\widehat{M}={c}_{i}(A,X,Y)\)

where c for every print tip i in a slide is a trained neural network function. The most common artificial neural network type used in function approximation is the multi-layered feed-forward neural network, also known as the multi-layer perceptron (MLP) (Rumelhart et al., 1986). These ‘black-box’ modeling tools have gained popularity in engineering and computer science because of their appealing ‘learning ability’ and their versatility and performance with respect to other classical statistical approaches. Without supposing a particular equational form, MLPs are able to mimic complex non-linear relationships between a multiple feature input vector and a dependent (output) variable using a set of training samples (known as input–output pairs). The parameterized neural network fitting function approximating the bias based on A, X and Y would read

\[f(\hbox{ x },\hbox{ w })={\sigma }^{\left(2\right)}\left({\displaystyle \sum _{j=1}^{J+1}}\left({w}_{j}\cdot {\sigma }_{j}^{\left(1\right)}\left({\displaystyle \sum _{i=1}^{I+1}}({x}_{i}\cdot {w}_{i,j})\right)\right)\right)\]

(4)

The symbols of Equation (4) are defined as follows: x is a vector having as its components the I = 3 features X, Y and A plus a constant value of 1 (accounting for the first layer bias), i.e. x = [X, Y, A, 1]; w are the fitting parameters called weights: w i,j(i = 1,…, I + 1; j = 1,…,J) and w j (j = 1,…,J + 1);

\({\sigma }_{j}^{\left(1\right)}(j=1,\dots ,J)\)

represent the hidden neurons and basically are sigmoid functions; σ(2) is the single neuron in the output layer, and again is a sigmoid function;

\({\sigma }_{J+1}^{\left(1\right)}=1\)

accounts for the second layer bias; J represents the number of neurons in the hidden layer of the network.

The sigmoid function is defined as:

\[\sigma \left(z\right)=\frac{1}{1+{e}^{-z}}\]

(5)

Such a neural network function with sigmoid transfer functions in the hidden layer is capable of universal function approximation if sufficient hidden units (neurons) are available (Hornik, 1990). Once a suitable value for J is selected, training of the neural network resumes, determining the best set of parameters, w, in such a way that the estimates produced by the neural network,

\(f(\hbox{ x },\widehat{\hbox{ w }})\)

⁠, for the N spots within a print tip closely approach the actual M values. Classic training algorithms perform the minimization of the sum of squared errors between the actual and predicted M values in order to identify the weight vector w:

\[{\displaystyle \underset{\hbox{ w }}{min}}\mathrm{SSE}={\displaystyle \sum _{k=1}^{N}}{\left(f\right({\hbox{ x }}_{k},\hbox{ w })-{M}_{k})}^{2}\]

(6)

where k is the index of spots within a print tip.

The neural network function in Equation (4) is susceptible to overfitting problems; i.e. this function may attempt to match M values too closely around the data points used for training but ‘wiggle’ excessively away from them. To minimize such problems, the number of hidden neurons J is set in this work to a reasonably low value, i.e. J = 3. Tests with several microarray datasets showed that three hidden neurons give a good compromise between approximation capabilities and smoothness (data not shown) although higher values (up to ∼7) may be safely used. Too many hidden units give more plasticity to the neural function to approximate higher non-linear relations, but at the same time increase the risk of over-normalization, i.e. removing useful signals.

The classic neural network training [Equation (6)] is not very resistant to outliers, as a few points which are distant from the main data stream may factor heavily in the SSE criterion [Equation (6)], and therefore may shift the estimate toward them (to be illustrated further). Outlier spots may be described with respect to all three independent variables (A, X, Y), i.e. intensity and space. With respect to A, outliers are those points whose M values are atypical for a given level of A, while outliers with respect to the spatial location are those points whose M values are atypical for their neighboring spots. The binning procedure that we use is intrinsically resistant to spatial outliers, so we need to place more emphasis on resistance to outliers with respect to A. The procedure that we use is to assign a weight to each spot in the model's parameter identification process, i.e. training. The weights are computed using the following steps:The five-step procedure above assigns weights in the interval [0,1] to all the spots in a print tip group based on how far their log-ratios M are from the median of M values in a given interval of average log-intensity A. Note that the intervals on the A scale are equal in the number of percentage of points not in range.

  1. Consider the pairs of a print tip, (M, A)k, k = 1,…,N ordered increasingly for A, so that _A_1 ≤ _A_2 ≤ ⋯ ≤ A kA N.
  2. Divide the ensemble of points into, e.g. B = 20 bins such that each bin has approximately the same number of points and the increasing order of A values is preserved inside the bins. Denote with
    \({M}_{k}^{l},l=1,\dots ,B\)
    ⁠; k = 1,…,N l;
    \({\sum }_{l=1}^{B}{N}_{l}=N\)
    ⁠, the new arrangement of M values.
  3. From each element
    \({M}_{k}^{l}\)
    ⁠, k = 1,…,N l, l = 1,…,B subtract the median of values in bin l to obtain the new values
    \({\overleftrightarrow{M}}_{k}^{l}\)
    ⁠. At this point, we have for each l level of intensity the corresponding M values centered on 0.
  4. Linearly1 transform
    \({\overleftrightarrow{M}}_{k}^{l}\)
    values to
    \({\tilde{M}}_{k}^{l}\)
    in such a way the median, med
    \(\left({\tilde{M}}^{l}\right)\)
    remains 0 and
    \[max\left({\tilde{M}}^{l}\right)=1\hbox{ if\; abs }[max({\overleftrightarrow{M}}^{l}\left)\right] > \hbox{ abs }[min({\overleftrightarrow{M}}^{l}\left)\right]\]
    or
    \[min\left({\tilde{M}}^{l}\right)=-1\hbox{ otherwise }.\]
    After this transformation, the obtained
    \({\tilde{M}}_{k}^{l}\)
    values are centered around 0 and lie in the interval [−1, 1].
  5. Compute the weight for the _k_th spot in the bin l, using the tricubic formula
    \[{W}_{k}^{l}={\left(1-|{\tilde{M}}_{k}^{l}{|}^{3}\right)}^{3}\]
    (7)

For the bias estimation within each print tip group, we have N pairs (X,Y,A)k and their corresponding raw M k value. In addition, W k designates the weight we give for each such pair k, k = 1,…,N in the model's parameters identification; i.e. training.

The training of neural network models with weights associated with the training samples was done using the nnet package implemented in R. (Ihaka and Gentleman, 1996) Details about the nnet package may be found in Venables and Ripley (2002). As stated earlier, the number of hidden neurons in the models was set to J = 3, giving a total of 16 parameters to estimate using typically a few hundred samples, corresponding to the usual number of spots in a print tip group. This translates into a high number of degrees of freedom in the estimation of the model's parameters. Training of the neural network models is performed using linearly scaled values for the predictors X, Y and A in a [0,1] interval, while for M the scaling is done in the interval [0.2,0.8], as is common practice in neural networks training having sigmoid output units.

Ideally, the correction

\({\widehat{M}}_{k}\)

to be subtracted from the raw log ratio of the spot k would be obtained from a neural network model trained with all data in the corresponding print tip group, except for the spot k. However, this approach would be computationally infeasible given the number of spots present on a typical microarray. The scheme we adopted in this work is a 4-fold cross-validation scheme which works as follows: The spots in each print tip group are split at random into i = 1…4 mutually exclusive subsets. Then a neural network is trained with data from the union of all subsets except the subset i. The resulting model is used to predict the corrections

\(\widehat{M}\)

of the spots in the subset i. The procedure is repeated for all values of i in order to compute the correction for all the spots in the print tip.

3 RESULTS AND DISCUSSION

The pNN2DA normalization procedure will be illustrated as well as compared with published normalization methods using two publicly available datasets, a dataset produced in our laboratory, and a simulated dataset. The two publicly available datasets are the Apolipoprotein AI experiment (http://www.stat.berkeley.edu/~terry/zarray/Html/apodata.html; Callow et al., 2000), and the swirl zebra fish experiment, which is available through the BioConductor project (http://www.bioconductor.org) (Ihaka and Gentleman, 1996). The dataset produced in our laboratory stems from an experiment comparing wood samples from Populus trichocarpa × deltoides trees treated with low versus high nitrogen fertilizer (Cooke et al., unpublished). The simulated experiment was constructed from the Apo AI dataset by adding several types of spatiality-dependent noise to the raw Apo AI data.

(A) The Apo AI dataset is often used in the literature to illustrate the effectiveness of different methods of normalization and/or identification of differentially expressed genes in cDNA microarray datasets (Smyth and Speed, 2003; Dudoit et al., 2002). In this dataset, eight out of the 6384 spotted cDNAs represent genes that are known a priori to be down-regulated in mice with a non-functional apo AI gene compared to ‘normal’ C57B1/6 mice. The dataset comprises 16 hybridizations (eight replicates in each condition), and will be used in this paper to compare the ability of different methods to distinguish the eight genes known to be differentially expressed from the remaining ones.

(B) The swirl zebra fish dataset will be used only to compare the normalization methods in terms of their power to reduce the variability. There are four replicate slides that include a dye swap, which compare the swirl mutants with wild-type zebra fish.

(C) The poplar experiment compares gene expression in wood (xylem) tissue harvested from poplar trees treated for 7 days with low or high levels of nitrogen fertilizer. Two independent experiments were conducted. In both Experiment 1 and Experiment 2, two plants were harvested separately per condition, giving a total of four biological replicates for the low-nitrogen treatment and four biological replicates for the high-nitrogen treatment. Each hybridization was carried out with labeled cDNA from a low-nitrogen treated plant and a high-nitrogen treated plant from the same experiment, with a technical dye swap performed for each hybridization. Hybridizations corresponding to samples from Experiment 2 were carried out in duplicate, except that for one series of hybridizations, an elevated and spatially uneven background intensity was created by allowing salts to deposit on the slide surface after the final rinse. Thus, the experiment comprises a total of 12 slides. Supplementary details on the experiment can be found at http://www.arborea.ulaval.ca/en/marrays-tools/bioinfo04-1294/. This experiment affords the opportunity to compare the ability of different normalization methods to reduce the between-replicate variability.

(D) The simulated dataset is constructed form the raw Apo AI data. Data from a subset of slides chosen at random from this dataset were perturbed by adding different types of noise to the raw log-ratios. Details on this simulation study will be given later in this section. The purpose of this simulated experiment is to compare the ability of different methods to identify the eight genes known to be differentially expressed in the original Apo AI data set if more noise were present.

Now that the data sets have been introduced, we move on to illustrate the issue of robustness of the neural network algorithm, followed by comparison between the different normalization methods introduced earlier.

3.1 Comparison between robust and classic neural network fitting

Although the normalization method we devise in this work uses spatial coordinates as well as A (intensity) to predict the bias affecting the M (ratio) values, we shall first illustrate the robust neural networks fitting procedure using only the average log-intensity A as predictor for the log-ratios M. This allows us to easily visualize the M versus A scatter together with the neural network curve in a simple two-dimensional plot.

Figure 1 shows the functions for a classically trained neural network model (i.e. no weights assigned to training samples), the robust neural network model and the loess fit for data of a single print tip from one slide in the Apo data set. This figure demonstrates how the classical neural network estimate may be significantly influenced by outliers (in Fig. 1, outliers exhibit high positive M values at lower A or high negative M values at higher A), while the robust neural network and loess curves follow the main stream of data.

Although the loess approximation of the scatter plot M versus A is intuitive and generally robust to outliers, it is not able to produce a regression function that can be easily transferred to another data set or group of researchers (Draghici, 2003). In contrast, the robust neural network procedure produces a compact equation describing the dependence of the bias on the variables used as regressors. The robust neural network model in Figure 1 requires only 10 parameters to describe the bias dependence on the A level representative for the 399 spots of a print tip that was used in this example.

3.2 Comparison between intensity and spatial normalization methods

We compared the pNN2DA normalization procedure with the published normalization methods that were presented in Section 2.1 and summarized in Table 1. We shall use two criteria to compare the normalization methods: (a) the ability to reduce the variability of log-ratios between replicated slides and within slides and (b) the ability to distinguish truly regulated genes from the other genes when a simple two sample _t_-test is used and corresponding adjusted _p_-values computed.

The Apo AI, swirl and poplar data sets will be used to address the first criterion. The Apo AI and the perturbed Apo AI data sets will be used to address the second criterion, since with these data sets we have a priori knowledge of which genes are differentially expressed. A qualitative comparison of the spatial uniformity of log-ratio distributions produced by the different normalization methods for data from one microarray in the Apo AI data set will also be carried out.

3.2.1 Impact on variability

Within-slide variability of normalized log-ratios Figures 24 illustrate the impact of different normalization procedures on the log-ratios distribution for the Apo AI (Fig. 2), Swirl (Fig. 3) and Poplar (Fig. 4) datasets. The density curves are obtained for each dataset by combining the log-ratios from all slides and using the density function in R to estimate the distributional probability. All the normalization methods correctly center the data on 0, but the spread of log-ratios differs markedly among methods. The best normalization method is expected to reduce the variability of log-ratios within the slides. The ranking of the methods according to this criterion is: 1—gLoMedF, 2—pNN2DA, 3—pLo, 4—pLoGS, 5—cPLo2D and 6—gMed. The Apo AI, Swirl and Poplar data sets all resulted in the same rankings for gLoMedF(1), pNN2DA(2) and gMed(6), while the ranks for pLo(3), pLoGS(4) and cPLo2D(5) are consistent for two of the three data sets (Table 2).

Between-replicates variability. A second type of variability reduction that is often used to compare normalization methods is the variability of M values among the replicated slides (Park et al., 2003; Workman et al., 2002). A simple estimate of the pooled variance for the gene g may be computed as:

\[{\widehat{\sigma }}_{g}^{2}=\frac{1}{(r-1)}{\displaystyle \sum _{i=1}^{r}}{({M}_{gi}^{*}-{\overline{M}}_{g\cdot }^{*})}^{2}\]

where r is the number of replicates in the dataset,

\({M}_{gi}^{*}\)

is the normalized M value2 for gene g and slide i, while

\({\overline{M}}_{g\cdot i}^{*}\)

is the average over the slides for the gene g.

Taking the log10 of such variance estimates and computing the absolute value of the mean over all genes3 will give us an idea of the overall variability among replicates. Figure 5 plots the absolute value of the average of log10 of variances, i.e.

\[AV|1/{N}_{g}{\displaystyle \sum _{g=1}^{{N}_{g}}}{log}_{10}{\widehat{\sigma }}_{g}^{2}|\]

In this figure, the AV value is negatively correlated with variance, and therefore a higher AV value is indicative of a more effective normalization procedure. This is because

\({\widehat{\sigma }}_{g}^{2}\)

quantities are bounded in the [0–1] interval and consequently, their log values are higher in absolute value when variances are smaller.

According to this criterion, the ranking of normalization methods is similar to that for within-slide variability with the exception that cPLo2D is better than both pLo and pLoGS. This ranking appears to be less consistent than that for within slides variability (Table 2).

It is generally easy to show that a normalization method reduces data variability, but is this reduction in variability achieved without introducing supplementary bias? This is what we are going to put in evidence in the following analysis.

3.2.2 Impact on differentially expression statistics

The Apo data set and the a priori knowledge of the effect of the apo AI gene on the expression of other genes were used to benchmark these normalization methods in terms of their capability to distinguish between the known differentially expressed genes and the remaining ones. A simple two group _t_-statistics is performed and the Westfall and Yang (1993) permutation adjusted _p_-values are derived. There are other useful statistics with which differential expression may be inferred, such as the _S_-statistic [related to SAM software (Tusher et al., 2001)], and the _B_-statistic. An alternative data analysis technique is to use the intensity-dependent selection method, as is illustrated by Kerr et al. (2002) and Draghici et al. (2003). In the latter, the threshold for selection of differentially expressed genes increases as the intensity of spots decreases.

The recent findings of Qin et al. (2004) show that intensity-based selection methods offered some improvement for the _t_-statistic, but on the data sets they used, this approach could not match the performance of more effective ranking statistics. As suggested an anonymous reviewer, some analysis methods, like the ones in Kerr et al. (2002) and Draghici et al. (2003) which are based on an ANOVA approach, may account for some of the biases unresolved by the normalization step. It is beyond the scope of the present study to address all possible combinations between normalization methods and analysis approaches.

Figure 6 shows the p_-values obtained for the normalization methods listed in Table 1. We might predict that the better the normalization procedure is, the larger the gap will be between the statistics of the last truly down-regulated gene (the eighth gene starting from the bottom for pLo) and the first non-regulated one (the ninth gene starting from the bottom for pLo). Under this criterion, the gMed, cPLo2D and pNN2DA methods appear to perform similarly in distinguishing between the eight regulated genes and the remaining genes, followed by pLo and_pLoGS. The gLoMedF method induces a false negative (at p = 0.01 threshold), and appears to remove useful signals from the measurements. The ranking according to this criterion is provided in Table 2. This analysis illustrates that reduction in variance should not be done at any price; i.e. it is important not to over-normalize microarray data.

The Apo dataset does not exhibit extensive spatial bias, however, and strong spatial bias is not an uncommon occurrence with cDNA microarray data. Three common types of spatial bias that are encountered are stronger signals around the slide edges, top-to-bottom signal gradients and local signal gradients arising from ‘blotches’ and other artifacts. In order to present a stronger test of the ability of these different normalization techniques to remove spatial bias, simulations were carried out by adding these distinct types of spatiality-dependent bias to the raw M values for a subset of slides in the Apo data set, obtaining in this way what we call the perturbed Apo AI data set.

Here is how these simulated data were constructed. Five slides were chosen at random from the eight slides in each of the two treatment groups to create these perturbed data sets, for a total of 10 slides. For six of the 10 slides, an edge effect was simulated by adding a bias which is quadratic with respect to the spot row and spot column index (Fig. 7b). For two slides, a top-to-bottom gradient was simulated by perturbing the data with bias which is linear with respect to the spot row index within the slide (Fig. 7c). For the remaining two slides, a local rather than whole slide spatial bias was added to affect small regions inside print tip groups. This local bias is quadratic with respect to both directions (Fig. 7d). We preferred this type of simulation study instead of creating an artificial data set de novo, to avoid making some distributional assumptions which may not hold in practice.

In applying the different normalization methods to this perturbed data set, the only normalization method that was able to provide adjusted _p_-values < 0.01 for all the eight down-regulated genes was pNN2DA (Fig. 8). The second ranked method was cPLo2D which is the only other method that was able to make confident separation between the truly regulated genes and the bulk of non-regulated ones (see Table 2 for details). This simulation result demonstrates more clearly the potential of the neural network method with noisy datasets. pLoGS was not tested with the simulated data because the artificial spatial bias we added was affecting the log-ratios and not the backgrounds of the channels and the local group background correction of the pLoGS method implicitly assumes that spatial variation in the foreground channels will be reflected also in the background channels. Therefore, the simulated data was not appropriate to test pLoGS.

As the last criterion for comparison, we assessed the ability of the normalization methods to reduce the spatial dependence of the bias. The methods can be compared by visual observation of reconstructed images from the normalized log-ratios. We infer that a more efficient normalization method will produce a more spatially uniform distribution of normalized log-ratios M* within print tips. Figure 7a shows the distribution of M* values within four print tip groups of the 16th slide in the Apo data set. According to this qualitative criterion, the method gLoMedF provides the most uniform distribution followed by pNN2DA. The remaining methods may not be confidently ranked (Table 2).

The outcome of the various tests assessing the performance of the different normalization methods can be summarized as follows (Table 2):

  1. The method that allowed the greatest discrimination between truly down-regulated genes and the remaining genes in the original and artificially biased Apo experiment was the robust neural networks method (pNN2DA). In terms of reducing the within-slide variability, it ranked second after the median filter method (gLoMedF) and systematically better than the remaining methods. With respect to between-slide variability, the rankings were similar, except that for one of the three data sets, cPLo2D ranked above pNN2DA.
  2. The composite normalization method (cPLo2D) ranks second in inducing differential regulation; however, on the perturbed data set it ranked lower than the neural network method. In terms of reducing variability as well as according to the qualitative assessment in Figure 7a, this method ranked below pNN2DA.
  3. The median filter method (gLoMedF) effects the largest reduction in variability within and between slides (Figs 25). It produces also the most uniform spatial distribution of log ratios (Fig. 7a). However, this method ranked last in terms of inducing differential regulation according to the _t_-test.
  4. Print tip loess normalization could not remove spatial effects within the print tips as it was not designed for this purpose. It performed about the same as gMed on the differential expression test, but consistently better in reducing between-replicates variability. The print tip loess used in combination with the GeneSight background correction method (pLoGS) produced almost the same results as simple pLo on the Apo AI and swirl data sets. However this method underperformed on poplar data which exhibited high and variable background.

3.3 Limitations of the robust neural networks normalization

The pNN2DA normalization method, as well as all the normalization methods it was compared with, relies on the assumption that most of the genes used for normalization are not differentially regulated, and that the number of down-regulated genes is counterbalanced by the number of up-regulated genes. Particular to the pNN2DA normalization method is the fact that it may produce slightly different results from one trial to another if the parameters of the neural networks models are differently randomly initialized and data are differently split between training and test sets. However, if everything else is kept unchanged (number of hidden units J, size of the binning window), the different initialization of the weights and split of data is not likely to affect the final results, or at least not more than all the other choices that have to be made to obtain the raw R and G signals.4 As an example, if normalization of the same batch of 16 slides for the Apo experiment is carried out several times, the _p_-adjusted values will always indicate down-regulation for the same eight genes, although the order may differ slightly.

The computation time required for the pNN2DA method is not prohibitive compared with other methods. For example on the same Apo batch of 16 slides of 6384 spots, pNN2DA will take around 2.5 times more CPU time compared with the composite normalization method cPLo2D.

4 CONCLUSIONS

In this paper, we have introduced a new normalization method for cDNA microarray data. The novelty of the approach is two-fold. First, the regression method used to estimate the bias as a function of intensity and spatial coordinates is based on neural networks. Therefore a compact color distortion model may be obtained, unlike methods based on loess fit. Second, spatial bias variation is not accounted for at the level of the spot, as in previous methods (where the spot row and column indices are used as regressors), but at an intermediate resolution by cutting the space of print tips into an adequate number of bins. Resistance to outliers with respect to the intensity variable is further enhanced by assigning a weight to each spot as a function of how far its M value is from the median over the spots at the same level of average log-intensity.

The overall performance of the novel normalization method was comparable to the composite normalization method, but greater than the other tested methods. However these conclusions are based on a limited number of data sets and therefore further validation will be required.

The normalization method described in this paper is available as an R style library called nnNorm as a part of BioConductor project (http://www.bioconductor.org/).

1The coefficients in the equation of the linear transformation

\({\tilde{M}}^{l}=\alpha {\overleftrightarrow{M}}^{l}+\beta \)

are obtained using two points. The first is

\(({\overleftrightarrow{M}}^{l}=0,{\tilde{M}}^{l}=0)\)

and the second is either

\([{\overleftrightarrow{M}}^{l}={\displaystyle \underset{k}{min}}({\overleftrightarrow{M}}_{k}^{l}),{\tilde{M}}^{l}=1]\)

if the most distant value in

\({\overleftrightarrow{M}}^{l}\)

with respect to 0 is positive, or

\([{\overleftrightarrow{M}}^{l}={\displaystyle \underset{k}{min}}({\overleftrightarrow{M}}_{k}^{l}),{\tilde{M}}^{l}=-1]\)

if it is negative.

2For the swirl and poplar datasets we adjust for the dye swap by multiplying the normalized log-ratios in dye swapped slides with ‘−1’.

3For the Apo dataset the eight known down-regulated genes are omitted from the list as the M values are expected to differ between the two treatment groups.

4In order to obtain the R and G intensities from scanned images, one has to choose a method for image segmentation (e.g. spatially based, intensity based, etc.) and spot quantification (e.g. taking the mean, median or mode of the pixels). See Yang et al. (2002a) for details.

Table 1

Inventory of normalization methods compared in this work

Method abbreviation Short description
None No normalization is done, raw M are considered with local background correction
gMed Global median: estimate is the median of M values in the array
pLo Print tip loess: estimate is loess fit M(A) within each print tip group
pLoGS Print tip loess with local group median background correction according to GeneSight: estimate is loess fit M(A) within each print tip group but the background correction is local group (over a 3 × 3 square) instead of local as in pLo
cPLo2D Composite 2D and A: estimate is equally weighted sum of loess fit M(SpotRow, SpotCol) and loess fit M(A) within each print tip group
gLoMedF Global loess and median filter: estimate is loess fit M(A) on the whole array plus the median of a 3 × 3 block of spots with the current spot in the center
pNN2DA Robust neural networks 2D and A: estimate is the robust neural networks fit for M(A,X,Y) within each print tip, where X and Y are obtained by binning the print tip space
Method abbreviation Short description
None No normalization is done, raw M are considered with local background correction
gMed Global median: estimate is the median of M values in the array
pLo Print tip loess: estimate is loess fit M(A) within each print tip group
pLoGS Print tip loess with local group median background correction according to GeneSight: estimate is loess fit M(A) within each print tip group but the background correction is local group (over a 3 × 3 square) instead of local as in pLo
cPLo2D Composite 2D and A: estimate is equally weighted sum of loess fit M(SpotRow, SpotCol) and loess fit M(A) within each print tip group
gLoMedF Global loess and median filter: estimate is loess fit M(A) on the whole array plus the median of a 3 × 3 block of spots with the current spot in the center
pNN2DA Robust neural networks 2D and A: estimate is the robust neural networks fit for M(A,X,Y) within each print tip, where X and Y are obtained by binning the print tip space

Table 1

Inventory of normalization methods compared in this work

Method abbreviation Short description
None No normalization is done, raw M are considered with local background correction
gMed Global median: estimate is the median of M values in the array
pLo Print tip loess: estimate is loess fit M(A) within each print tip group
pLoGS Print tip loess with local group median background correction according to GeneSight: estimate is loess fit M(A) within each print tip group but the background correction is local group (over a 3 × 3 square) instead of local as in pLo
cPLo2D Composite 2D and A: estimate is equally weighted sum of loess fit M(SpotRow, SpotCol) and loess fit M(A) within each print tip group
gLoMedF Global loess and median filter: estimate is loess fit M(A) on the whole array plus the median of a 3 × 3 block of spots with the current spot in the center
pNN2DA Robust neural networks 2D and A: estimate is the robust neural networks fit for M(A,X,Y) within each print tip, where X and Y are obtained by binning the print tip space
Method abbreviation Short description
None No normalization is done, raw M are considered with local background correction
gMed Global median: estimate is the median of M values in the array
pLo Print tip loess: estimate is loess fit M(A) within each print tip group
pLoGS Print tip loess with local group median background correction according to GeneSight: estimate is loess fit M(A) within each print tip group but the background correction is local group (over a 3 × 3 square) instead of local as in pLo
cPLo2D Composite 2D and A: estimate is equally weighted sum of loess fit M(SpotRow, SpotCol) and loess fit M(A) within each print tip group
gLoMedF Global loess and median filter: estimate is loess fit M(A) on the whole array plus the median of a 3 × 3 block of spots with the current spot in the center
pNN2DA Robust neural networks 2D and A: estimate is the robust neural networks fit for M(A,X,Y) within each print tip, where X and Y are obtained by binning the print tip space

Fitting M as a function of A for the slide 13 and print tip 14 of the Apo dataset: dotted line—classic neural network model (NN); continuous thick line—robust neural network model (Robust NN); and continuous thin line—loess fit.

Fig. 1

Fitting M as a function of A for the slide 13 and print tip 14 of the Apo dataset: dotted line—classic neural network model (NN); continuous thick line—robust neural network model (Robust NN); and continuous thin line—loess fit.

Normalized log-ratios density plot for the Apo dataset. The methods in the legend appear in the increasing order of the height of the distribution, so that gLoMF provides the smallest spread of log-ratios followed by pNN2DA.

Fig. 2

Normalized log-ratios density plot for the Apo dataset. The methods in the legend appear in the increasing order of the height of the distribution, so that gLoMF provides the smallest spread of log-ratios followed by pNN2DA.

Normalized log-ratios density plot for the swirl zebra fish dataset. The methods in the legend appear in the increasing order of the height of the distribution, so that gLoMF provides the smallest spread of log-ratios followed by pNN2DA.

Fig. 3

Normalized log-ratios density plot for the swirl zebra fish dataset. The methods in the legend appear in the increasing order of the height of the distribution, so that gLoMF provides the smallest spread of log-ratios followed by pNN2DA.

Normalized log-ratios density plot for the poplar dataset. The methods in the legend appear in the increasing order of the height of the distribution, so that gLoMF provides the smallest spread of log-ratios followed by pNN2DA.

Fig. 4

Normalized log-ratios density plot for the poplar dataset. The methods in the legend appear in the increasing order of the height of the distribution, so that gLoMF provides the smallest spread of log-ratios followed by pNN2DA.

Table 2

Ranking of different normalization methods according to multiple criteria

Within-slide variability (Apo AI, swirl, poplar) Between-replicates variability (Apo AI, swirl, poplar) Gap between log _p_-values of last true positive and first false positive genes Spatial uniformity of M values distributiona [Apo AI (slide 16)]
Apo AI Perturbed Apo AI
Rank
1st gLoMedF (3/3)b gLoMedF (3/3) pNN2DA pNN2DA gLoMedF
2nd pNN2DA (3/3) pNN2DA (2/3) gMed cPLo2D pNN2DA
3rd pLo cPLo2D (2/3) cPLo2D gLoMedF
4th pLoGS pLo pLo {cPLo2D, pLo, pLoGS}c
5th cPLo2D (2/3) pLoGS (2/3) pLoGS gMed, pLo, pLoGS
6th gMed (3/3) gMed (3/3) gLoMedF gMed
Within-slide variability (Apo AI, swirl, poplar) Between-replicates variability (Apo AI, swirl, poplar) Gap between log _p_-values of last true positive and first false positive genes Spatial uniformity of M values distributiona [Apo AI (slide 16)]
Apo AI Perturbed Apo AI
Rank
1st gLoMedF (3/3)b gLoMedF (3/3) pNN2DA pNN2DA gLoMedF
2nd pNN2DA (3/3) pNN2DA (2/3) gMed cPLo2D pNN2DA
3rd pLo cPLo2D (2/3) cPLo2D gLoMedF
4th pLoGS pLo pLo {cPLo2D, pLo, pLoGS}c
5th cPLo2D (2/3) pLoGS (2/3) pLoGS gMed, pLo, pLoGS
6th gMed (3/3) gMed (3/3) gLoMedF gMed

aThis is a qualitative assessment.

bRatio in parentheses designates the number of datasets for which the current method performed better than the next ranked method, divided by the total number of datasets that was used in the test.

cNo meaningful ranking can be stated for the methods included in the brackets.

Table 2

Ranking of different normalization methods according to multiple criteria

Within-slide variability (Apo AI, swirl, poplar) Between-replicates variability (Apo AI, swirl, poplar) Gap between log _p_-values of last true positive and first false positive genes Spatial uniformity of M values distributiona [Apo AI (slide 16)]
Apo AI Perturbed Apo AI
Rank
1st gLoMedF (3/3)b gLoMedF (3/3) pNN2DA pNN2DA gLoMedF
2nd pNN2DA (3/3) pNN2DA (2/3) gMed cPLo2D pNN2DA
3rd pLo cPLo2D (2/3) cPLo2D gLoMedF
4th pLoGS pLo pLo {cPLo2D, pLo, pLoGS}c
5th cPLo2D (2/3) pLoGS (2/3) pLoGS gMed, pLo, pLoGS
6th gMed (3/3) gMed (3/3) gLoMedF gMed
Within-slide variability (Apo AI, swirl, poplar) Between-replicates variability (Apo AI, swirl, poplar) Gap between log _p_-values of last true positive and first false positive genes Spatial uniformity of M values distributiona [Apo AI (slide 16)]
Apo AI Perturbed Apo AI
Rank
1st gLoMedF (3/3)b gLoMedF (3/3) pNN2DA pNN2DA gLoMedF
2nd pNN2DA (3/3) pNN2DA (2/3) gMed cPLo2D pNN2DA
3rd pLo cPLo2D (2/3) cPLo2D gLoMedF
4th pLoGS pLo pLo {cPLo2D, pLo, pLoGS}c
5th cPLo2D (2/3) pLoGS (2/3) pLoGS gMed, pLo, pLoGS
6th gMed (3/3) gMed (3/3) gLoMedF gMed

aThis is a qualitative assessment.

bRatio in parentheses designates the number of datasets for which the current method performed better than the next ranked method, divided by the total number of datasets that was used in the test.

cNo meaningful ranking can be stated for the methods included in the brackets.

Absolute value of mean \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} \({log}_{10}\left({\widehat{\sigma }}_{g}^{2}\right)\) \end{document} for the (a) Apo, (b) swirl zebra fish and (c) poplar datasets. A higher AV indicates a lower variability between slides.

Fig. 5

Absolute value of mean

\({log}_{10}\left({\widehat{\sigma }}_{g}^{2}\right)\)

for the (a) Apo, (b) swirl zebra fish and (c) poplar datasets. A higher AV indicates a lower variability between slides.

Permutation adjusted p-values for the Apo dataset. The known down-regulated genes are represented as filled circles and the remaining ones with empty circles. The transversal line demarks a p = 0.01 threshold on differential expression between the treatment groups.

Fig. 6

Permutation adjusted _p_-values for the Apo dataset. The known down-regulated genes are represented as filled circles and the remaining ones with empty circles. The transversal line demarks a p = 0.01 threshold on differential expression between the treatment groups.

Reconstructed image of M values for real and artificially perturbed real data: (a) before and after normalization—data from apo AI knock-out mouse #8, print tips 11–14, the left and right columns indicating the name of the method; (b) artificial quadratic bidirectional bias acting at the slide level; (c) artificial unidirectional linear bias acting at the slide level; (d) artificial quadratic bidirectional bias acting locally. Shades of green correspond to negative values and shades of red to positive.

Fig. 7

Reconstructed image of M values for real and artificially perturbed real data: (a) before and after normalization—data from apo AI knock-out mouse #8, print tips 11–14, the left and right columns indicating the name of the method; (b) artificial quadratic bidirectional bias acting at the slide level; (c) artificial unidirectional linear bias acting at the slide level; (d) artificial quadratic bidirectional bias acting locally. Shades of green correspond to negative values and shades of red to positive.

Permutation adjusted p-values for the artificially perturbed Apo dataset. The truly down-regulated genes are represented as filled circles and the remaining ones with empty circles. Ten of the 16 slides were corrupted with various types of spatial bias. Observe how the ability of methods to separate the eight truly down-regulated genes from the bulk of the remaining ones diminishes with respect to Figure 6.

Fig. 8

Permutation adjusted _p_-values for the artificially perturbed Apo dataset. The truly down-regulated genes are represented as filled circles and the remaining ones with empty circles. Ten of the 16 slides were corrupted with various types of spatial bias. Observe how the ability of methods to separate the eight truly down-regulated genes from the bulk of the remaining ones diminishes with respect to Figure 6.

†Present address: Department of Biological Sciences, University of Alberta, Edmonton AB, Canada T6G 2E9

This research was funded by a Genome Quebec grant to J.E.K.C and J.M. The reviewers are gratefully thanked for the useful criticism that allowed us to improve the content and the presentation of this paper. The authors also thank Robert Nadon for his useful comments in the early stages of this work. Mario Ouellet is thanked for carrying out the hybridizations in the poplar experiment.

REFERENCES

Callow, M.J., et al.

2000

Microarray expression profiling identifies genes with altered expression in HDL-deficient mice.

Genome Res.

10

2022

–2029

Chen, J.J., et al.

2004

Analysis of variance components in gene expression data.

Bioinformatics

20

1436

–1446

Cui, X., et al.

2003

Data transformations for cDNA microarray data.

Statist. Appl. Genet. Mol. Biol.

2

Article 4

Draghici, S.

Data Analysis Tools for Microarrays

2003

Chapman and Hall/CRC Press

Draghici, S., et al.

2003

Noise sampling method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays.

Bioinformatics

19

1348

–1359

Dudoit, S., et al.

2002

Statistical methods for identifying expressed genes in replicated cDNA microarray experiments.

Stat. Sin.

12

111

–139

Golub, T.R., et al.

1999

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

Science

286

531

–537

Hornik, K.

1990

Approximation capabilities of multilayer feedforward neural networks.

Neural Networks

4

251

–257

Ideker, T., Ybarra, S., Grimmond, S.

2003

Hybridization and posthybridization washing. In Bowtell, D. and Sambrook, J. (Eds.).

DNA Microarrays: A Molecular Cloning Manual

, Cold Spring Harbor, NY Cold Spring Harbor Laboratory Press, pp.

173

–256

Ihaka, R. and Gentleman, R.

1996

R: a language for data analysis and graphics.

J. Comput. Graph. Statist.

5

299

–314

Kerr, M.K., et al.

2002

Statistical analysis of a gene expression microarray experiment with replication.

Stat. Sin.

12

203

–218

Nadon, R. and Shoemaker, J.

2002

Statisical issues with microarrays: processing and analysis.

Trends Genet.

18

265

–271

Park, T., et al.

2003

Evaluation of normalization methods for microarray data.

BMC Bioinform.

4

33

Quackenbush, J.

2002

Microarray data normalization and transformation.

Nat. Genet.

32

S496

–S501

Qin, L.X., et al.

2004

Empirical evaluation of data transformations and ranking statistics for microarray analysis.

Nucleic Acids Res.

32

5471

–5479

Rumelhart, D.E., Hinton, G., Williams, R.

1986

Learning internal representation by error propagation.

Parallel Distributed Processing

MIT Press Vol.

1

, pp.

318

–364

Simon, R.M. and Dobbin, K.

2003

Experimental design of DNA microarray experiments.

BioTechniques

34

S16

–S21

Smyth, G.K. and Speed, T.

2003

Normalization of cDNA microarray data.

Methods

31

265

–273

Smyth, G.K., et al.

2003

Statistical issues in cDNA microarray data analysis.

Methods Mol. Biol.

224

111

–136

Spellman, P.T., et al.

1998

Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization.

Mol. Biol. Cell

9

3273

–3297

Tusher, V.G., et al.

2001

Significance analysis of microarrays applied to the ionizing radiation response.

Proc. Natl Acad. Sci. USA

98

5116

–5121

Venables, W.N. and Ripley, B.D.

Modern Applied Statistics with S, 4th edn

2002

, New York Springer-Verlag

Westfall, P.H. and Yang, S.S.

Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment

1993

Wiley Series in Probability and Mathematical Statistics, Wiley

Wilson, D.L., et al.

2003

New normalization methods for cDNA microarray data.

Bioinformatics

19

1325

–1332

Workman, C., et al.

2002

A new non-linear normalization method for reducing variability in DNA microarray experiments.

Genome Biol.

3

1

–16

Yang, Y.H. and Speed, T.

2003

Representing and evaluating slide data. In Bowtell, D. and Sambrook, J. (Eds.).

DNA Microarrays: A Molecular Cloning Manual

, Cold Spring Harbor, NY Cold Spring Harbor Laboratory Press, pp.

544

–551

Yang, Y.H., et al.

2002

Analysis of cDNA microarray images.

Brief. Bioinform.

2

341

–349

Yang, Y.H., et al.

2002

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Nucleic Acids Res.

30

e15

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org