Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae - PubMed (original) (raw)

Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae

Marie-Ange Teste et al. BMC Mol Biol. 2009.

Abstract

Background: Real-time RT-PCR is the recommended method for quantitative gene expression analysis. A compulsory step is the selection of good reference genes for normalization. A few genes often referred to as HouseKeeping Genes (HSK), such as ACT1, RDN18 or PDA1 are among the most commonly used, as their expression is assumed to remain unchanged over a wide range of conditions. Since this assumption is very unlikely, a geometric averaging of multiple, carefully selected internal control genes is now strongly recommended for normalization to avoid this problem of expression variation of single reference genes. The aim of this work was to search for a set of reference genes for reliable gene expression analysis in Saccharomyces cerevisiae.

Results: From public microarray datasets, we selected potential reference genes whose expression remained apparently invariable during long-term growth on glucose. Using the algorithm geNorm, ALG9, TAF10, TFC1 and UBC6 turned out to be genes whose expression remained stable, independent of the growth conditions and the strain backgrounds tested in this study. We then showed that the geometric averaging of any subset of three genes among the six most stable genes resulted in very similar normalized data, which contrasted with inconsistent results among various biological samples when the normalization was performed with ACT1. Normalization with multiple selected genes was therefore applied to transcriptional analysis of genes involved in glycogen metabolism. We determined an induction ratio of 100-fold for GPH1 and 20-fold for GSY2 between the exponential phase and the diauxic shift on glucose. There was no induction of these two genes at this transition phase on galactose, although in both cases, the kinetics of glycogen accumulation was similar. In contrast, SGA1 expression was independent of the carbon source and increased by 3-fold in stationary phase.

Conclusion: In this work, we provided a set of genes that are suitable reference genes for quantitative gene expression analysis by real-time RT-PCR in yeast biological samples covering a large panel of physiological states. In contrast, we invalidated and discourage the use of ACT1 as well as other commonly used reference genes (PDA1, TDH3, RDN18, etc) as internal controls for quantitative gene expression analysis in yeast.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Schematic view of growth characteristics of a WT strain and its tps1 derivative. Growth (cells) and glycogen profiles during cultures of CEN.PK strains on galactose. WT (left, set D from Figure 2) and tps1 (right, set H from Figure 2). EP, Exponential Phase; DS, Diauxic Shift; PDS, Post-Diauxic Shift; SP, Stationary Phase. Original data and sampling numbering can be found in the Additional files. Cells (OD600), Glycogen (μg eq.glucose/OD unit).

Figure 2

Figure 2

Sample sets and Ranking of candidate reference genes as calculated by geNorm. Left panel: Independent cultures (illustrated by the boxes) were carried out: Wild type KT strain on glucose (sample set B) and galactose (set C); Wild type CEN.PK (set D & set E as independent cultures) and its tps1 derivative strain (set H & set I as independent cultures), on galactose. Sampling [S#] was performed all along the cultures with a posteriori selection and analysis of 4 to 7 RNA samples representative of different physiological states (e.g. samples 1 to 5 for growth of the KT strain on glucose; see Additional file 1). Expression data from one culture (e.g. set B/) or from several cultures (connector between boxes, e.g. set A/that includes samples from sets B & C together) were then analyzed with geNorm (A ~M sample sets). Right panel: Synthetic overview of ranking of the candidate reference genes according to their expression stability, and determination of the optimal number of genes used for normalization. The 2 most stable genes (black circle), the third (dashed circle) and the following 3 best reference genes (empty circle). Pair-wise variation (Vn/n+1) between NFn and NFn+1 (NF: normalization factor; n: number of genes used for NF calculation). Right Column: pair-wise variation value below the threshold 0.15, which means that n genes might be sufficient for NF calculation (i.e. 2 genes for set "A"). See additional file 3 for overall stability under the standard geNorm output format.

Figure 3

Figure 3

Distribution overview of expression levels (Ct) of the different genes. Boxplot representation of raw Ct values obtained from amplification curves. Lower and upper boundaries of the box indicate the 25th and the 75th percentile, respectively, the thin line within the box marks the median, and the whiskers (error bars) below and above the box indicate the 10th and 90th percentiles. Mean (thick line) and outliers (*). Complete RNA sample set from the study (n = 32, grey), sample set "A" (n = 9, yellow; see Figure 2, Glucose + Galactose) and sample set "K" (n = 11, green; see Figure 2, WT + tps1Δ). As stated in methods, the 25 μL reaction mixes contained 5 μL of cDNA preparation diluted 10 times, except for RDN18 where cDNA was diluted 50 times. For a easy and preliminary estimation of the relative expression of a gene between two samples, a difference of 3.33 Ct with 100% PCR efficiency represents 10-fold over-expression or repression between two conditions (N2/N1 = (1+Eff)^(Ct1-Ct2)); With PCR efficiency correction, the same Ct difference with only 90% efficiency signifies a 8.5-fold variation of transcripts between the two samples.

Figure 4

Figure 4

Expression summary as reported in the SGD Expression Connection tool. For each gene (depicted by different color lines), the pattern that is reported in this figure is a copy of the "expression summary" histogram that was obtained using the Expression Connection tool from the SGD server

. This graph indicates the number of samples (also called 'experiments' on the SGD server), at a given expression ratio value, that could be found in all the microarray datasets stored on the SGD server (i.e. approx. 30 studies). The expression data reported on the axis are in log2 scale.

Figure 5

Figure 5

Effect of normalization strategies on expression ratios. Normalized expression of ACT1, PDA1, IPP1, RDN18, GPH1, GSY2 and SGA1, in 5 characteristic samples during growth on glucose (i.e. set "B" in Figure 2): early exponential phase (respiro-fermentative), entry in (disappearance of glucose) and exit from the diauxic shift, mid of post-diauxic (respiratory) growth, and 3 days stationary phase. The exponential phase sample (S#1) was used as calibrator. Normalization was performed using the three most stable genes (NF(_UBC_6, _TAF_10, _ALG_9), dashed bar), the following 3 best (NF(_TFC_1, _KRE_11, _FRP_2), grey bar), ACT1 alone (NF(_ACT_1), black diamond) or using ACT1, PDA1 and IPP1 (NF(_ACT_1, _PDA_1, _IPP_1), empty diamond). Normalized expression data and error bars were calculated using the gene expression module of the BIORAD iQ5 software, which follows models and error propagation rules outlined in the geNorm manual. For the sake of clarity, we did not plot standard deviation of ratios obtained from NF(_ACT_1, _PDA_1, _IPP_1).

Figure 6

Figure 6

Degree of correlation between normalization strategies in simple datasets. Scatter plot of the data illustrated in Figure 5. X axis: ratios calculated using the three most stable genes (NF(_UBC_6, _TAF_10, _ALG_9)); Y axis: ratios calculated using NF(_TFC_1, _KRE_11, _FRP_2) (green diamond) or NF(_ACT_1) (purple square). Horizontal and vertical error bars: Standard deviation on X and Y ratio, respectively. Grey Dotted line: y = x. The equation and correlation coefficient of the linear regression fit (not reported) are y = 0,9178x, R2 = 0,997 (green diamonds), and y = 0,9327x, R2 = 0,598 (purple squares).

Figure 7

Figure 7

Effect of carbon source on expression profiles of the commonly-used ACT1 and genes of interest. Normalized expression of ACT1 and GPH1, GSY2, SGA1 in sample set "A" (n = 9). This set includes 5 samples selected during growth on glucose (grey, see legend from Figure 5) and 4 characteristic samples from growth on galactose (blue): early exponential phase (respiro-fermentative), entry in the diauxic shift (disappearance of galactose), mid of post-diauxic (respiratory) growth, and 3 days stationary phase. The exponential phase sample on glucose was used as calibrator for this sample set. Normalization was performed using the three most stable genes (NF(_UBC_6, _TAF_10, _ALG_9), dashed bar), the geometric mean of ACT1, PDA1 and IPP1 (NF(_ACT_1, _PDA_1, _IPP_1), empty diamond) or ACT1 alone (NF(_ACT_1), black diamond). Normalized expression data and error bars calculated as described in Figure 5. For the sake of clarity, we did not plot standard deviation of ratios obtained from NF(_ACT_1, _PDA_1, _IPP_1).

Figure 8

Figure 8

Effect of tps1 mutation on expression profiles of genes of interest. Normalized expression of GPH1, GSY2 and SGA1 in WT and tps1Δ strains grown on galactose (set "K"). This set includes i/4 samples selected during growth of the WT strain (blue): early exponential phase [respiro-fermentative], exit from the diauxic shift, mid of post-diauxic [respiratory] growth, and after 3 days in stationary phase; and ii/7 samples from growth of the tps1Δ strain (red): early exponential phase [respiro-fermentative], entry in and exit from the diauxic shift, early and mid respiratory growth (i.e. just before and just after glycogen peak), entry in and after 3 days in stationary phase. The exponential phase sample of the WT strain was used as calibrator. Normalization was performed using the three most stable genes in this sample set (NF(_UBC_6, _TFC_1, _KRE_11), dashed bar) or ACT1 alone (NF(_ACT_1), black diamond). Normalized expression data and error bars calculated as described in Figure 5. For the sake of clarity, we did not plot standard deviation of ratios obtained using ACT1 as reference.

Figure 9

Figure 9

Degree of correlation between normalization strategies in more heterogeneous datasets. Scatter plot of ratio values obtained from sample set "K" for genes which together illustrated a wide range of responses (from strong over-expression to repression, see Figure 8 for some of them). X axis: ratios calculated using the three most stable genes (NF(_UBC_6, _TFC_1, _KRE_11)); Y axis: ratios calculated using NF(_TAF_10, _FRP_2, _ALG_9) (A) and NF(_ACT_1) (B). Dotted line: y = x. The equation and correlation coefficient of the linear regression fit (not reported) were y = 1.030x, R2 = 0.916 (A) and y = 2.405x, R2 = 0.124 (B).

Similar articles

Cited by

References

    1. Bustin SA, Benes V, Nolan T, Pfaffl MW. Quantitative real-time RT-PCR--a perspective. J Mol Endocrinol. 2005;34:597–601. - PubMed
    1. Kubista M, Andrade JM, Bengtsson M, Forootan A, Jonak J, Lind K, Sindelka R, Sjoback R, Sjogreen B, Strombom L, et al. The real-time polymerase chain reaction. Mol Aspects Med. 2006;27:95–125. - PubMed
    1. VanGuilder HD, Vrana KE, Freeman WM. Twenty-five years of quantitative PCR for gene expression analysis. Biotechniques. 2008;44:619–626. - PubMed
    1. Wong ML, Medrano JF. Real-time PCR for mRNA quantitation. Biotechniques. 2005;39:75–85. - PubMed
    1. Nolan T, Hands RE, Bustin SA. Quantification of mRNA using real-time RT-PCR. Nat Protoc. 2006;1:1559–1582. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources