Deciphering a transcriptional regulatory code: modeling short-range repression in the Drosophila embryo - PubMed (original) (raw)

Deciphering a transcriptional regulatory code: modeling short-range repression in the Drosophila embryo

Walid D Fakhouri et al. Mol Syst Biol. 2010.

Abstract

Systems biology seeks a genomic-level interpretation of transcriptional regulatory information represented by patterns of protein-binding sites. Obtaining this information without direct experimentation is challenging; minor alterations in binding sites can have profound effects on gene expression, and underlie important aspects of disease and evolution. Quantitative modeling offers an alternative path to develop a global understanding of the transcriptional regulatory code. Recent studies have focused on endogenous regulatory sequences; however, distinct enhancers differ in many features, making it difficult to generalize to other cis-regulatory elements. We applied a systematic approach to simpler elements and present here the first quantitative analysis of short-range transcriptional repressors, which have central functions in metazoan development. Our fractional occupancy-based modeling uncovered unexpected features of these proteins' activity that allow accurate predictions of regulation by the Giant, Knirps, Krüppel, and Snail repressors, including modeling of an endogenous enhancer. This study provides essential elements of a transcriptional regulatory code that will allow extensive analysis of genomic information in Drosophila melanogaster and related organisms.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1

Figure 1

Transformation of DNA sequence and protein information by gene modeling. (A) An enhancer with three repressors (red squares) and four activators (green circles) is modeled, to generate the gene expression surface shown in (B). The axes represent normalized activator, repressor, and gene activity levels. (C) A Drosophila embryo with Giant repressor (red stripes) and Dorsal activator (green) staining is shown. Each embryo provides a diversity of potential inputs to the regulatory element: the white arrow points to a region in which activator levels are high and repressor levels are low. The black arrow points to a region in which both activator and repressor levels are low. The white triangle points to a region in which activator and repressor levels are both high, and the black triangle points to a region in which repressor levels are high and activator levels are low. (D) Output of regulatory element shown in (A), which mirrors values from (C) being mapped through surface shown in (B). (E) Formal scheme of data collection, analysis, and modeling.

Figure 2

Figure 2

Structures of genes assayed to determine context dependence of short-range repressor activity, and representative in situ images showing lacZ activity. Mid blastoderm embryos are oriented dorsal up, anterior to the left. Genes 1–8 test activator–repressor spacing, 9–10 and 16–18 activator–repressor stoichiometry and spacing, 12–15 and 19 arrangement and promoter proximity, 11 and 20 activator number and affinity, and 21–27 alternative short-range repressors.

Figure 3

Figure 3

Representative [_lacZ_] versus [Gt] plots. (A) Structures of three genes assayed (1, 9, and 10). (B, C) Representative embryos imaged for Giant protein and lacZ reporter gene activity. (D) The data from multiple confocal embryo images was processed and compiled to provide normalized reporter gene [_lacZ_] versus normalized repressor [Gt].

Figure 4

Figure 4

Parameters found by the ES parameter estimation technique for scheme 2 of the model. (A) Root mean square error, E, is shown on the left, with corresponding scale shown on the left axis. Repressor-scaling factor R (referred to as _S_R in fractional occupancy model in Materials and methods) and cooperativity C are shown in the central and right portions, respectively, with scale shown on the right axis. (B) Quenching efficiency parameters are shown for increasing distances of repressors located 5′ of the activators on the left. Quenching efficiency levels relative to Twist proximal (T) sites and Dorsal proximal (D) sites are shown in the right panel. A non-monotonic decrease in quenching efficiency for increasing distances is observed.

Figure 5

Figure 5

Parameters for scheme 2 with the constraint that quenching efficiency parameters decrease monotonically. (A) Root mean square error E, repressor-scaling factor R, and cooperativity C labeled as in Figure 4. (B) Quenching efficiency parameters and relative quenching of Dorsal and Twist sites. Under this constraint, the level of quenching efficiency changes very little from 28 to 66 bp, in contrast to observed trends (Figure 2).

Figure 6

Figure 6

Parameters for scheme 2 with cooperativity parameters set to different levels. (A, B) Parameters found in our study (circles) and parameters found by constraint of cooperativity parameters to those from Segal et al (2008) (diamonds). The increased cooperativity value is compensated by a decreased repressor-scaling factor R. (C) Root mean square errors (RMSE) for cooperativity parameters (constrained to values between 0 and 30). Estimated cooperativity values from our model lie near the lowest point in this curve.

Figure 7

Figure 7

Validation of modeling by prediction of subsets of the data from parameters derived from the remainder of the data. (A) Leave-one-out analysis. Root mean square errors are calculated using parameters found by 11 genes excepting the genes indicated, and all the genes. Relative RMSE ratios, indicating greater errors for prediction of genes 2, 9 and, 16, indicating their greater contribution to the parameter constraints. (B) Leave-sets-out analysis for nine distinct sets of genes defined by their shared properties (Table II). Root mean square errors are calculated using parameters found from the reduced set and the entire set. Relative RMSE ratios, indicating greater errors for prediction of sets 1, 2, and 4, indicating their greater contribution to the parameter constraints. (C) Predictions for leaving out set 8. Genes 1, 10, and 12 are predicted by using parameters found from other 9 genes. Points represent average values for [_lacZ_] versus [Gt] data, which was divided into 20 bins. (D) Parameter estimation results are shown for different amounts of data 50, 75, and 100%. The data is cut randomly from each gene at the same percentage.

Figure 8

Figure 8

Extension of the model to endogenous regulatory elements. (A) The rhomboid gene is expressed in the blastoderm embryo in two lateral stripes (one shown in focal plane), under control of the Dorsal and Twist activators. Ventral expression is inhibited by the Snail short-range repressor, which is expressed in the presumptive mesoderm. The _cis_-regulatory modules used for analysis are shown. Different forms of rhomboid NEE enhancer are depicted, with varying number and arrangements of Snail short-range repressor-binding sites. Dorsal and Twist activators are shown by large and small green circles, respectively, and Snail repressors are shown by red squares. On the right are the predicted repression levels caused by Snail-binding sites shown in each module based on parameter estimation using this group of enhancers. (B) Predicted parameters for scaling factors for each transcription factor and cooperativity. Average and standard deviation for 20 estimation runs are shown.

References

    1. Ackers GK, Johnson AD, Shea MA (1982) Quantitative model for gene regulation by λ phage repressor. Proc Natl Acad Sci USA 79: 1129–1133 - PMC - PubMed
    1. Arnosti DN, Gray S, Barolo S, Zhou J, Levine M (1996a) The gap protein knirps mediates both quenching and direct repression in the Drosophila embryo. EMBO J 15: 3659–3666 - PMC - PubMed
    1. Arnosti DN, Barolo S, Levine M, Small S (1996b) The eve stripe 2 enhancer employs multiple modes of transcriptional synergy. Development 122: 205–214 - PubMed
    1. Ay A, Fakhouri WD, Chiu C, Arnosti DN (2008) Image processing and analysis for quantifying gene expression from early Drosophila embryos. Tissue Eng Part A 14: 1517–1526 - PMC - PubMed
    1. Berman BP, Nibu Y, Pfeiffer BD, Tomancak P, Celniker SE, Levine M, Rubin GM, Eisen MB (2002) Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. Proc Natl Acad Sci USA 2: 757–762 - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources