Conservation of enhancer location in divergent insects (original) (raw)

Abstract

Dorsoventral (DV) patterning of the Drosophila embryo is controlled by a concentration gradient of Dorsal, a sequence-specific transcription factor related to mammalian NF-κB. The Dorsal gradient generates at least 3 distinct thresholds of gene activity and tissue specification by the differential regulation of target enhancers containing distinctive combinations of binding sites for Dorsal, Twist, Snail, and other DV determinants. To understand the evolution of DV patterning mechanisms, we identified and characterized Dorsal target enhancers from the mosquito Anopheles gambiae and the flour beetle Tribolium castaneum. Putative orthologous enhancers are located in similar positions relative to the target genes they control, even though they lack sequence conservation and sometimes produce divergent patterns of gene expression. The most dramatic example of this conservation is seen for the “shadow” enhancer regulating brinker: It is conserved within the intron of the neighboring Atg5 locus of both flies and mosquitoes. These results suggest that, like exons, an enhancer position might be subject to constraint. Thus, novel patterns of gene expression might arise from the modification of conserved enhancers rather than the invention of new ones. We propose that this enhancer constancy might be a general property of regulatory evolution, and should facilitate enhancer discovery in nonmodel organisms.

Keywords: Anopheles, dorsoventral patterning, evolution, Tribolium, cis-regulation


The maternal Dorsal nuclear gradient controls the dorsoventral (DV) patterning of the early Drosophila melanogaster embryo through the direct transcriptional regulation of ≈60–70 target genes in a concentration-dependent manner (1). Proteolytic cleavage of the Toll ligand Spatzle on the ventral side of the oocyte leads to the activation of the ubiquitous Toll receptor and the selective degradation of the Cactus inhibitor, resulting in the translocation of the Dorsal protein from the cytoplasm into the nucleus (2, 3). Thus, Toll signaling converts an extracellular gradient of processed Spatzle into a nuclear concentration gradient of the transcription factor Dorsal (4). The Dorsal gradient generates at least 3 primary thresholds of gene activity in the cellularizing embryo, leading to the specification of the mesoderm, neurogenic ectoderm, and dorsal ectoderm (5).

The subdivision of the DV axis into separate embryonic tissues depends on the differential regulation of target enhancers by distinct concentrations of the Dorsal protein. Mesodermal genes such as twist and snail have enhancers that contain low-affinity Dorsal binding sites, restricting the activation of these genes to ventral regions where there are peak levels of nuclear Dorsal (5). Neurogenic genes such as vnd, brinker, and sog possess enhancers with high-affinity Dorsal binding sites, and are consequently activated by lower concentrations of the gradient (6). These enhancers also contain Snail repressor binding sites, which inhibit the expression of neurogenic genes in the mesoderm where there are high levels of Dorsal (7, 8).

Cross-species comparisons of genes involved in insect DV patterning have made it increasingly clear that the broad Dorsal nuclear gradient is a Dipteran-specific innovation (9, 10). For example, the Dorsal gradient is narrowly restricted to ventral tissues and is transiently expressed during early embryogenesis in the flour beetle Tribolium castaneum (10). This spatial and temporal restriction in the Dorsal gradient is thought to reflect an entirely zygotic mode of Toll signaling, along with robust Dorsal feedback loops that limit Toll and Cactus expression to ventral tissues (1113).

In most insects (but not the Diptera), Dorsal is limited to the mesoderm and perhaps the ventral-most regions of the presumptive neurogenic ectoderm (9, 10). In Tribolium, the patterning of the neurogenic ectoderm and dorsal ectoderm appears to be indirectly regulated by Dorsal, as Dorsal is involved in the formation of a broad Dpp signaling gradient emanating from the dorsal side of the embryo (14). In Drosophila, the Dpp inhibitor Sog is expressed in broad lateral stripes in response to low levels of Dorsal, where it functions to restrict Dpp signaling to the dorsal midline (8, 15). However, sog expression is limited to the ventral mesoderm of Anopheles and Tribolium embryos, and a critical transcriptional inhibitor of Dpp signaling, brinker, is altogether absent in the Tribolium embryo (14, 16). These changes in the deployment of Sog and Brinker are probably responsible for the expanded domains of Dpp signaling seen in Anopheles and Tribolium and the subsequent expansion of the amnion and serosa lineages (14, 17).

EGF signals pattern the ventral neurogenic ectoderm in Drosophila (18, 19). These signals arise from 2 separate sources. First, low levels of the Dorsal gradient directly activate the EGF signaling genes rhomboid and vein in ventral regions of the neurogenic ectoderm (6, 7). Second, the ventral midline is a source of another EGF signaling molecule, Spitz, which is released into lateral regions of the developing ventral nerve cord (19). Dorsal indirectly regulates Spitz through the activation of the midline determinant sim (6, 20). It is possible that Spitz might be sufficient to pattern the CNS in those insects where Dorsal is restricted to the mesoderm (21). Thus, the transcriptional outputs of the broad Dorsal gradient of Drosophila may be superimposed on an ancestral DV patterning system that depends on restricted expression of Dorsal and Dpp/EGF signaling relays mediated by Sog and Sim, respectively.

To explore the evolution of DV patterning mechanisms, we identified and characterized 5 different Dorsal target enhancers in Anopheles and Tribolium. Despite widespread regulatory changes, the positions of the enhancers are constant among these different insect groups. For example, the cactus enhancers of Drosophila and Tribolium are located within 3′ introns, whereas the brinker enhancers of Drosophila and Anopheles are located within the intron of a neighboring gene, Atg5. This invariance of enhancer positions reflects either orthology or functional constraint. We propose that the evolution of novel expression patterns depends on changes in conserved enhancers rather than the invention of new ones.

Results

Dorsal Target Enhancers in Anopheles.

Although not explicitly visualized, it is assumed that the Dorsal gradient in Anopheles functions in a manner similar to that seen in Drosophila. To explore this possibility, we examined the regulation of 2 Dorsal target genes in Anopheles, twist, and brinker, which are activated by high and low levels of the Dorsal gradient, respectively.

Computational analysis of an extended genomic interval encompassing the Anopheles twist locus identified a single cluster of putative Dorsal binding sites, located immediately upstream of the transcription start site (Fig. 1A). A genomic DNA fragment containing this binding-site cluster was attached to a lacZ reporter gene, and lacZ expression was examined in transgenic Drosophila embryos. Robust staining was detected in the ventral mesoderm, similar to the endogenous twist expression patterns in the early embryos of both Drosophila and Anopheles.

Fig. 1.

Fig. 1.

The Anopheles twist and brinker enhancers. (A and C) The locations of the Anopheles twist (A) and brinker (brk) (C) enhancers are indicated by pink rectangles. The twist enhancer is located 5′ of the gene; the brinker enhancer is ≈100 kb away in the intron of Atg5. (B and D) The Anopheles twist (twi) enhancer (B) drives lacZ reporter expression in the Drosophila mesoderm, whereas the Anopheles brinker enhancer (D) drives lacZ in the neurogenic ectoderm.

The Anopheles twist enhancer appears to correspond to the distal enhancer of the D. melanogaster twist locus, which contains separate proximal and distal enhancers in the 5′ flanking region (22, 23). The distal enhancer is responsible for sustained expression of twist in the developing mesoderm. In Drosophila virilis, the distal twist enhancer is located in the same position as in the D. melanogaster enhancer. However, the proximal enhancer is located within the second intron in D. virilis (24), and it is possible that it is located in a 3′ position in Tribolium (25). Despite this apparent movement of the proximal enhancer, the critical distal enhancer is located in comparable 5′ positions in Drosophila, Anopheles, and Tribolium.

brinker encodes a transcriptional repressor of Dpp signaling that is expressed in the presumptive neurogenic ectoderm of the early Drosophila embryo (26). A similar brinker expression pattern is observed in Anopheles embryos (16). The Drosophila gene is activated by 2 enhancers: one is located ≈10 kb 5′ of the transcription start site, and the other is located within the intron of a neighboring gene, Atg5, ≈13 kb downstream of the brinker start site (6, 27). Computational analyses identified 3 clusters of putative Dorsal binding sites in an extended region encompassing the Anopheles brinker locus. Only the cluster located within the intron of the neighboring Atg5 gene directs broad lateral stripes of lacZ expression in transgenic Drosophila (Fig. 1 C and D). Thus, it would appear that the major enhancer regulating brinker expression in the early Anopheles embryo is located in the same position as the 3′ shadow enhancer in Drosophila. This is a remarkable example of conservation in the position of a developmental enhancer because the Anopheles Atg5 gene is located nearly 100 kb from brinker and is inverted as compared with the organization in Drosophila (Fig. 1 A and C).

Dorsal Target Enhancers in Tribolium.

The preceding analysis identified twist and brinker enhancers in Anopheles that are located in the same relative positions as the corresponding enhancers in Drosophila. To determine whether this trend of fixed enhancer positions extends to even more divergent insects, we identified Dorsal target enhancers in the flour beetle T. castaneum.

Previous studies have shown that the Dorsal gradient is narrow and transient in Tribolium embryos, presumably due to a Dorsal–Cactus negative feedback loop, whereby Dorsal activates the expression of the Cactus inhibitor (10, 13). To determine whether Dorsal directly regulates cactus expression, computational methods were used to identify putative Dorsal binding sites in the vicinity of the cactus locus. The best binding-site cluster is located within the second predicted intron of the cactus transcription unit (Fig. 2C). This genomic DNA fragment mediates robust expression of a lacZ reporter gene in the presumptive mesoderm of transgenic Drosophila embryos (Fig. 2 A and E). The gap in expression in anterior regions could be due to repression by Hunchback because there is greater integration of anterior–posterior and dorsal–ventral patterning in Tribolium, as compared with Drosophila (e.g., ref. 17).

Fig. 2.

Fig. 2.

The Tribolium cactus and sim enhancers. (A) Tribolium embryo stained for cactus mRNA (red) and Twist protein to mark the mesoderm (green). (B) sim (red) is expressed in the ventral midline in beetles. DNA is stained in blue (A and B), anterior is to the left, and dorsal is up in all cases. (C and D) The locations of the Tribolium cactus (C) and sim (D) enhancers are indicated by pink rectangles. (E and F) The Tribolium cactus enhancer and the Tribolium sim enhancer drive lacZ reporter expression in transgenic Drosophila in the mesoderm (cactus) (E) and ventral midline (sim) (F).

ChIP–chip assays (11) previously identified an intronic enhancer in the Drosophila cactus gene that mediates a similar pattern of expression (summarized in Fig. 3). It is possible that the Dorsal–Cactus negative feedback loop has a stronger impact on the dynamics of the Dorsal gradient in Tribolium than Drosophila because Toll signaling is maternally deployed in Drosophila, but is strictly zygotic in Tribolium (12, 13). In any event, these studies identified comparable cactus enhancers in similar locations in Tribolium and Drosophila, despite extensive divergence in the dorsoventral patterning networks.

Fig. 3.

Fig. 3.

Summary of enhancer positions. Enhancers are indicated by pink rectangles. In all cases, the Drosophila locus is shown on the left and the orthologous Tribolium or Anopheles gene on the right. Abbreviations: Dm, Drosophila melanogaster; Ag, Anopheles gambiae; Tc, Tribolium castaneum.

Additional DV enhancers were identified in Tribolium to determine whether they are also located in fixed positions relative to their target genes. cactus is activated by the highest levels of the gradient, and the next gene tested, sim, is activated just beyond the limits of the cactus expression pattern where it defines the ventral midline of the developing nerve cord (28, 29). Enhancers controlling midline expression of sim are located in the immediate 5′ flanking region in both honeybees (Apis mellifera) and mosquitoes (Anopheles gambiae) (6, 21). We surveyed the Tribolium sim locus for putative Dorsal, Twist, Snail, and Sim/Tango binding sites, and identified a cluster containing all these sites upstream of the P1 promoter (Fig. 2D). A genomic DNA fragment encompassing this cluster recapitulates the sim expression pattern in the ventral midline of transgenic Drosophila embryos (Fig. 2 B and F). Thus, the sim enhancer is located in the same relative position in Drosophila, Apis, and Tribolium, despite extensive evolutionary changes in the underlying DV patterning mechanisms (21).

The secreted Sog inhibitor is essential for the formation of a Dpp signaling gradient (15). In Drosophila, sog is activated by low levels of the Dorsal gradient throughout the presumptive neurogenic ectoderm (8), but its expression is restricted to ventral regions of Tribolium and Anopheles embryos (14, 16). In Tribolium, sog is initially expressed in a ventral domain extending slightly beyond the mesoderm before becoming restricted to the ventral neurogenic ectoderm (Fig. 4 A and C) (14). We identified a cluster of low-affinity Dorsal sites, as well as Twist and Snail binding sites, ≈10 kb upstream of the Tribolium sog promoter (Fig. 4E). This binding cluster recapitulates the Tribolium sog pattern when expressed in transgenic Drosophila embryos (Fig. 4 A–D). As seen for the endogenous sog expression pattern, the Tribolium enhancer initially directs lacZ expression throughout the presumptive mesoderm before becoming restricted to the ventral neurogenic ectoderm and ultimately the ventral midline (Fig. 4 B and D and Fig. S1).

Fig. 4.

Fig. 4.

The Tribolium sog enhancer. (A and C) Tribolium sog is expressed first in the mesoderm and ventral neurogenic ectoderm (A) before becoming restricted to the ventral neurogenic ectoderm (C). sog is in red, Twist in green, and DNA in blue. (B and D) The Tribolium sog enhancer has the same expression pattern when driving a reporter in transgenic Drosophila. Frame E shows the location of the Tribolium sog enhancer (pink rectangle), and Pol II chIP–chip (Upper) and cDNA expression profiles (Lower) of the Tribolium sog locus. White arrow indicates the promoter.

The Drosophila sog expression pattern is controlled by 2 separate enhancers: one located within the first intron, and another (the “shadow” enhancer) located ≈20 kb upstream of the gene (see Fig. 3) (8, 27). The Tribolium enhancer is located in the same general region as the D. melanogaster sog shadow enhancer, although somewhat closer to the transcription start site. There is no evidence for an intronic sog enhancer in Tribolium. Identification of the Tribolium sog 5′ exon (14) is well supported by Pol II ChIP–chip signals at the beetle sog promoter (Fig. 4E). Syntenic relationships between sog and its neighbors are not preserved between Drosophila and Tribolium, including CG8117, which resides between the D. melanogaster sog shadow enhancer and promoter. However, this CG8117–sog linkage is not seen in divergent drosophilids outside the melanogaster group, including Drosophila virilis, which contains an experimentally verified sog enhancer located ≈30 kb upstream of the start site (30).

The previous analysis of sog regulation in Anopheles identified an enhancer located in intron 1 (16). A computational survey of putative Dorsal binding clusters identified a second enhancer located in the 5′ flanking region, similar to the position of the Drosophila shadow enhancer and 5′ enhancer in Tribolium, which directs reporter expression in the fly mesoderm (Fig. S2). It is possible that this 5′ enhancer is the major determinant of sog regulation in Drosophila, Anopheles, and Tribolium. The absence of intronic sog enhancers in some species may reflect the incomplete characterization of the locus or, alternatively, the acquisition of new enhancers in addition to the conserved 5′ enhancer.

Discussion

This study identified 5 different DV enhancers in A. gambiae and T. castaneum, representing a broad spectrum of patterning responses to the Dorsal gradient. These enhancers embody the largest collection of functionally defined regulatory DNAs engaged in a common process in divergent insects. Despite extensive differences in the DV regulatory networks of flies, mosquitoes, and beetles, the enhancers are located in similar positions relative to the promoters of the target genes they control. A constrained position is also observed for the previously identified vnd enhancer in Anopheles (31) (summarized in Fig. 3). Altogether, these 6 enhancers are located in all possible orientations, including the immediate 5′ flanking region (twist, sim), remote 5′ flanking region (sog), intron 1 (vnd), 3′ intron (cactus), and within a neighboring gene (brinker).

Given the rapid divergence of noncoding sequences due to the constant turnover of individual transcription-factor binding sites within enhancers (32), there are no arrangements of sites or even individual sites that can be thought of as orthologous in any of the pairs of enhancers described here (see Fig. S3 for binding-site arrangements). The most closely related pair of species examined in this study, Anopheles and Drosophila, last shared a common ancestor >200 million years ago. The 2 genomes have been rearranged to such an extent that only an estimated 34% of orthologous genes can be sorted into microsyntenic blocks (33). In light of this, it is somewhat surprising to see such conservation in enhancer positions. There are at least 2 possible explanations. First, the different enhancers identified in Anopheles and Tribolium might be orthologous, that is, they might derive from a common enhancer in the last shared ancestor. The position of the enhancer within the locus is apparently unchanged simply because no genomic events that would perturb its position have occurred. According to this view, selection might depend on the modification of preexisting enhancers rather than the creation of new ones, similar to the evolution of protein coding sequences (34, 35). A nonexclusive alternative explanation is that the enhancer position is functionally constrained within a genomic locus. For example, enhancers might be able to communicate with the target promoter only when located in particular positions within the higher-order structure of a complex genetic locus. Thus, enhancers might be in a constant flux of death and birth, but de novo enhancers work only when located in particular positions. The Hox gene complexes represent extreme examples of constrained enhancer organization (36). Perhaps simpler loci are subject to similar, but somewhat less stringent, organizational constraints.

The conservation of enhancer location observed in this study applies to developmental control genes. It is certainly conceivable that enhancer turnover alters the locations of regulatory DNAs, particularly in the case of housekeeping genes. An interesting example is seen for the spec genes in the sea urchin Strongylocentrotus purpuratus, which have coopted repetitive elements to function as enhancers (37). In contrast, regulatory genes engaged in interlocking network interactions, as seen for the genes considered in this study, might tend to retain old enhancers rather than invent new ones. An implication of this study is that the evolution of novel patterns of gene expression depends on the modification of existing enhancers rather than the invention of new ones. This has already been documented for the evolution of the sim expression pattern and ventral midline of divergent insects (21). We propose that the modification of constrained or orthologous enhancers will prove to be a general mechanism for the evolution of gene expression patterns.

Materials and Methods

Fly Transgenics.

T. castaneum sim, T. castaneum cactus, and A. gambiae twist enhancers were cloned into the _P_-element vector nE2G (6). A. gambiae brinker and T. castaneum sog enhancers were cloned into the ΦC31 transgenesis equivalent (eve minimal promoter, lacZ reporter) (38) provided by Michael Eisen (University of California, Berkeley). The D. melanogaster strain used for _P_-element-mediated transgenesis was _yw_67, as described previously (e.g., ref. 39). All experiments involving _P_-elements were performed with multiple independent lines. ΦC31-mediated transgenesis was carried out by using either 86Fb (40), or an AttP2 (41) pNos-integrase line, or both. Oligos are listed in Table S1.

Imaging.

Drosophila and Tribolium embryo fixation and in situ hybridizations were carried out according to methods described previously (21, 42). For RNA detection, embryos were hybridized with digoxygenin (Roche) or dinitrophenyl (Perkin–Elmer)-labeled probes and visualized colorimetrically (22) or fluorescently (42) together with the DNA stain Draq5 (Biostatus). Probes were generated with the primers listed in Table S1 and in vitro transcription. In general, probes were 2–3 kb in length, and designed to the 5′ end of the gene. The Tribolium Twist antibody was raised in rat to a nearly full-length _N_-terminal 6His-conjugated Tribolium Twist fusion protein, and was visualized with donkey anti-rat Alexa 488. Oligos are listed in Table S1.

Microarray Analysis.

Sequence for the Tribolium sog locus was taken from the Tcas_2.0 Baylor HSGC assembly. The tiled region consists of ≈100 kb on LG9 (10930000–11035000). NimbleGen designed 50-bp features covering these regions at a density of ≈1 feature per 90 bp. Chromatin for IPs was prepared from ≈15–40-h-old beetle embryos (raised at 30° C) according to ref. 43, with the exception that beetle embryos were dechorionated for 2 min in 100% bleach and were cross-linked for 10 min in formaldehyde-equilibrated hexanes. Chromatin IPs were performed according to NimbleGen's standard protocol (http://www.nimblegen.com) by using a mixture of Pol II antibodies (H14, H5, and 8WG16; Covance). Final IP samples were amplified by using the WGA-2 kit (Sigma) before being submitted to NimbleGen for labeling and hybridization. Visualization and scaling of tiling data were done with the Integrated Genome Browser (Affymetrix) and are shown as the log2 of the ratio between the IP and the input control samples. Generation of Tribolium cDNA to hybridize to the array has been described previously (44), and was carried out with ≈5–15-h-old embryos raised at 30° C.

Computational Analysis.

Anopheles and Tribolium orthologs of Drosophila genes were identified by a reciprocal BLAST–BLAST strategy. For each gene, the extended locus was scanned for clusters of binding sites by using ClusterDraw2, and the top clusters were assayed for enhancer activity in transgenic Drosophila. ClusterDraw2 has been described previously (21). The program and the motifs used in this analysis are available online at http://flydev.berkeley.edu/cgi-bin/cld/submit.cgi. Dorsal, Twist, and Snail sites have been described previously in ref. 45, and Zelda sites in refs. 46 and 47. Additionally, a table of binding sites used in this analysis is available as Table S2.

Supplementary Material

Supporting Information

Acknowledgments.

We thank Matthew Ronshaugen and Michael Perry for insightful comments and discussions and Dave Hendrix for technical help. J. C. was funded in part by a Chang Ling Tien fellowship for the environmental sciences. This work was funded by a Moore Foundation grant and National Institutes of Health Grant GM46638.

Footnotes

The authors declare no conflict of interest.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information