Organization of an echinoderm Hox gene cluster (original) (raw)

Abstract

The Strongylocentrotus purpuratus genome contains a single ten-gene Hox complex >0.5 megabase in length. This complex was isolated on overlapping bacterial artificial chromosome and P1 artificial chromosome genomic recombinants by using probes for individual genes and by genomic walking. Echinoderm_Hox_ genes of Paralog Groups (PG) 1 and 2 are reported. The cluster includes genes representing all paralog groups of vertebrate Hox clusters, except that there is a single gene of the PG4–5 types and only three genes of the PG9–12 types. The echinoderm Hox gene cluster is essentially similar to those of the bilaterally organized chordates, despite the radically altered pentameral body plans of these animals.


The linkage of Hox genes in large genomic complexes is a definitive functional character of these genes. Mainly on the basis of evidence from insects and chordates, the spatial order of the domains of Hox gene expression during the developmental formulation of the anterior/posterior axis of the body plan has been seen to reflect their order in the genome (13). Hox genes that are apparently orthologous with those expressed developmentally from anterior to progressively posterior locations in chordates and insects are evidently a synapomorphy of the bilaterian metazoans (4,5). But it is remarkable, considering their fundamental importance for both developmental and evolutionary bioscience, that current structural knowledge of Hox gene complexes is phylogenetically so narrow (see Fig. 1). Outside of arthropods and chordates, the only genomic Hox complex so far known is the highly reduced four-gene cluster of_Caenorhabditis elegans_ (6).

Figure 1.

Figure 1

Phylogenetic tree for Metazoa, including representative protostome phyla and the three phyla that constitute the deuterostomes. Molecular phylogenies divide the protostomes into two great clades, namely the ecdysozoans (here arthropods to priapulids) and the lophotrochozoans (here flatworms to molluscs); deuterostomes consist of hemichordates and echinoderms, which are sister groups, plus chordates (vertebrate and invertebrate) (4044). The only phyla in which Hox gene clusters have been structurally characterized at the genome level are boxed (see text for references). The echinoderm box refers to the present work.

Among the particular reasons to focus on the Hox gene cluster in echinoderms are the following: (i) Echinoderms plus hemichordates constitute the sister group of the chordates within the deuterostomes (Fig. 1); an echinoderm Hox gene cluster might thus provide outgroup evidence of use in distinguishing trends in_Hox_ gene cluster evolution within the chordates. (ii) Echinoderms have radially organized pentameral body plans that differ in many ways from those of other deuterostomes. Thus the relations between the developmental domains of expression of given echinoderm Hox genes and their respective genomic positions might illuminate the evolutionary transformations that led to the appearance of their unique morphological features. (iii) Though fragments of many individual Hox genes had been isolated from sea urchins (711) and starfish (12) by PCR and other methods, their specific classification (and hence the interpretation of their patterns of expression) in many cases required knowledge of their relative genomic positions. Furthermore, (iv)Strongylocentrotus purpuratus, the subject of this work, is a maximal indirect developer (13). The embryo gives rise to a free-living bilaterally organized feeding larva that in its structural features bears essentially no relation to the pentameral adult. The adult body plan develops in an elaborate postembryonic process occurring in special growing tissues within the larva (14, 15). As described elsewhere (16), we have recently shown that most of the genes of the Hox complex of S. purpuratus are not used for the embryonic development of the larva, but all are expressed during development of the adult body plan. In contrast, all chordates, vertebrate and invertebrate, develop directly in the strict sense (17), and they begin to express their Hox genes in midembryogenesis as soon as the primary axis of the adult body plan is laid down. These developmental differences suggest that interesting insights may derive from genomic comparisons of the regulatory apparatus upstream of Hox genes in chordates and in indirectly developing echinoderms. Isolation and enumeration of the_S. purpuratus Hox_ gene complex would constitute a necessary initial step to regulatory analysis.

MATERIALS AND METHODS

Gene-Specific Probes.

The origins of gene-specific probes representing regions outside of the already known homeobox sequences were as follows: a full-length cDNA was cloned for the_SpHox7_ gene by using a 3′-rapid amplification of cDNA ends generated fragment (18). DNA clones representing SpHox9/10 and SpHox11/13b had been obtained earlier (19, 20). λ genomic clones were obtained for SpHox3, SpHox4/5, SpHox8, and SpHox11/13a. To obtain these clones, we used gene-specific fragments derived from the homologous genes of_Paracentrotus lividus_ [_P1Hbox11_, an ortholog of_SpHox3_ (11)], and _Heliocidaris erythrogramma_[_HeHbox9_, an ortholog of _SpHox4/5_;_HeHbox6_, an ortholog of _SpHox8_;_HeHbox10_, an ortholog of _SpHox11/13a_ (9)]. These fragments were used to screen an S. purpuratus λFIXII genomic library. Probes were labeled by random priming. Hybridization of filters containing a total of 3 × 105 clones was carried out for 16 h at 60°C in 5× SSC (1× SSC = 0.15 M sodium chloride/0.015 M sodium citrate, pH 7)/0.2% SDS/5× Denhardt’s solution (1× Denhardt’s solution = 0.02% polyvinylpyrrolidone/0.02% Ficoll/0.02% BSA)/100 μg/ml denatured salmon sperm DNA.

Sequence Analysis.

The sequences were analyzed by using theblast and fasta algorithms. The relationship of_SpHox9/10_, SpHox11/13a, and_SpHox11/13b_ with other Hox genes was analyzed by using a maximum parsimony method of phylogenetic reconstruction [paup (Phylogenetic Analysis Using Parsimony)] (21).

Pulsed-Field Gel Electrophoresis.

Methods used in genomic DNA preparation and digestion and parameters used for pulsed-field gel electrophoresis have been described previously (18). To prepare the probes used for the pulsed-field gel blot hybridizations or subsequent mapping studies, fragments outside the homeobox were subcloned in pBluescript. The probes were tested against S. purpuratus genomic DNA to detect any repetitive elements contained in them.

P1 Artificial Chromosome (PAC) and Bacterial Artificial Chromosome (BAC) Library Construction and Screening.

A P1 artificial chromosome library was constructed in the pCyPAC7 vector (a kind gift of Chris Amemiya, Boston University). DNA from the sperm of a single animal was partially digested with _Mbo_I and the library was constructed as described (22). The BAC library was constructed in the pBACe3.6 vector (a kind gift of Pieter de Jong, Roswell Park Cancer Institute, Buffalo, NY; GenBank accession no. U80929). Agarose-embedded DNA from a second animal was partially digested by_Eco_RI/_Eco_RI methylase competition (23). PAC and BAC genomic fragments containing all S. purpuratus Hox genes were also cloned from arrayed libraries by using the following hybridization conditions: Four 22 × 22 cm2 filters containing a total of about 8 × 104 clones were incubated for 16 h at 65°C in 5× SSPE (1× SSPE = 0.15 M NaCl/10 mM phosphate, pH 7.4/1 mM EDTA)/0.5% SDS/5× Denhardt’s solution/50 μg/ml denatured salmon sperm DNA. The filters were washed to a criterion of 1× SSPE/0.5% SDS at 65°C and exposed for a few hours. Positive clones were analyzed by restriction mapping and their homeoboxes amplified by PCR and then sequenced.

RESULTS

Hox Genes of S. purpuratus Identified with Canonical Probes.

An exhaustive PCR screen of S. purpuratus genomic DNA was carried out by using a pair of degenerate oligonucleotide primers (18) that recognize canonical sequence elements within the homeodomains of many Hox genes. Eight different _Hox_-type sequence fragments were recovered, plus an ortholog of the _Xenopus XLHbox_8 gene (24). These eight _Hox_-type homeobox sequences had all been recovered in various earlier studies on different sea urchin species (711) and had been subjected to detailed analysis by using several phylogenetic reconstruction procedures by Popodi et al. (9). At the amino acid level, the S. purpuratus homeodomain sequences are almost identical to those of Heliocidaris erythrogamma, which were used for that analysis. These eight Hox_-type sequences could be classified as follows, based on comparison with_Drosophila and mouse Hox genes (25, 26):

SpHox3 was unambiguously assigned to PG3, based on multiple uniquely shared residues.

A single PG4 or 5 gene was found and named SpHox4/5.

Three genes were found that were related to vertebrate genes of PG6, 7, or 8, but their identity within this subgroup could not be unequivocally determined by sequence alone.

Three other genes were clearly related to the posterior group genes of vertebrates, i.e., PG9–13.

Sequence comparisons between the homeodomains of the S. purpuratus Hox genes and those of their vertebrate and_Drosophila_ ortholog (25) are shown in Fig.2 (paralog group assignments in Fig. 2 are based on position within the Hox cluster as described below, as well as on the homeodomain sequences per se). As already demonstrated by Popodi et al. (9), the sea urchin Hox gene sequences are obviously more closely related to the deuterostome Hox genes than to those of_Drosophila_.

Figure 2.

Figure 2

Alignment of vertebrate, amphioxus,Drosophila, and S. purpuratus homeodomain sequences. The S. purpuratus homeodomain sequences are shown flanked by arrows representing the positions of the PCR fragments used in our screens (18). Homeodomain sequences for some genes were obtained by other methods or were from a combination of sources, e.g., clones isolated by genomic walking or cDNA clones, as described in the text. In vertebrate homeodomain consensus sequences (VERT), uppercase letters indicate a residue conserved in all known vertebrate sequences of that paralog group, e.g., all mouse and human PG1 genes (24). Lowercase letters indicate a residue found in the majority but not all vertebrate sequences of each paralog group, i.e., comparing the multiple vertebrate sequences available for each Paralog Group (there is only a single amphioxus gene from each Paralog Group). Dashes indicate amino acid identity at that position between the S. purpuratus genes and all vertebrate genes as well as_Drosophila_ and amphioxus genes of that paralog group. Amphioxus sequences [AMP, from Branchiostoma_ (3)] are shown below the vertebrate consensus sequences.Drosophila sequences included in the comparison are_Labial (LB), Proboscipedia (PB),Deformed (DFD), and Abdominal B (ABD-B). Sequences are compiled from ref. 25.

“Anterior” Hox Genes of S. purpuratus: SpHox1 and SpHox2.

Previous surveys of_Hox_ genes in sea urchins had failed to detect any “anterior” class genes (710), and it was even suggested (8) that their absence might be interpreted in terms of the peculiar modifications of the echinoderm body plan, which lacks obvious head structures. However, a PG3 Hox gene was then recovered from_Paracentrotus lividus_ (11) and, as noted above, a_Strongylocentrotus_ PG3 homeobox also emerged from our PCR screen. We now report the identification of PG1 and 2 Hox genes as well. Invisible to the genomic PCR approaches that we and others had attempted, these genes were recovered only by genomic walking. As shown in Fig. 2, diagnostic residues in their homeodomain sequences identify genes of these paralog groups (e.g., the alanine at position 9 of the homeodomain in PG1 genes and the proline at this position in PG2 and 3 genes) (25). It is clear that sea urchins possess the same complete complement of “anterior” class Hox genes as do other bilaterians.

Two features again relate the SpHox1–3 genes more closely to their chordate than to their Drosophila counterparts. First, the homeodomain sequences are more similar and, in the case of_SpHox3,_ this similarity to the vertebrate sequences extends beyond the homeodomain (not shown). Second, whereas the homeodomains of_Drosophila lab_ and pb genes are interrupted by introns, the vertebrate PG1 and 2 genes are not (26) and neither are the SpHox1 and SpHox2 genes.

Single Hox Gene Cluster.

Multiple sequences belonging to given Hox gene paralog groups have not been recovered in any of the PCR screens carried out on echinoderm nucleic acids (refs. 711 and present studies), and this plus our initial pulsed-field gel electrophoresis experiments on S. purpuratus Hox genes (18) led to the supposition of a single Hox gene complex per haploid genome in this species. To confirm this, many additional pulsed-field gel hybridizations (27) were performed by using the complete set of Hox gene probes, which this work made available. The three experiments reproduced in Fig.3 are representative of gel-blot hybridizations carried out with all ten Hox gene probes. The DNA of S. purpuratus displays 4–5% intraspecific sequence polymorphism (28), and thus on random expectation there is a significant probability that a given restriction enzyme target site sequence that is present in one haploid genome of an individual will be missing in the other or that a different site will be present. Thus two bands per single copy sequence sometimes appear; examples can be seen in each of the panels of Fig. 3, though most bands are single. A 450-kb_Not_I band hybridizes with SpHox6 and_SpHox11/13a_ probes (Fig. 3 B and C) and with probes for four other Hox genes as well (not shown), but none of the three “anterior” Hox genes are included in this DNA fragment (e.g., SpHox3; see Fig.3A), nor is the 5′ terminal gene of the “posterior” type. We show below that the four “anterior”-most Hox genes are included within a span of ≈100 kb. The Hox cluster of S. purpuratus is thus >0.5 megabase in length, larger than either the 300-kb Branchiostoma (amphioxus) cluster (3) or the mouse and human clusters, which are on the order of 100 kb (26, 29). Hox cluster length is not correlated with genome size in any simple way over these great phylogenetic distances, as the S. purpuratus genome (30) is only about one-fourth the size of mammalian genomes.

Figure 3.

Figure 3

Genome blot hybridizations carried out on pulsed-field electrophoretic displays of S. purpuratus genomic DNA. The DNA was obtained from sperm of a single individual. Seven different restriction enzymes were used for the blots in each panel, as indicated. Arrows indicate common bands revealed by probes for more than one gene.

Order of S. purpuratus Hox Genes in a Genomic Contig.

BAC and PAC genomic libraries were constructed, each from the sperm of a single individual. The PAC library, constructed in the vector pCyPAC7, a modification of the pCyPAC2 vector (22), contained inserts averaging 80 kb and afforded 7-fold genome coverage. The average insert length in the BAC library, constructed in the pBACe3.6 vector, was 140 kb, and it provided about 13-fold genome coverage. The libraries were arrayed by using a Q-Bot robot (Genetix, Christchurch, Dorset, UK) and spotted at high density on 22 × 22-cm2 nylon membranes (31). The filters were screened with the Hox gene probes and about 70 positive genomic recombinants were recovered from each library, so that each gene was represented on multiple inserts. Subarrays were prepared and, to determine overlaps, each insert was challenged with all relevant_Hox_ gene probes. To confirm the screening results, the homeoboxes were recovered by PCR from each of the key inserts and sequenced. In this way the exact order of the Hox gene cluster was unequivocally determined. We made no attempt to map the positions of the genes within the inserts so as to establish intergenic distances. Even so, it is clear that gene density is highest at the “anterior” end of the cluster. Thus SpHox1,SpHox2, SpHox3, and SpHox4/5 genes were all found within a single PAC recombinant of about 100 kb in length. The “anterior” genes are also more densely packed in the mouse (29), Fugu (32), and amphioxus (3) Hox gene clusters.

The gene order is shown in Fig. 4 in which, for simplicity, only a small minority of the specific genomic fragments included in the analysis is indicated. The order establishes the linear arrangement of those genes that are too similar to be assigned unequivocally on the basis of sequence alone.

Figure 4.

Figure 4

Organization of the S. purpuratus Hox gene cluster. The diagram is not to scale, as the intergenic distances within the cluster have not been determined. The sequence of Hox genes within the cluster was inferred from their locations within PAC and BAC genomic inserts (see text) and the overlaps amongst clones containing each gene. For brevity, only one set of PAC genomic clones is shown, though each genomic region was analyzed on the basis of overlaps of multiple independent PAC and BAC clones. The correct names of the Hox genes with respect to their paralogous affinities with vertebrate Hox genes appear at the top of the diagram, and beneath in parentheses are designations found in earlier literature describing isolations of_Hox_ homeodomains or cDNAs in various laboratories (see text for references). The dashed line indicates the span of the 450-kb fragment indicated in Fig. 3, which includes all the genes from_SpHox4/5_ to SpHox11/13a.

DISCUSSION

PG4–5 Gene.

The mapping data of Fig. 4 confirms that only a single gene of the PG4–5 type exists in the S. purpuratus cluster, thus explaining why only one example of this type of_Hox_ homeodomain has been recovered in any of the sea urchin_Hox_ gene screens cited above. We named this gene_SpHox4/5_ because its homeodomain is not obviously closer to either one of the possible cognates (Fig. 2). A single_Hox_ gene of this subtype may be a synapomorphy of the echinoderms, because a starfish also has only one gene of the PG4–5 type (12), and starfish are a distantly related echinoderm class with respect to echinoids (33). Hemichordates (34) and amphioxus (3) have two genes of this type, i.e., the PG4 and 5 Hox genes, as does Drosophila (i.e., the Dfd and Scr genes), which here serves as a distant outgroup.

The Posterior Group Genes.

The genes at the 5′ end of the_Hox_ clusters, i.e., in vertebrates, the PG9–13 genes, appear to have evolved more rapidly (35). There are three_Hox_ genes of this subgroup in S. purpuratus (Fig.4). A maximum parsimony analysis indicates that these three genes fall into two subclasses, one containing a single gene,SpHox9/10, which is very similar to the chordate PG9 and 10 genes; the other containing two genes, most closely related to the vertebrate PG11–13 Hox genes. These S. purpuratus genes are hence designated SpHox11/13a and_SpHox11/13b_. The “posterior” Hox genes of vertebrates have similarly been classified into these same two groups in a previous phylogenetic analysis (36). Expansions of the “posterior genes” may have occurred in deuterostomes, since amphioxus has four such genes (37), whereas vertebrate clusters possess five (2) compared with the three in this sea urchin. Because there is no specific orthology relationship between the S. purpuratus SpHox11/13a and SpHox11/13b genes and the chordate_PG11–13_ genes, some of these gene duplications or paralogous expansions may be phylum specific. The implication is that an ancestral deuterostome form may have possessed one Hox gene of the PG9–10 type and one gene of the PG11–13 type.

Regulation of Hox Gene Expression and the Echinoderm Body Plan.

The pentamerally symmetric adult body plan of echinoderms is obviously very highly modified from those of their bilaterian ancestors, and it differs greatly from the bilaterally organized body plans of hemichordates and chordates, the other living deuterostome phyla. Echinoderms lack obvious head structures, and their radially organized water vascular systems and central nervous systems, as well as their calcite endoskeletons, are all phylum-specific characters. Yet, as this work shows, their Hox gene complex is essentially the same as that of chordates, with the minor exceptions discussed above, namely that there is a single gene of the PG4–5 type and three rather than a larger number of genes of the PG9–13 type. Though their body plans contrast with those of other deuterostomes particularly in the anterior regions, they possess exactly the same “anterior” group Hox genes as do chordates, i.e., the PG1, 2, and 3 genes. We cannot here discuss the evolutionary derivation of the echinoderm body plan, except to emphasize that the presence of the complete Hox gene complex in the sea urchin strongly validates the view (38) that evolutionary changes in morphogenetic_Hox_ gene function have depended primarily on regulatory alterations. Hox genes are called into play downstream of the prior patterning processes that initially define morphological elements of the body plan, and they control the institution of other patterning processes within these morphological domains (e.g., 2, 38–40). Their expression depends specifically on the structures of their own cis-regulatory elements, and their function depends likewise on the structures of the cis-regulatory elements of their target genes. Both of these key sets of genomic regulatory sequences have evidently changed markedly during evolution.

Acknowledgments

It is a pleasure to acknowledge the expert assistance of Dr. Kevin Peterson of this laboratory in the phylogenetic analyses of_Hox_ gene sequences referred to throughout this work and also for helpful critical comments on the manuscript. We are extremely grateful to Drs. Ellen Popodi (Indiana University), Robert Maxson (University of Southern California), Michael Murtha (Yale University), and Giovanni Spinnelli (University of Palermo) for providing us with_Hox_ probes from various sea urchin species. We thank Valeria Mancino for help with pulsed-field electrophoresis. Drs. Chris Amemiya and Pieter de Jong were extremely helpful in providing us with their PAC and BAC vectors, respectively. This work was supported by the Stowers Institute for Medical Research and by a grant from the National Institute of Child Health and Human Development. J.P.R. was supported by a National Institutes of Health Training Grant (HD-07257).

ABBREVIATIONS

PG

Paralog Group

PAC

P1 artificial chromosome

BAC

bacterial artificial chromosome

References