Species-Specific Strategies Underlying Conserved Functions of Metabolic Transcription Factors (original) (raw)

Journal Article

,

1Division of Endocrinology, Diabetes, and Metabolism, Department of Medicine (R.E.S., L.J.E., M.A.L.)University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

Search for other works by this author on:

,

2Department of Genetics (G.T., L.J.E., Z.L., M.A.L., K.H.K.), Institute for Diabetes, Obesity, and Metabolism, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

Search for other works by this author on:

,

2Department of Genetics (G.T., L.J.E., Z.L., M.A.L., K.H.K.), Institute for Diabetes, Obesity, and Metabolism, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

Search for other works by this author on:

,

2Department of Genetics (G.T., L.J.E., Z.L., M.A.L., K.H.K.), Institute for Diabetes, Obesity, and Metabolism, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

Search for other works by this author on:

,

1Division of Endocrinology, Diabetes, and Metabolism, Department of Medicine (R.E.S., L.J.E., M.A.L.)University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

2Department of Genetics (G.T., L.J.E., Z.L., M.A.L., K.H.K.), Institute for Diabetes, Obesity, and Metabolism, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

Search for other works by this author on:

2Department of Genetics (G.T., L.J.E., Z.L., M.A.L., K.H.K.), Institute for Diabetes, Obesity, and Metabolism, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

*Address all correspondence and requests for reprints to: Klaus H. Kaestner, University of Pennsylvania Department of Genetics, 752b Clinical Research Building, 415 Curie Boulevard, Philadelphia, Pennsylvania 19104-6145.

Search for other works by this author on:

Received:

05 November 2010

Accepted:

10 January 2011

Cite

Raymond E. Soccio, Geetu Tuteja, Logan J. Everett, Zhaoyu Li, Mitchell A. Lazar, Klaus H. Kaestner, Species-Specific Strategies Underlying Conserved Functions of Metabolic Transcription Factors, Molecular Endocrinology, Volume 25, Issue 4, 1 April 2011, Pages 694–706, https://doi.org/10.1210/me.2010-0454
Close

Navbar Search Filter Mobile Enter search term Search

The winged helix protein FOXA2 and the nuclear receptor peroxisome proliferator-activated receptor-γ (PPARγ) are highly conserved, regionally expressed transcription factors (TFs) that regulate networks of genes controlling complex metabolic functions. Cistrome analysis for Foxa2 in mouse liver and PPARγ in mouse adipocytes has previously produced consensus-binding sites that are nearly identical to those used by the corresponding TFs in human cells. We report here that, despite the conservation of the canonical binding motif, the great majority of binding regions for FOXA2 in human liver and for PPARγ in human adipocytes are not in the orthologous locations corresponding to the mouse genome, and vice versa. Of note, TF binding can be absent in one species despite sequence conservation, including motifs that do support binding in the other species, demonstrating a major limitation of in silico binding site prediction. Whereas only approximately 10% of binding sites are conserved, gene-centric analysis reveals that about 50% of genes with nearby TF occupancy are shared across species for both hepatic FOXA2 and adipocyte PPARγ. Remarkably, for both TFs, many of the shared genes function in tissue-specific metabolic pathways, whereas species-unique genes fail to show enrichment for these pathways. Nonetheless, the species-unique genes, like the shared genes, showed the expected transcriptional regulation by the TFs in loss-of-function experiments. Thus, species-specific strategies underlie the biological functions of metabolic TFs that are highly conserved across mammalian species. Analysis of factor binding in multiple species may be necessary to distinguish apparent species-unique noise and reveal functionally relevant information.

NURSA Molecule Pages:

Transcription factors (TFs) regulate gene expression by binding to specific DNA sequences. Identifying genome-wide targets of TFs is important to determine the mechanisms involved in regulating biological processes. Because a particular TF may often bind to variations of a consensus sequence, searching for binding sites computationally often results in multiple false positives that are not actually bound in vivo.

Given the similar physiology and function of human and mouse tissues, one would expect that transcriptional regulatory networks between the species would be conserved. Aligning the human genome with rodent genomes demonstrated that not only coding sequences are highly related, but large blocks of noncoding sequences are also conserved (13). The strong sequence conservation of these noncoding regions suggested that they functioned in gene regulation, as enhancers or silencers for instance. Sequence conservation has since been incorporated when attempting to identify true TF targets (1, 2, 410). This approach, termed “Phylogenetic Footprinting,” decreases the number of false positives when identifying TF-binding sites, but it is also likely to increase the number of false negatives, because it does not take the issue of binding site turnover into account (5, 11).

Binding site turnover occurs because TF-binding sites are short and thus frequently disrupted by random mutation. In a binding site-turnover event, when a point mutation causes a particular binding site to lose its functionality, this loss is compensated by generation of a new binding site nearby that takes over the function of the previous binding site. Because the new binding site exists, gene regulation and function are not altered even though the position of the regulatory module has changed. Binding site turnover can occur through a number of mechanisms in addition to point mutations (1215). Binding site turnover, where gene function between species remains constant and the regulatory sequences between species are highly divergent, has been well documented in Drosophila (13, 1521). Studies of binding site turnover in mammalian systems are limited and have previously been only investigated for a small number of regulatory regions (22, 23). However, the availability of next generation sequencing technology has enabled the comparison of TF-binding genome wide in different species (2426).

Here we compare binding-site profiles in mouse and human for two metabolic TFs, FOXA2 and peroxisome proliferator-activated receptor (PPAR)γ, to determine how often binding sites are shared between species and how often the binding site locations have changed while maintaining the same regulatory relationship. FOXA2 is a member of the Foxa subfamily of Forkhead proteins and has a DNA-binding domain that is highly conserved from yeast to humans (27). FOXA2 is critical for initiating liver specification and is also important for bile duct development and regulating the gluconeogenic program in adult mice (2830). PPARγ is a nuclear receptor, classified as an orphan because its physiological ligand is uncertain, that binds DNA as a heterodimer with the retinoid X receptor. PPARγ is the master regulator of adipogenesis, because it is necessary and sufficient to direct fat cell development from preadipocyte precursor cells, and loss of PPARγ results in loss of adipocytes (31).

To investigate the conservation of binding for FOXA2 and PPARγ, we performed chromatin immunoprecipitation (ChIP)-Seq for FOXA2 in human and mouse liver, and for PPARγ in human and mouse adipocytes. ChIP-Seq technology allowed us to map binding sites genome wide in both species, thus enabling a complete comparison of targets in human and mouse. We determined that for both TFs, approximately 10% of the binding sites identified overlap between mouse and human. When using a nearest gene-mapping approach, more than 50% of the target genes in mouse were also targets in human, and vice versa, indicating likely candidates for binding site turnover. Importantly, when separating genes into those that have shared binding between species vs. species-unique binding, the group with shared binding sites showed significant enrichment of several functional categories specific to liver and adipocyte physiology and metabolism. Thus, the evolutionary conservation of hepatocyte and adipocyte function is reflected in the conservation of their respective transcriptional regulatory networks.

Results

Conservation of TF-binding sites

ChIP-Seq, followed by peak calling using GLobal Identifier of Target Regions (GLITR) (32), was performed to identify genome-wide binding sites for the metabolic TFs FOXA2, in human and mouse liver, and PPARγ, in human and mouse adipocytes (Table 1). To determine whether TF-binding regions in one species were conserved in the other, we converted genomic coordinates between the two species using the University of California Santa Cruz liftOver tool. Approximately 70–80% of regions in one genome converted to the other using default parameters (Table 1). The binding regions that failed to convert did not contain weaker binding sites than the total datasets (Supplemental Fig. 1A published on The Endocrine Society's Journals Online web site at http://mend.endojournals.org) but were more likely located in intergenic regions (Supplemental Fig. 1B).

Table 1.

ChIP-Seq and GLITR peak calling

Human FoxA2 Mouse FoxA2 Human PPARγ Mouse PPARγ
Total tags 27,636,452 30,399,498 39,446,253 22,833,337
Collapsed tags 21,852,104 18,536,593 18,442,260 17,222,550
GLITR peaks (FDR 1.5%) 7,917 8,102 20,949 12,687
Average peak width 279 bp 285 bp 331 bp 394 bp
Convertible 5,671 6,084 14,629 10,048
% Convertible 71.6% 75.1% 69.8% 79.2%
Nonconvertible 2,246 2,018 6,320 2,639
Convertible but nonconserved binding 4,932 5,337 13,450 8,893
Total species unique 7,178 7,355 19,770 11,532
Conserved 739 747 1,179 1,155
% Conserved/liftover 13.0% 12.3% 8.1% 11.5%
% Conserved/total 9.3% 9.2% 5.6% 9.1%
Human FoxA2 Mouse FoxA2 Human PPARγ Mouse PPARγ
Total tags 27,636,452 30,399,498 39,446,253 22,833,337
Collapsed tags 21,852,104 18,536,593 18,442,260 17,222,550
GLITR peaks (FDR 1.5%) 7,917 8,102 20,949 12,687
Average peak width 279 bp 285 bp 331 bp 394 bp
Convertible 5,671 6,084 14,629 10,048
% Convertible 71.6% 75.1% 69.8% 79.2%
Nonconvertible 2,246 2,018 6,320 2,639
Convertible but nonconserved binding 4,932 5,337 13,450 8,893
Total species unique 7,178 7,355 19,770 11,532
Conserved 739 747 1,179 1,155
% Conserved/liftover 13.0% 12.3% 8.1% 11.5%
% Conserved/total 9.3% 9.2% 5.6% 9.1%

Summary of ChIP-Seq data for FOXA2 and PPARγ. Total sequence reads, as well as the number of reads after collapsing datasets with GLITR are shown. Collapsed mouse and human datasets had a similar number of sequence tags for both FOXA2 and PPARγ. A false discovery rate of 1.5% was used for all datasets, and for each species the binding regions were converted to the other species. The percent of regions that were able to convert and the conserved peaks between species are shown for all datasets.

Table 1.

ChIP-Seq and GLITR peak calling

Human FoxA2 Mouse FoxA2 Human PPARγ Mouse PPARγ
Total tags 27,636,452 30,399,498 39,446,253 22,833,337
Collapsed tags 21,852,104 18,536,593 18,442,260 17,222,550
GLITR peaks (FDR 1.5%) 7,917 8,102 20,949 12,687
Average peak width 279 bp 285 bp 331 bp 394 bp
Convertible 5,671 6,084 14,629 10,048
% Convertible 71.6% 75.1% 69.8% 79.2%
Nonconvertible 2,246 2,018 6,320 2,639
Convertible but nonconserved binding 4,932 5,337 13,450 8,893
Total species unique 7,178 7,355 19,770 11,532
Conserved 739 747 1,179 1,155
% Conserved/liftover 13.0% 12.3% 8.1% 11.5%
% Conserved/total 9.3% 9.2% 5.6% 9.1%
Human FoxA2 Mouse FoxA2 Human PPARγ Mouse PPARγ
Total tags 27,636,452 30,399,498 39,446,253 22,833,337
Collapsed tags 21,852,104 18,536,593 18,442,260 17,222,550
GLITR peaks (FDR 1.5%) 7,917 8,102 20,949 12,687
Average peak width 279 bp 285 bp 331 bp 394 bp
Convertible 5,671 6,084 14,629 10,048
% Convertible 71.6% 75.1% 69.8% 79.2%
Nonconvertible 2,246 2,018 6,320 2,639
Convertible but nonconserved binding 4,932 5,337 13,450 8,893
Total species unique 7,178 7,355 19,770 11,532
Conserved 739 747 1,179 1,155
% Conserved/liftover 13.0% 12.3% 8.1% 11.5%
% Conserved/total 9.3% 9.2% 5.6% 9.1%

Summary of ChIP-Seq data for FOXA2 and PPARγ. Total sequence reads, as well as the number of reads after collapsing datasets with GLITR are shown. Collapsed mouse and human datasets had a similar number of sequence tags for both FOXA2 and PPARγ. A false discovery rate of 1.5% was used for all datasets, and for each species the binding regions were converted to the other species. The percent of regions that were able to convert and the conserved peaks between species are shown for all datasets.

Because the nonconvertible regions only map to one genome, we determined conservation of binding based on convertible regions. We considered a target region conserved if the centers of the human and mouse GLITR regions fell within 500 bp of each other (Supplemental Fig. 2A). The distance of 500 bp was chosen because it reflects the approximate resolution of ChIP-Seq given the average peak widths. Consistent with this notion, distances less than 500 bp gave drastically fewer conserved regions, whereas distances greater than 500 bp yielded only incrementally more conserved regions (Supplemental Fig. 2B). An alternative method of analysis, looking for strict overlap of GLITR regions (Supplemental Fig. 2C), found a similar number of conserved binding regions as the 500 bp distance between centers (Supplemental Fig. 2D).

For FOXA2, the total number of binding regions was similar in human and mouse livers, and approximately 9% of all regions and 12% of convertible regions were conserved between the two species (Fig. 1A and Table 1). For PPARγ, mouse adipocyte-binding regions showed similar conservation, whereas more binding regions were identified in human adipocytes leading to a somewhat lower percentage of conserved human regions (Fig. 1B). The alternative definition of conservation as strict overlap between binding regions gave nearly the same approximately 10% conservation, whereas the overlap between matched control and binding regions was less than 0.5% (Supplemental Fig. 3). Therefore, based on our data for FOXA2 and PPARγ, we estimate that approximately 10% of TF-binding sites are conserved between mice and humans.

Conservation of TF-binding regions. The overall conservation of binding regions approximates 10% between mouse and human. All FOXA2 (A) and PPARγ (B) binding regions are represented in Venn diagrams. The darker-colored pie slices of each circle show the regions that fail to convert to the other genome, whereas the intersection of circles shows the conservation of convertible regions based on centers within 500 bp. There are different numbers of conserved regions in each species, reflecting cases in which a single GLITR region in one species overlaps multiple regions in the other species. C, Conserved regions are more likely to be located in gene promoters. Each dataset of binding regions was divided as in the Venn diagrams, and the percentage of regions localized within 5 kb of the transcription start site of the nearest gene is shown. D, The strongest regions are more likely to be conserved. The percent conservation of the 100 strongest convertible binding regions for each dataset is shown. E, Conservation is found even among the weakest binding regions. All convertible binding regions were divided into quintiles based on strength, and the percent conservation is shown. The data are clearly skewed from the null hypothesis of 20% (the P value for the overall skew in each dataset is <10−10), but even the weakest quintile accounts for more than 10% of all conserved sites. *, P < 0.05; **, P < 0.001; ***, P <10−10vs. all sites.

Fig. 1.

Conservation of TF-binding regions. The overall conservation of binding regions approximates 10% between mouse and human. All FOXA2 (A) and PPARγ (B) binding regions are represented in Venn diagrams. The darker-colored pie slices of each circle show the regions that fail to convert to the other genome, whereas the intersection of circles shows the conservation of convertible regions based on centers within 500 bp. There are different numbers of conserved regions in each species, reflecting cases in which a single GLITR region in one species overlaps multiple regions in the other species. C, Conserved regions are more likely to be located in gene promoters. Each dataset of binding regions was divided as in the Venn diagrams, and the percentage of regions localized within 5 kb of the transcription start site of the nearest gene is shown. D, The strongest regions are more likely to be conserved. The percent conservation of the 100 strongest convertible binding regions for each dataset is shown. E, Conservation is found even among the weakest binding regions. All convertible binding regions were divided into quintiles based on strength, and the percent conservation is shown. The data are clearly skewed from the null hypothesis of 20% (the P value for the overall skew in each dataset is <10−10), but even the weakest quintile accounts for more than 10% of all conserved sites. *, P < 0.05; **, P < 0.001; ***, P <10−10_vs._ all sites.

The distribution of FOXA2- and PPARγ-binding regions relative to the nearest gene was similar to that reported previously for PPARγ and many other TFs (33), with only 5–10% in proximal promoters (defined here as within 5 kb upstream of transcription start sites), 40–50% within genes (mostly intronic), and 40–50% in distal intergenic regions (Supplemental Fig. 1B). A potential explanation for the 10% overall conservation of binding regions noted above could be that only proximal promoter sites are conserved in mammalian evolution. Indeed, binding regions that were conserved in the mouse and human genomes showed a 2-fold greater promoter localization compared with all sites: approximately 16% of conserved FOXA2 regions and about 12% of conserved PPARγ regions (Fig. 1C). Nonetheless, the great majority of conserved factor-binding regions are located distal to promoters in intergenic regions and introns (Supplemental Fig. 1B). For binding regions located within genes, overall there was a preference for the first intron (∼40% of intragenic regions were in the first intron), and conserved regions were slightly more likely than nonconserved ones to be in the first intron (Supplemental Fig. 4). Within introns, binding regions were distributed roughly evenly across the intron from 5′ to 3′, regardless of whether the binding region was conserved (Supplemental Fig. 5).

Another potential explanation for the 10% overall conservation could be that only the strongest binding regions were conserved in mammalian evolution. The average binding strength, measured by the GLITR fold change, was indeed significantly higher for conserved regions compared with all regions (Supplemental Fig. 1A). In each of the four datasets, the 100 strongest regions in each species were highly likely to be conserved in the other species (Fig. 1D). For instance, only 8% of all convertible human PPARγ regions were conserved in mouse, whereas 40% of the top 100 regions were conserved across the species. Figure 1E divides each dataset of binding sites into quintiles based on strength. Stronger sites are more likely to be conserved, because the strongest 20% of sites account for 30–35% of all conserved regions. However, conserved binding regions were found even among the weakest regions, because the weakest 20% of regions still accounted for 10–15% of all conserved regions. Therefore, whereas the 10% of regions that are conserved are statistically enriched for promoter location and stronger binding, most conserved regions are clearly weaker and distal to promoters.

Motifs in TF-binding sites

We determined the top-scoring DNA sequence motif in each complete dataset of binding regions using de novo motif discovery. For each factor, this motif was nearly identical in human and mouse, and both agreed closely with the factor's motif in the TRANSFAC database (Fig. 2A) as well as the previously reported mouse ChIP-Seq consensus motifs (32, 34). This result is not unexpected given the high degree of conservation of DNA-binding domains, and confirms that the limited overall conservation of binding regions is clearly not due to differences in binding motifs between mouse and human factors.

Motifs in TF-binding regions. A, The complete datasets of mouse and human binding regions for FOXA2 and PPARγ were searched using de novo motif analysis, and the top scoring motif is shown for each. These motifs were compared with the TRANSFAC database to find the most similar motif. B, Conserved and species-specific binding correlates with the occurrence of motifs in binding regions. The top human de novo motif for FOXA2 (left) and PPARγ (right) was used to interrogate the convertible binding regions as well as control datasets matched for length and guanine and cytosine content. The binding regions were divided into conserved and species-specific regions, and coordinates were mapped on the mouse (mm8) and human (hg18) genomes as indicated. *, P < 0.01; **, P <10−8; ***, P <10−70 for comparisons as indicated. #, P <10−70vs. mouse-only regions mapped on human genome, ^, P <10−70vs. human-only regions mapped on mouse genome. C, PhastCons scores for conserved and species-specific (nonconserved) binding. PhastCons scores were calculated based on placental mammal sequences and were calculated in human binding regions. Conserved binding regions (yellow) had high conservation scores, while non-conserved regions (green) had much lower conservation scores, but were still higher than all regions (red). Non-convertible (blue) and matched control (purple) regions showed background conservation.

Fig. 2.

Motifs in TF-binding regions. A, The complete datasets of mouse and human binding regions for FOXA2 and PPARγ were searched using de novo motif analysis, and the top scoring motif is shown for each. These motifs were compared with the TRANSFAC database to find the most similar motif. B, Conserved and species-specific binding correlates with the occurrence of motifs in binding regions. The top human de novo motif for FOXA2 (left) and PPARγ (right) was used to interrogate the convertible binding regions as well as control datasets matched for length and guanine and cytosine content. The binding regions were divided into conserved and species-specific regions, and coordinates were mapped on the mouse (mm8) and human (hg18) genomes as indicated. *, P < 0.01; **, P <10−8; ***, P <10−70 for comparisons as indicated. #, P <10−70_vs_. mouse-only regions mapped on human genome, ^, P <10−70_vs_. human-only regions mapped on mouse genome. C, PhastCons scores for conserved and species-specific (nonconserved) binding. PhastCons scores were calculated based on placental mammal sequences and were calculated in human binding regions. Conserved binding regions (yellow) had high conservation scores, while non-conserved regions (green) had much lower conservation scores, but were still higher than all regions (red). Non-convertible (blue) and matched control (purple) regions showed background conservation.

The top-scoring human motifs were used to search for occurrences of the motif within the experimentally determined binding regions. Convertible binding regions were divided into those with conserved binding and those with species-unique binding, and these sites were mapped on either the mouse or human genome. Conserved FOXA2 regions showed similar 60–70% occurrence of the motif regardless of which genome the sites were mapped upon (Fig. 2B, left). However, sites only bound in one genome showed approximately 60% occurrence of motifs in that genome, but only about 30% occurrence of motifs when the region is converted to the other genome. This difference is highly statistically significant (P <10−70); therefore, the presence of species-specific binding correlates with species-specific motifs. However, given a background occurrence of motifs in only 10–15% of matched control regions (P <10−70_vs._ the ∼30% occurrence of motifs in nonbound species-specific regions), there are clearly cases where binding regions were not identified despite the presence of motifs.

The same analysis of PPARγ sites and motifs gave similar results (Fig. 2B, right): conserved binding regions are equally enriched for motifs in both species, whereas species-specific regions are much less enriched for motifs in the nonbinding species, although still more enriched than background. In mouse adipocytes, PPARγ is known to cooperate genome-wide with the TF CCAAT enhancer- binding protein (CEBP)α (34). CEBP motifs are also found in human PPARγ-binding regions (Supplemental Fig. 6A), and a CEBP motif showed the same pattern of decreased occurrence in syntenic regions of the non-PPARγ-binding species (Supplemental Fig. 6B). Furthermore, genome-wide binding of CEBPα in mouse adipocytes has been determined by ChIP-chip (34). We identified the expected CEBPα occupancy in approximately 60% of conserved and about 40% of mouse-unique PPARγ-binding regions, yet the human-unique regions mapped on the mouse genome showed only about 7% CEBPα occupancy (Supplemental Fig. 6C). Therefore, binding of the cooperating factor CEBPα appears to parallel the species-specific and conserved binding of PPARγ.

As a measure of conservation among multiple species, PhastCons scores based on placental mammal genomic sequences were determined around the center of human binding regions (Fig. 2C). As expected, all binding regions gave a peak of conservation, whereas nonconvertible regions and matched control regions gave no such peak. The conserved binding regions gave much higher average PhastCons scores than the nonconserved regions. Notably, however, there was clear conservation even in the regions with experimentally determined factor binding in humans but not in mice. Therefore, even when a binding region falls into a region of conservation among placental mammals, there is not necessarily shared TF binding between species.

Conservation of genes nearest to TF-binding sites

Given the relatively low conservation (∼10%) of TF-binding regions, we next investigated the similarity of genes nearest to the bound regions. Each convertible region was mapped to the nearest gene, and on average 1.5–2 binding regions were found near each gene. For each unique gene, the ortholog was identified in the other species (an ortholog could not be found in only ∼4% of cases; see Fig. 3B), and it was determined whether the ortholog was nearest to at least one ChIP-Seq region in that genome. By this method, for both FOXA2 and PPARγ, approximately 50% of nearest genes were bound in both species (Fig. 3A).

Shared genes nearest to TF-binding regions. A, The sharing of nearest genes approximates 50%. The nearest gene was found to each binding region, and this list was filtered to unique genes with orthologs in the other species. The intersection of the Venn diagrams represents orthologous genes with nearest binding regions in both species. B, Most shared genes have only nonconserved nearest regions. For each list of genes nearest to factor-binding regions in one species, about 4% of genes did not have an ortholog in the other species (red). When there was an ortholog, it either had no nearest region (yellow), only nonconserved nearest regions (dark blue), or a conserved nearest region (light blue). C, Shared genes have more nearest binding regions. The genes are divided as in B with the average number of nearest regions shown. *, P <10−6 by permutation t test, compared with 106 random permutations of the data.

Fig. 3.

Shared genes nearest to TF-binding regions. A, The sharing of nearest genes approximates 50%. The nearest gene was found to each binding region, and this list was filtered to unique genes with orthologs in the other species. The intersection of the Venn diagrams represents orthologous genes with nearest binding regions in both species. B, Most shared genes have only nonconserved nearest regions. For each list of genes nearest to factor-binding regions in one species, about 4% of genes did not have an ortholog in the other species (red). When there was an ortholog, it either had no nearest region (yellow), only nonconserved nearest regions (dark blue), or a conserved nearest region (light blue). C, Shared genes have more nearest binding regions. The genes are divided as in B with the average number of nearest regions shown. *, P <10−6 by permutation t test, compared with 106 random permutations of the data.

We recognize three drawbacks in this method of determining shared target genes. First, some conserved sites are assigned to the closest gene in the human genome, yet a different adjacent gene is actually nearest in the mouse genome (an example is shown in Supplemental Fig. 7A). These cases give the paradoxical result that a conserved region does not yield a shared gene, even though the regulatory relationship is likely the same. Overall, the incidence of these events was quite low, such that overall gene sharing in each dataset would only increase by about 1% by including these regions (Supplemental Fig. 7B). Even when the definition of a conserved site was extended from 500 bp to 10 kb, the overall gene sharing increased only approximately 2% (data not shown). Second, some genes may be near sites that fail to convert between genomes, and these genes were left out of the initial analysis. Therefore, we identified the genes nearest to all binding sites in a given genome (rather than just convertible ones), and then converted orthologous gene names between species. This analysis gave a very similar 40–60% overlap of target genes between the two species (Supplemental Fig. 8). Third, we recognize that the gene controlled by a TF is not always nearest to the binding site, such that the actual target may be a different nearby gene. However, a nearby gene analysis for strong PPARγ sites yielded 35–40% shared genes (Supplemental Fig. 9, A and B). Therefore, three variant analyses failed to yield substantially different results from the original, leading to the conclusion that about half of the genes associated with factor-binding regions are shared by both species and half are species unique.

For the shared orthologous genes with nearest factor-binding regions in both species, we next asked whether any of the binding regions associated with the genes were conserved. Among all of the shared genes, only about 30% of these had at least one conserved nearest region (Fig. 3B). In the other approximately 70% of shared genes, none of the nearest regions are conserved; thus there are different species-specific binding regions near the orthologous genes. Even when the analysis was extended to all genes within 50 kb for PPARγ, the majority of shared genes had only nonconserved sites (Supplemental Fig. 9C). Examples of genes with conserved and nonconserved PPARγ-binding regions are shown in Fig. 4B.

Transcriptional regulation of shared and species-unique target genes by FOXA2 and PPARγ. A, Microarray experiments were performed comparing liver gene expression in Foxa1/a2 double-mutant mice to control mice (FOXA2) and 3T3-L1 mouse adipocyte gene expression in PPARγ siRNA knockdown cells vs. control siRNA (PPARγ). All nonredundant genes on each microarray were divided into the four classes shown based on the presence or absence of nearest factor-binding regions in the mouse and human genomes. For each class, the percentage of genes with statistically significant more than 1.5-fold microarray down-regulation upon factor knockout/knockdown was determined. *, P < 10−9; **, P <10−40; NS, P > 0.1 vs. negative control (black bar, genes with binding regions in neither species). B, UCSC browser tracks on the mouse genome (mm8) are shown to illustrate several classes of genes, with tracks shown for the mouse and convertible human PPARγ-binding regions. The fold down-regulation upon PPARγ siRNA knockdown from the microarray experiment in panel A is shown for each gene.

Fig. 4.

Transcriptional regulation of shared and species-unique target genes by FOXA2 and PPARγ. A, Microarray experiments were performed comparing liver gene expression in Foxa1/a2 double-mutant mice to control mice (FOXA2) and 3T3-L1 mouse adipocyte gene expression in PPARγ siRNA knockdown cells vs. control siRNA (PPARγ). All nonredundant genes on each microarray were divided into the four classes shown based on the presence or absence of nearest factor-binding regions in the mouse and human genomes. For each class, the percentage of genes with statistically significant more than 1.5-fold microarray down-regulation upon factor knockout/knockdown was determined. *, P < 10−9; **, _P_ <10−40; NS, _P_ > 0.1 vs. negative control (black bar, genes with binding regions in neither species). B, UCSC browser tracks on the mouse genome (mm8) are shown to illustrate several classes of genes, with tracks shown for the mouse and convertible human PPARγ-binding regions. The fold down-regulation upon PPARγ siRNA knockdown from the microarray experiment in panel A is shown for each gene.

It has been observed that important target genes often have multiple TF-binding sites associated with them (33). This is especially well studied in Drosophila, in which seemingly redundant secondary or “shadow” enhancers function to assure phenotypic robustness in the face of environmental stress (35). In our datasets, the shared genes had more binding regions for FOXA2 or PPARγ than the species-unique genes, and the shared genes with conserved regions showed the most peaks (Fig. 3C). For instance, a human-unique PPARγ nearest gene had, on average, 1.9 nearby binding sites, while a shared gene with only species-unique regions had 2.6 nearby peaks and a shared gene with conserved regions had 4.5 nearby PPARγ-bound sites. Overall, this gene-based analysis demonstrated that 1) the approximately 50% interspecies sharing of genes nearest to binding regions is much higher than the about 10% conservation of binding regions; 2) only a minority of shared genes have conserved binding regions; and 3) the shared genes have more target sites nearby.

Species-unique genes are functional targets of the TFs

Because approximately 50% of genes nearest to binding regions were species unique, we next investigated whether these genes show the expected transcriptional regulation by the factors. Both FOXA2 and PPARγ function primarily as transcriptional activators, so loss of function for these TFs should decrease expression of target genes. Gene expression microarray data were analyzed from two experiments: 1) livers in Foxa1/a2 double-null mutant mice vs. wild-type and 2) 3T3-L1 adipocytes with Pparg small interfering RNA (siRNA)-mediated gene suppression vs. siRNA control (previously published in Ref. 43). The negative control set, genes on the array not associated with factor binding in either species, had only a low percentage of genes that showed significant 1.5-fold down-regulation on the microarrays: approximately 6% in Foxa1/a2 knockout and about 7.5% for Pparg knockdown (Fig. 4A, black bars). The mouse genes on the array associated with factor binding were designated as in Fig. 3A as shared (i.e. factor binding in both species), mouse unique, and human unique. For both FOXA2 and PPARγ, the shared genes showed the expected increase in percent regulated genes, in that the shared target genes were about 2- to 3-fold more likely to be affected by the loss of the corresponding TF than the negative control group. Importantly, mouse-specific genes showed significant transcriptional regulation similar to the shared genes, whereas the human-specific genes failed to show any more regulation than the negative control. Examples of PPARγ target genes in each class and their regulation upon knockdown are shown in Fig. 4B. Because mouse-specific but not human-specific targets show expected down-regulation upon gene ablation or knockdown in mouse cells, this suggests that even the species-unique binding regions are likely functional in gene regulation.

Conservation of metabolic pathways regulated by TFs

Because both the shared and species-unique target genes show transcriptional regulation by the factors, we next investigated the biological relevance of these genes based on their functional annotation. It has previously been reported that genes near FOXA2 and PPARγ sites are annotated to function in metabolic processes important in hepatocytes and adiopocytes, respectively (34, 36). In the current pathway analysis, the shared genes showed the expected significant enrichment of metabolic (italicized in Fig. 5) and other relevant pathways. Strikingly, the species-unique genes showed no metabolic pathway enrichment, because most of the pathways exhibited low statistical significance and/or uncertain biological relevance given the TFs in question.

Conservation of metabolic pathways regulated by TFs. A, DAVID/PANTHER biological pathway analysis of FOXA2 nearest genes that are human unique (red), mouse-unique (blue), or shared human (purple). B, DAVID/PANTHER biological pathway analysis of PPARγ nearest genes that are human unique (green), mouse unique (yellow), or shared human (light green). All gene lists were generated by the analysis in Fig. 3A, and all pathways with P < 0.01 are shown.

Fig. 5.

Conservation of metabolic pathways regulated by TFs. A, DAVID/PANTHER biological pathway analysis of FOXA2 nearest genes that are human unique (red), mouse-unique (blue), or shared human (purple). B, DAVID/PANTHER biological pathway analysis of PPARγ nearest genes that are human unique (green), mouse unique (yellow), or shared human (light green). All gene lists were generated by the analysis in Fig. 3A, and all pathways with P < 0.01 are shown.

Attempts to extend this pathway analysis from nearest genes to all nearby genes tended to produce large gene lists that overwhelm standard computational tools, which work best with between 100-2000 genes (37). One alternative was to also restrict the binding sites to the strongest ones. For example, the genes near strong PPARγ sites in both species showed a remarkable enrichment for fat metabolic processes (Supplemental Fig. 9D). Another way to assess all binding regions is the recently developed tool GREAT (Genomic Regions Enrichment of Annotations Tool) (38). Instead of gene lists, this algorithm uniquely uses the genomic coordinates of ChIP-Seq peaks to determine enriched functions and pathways. GREAT analysis of the shared binding regions showed significant enrichment of the expected metabolic pathways, whereas the species-unique sites again failed to show significant enrichment (Supplemental Fig. 10 and Supplemental Table 1). Functional analyses overall showed that the enrichment for metabolic processes observed in the entire dataset is almost entirely driven by a core set of genes, which could be identified by nearby TF binding in both mouse and human cells.

Discussion

We performed ChIP-Seq analysis to identify the genome-wide binding regions for two metabolic TFs: FOXA2 in human and mouse liver and PPARγ in human and mouse adipocytes. The mouse cistromes for both factors have already been reported (34, 36, 39, 40), but comparisons with the human cistromes reported here have yielded additional insights. Our analysis first focused on binding regions, then on genes nearest to the bound regions, then on transcriptional regulation of these genes, and finally on functional enrichments and biological pathways of species-specific and conserved binding events.

The binding regions themselves showed a low degree of evolutionary conservation. For any given binding region for FOXA2 or PPARγ in one species, there was only a 10% occurrence of binding at the orthologous genomic location in the other species. This agrees well with the recently reported limited conservation of CEBPα and hepatocyte nuclear factor-4α binding regions in placental mammals by ChIP-Seq in human, murine, and canine livers (26). In addition, during preparation of this manuscript, a report reaching similar conclusions comparing human and mouse adipocyte PPARγ cistromes was published (25).

We show further that strongly bound regions and promoter-proximal binding sites were somewhat more likely to be conserved. The consensus motifs derived from binding regions for either FOXA2 or PPARγ in each species were indistinguishable, confirming that the factors indeed bind the same DNA sequence in both species, as suggested by the amino acid similarity of the DNA-binding domains of both TFs between mouse and human. By searching for the motifs within species-specific binding regions vs. the orthologous regions without binding in the other species, we showed that lack of binding correlates with a large reduction in the occurrence of the motif. This confirms that there was indeed turnover of binding sites in mammalian evolution. However, even in the orthologous regions that lack binding, motif occurrence remained much higher than background, so other reasons must account for lack of binding at these loci. Schmidt et al. (26) similarly showed details of motif turnover events due to mismatches and insertions/deletions but also cases with apparent lack of binding despite an intact motif. Furthermore, even species-specific sites fell in regions of conservation based on PhastCons scores, proving that such conservation scores alone cannot be used to predict factor binding in other species.

Next, we focused on genes associated with the binding events. Any method of assigning binding sites to genes is fraught with potential drawbacks. Our primary analysis assigned each binding region to the nearest gene, and several refinements and variant analyses failed to substantially change our conclusion. For any given gene associated with one or more binding regions, 50% of the orthologous genes in the other species were also associated with binding regions. Given only 10% conservation of binding regions, it is apparent that during evolution there was greater selection at the level of the target genes, whereas the actual binding sites often turned over. This turnover is possible because multiple sites are typically associated with each target gene; thus, mutation of one of these sites and reestablishment elsewhere is tolerated in evolution. Even when a gene is shared, and thus associated with binding regions in both species, the majority of the time there are no conserved and only species-unique binding regions. This is certainly an example of Francois Jacob's adage that “evolution is a tinkerer” (41).

It remains possible that some of the interspecies differences we observed stem from the individual models used. For PPARγ, the two cell culture models for adipocyte differentiation differ in several ways. Mouse 3T3-L1 cells are immortalized and thus carry genetic changes, whereas human SGBS cells are derived from an individual with a rare genetic disease (Simpson-Golabi-Behmel Syndrome) (42). Furthermore, the mouse 3T3-L1 cells differentiate over about 1 wk with only an initial 2–3 d of a minimal adipogenic cocktail, whereas the human SGBS cells require continuous exposure for about 2 wk to a stronger cocktail that includes a pharmacological PPARγ ligand. For FOXA2, model differences are less of an issue because human and mouse liver tissue samples were used with minimal processing before cross-linking, but theoretical differences in metabolic status (obesity, fasting and feeding, circadian, etc.) could affect the results. It is also potentially confounding that our ChIP-Seq experiments were performed at different times. However, by pooling replicate ChIP experiments and sequencing runs to increase the total number of sequence reads, we were able to accurately capture true TF-binding regions. We would not expect that this methodology contributed to lower interspecies overlap, because the binding site concordance between duplicate experiments in the same species was quite robust (data not shown). Furthermore, although the PPARγ cistrome in mouse 3T3-L1 adipocytes has been determined multiple times in different laboratories, using diverse methods and cell batches, there is substantial agreement regarding binding regions: typically 50–90% depending on the cutoffs employed (Ref. 33 and data not shown). Overall, despite potentially different interspecies models and experimental timing, we saw similar results for both factors, and the high percentage of shared target genes suggests great similarity.

We next asked whether the species-unique genes were transcriptionally regulated by the factors. Given that species-unique genes had fewer associated binding regions than shared genes, it was possible that the binding regions that only targeted a gene in one species were nonfunctional. Interestingly, gene expression microarray experiments in the mouse systems involving Foxa1/Foxa2 gene ablation or PPARγ knockdown revealed that the mouse-unique genes were down-regulated much like the shared target genes, whereas the human-unique genes were not. Therefore, the species-unique genes are indeed transcriptional targets of the factors.

Our final analysis moved from target genes to their biological functions. Taking the entire list of genes with associated binding regions, there was enrichment for expected biological processes, given that FOXA2 and PPARγ are known to regulate liver and adipocyte metabolism, respectively. However, dividing the gene list between shared genes with associated factor binding in both species vs. species-unique genes yielded a very striking result. Only the shared genes showed significant enrichment for the expected metabolic pathways. In stark contrast, the species-unique genes showed much lower statistical enrichment for functional pathways, and these were mostly of uncertain biological relevance to liver and adipocyte metabolism.

Based on these results, the shared genes with associated factor-binding regions in both species can be considered the core genes regulated by each factor. These core genes are enriched for the expected metabolic pathways and tend to have multiple associated factor-binding sites, some conserved but many species unique. Some species-unique binding regions and associated species-unique target genes could reflect evolutionary selection and true functional differences between mice and humans, because species-unique genes are transcriptionally activated by the factors. However, it appears that most species-unique genes represent noise and genetic drift because they do not enrich for expected biological pathways. Therefore, a ChIP-Seq experiment in a single species will fail to distinguish the core genes from others of uncertain relevance, whereas interspecies comparisons of binding regions allow identification of the most biologically relevant targets.

Materials and Methods

Chromatin immunoprecipitation

Foxa2 ChIP experiments were performed as described previously (40) on two biological replicates for human liver and two biological replicates for mouse liver. Chromatin (10 μg) and rabbit anti-Foxa2 serum (2 μg) (provided by J.A. Whitsett) were used for each Foxa2 ChIP experiment. Mouse 3T3-L1 adipocytes were differentiated using a standard protocol (43) and harvested for ChIP on d 10. Human SGBS cells (42) were differentiated as previously described (44) and harvested on d 20. By assessing lipid droplet accumulation by phase-contrast microscopy, more than 95% of 3T3-L1 and more than 60% of SGBS cells had adipocyte morphology at harvest. For PPARγ ChIP, 100 μg of chromatin and 10 μg of rabbit anti-PPARγ (sc-7196, Santa Cruz Biotechnology, Inc., Santa Cruz, CA) was used as previously described (34). Three to five ChIP samples from independent cell differentiations were pooled for each PPARγ ChIP-Seq library.

ChIP-Seq library preparation

Libraries for each ChIP experiment were prepared as per Illumina's instructions (http://www.illumina.com). Briefly, DNA fragments were blunted, phosphorylated, and ligated to Illumina library adapters. For input DNA preparation, 10 ng of starting material were used. Size selection was performed using gel electrophoresis by excising DNA fragments at 200 ± 25 bp. After gel purification, PCR amplification was performed [30 sec at 98 C; (10 sec at 98 C, 30 sec at 65 C, 30 sec at 72 C) × 18 cycles; 5 min at 72 C]. Amplified material was evaluated on a Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA) using the DNA 1000 Kit to ensure proper size selection and was subsequently diluted to a concentration of 10 nm. Human FOXA2 libraries were each sequenced twice on an Illumina GAI and once on an Illumina GAII. Mouse Foxa2 libraries were sequenced on the GAII after which data were combined with previously published Foxa2 ChIP-Seq data (32) so that each data set had roughly the same number of sequence tags for further analysis. Two different human SGBS PPARγ libraries were sequenced on the GAII and the data was pooled. One mouse 3T3-L1 PPARγ library was sequenced on the GAI and previously published (45), whereas a second library was sequenced on the GAII and pooled, again to give similar sequencing depth.

Raw ChIP-Seq data processing

Sequencing output was analyzed using the Illumina Genome Analyzer Pipeline. For mouse data, sequence tags that aligned uniquely to the mouse genome build MM8 with zero, one, or two mismatches, according to the ELAND (Efficient Local Alignment of Nucleotide Data) alignment algorithm, were used for further analysis. For human data, sequence tags that aligned uniquely to the human genome build HG18 with zero, one, or two mismatches, according to the ELAND alignment algorithm, were used for further analysis.

Peak calling

All ChIP-Seq data sets were analyzed using GLITR (32) with default parameters and a 1.5% false discovery rate cutoff. Pseudo-ChIP data sets that had the same number of tags as each filtered data set were sampled from the input sequence tag data available for MM8 and HG18 at http://web.me.com/kaestnerlab1/GLITR/, and the remaining input sequence tags in these sets were used as background for the GLITR sampling procedure. Data were further filtered to remove peaks that resulted from amplification bias. Each GLITR region had an associated width, fold change, and stack height. For the purposes of defining the strength of factor binding to a region, the fold change parameter was used. For each of the four datasets, a matched control was generated with the same number of regions matched for width and guanine and cytosine content (Supplemental Fig. 3).

Comparing binding sites between species

To determine the number of overlapping peaks between species, the UCSC liftOver tool (http://genome.ucsc.edu/cgi-bin/hgLiftOver) was used with default parameters (minimum ratio of bases that must remap = 0.1) to convert MM8 peaks to HG18, and vice versa. This tool relies on alignment by a pairing and netting process to find the best long-range orthology (46). The distance between peaks in mouse data to human peaks that were converted with mouse coordinates were then calculated. Similarly, the distance between peaks in human data to mouse peaks that were converted to human coordinates were calculated. All peak distances were calculated between peak centers, and peak distances less than 500 bp were considered conserved between species (Supplemental Fig. 2A). An alternative definition of conservation required strict overlap between convertible GLITR regions (Supplemental Fig. 2C).

Motif finding and conservation

GLITR regions were input into the Cistrome Analysis Pipeline for analysis (http://cistrome.dfci.harvard.edu/ap/). For each factor in each species (four datasets), the SeqPos tool was used to identify the highest scoring de novo motif (lowest Z-score, less than −95 for each), relying on the MDscan algorithm (47). The position weight matrix for the human de novo motifs for FOXA2 and PPARγ were used to search for sequences that matched the motif (Cistrome's “Screen all motifs” tool) in BED files of binding regions. Binding regions often had multiple instances of the motif but were counted by the simple presence or absence of at least one motif. To generate sequence logos from each de novo motif and compare to known TRANSFAC motifs, the MEME Suite of motif-based sequence analysis tool was used (http://meme.nbcr.net/), specifically TomTom (48). Average PhastCons scores (49) were calculated around the center of human binding regions on hg18 using Cistrome's Conservation/Aggregate Datapoints tool.

Gene assignment and orthologous gene sharing

Each peak was mapped to the nearest gene, based on University of California Santa Cruz's refFlat.txt files for MM8 and HG18, and location relative to this gene could be determined as either promoter (defined as within 5 kb upstream of transcription start site), within the gene (defined as between the transcription start site and transcription termination site), or intergenic (any other location). Mouse/human orthologous gene mappings were downloaded from the Mouse Genome Informatics database (http://www.informatics.jax.org/). ChIP-Seq regions were considered overlapping from a gene-centric view if a peak that mapped to a human gene had an orthologous mouse gene with a peak mapped to it, and vice versa. An alternative analysis (Supplemental Fig. 5) used the MammalHom website (http://depts.washington.edu/l2l/mammalhom.html) to translate gene symbols between human and mouse. Another alternative analysis (Supplemental Fig. 6) used Cisgenome software (http://www.biostat.jhsph.edu/~hji/cisgenome/) to annotate all nearby genes within 50 kb upstream or downstream of a binding region.

Functional annotation

Lists of gene symbols were uploaded to the DAVID (Database for Annotation, Visualization and Integrated Discovery) website (http://david.abcc.ncifcrf.gov/) (37), and functional annotation charts were generated using Gene Onotology: All PANTHER (Protein ANalysis THrough Evolutionary Relationships) Biological Processes. All processes with P < 0.01 are shown. All ChIP-Seq peaks were used in GREAT (38) analysis because the tool takes genomic regions as input rather than gene symbols. Because GREAT is currently available for mm9 and hg18, for our mouse data, genomic regions were converted to mm9 using liftOver. Default settings were used, and we reported the top 10 results for the Gene Ontology Biological Process ontology, and top five results for the Mouse Phenotype and Mouse Genome Informatics Expression: Detected ontologies.

Microarray studies

Liver RNA was isolated from four control and four Foxa1/2 mutant mice at three months of age. RNA was reverse transcribed and labeled as described previously (51). Fluorescence-labeled cDNAs were hybridized to the Whole Mouse Genome Oligo Microarray (Agilent). This microarray chip represents more than 41,000 mouse gene transcripts. Genes displaying a fold change over 1.5-fold between mutants and controls and a false discovery rate less than 10%, calculated using significance analysis of microarray analysis (52), were selected. The PPARγ siRNA experiment was previously reported (43), and the raw data were downloaded from Gene Expression Omnibus accession no. GSE14004 and reanalyzed using significance analysis of microarray with the cutoffs above.

Statistics

All categorical enrichment P values (Figs. 1,C-E, 2B, 4A, and Supplemental Fig. 1B) were determined using a 2 × 2 one-tailed Fisher's Exact Test, using R statistical software (http://www.r-project.org/). P values for difference in overall distribution (Supplemental Fig. 1A) were computed using a one-tailed two-sample Wilcoxon test, also using R software. P values for difference in mean (Fig. 3C) were computed using a one-tailed permutation t test (50) with 1,000,000 permutations.

Data availability

The raw sequencing reads and GLITR regions for all four ChIP-Seq experiments described here are publicly available (Gene Expression Omnibus accession no. GSE25836). The Foxa1/a2 knockout microarray data have been deposited in ArrayExpress under accession number E-MEXP-2106.

Acknowledgments

We thank Martin Wabitsch (University of Ulm, Germany) for providing human SGBS cells. We thank Dr. Joshua Friedman (Children's Hospital of Philadelphia, Pennsylvania) for providing human liver tissue samples. We thank Dr. Jonathan Schug, Alan Fox, Olga Smirnova, and the Functional Genomics Core at the University of Pennsylvania Diabetes and Endocrinology Research Center (DERC) (P30-DK19525) for ChIP-Seq sample sequencing and running the Illumina pipeline. We also thank members of the Kaestner and Lazar laboratories for many helpful discussions.

This work was supported by the Nuclear Receptor Signaling Atlas U19DK/HL/ES 62434 (to M.A.L.) and P01 DK049210 (to K.H.K. and M.A.L.). R.E.S. was supported by training grant T32 DK007314.

Disclosure Summary: The authors have nothing to disclose.

Abbreviations

References

Jareborg

N

,

Birney

E

,

Durbin

R

1999

Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs.

Genome Res

9

:

815

824

Wasserman

WW

,

Palumbo

M

,

Thompson

W

,

Fickett

JW

,

Lawrence

CE

2000

Human-mouse genome comparisons to locate regulatory sites.

Nat Genet

26

:

225

228

Bejerano

G

,

Pheasant

M

,

Makunin

I

,

Stephen

S

,

Kent

WJ

,

Mattick

JS

,

Haussler

D

2004

Ultraconserved elements in the human genome.

Science

304

:

1321

1325

Blanchette

M

,

Tompa

M

2002

Discovery of regulatory elements by a computational method for phylogenetic footprinting.

Genome Res

12

:

739

748

Huang

W

,

Nevins

JR

,

Ohler

U

2007

Phylogenetic simulation of promoter evolution: estimation and modeling of binding site turnover events and assessment of their impact on alignment tools.

Genome Biol

8

:

R225

Janky

R

,

van Helden

J

2008

Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution.

BMC Bioinformatics

9

:

37

Vardhanabhuti

S

,

Wang

J

,

Hannenhalli

S

2007

Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation.

Nucleic Acids Res

35

:

3203

3213

Wasserman

WW

,

Fickett

JW

1998

Identification of regulatory regions which confer muscle-specific gene expression.

J Mol Biol

278

:

167

181

Xie

X

,

Lu

J

,

Kulbokas

EJ

,

Golub

TR

,

Mootha

V

,

Lindblad-Toh

K

,

Lander

ES

,

Kellis

M

2005

Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals.

Nature

434

:

338

345

Zhang

Z

,

Gerstein

M

2003

Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements.

J Biol

2

:

11

Smith

NG

,

Brandström

M

,

Ellegren

H

2004

Evidence for turnover of functional noncoding DNA in mammalian genome evolution.

Genomics

84

:

806

813

Jiménez-Delgado

S

,

Pascual-Anaya

J

,

Garcia-Fernàndez

J

2009

Implications of duplicated cis-regulatory elements in the evolution of metazoans: the DDI model or how simplicity begets novelty.

Brief Funct Genomic Proteomic

8

:

266

275

Ludwig

MZ

2002

Functional evolution of noncoding DNA.

Curr Opin Genet Dev

12

:

634

639

Stone

JR

,

Wray

GA

2001

Rapid evolution of cis-regulatory sequences via local point mutations.

Mol Biol Evol

18

:

1764

1770

Wray

GA

,

Hahn

MW

,

Abouheif

E

,

Balhoff

JP

,

Pizer

M

,

Rockman

MV

,

Romano

LA

2003

The evolution of transcriptional regulation in eukaryotes.

Mol Biol Evol

20

:

1377

1419

Costas

J

,

Casares

F

,

Vieira

J

2003

Turnover of binding sites for transcription factors involved in early Drosophila development.

Gene

310

:

215

220

Ludwig

MZ

,

Bergman

C

,

Patel

NH

,

Kreitman

M

2000

Evidence for stabilizing selection in a eukaryotic enhancer element.

Nature

403

:

564

567

Ludwig

MZ

,

Palsson

A

,

Alekseeva

E

,

Bergman

CM

,

Nathan

J

,

Kreitman

M

2005

Functional evolution of a cis-regulatory module.

PLoS Biol

3

:

e93

Ludwig

MZ

,

Patel

NH

,

Kreitman

M

1998

Functional analysis of eve stripe 2 enhancer evolution in Drosophila: rules governing conservation and change.

Development

125

:

949

958

Moses

AM

,

Pollard

DA

,

Nix

DA

,

Iyer

VN

,

Li

XY

,

Biggin

MD

,

Eisen

MB

2006

Large-scale turnover of functional transcription factor binding sites in Drosophila.

PLoS Comput Biol

2

:

e130

Tautz

D

2000

Evolution of transcriptional regulation.

Curr Opin Genet Dev

10

:

575

579

Dermitzakis

ET

,

Clark

AG

2002

Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover.

Mol Biol Evol

19

:

1114

1121

Odom

DT

,

Dowell

RD

,

Jacobsen

ES

,

Gordon

W

,

Danford

TW

,

MacIsaac

KD

,

Rolfe

PA

,

Conboy

CM

,

Gifford

DK

,

Fraenkel

E

2007

Tissue-specific transcriptional regulation has diverged significantly between human and mouse.

Nat Genet

39

:

730

732

Kunarso

G

,

Chia

NY

,

Jeyakani

J

,

Hwang

C

,

Lu

X

,

Chan

YS

,

Ng

HH

,

Bourque

G

2010

Transposable elements have rewired the core regulatory network of human embryonic stem cells.

Nat Genet

42

:

631

634

Mikkelsen

TS

,

Xu

Z

,

Zhang

X

,

Wang

L

,

Gimble

JM

,

Lander

ES

,

Rosen

ED

2010

Comparative epigenomic analysis of murine and human adipogenesis.

Cell

143

:

156

169

Schmidt

D

,

Wilson

MD

,

Ballester

B

,

Schwalie

PC

,

Brown

GD

,

Marshall

A

,

Kutter

C

,

Watt

S

,

Martinez-Jimenez

CP

,

Mackay

S

,

Talianidis

I

,

Flicek

P

,

Odom

DT

2010

Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding.

Science

328

:

1036

1040

Friedman

JR

,

Kaestner

KH

2006

The Foxa family of transcription factors in development and metabolism.

Cell Mol Life Sci

63

:

2317

2328

Lee

CS

,

Friedman

JR

,

Fulmer

JT

,

Kaestner

KH

2005

The initiation of liver development is dependent on Foxa transcription factors.

Nature

435

:

944

947

Zhang

L

,

Rubins

NE

,

Ahima

RS

,

Greenbaum

LE

,

Kaestner

KH

2005

Foxa2 integrates the transcriptional response of the hepatocyte to fasting.

Cell Metab

2

:

141

148

Li

Z

,

White

P

,

Tuteja

G

,

Rubins

N

,

Sackett

S

,

Kaestner

KH

2009

Foxa1 and Foxa2 regulate bile duct development in mice.

J Clin Invest

119

:

1537

1545

Lefterova

MI

,

Lazar

MA

2009

New developments in adipogenesis.

Trends Endocrinol Metab

20

:

107

114

Tuteja

G

,

White

P

,

Schug

J

,

Kaestner

KH

2009

Extracting transcription factor targets from ChIP-Seq data.

Nucleic Acids Res

37

:

e113

Siersbaek

R

,

Nielsen

R

,

Mandrup

S

2010

PPARγ in adipocyte differentiation and metabolism–novel insights from genome-wide studies.

FEBS Lett

584

:

3242

3249

Lefterova

MI

,

Zhang

Y

,

Steger

DJ

,

Schupp

M

,

Schug

J

,

Cristancho

A

,

Feng

D

,

Zhuo

D

,

Stoeckert

CJ

,

Liu

XS

,

Lazar

MA

2008

PPARγ and C/EBP factors orchestrate adipocyte biology via adjacent binding on a genome-wide scale.

Genes Dev

22

:

2941

2952

Frankel

N

,

Davis

GK

,

Vargas

D

,

Wang

S

,

Payre

F

,

Stern

DL

2010

Phenotypic robustness conferred by apparently redundant transcriptional enhancers.

Nature

466

:

490

493

Wederell

ED

,

Bilenky

M

,

Cullum

R

,

Thiessen

N

,

Dagpinar

M

,

Delaney

A

,

Varhol

R

,

Zhao

Y

,

Zeng

T

,

Bernier

B

,

Ingham

M

,

Hirst

M

,

Robertson

G

,

Marra

MA

,

Jones

S

,

Hoodless

PA

2008

Global analysis of in vivo Foxa2-binding sites in mouse adult liver using massively parallel sequencing.

Nucleic Acids Res

36

:

4549

4564

Huang da

W

,

Sherman

BT

,

Lempicki

RA

2009

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

Nat Protoc

4

:

44

57

McLean

CY

,

Bristor

D

,

Hiller

M

,

Clarke

SL

,

Schaar

BT

,

Lowe

CB

,

Wenger

AM

,

Bejerano

G

2010

GREAT improves functional interpretation of cis-regulatory regions.

Nat Biotechnol

28

:

495

501

Nielsen

R

,

Pedersen

TA

,

Hagenbeek

D

,

Moulos

P

,

Siersbaek

R

,

Megens

E

,

Denissov

S

,

Børgesen

M

,

Francoijs

KJ

,

Mandrup

S

,

Stunnenberg

HG

2008

Genome-wide profiling of PPARγ:RXR and RNA polymerase II occupancy reveals temporal activation of distinct metabolic pathways and changes in RXR dimer composition during adipogenesis.

Genes Dev

22

:

2953

2967

Tuteja

G

,

Jensen

ST

,

White

P

,

Kaestner

KH

2008

Cis-regulatory modules in the mammalian liver: composition depends on strength of Foxa2 consensus site.

Nucleic Acids Res

36

:

4149

4157

Jacob

F

1977

Evolution and tinkering.

Science

196

:

1161

1166

Wabitsch

M

,

Brenner

RE

,

Melzner

I

,

Braun

M

,

Möller

P

,

Heinze

E

,

Debatin

KM

,

Hauner

H

2001

Characterization of a human preadipocyte cell strain with high capacity for adipose differentiation.

Int J Obes Relat Metab Disord

25

:

8

15

Schupp

M

,

Cristancho

AG

,

Lefterova

MI

,

Hanniman

EA

,

Briggs

ER

,

Steger

DJ

,

Qatanani

M

,

Curtin

JC

,

Schug

J

,

Ochsner

SA

,

McKenna

NJ

,

Lazar

MA

2009

Re-expression of GATA2 cooperates with peroxisome proliferator-activated receptor-γ depletion to revert the adipocyte phenotype.

J Biol Chem

284

:

9458

9464

Kim

RJ

,

Wilson

CG

,

Wabitsch

M

,

Lazar

MA

,

Steppan

CM

2006

HIV protease inhibitor-specific alterations in human adipocyte differentiation and metabolism.

Obesity

14

:

994

1002

Lefterova

MI

,

Steger

DJ

,

Zhuo

D

,

Qatanani

M

,

Mullican

SE

,

Tuteja

G

,

Manduchi

E

,

Grant

GR

,

Lazar

MA

2010

Cell-specific determinants of peroxisome proliferator-activated receptor γ function in adipocytes and macrophages.

Mol Cell Biol

30

:

2078

2089

Kent

WJ

,

Baertsch

R

,

Hinrichs

A

,

Miller

W

,

Haussler

D

2003

Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes.

Proc Natl Acad Sci USA

100

:

11484

11489

Liu

XS

,

Brutlag

DL

,

Liu

JS

2002

An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments.

Nat Biotechnol

20

:

835

839

Gupta

S

,

Stamatoyannopoulos

JA

,

Bailey

TL

,

Noble

WS

2007

Quantifying similarity between motifs.

Genome Biol

8

:

R24

Siepel

A

,

Bejerano

G

,

Pedersen

JS

,

Hinrichs

AS

,

Hou

M

,

Rosenbloom

K

,

Clawson

H

,

Spieth

J

,

Hillier

LW

,

Richards

S

,

Weinstock

GM

,

Wilson

RK

,

Gibbs

RA

,

Kent

WJ

,

Miller

W

,

Haussler

D

2005

Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.

Genome Res

15

:

1034

1050

Ewens

WJ

,

Grant

GR

2005

Statistical methods in bioinformatics: an introduction

.

New York, NY

:

Springer

Gao

N

,

LeLay

J

,

Vatamaniuk

MZ

,

Rieck

S

,

Friedman

JR

,

Kaestner

KH

2008

Dynamic regulation of Pdx1 enhancers by Foxa1 and Foxa2 is essential for pancreas development.

Genes Dev

22

:

3435

3448

Tusher

VG

,

Tibshirani

R

,

Chu

G

2001

Significance analysis of microarrays applied to the ionizing radiation response.

Proc Natl Acad Sci USA

98

:

5116

5121

Author notes

*

R.E.S. and G.T. contributed equally to this work.

Copyright © 2011 by The Endocrine Society

Citations

Views

Altmetric

Metrics

Total Views 1,090

718 Pageviews

372 PDF Downloads

Since 1/1/2017

Month: Total Views:
January 2017 2
February 2017 5
March 2017 6
April 2017 2
May 2017 10
June 2017 4
July 2017 9
August 2017 2
September 2017 12
October 2017 3
November 2017 2
December 2017 15
January 2018 24
February 2018 21
March 2018 14
April 2018 18
May 2018 19
June 2018 9
July 2018 14
August 2018 21
September 2018 17
October 2018 13
November 2018 12
December 2018 4
January 2019 7
February 2019 13
March 2019 29
April 2019 10
May 2019 11
June 2019 16
July 2019 13
August 2019 13
September 2019 19
October 2019 19
November 2019 6
December 2019 8
January 2020 15
February 2020 17
March 2020 15
April 2020 9
May 2020 4
June 2020 16
July 2020 13
August 2020 7
September 2020 8
October 2020 13
November 2020 13
December 2020 5
January 2021 13
February 2021 8
March 2021 17
April 2021 14
May 2021 10
June 2021 11
July 2021 4
August 2021 3
September 2021 5
October 2021 7
November 2021 11
December 2021 3
January 2022 5
February 2022 8
March 2022 13
April 2022 12
May 2022 13
June 2022 11
July 2022 10
August 2022 13
September 2022 24
October 2022 30
November 2022 10
December 2022 5
January 2023 8
February 2023 5
March 2023 18
April 2023 16
May 2023 16
June 2023 10
July 2023 5
August 2023 3
September 2023 13
October 2023 12
November 2023 11
December 2023 19
January 2024 13
February 2024 17
March 2024 9
April 2024 28
May 2024 16
June 2024 8
July 2024 14
August 2024 20
September 2024 7
October 2024 5

Citations

48 Web of Science

×

Email alerts

More on this topic

Citing articles via

More from Oxford Academic