|
|
|
|||
| Home Help Feedback Subscriptions Archive Search Table of Contents | ||||
First published online 18 April 2007
doi: 10.1242/dev.001966
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

1 Department of Molecular and Cellular Biology, Harvard University, 16 Divinity
Avenue, Cambridge, MA 02138, USA.
2 Department of Statistics, Harvard University, 16 Divinity Avenue, Cambridge,
MA 02138, USA.
3 Department of Statistics, Stanford University, Sequoia Hall, 390 Serra Mall,
Stanford, CA, 94305, USA.
4 Agilent Technologies, 245 First Street, Suite 105, Cambridge, MA 02142,
USA.
5 Institute for Systems Biology, Seattle, WA, 98103, USA.
6 Division of Biology, California Institute of Technology, Pasadena, CA 91125,
USA.
7 Harvard Stem Cell Institute, 16 Divinity Avenue, Cambridge, MA 02138,
USA.
Author for correspondence (e-mail:
mcmahon{at}mcb.harvard.edu)
Accepted 5 March 2007
| SUMMARY |
|---|
|
|
|---|
Key words: Chromatin immunoprecipitation, Gli, Neural patterning, Sonic hedgehog, Mouse
| INTRODUCTION |
|---|
|
|
|---|
One crucial role of sonic hedgehog (Shh) lies in patterning of the
vertebrate neural tube. Here, Shh secreted from the midline notochord acts as
a morphogen to specify five classes of ventral neuronal progenitors in a
concentration-dependent fashion; from dorsal to ventral these are progenitors
for V0, V1, V2 interneuron pools, motoneurons (MN) and V3 interneurons. In
addition, the highest levels of Shh signaling at the ventral midline induce a
second ventral midline domain of Shh-expressing cells, the floor plate (FP),
immediately ventromedial to V3 progenitors (reviewed by
Briscoe and Ericson, 2001
;
Jessell, 2000
;
McMahon et al., 2003
). Genetic
studies have demonstrated that whereas V0, V1, V2 and MN identities can be
specified in the absence of Hh signaling if GliREP is removed, the
most ventral identities, V3 and FP, require Hh signaling and GliACT
forms (Bai et al., 2004
).
Recent work indicates that relatively small differences in the ratio of these
forms dictate distinct transcriptional outputs
(Stamataki et al., 2005
),
suggesting a fine-tuned response to Gli regulators in the cis-regulatory
regions of target genes expressed at distinct Hh thresholds. Clearly, testing
this or any alternative model requires a thorough understanding of the
cis-regulatory mechanisms in direct target genes that dictate Glimediated
regulation.
Here we have initiated a systematic effort to define sets of genes that are directly regulated by Gli activator. Gli1-directed chromatin immunoprecipitation (ChIP) products isolated during the course of Hh-mediated patterning of neural progenitors were used to interrogate custom genomic arrays, successfully identifying and mapping a large number of novel in vivo target sites. Bioinformatic analysis and expression studies in cell culture and transgenic mouse embryos validated these data and provide novel insights into the regulation of both tissue-specific and more general components of a Hh transcriptional response. Further, the analysis of in vivo targets facilitated the development of a novel algorithm for Gli-target site prediction. Based on our data, we present a model for ventral neuronal specification in which Gli-activator and Gli-repressor forms differ substantially in their selection of binding sites. This uneven competition drives the transcriptional interpretation of the morphogen gradient.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The pHSP68lacZ2XINS vector was constructed by PCR amplifying the 2X Chick
ß-globin insulator motif from XB3+Insulator with primers containing
Asc1 sites and inserting this into the Asc1 site of
pHSP68lacZ (Sasaki and Hogan,
1996
). With the exception of the Nkx2.1 (also known as
Titf1 - Mouse Genome Informatics) peak, where only one copy of the
enhancer was used, transgenic constructs were generated by cloning four copies
of test DNA upstream of the minimal promoter in pHSP68lacZ2XINS. The sequence
coordinates of the Ptch1 peak2 enhancer and the Nkx2.2
(Nkx2-2) Gli enhancers are described in the text. In addition, a 575
bp sequence (chr12:53,245,800-53,246,374) encompassing the Nkx2.1 peak and a
291 bp sequence (chr11:77915085-77915375) spanning the Rab34 peak were used to
generate the respective constructs. To construct mutated Gli sites, we changed
the site 5'-TGGGTGGTC-3' to 5'-TCCCACGTC-3'
(NkxM-HSP68lacZ), a mutant form that removes an invariant G
essential for Gli binding and several other residues that contact the zinc
finger motif of Gli1 (Pavletich and Pabo,
1993
).
Embryoid bodies (EBs) were generated using previously described procedures
(Wichterle et al., 2002
).
After 2 days, the media was changed to DFNB containing 500 nM retinoic acid
and, when applicable, 1 µM Hh agonist (HH-Ag)
(Frank-Kamenetsky et al.,
2002
). EBs were grown for an additional 3 days to induce neural
progenitor stages, at which point cultures were harvested for ChIP.
Immunostaining with Pax6, Nkx6.1 (Nkx6-1), Nkx2.2 and FoxA2 was performed
using previously published methods
(Wijgerde et al., 2002
). For
western blots, samples were run out on 4-12% BisTris gradient gels and protein
levels were normalized using levels of ß-gal. The M2 FLAG antibody
(Sigma) or anti-ß-galactosidase (Promega Z3781) were each used at a
1:2000 dilution. For immunostains, YFP was detected using an anti-GFP at
1:2000 (Abcam ab290).
Chromatin immunoprecipitation and hybridization
We modified a published protocol (Odom
et al., 2004
) to reduce the number of input cells to approximately
2x106 cells per precipitation using a FLAG M2 mouse
monoclonal antibody (Sigma). After isolating chromatin, we incubated it with
previously prepared M2-coated magnetic beads. To prepare beads, magnetic sheep
anti-mouse IgG beads (Dynal) were incubated overnight with 0.8 µg M2 Mouse
monoclonal anti-Flag antibody (Sigma) and 10 µl of beads per precipitation.
The following day, the beads were rinsed and added to chromatin and incubated
overnight at 4°C. Samples were then rinsed five times with RIPA buffer,
and the antibody was stripped from the beads by incubating in 1% SDS at
65°C for 15 minutes and cross-linking was reversed by incubating overnight
at 65°C. The next day, samples were sequentially treated with RNAse A and
Proteinase K, phenol:chloroform extracted, ethanol precipitated with 20 µg
glycogen and resuspended in 60 µl 10 mM Tris pH 8.0. ChIP-enriched DNA
samples were amplified before hybridization to the tiling array. For this, DNA
samples were blunted using T4 DNA polymerase (50 µl of each ChIPed sample
with 0.6 units of T4 DNA polymerase in a 110 µl volume at 12°C for 15
minutes), phenol:chloroform extracted with 10 µg glycogen, ethanol
precipitated and resuspended in 25 µl water. This material was then
blunt-ligated to unidirectional linkers
(Ren et al., 2000
) using 15
µM annealed linkers and a Quick Ligation Kit (New England Biolabs) in a 21
µl volume at 12°C overnight. The reaction was then ethanol
precipitated, resuspended in 25 µl water and PCR amplified using 1 µM
primer oJW102 with an annealing temperature of 60°C and a 1 minute
extension at 72°C for 26 or 29 cycles. To ensure that samples were
uniformly amplified, we reassayed the samples by qPCR.
For each ChIP, 2 µg amplified ChIP sample or Input control was labeled with Cy3 or Cy5-dUTP (Perkin Elmer) using the Bioprime Array CGH kit (Invitrogen) and competitively hybridized using 5 µg labeled material per channel with the Oligo CGH Hybridization Kit (Agilent Technologies) on a custom tiled 44K array from Agilent Technologies (see Table S1 in the supplementary material). Complete details for array construction and all raw array data may be obtained from GEO (GEO #GSE5683). Three independent biological samples were employed in the analysis. Within each sample, dye-swap experiments were performed in duplicate for a total of four technical replicates per biological sample.
The hybridizations were done in Agilent SureHyb hybridization chambers, and
in an Agilent rotating hyb oven at a speed of 10 rpm for 72 hours at 65°C.
Each hybridization was performed in duplicate and by exchanging labeled dyes
(dye-swapping) for a total of four technical replicates per biological sample.
Samples were washed sequentially for 5 minutes at room temperature using Oligo
aCHG Wash Buffer 1 (Agilent Technologies), 1 minute at 37°C in Oligo aCGH
wash buffer 2 (Agilent Technologies), 1 minute in Acetonitrile at room
temperature and 30 seconds in Stabilization and Drying Solution (Agilent
Technologies). Finished arrays were scanned using an Agilent scanner at a 10
µm resolution with both channels at a PMT setting of 100. Array images were
extracted using Agilent's Feature Extraction Software (version 8.5.1.1) using
the standard CGH protocol included with the software. The settings included
spatial detrending of the extracted array data and a linear median
normalization. Log fold changes were obtained for each probe, and the four
technical replicates within each biological replicate were averaged. The
correlation coefficient between any two technical replicates using the same
dye labeling was r=0.948 (s.d.=0.020), and between two technical
replicates that use different dye labeling was r=0.773 (s.d.=0.095).
In contrast, the correlation between two biological replicates was
r=0.068 (s.d.=0.075). T-scores were obtained using TileMap in the
moving average mode (Ji and Wong,
2005
). All coordinates in this manuscript refer to the Build 34
assembly of the mouse genome.
Mapping of Gli sites, determination of conserved regions and motif identification
Potential binding regions (see Table S2 in the supplementary material) were
extended 200 bp from both ends and repeats were masked (A. F. A. Smit, R.
Hubley and P. Green, RepeatMasker at
http://repeatmasker.org).
We then mapped the Gli consensus-binding pattern TGGGTGGTC
(Kinzler and Vogelstein,
1990
), allowing only one bp mismatch. Candidate binding regions
were binned into groups of size ten according to their initial rank, and the
number of Gli sites (n1) and the total number of surveyed
non-repeat base pairs (n2) were counted for each tier and reported
in Table S3 in the supplementary material. To determine the background level
of Gli occurrence, the Gli consensus sites were also mapped to all tiled
regions present on the array, yielding 1264 sites in a total of 5,608,267 bp
non-repeat sequences. Gli-binding-site enrichment was then computed for each
tier by comparing its site density to the site density of control regions,
i.e.
r1=(n1/n2)target/(n1/n2)control.
In a separate approach, we examined the degree of cross-species
conservation among binding sites. The multiple species alignment among mouse,
rat, human, dog and zebrafish were used to compute a conservation score for
each position of mouse genome. Sites whose mean conservation score is among
the top 10% of the genome were called conserved sites. The number of conserved
Gli sites (n3) and the total number of surveyed base pairs that are
conserved (n4) were counted for each tier and control regions. We
computed the enrichment of conserved Gli sites in conserved regions as:
![]() |
and
![]() |
r2 measures the extent to which Gli sites are enriched in conserved regions, and r3 measures the combined gain we can obtain using both bindingsequence specificity and cross-species conservation information.
To perform de novo motif discovery, Gibbs motif sampler was applied to the
top 13 tier 1 peaks after extending each defined peak 100 bp from both ends.
The Gli matrix identified by motif analysis were visualized using WebLogo
(Crooks et al., 2004
). To map
Gli consensus sites, the core-binding pattern TGGGTGGTC was used to scan
genomic sequences; up to one mismatch was allowed. We computed cross-species
conservation scores using mouse, rat, human, dog and zebrafish alignments.
Motif sites that reside within the top 10% most conserved regions in the mouse
genome are defined as phylogenetically conserved sites.
In silico predictions of Gli target genes and enhancers
To generate an experimental dataset for predictions, retinoic acid-treated
EBs were grown in the presence or absence of HH-Ag. We did not observe
induction of ventral Hh markers in RA-treated constitutive Gli1FLAG
EBs and used these for the control, baseline set. To pick genes that were
upregulated by Hh signaling, a modified t-statistic
(Ji and Wong, 2005
) was
computed for each gene to test if its expression level is higher in Hh-induced
samples compared with non-induced samples and the top 365 probe sets
corresponding to 249 non-redundant genes were used in further analyses (see
Table S5 in the supplementary material). We then identified their orthologs
from human, dog, cow, chicken and zebrafish. Each gene was extended 50 kb
upstream from transcription start and 50 kb downstream from transcription end,
and predictions were made based on these sequences.
In order to make EEL predictions, EEL was run using the same parameters as
Hallikas et al. (Hallikas et al.,
2006
). In addition to the previous 107 motifposition-specific
weight matrices, we also included the Gli matrix recovered in this current
study (see Fig. S2 in the supplementary material). Pairwise alignments between
mouse-human, mouse-dog, mouse-cow, mouse-chicken and mouse-zebrafish were
generated. As EEL did not provide a way to combine these pairwise alignments
into multiple alignments and to rank predicted enhancers accordingly, we
decided to use mouse-human alignments as the basis for making our predictions.
Each predicted enhancer was then annotated to reflect whether it was also
found in other pairwise alignments. Predictions shorter than 2000 bp, with an
EEL score
100 and a combined Gli score
25 were picked, retained and
ranked by their combined Gli scores.
To perform MCA, we first collected conserved, non-coding genomic segments
from the mouse genome. Cross-species conservation scores were computed based
on percent identities measured in a 50 bp sliding window using multiz mouse,
rat, human, dog and zebrafish alignments (downloaded from
http://genome.ucsc.edu).
The scores ranged from 0 to 255, with 255 corresponding to the most conserved
status. We collected all continuous segments with a score
40 (top 10%),
discarding segments shorter than 50 bp. These first-stage segments were joined
to form second-stage segments if their end-to-end distance (gap) was
50
bp. The second-stage segments were filtered further to remove segments <200
bp or if they did not contain at least one first-stage segment >100 bp. The
remaining segments were termed thirdstage segments, comprising the final
collection of conserved genomic regions.
To measure Gli enrichment, we mapped the Gli matrix recovered from the ChIP
to all third-stage conserved genomic regions. The Gli matrix was compared to a
third-order background Markov model, and a likelihood ratio
500 was used
as a cutoff to define Gli sites. The occurrence of Gli sites in conserved
genomic segments was then modeled as Poisson processes. If a segment is a
Gli-binding region (the alternative hypothesis H1), the occurrence
rate of Gli sites was assumed to be
1 (site/bp) and each
Gli site was assumed to be generated from the Gli-binding matrix. In contrast,
if a segment is not a Gli-binding region (the null hypothesis H0),
the occurrence rate of Gli sites was assumed to be
0 and
all such sites were assumed to be false sites and generated from the
background Markov model. The rate
0 was estimated based on
the mapping results in general conserved regions
(
0=0.0002/bp), and
1 was estimated using
conserved regions covered by the top 25 peaks
(
1=0.0015/bp). For each segment, the log likelihood ratio
between H1 and H0 was computed as:
![]() |
Here, l is the length of a conserved segment, n is the number of Gli sites in the segment obtained by matrix mapping, L1 is the probability to generate the sites according to Gli matrix, and L0 is the probability to generate the sites by background Markov model. Finally, S was used to measure the Gli enrichment and rank conserved segments.
We also mapped Gli consensus TGGGTGGTC (allowing
1 mismatch) to the
third-stage conserved genomic regions. All regions that contain at least one
Gli consensus site were selected as a potential Gli cis-regulatory module
(CRM). The Gli CRMs are then ranked by Gli enrichment score S. For each gene,
all Gli CRMs located within introns, UTRs, 50 kb upstream regions (from
transcription start) and 50 kb downstream regions (from transcription end)
were collected and the enrichment scores were added together to derive the
combined Gli-binding strength for that gene.
Cell culture
Gli luciferase assays in murine NIH3T3 cells used methods described
previously (Nybakken et al.,
2005
). We used 25 ng Gli1 cloned into pCIG
(Megason and McMahon, 2002
) or
empty pCIG vector for controls. Candidate enhancers
(Table 2) were PCR-amplified
from genomic DNA and cloned into the pGL3-Promoter vector (Promega).
|
| RESULTS |
|---|
|
|
|---|
As a strategy for isolating Gli1 targets in the ventral neural tube, we
focused on the demonstrated ability of Shh to recapitulate its ventral
neuralizing activity in the specification of ES cells differentiated to form
neuralized EBs (Wichterle et al.,
2002
). In this assay, addition of Shh, or small molecule agonists
that activate the Shh pathway, led to a quantitative ventralization of neural
progenitors closely mirroring the neural patterning activity of Shh in the
ventral neural tube. In the absence of Hh agonist (Hh-Ag), the
Gli1FLAG line was not sufficient to specify Nkx6.1+-
(V2, MN, V3, FP progenitors), Nkx2.2+- (V3 progenitors) or
FoxA2+-(FP) producing cell types
(Fig. 1B-D,J-L). The only
differential response detected between the induced Gli1FLAG line
and control, parental EBs was a large decrease in the number of
Pax6high cells, although the total number of Pax6-producing cells
was not altered (70%
=10% of cells are Pax6high in control
EBs compared with 2%
=4% for Gli1FLAG EBs;
Fig. 1A,I,Q); Pax6 repression
has been shown to be Shh-dependent
(Ericson et al., 1997
). In
contrast to this weak response, Gli1FLAG resulted in a strong
synergy with Hh-Ag (Frank-Kamenetsky et
al., 2002
). At a dose of 1 µM, EBs were only weakly ventralized
(Nkx6.1+, Nkx2.2-, FoxA2-). In contrast, when
Gli1FLAG was present, an identical dose of Hh-Ag resulted in the
induction of ventralmost neural (Nkx2.2+, V3 progenitor) and FP
(FoxA2+) identities (compare
Fig. 1F-H with N-P), as well as
a substantial reduction in Pax6+ cells consistent with Nkx2.2
repression of Pax6 (Fig. 1E,M)
(Ericson et al., 1997
).
Nkx2.2+ and FoxA2+ cells comprised approximately 10%
each of the EB-derived population. The requirement of Shh for Gli1 activation
- even when under control of an ectopic promoter - was previously noted by Bai
and Joyner (Bai and Joyner,
2001
).
|
To verify Gli1FLAG-specific enrichment of expected
cis-regulatory regions in Hh-Ag-induced EBs, we adapted a standard ChIP
protocol to perform ChIP on 2x106 cells
(Odom et al., 2004
). Chromatin
extracts were incubated with anti-FLAG antibody and following IP, the
enrichment of known Gli targets in IP fractions was examined by quantitative
RT-PCR (qPCR). qPCR confirmed specific enrichment of Gli-binding sites in
several known target regions, including the FoxA2 FP enhancer
(Sasaki et al., 1997
), a
conserved cis-regulatory region upstream of Ptch1 previously
associated with Shh-dependent regulation in human Ptch1
(Agren et al., 2004
), and
Gli1 itself (Dai et al.,
1999
). Gli1 expression is absolutely dependent on prior
Hh signaling (Bai et al.,
2004
).
Having verified the ChIP procedure, we screened the custom tiling array.
The regions significantly enriched in the IP fraction were ranked by a
T-score, identifying 47 peaks having a false discovery rate of
25% (see
Table S2 in the supplementary material). As the top 13 ranked sites contained
all previously known Gli targets known to be expressed in the neural tube
(Table 2, underlined), we used
these regions to determine if we could uncover any enriched motif that might
represent a transcription-factor-binding site without any prior knowledge of
that site. Employing a Gibbs motif sampler
(Lawrence et al., 1993
;
Liu, 1994
), we obtained two
distinct motifs. The first motif (Fig.
2I) was very similar to the Gli-binding motif (TGGGTGGTC;
Fig. 2J; see Fig. S2 in the
supplementary material) determined by oligonucleotide binding to recombinant
proteins (Hallikas et al.,
2006
; Kinzler and Vogelstein,
1990
; Vortkamp et al.,
1995
), a confirmation that our strategy specifically isolates
sequences directly bound by Gli1FLAG. Further, the analysis
predicted nucleotide preferences extending beyond the core Gli consensus
region (Fig. 2I). A second,
lower scoring motif with a GC-rich pattern was the only other significantly
enriched sequence; this is not enriched in conserved regions and is also
present in multiple, unrelated ChIP datasets, suggesting that it is
non-specific (data not shown).
|
1 mismatch) was enriched in peak regions with
high T-scores, when compared with control regions; however, we observed a
distinct drop in enrichment from about 5-fold in the first two bins to
3.5-fold in the third bin (see Table S3 in the supplementary material). Of the
five motif sites detected in the third bin, four of them occurred in regions
ranked 21-25, and only one in regions ranked 26-30. Consequently, we focused
on the top group of 25 predictions in subsequent analyses.
Fig. 2 shows examples of the
data for a subset of genes studied here. These 25 regions contained 28 Gli
consensus motif sites in a total of 23,656 bp of non-repeat sequences. The
width of the peak itself is predicted to depend both on the number of Gli
sites within a particular region and the length of chromatin fragments
generated on DNA shearing. The first, second (median) and third quantiles of
region length were 295, 474 and 707 bp, respectively. Many, but not all, peaks
are centered about a strong Gli consensus sequence. In total, we observed a
5.25-fold enrichment of predicted Gli target sites within peaks when compared
with all regions tiled on the array. If we were to assume that the top 25 ChIP peaks represented all the Gli-binding events on the array, only 28/1264 Gli consensus sites (2.22%) were bound by Gli (some ChIP peaks had multiple consensus sites, while others did not contain any consensus sites). Among the 1264 consensus sites tiled in the array, 299 were phylogenetically conserved, and 20 of these (6.69%) were associated with Gli binding. These regions were associated with 16 genes, indicating that a significant fraction of the targets contained multiple Gli-binding modules. Interestingly, only seven of the regions identified (28%) were within the 5 kb proximal promoter region. Remaining sites were located in the first or second intron at varying distances from the TSS (24%), at a range greater than 5 kb 5' of the TSS (24%), after the end of transcription (TES) (16%) or elsewhere within the transcript (8%) (Table 1). These results highlight the problem of local promoter analyses for large-scale mining of transcriptional regulatory mechanisms in the mammalian genome.
|
Identification of neural Gli enhancers
We next examined this list for the presence of known Gli-binding sites that
mapped within reported cis-regulatory regions in Ptch1
(Agren et al., 2004
;
Hallikas et al., 2006
),
Gli1 (Dai et al.,
1999
), Nkx2-9
(Santagati et al., 2003
) and
FoxA2 (Sasaki et al.,
1997
); each of these predicted target sites was identified (arrows
in Fig. 2A,G,C,H,
respectively). To the best of our knowledge, these represent all published,
mapped Gli-binding sites in targets expected to be expressed within the
EB-generated ventral neural target population. Based on the presence of these
previously validated sites, we further divided the ranked list into a first
tier of 13 regions that included all the previously characterized Gli enhancer
sites and a second tier of 12 regions that represented additional peaks of
high statistical quality (Table
1, Table S2 in the supplementary material, and data below). As
expected from our previous transcriptional profiling data, the transcripts
adjacent to these peaks were upregulated in Ptch1lacZ/lacZ
embryos (Goodrich et al.,
1997
) and downregulated in Smo-/- embryos
(Zhang et al., 2001
), where Hh
signaling was enhanced or absent, respectively (see Fig. S3 in the
supplementary material). In addition to the one previously reported Gli
enhancer in Nkx2.9 (Santagati et
al., 2003
) (Fig.
2C, peak 1), we detected an additional site
(Fig. 2C, peak 2) approximately
8.7 kb 5' to the TSS. In human Ptch1, a single 2981 bp
enhancer/promoter fragment has been reported from studies in NIH3T3 and
HEK293T cells; two conserved, predicted Gli-binding sites are predicted in
this region, one of which has been shown to be functional
(Agren et al., 2004
). The
homologous region was detected in our analysis (peak 2, arrow in
Fig. 2A). We also identified
five additional peaks, including three that were highly enriched
(Fig. 2A). One of these, peak
5, lies immediately upstream of an uncharacterized alternatively spliced
full-length form of Ptch1. While human Ptch1 contains three
alternatively spliced full-length transcripts
(Agren et al., 2004
), there are
only two full-length forms represented among mouse expressed sequence tags
(ESTs). A peak immediately upstream of Gli1 is unusually broad,
possibly reflecting an extended region of Gli1 that also binds Gli3
(Fig. 2G)
(Dai et al., 1999
). We also
detected strong peaks in Ptch2
(Fig. 2E, two peaks),
Hhip1 (data not shown, two peaks), Nkx2.2
(Fig. 2B, one peak),
Nkx2.1 (Fig. 2F, one
peak) and Rab34 (Fig.
2D, one peak); with the exception of Nkx2.2
(Lei et al., 2006
), none of
these putative Gli-regulatory regions have been previously reported
(Fig. 2). Ptch2 and
Hhip1 are members of the Hh pathway that have been reported to show
Hh-dependent regulation (Chuang and
McMahon, 1999
; Motoyama et
al., 1998
); Nkx2.2 encodes a transcriptional determinant
of the Shh-dependent V3 interneurons
(Briscoe et al., 1999
);
Nkx2.1 encodes a Shh-dependent transcriptional regulator within
ventral neural populations of the forebrain
(Pabst et al., 2000
); and
Rab34 encodes a small GTP-binding protein not previously associated
with Hh signaling that is predicted from our transcriptional array data to be
a rather general target of Hh regulation in several tissues.
Differences between generic and tissue-restricted Gli enhancers
The genes identified as targets in our analysis exhibited either a global
(multiple Hh-responsive tissues) or tissue-restricted pattern of expression
(see Fig. S3 in the supplementary material). To test Gli-mediated regulation
in a putative cis-regulatory enhancer region, we identified 12 candidate
enhancers (Table 2) and asked
if these could respond to Gli1 stimulation. These assays were performed in the
presence and absence of Gli1 using murine NIH3T3 cells, a fibroblast line that
has been used to define several cis-regulatory regions, including a
Ptch1 enhancer region (Ptch1, peak 2) identified in this
screen (Agren et al., 2004
). We
focused our analysis of candidate sites among global Hh responders (Ptch1,
Ptch2, Hhip1 and Rab34) and compared it with the putative
regulatory region in Nkx2.2, the expression of which has only been
shown to require Hh signaling in the context of neural progenitor
specification (see Fig. S3B-B in the supplementary material). As expected, we
did not detect any specific Gli-mediated induction for the neural-specific
Nkx2.2 candidate enhancer in the reporter assay
(Table 2). Nor did we observe
reporter activation for either of the peaks in Ptch1
(Fig. 2A, peak 6) or
Hhip1 (peak 2, data not shown) that failed to show a consensus Gli
motif. However, we observed a robust induction with the Gli site in
Ptch1 peak 2 (Fig.
2C), as well as in previously unknown sites in Ptch1
Intron2 (Fig. 2C, peak 1), and
Ptch2 (Fig. 2E, peak
2). A more modest response was detected in Ptch2 peak 1
(Fig. 2E), Hhip1 peak
1 (data not shown) and Rab34 (Fig.
2D). Taken together, these data
(Table 2) indicate that many
Gli-responsive elements are confirmed in this assay; those that are not may
reflect the absence of context-dependent factors (see Discussion).
To further explore the properties of a subset of these putative enhancers
in more detail, we selected four of the regions for more extensive transgenic
analysis. We chose a 180 bp Ptch1 peak 2 enhancer region
(Fig. 2A;
Table 2) representing a global
target gene. As tissue-specific candidates, we chose a 239 bp region (a region
non-responsive in the NIH3T3 assay) representing the single, Gli target
prediction within a peak located approximately 1.9 kb 5' of the TSS in
Nkx2.2, a transcriptional regulator with neurally restricted activity
associated with V3 interneuron progenitor specification
(Fig. 2B;
Table 2). Very recently, this
site was shown to be Gli responsive (Lei
et al., 2006
). In addition, we selected a 575 bp region
encompassing the single peak in Nkx2.1 11.5 kb downstream of the TES
(Fig. 2F) of this forebrain
restricted gene (see Fig. S3F in the supplementary material). Finally, we
selected a 291 bp region around the single, novel peak in Rab34 intron 1 (see
above, Fig. 2D).
|
Nkx2.2-lacZ transgenic embryos were analyzed directly at
E10.5; nine of 15 showed detectable ß-gal activity; of these, seven had
ventral-specific staining in the presumptive brain and spinal cord that
clearly encompassed the normal Nkx2.2 domain
(Fig. 3E,F,K,L,Q,R).
Examination of sections indicated two ectopic regions of transgenic activity
when compared to Nkx2.2 expression in the neural tube at this stage
(Fig. 3E,K,Q). One was in the
floor plate (arrowhead in Fig.
3L), the other dorsal to the normal Nkx2.2 domain (arrow
in Fig. 3L). To determine
whether the observed expression was Gli-dependent, the single predicted Gli
site was mutated, and expression of the resulting Nkx2.2M
lacZ transgene was analyzed in G0 embryos at E10.5. In contrast to
the previous results, only four out of 11 transgenic embryos showed any
ß-gal activity, and activity was generally weaker and more restricted.
Expression in the normal Nkx2.2 domain (in both the brain and spinal cord) and
ectopic expression in the floor plate were entirely dependent upon the
Gli-binding region (Fig.
3G,M,S); the weak ectopic, ventrolateral spinal cord domain,
however, was Gli-site-independent (arrowhead in
Fig. 3M). In summary, this peak
region encodes a CRM that regulates Gli-dependent expression of
Nkx2.2 in a domain that encompasses its native expression domain.
However, Gli-site dependent expression in the FP region indicates that
additional cis-regulatory elements are required to repress Nkx2.2 in
the floor plate; Nkx2.2 is first activated within ventral midline
cells and is co-expressed together with the FP determinant FoxA2 at the outset
of FP formation (Jeong and McMahon,
2005
). In addition, ectopic ventrolateral Gli-independent activity
suggests that additional regulatory regions normally silence non-Hh dependent
activation of Nkx2.2 outside its normal domain. Previous data indicate that
Pax6, and potentially, Tcf4 play an important role in the
ventral restriction of Nkx2.2 (Ericson et
al., 1997
; Lei et al.,
2006
).
|
Rab34 has a broad expression pattern that would not normally be suggestive of a Hh target gene (Fig. 3W). Nonetheless, transgenic embryos expressed a restricted subset of this expression that correlates well with Hh-responsive regions - 11/17 transgenics expressed ß-gal and 10/11 were in a similar, restricted pattern (Fig. 3X). For example, expression of Rab34 in the limb bud is diffuse, but ß-gal driven by the predicted Gli target region was sharply restricted to the posterior, Hh-responsive portion of the limb bud in 7/10 transgenics (Fig. 3A',B'). This expression seemed to commence after the onset of Hh expression, as the three transgenics without limb bud expression were slightly younger and only the oldest transgenics exhibited expression within the later arising hindlimb.
Gli target specificity is determined by the nuclear availability of Gli1
After characterizing these enhancers, we asked if Gli1 could bind to
targets in the absence of gene expression. As shown in
Fig. 1, Nkx2.2, FoxA2 and other
markers of Hh-dependent ventral neuronal cells were not expressed in
RA-treated EBs in the absence of Hh agonist. We examined the degree of Gli1
binding to the enhancer sites of several of these tissue-restricted genes as
well as the degree of Gli1 binding to several enhancers that have more global
response using the standard ChIP assay. As an additional experiment, we also
asked whether we could facilitate Gli1 binding to targets by sequestering it
in the nucleus by treatment with leptomycin B immediately prior to
harvesting.
Our experiments demonstrate that tissue-specific Gli enhancers generally contain little or no Gli1 binding in RA-only treated EBs. For example, the Nkx2.2 enhancer exhibited 6.9-fold enrichment in ChIPs from Hh-stimulated EBs where gene expression occurs, but was enriched only 1.3-fold in RA-only samples; the Nkx2.9 enhancer was enriched 10-fold in Hh-stimulated versus 1.7-fold in the RA-only samples. Similar trends were seen for Nkx2.1 and FoxA2 (data not shown). In contrast, global enhancers were enriched by Gli1 in both the presence and absence of Hh stimulation. Here, Ptch1 peak 2 was enriched in both RA-only samples (14.5-fold) and in Hh-stimulated EBs (25.2-fold). Enrichment of Gli binding in the absence of Hh stimulation was also seen for the other global enhancers Ptch1 peak1, Ptch2peak2, Rab34 and Gli1 (data not shown). Thus, Gli1 is able to bind to global enhancers in the absence of Hh stimulation, but is unable to bind efficiently to tissue-specific enhancers in the absence of corresponding gene expression. Surprisingly, when RA-only EBs were treated with 5 nM leptomycin B (Lepto) for 4 hours before ChIP to cause nuclear accumulation of Gli1 (see Fig. S1C in the supplementary material), they exhibited marked increases in the levels of binding to tissue-specific enhancers. Gli1 enrichment of the Nkx2.2 enhancers was 3.3-fold in RA+Lepto samples compared with 1.3-fold in RA-only samples; similarly, the Nkx2.9 enhancer was enriched 9.4-fold in RA+Lepto-treated samples compared with 1.7 fold in RA-only samples. Leptomycin treatment caused similar increases in enrichment for Nkx2.1 and FoxA2, while maintaining the levels of enrichment seen previously for global enhancers in RA-only samples (data not shown). These data argue that Gli1 nuclear concentration is a major determinant of enhancer-binding specificity.
In silico prediction of Gli targets in the ventral neural tube
Next we sought to incorporate information from the Gli target
identification to develop a general, testable method for predicting Gli
targets in a tissue-specific fashion. We generated a pool of candidate targets
by transcriptional profiling of EBs in the presence and absence of Hh
signaling and selected 249 genes upregulated in response to Hh-Ag treatment, a
group that includes relevant Gli targets in the ventral neural tube (see Table
S5 in the supplementary material).
We then applied two distinct methods: one a recently published algorithm,
Enhancer Element Locater (EEL) (Hallikas
et al., 2006
); and the other a novel method, Module Cluster
Analysis (MCA). Our ChIP-chip data indicated that several Gli targets contain
multiple peaks that presumably correspond to multiple CRMs (e.g. Ptch1,
Nkx2.9). In MCA, Gli-binding affinities from multiple CRMs were combined,
and the combined signal was used as a predictor of Gli target genes (see
Materials and methods).
When EEL was applied to the 249 genes upregulated in the Hh-treated EBs, the top 20 predicted enhancers, ranked by their combined Gli-binding strength, included five regions identified in our ChIP screens. We performed qPCR on the remaining 15 predictions and verified one additional peak corresponding to an EST (see Table S6 in the supplementary material), resulting in an effective prediction rate of 1/15 enhancer sites.
When MCA was applied on a genome-wide basis, Ptch1 and Hhip1 were both present in the top 10 genes (see Table S7 in the supplementary material), suggesting that the combined Gli-binding strength serves as a good predictor of Gli target genes. We intersected this genome-wide dataset with the Hh-EB upregulated gene list to predict Gli target genes specifically within the ventral neural tube. We then ranked all predicted CRMs that were associated with the top 20 neural Gli target genes and tested the top 20 individual CRMs by qPCR on ChIPed material. Within these top-ranking CRMs, 45% (9/20) were indeed Gli-activator targets by this assay (see Table S8 in the supplementary material). This set included six peaks that had been previously identified by our ChIP-chip studies, giving a new discovery rate of 3/14, compared to 1/15 with EEL.
To summarize, 5/22 of the MCA predicted enhancers (or 11/28 including predictions covered by ChIP peaks) were bound by Gli1. At the gene level, at least 7/20 MCA predicted genes were verified to be direct Gli targets. Thus, pooling information from multiple potential CRMs in the MCA approach improved the ability to predict Gli target genes computationally.
| DISCUSSION |
|---|
|
|
|---|
Identification of enhancer elements during neural tube development
Genome-scale transcription factor binding approaches have been used with
great success in yeast, and more recently in ES cells
(Lee et al., 2006
) and
Drosophila embryos (Alekseyenko et
al., 2006
; Sandmann et al.,
2006
). Embryonic analysis in mammalian systems presents several
challenges, including the limited number of cells present in a given target
population at a crucial regulatory stage and the dynamic expression of
transcription factors during development. By modifying existing protocols we
were able to generate robust results with an approximately 50-fold reduction
in input ChIP product (approximately 2x106 cells per ChIP).
The degree of enrichment we obtained with the Gli1FLAG construct is
impressive considering that: (1) Gli1 is predominately cytoplasmic, and Hh
signal-directed nuclear accumulation is transient
(Kogerman et al., 1999
); (2)
the epitope-tagged construct must compete with endogenously transcribed,
untagged Gli-activator; and (3) the target population is a mosaic in which
relatively few cells represent a given ventral progenitor type
(Fig. 1Q). Indeed, we were able
to identify bona-fide Gli-activator target sites within enhancers in
Nkx2.2 and FoxA2, where Nkx2.2+ and
FoxA2+ cells represented only 10% of the input population.
Interestingly, a subset of predicted peaks did not contain any Gli consensus
sites. These regions might be directly Gli-responsive despite the absence of a
consensus Gli DNA-binding site. Alternatively, they might represent binding
sites for other transcriptional regulators that could then make secondary
protein-protein interactions with Gli1. While no specific interactions have
been demonstrated for Gli-activator forms, Gli3 proteins have previously been
shown to interact with members of the histone deacetylase complex
(Cheng and Bishop, 2002
;
Dai et al., 2002
) as well as
with Smad proteins (Liu et al.,
1998
).
In addition to previously characterized FoxA2 and Nkx2.9 enhancers, we identified Nkx2.2, Nkx2.1 and Rab34 as direct targets through transgenic analysis and several new additional targets through validated bioinformatics predictions. However, direct transgenic analysis of Nkx2.2 illustrates one major challenge the researcher faces in constructing regulatory circuitry from this data: that is, the identification of other cis-regulatory regions that act in conjunction with Gli factors to give appropriate temporal and spatial regulation. In the example of the Nkx2.2 element we assayed, sequences that repress Gli-dependent FP expression and silence non-Gli dependent ventrolateral neural expression were not present and remain to be identified.
Surprisingly, our data indicate that a global Hh response represented by Ptch1, the best characterized transcriptional response to Hh signaling, may in fact be controlled by cell-type-specific response elements. Our analysis of one of four strong, independent Gli-binding regions demonstrates that the region selected represents a subset of the full range and activity of the Ptch1 transcriptional response. This leaves open the possibility that other regions may encode limb response elements, or that a specific limb response element that normally acts in conjunction with the Gli site in Ptch1 peak 2 lies outside the regulatory module assayed here. Our assay for putative enhancers using NIH3T3 cells allows us to predict several additional enhancers that are strong candidates for global regulators of a Hh signaling response (Table 2). Consistent with this interpretation, several genes associated with these enhancers are expressed in multiple Hh-responsive tissues (see Fig. S3 in the supplementary material), and all these targets seem to participate in negative feedback loops to regulate expression levels (see model in Fig. 4A).
Among the novel targets, Rab34 is particularly intriguing given
the role of another small GTP-binding protein, Rab23, as a crucial regulator
of Hh responsiveness (Eggenschwiler et
al., 2006
; Eggenschwiler et
al., 2001
) and an independent link between Smo and Ptch1 activity
in endosomal trafficking (Zhu et al.,
2003
). While transcriptional profiling identified Rab34
as a general, positive target of Hh action in many tissues, Rab34 is
itself rather broadly expressed and therefore, based on expression alone,
would not be viewed as a strong candidate for a specific role in the Hh
pathway. However, the ChIP data, coupled with studies of Rab34 expression in
Ptch1 and Smo mutants (see Fig. S3D-D in the supplementary material),
indicate that a component of the Rab34 expression pattern is Hh/Gli-dependent,
an observation supported by our transgenic studies
(Fig.
2W,X,A',B').
|
As noted, we have failed to detect Gli-binding sites in 150 kb of tiled
sequence flanking the Nkx6.1, one of the very earliest responders to
the Hh pathway (see Table S1 in the supplementary material)
(Jeong and McMahon, 2005
).
Whereas the possibility of a more distant Gli1-binding enhancer region remains
open, an alternative explanation is that Gli activator binds only to the
subset of Gli targets that are expressed in the ventralmost FP and pV3
domains. Because Nkx6.1 is still expressed in mice lacking the activity of all
Gli proteins (both activator and repressor) in the neural tube, its expression
depends principally upon loss of Gli repression rather than Gli activation
(Bai et al., 2004
;
Lei et al., 2004
;
Wijgerde et al., 2002
).
Our findings, together with published reports that have documented the
relationship between Gli factors and Shh signaling in specification and the
interactions among specific transcriptional regulators downstream of Shh
input, lead to the model depicted in Fig.
4B (Briscoe et al.,
2000
; Ericson et al.,
1997
; Fogarty et al.,
2005
; Novitch et al.,
2003
; Santagati et al.,
2003
; Sasaki and Hogan,
1996
; Vallstedt et al.,
2001
; Wijgerde et al.,
2002
). Shh signaling within the neural tube is effectively
interpreted by region-specific combinations of Gli activator and Gli repressor
forms. Initially only two cell types are present - one is a V0/V1 bipotential
progenitor that depends only indirectly on a loss of Gli repression, which
would in turn result in the loss of an inhibitory dorsal signal
(Fogarty et al., 2005
;
Wijgerde et al., 2002
). In the
second cell state, there is an initial V2 progenitor, which would then
sequentially give rise to V2-FP progenitor domains. Most dorsally, where Shh
is never received, Gli repressors silence all Hh target genes. At the dorsal
limits of Hh signaling (at the dorsal-ventral intersect), Gli repressor would
be reduced, thereby relieving the inhibitory action of an unknown factor X on
Dbx1 and Dbx2. Dorsal BMP signals have previously been hypothesized to play
such a role (Wijgerde et al.,
2002
); this interaction with Gli repressor is especially
attractive, as Gli3 has previously been shown to bind Smad proteins,
transcriptional mediators of BMP signaling
(Liu et al., 1998
).
Importantly, the regulation of these domains would not depend on any
differential activity of Gli repressor - only on the attenuation of the dorsal
inhibitory input. In domains that receive a higher Hh input, Gli repression
would be progressively attenuated to the pMN domain, and loss of this
repressor is sufficient for activation of targets. Within the pV3 and FP
domains, however, specification requires varying levels of Gli-activator
activity. Recent experiments in chick argue persuasively for a ratiometric
assessment of Gli repressor and activator in dictating the position-specific
Hh response both in the limb and the neural tube
(Davey et al., 2006
) (E.
Dessaud and J. Briscoe, personal communication). Finally, the available data
indicate an important dynamic component to ventral patterning, in which
specific neural progenitor populations may change their identity on
integrating Hh signaling over time (see Movie 1 in the supplementary
material).
In summary, by applying biochemical and bioinformatics approaches, we have a new view of Hh targets, their regulation and the general parameters that characterize a Gli-activator response in patterning the vertebrate neural tube. These data provide a solid grounding for explaining the possible roles of newly defined targets (e.g. Rab34) and for determining the relationships between Gli proteins and other transcriptional regulators in coordinating precise transcriptional outcomes through local cis-acting regulatory modules in central Hh target genes.
Supplementary material
Supplementary material for this article is available at
http://dev.biologists.org/cgi/content/full/134/10/1977/DC1
| ACKNOWLEDGMENTS |
|---|
| Footnotes |
|---|
| REFERENCES |
|---|
|
|
|---|