As one of two Drosophila Hox clusters, the bithorax complex (BX-C) is responsible for determining the posterior thorax and each abdominal segment of the fly. Through the dissection of its large cis-regulatory region, biologists have obtained a wealth of knowledge that has informed our understanding of gene expression, chromatin dynamics and gene evolution. This primer attempts to distill and explain our current knowledge about this classic, complex locus.
In Drosophila, the bithorax complex (BX-C) controls the identity of each of the segments that contributes to the posterior two-thirds of the fly (Fig. 1, also see Box 1). It does this by regulating the expression of the three BX-C homeotic genes: Ultrabithorax (Ubx), abdominal A (abd-A) and Abdominal B (Abd-B). For over ninety years, scientists have been trying to unlock the complexities of the >300 kb cis-regulatory region of the bithorax locus. This work has lead to many ground-breaking discoveries, some of which have helped to shape our modern definition of a gene. Yet, throughout its rich history, the BX-C literature has suffered from a reputation of complexity [for a comprehensive review on BX-C molecular genetics, see Duncan (Duncan, 1987)]. In this primer article, we hope to demystify the BX-C by showing that it can be broken down into a modular array of understandable elements. Many of these elements have now been well-characterized molecularly and fall into a few distinct categories, including initiator elements, maintenance elements/Polycomb-response elements, cell-type specific enhancers, chromatin domain boundaries and promoter targeting sequences (see glossary in Box 2). Although our knowledge of the BX-C is still far from complete, we believe a solid understanding of these basic building blocks, when combined with the wealth of genetic data on the BX-C, will lead to a reasonably clear picture of the functioning of this complex locus.
A crash course in BX-C genetics
Much of the mystery surrounding the BX-C comes from the nature of the early data on the complex. These early data were completely genetic in nature, and, therefore, only described the complex phenotypes of BX-C mutations and the interactions between these mutations (Lewis, 1954; Lewis, 1963). However, although these data are a bit abstract, they provided scientists with their first clues to the complexity of the BX-C and of the vast complexities to be discovered through the exploration of eukaryotic genomes. As such, we begin with a basic introduction to the genetics of the BX-C before proceeding to the molecular data.
In 1978, Ed Lewis published a landmark paper in which he summarized nearly 40 years of his work on the BX-C. In this paper, he reviewed a series of Drosophila mutations (called abx/bx, bxd/pbx, and iab-2 through to iab-8; see Fig. 1) that affect the identity of the posterior two-thirds of the fly: the third thoracic segment (T3) and the eight abdominal segments (A1 to A8; see Fig. 2A) (Lewis, 1978). Phenotypic analysis [which was further extended by Karch et al. (Karch et al., 1985)] defined nine classes of mutations and indicated that each mutation class defined an element that was required for the identity of a single segment (see Box 3). Remarkably enough, these classes of mutations mapped to the chromosome in an order that corresponded to the body segments in which they act. This astonishing correspondence between body axis and genomic organization was later found to be evolutionarily conserved in the homeotic clusters of most animals (for a review, see McGinnis and Krumlauf, 1992).
Although embryos deficient for the whole BX-C never hatch, they live long enough to make identifiable markers for each segment. In these embryos, segments posterior to the second thoracic segment (T2) develop as copies of T2 (see Fig. 2). Because of this, Ed Lewis proposed that T2 represents the ground state of development (i.e. the default state) and that each class of mutation represents a segment-specific function that allows a more posterior segment to differentiate away from the ground state. Furthermore, the fact that mutations affecting individual segment-specific functions always cause homeotic transformations towards the last unaffected, more-anterior segment (and not always to T2), meant that everything required for the development of the more-anterior segments had to be present in the more-posterior segments. Therefore, Lewis proposed that segment-specific functions act in an additive fashion. This idea was supported by the fact that some mutations that affected anterior segment-specific functions also caused slight changes in the more-posterior segments. For example, in flies with defective bxd/pbx function, the A1 segment develops as a copy of T3 (see Fig. 2). Thus, the normal role of bxd/pbx must be to assign segmental identity to A1. Likewise, because A1 is transformed into a copy of T3 instead of T2, the normal role of the abd/bx segment-specific function, required for T3 specification, must be present in the developing A1 segment (Fig. 2). Lewis summarized these findings into two rules: “...a [segment-specific function] derepressed in one segment is derepressed in all segments posterior thereto...” and “...the more posterior a segment... the greater the number of BX-C [segment-specific functions] that are in a derepressed state” (Lewis, 1978). Because of the correlation between chromosomal location and anteroposterior function, Lewis visualized this additive effect as a segmentally sequential opening of genes along the chromosome.
The BX-C goes molecular
The 300 kb of DNA that covers the BX-C was first cloned during the early 1980s (Bender et al., 1983a; Bender et al., 1983b; Karch et al., 1985). All the mutations affecting the segment-specific functions were found to be associated with rearrangement breakpoints (such as translocations, inversions, deficiencies or insertions of transposable elements). The lesions associated with a given class of mutations always clustered in a relatively small part of the BX-C, and different classes never overlapped (see Fig. 1). The observation that all the mutations in each class are associated with rearrangement breakpoints not only helped to map them onto a DNA map (over 100 mutations have been localized), but also suggested that they did not simply inactivate protein-coding regions (in which case, point mutations would also have been recovered during the numerous screens performed). In 1985, Sanchez-Herrero et al. and Tiong et al. independently presented the first true complementation analysis of the BX-C, which suggested that the whole BX-C only contains three homeotic genes: Ultrabithorax (Ubx), abdominal A (abd-A) and Abdominal B (Abd-B) (Sanchez-Herrero et al., 1985; Tiong et al., 1985). These findings were later supported when it was shown that an Ubx abd-A Abd-B triple-mutant embryo harbored the same phenotype as an embryo carrying a complete deletion of the BX-C (Casanova et al., 1987). Publication of the complete sequence of the complex provided the final confirmation; the BX-C contained only three homeotic genes (Martin et al., 1995).
Box 1. Hox clusters in flies and vertebrates
Although the homeotic genes of the BX-C control the identity of the segments of the posterior thorax and the abdominal segments, the segments forming the head and the anterior thorax are determined by the Antennapedia complex (ANT-C) (Kaufman et al., 1990). The ANT-C is named after the spectacular, dominant gain-of-function phenotype of its founding member, Antennapedia (Antp), a mutant in which an extra pair of legs develop on the head at the expense of a normal pair of antennae (Le Calvez, 1948). The ANT-C is the second of the fly's two homeotic clusters. It was when work on the Antp gene collided with the work on the BX-C that a whole new field of developmental and evolutionary biology emerged.
In 1951, in order to explain a peculiar phenomenon called pseudoallelism, Ed Lewis hypothesized that the homeotic genes arose during the course of evolution through tandem duplication events and subsequent divergence of function (Lewis, 1951). It wasn't until the Antp and Ubx genes were cloned and sequenced (Garber et al., 1983; Scott et al., 1983; Bender et al., 1983b) that direct evidence for this hypothesis was provided with the discovery of a highly conserved sequence shared between the Antp and Ubx genes, now known as the homeobox (McGinnis et al., 1984; Scott and Weiner, 1984) (see Box 3). Soon after this discovery, it became clear that all of the BX-C and ANT-C homeotic genes contained this conserved motif.
It turns out that Antp is only the most distal member of the series of homeobox genes that make up the ANT-C, which contains, in order, the labial (lab), proboscipedia (pb), Deformed (Dfd), Sex combs reduced (Scr) and Antp genes (Kaufman et al., 1990). As in the BX-C, these homeotic genes show colinearity in their expression patterns (see Box 3), with the exception of pb. The identification of the homeobox in Drosophila enabled the identification of similar clusters of homeobox-containing genes in other organisms, ranging from mollusks to vertebrates, which generally contain homologues of each of the homeobox genes in the ANT-C and BX-C. However, these genes are generally arranged in single, continuous clusters. Surprisingly, the correlation between the position of the gene on the chromosome and the relative position at which the gene is expressed along the body axis is conserved in most species tested (for a review, see McGinnis and Krumlauf, 1992). It is ironic that although the Hox saga began in Drosophila, Drosophila really represents an exception to the clustering of Hox genes due to its split Hox cluster. It is speculated that the `unusual' mechanism of segmentation in Drosophila and/or its life cycle peculiarities have allowed for this splitting of a single ancestral Hox cluster (see Duboule, 1992; Von Allmen et al., 1996; Lewis et al., 2003).
But, there was then a contradiction: on the one hand, early genetic analysis revealed the existence of nine classes of mutations that affect segment-specific functions, while, on the other hand, other genetic and molecular analysis indicated that the BX-C only encodes three proteins. The description of the expression patterns of Ubx, abd-A and Abd-B answered this apparent paradox (Beachy et al., 1985; Celniker et al., 1990; Karch et al., 1990; Macias et al., 1990; White and Wicox, 1985). Fig. 3B shows the central nerve cord of a wild-type embryo stained with an antibody directed against Abd-B. Like Ubx and abd-A, although in different parasegments, Abd-B is expressed in an intricate pattern that is finely tuned from one parasegment to the next. By staining various mutant embryos, it was finally understood that the segment-specific functions corresponded to cis-regulatory regions that regulate the expression of Ubx, abd-A or Abd-B in a segment-specific fashion. Mutations in any of the segment-specific regulatory regions alter the expression of its relevant target gene. For example, flies homozygous for the iab-7Sz mutation have their seventh abdominal segment transformed into a copy of the sixth. Consistent with this, in embryos, the Abd-B expression pattern characteristic for parasegment 12 (PS12, which corresponds to A7, see Box 3) is replaced by the pattern normally present in PS11/A6 (Galloni et al., 1993). The strong correlation between the level of homeotic gene expression and segmental identity also suggested that the level of homeotic gene expression was crucial for determining segmental identity, thus providing a mechanism by which three genes pattern nine segments (Castelli-Gair and Akam, 1995).
Fig. 1 schematically details how the cis-regulatory regions of the BX-C are arranged. The red and orange regions show the regulatory regions that interact with Ubx. They include the abd/bx and bxd/pbx regions that regulate Ubx expression in PS5 and PS6, respectively (Beachy et al., 1985; Little et al., 1990; White and Wicox, 1985). Similarly, iab-2, iab-3 and iab-4 specify the appropriate abd-A expression patterns in PS7, PS8 and PS9, respectively (Karch et al., 1990; Macias et al., 1990; Sanchez-Herrero, 1991). Shown in shades of green are the segment-specific functions that regulate the Abd-B transcription unit. The regulation of Abd-B expression is more complex than that of the other two BX-C Hox genes; however, for the purposes of this review, we will focus only on the short Abd-B transcript (A see Fig. 1; also referred as to Abd-Bm), which is required for the identities of PS10-PS13 (Casanova et al., 1986; Sanchez-Herrero and Crosby, 1988; Kuziora and McGinnis, 1988; Celniker et al., 1989; Zavortink and Sakonju, 1989; Delorenzi and Bienz, 1990). The iab-5, iab-6, iab-7 and iab-8 regions regulate this short transcript in PS10 to PS13, respectively (Boulet et al., 1991; Celniker et al., 1990; Estrada et al., 2002; Sanchez-Herrero, 1991).
Initiation and maintenance phases in BX-C regulation
The overall determination of the anteroposterior (AP) axis during the initial stages of embryogenesis in Drosophila is under the control of three classes of transcription factors that are deployed in a cascade. These transcription factors, which are encoded by the maternal, gap and pair-rule genes, subdivide the embryo into 14 parasegments (see Box 3) (for reviews, see Ingham, 1988; Hoch and Jackle, 1993; Kornberg and Tabata, 1993; DiNardo et al., 1994). It is now known that these proteins interact with elements in each of the cis-regulatory regions of the BX-C genes to determine their ultimate expression patterns (White and Lehmann, 1986; Irish et al., 1989; Shimell et al., 1994; Casares and Sanchez Herrero, 1995). For example, the combination of the gap and pair-rule gene products that are present in PS12 allow the iab-7 cis-regulatory region, but not the iab-8 cis-regulatory region, to control Abd-B expression in PS12/A7. However, because the gap and pair-rule genes are only transiently expressed in the early embryo, and the activity states of the segment-specific cis-regulatory regions are fixed for the life of the fly, a system to maintain homeotic gene expression is also required in each cis-regulatory domain (Struhl and Akam, 1985). This maintenance system has been shown to require the products of the Polycomb-Group (Pc-G) and trithorax-Group (trx-G) genes (see Box 4). Although the Pc-G products function as negative regulators, maintaining the inactive state of the cis-regulatory regions not in use, the trx-G products function as positive regulators, maintaining the active state of active regulatory regions (Paro, 1990; Kennison, 1993; Simon, 1995; Pirrotta, 1997). Both the Pc-G and trx-G products are thought to maintain the active or inactive state of each parasegment-specific cis-regulatory region by modifying the chromatin structure of each region. Because of this distinction between the initiation of expression and the maintenance of expression, the elements to which the gap and pair-rule proteins bind have been termed initiators, and the elements to which the Pc-G and trx-G proteins bind have been termed maintenance elements [ME; also known as Pc-G or trx-G response elements (PREs/TREs)].
Box 2. Glossary of specialized terms
Boundary element: A DNA element that separates adjacent chromatin/DNA domains.
Colinearity: The relationship between the position of a Hox gene along a chromosome and the pattern of its expression along the anteroposterior axis.
Gap gene: A class of Drosophila genes that, when mutated, cause embryos to develop with groups of consecutive segments missing.
Gene conversion: The transfer of DNA sequences between two homologous sequences; can be a mechanism for mutation if the transfer of material contains one or more mutations.
Homeobox: A 180-base-long sequence that is highly conserved among genes encoding Hox proteins. It enables a protein to bind to DNA in a sequence-specific fashion.
Homeotic: Adjective of the term homeosis, which was introduced in 1894 by Bateson (Bateson, 1894) to describe phenotypic variation in which “something is changed into the likeness of something else”.
Initiator element: a DNA fragment that initiates a specific expression pattern of a linked gene.
Maintenance element: a DNA fragment that can maintain the expression pattern of a linked gene established during an earlier stage of embryogenesis (by an initiator).
Pair-rule gene: A class of Drosophila genes that, when mutated, results in the development of embryos with every second parasegment missing.
Initiators, maintenance elements and segment-specific enhancers
Confirmation of the segment-specific and biphasic nature of BX-C gene regulation came from studies using reporter gene constructs. In these experiments, DNA fragments from the various regulatory regions were cloned upstream of a lacZ reporter gene (Lis et al., 1983). By making transgenic flies carrying these reporter constructs and studying their resulting patterns of expression, scientists have been able to identify specific DNA fragments that are required for initiating the segment-specific expression of BX-C genes, maintaining the restricted pattern of their expression, and for producing segment-independent, cell-type specific expression (see Fig. 4 for examples of how initiator and maintenance elements control gene expression patterns during development).
BX-C initiator elements can be defined as being specific types of enhancers that confer a parasegmentally restricted pattern of expression to a reporter gene during early embryogenesis (Simon et al., 1990; Qian et al., 1991; Muller and Bienz, 1992; Busturia and Bienz, 1993; Barges et al., 2000; Zhou et al., 1999; Shimell et al., 2000). For example, Fig. 4A shows the expression pattern of a lacZ reporter gene in an early Drosophila embryo when it is driven by an initiator element derived from the iab-6 regulatory region. As the iab-6 region is responsible for Abd-B expression in PS11, we see that this element can faithfully drive lacZ expression from PS11. Based on these types of assays, we know that initiator elements are able to read an early AP positional address and to transmit this information to a promoter. However, this ability is transient. At later stages of embryogenesis, the strict anterior border of expression derived from this construct is lost and lacZ becomes expressed in all of the parasegments along the AP axis (Fig. 4B). This degeneration of the initial pattern is probably due to the loss of positional information that is provided in the early embryo by the gap and pair-rule gene products. In support of this idea, a few initiator elements have been mapped precisely enough to show a direct correlation with the binding sites for gap and pair-rule gene products (Qian et al., 1991; Zhang et al., 1991; Shimell et al., 1994; Zhou et al., 1999).
In most cases, the anterior border of expression of a reporter gene that is controlled by an initiator element is lost when the products of the gap and pair-rule genes decay (at the end of the initiation phase). However, a few larger fragments are able to maintain the initial anterior border of expression of a lacZ reporter. For example, the construct shown in Fig. 4C,D contains an initiator element from iab-5 that can initiate and maintain the appropriate PS10-specific anterior border of expression (Fig. 4C,D). The ability to maintain the initial expression pattern of a reporter gene has been mapped to a fragment that is distinct from the initiator, called a maintenance element (ME) (Brock and van Lohuizen, 2001). Because the maintenance of the initial expression pattern is lost in Pc-G mutant backgrounds, MEs are often referred to as Polycomb-Response-Elements (PREs). When associated with an initiator element, a maintenance element maintains the anterior limit of expression of a reporter gene throughout late embryogenesis and larval life (Muller and Bienz, 1991; Simon et al., 1993; Chan et al., 1994; Fritsch et al., 1999; Busturia et al., 2001). These maintenance elements do not have an intrinsic segmental address and can maintain different segmental expression patterns when combined with different initiator elements (Chiang et al., 1995).
Cell-type or tissue-specific enhancers are a third type of regulatory element that has been identified within the segment-specific, cis-regulatory regions of the BX-C (Simon et al., 1990; Busturia and Bienz, 1993; Pirrotta et al., 1995). In most cases, these elements confer a cell/tissue-specific expression pattern to a reporter gene that is reiterated in all of the parasegments along the AP axis of the embryo. It must be noted, however, that within the BX-C, these enhancers confer a cell/tissue-specific pattern of homeotic gene expression that is restricted parasegmentally. This apparent discrepancy between the expression pattern of homeotic genes and that of transgenic reporter genes when under the control of these enhancers can easily be explained if the enhancers are coordinately regulated by the initiator and maintenance elements (see below).
Cis-regulatory regions are organized into parasegment-specific chromosomal domains
How can the various enhancers in a cis-regulatory region be coordinately regulated? Two complementary observations (coming from enhancer trap studies and from boundary mutations) have provided compelling evidence that the cis-regulatory regions of the BX-C are organized into parasegment-specific chromosomal domains.
In Drosophila, transgenic animals are generally made using P-element transposons. These transposons insert throughout the genome in a fairly random fashion. If these P-elements contain a basal promoter and a reporter gene, they often respond to nearby enhancer elements. The technique of using P-elements with reporter genes to get a read-out of the enhancers in the vicinity of a P-element insertion is called enhancer trapping (O'Kane and Gehring, 1987). Fig. 5 shows the insertion sites of several enhancer trap transposons that have landed within the BX-C (Galloni et al., 1993; McCall et al., 1994; Bender and Hudson, 2000). The colored line in this figure represents the genomic DNA of the BX-C using the same color coding as that shown in Fig. 3 (see legend for details). If we focus on the three transposons inserted within the ∼75 kb region marked in orange (between map positions 315,379 and 242,806), we find that all three transposons have similar expression patterns. The anterior border of expression of these enhancer traps is PS5. The abx/bx cis-regulatory region that regulates Ubx expression in PS5 lies within this region. Although the promoters of these three P-elements are obviously trapping different enhancer activities in this 75 kb region of DNA, they are all transcribed in PS5 and in the parasegments posterior to PS5, regardless of where exactly they have inserted. Meanwhile, the anterior parasegmental boundary of expression of the three enhancer traps inserted within the region 232,727 to 192,677 is shifted one parasegment posterior to PS6 (marked in red on Fig. 5). This domain corresponds to the region that contains the bxd/pbx cis-regulatory region that drives Ubx expression in PS6. Once again, although the intensity of expression varies between these three enhancer traps, the anterior border of each one's expression begins at PS6.
By examining the large number of enhancer trap lines isolated in the BX-C (Bender and Hudson, 2000) (some of which are shown in Fig. 5), two striking observations could be made. First, enhancer trap transposons that are spread out over considerable distances often produce the same expression pattern, whereas others located just a few kilobases away produce a different pattern. Second, the anterior border of lacZ expression always progresses towards the posterior by increments of one parasegment. Based on these observations and others, it was proposed that the BX-C enhancers reside in chromosomal domains that are coordinately regulated (Peifer et al., 1987). For example, all elements residing in the ∼75 kb region between map positions 315,379 and 242,806 (Fig. 5) are turned on and off together. This is why enhancer trap lines inserted in this region display similar patterns of expression. Meanwhile, enhancer traps lying very close to this region, but outside of it, display different patterns of expression (for example, compare the transposons at position ∼232,727 and ∼242,806, or the transposons at positions 127,367 and ∼125,489 in Fig. 5). In this model, enhancer trap transposons behave simply as sensors to the state of a domain, the extent of which can be mapped by comparing the various enhancer trap lines.
One prediction made by the domain hypothesis is the existence of boundary elements, which would act to limit the extent of each domain. In Figs 1 and 5, the boundaries are symbolized by the sharp color transition between the adjacent domains symbolized by the colored rectangles. A boundary is postulated to exist between each of the regulatory domains. Thus far, three boundaries, Mcp, Fab-7 and Fab-8, have been identified through mutational analysis (Gyurkovics et al., 1990; Karch et al., 1994; Mihaly et al., 1997; Mihaly et al., 1998; Barges et al., 2000). The best characterized of them is Fab-7, which separates the iab-6 cis-regulatory domain from the iab-7 cis-regulatory domain. When Fab-7 is deleted, iab-6 and iab-7 fuse into a new functional unit. This fusion disrupts Abd-B regulation in PS11, where normally only iab-6 is active. Usually this results in the inappropriate activation of iab-7 enhancers in PS11, which are turned on by the initiator element in the iab-6 domain. As a consequence, Abd-B expression is regulated in a PS12-like pattern, transforming cell identity from PS11 to PS12 (see Fig. 3C).
Box 3. Segments versus parasegments
During early embryogenesis, a Drosophila embryo is rapidly metamerized into 14 parasegments by the products of the maternal, gap and pair-rule genes. In adult animals, these 14 parasegments will form the three head, the three thoracic and the eight abdominal segments. However, although there are similar numbers of segments and parasegments, they are, for the most part, slightly shifted relative to one another. In the thorax and the abdomen, this shift is approximately half a segment, meaning that a parasegment comprises the posterior half of one segment and the anterior half of the next. For example, PS6 comprises the posterior of segment T3 and the anterior segment A1. By chance, this shift is less visible in the adult animal because the visible portion of the adult abdominal segments corresponds primarily to the anterior portion of the segment. Ed Lewis described all of the phenotypes he originally studied according to the adult segments affected (Lewis, 1978). However, in 1985, Martinez-Arias and Lawrence observed that BX-C regulation occurs not through segments but instead through parasegments (PS) (Martinez-Arias and Lawrence, 1985). This observation has been confirmed by the expression patterns of the BX-C homeotic genes in embryos and larvae. The accompanying figure shows the correlation between segments and parasegments in the Drosophila embryo (anterior is to the left), and the expression pattern of each BX-C homeotic gene (darker shades of color indicate higher expression levels). Today, most reports on the regulation of the BX-C use both terminologies depending on whether an embryonic or adult phenotype is being described [the drawing of the embryo is reproduced, with permission, from Hartenstein (Hartenstein, 1993).
Because Mcp, Fab-7 and Fab-8 have properties that are reminiscent of other chromatin domain boundaries (such as scs/scs' or gypsy), and because other boundaries seem to behave as chromatin insulators, the identified BX-C boundaries have all been tested for enhancer-blocking activity. In this assay, the boundaries are tested in transgenic animals for their ability to prevent an interaction between an enhancer and a reporter gene promoter when the boundary is placed in between them. If the boundary DNA fragment is able to suppress the reporter gene when placed in between the enhancer and the promoter (but not when placed elsewhere), the fragment is considered to act as an insulator (Kellum and Schedl, 1991). The three known BX-C boundaries are each able to act as insulators when tested in this assay (Hagstrom et al., 1996; Zhou et al., 1996; Zhou et al., 1999; Barges et al., 2000; Gruzdeva et al., 2005). However, this finding leads to another paradox. Many of the BX-C boundaries are positioned in between BX-C enhancers and their target promoter. How then can these enhancers ever reach their target promoter, sometimes over many intervening insulators? The answer to this paradox is still a mystery. However, two sets of experiments have suggested possible mechanisms to achieve this.
Box 4. Polycomb- and trithorax-Group genes
The Polycomb-Group (Pc-G) genes (a group of ∼40 genes) (Jürgens, 1985; Duncan, 1982) keep the homeotic genes of the BX-C repressed in those segments where they have not been activated during early embryogenesis. The Pc-G gene products form large complexes that are thought to package chromatin into a compact, transcriptionally inactive conformation (McCall and Bender, 1996; Boivin and Dura, 1998; Fitzgerald and Bender, 2001). These genes have been named after their founding member, Polycomb (Pc), which was discovered by Ed Lewis as a negative regulator of BX-C activity (Lewis, 1978). Most Pc-G genes have been identified through the mild homeotic transformations that appear when a single Pc-G gene copy is mutated (a haplo-insufficient phenotype). The homeotic transformation that most often occurs in Pc-G mutants is the appearance of extra pairs of sex combs on the legs of the second and third thoracic segment in males, a feature normally found only on the legs of the first thoracic segment. It is for this reason that Pc-G genes often have names like Polycomb, Extra sex combs, Sex combs on midleg and Additional sex combs. So far, two functionally distinct classes of Pc-G protein repressor complexes, PRC1 and PRC2, have been identified. The core PRC1 complex (PCC) contains Polycomb, Polyhomeotic, Posterior sex comb and dRING1 (Sex combs extra – FlyBase) (Francis et al., 2001). The PRC2 complex contains the Enhancer of zeste [E(z)], the Supressor of zeste12[Su(z)12] and the Nurf55/CAF1 proteins. In agreement with a role for Pc-G proteins in mediating chromatin conformational changes, the PRC2 complex has a histone H3 Lys 27 (H3-K27) methyltransferase activity that is mediated by E(z). H3-K27 is an epigenetic chromatin modification that is associated with transcriptionally inactive chromatin (Czermin et al., 2002; Muller et al., 2002; Ng et al., 2000; Tie et al., 2001).
Meanwhile, the trithorax-Group (trx-G) genes appear to act counter to the Pc-G genes by maintaining the homeotic genes and their large cis-regulatory regions in a transcriptionally permissive state. Many trx-G genes have been identified through genetic screens for mutations that can suppress the dominant phenotype of Pc-G genes (Kennison and Tamkun, 1988; Shearn, 1989). Thus far, four complexes containing trx-G proteins have been identified, which all have chromatin modification activities, although their modes of action are quite varied (for a review, see Simon and Tamkun, 2002).
In 1999, Zhou and Levine asked this very question and looked for specific DNA fragments that could aid distal enhancers to bypass intervening boundaries (Zhou and Levine, 1999). The result of these experiments was the identification of an element that they called the promoter-targeting sequence (PTS). This element, normally located in the iab-7 domain just adjacent to the Fab-8 boundary, allows distal enhancers to bypass the Fab-8 boundary in transgenic assays. Later, it was shown that this PTS element can enable an enhancer to bypass even the gypsy insulator, suggesting that PTS function is independent of the insulator itself (Zhou and Levine, 1999). Recently, a new PTS element has been found in the iab-6 domain (Chen et al., 2005). On the basis of these results, it now seems likely that each boundary element may be flanked by a PTS element to aid in insulator bypass.
The second set of data that provides some hints as to how boundary elements are bypassed came from experiments performed with the gypsy insulator. When inserted between an enhancer and a promoter, the gypsy insulator is able to prevent enhancer-promoter interactions (Geyer and Corces, 1992). However, it was found that if two gypsy insulators were placed between these same enhancers and promoters, the enhancers were able to bypass the intervening insulators. A model was proposed in which insulators would pair with one another to allow bypass (Cai and Shen, 2001; Muravyova et al., 2001). In the BX-C, where there are often many boundary elements between an enhancer and a target promoter, this is a very attractive model. Perhaps boundary elements interact with one another and allow the appropriate enhancers to reach their target promoters. This model is still untested, although it has been shown that the Mcp element can indeed pair with the gypsy insulator and lead to enhancer bypass (Gruzdeva et al., 2005).
Although these two models are quite attractive, there still remains some doubt regarding their validity. This is largely due to experiments in which the Fab-7 boundary was replaced by the gypsy or scs insulators within the BX-C, by a process called gene conversion. When these experiments were performed, it was found that both the gypsy and scs fragments acted as insulators within the BX-C, blocking all distal enhancers from interacting with the Abd-B promoter (Hogga et al., 2001). This happened even though the PTS elements were left intact in all of these experiments. Therefore, although the BX-C boundaries may work as insulators in a transgenic context, their functioning may be more complicated in their endogenous context.
Given our current knowledge of the elements that make up the BX-C and of basic BX-C genetics, we can propose a general model for gene regulation within the BX-C. First, we believe that each regulatory region in the BX-C is a chromosomal domain, made up of a modular array of all of the elements necessary for the expression of a particular Hox gene in all segments posterior to a particular parasegment. Of primary importance to the functioning of each domain is the initiator element, which reads the positional address that is spelled out by the gap and pair-rule gene products, and then signals to either activate or silence the domain. Second, MEs, responding to the state of the initiator element, imprint this decision on the rest of the domain by changing the domain's chromatin structure. Because of the derepression of the inactive domain in transgenic reporter constructs containing initiators but lacking MEs (Simon et al., 1993; Chan et al., 1994; Fritsch et al., 1999), we envision that most of this imprinting comes in the form of Pc-G-mediated silencing. If the domain is active, then the various enhancers in the domain can function on their appropriate target promoter in that parasegment (and those posterior to it). If the domain is silenced, then all enhancers in the domain are prevented from interacting with the promoter. Finally, throughout this process, boundaries keep each domain separate and autonomous.
Throughout the 90 years of BX-C research, the BX-C has constantly provided the scientific world with exciting new findings. From the striking and conserved correlation between genomic position and segmental function (co-linearity), to the first evidence supporting evolution through gene duplication, studies of the BX-C have always led to the breaking of new ground. And through all this, we have learned much about how this complex locus is controlled. Yet, even now, we still only have a rough sketch of the BX-C. Our current understanding leaves many areas about which we know very little or nothing. For example, how do initiator elements communicate with MEs and instruct them as to the fate of the domain, and how do the various enhancers find their appropriate target promoters, often over great distances? The answers to many of these questions will influence the work of scientists across many different fields of biology. Given the wealth of knowledge about the BX-C, we believe that many of these questions will soon be addressed by studies of this complex. Determining the nature of the signal between the initiator and the ME, for example, is currently the topic of active research, with the leading model for this signal being the transcription of the ME itself (Lipshitz et al., 1987; Sanchez-Herrero and Akam, 1989; Cumberledge et al., 1990; Bender and Fitzgerald, 2002; Drewell et al., 2002; Hogga and Karch, 2002; Rank et al., 2002; Schmitt et al., 2005; Sanchez-Elsner et al., 2006). The surprising possibility that transcription through a cis-regulatory region might affect the regulation of a distal coding region is just one hint at the complexities that await us. And, we remain confident that, in the years to come, the BX-C will continue to provide us with new insights that will change how we think about genes and how they are transcriptionally regulated.
We are indebted to Drs Dan Fitzgerald and Welcome Bender for sharing results and providing unpublished data for Fig. 5. We also thank Dr Henrik Gyurkovics for stimulating discussions. Finally, we acknowledge Dr Annick Mutero and Maria Gambetta for critical reading of the manuscript. R.M. is supported by the Swiss National Foundation and F.K. by the State of Geneva.
- © 2006.