Non-coding RNAs (ncRNAs) that regulate gene expression in cis or in trans are a shared feature of prokaryotic and eukaryotic genomes. In mammals, cis-acting functions are associated with macro ncRNAs, which can be several hundred thousand nucleotides long. Imprinted ncRNAs are well-studied macro ncRNAs that have cis-regulatory effects on multiple flanking genes. Recent advances indicate that they employ different downstream mechanisms to regulate gene expression in embryonic and placental tissues. A better understanding of these downstream mechanisms will help to improve our general understanding of the function of ncRNAs throughout the genome.
In recent years, tiling-array analyses (see Glossary, Box 1) and genome-wide cDNA sequencing have shown not only that most of the mammalian genome is transcribed, but also that the majority of the mammalian transcriptome consists of non-coding (nc) RNAs (Carninci et al., 2005; Engstrom et al., 2006; Kapranov et al., 2002; Katayama et al., 2005; Okazaki et al., 2002). Because a full classification system for ncRNAs is still outstanding, they are generally described according to their mature length, location and orientation with respect to the nearest protein-coding gene. For example, a new group called large intervening non-coding (linc) RNAs comprise ncRNAs that lie outside annotated genes in the mouse genome (Guttman et al., 2009). When their function is known, ncRNAs can also be classified by whether they act in cis or trans (see Glossary, Box 1). Trans-acting functions are associated with short ncRNAs, such as short interfering (si) RNAs (21 nt), micro (mi) RNAs (∼22 nt), piwi-interacting RNAs (26-31 nt) and short nucleolar (sno) RNAs (60-300 nt). By contrast, cis-acting functions have so far only been associated with macro ncRNAs (see Glossary, Box 1), which can be up to several hundred thousand nucleotides long. Interestingly, whereas the number of protein-coding genes is no indication of an organism's morphological complexity, macro ncRNA number increases with complexity, indicating a potential functional role in gene regulation (Amaral and Mattick, 2008). In support of this hypothesis, many ncRNAs show distinct cell-type-specific and developmental-stage-specific expression profiles (Dinger et al., 2008; Mercer et al., 2008). To date, however, only a few macro ncRNAs have been analysed in detail and shown to have functional gene-regulatory roles (Yazgan and Krebs, 2007; Prasanth and Spector, 2007).
The best-known functional mammalian macro ncRNAs are the inactive X-specific transcript (Xist) and X (inactive)-specific transcript, antisense (Tsix), which are overlapping transcripts required for X chromosome inactivation in female mammals - an epigenetic dosage-compensation mechanism (see Glossary, Box 1) that equalises X-linked gene expression between the sexes. Xist is expressed from, and localises to, the inactive X chromosome and, by an unknown mechanism, targets repressive chromatin modifications and gene silencing to this chromosome. Tsix overlaps with the entire Xist gene in an antisense orientation and silences Xist on the active X chromosome (reviewed by Wutz and Gribnau, 2007). The next-best-studied mammalian macro ncRNAs are those involved in genomic imprinting (see Glossary, Box 1; Box 2). To date, 90 genes show imprinted expression in the mouse (http://www.har.mrc.ac.uk/research/genomic_imprinting), and their imprinted status is mostly conserved in humans (www.otago.ac.nz/IGC). Imprinted genes mostly occur in clusters that contain 2-12 genes, and in most of these clusters at least one gene is a macro ncRNA. So far, two of the three tested imprinted macro ncRNAs have been shown to be required for the imprinted expression of the whole cluster (Barlow and Bartolomei, 2007). Thus, imprinted macro ncRNAs are able to regulate small clusters of autosomal genes in cis and offer an excellent model system not only to investigate how ncRNAs regulate genes epigenetically, but also to investigate the general biology of ncRNA transcripts.
In this review, we focus on six well-studied mouse imprinted clusters and their associated macro ncRNAs (Fig. 1) and review three main areas: first, how imprinted macro ncRNAs are themselves epigenetically regulated by DNA methylation imprints, and their role in inducing imprinted expression and epigenetic modifications in imprinted clusters; second, what is currently known about the organisation and the transcriptional biology of imprinted macro ncRNAs; and third, why developmental and tissue-specific variation in imprinted expression indicates that multiple mechanisms might operate downstream of imprinted ncRNAs.
Imprinted macro ncRNAs are epigenetically regulated
A key feature of imprinted gene clusters is the presence of an imprint control element (ICE) (see Glossary, Box 1), which has been genetically defined by deletion experiments in mice or through the mapping of minimal naturally occurring deletions in humans (Table 1). The ICE is epigenetically modified on only one parental chromosome by a DNA methylation `imprint', which is acquired during maternal or paternal gametogenesis and is maintained on the same parental chromosome in the diploid embryo. As the other parental chromosome lacks ICE DNA methylation, this region in a diploid cell is also known as a gametic differentially methylated region (gDMR). The unmethylated ICE controls the imprinted expression of the whole cluster; upon its deletion, imprinted genes are no longer expressed in a parental-specific pattern (Bielinska et al., 2000; Fitzpatrick et al., 2002; Lin et al., 2003; Thorvaldsen et al., 1998; Williamson et al., 2006; Wutz et al., 1997). Note that the term `imprinted' refers to the presence of DNA methylation on the ICE and not to gene expression status and that the above-mentioned deletion experiments show that only the unmethylated ICE is active. Four of the six well-studied imprinted clusters in the mouse (Igf2r, Kcnq1, Pws/As, Gnas) are maternally imprinted and thus gain their ICE DNA methylation imprint during oogenesis. This imprint is maintained only on the maternal chromosome in diploid cells (Fig. 1A-D). The remaining two clusters (Igf2, Dlk1) are paternally imprinted and gain their ICE DNA methylation imprint during spermatogenesis. This imprint is maintained only on the paternal chromosome in diploid cells (Fig. 1E,F).
- Box 1. Glossary
- Cis-acting function
- The ability of a DNA sequence or transcript to regulate the expression of one or more genes on the same chromosome. This contrasts with trans-acting function (see below).
- A CCCTC-binding factor that is an 11-zinc-finger protein that binds insulator elements.
- Differentially methylated region (DMR)
- A CG-dinucleotide-rich genomic region that, in diploid cells, is methylated on one parental chromosome and unmethylated on the other. Gametic DMRs acquire their parental-specific DNA methylation during gametogenesis, either in the developing haploid oocyte or sperm, whereas somatic DMRs acquire their parental-specific DNA methylation in somatic diploid cells.
- DNA/RNA FISH
- A fluorescence in situ hybridisation technique that uses a complementary DNA or RNA strand to determine the localisation of DNA sequences or RNA transcripts in cell nuclei.
- Dosage compensation
- An epigenetic regulatory mechanism present in mammals, flies and worms that equalises the expression of genes on the X chromosome between XY/X0 males and XX females.
- Modifications of DNA or chromatin proteins that alter the ability of DNA to respond to external signals.
- Gene regulation in cis
- See cis-acting function.
- Gene regulation in trans
- See trans-acting function.
- Genomic imprinting
- An epigenetic mechanism that induces parental-specific gene expression in diploid mammalian cells (see Box 2).
- Imprint control element (ICE)
- A short DNA sequence [also known as an ICR (imprint control region) or IC (imprinting centre)] that controls the imprinted expression of multiple genes in cis. All known ICEs are also gametic DMRs; however, their identification requires the in vivo analysis of a deletion of the DMR.
- Insulator element
- A genetic boundary element that binds insulator proteins to separate a promoter on one side of the insulator element from the activating effects of an enhancer located on the other side.
- Macro ncRNAs
- ncRNAs that can be as short as a few hundred nucleotides or as long as several hundred thousand nucleotides, the function of which does not depend on processing into short or micro RNAs.
- Tiling-array analysis
- Commercial chips containing 25-60 nt oligonucleotide probes designed to continuously cover a genomic region that are used to produce unbiased maps of histone modifications following chromatin immunoprecipitation (ChIP-chip), or of DNA methylation following methylated DNA immunoprecipitation (MeDIP-chip), or of gene expression following cDNA hybridisation (RNA-chip).
- Trans-acting function
- The ability of a DNA sequence or transcript to regulate the expression of one or more genes on different chromosomes or to regulate mature RNAs in the cytoplasm.
In the Igf2r, Kcnq1 and Gnas clusters, the ICE contains a promoter for a macro ncRNA [Airn (108 kb), Kcnq1ot1 (91 kb) and Nespas (∼30 kb), respectively] that has an overlap in the antisense direction with only one gene in each imprinted cluster (for references, see Table 2). In the Pws/As cluster, the provisionally named Snrpn-long-transcript (Snrpnlt, also known as Lncat) is an unusually long macro ncRNA that could cover 1000 kb of genomic sequence. The Snrpnlt ncRNA overlaps in antisense orientation with the Ube3a gene, which is located 720 kb downstream. The size of the ICE in this cluster has not been precisely determined in the mouse, as the smallest available ICE deletion only shows a partial or mosaic imprinting defect (Bressler et al., 2001). In the paternally imprinted Igf2 and Dlk1 clusters, the ICE is found 5-14 kb upstream of the H19 macro ncRNA (2.2 kb) and the provisionally named Gtl2-long-transcript (Gtl2lt; length unknown), respectively. H19 lacks any known transcriptional overlap with the other genes in the cluster, whereas Gtl2lt overlaps with Rtl1. In short, although the organisation of these six well-studied imprinted clusters appears to be complex, they generally follow two simple rules: (1) an unmethylated ICE is required for macro ncRNA expression; and (2) most imprinted mRNA genes are not expressed from the chromosome from which the macro ncRNA is expressed.
Imprinted macro ncRNAs can host short ncRNAs
The length of the macroRNAs of the imprinted clusters, i.e. Airn, Kcnq1ot1, Nespas, Snrpnlt, H19 and Gtl2lt, ranges from 2.2-1000 kb (Table 2). Intriguingly, as shown in Fig. 1, the majority of these imprinted macroRNAs, with the possible exception of Kcnq1ot, also serve as host transcripts for trans-acting short RNAs, such as siRNAs, which are involved in gene silencing by the RNA interference pathway (reviewed by Mattick and Makunin, 2006), miRNAs, which function as translational gene repressors (reviewed by Cannell et al., 2008), and snoRNAs, which are involved in rRNA processing (reviewed by Brown et al., 2008). For example, siRNAs encoded by the Au76 pseudogene region lie within the Airn ncRNA in the Igf2r cluster and are found in oocytes (Watanabe et al., 2008). In the Pws/As cluster, two snoRNA clusters (Snord115, Snord116) are located within the Snrpnlt macro ncRNA (Cavaille et al., 2000; Huttenhofer et al., 2001). In the Gnas cluster, two miRNAs are located within the Nespas macro ncRNA (Holmes et al., 2003). In the Igf2 cluster, both the H19 ncRNA and the protein-coding Igf2 gene contain a miRNA within their transcriptional unit (Cai and Cullen, 2007; Landgraf et al., 2007). In the Dlk1 cluster, four separate macro ncRNAs have been described [Gtl2 (Meg3), Rtl1as, Rian, Mirg] that might be contained within the Gtl2lt ncRNA. Gtl2, Rtl1as and Mirg contain multiple miRNAs, whereas Rian contains multiple snoRNAs (Cavaille et al., 2002; Houbaviy et al., 2003; Huttenhofer et al., 2001; Kim et al., 2004; Lagos-Quintana et al., 2002; Seitz et al., 2004; Seitz et al., 2003). Interestingly, miRNAs generated from Rtl1as have been shown to be involved in the trans-silencing of Rtl1 through an siRNA-mediated pathway (Davis et al., 2005). Few of the small ncRNAs in these clusters have been analysed in detail and most probably have trans-acting functions, which indicates that they are unlikely to be involved in regulating imprinted gene expression that depends on a cis-acting mechanism. However, their presence suggests a functional link between macro ncRNAs and short ncRNAs.
Imprinted ncRNAs - atypical mammalian transcripts?
An unusual feature of many imprinted ncRNAs is that they are unspliced or spliced with a low intron/exon ratio, in contrast to the majority of mammalian mRNA-encoding genes, which are intron rich. Notably, the export of ncRNAs to the cytoplasm correlates with splicing (Table 2). The H19 ncRNA is fully spliced and exported to the cytoplasm (Brannan et al., 1990; Pachnis et al., 1984), whereas the Kcnq1ot1 ncRNA is unspliced and retained in the nucleus (Pandey et al., 2008). The Airn ncRNA produces unspliced and spliced transcripts at a ratio of 19:1, and only the spliced transcripts are exported to the cytoplasm (Seidl et al., 2006). In the Gnas cluster, both Exon1A and Nespas ncRNAs are also found as spliced and unspliced forms, but their cellular localisation is unknown (Holmes et al., 2003; Li et al., 1993; Liu et al., 2000; Williamson et al., 2002). Both Airn and Kcnq1ot1 were shown by RNA fluorescence in situ hybridisation (RNA FISH; see Glossary, Box 1) to form RNA `clouds' at their site of transcription (Braidotti et al., 2004; Nagano et al., 2008; Terranova et al., 2008). We do not yet know whether these ncRNA `clouds' explain their ability to repress flanking genes or whether this `cloud-like' appearance is a consequence of a lack of splicing, as mRNA genes mutated to inhibit splicing also show nuclear retention and intranuclear RNA focus formation (Custodio et al., 1999; Ryu and Mertz, 1989).
Although not all imprinted macro ncRNAs have been studied in sufficient detail, at least some have been shown to have unusual transcriptional features that are not generally associated with mammalian mRNA genes, such as reduced splicing potential or low intron/exon ratio, nuclear retention and accumulation at the site of transcription. Investigations into the control of these unusual transcriptional features, and into their role in the functional properties of imprinted ncRNAs, are only just beginning. An obvious genetic element that might account for unusual transcriptional features is the promoter. It was recently shown in fission yeast that splicing regulation is promoter driven (Moldon et al., 2008). It was therefore surprising to find that the endogenous Airn promoter does not control the low splicing capacity of the Airn ncRNA (Stricker et al., 2008), and further work is required to determine how these unusual macro ncRNA features arise.
Macro ncRNAs are functional in imprinted expression
The presence of macro ncRNAs in imprinted clusters raises the question of whether they play a functional role in imprinting. In the case of the two paternally expressed imprinted macro ncRNAs Airn and Kcnq1ot1, experiments in which the ncRNA was truncated by the homologous insertion of a polyadenylation cassette have demonstrated that these ncRNAs are indeed necessary for imprinted expression. Paternal chromosomes that carry a truncated Airn or Kcnq1ot1 ncRNA lose the repression of all protein-coding genes in the imprinted cluster in both embryonic and placental tissues, whereas maternal alleles are unaffected (Mancini-Dinardo et al., 2006; Shin et al., 2008; Sleutels et al., 2002). These experiments showed that these macro ncRNAs act by repressing multiple flanking genes in cis in both embryonic and placental tissues. Nespas is similar to Airn and Kcnq1ot1 in that it is transcribed from a promoter contained within the unmethylated ICE on the paternal allele and has an antisense orientation with respect to the imprinted protein-coding Nesp gene. However, it is not yet known whether Nespas has a cis-silencing role similar to Airn and Kcnq1ot1. By contrast, the maternally expressed H19 ncRNA is known to be dispensable for the imprinted expression of Igf2 (Schmidt et al., 1999). Instead, a methylation-sensitive insulator element (see Glossary, Box 1) contained in the ICE regulates the ability of enhancers that lie downstream of H19 to interact physically with the upstream H19 and Igf2 promoters (Fig. 1E). On the unmethylated maternal allele, CTCF (see Glossary, Box 1) binds the ICE and restricts the access of enhancers to the H19 promoter. On the methylated paternal allele, CTCF cannot bind, and the enhancers interact preferentially with the Igf2 promoter, facilitating its transcription (Bell and Felsenfeld, 2000; Hark et al., 2000). An additional mechanism to induce imprinted expression is present in the newly described H13 imprinted cluster on mouse chromosome 2 (Wood et al., 2007) (not shown in Fig. 1). The H13 (histocompatibility 13 antigen) (Graff et al., 1978) gene contains an intronic, maternally methylated gDMR. The transcription of full-length functional H13 from the maternal chromosome depends on the methylation of this gDMR. On the paternal allele, the unmethylated gDMR acts as a promoter for the Mcts2 retrogene, and Mcts2 expression correlates with the premature polyadenylation of H13 (Wood et al., 2008). To date, it is unknown whether the Mcts2 retrogene is coding or non-coding, nor whether Mcts2 expression or the unmethylated gDMR is required to block the production of full-length H13 transcripts.
Box 2. Genomic imprinting: basic biology, history and clinical implications
Mammals are diploid organisms whose cells contain two matched sets of chromosomes, one inherited from the mother and one from the father. Thus, mammals have two copies of every gene with the same potential to be expressed in any cell. Genomic imprinting is an epigenetic mechanism that affects∼ 1% of genes and restricts their expression early in development to one of the two parental chromosomes. Genes that show parental-specific expression were hypothesised to exist in mammals following a series of landmark observations that began to accumulate thirty years ago. These included the failure of embryos to develop by parthenogenesis in the absence of fertilisation, the phenotype of embryos that had inherited two copies of one parental chromosome in the absence of the other parental copy, and the inability to generate viable embryos that contained two maternal or paternal pronuclei through oocyte nuclear transfer experiments. This hypothesis was corroborated in 1991 by the discovery of three imprinted genes: the maternally expressed Igf2r gene, the paternally expressed Igf2 gene and the maternally expressed H19 ncRNA. Whereas genomic imprinting now offers one of the best models in which to investigate epigenetic gene regulation in mammals, it also has considerable implications for modern molecular medicine in the management of genetic diseases that map to autosomes but are only inherited from one parent, and in the efforts to apply assisted reproductive or cloning technologies to human reproduction.
Thus, two out of the three tested imprinted macro ncRNAs act in cis to induce the imprinted expression of flanking genes. These macro ncRNAs might also possess additional functions. For example, as their promoter is contained in the ICE, they could play a role in acquiring the gametic DNA methylation imprint. However, a recent publication indicates that the transcription of overlapping imprinted protein-coding genes, rather than of ncRNAs, is needed to acquire methylation imprints in the Gnas cluster (Chotalia et al., 2009). This work shows that a truncation of the Nesp mRNA transcript (Fig. 1D) by the insertion of a polyadenylation cassette, which abolishes transcription through the ICE, impairs acquisition of the ICE methylation mark in oocytes. This might represent a common theme for oocyte-specific DNA methylation imprints, given that an overlapping mRNA gene has been reported to be transcribed through five other maternal gDMRs in oocytes, but not in sperm. Perhaps surprisingly, ncRNA transcription might play a role in maintaining the unmethylated state of the ICE in the early embryo. This is indicated by experiments in which the Airn promoter was deleted from the paternal chromosome in embryonic stem (ES) cells, which resulted in the methylation of the normally unmethylated paternal ICE (Stricker et al., 2008).
In summary, macro ncRNAs have been shown to function in inducing parental-specific gene expression in the Igf2r and Kcnq1ot1 imprinted clusters. In the Igf2 imprinted cluster, the ncRNA is expressed from the parental chromosome, which silences the protein-coding genes but does not itself play a functional role in the silencing. The functions of ncRNAs in the other imprinted clusters shown in Fig. 1 are yet to be tested.
Developmental and tissue-specific imprinted expression
How might developmental or tissue-specific imprinted expression arise? In this section, we discuss mechanisms that might differentially regulate imprinted expression, and describe recently developed in vitro model systems that provide an excellent tool with which to study these mechanisms (Fig. 2). Genomic imprinting consists of a cycle of events that begins when the ICE DNA methylation imprint is established on one parental allele during gametogenesis. After fertilisation, when the embryo is diploid, the ICE methylation imprint is maintained on the same parental allele through the action of the DNA methyltransferase DNMT1 (Li et al., 1993). In subsequent developmental or tissue-specific regulated steps, imprinted expression can be maintained by additional epigenetic modifications or lost in the absence of such factors. In the examples already discussed, temporal- and tissue-specific imprinted expression could be achieved by regulating ncRNA expression and function (for the Igf2r and Kcnq1 clusters) or by regulating insulator formation (for the Igf2 cluster). To complete the genomic imprinting life cycle, the ICE methylation imprint and any secondary epigenetic modifications are erased during early germ cell development to allow the parental gametes to acquire a maternal or paternal DNA methylation imprint ready for the next generation (reviewed by Barlow and Bartolomei, 2007).
There are several examples in which differential imprinted expression correlates with differential ncRNA expression. In the mouse brain, the imprinted expression of the Ube3a gene in the Pws/As cluster is seen in neurons that express the Ube3a-ats transcript, which might be continuous with Snrpnlt (Fig. 1C), but it is not seen in glial cells that lack this antisense RNA (Yamasaki et al., 2003). Glial cells show imprinted expression of Igf2r and express the Airn ncRNA, but neurons lack Airn and show non-imprinted Igf2r expression (Yamasaki et al., 2005). Imprinted expression of Igf2r also correlates with Airn expression in embryonic development. Preimplantation embryos lack Airn and express Igf2r from both parental chromosomes, whereas post-implantation embryos express Airn only from the paternal chromosome and Igf2r only from the maternal chromosome (Sleutels et al., 2002; Szabo and Mann, 1995). In terms of in vitro models for studying imprinting, it was shown recently for the Igf2r cluster that ES cell differentiation mimics the onset of imprinted expression and the gain of epigenetic modifications seen in the developing embryo (Latos et al., 2009). This work establishes the utility of ES cells to study the imprinted expression that is typical for embryonic tissue. However, as the imprinted expression of the Slc22a2 and Slc22a3 genes in the Igf2r cluster is restricted to the placenta labyrinth layer (Fig. 2A), this cannot be analysed in ES cells, which arise from a cell lineage that does not contribute to this tissue. Trophoblast stem (TS) cells are an obvious ES cell analogue for the study of genes that show imprinted expression only in placental tissues (Fig. 2B). However, differentiated TS cells appear to be an unsuitable model for the later stages of placental development, as the expression patterns and histone modifications detected in vivo are not recapitulated in vitro (Lewis et al., 2006).
The placenta provides a good example of tissue-specific variation in imprinted expression, as the majority of imprinted genes in the mouse only show imprinted expression in the placenta (Wagschal and Feil, 2006). This occurs because in many imprinted clusters a small number of centrally positioned genes show `ubiquitous' imprinted expression (i.e. in embryo, placenta and adult), whereas additional genes in the cluster that extend upstream or downstream have imprinted expression only in the placenta (Fig. 1). As experiments that involve either ICE deletion or ncRNA truncation (as described above) show that imprinted expression in the embryo and the placenta are controlled by the same elements, there are two possible explanations for this phenomenon: either the ICE or the ncRNA acts differently in these two tissues to repress genes; or the placenta allows spreading of the basic mechanism that operates in embryonic tissue. The mouse placenta is a highly invasive organ, and a complete separation of embryonic and maternal tissue is not possible (Fig. 2A). This maternal contamination means that the placenta might not be a reliable tissue for analysing imprinted expression and epigenetic modifications. The placenta is an extra-embryonic tissue, which means that it is an embryonic tissue that does not contribute to the embryo itself. If `placental-specific' imprinted expression were a general feature of extra-embryonic tissues, it might be more easily analysed in the extra-embryonic membranes (amnion, parietal and visceral yolk sac), which can be isolated without the presence of contaminating maternal tissue (Fig. 2A). The imprinted expression of some genes in the Igf2 and Kcnq1 imprinted clusters has been demonstrated in the visceral yolk sac (Davis et al., 1998; Frank et al., 1999); however, as the lineages of the extra-embryonic membranes and placenta differ, it remains to be determined whether the `placental-specific' imprinted expression pattern is conserved in all extra-embryonic tissues.
DNA methylation represses a cis-acting repressor
As described above, deletion experiments show that only the unmethylated ICE is active in inducing the silencing of flanking protein-coding genes, either by activating a ncRNA promoter or by forming an insulator (see also Fig. 1). Thus, the ICE can be viewed as a cis-acting repressor and DNA methylation as a modification to repress this repressor. The analysis of ICE methylation can therefore offer insights into how this epigenetic modification is attracted to specific sequences and how it is used to inhibit ncRNA transcription and insulator function. In the maternal germline, the DNA methyltransferase-like protein DNMT3L, in concert with the DNA methyltransferase DNMT3A, are crucial players in the establishment of ICE germline DNA methylation (Bourc'his and Proudhon, 2008). The subsequent maintenance of ICE methylation requires the DNMT1 family of DNA methyltransferases (Hirasawa et al., 2008). Additional proteins, such as the Krüppel-associated box zinc-finger protein ZFP57, are also required for acquiring ICE methylation in the Pws/As cluster and for maintaining ICE methylation in Dlk1 (as well as in three other imprinted clusters not shown in Fig. 1), but play no role in the Igf2 and Igf2r clusters (Li et al., 2008). Although the exact mechanism by which ZFP57 acts is unknown, this finding raises the possibility that each ICE requires different additional factors for the acquisition and maintenance of germline DNA methylation. Exactly how de novo DNA methylation enzymes recognise ICE sequences is unclear. Many ICEs contain a run of tandem direct repeats that have been suggested to form a secondary structure that induces DNA methylation (Neumann and Barlow, 1996). Tandem direct repeat sequences from the Igf2r and Kcnq1 cluster ICEs are able to induce maternal germline methylation in a transgenic model (Reinhart et al., 2006). The role of these repeats in the endogenous Igf2r ICE is not yet known; however, the in vivo deletion of a subset of tandem repeats from the Kcnq1ot1 ICE or of direct repeats that flank the Igf2 ICE did not change ICE DNA methylation (Lewis et al., 2004; Mancini-Dinardo et al., 2006). By contrast, a mouse strain-specific loss of methylation was observed following the deletion of a repeat region in the paternally imprinted Rasgrf1 cluster located on mouse chromosome 9 (not shown in Fig. 1) (Yoon et al., 2002). Thus, further work is needed to determine exactly how the methylation machinery in oocytes targets ICE sequences. Although the analysis of imprinted genes highlights one of the few reported cases of a functional role for DNA methylation in gene silencing, it should be noted that its silencing effect is directed towards the ncRNA. As shown in Fig. 1, imprinted protein-coding genes are expressed from the parental chromosome that carries the methylated ICE. Thus, in imprinted clusters, DNA methylation acts by repressing a repressor of imprinted protein-coding genes.
Histone modifications associated with imprinted gene clusters
The previous section described how DNA methylation directly regulates ICE activity, but does not directly silence imprinted protein-coding genes. Here, we discuss current progress in understanding the potential roles played by histone modifications in restricting expression of macro ncRNAs to one parental allele and imprinted protein-coding genes to the other allele. Recent studies have shown that the ICE carries histone modifications that are specific to the DNA-methylated or the DNA-unmethylated allele (Fig. 3 and Table 1). Genome-wide sequencing and oligonucleotide tiling-array analyses have been used to show that the DNA-methylated ICE is marked by focal repressive histone modifications of the type found in constitutive centromeric and telomeric heterochromatin (Table 3), such as H3K9me3, H4K20me3 and the presence of HP1 (Mikkelsen et al., 2007; Regha et al., 2007). The discovery of focal repressive heterochromatin changed our general understanding of chromatin, which has traditionally been classified as either heterochromatin or euchromatin and considered to represent, respectively, transcriptionally repressed and active regions (Huisinga et al., 2006). As the ICE is often located within transcribed genes (Fig. 3), it is now clear that focal heterochromatin can exist inside actively transcribed regions without blocking RNA polymerase II (RNAPII) elongation. In the Igf2r and Kcnq1 clusters, the repressive H3K27me3 mark is present on the methylated ICE only in undifferentiated ES cells. In the Igf2r cluster, H3K27me3 is absent from embryonic fibroblasts (Fig. 3A), but in the Kcnq1 cluster it is present in both embryo and placenta (Fig. 3B) (Latos et al., 2009; Lewis et al., 2006; Lewis et al., 2004; Umlauf et al., 2004). The unmethylated ICE lacks repressive modifications but carries active histone modifications, such as H3K4me and H3/H4 acetylation. The presence of active and repressive histone modifications on the same DNA sequence in diploid cells that modify different parental chromosomes can be used to identify an ICE (Fig. 3). The usefulness of this approach was demonstrated in a genome-wide study of diploid ES cells that identified short regions that carry both repressive H3K9me3 and active H3K4me3 modifications on the ICE of the six imprinted clusters shown in Fig. 1 (Mikkelsen et al., 2007).
The histone modification profiles established so far show that repressive marks are associated with the DNA-methylated ICE, whereas active histone marks are associated with the unmethylated ICE. Although it has proven to be relatively straightforward to assign a function to DNA methylation in regulating ICE activity, a general function for histone modifications has not yet been identified. Repressive H3K9me3 modifications are regulated by three known histone methyltransferases, SUV39H1, SUV39H2 (also known as KMT1A and KMT1B) and ESET (KMT1E, SETDB1), whereas the repressive H4K20me3 modification is regulated by SUV4-20H1 and SUV4-20H2 (KMT5B and KMT5C) (Table 3). The repressive H3K9me3 mark is maintained and even enhanced on the ICE in embryonic fibroblasts that lack SUV39H1 and SUV39H2, whereas the repressive H4K20me3 is reduced in embryonic cells that lack SUV4-20H1 and SUV4-20H2 without removing either the repressive H3K9me3 mark or DNA methylation (Pannetier et al., 2008; Regha et al., 2007). The ESET methyltransferase was found to bind to the Igf2r ICE; however, its role could not be tested directly because ESET-deficient cells are not viable at any developmental stage (Dodge et al., 2004; Regha et al., 2007). Thus, suitable genetic systems are not yet available to test the role of the repressive H3K9me3 and H4K20me3 modifications in regulating ICE activity.
In contrast to the lack of a defined role for histone-modifying enzymes in regulating ICE activity, several reports describe a role for these enzymes in regulating placental, but not embryonic, imprinted expression. The Polycomb group protein EED, which is required for repressive H3K27me3 modifications, has been shown to repress the paternal allele of four out of 18 tested imprinted genes in embryos 7.5 days post-coitus (dpc), which mainly consist of extra-embryonic tissue at this stage (Mager et al., 2003). The affected genes were located in three different imprinted clusters, in which the majority of genes maintained correct imprinted expression. This indicates that EED does not play a general role in regulating imprinted expression, but is attracted to specific genes. The G9A (KMT1C, EHMT2) histone lysine methyltransferase, which dimerises with G9A-like protein (GLP; KMT1D, EHMT1) to induce repressive H3K9me2 modifications (Table 3), is necessary for the paternal repression of some genes in the Kcnq1 and Igf2r clusters in the placenta, but not in the embryo (Nagano et al., 2008; Wagschal et al., 2008). As mentioned above, in embryonic tissue, repressive heterochromatin (H3K9me3, H4K20me3, HP1), but not repressive H3K27me3, modifies the DNA-methylated ICE in a focal manner. Not all of these repressive modifications have been mapped throughout the imprinted clusters in the placenta, but repressive H3K9me2/3 and H3K27me3 marks were found at the promoters of silenced mRNA genes in the Igf2r cluster (Nagano et al., 2008) (Fig. 3A). By contrast, these repressive marks were found to be more widespread in the placenta on the chromosome that carries the silenced mRNA genes in the Kcnq1 cluster (Fig. 3B). In one study in the placenta (Nagano et al., 2008), both active and repressive histone modifications were found on genes that showed placental-specific imprinted expression. Although this might indicate the existence of `bivalent' domains (Bernstein et al., 2006), care should be taken in interpreting these results owing to the risk of maternal tissue contamination in placental samples (Fig. 2).
In summary, the analysis of histone modifications shows that the same active and repressive histone modifications that correlate with expressed and silent genes also modify imprinted genes in an allele-specific manner. Further work is needed to determine which modifications reflect the cause as opposed to the consequence of imprinted expression. Although there is currently no indication that histone modifications co-operate with DNA methylation to restrict macro ncRNA expression to one parental allele, there is emerging data that, in the placenta, histone modifications might play a role in repressing imprinted mRNA genes in cis. As discussed in the next section, there might even be a link between macro ncRNA function and the establishment of histone modifications.
In the placenta, ncRNAs might target repressive histone marks
In both placental and embryonic tissue, the repression of multiple genes in the Igf2r and Kcnq1 clusters on the paternal chromosome depends on the Airn and Kcnq1ot1 macro ncRNAs (Mancini-Dinardo et al., 2006; Shin et al., 2008; Sleutels et al., 2002). However, the mechanism by which these ncRNAs induce repression is unknown. One significant open question is whether it is the ncRNA itself, or the act of its transcription, that is required for silencing (Pauler et al., 2007). Two recent studies indicate that the Airn and Kcnq1ot1 ncRNAs are themselves directly involved in silencing genes in the placenta. Kcnq1ot1 was found to localise physically to several silent genes on the paternal allele that lay hundreds of kb away from the Kcnq1ot1 promoter (Pandey et al., 2008). This finding is supported by RNA/DNA FISH, which showed partial overlap between the Kcnq1ot1 RNA and the flanking imprinted genes in the Kcnq1 cluster in the trophectoderm cells of early embryos, which contribute to the placenta (Terranova et al., 2008). Furthermore, Kcnq1ot1 also directly interacts with Polycomb group proteins, which are necessary for establishing the repressive H3K27me3 mark, and with G9A, which is involved in setting the repressive H3K9me2 mark (Pandey et al., 2008). Together, this indicates that in the placenta the Kcnq1ot1 ncRNA localises to chromatin and targets histone methyltransferases to the whole imprinted cluster. Notably, embryos that are deficient for G9A and for the Polycomb proteins EZH2 and RNF2 show a loss of paternal repression for some of the placental-specific imprinted genes in the Kcnq1 cluster (Terranova et al., 2008; Wagschal et al., 2008). Similarly, in the placenta, the Airn ncRNA in the Igf2r cluster lies in close proximity to the silent Slc22a3 promoter and has been shown to bind G9A. In addition, G9a-null embryos show a loss of placental imprinted expression for Slc22a3, but maintain Igf2r imprinted expression (Nagano et al., 2008). An RNA FISH study of the Airn and Kcnq1ot1 ncRNAs in TS cells and in preimplantation trophectoderm cells has also shown that both these ncRNAs are located in nuclear domains that are characterised by a high density of repressive H3K27me3 and by a lack of active histone modifications and RNAPII (Terranova et al., 2008). In summary, the evidence so far indicates that the Airn and Kcnq1ot1 ncRNAs induce imprinted expression by an RNA-directed targeting mechanism in the placenta that only affects genes that show placental-specific imprinted expression (Fig. 1). We present a model (Fig. 4A) according to which the ncRNA expressed from the unmethylated ICE is maintained at the site of transcription and associates with chromatin in cis. The ncRNA could localise throughout the cluster (Fig. 4A, Kcnq1ot1) or to specific promoters by looping (Fig. 4A, Airn), and might subsequently attract specific histone modifications that repress the transcription of multiple genes located at some distance from the ncRNA gene itself.
Is ncRNA transcription more important in embryonic cells?
As described above, recent work suggests that in the placenta it is the ncRNA transcript itself that mediates gene silencing. However, imprinted ncRNAs exert different effects in the mouse embryo than in the placenta, as only a subset of the genes in the imprinted gene clusters show imprinted expression in embryonic and adult somatic tissue (Fig. 1, Fig. 4B). In the Igf2r cluster, the Airn ncRNA represses Igf2r in embryos, but represses Igf2r, Slc22a2 and Slc22a3 in the placenta. In the Kcnq1 cluster, the Kcnq1ot1 ncRNA silences Kcnq1, Cdkn1c, Slc22a18 and Phlda2 in the embryo, but an additional six genes in the placenta. A further indication of the differences between the embryo and the placenta is that G9A and EED, which are required for repressive H3K9me2 and H3K27me3 modifications, respectively, and for the imprinted expression of some placental genes, appear to play no role in the imprinted expression in the Igf2r and Kcnq1 clusters in embryonic tissues (Mager et al., 2003; Nagano et al., 2008; Wagschal and Feil, 2006). We have proposed previously that the Airn ncRNA might silence Igf2r because of transcriptional interference, and that Airn might act solely by transcription per se (Pauler et al., 2007). According to this model, ncRNA transcription either interferes directly with transcriptional initiation or with the activity of essential cis-regulatory elements (Fig. 4B). Several lines of evidence support a model in which Airn silences Igf2r by transcriptional interference in embryonic tissue. First, Airn has a short half-life of ∼90 minutes, which argues against a function for the ncRNA in targeting repressive chromatin, as this would require it to be stable for at least one cell cycle (Seidl et al., 2006). Second, Airn does not induce widespread repressive chromatin in embryos (Regha et al., 2007). Third, the ability of Airn to silence Igf2r is dependent on promoter strength, a feature associated with transcriptional interference (Shearwin et al., 2005; Stricker et al., 2008). The Igf2r and Kcnq1 clusters differ in that the Kcnq1ot1 ncRNA represses multiple genes in the embryo, and as it is contained entirely within Kcnq1 (Pandey et al., 2008), it does not overlap with a promoter (Fig. 4B). However, it is possible to propose a transcriptional interference mode of silencing for this ncRNA by postulating the existence of crucial cis-regulatory elements that are overlapped by Kcnq1ot1. Although there is less evidence to support a transcriptional interference model for Kcnq1ot1, the lack of widespread repressive chromatin marks on genes in this cluster that show imprinted expression in the embryo (Pandey et al., 2008; Umlauf et al., 2004), as well as the absence of a role for G9A and EED (Mager et al., 2003; Wagschal et al., 2008), indicate that RNA-mediated targeting does not operate in embryonic tissues.
Mammalian macro ncRNAs, which comprise the majority of the transcriptome, have been suggested to play a role in the epigenetic regulation of gene expression, mainly on the basis of their expression patterns. In contrast to the uncertainty surrounding the function of most mammalian macro ncRNAs, imprinted macro ncRNAs have clearly been shown to regulate flanking genes epigenetically. Thus, imprinted genes offer a valuable in vivo and in vitro model not only to decipher the transcriptional biology of macro ncRNAs themselves and their regulation by DNA methylation, but also to shed light on the epigenetic mechanisms that underlie the macro ncRNA-mediated repression of flanking genes.
We thank Quanah Hudson and Stefan Stricker for comments on the manuscript. The authors are supported by grants from the Sixth European Union Framework Programme: the `HEROIC' Integrated Project and the `EPIGENOME' Network of Excellence, and the Austrian Science Fund (FWF).
↵* These authors contributed equally to this work
- © 2009.