help button home button Genetics Email Content Delivery
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Louis, M.
Right arrow Articles by Kaufman, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Louis, M.
Right arrow Articles by Kaufman, M.
Genetics, Vol. 165, 1355-1384, November 2003, Copyright © 2003

A Theoretical Model for the Regulation of Sex-lethal, a Gene That Controls Sex Determination and Dosage Compensation in Drosophila melanogaster

Matthieu Louisa, Liisa Holm2,a, Lucas Sánchezb, and Marcelle Kaufmanc
a The European Bioinformatics Institute, EMBL Outstation, Cambridge CB10 1SD, United Kingdom,
b Centro de Investigaciones Biologicas, 28006 Madrid, Spain
c Université Libre de Bruxelles, Centre for Non-linear Phenomena and Complex Systems, B-1050 Brussels, Belgium

Corresponding author: Matthieu Louis, EMBL Outstation, Cambridge CB10 1SD, United Kingdom., mlouis{at}ebi.ac.uk (E-mail)

Communicating editor: A. J. LOPEZ


*  ABSTRACT
*TOP
*ABSTRACT
*MODELS
*INTEGRATED MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

Cell fate commitment relies upon making a choice between different developmental pathways and subsequently remembering that choice. Experimental studies have thoroughly investigated this central theme in biology for sex determination. In the somatic cells of Drosophila melanogaster, Sex-lethal (Sxl) is the master regulatory gene that specifies sexual identity. We have developed a theoretical model for the initial sex-specific regulation of Sxl expression. The model is based on the well-documented molecular details of the system and uses a stochastic formulation of transcription. Numerical simulations allow quantitative assessment of the role of different regulatory mechanisms in achieving a robust switch. We establish on a formal basis that the autoregulatory loop involved in the alternative splicing of Sxl primary transcripts generates an all-or-none bistable behavior and constitutes an efficient stabilization and memorization device. The model indicates that production of a small amount of early Sxl proteins leaves the autoregulatory loop in its off state. Numerical simulations of mutant genotypes enable us to reproduce and explain the phenotypic effects of perturbations induced in the dosage of genes whose products participate in the early Sxl promoter activation.


SOMATIC sex determination is the commitment of an embryo to either the female or the male developmental pathway. In Drosophila melanogaster, flies with the chromosome constitution 2X;2A (X, X chromosome; A, haploid autosomal set) are females and flies with the chromosome constitution XY;2A (Y, Y chromosome) are males. Therefore, the X-linked genes are in two doses in females and in one dose in males. This imbalance is essential to signal sexual identity and lasts for a short period of time after fertilization, after which the amount of products encoded by the genes located in the X chromosome is equalized in both sexes (dosage compensation). In Drosophila, the dosage compensation process is achieved through hypertranscription of the single X chromosome in males (reviewed in CLINE and MEYER 1996 Down; LUCCHESI 1996 Down).

In somatic cells of D. melanogaster, the X-linked Sex-lethal (Sxl) gene directs both sex determination and dosage compensation. The instruction for establishing sexual identity and dosage compensation is implemented by the absence or the presence of the Sxl gene product. Over the last three decades, experimental investigations have unraveled the regulatory mechanisms that determine the production state of Sxl protein (diagrammed in Fig 1). The Sxl gene has two promoters (SALZ et al. 1989 Down), the establishment and the maintenance promoter (also called the early and the late promoter and denoted as SxlPe and SxlPm). For convenience, we adopt the following convention: the products of gene Sxl originating at the establishment promoter are referred to as early transcripts and proteins. The products of Sxl originating at the maintenance promoter are referred to as late transcripts and proteins.



View larger version (98K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. The determination of the production state of Sxl protein is a multistep process that converts the X/A ratio in a binary way. The dashed arrow indicates that positive autoregulation of the RNA splicing is launched in females only.

The primary genetic X/A signal (where X/A represents the ratio of X chromosome to autosomal sets) acts on the establishment promoter and controls Sxl expression at the transcription level (TORRES and SANCHEZ 1991 Down; KEYES et al. 1992 Down). Due to a twofold difference in the number of X chromosomes, Sxl transcription is efficiently initiated in females only. As a result, early Sxl protein is abundantly produced in females whereas it remains undetectable in males. The activation process of Sxl by the X/A signal is cell autonomous. The Y chromosome plays no role in Sxl activation.

Once Sxl transcription has been sex-specifically regulated, an event that occurs around blastoderm stage, the X/A signal is no longer needed and the production state of Sxl protein remains fixed (SANCHEZ and NOTHIGER 1983 Down; BACHILLER and SANCHEZ 1991 Down). The ability of the system to function as a stable genetic switch after the blastoderm stage and during the rest of development and adult life is due to the existence of a positive autoregulatory loop on Sxl expression (CLINE 1984 Down).

After the blastoderm stage, the maintenance promoter SxlPm starts functioning in both sexes, and production of the late transcripts persists throughout the remainder of the fly's life. Male transcripts differ from female ones by the inclusion of a male-specific exon that places stop codons in the open reading frame of mature mRNAs. The inclusion of this exon gives rise to truncated, nonfunctional Sxl proteins. In females, the male-specific exon is spliced out and functional Sxl protein is produced (BELL et al. 1988 Down; BOPP et al. 1991 Down). Therefore, the control of Sxl expression after the blastoderm stage occurs by sex-specific splicing of its transcripts. The autoregulatory function of Sxl takes place at the level of splicing since Sxl protein itself is the driving force for female-specific splicing of its own primary transcript (BELL et al. 1991 Down). The role of the primary genetic X/A signal for sex determination and dosage compensation is thus to provide females with a transient early burst of Sxl protein sufficient to establish female-specific control of Sxl expression, once the maintenance promoter has become functional.

Like any other regulatory process, determination of the production state of Sxl protein can be viewed as a program. This program aims to sense small quantitative differences in the X/A signal and to amplify them into the final all-or-none production of Sxl protein. Experimental works have studied how this program is genetically encoded and a very good picture of the system has emerged. So far, the dynamical aspects of the program—the way the program code is executed—have been tackled through verbal models. The complex nature of the regulatory processes analyzed makes it desirable to unify the present knowledge within a theoretical framework. Quantitative models are often useful to clarify qualitative hypotheses based on intuition.

We present a theoretical model for determination of sex-specific production of Sxl protein. The regulatory process modeled is composed of three steps: the formation of the X/A signal, the activation of the establishment promoter SxlPe by this X/A signal, and the effect of the early production of Sxl protein on the control of Sxl autoregulation (cf. Fig 1). The model focuses on the known molecular mechanisms operating at each step. As we shall see, this model clarifies the role of the system parts and allows testing working hypotheses. It emphasizes the importance of the molecular organization of the establishment promoter and shows that the decision-making process does not require all-or-none transcriptional regulation of SxlPe by the X/A signal. Indeed, our simulations are not compatible with the total absence of early Sxl protein in males, and the model suggests that production of small amounts of early Sxl protein in males is not sufficient to switch on the autoregulatory loop on Sxl protein production. Numerical simulations of the model equations allow a thorough analysis of mutant genotypes and display the in silico effects of loss-of-function mutations and/or abnormal dosages of the X/A signal genes. Our results are in good agreement with experimental observations and shed insights into the mechanistic features that enable the system to buffer important variations in gene dosage.


*  MODELS
*TOP
*ABSTRACT
*MODELS
*INTEGRATED MODEL
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

The regulatory processes controlling the production of Sxl protein are both time and space dependent. Eukaryotic cells are highly organized milieus and major cellular functions have been shown to occur in specific compartments (for an analysis of the functional architecture of the cell nucleus, see DUNDR and MISTELI 2001 Down). In a first attempt to partition the cell milieu, one can roughly distinguish the cytoplasm from the nucleus. While transcriptional regulation and RNA splicing occur in the nucleus, proteins are translated in the cytoplasm. For the sake of simplicity, the model developed herein does not integrate the distinction between nucleus and cytoplasm; diffusion and active transports between these two cell compartments are not considered. This simplification is motivated by the observation that all major regulatory events relevant to the transcriptional and the post-transcriptional regulation of Sxl take place inside the nucleus. In the forthcoming sections, the volume of the system implicitly refers to the volume of typical cell nuclei (for details see Appendix A).

Formation of the X/A signal
The X/A signal is polygenic. Genetic and molecular analyses have identified a set of zygotic and maternal genes that are necessary for activation of the establishment promoter: the zygotic numerators (X-linked), the zygotic denominators (autosomal), and the maternal genes. The numerator genes are scute [sc, also called sisterless-b (sis-b); TORRES and SANCHEZ 1989 Down, TORRES and SANCHEZ 1991 Down; PARKHURST et al. 1990 Down; ERICKSON and CLINE 1991 Down)], sisterless-a (sis-a; CLINE 1986 Down), outstretched [os, also called sisterless-c (sis-c); JINKS et al. 2000 Down; SEFTON et al. 2000 Down)], and runt (run; DUFFY and GERGEN 1991 Down; TORRES and SANCHEZ 1992 Down). The gene deadpan (dpn) is the only autosomal denominator gene of the X/A signal that has been found so far (YOUNGER-SHEPHERD et al. 1992 Down). The maternal effect genes are daughterless (da; CLINE 1978 Down), hermaphrodite (her; PULTZ and BAKER 1995 Down), and extramacrochaetae (emc; YOUNGER-SHEPHERD et al. 1992 Down).

Model components: For simplicity, only the most representative products of each class are taken into account to model the formation of the X/A signal: the numerator gene products Sc and Sis-a (denoted as SisA for notation clarity), the denominator gene product Dpn, and the maternal gene products Da and Emc (present in the same amount in male and female embryos; see above and a detailed discussion in SANCHEZ et al. 1994 Down).

Among the X-linked genes required for Sxl activation, we retained the two predominant genes, sc and sis-a. This choice is justified by the observation that not all of the genes involved in the activation of SxlPe play the same role. Indeed, despite the fact that increasing dosage of run alone is sufficient for promoting Sxl transcription in males (KRAMER et al. 1999 Down), several lines of evidence point out weaker effects of run with respect to sc and sis-a. First, low levels of Sxl expression are observed in the absence of run activity, indicating that run is not absolutely required for the transcriptional initiation of Sxl (DUFFY and GERGEN 1991 Down). By contrast, the absence of sc and sis-a activity completely eliminates Sxl expression (CLINE 1986 Down, CLINE 1988 Down; TORRES and SANCHEZ 1989 Down). Second, simultaneous duplications of run and sc show very low Sxl-dependent male-specific lethality (TORRES and SANCHEZ 1992 Down) in comparison with the lethal effects produced by sis-a and sc duplications (CLINE 1988 Down; TORRES and SANCHEZ 1989 Down).

As far as sis-c is concerned, the following observations suggest that it plays a secondary role with respect to sc and sis-a: (i) while mutations in sis-a and sc strongly downregulate Sxl transcription, removal of sis-c activity has a significantly weaker effect and allows residual expression of the gene (JINKS et al. 2000 Down; SEFTON et al. 2000 Down), and (ii) the female-specific lethal synergistic effect between run and a deficiency of sis-c is weaker than that between run and either sc or sis-a and that between sis-c and either sc or sis-a (SANCHEZ et al. 1994 Down, SANCHEZ et al. 1998 Down).

A molecular analysis of the interactions between the products required for early Sxl activation has been performed only for the gene products selected to model the formation of the X/A signal. Sc (VILLARES and CABRERA 1987 Down; MURRE et al. 1989 Down), Da (CRONMILLER et al. 1988 Down; MURRE et al. 1989 Down), and Dpn (BIER et al. 1992 Down) proteins contain a basic-helix-loop-helix (bHLH) domain. The HLH domain allows these proteins to form homodimers or heterodimers, while the basic domain is required for their binding to specific DNA sequences (MURRE et al. 1989 Down). Emc belongs to the HLH family and is devoid of a basic domain (ELLIS et al. 1990 Down; GARRELL and MODOLELL 1990 Down). Finally, SisA is a basic-leucine-zipper (bZIP) protein (ERICKSON and CLINE 1993 Down). Members of the bHLH and bZIP protein families are usually involved in transcriptional regulation (for review see MASSARI and MURRE 2000 Down).

Protein complexes: Interactions between the gene products selected in the model lead to the formation of complexes, which represent the molecular actors of the X/A ratio signal. Protein-protein interactions have been investigated by in vitro methods and yeast two-hybrid assays. Experimental evidence supports the formation of the following homo- and heterodimers: Sc-Da (CABRERA and ALONSO 1991 Down; VAN DOREN et al. 1991 Down; DESHPANDE et al. 1995 Down; LIU and BELOTE 1995 Down), Dpn-Dpn (WINSTON et al. 1999 Down), Emc-Sc, and Emc-Da (VAN DOREN et al. 1991 Down). Substantial levels of interaction were also reported between SisA and Da (LIU and BELOTE 1995 Down) and between SisA and Dpn (LIU and BELOTE 1995 Down). On this basis, we assume in this study that SisA can form stable heterodimers with Da and Dpn. Since very little is known about the formation of higher-order multimers, we restrict the present model of the X/A signal production to the formation of dimers. No interaction has been observed between Sc and Dpn (DESHPANDE et al. 1995 Down; LIU and BELOTE 1995 Down) and between Sc and SisA (LIU and BELOTE 1995 Down).

The molecular characterization of the X/A signal components has clarified how they act on the establishment promoter SxlPe. Sc-Da induces transcription at SxlPe by binding to a set of regulatory sites within the promoter (ESTES et al. 1995 Down; HOSHIJIMA et al. 1995 Down; YANG et al. 2001 Down). Similarly, Dpn-Dpn inhibits Sxl transcription by binding to a pair of adjacent regulatory sites (ESTES et al. 1995 Down; HOSHIJIMA et al. 1995 Down). It has been shown that SisA protein is required to efficiently induce transcription at SxlPe (ESTES et al. 1995 Down). Hence, we assume that SisA-Da is a transcription factor capable of activating transcription at the establishment promoter. In contrast, Emc-Sc and Emc-Da complexes do not interact with DNA (DAVIS et al. 1990 Down; VORONOVA and BALTIMORE 1990 Down). No interaction has been reported between SisA-Dpn and DNA. Therefore, Emc-Sc, Emc-Da, and SisA-Dpn are treated as inactive complexes that titrate Sc, SisA, Da, and Dpn and prevent the formation of the transcription factors mentioned above. Formation of the Emc-Sc complex impedes the activation of SxlPe as it sequesters Sc. In contrast, formation of the SisA-Dpn complex has a dual effect: it simultaneously favors activation and repression of SxlPe since it reduces the amount of both free SisA and free Dpn available for the formation of activator (SisA-Da) and repressor (Dpn-Dpn). The potential formation of Da-Da and Da-Emc complexes is not taken into account as it merely reduces in a non-sex-specific manner the amount of free Da and Emc. Table 1 contains the symbols used to denote single proteins and protein complexes, together with the supposed function of the individual complexes on the activation of SxlPe.


 
View this table:
[in this window]
[in a new window]

 
Table 1. Proteins and protein complexes of the X/A ratio signal that are included in the model

Through the formation of inactive complexes, the primary signal realizes a balance between the gene products encoded by the X chromosome(s) and the autosomes. Because autosomal zygotic gene products and maternal gene products are present in equal amount in the two sexes, they have no discriminative power in sex determination. The sole difference between males (1X;2A) and females (2X;2A) is the number of X chromosomes and thus the dosage of the X-linked gene products. A decade ago, it was hypothesized that the competitive formation of positive and negative regulatory complexes could lead to a higher sequestration of activator molecules in males than in females and could thereby amplify the male-female differences (PARKHURST et al. 1990 Down; PARKHURST and ISH-HOROWICZ 1992 Down). By modeling the X/A signal formation, we were able to quantitatively estimate the amplification resulting from sequestration events.

Model assumptions: To model the formation of the X/A signal with ordinary differential equations, the following assumptions are made:

  1. The concentration of the X/A signal proteins is assumed to be high enough so that a description in terms of concentration can be applied.

  2. Since Sc, SisA, and Dpn are zygotically expressed shortly after fertilization, their production is modeled by constant influxes that are rapidly switched on and off, in agreement with the times of in vivo mRNA and protein appearance (cf. Table A22).

  3. Da and Emc are present in nonlimiting concentrations in the two sexes. Whether or not maternal da and emc mRNAs are still translated at the time early Sxl activation occurs remains unclear. The existence of translation, though, should not affect the system dynamics as long as the proteins are sufficiently abundant. This is supported by numerical tests (data not shown). For simplicity, translation of the maternal mRNAs is ignored and Da and Emc proteins are given as initial conditions.

  4. Individual proteins are degraded following first-order reactions.

  5. The production rate of Sc and SisA is roughly proportional to the number of gene doses. This has been shown experimentally (ERICKSON and CLINE 1993 Down; DESHPANDE et al. 1995 Down).

  6. Interactions between the protein pairs SisA::Dpn and SisA::Da lead to the formation of SisA-Da and SisA-Dpn heterodimers.

  7. The formation of protein complexes is assumed to be reversible. No experimental fact contradicts this hypothesis.

  8. Molecular dimers are assumed to be not directly degraded. This last hypothesis can be justified by the putative hiding of domains favoring degradation within oligomers. For instance, stabilization effects of oligomerization have been observed for spectrin assembly in erythroid development (LAZARIDES and WOODS 1989 Down) and hyperthermophilic protein assemblies (VIEILLE and ZEIKUS 2001 Down).

The reaction scheme for the formation of the X/A ratio signal is given in (1) and discussed in more detail in Appendix A:

(1)

In (1), the three dots symbolize the degradation pathway. In agreement with assumption 5, the influx rate of the X-linked gene products is equal to a production rate per gene dose (denoted as F) multiplied by the number of gene doses (denoted as {chi}). For wild-type flies, {chi}p is equal to the number of autosomal sets while {chi}a and {chi}s are equal to the number of X chromosomes. As only the number of X-linked genes differs between males (1X;2A) and females (2X;2A), the initial X/A ratio difference corresponds to a twofold difference in the production flux of Sc and SisA.

Due to the lack of experimentally measured kinetic data, parameter values are either inferred from known kinetic constants of homologous proteins or deduced theoretically as explained in Appendix A. Although the exact amounts of proteins forming the X/A signal have never been precisely quantified over time, it has been experimentally observed that: (i) sc and sis-a mRNAs start being substantially transcribed during nuclear cycle 11; (ii) substantial amounts of dpn mRNAs are not detectable before cycle 11; and (iii) the production of Sc and SisA proteins correlates with the activation of early Sxl transcription in females (see Table A22 for details). For simplicity, it is assumed that Sc, SisA, and Dpn proteins appear in the nucleus simultaneously. The initial concentration of Da and Emc was chosen so that both proteins remain in nonlimiting concentrations throughout the X/A signal assessment.

Steady-state analysis: The kinetic equation system corresponding to the reaction scheme (1) consists of 10 ordinary differential equations (ODEs) given by (A1) in Appendix A together with their steady-state solutions. Under the aforementioned biochemical assumptions of the model, the following results are derived:

  1. The putative lack of production of Da and Emc implies that both proteins are absent at steady state and therefore the amount of SisA-Da (ad) and Sc-Da (sd) activator complexes is nil at steady state. Consequently, the steady-state composition of the X/A signal hinders the expression of Sxl at the establishment promoter in both males and females.

  2. The steady-state concentration of repressor Dpn-Dpn (p2) complex is not a function of SisA flux (Fa) or SisA degradation (da). As a consequence, the amount of repressor complexes (p2) is not influenced by the formation of the sequestration complex SisA-Dpn (ap) at steady state.

  3. Numerical simulations show, for the realistic parameter set chosen, that the time needed for the system to relax toward steady state largely exceeds the developmental time window during which the X/A signal is formed and activates Sxl. It can be concluded that the X/A signal is not at steady state when it governs the transcriptional control of early Sxl.

Numerical simulations: To analyze the system outside steady state, the time evolution of the protein concentration is computed numerically by integrating the ODE system (A1) with the parameter set and initial conditions presented in Appendix A. Results are shown in Fig 2. Activator SisA-Da (Fig 2A), Sc-Da (Fig 2B), and the repressor Dpn-Dpn (Fig 2C) complexes are present in significant amounts in both males and females. The amounts of activators are higher in females than in males, whereas the amount of repressor is higher in males than in females. The primary signal is sensed through the relative amount of activators (Sc-Da and SisA-Da) vs. repressor (Dpn-Dpn). Interestingly, scanning parameter space suggests that an absence of activator in males and/or an absence of repressor in females are not achievable for parameter sets that are realistic biologically (data not shown).



View larger version (21K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. Time evolution of the number of proteins in females (solid curve) and males (dashed curve) for (a) SisA-Da, (b) Sc-Da, and (c) Dpn-Dpn. Numerical simulations of the ODE system (A1) were achieved with the parameter set given in the Appendix A and by using the free software XPPAUT (http://www.math.pitt.edu/~bard/xpp/xpp.html).

To quantify to what extent the twofold difference in X chromosome dosage is amplified by the reduction of the effective concentration of numerator factors through the formation of inactive complexes, the ratio of the number of activator complexes in males and females is computed over the time. The same is done for the ratio of repressor complexes. Fig 2 displays that maximum amplification is reached 40 min after the X/A ratio genes are expressed for both the activators and the repressor. Around that stage, the activator and repressor ratios are roughly equal to 3 and 1/3, respectively:

In conclusion, our model of X/A signal formation shows that the existence of the sequestering complexes SisA-Dpn and Sc-Emc leads to a significant but moderate four- to fivefold amplification of the initial difference in X-linked (numerator) products between males and females. Moreover, the complete absence of activators in males and/or of repressors in females is unlikely to be at the origin of the female-specific early activation of Sxl.

Activation of Sxl by the X/A signal
The molecular structure of the establishment promoter SxlPe has been the object of different experimental studies (ESTES et al. 1995 Down; HOSHIJIMA et al. 1995 Down; YANG et al. 2001 Down). It has been shown in D. melanogaster that a fragment of 1400 bp upstream of the SxlPe transcription initiation site contains all the cis-acting sequences required for the control of early Sxl expression by the X/A signal (cf. Fig 3A). Several canonical and noncanonical nucleotide sequences, called E-boxes (MASSARI and MURRE 2000 Down), lie within this region and show a wide range of DNA-binding affinities for the Sc-Da complexes (YANG et al. 2001 Down). The E-boxes can be roughly clustered in two regions with distinct average affinities. Deletion experiments suggest that the upstream distal region (-1.4 to -0.8 kbp) is not essential for sex specificity, although it groups high or moderate affinity sites. In contrast, the proximal region (-390 bp to the transcription initiation site) is necessary and sufficient to induce transcription (in the presence of a sufficient amount of activators). The ostensible function of the distal region is to enhance transcription once it has been initiated by the proximal region; control of transcription at SxlPe is exerted mainly through the proximal region (YANG et al. 2001 Down).



View larger version (69K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3. (a) Establishment promoter (SxlPe) structure following YANG et al. 2001 Down. The model identifies the establishment promoter with its proximal region. (b) Illustration of the regulatory mechanisms included in the transcription function {Phi}rx(a, r). The left bracket (dark gray background) describes the binding of activator complexes to the independent E-boxes located upstream of the D-boxes. The right bracket (light gray background) describes the competitive binding of activator and repressor complexes to E-box 1 and the D-boxes. (c) Following the model assumptions, an example of an active configuration of SxlPe. (d) Example of an inactive configuration of SxlPe.

Binding sites controlled by the repressor are called D-boxes. Two D-boxes lie upstream and close to the transcription initiation site; they bind Dpn dimers (HOSHIJIMA et al. 1995 Down; YANG et al. 2001 Down). Due to their location close to the transcription start site, binding of the D-boxes is very likely to inhibit the formation of the transcription machinery.

Gene regulation is a process of an intrinsically probabilistic nature (FIERING et al. 2000 Down). Stochastic aspects of gene expression have been recently pointed out (ZLOKARNIK et al. 1998 Down; HUME 2000 Down; ELOWITZ et al. 2002 Down). Several theoretical studies have highlighted important implications of noise and randomness in gene regulatory systems that are not predictable by deterministic models (KO 1991 Down; ARKIN et al. 1998 Down; COOK et al. 1998 Down; MCADAMS and ARKIN 1999 Down; KIERZEK et al. 2001 Down). Transcriptional regulation at the establishment promoter is clearly subject to molecular noise, inasmuch as it takes place at only one promoter composed of a few regulatory sites. Moreover, recent observations suggest that the activation of SxlPe occurs independently on each X chromosome (ERICKSON and CLINE 1998 Down).

Increasing evidence supports the view that enhancers/repressors stochastically regulate the probability that transcription occurs (WALTERS et al. 1995 Down; FIERING et al. 2000 Down; HUME 2000 Down). Accordingly in our modeling, we consider that the activity of a promoter is described in terms of "on" states that allow transcription and "off" states that do not allow transcription. The dynamics of the transitions between the on and off states of the promoter are determined by the binding and unbinding of the transcriptional enhancers/repressors, and the rate of transcription is a function of the fraction of time that the promoter spends in its on states. We have therefore developed a probabilistic model for the control of SxlPe activity by the X/A signal that takes into account the various molecular characteristics of the promoter.

Model assumptions:

  1. Since only the proximal region (-390 bp to the transcription initiation site) of the establishment promoter is necessary and sufficient to induce transcription, the model identifies the establishment promoter with this proximal region, which also contains the D-boxes (see Fig 3B).

  2. As no evidence supports the existence of cooperativity among E-boxes, E-boxes are also considered to be independent. Furthermore, cooperativity does not exist among D-boxes (WINSTON et al. 1999 Down).

  3. Because the first E-box is located adjacent to the D-boxes and the transcription initiation site (YANG et al. 2001 Down), it is treated differently than the other E-boxes. Steric hindrance is assumed to exist between E-box 1 and the two D-boxes, since the activator and repressor molecules are relatively large (X/A gene products are made up of ~250 amino acids). This would lead to a competition for the binding of activators and repressors to their respective regulatory sites. We thus assume that the binding of any D-box prevents the subsequent binding of E-box 1 and vice versa (see Fig 3C and Fig D).

  4. It has been demonstrated that mutations in only one of the E-boxes do not prevent expression at the establishment promoter SxlPe. Combinations of several mutated E-boxes, however, can substantially reduce the promoter activity (YANG et al. 2001 Down). These results suggest that a minimum number of E-boxes must be occupied to efficiently activate SxlPe. We assume that transcription is induced if and only if no D-box is occupied and at least six E-boxes (including E-box 1) are occupied (cf. Fig 3C and Fig D).

  5. Corepression mechanisms are ignored, as they are mediated by a maternal factor present in the same concentration in males and females (e.g., role of Groucho; DAWSON et al. 1995 Down).

  6. The measured dissociation constant Kr = lr/kr of the reaction between Dpn-Dpn and the D-boxes,


  7. (2)

    has been experimentally estimated to be 2.6 nM (WINSTON et al. 1999 Down).

  8. For simplicity, the affinities of the E-boxes are chosen to be equal. The measured dissociation constant Kr for the reaction between D-boxes and repressor complexes is one to two orders of magnitude lower than the values reported for most of the bHLH proteins that bind DNA. The dissociation constant Ka = la/ka for the binding of the E-boxes is thus expected to be higher than Kr:

    (3)

    We arbitrarily set Ka = 50 nM so that it is larger than the dissociation constant of Dpn-Dpn and the bHLH protein E47 and smaller than the dissociation constant of MyoD (experimental values of dissociation constant for bHLH proteins are given in SUN and BALTIMORE 1991 Down).

  9. It is assumed that Sc-Da and the putative SisA-Da complexes bind to the same regulatory sites and have similar effects on transcription. From now on, the numbers of activator complexes Sc-Da and SisA-Da are combined and denoted as A (activator acting upon the establishment promoter). The number of repressor complexes is denoted as R (repressor acting upon the establishment promoter).

  10. The number of activator and repressor molecules is assumed to be sufficiently large to systematically neglect the number of molecules bound to the promoter in comparison to the number of molecules in free solution. Furthermore, numerical simulations support the idea that the formation of the X/A signal evolves on a timescale slower than the dynamics of the interactions between the promoter and the regulatory factors (data not shown).

  11. For simplicity, the times separating association and dissociation events of the transcription factors are modeled as a random variable that follows a Poisson distribution.

Probabilistic model for the transcriptional regulation of a single gene: Let us define the configuration of a promoter as the state of occupancy of all its individual binding sites. The effects of site deletions suggest that promoter configurations can be clustered into the active ones able to induce transcription and the inactive ones that show very little transcriptional induction. As a first approximation, it is sound to assume that the transcription rate is proportional to the average fraction of time the promoter spends in its active configurations. We aim to justify this statement and calculate the average fraction of time the promoter is active. Below, we briefly outline the basic concepts underlying the methodology used in the model.

Given the structure of a promoter, configurations can be symbolized by Boolean vectors where each bit encodes the state of occupancy of a particular binding site (by convention, the state occupied is denoted as one and the state unoccupied as zero). If we suppose that the probability that two binding sites undergo a change at the same time is negligible, configuration changes can be viewed as transitions between Boolean vectors where only one bit is allowed to flip at each change. Given that the promoter can be in a finite number of states (maximum 2n, where n is the number of binding sites), and assuming that association and dissociation of activators and repressors occur following a stochastic process, transitions between configurations can be formalized as a time-continuous Markov chain. As we shall see, the Markov chain theory provides a useful framework to calculate the probability of being in any promoter configuration at any time. In this formalism, the vector containing all the configuration probabilities obeys a master equation that can be solved analytically for small and rather simple systems (MCQUARRIE 1967 Down) or numerically in every case (GILLESPIE 1977 Down, GILLESPIE 1992 Down). Hence, probabilities instead of concentrations are handled, thus avoiding the pitfall of having to define "configuration concentrations" for a single gene system.

On the basis of the above assumptions 2 and 3, we divide the establishment promoter into two functional independent domains: the upstream domain contains the independent E-boxes 2–7 and the downstream domain contains E-box 1 and the two D-boxes (cf. Fig 3B). Analysis of the regulatory characteristics of the two domains is done separately in the following two paragraphs.

Competitive binding for D-boxes and E-box 1 (downstream domain): Let us represent the state of the domain that contains E-box 1 and the D-boxes by a Boolean vector [{epsilon}1, {delta}1, {delta}2], where {epsilon}1 denotes the state of occupancy of E-box 1 and {delta}i the state of occupancy of the ith D-box. As mentioned above, {epsilon}1 and {delta}i = {0, 1}. In theory the system admits 23 different states. Nevertheless, competitive bindings (assumption 3) allow only 5 of them (denoted as G, GE, GD1, GD2, and GD1D2). In the current model, inhibition of promoter activation is restricted to short-range effects occurring through steric hindrance between E-box 1 and the D-boxes (assumption 3). Even though the requirement of E-box 1 for the operation of SxlPe remains unclear (YANG et al. 2001 Down), transcription activation is assumed to occur, in our model, if and only if the E-box 1 is occupied by the activators, implying that the two D-boxes are empty of repressors.

The subsystem states are listed in Table 2 together with their notation and effect on transcription. Let us define the probability distribution vector as

where PG, PGE, ... denotes the probability that the promoter is in configuration "G," "GE," ...


 
View this table:
[in this window]
[in a new window]

 
Table 2. Configurations of the promoter downstream domain

Let At be the number of activator molecules and Rt be the number of repressor molecules at time t. If At and Rt are changing slowly enough, they can be considered as transiently constant. Therefore, the explicit time dependence of At and Rt is suppressed in the rest of this paragraph. The binding of activator and repressor molecules to the promoter is modeled as a Poisson process (cf. assumption 10) with kinetic constants given in reactions (2) and (3). The probability that one activator molecule binds to E-box 1 during the infinitesimal time interval dt is thus equal to (kaAdt). Similarly, the probability that the complex activator::E-box 1 dissociates during dt is equal to (ladt). The same holds for the repressor with the on- and off-rate kr and lr. The dynamics of vector (t) are described by a system of ODEs called the master equation,

(4)

represents a (5 x 5) matrix of transition probabilities (for the expanded form of , see Appendix B). Equation 4 can be solved analytically and numerically. The steady-state solution of (4) is given in Appendix B. It can be shown that the relaxation time of the chain is relatively fast for the set of parameters chosen (in the order of a few seconds). For simplicity, it is then sufficient to focus on the calculation of the steady-state distribution st rather than calculating the time-dependent distribution (t). As derived in Appendix B, the probability that the promoter domain is in the active state GE is

(5)

Relation (5) gives the average fraction of time that the promoter spends in state GE at steady state.

Multiple E-box (upstream) domain: The upstream domain is made up of six independent E-boxes. Their location is assumed to not influence their role in promoting transcription. Since we are dealing with six E-boxes, the number of possible different configurations (26) is obviously too large to enable us to proceed as we did for the downstream domain. This problem can be bypassed by the clustering of states in classes of equivalence that have the same number of occupied binding sites, whichever they are. Equivalence classes are defined in Table 3.


 
View this table:
[in this window]
[in a new window]

 
Table 3. Equivalence classes of the promoter upstream domain

According to assumption 4, the rate of transcription from SxlPe will depend on the active classes C5 and C6 solely. Following the methodology depicted in Appendix B, the steady-state probability that the promoter domain is active is given by

(6)

Transcription rate as a function of activator and repressor molecules (full promoter analysis): The results of the two previous paragraphs can be combined to compute early Sxl transcription rate as a function of the number of activator and repressor molecules present in the system. A simple model for the transcriptional regulation of the full promoter can be constructed on the basis of relations (5) and (6). Let us denote the class of active configurations of the whole promoter as G* and the class containing the other nonactive configurations as G. To calculate the number of transcripts produced per unit of time, we need to estimate the number of transcription rounds induced per unit of time. Each transcription round starts with the successful engagement of the polymerase machinery. The binding of the polymerase machinery requires the promoter to be in state G*. Once the binding of the polymerase complex has occurred, a transcription round starts and ends up with the synthesis of an early mRNA molecule (denoted as rx). The reaction scheme (7) illustrates the process schematically:

(7)

In the reaction scheme, the production of an mRNA molecule and the activation/deactivation kinetics of the promoter are represented by different arrows to emphasize that the promoter is not transformed into rx but conditions the production of rx. The binding of the polymerase is modeled as a Poisson process of parameter kt. Under this scheme, transcription initiation is supposed to arise from the competition between the binding of the polymerase complex to the active promoter (i.e., when the promoter is in state G*) and the deactivation of the promoter from state G* to state G.

In a first approximation, the average number of early Sxl mRNAs ({Phi}rx) can be estimated as being the fraction of time spent by the promoter in its active state G* divided by the average time required to induce a transcription round. Given the independence of the upstream and downstream domains, the fraction of time searched can be calculated as the probability of having E-box 1 and at least five other E-boxes occupied simultaneously, i.e.,

When the promoter is active, the average time separating two bindings of the polymerase is 1/kt (general property of Poisson process). It then follows that

(8)

The numerical value of kt is fixed as discussed in Appendix B.

Sex-specific promoter activation: At this stage, it is useful to introduce the scaled variable a = A/Ka for the activator and r = R/Kr for the repressor [where Ka and Kr represent the dissociation constant of reactions (3) and (2), respectively]. Fig 4 displays two different graphs of {Phi}rx as a function of variables a and r. For the range of values considered in Fig 4B, we observe that the response of the transcription rate {Phi}(a, r) to increases in the amount of activators at a fixed amount of repressor is either very weak, when the amount of activators is low, or almost linear when the amount of activators is sufficiently high (cf. Fig 4B). This behavior resembles a threshold phenomenon where the activator leaves transcription off until it reaches a certain value and then induces transcription, though it is less pronounced. The contour plot of {Phi}(A, R) in the plane (A, R) is presented in Fig 5A. Fig 5A suggests that two conditions need to be simultaneously fulfilled to induce transcription efficiently: (i) the amount of activators must be sufficiently high and (ii) the amount of repressor must be sufficiently low. On this basis, the plane (A, R) can be separated in four qualitatively distinct quadrants (cf. Fig 5A). The borderline between quadrants is arbitrarily placed, as no obvious threshold values can be defined from (8):

These results suggest that female fate is determined in quadrant IV whereas male fate is totally secured in quadrant I. Numerical simulations confirm this idea.

In Fig 5B, the time trajectories of male and female are plotted in the (A, R) plan after numerical integration of the model equations. The female trajectory (white curve) visits quadrant IV for a relatively long time while the male one (red curve) remains in quadrants I and II, mainly visiting quadrant I. These trajectory differences reflect the fact that the activator/repressor ratio is constantly higher in females than in males. Accordingly, the number of early Sxl proteins produced in males and females differs dramatically. Let us denote as the number of early Sxl proteins in the system before the maintenance promoter is constitutively turned on (i.e., at time t = 12,000 sec in our model; see next section). From the simulated trajectories depicted in Fig 5B, the ratio female/male of Sxl protein in females vs. males is estimated to be 80. We conclude that the transcriptional control of the establishment promoter by the X/A signal leads to almost a 100-fold difference in the number of early Sxl proteins present in the two sexes. We observe that the amount of early Sxl proteins is not nil in males, even though it is low compared to the amount produced in females. From relation (8) and Fig 5A, we learn that the establishment promoter is fully off only if the amount of activators is nil.

In summary, the net activation of the establishment promoter by the X/A signal depends on both the relative amount of activator A and repressor R molecules. An amplification effect of Sxl transcription exists in females vs. males that results in the production of substantially more early Sxl proteins in females than in males.

Establishment of Sxl autoregulation
Differences between the transcripts derived from the establishment promoter SxlPe and the maintenance promoter SxlPm are due mainly to usage of different promoters and alternative splicing. The early Sxl primary transcripts originating at SxlPe follow a fixed splicing pattern where the late exon L2 and the male-specific exon L3 are not included and the early specific exon E1 is directly spliced to exon L4. Exon L4 and the exons downstream from it are present in both early and late Sxl mRNAs. Splicing of the early Sxl primary transcripts is constitutive and does not require Sxl protein (HORABIN and SCHEDL 1996 Down), whereas the presence of Sxl protein is necessary to drive the female-specific splicing of the late Sxl primary transcripts originating at SxlPm (i.e., skipping of the male-specific exon L3).

The action of early Sxl proteins on the splicing of the late transcripts is essential for the establishment of Sxl autoregulatory function. The mechanism by which Sxl protein controls the skipping of exon L3 is not totally understood. Notwithstanding, it has been observed that Sxl cooperatively binds to the late transcripts at several poly(U) sequences (WANG and BELL 1994 Down; KANAAR et al. 1995 Down). The present model aims to quantitatively assess the conditions that switch on the state of the positive feedback loop controlling the splicing pattern of the late Sxl primary transcripts.

Model assumptions:

  1. The different subunits of the splicing machinery are not explicitly modeled. Only Sxl protein is taken into account since it is sex-specifically produced before SxlPm becomes active.

  2. Early and late Sxl proteins differ exclusively in their N-terminal exons encoding two dozen amino acids (KEYES et al. 1992 Down). It is assumed that these differences do not alter the RNA-binding properties of the proteins.

  3. Sxl proteins cooperatively bind to the transcript splicing sites and form homodimers (SAKAMOTO et al. 1992 Down; WANG and BELL 1994 Down; WANG et al. 1997 Down; SAMUELS et al. 1998 Down). Since the formation of dimers is important for the stabilization of the splicing factors on the primary transcripts (WANG et al. 1997 Down; SAMUELS et al. 1998 Down), it is assumed that alternative splicing is mediated by dimers. The putative assembly of higher-order Sxl complexes is not considered. In the absence of clear evidence that strong Sxl::Sxl interactions occur in vivo, we furthermore consider that homodimerization of the proteins does not occur in free solution. However, all our results would hold qualitatively when Sxl dimerization in free solution is included in the model.

  4. Splicing is regulated by the synergistic action of Sxl dimers on multiple sites lying upstream and downstream of exon L3 (SAKAMOTO et al. 1992 Down; WANG and BELL 1994 Down; PENALVA et al. 1996 Down). For simplicity, only one splicing site is taken into account and thus synergistic effects are not considered.

  5. In Drosophila, dosage compensation is triggered by the presence or the absence of Sxl protein (reviewed in LUCCHESI 1996 Down). We assume that the initial transcription from the maintenance promoter SxlPm is not dosage compensated. Since the Sxl gene is located on the X chromosome, the production of late Sxl primary transcripts is supposed to be twice as high in females (XX) as in males (XY).

Reaction mechanism: Let us denote Sxl proteins as x, Sxl primary transcripts as h (h for late heterogeneous nuclear RNA), and Sxl mRNA spliced in its female mode as rx. Let us define the state space of the system as {Omega} = {(x, h, hx, hx2, rx), where x, h, hx, hx2, and rx {isin} +}. A particular state of the system represents a point in {Omega}. The reaction scheme below displays the reaction mechanisms of the model:

(9a)


(9b)


(9c)


(9d)

The three dots in (9a) and (9d) symbolize the degradation pathway. On the basis of assumption 5, reaction (9a) represents the production of the primary transcript h following a constant influx {chi}x.Fh, where {chi}x denotes the number of Sxl gene copies and Fh the production rate of transcripts per gene. Reaction (9b) describes the binding of a Sxl monomer to the "naked" primary transcripts. The second binding of Sxl to the primary transcripts is represented by the left reversible reaction of (9c). The irreversible reaction on the left of (9c) represents the splicing step where exon L3 is removed from h.Sxl2 so that it becomes a messenger RNA (rx). The splicing step is accompanied by the release of two Sxl monomers. Reaction (9d) describes the production of Sxl protein. The instantaneous translation rate of Sxl mRNA is set equal to constant rate {rho}tsl per primary transcripts. Further information about the kinetic scheme is given in Appendix C with the corresponding ODE system.

Generic properties of the alternative splicing mechanism: As shown in Appendix C, the kinetic equation (C1) describing the reaction scheme (9a–9d) admits two stable (z0 and z+) and one unstable steady state (z-). These steady states are points within the five-dimensional space {Omega} with their respective number of Sxl protein x such that

(10)

Except for the unique unstable steady-state z-, all the points of the state space {Omega} dynamically evolve toward either z0 or z+. The set of points that tends to one particular steady state constitutes its so-called basin of attraction. Reaction scheme (9a–9d) is characterized by the existence of two antagonistic trends that compete for the control of Sxl production. On the one hand, Sxl protein concentration tends to decrease as the protein and its (functional) mRNA undergo a natural turnover. Once the Sxl protein concentration has fallen to zero, it remains nil as the female-specific splicing cannot be achieved anymore. On the other hand, the concentration of functional protein increases each time a primary transcript is spliced and subsequently translated. The initial abundance of Sxl protein determines which trend is the strongest. The two trends pull the system toward two different steady states, one with Sxl protein concentration equal to zero and another where the Sxl concentration is high and balances production and degradation.

Among the points belonging to {Omega}, we are interested in a subset that corresponds to the initial conditions of the system just before alternative splicing starts, i.e., all the states {isin} {Omega} such that

wherein denotes the concentration of early Sxl protein present in the system when the constitutive production of late Sxl primary transcripts is about to be launched. Let us denote the projection of the basin of attraction of z0 and z+ on the subspace (+, 0, 0, 0, 0) as B(0) and B(x+), respectively. As illustrated in Fig 6A, B(0) and B(x+) are separated by a threshold value {zeta} (not to be confounded with x-). The value of {zeta} can be accurately computed by numerical simulations. As seen in relation (10), x+ is a function of the late Sxl production fluxes {chi}x.Fh and differs between sexes since {chi}x is twice as large in females as in males. Similarly, the value of {zeta} depends on {chi}x.Fh as well; hence it is denoted as {zeta}{male} in males and {zeta}{female} in females. Numerical simulations enable us to estimate the value of {zeta}{male} and {zeta}{female} as 1100 and 2700 molecules, respectively.



View larger version (60K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 6. (a) Illustration of the projection of the basin of attraction of states z0 and z+ on the subspace (R+, 0, 0, 0, 0). White arrows over the axis indicate the dynamical trends of the states following in the corresponding regions. (b) Typical dynamics of an initial condition (, 0, 0, 0, 0) such that < {zeta}. (c) Typical dynamics of an initial condition (, 0, 0, 0, 0) such that > {zeta}. Note that the value of {zeta} is different in males and females (see text).

In males, the system unavoidably evolves toward the complete absence of Sxl protein production when the initial number of early Sxl proteins is smaller than the threshold {zeta}{male}, i.e., if belongs to B(0) (cf. Fig 6B). In contrast, stable production of Sxl protein is ensured if the number of early Sxl proteins is larger than {zeta}{female}, i.e., if belongs to B(x+) (cf. Fig 6C). It can be hypothesized that fates will be robustly specified when is significantly smaller than {zeta}{male} in males and significantly larger than {zeta}{female} in females. In this respect, it is interesting to note that the size of B(0) is larger for males than for females. Numerical simulations of the whole system show that /{zeta}{male} = 82/2700 = 0.03 in males and /{zeta}{female} = 6700/1100 = 6.1 in females.

The difference in the threshold value {zeta} in females and in males ensues from the assumption that the initial transcription at SxlPm is not dosage compensated. While so far no experimental observation suggests that the single gene Sxl is hypertranscribed in males when the maintenance promoter becomes active, gene run, however, is already dosage compensated when the X/A ratio is measured (GERGEN 1987 Down). The early dosage compensation mechanism of run being independent from the main dosage compensation pathway mediated by the male-specific lethal genes (BERNSTEIN and CLINE 1994 Down), we have assumed that SxlPm transcription remains unaffected by dosage compensation during the establishment of Sxl autoregulation.

If it turns out that SxlPm transcription is dosage compensated from the start, the threshold value {zeta}, in the model, would become equal for males and females, due to X chromosome hypertranscription in males ({chi}x = 2 for both males and females). The main conclusions of the model would, however, hold since the number of early Sxl proteins in males ( = 82) would still be far below this threshold value ({zeta}{male} = {zeta}{female} = 1100).

In outline, we have formally shown that the positive feedback loop that is involved in the Sxl RNA splicing process can generate a bistable switch for both males and females. The state of this switch is triggered by the residual concentration of early Sxl protein present when SxlPm becomes active. This result is in agreement with the observation (CLINE 1980 Down; KEYES et al. 1992 Down; BERNSTEIN et al. 1995 Down) that activation of the autoregulatory function of Sxl requires that the concentration of early Sxl protein be higher than a threshold value. Our model clarifies the nature of this threshold phenomenon. Indeed, for both males and females, a threshold separates the basin of attraction of a steady state where the concentration of Sxl protein is nil (male phenotype, lethal state for females) from the basin of attraction of a steady state where the concentration of Sxl protein remains nonzero in a self-sustained manner (female phenotype, lethal state for males; Fig 6A). This threshold is of a different nature than the thresholds usually involved in the transcriptional activation of a cooperative promoter (see, for instance, HERSCHLAG and JOHNSON 1993 Down; CAREY 1998 Down). In the absence of a positive feedback circuit, such thresholds do not imply the existence of bistability. Rather, they express that the activity of a given gene is low or high when the concentration of an activator is below or above a threshold value, respectively.

The model also sheds insight into the relationship between the kinetic properties of the system (degradation rate of Sxl transcripts and Sxl protein, splicing efficiency, etc.) and the existence of a bistable behavior. Equation 10 or C3 shows, for instance, that when the degradation rate of the primary transcripts (dh) significantly increases, bistability is lost and the only remaining steady state is the zero state.


*  INTEGRATED MODEL
*TOP
*ABSTRACT
*MODELS
*INTEGRATED MODE