help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Originally published as Biophys J. BioFAST on November 12, 2004.
doi:10.1529/biophysj.104.050666
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
biophysj.104.050666v1
88/2/828    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Walczak, A. M.
Right arrow Articles by Wolynes, P. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Walczak, A. M.
Right arrow Articles by Wolynes, P. G.
Biophysical Journal 88:828-850 (2005)
© 2005 The Biophysical Society

Self-Consistent Proteomic Field Theory of Stochastic Gene Switches

Aleksandra M. Walczak *, Masaki Sasai {dagger} and Peter G. Wolynes * {ddagger}

* Department of Physics, Center for Theoretical Biological Physics, University of California at San Diego, La Jolla, California; {dagger} Department of Complex Systems Science, Graduate School of Information Science, Nagoya University, Nagoya, Japan; and {ddagger} Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, California

Correspondence: Address reprint requests to Peter Wolynes, University of California at San Diego, Chemistry and Biochemistry, 9500 Gilman Drive, MC 0371, La Jolla, CA 92093-0371. Tel.: 858-822-4825; E-mail: pwolynes{at}ucsd.edu.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THE SELF-CONSISTENT PROTEOMIC...
 THE TOGGLE SWITCH
 THE SYMMETRIC TOGGLE SWITCH
 THE ASYMMETRIC TOGGLE SWITCH
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 CONCLUSIONS
 APPENDIX A
 APPENDIX B
 ACKNOWLEDGEMENTS
 REFERENCES
 
We present a self-consistent field approximation approach to the problem of the genetic switch composed of two mutually repressing/activating genes. The protein and DNA state dynamics are treated stochastically and on an equal footing. In this approach the mean influence of the proteomic cloud created by one gene on the action of another is self-consistently computed. Within this approximation a broad range of stochastic genetic switches may be solved exactly in terms of finding the probability distribution and its moments. A much larger class of problems, such as genetic networks and cascades, also remain exactly solvable with this approximation. We discuss, in depth, certain specific types of basic switches used by biological systems and compare their behavior to the expectation for a deterministic switch.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THE SELF-CONSISTENT PROTEOMIC...
 THE TOGGLE SWITCH
 THE SYMMETRIC TOGGLE SWITCH
 THE ASYMMETRIC TOGGLE SWITCH
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 CONCLUSIONS
 APPENDIX A
 APPENDIX B
 ACKNOWLEDGEMENTS
 REFERENCES
 
Genetic switch systems are an elementary means of regulatory control present in every living organism. Their complexity and details differ, but the general mechanism, that of the expression of a given gene being regulated by proteins, is believed to be universal (Ptashne and Gann, 2002Go). They are building blocks of larger regulatory elements: genetic networks and signaling cascades. The pathways by which these systems operate are passed on from generation to generation. Understanding their stability and characteristics is therefore fundamental. A lot of previous work has considered a deterministic description of genetic switches (Ackers et al., 1982Go; Hasty et al., 2001). The need for a stochastic treatment of genetic switches, due to the single copy of the DNA molecule and multiple protein molecules in the cell, has been largely recognized (Sneppen and Aurell, 2002Go; Kepler and Elston, 2001Go).

The most general way of accounting for nondeterministic processes is to write down the master equation for a given system. To define the state of the switch, one must specify the DNA binding states of particular genes and the number of proteins of each type. The probability distribution for even a single switch consisting of two genes, the product proteins of which act as regulator proteins for the system, may not be determined exactly and approximations must be considered (Bialek, 2001Go; Hasty et al., 2000Go; Sneppen and Aurell, 2002Go).

Several approaches to account for the probabilistic nature of chemical reactions have been undertaken, ranging from the Langevin description of single genes (Bialek, 2001Go), and two interacting gene switches (Hasty et al., 2000Go), to the master equation reduced to Fokker-Planck equation considerations (Kepler and Elston, 2001Go; Hasty et al., 2001aGo). A dynamical action formulation has also been used (Sneppen and Aurell, 2002Go) to determine the lifetimes of states of the switch. A popular alternative to purely analytical methods, which often need to make approximations or are limited to very simple model systems, has been to conduct stochastic simulations of genetic switches. Two types of simulations are mostly used. In the first, the randomness of the system is introduced by means of a Monte Carlo algorithm with a fixed time step (Paulsson et al., 2000Go). The second is based on the Gillespie algorithm (Gillespie, 1977Go) to predict the probability of a given reaction occurring (Arkin et al., 1998Go). For single-gene systems, stochastic simulations have shown that the stochasticity in the system is responsible for the bimodal probability distributions (Cook et al., 1998Go) that have been experimentally observed. These methods prove very useful, because they allow us to test the theoretical predictions on model systems that might be hard to build experimentally. However, this approach often does not enable us to gain intuition or insight into the mechanisms behind the functioning of the system. The aim of the present work is to gain a better and deeper understanding of the device physics of genetic switches. We therefore, contrary to many important previous discussions (McAdams and Arkin, 1997Go; Aurell et al., 2002Go; Vilar et al., 2003Go), do not present a specific concrete biological system, but discuss generic behavior and try to understand its sources. Our approximation also allows for an exact solution of a broad class of genetic switch systems without any further assumptions and with little computational effort. Hasty et al. (2001b)Go present an overview of the existent theoretical approaches.

A popular approximation assumes that the DNA binding state reaches equilibrium much faster than the protein number state. Therefore the adiabatic approximation is often considered (Ackers et al., 1982Go; Sneppen and Aurell, 2002Go; Darling et al., 2000Go), allowing for a thermodynamic treatment (Ackers et al., 1982Go) of the DNA binding state. The protein number fluctuations are then treated stochastically. Even before the statistical thermodynamics approach of Ackers et al. (1982)Go using partition functions, much previous work assumed the DNA binding and unbinding can simply be accounted for by an equilibrium constant, since the relaxation timescales for equilibration of the DNA state are much larger than those of the protein numbers, which require protein synthesis and degradation to change. The partition function approach has also been successful for looking at logic gates built from switches (Buchler et al., 2003Go). The adiabatic approximation is believed to hold true in many cases, judging by the experimental parameters of biological switches (Darling et al., 2000Go). But as the experiments of, for example, Becskei et al. (2001)Go show, not all switches need to function within the adiabatic limit and the nonadiabatic limit may result in new phenomena. We therefore consider a wide range of parameter ratios in our discussion.

In this article we explore more fully an approximation, previously used by Sasai and Wolynes (2003)Go for the variational treatment of the problem, the self-consistent proteomic field (SCPF) approximation. Within this approximation one assumes that the probability of finding the switch in a given state is a product of the probabilities of states of individual genes. One can then solve the steady-state master equation for the probability distribution of many regulatory systems exactly. We discuss the approximation and present a detailed study of different classes of genetic switches, some of which have never been considered theoretically. We consider separately several particular features of such systems that are found in known switches, to be able to characterize their contributions to the behavior of the whole system. To be specific, starting from a symmetric toggle switch, we go on to compare the effects of multimer binding, and of the production of proteins in bursts, on the stability of the switch.

The stochastic effects prove to be modest for symmetric switches without bursts, especially if the genes have a basal production rate. We find the deterministic and stochastic SCPF solutions to have similar probabilities of particular genes to be on, and similar mean numbers of proteins of a given species in the cell. However, in the nonadiabatic limit, when the unbinding rate from the DNA is smaller than the death rate of proteins, the probability distributions have two well-defined peaks, unlike in the deterministic approximation or adiabatic limit of the stochastic SCPF solution.

We also show that the effect of stochasticity on the observables becomes more apparent when proteins are produced in bursts. In these types of switches, the definition of the adiabatic limit, which was clear for the switches in which proteins are produced separately, is no longer simple. Our discussion shows that the properties of genes often analyzed in the deterministic limit, may be strongly influenced by stochasticity in this case. Randomness in a biological reaction system leads to quantitative and, in many examples, even qualitative changes, from predictions of deterministic models.

We also discuss the differences in the behavior of asymmetric and symmetric switches. We point to the mechanisms resulting in different types of bifurcations and show how they are influenced by noise. Within the SCPF approximation, switches that are regulated by binding and unbinding of monomers do not have regions of bistability. This holds true for both symmetric and asymmetric switches. When proteins are produced individually rather than in bursts, fast unbinding from the DNA can effectively minimize the destructive effect of protein number fluctuations on the stability of the DNA binding state. Furthermore, a detailed analysis of the probability distributions shows that they have long tails, and are far from Poissonian in both the adiabatic and nonadiabatic limits. We discuss the properties of the system in terms of clouds of proteins buffering the DNA. We show how fast or slow DNA binding characteristics and protein number fluctuations influence the stability of the buffering clouds, leading to specific emergent behavior of observables. Throughout the article, a comparison is made between results of the exact stochastic solution, to solutions of the deterministic kinetic equations for the system within the self-consistent proteomic field approximation.

We establish a base of potential building blocks of more complicated switches and systems, such as networks and signaling cascades, for which an exact solution within the present approximation can also be obtained. A detailed discussion of these larger systems will be the topic of another article. We also present limitations of the present style of analysis where exact solutions are not possible.

There are two aims of this article. The first is to discuss the self-consistent field approximation and show that it has an exact solution that could be extended to a large class of systems. This approximation lets one deal in a straightforward and computationally inexpensive manner with the effect of random processes on genetic networks. The second is to discuss the many components of biological switches present in nature and in engineered systems, in the necessary stochastic framework.


    THE SELF-CONSISTENT PROTEOMIC FIELD APPROXIMATION
 TOP
 ABSTRACT
 INTRODUCTION
 THE SELF-CONSISTENT PROTEOMIC...
 THE TOGGLE SWITCH
 THE SYMMETRIC TOGGLE SWITCH
 THE ASYMMETRIC TOGGLE SWITCH
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 CONCLUSIONS
 APPENDIX A
 APPENDIX B
 ACKNOWLEDGEMENTS
 REFERENCES
 
The basic mechanism of gene transcription regulation in prokaryotes may be reduced to the binding and unbinding of regulatory proteins, repressors, and activators, to the operator site of the DNA. If we use this simplified treatment, which neglects extra levels of regulation such as the binding of RNA polymerase, effectively each gene can be described as being either in an active (on) state, when the repressor is unbound (activator bound); or in an inactive (off) state, with the repressor bound (activator unbound). The stochastic system of a single gene and its product proteins is described by the joint probability distribution of the number of product proteins in the cell n, and the DNA binding-site state, as on (protein not bound) = 1; and off (protein bound) = 2. To conserve probability,

If one considers two interacting genes, the description in terms of a joint probability vector needs to be extended to four states: both genes may be on, or off; or one of the genes may be on, and the other off. If the two genes do not interact, as would be the case for two self-regulatory proteins, the probability of finding the two-gene system in a given state, defined by both the number of product proteins and the DNA binding-site state, would be the product of the states of particular genes Pjj'(n1,n2;t) = Pj(n1;t)Pj'(n2;t). This is generally not true for two interacting proteins, as is the case in a genetic switch. However, as a first approximation to the problem, one can ignore correlations between the spaces of the two genes and assume the space of the switch is a sum of spaces of the genes that compose it. Since we are looking for solutions in which the symmetry of the system is broken and different behaviors of the on- and off-state of a gene are possible, we must allow for different probability distribution functions for the on- and off-states. This is analogous to the unrestricted Hartree approximation in quantum mechanics, where allowing different spatial functions for spin-up and spin-down states results in breaking of the symmetry of the bound molecular orbital solution to the dissociated solution of two separate hydrogen atoms with opposite spin-states for large internuclear distances. We therefore allow for multiple solutions for a given set of parameters. The total probability of having a given gene state j and ni proteins of that type is simply given by Pj(ni,ni') = Pj,j'=0(ni,ni') + Pj,j'=1(ni,ni').

The self-consistent approximation is a crude one, since in the case of the genetic switch, the state of a given gene is often determined by the number of protein products of the other gene. However, within this approximation, one can solve the master equation for the probability distribution exactly, without any further approximations. This yields a powerful computational tool, which simultaneously gives useful insight.


    THE TOGGLE SWITCH
 TOP
 ABSTRACT
 INTRODUCTION
 THE SELF-CONSISTENT PROTEOMIC...
 THE TOGGLE SWITCH
 THE SYMMETRIC TOGGLE SWITCH
 THE ASYMMETRIC TOGGLE SWITCH
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 CONCLUSIONS
 APPENDIX A
 APPENDIX B
 ACKNOWLEDGEMENTS
 REFERENCES
 
For clarity of exposition, we show how the problem may be solved exactly within the self-consistent proteomic field approximation on a well-defined system of the toggle switch. We then expand the method to apply to other systems. The elementary system we use as an example is composed of two genes, labeled 1 and 2, as presented in Fig. 1. Gene 1 produces proteins of type 1, which act as regulatory proteins, i.e., repressors, on gene 2. The product of gene 2, proteins of type 2, in turn repress gene 1. In this simplified model, we assume that protein production occurs instantaneously upon unbinding of the repressor. For now, we assume that repressor proteins bind as dimers, since that is a common scenario in biological systems, but we do not treat dimerization kinetics explicitly. For simplicity, the coupling form between the genes responsible for binding will be taken to be of the form where p is the order of the multimerization of the repressor. This form is a small approximation to the more exact hin3–i(n3–i – 1)...(n3–ip + 1). We have checked that using the simpler monomial does not influence the results in any regime discussed. We also do not account for the existence of mRNA molecules and the consequent time delays owing to their synthesis as intermediates. The extensions of the model are discussed later.



View larger version (18K):
[in this window]
[in a new window]
 
FIGURE 1  A schematic representation of the toggle switch. Gene 1 produces proteins of type 1, which repress gene 2; and gene 2 produces proteins of type 2, which repress gene 1.

 
Within the self-consistent proteomic field approximation, the set of master equations for the corresponding system is of the form


(1)
for n ≥ 1 where the i = 1,2 refers to the gene label. P1(n1) describes the probability of gene 1 being in the on-state and there being n1 protein molecules of type 1 in the cell. The first term on the right-hand side of Eq. 1 describes the production of proteins of type i with a production rate gj(i), where j = 1, 2, depending on whether the gene is in the on- or off-state. The second term accounts for the destruction of proteins with rate ki. The binding of repressor proteins produced by the other gene is proportional to the number of dimer molecules present in the system n3–i with rate hi. We assume unbinding occurs with a constant rate fi. Binding and unbinding contributes to the kinetics of the DNA binding states, as described by the last two terms. This set is supplemented by the Pj(ni = 0) equations to account for boundary conditions.


(2)
For convenience, let us define the probability of finding the DNA binding site in a given state. One can now sum the Pj(1) equations over the number states of the second protein with P1(2) + P2(2), and likewise the Pj(2) equations. Due to the SCPF approximation, the only term affected is the repressor binding term and since the summation results in where is the second moment of the number distributions of type 2 proteins produced when gene 2 is in the jth state. The equations of motion of the moments of the probability distribution are of the form

(3)
The steady-state equations for the moments of the distributions that follow are closed-form; the order moment equation of motion depends only on the lower moments of the ith gene and

To analyze the behavior of switches we introduce the following scaled parameters: the adiabaticity parameter {omega}i = fi/ki, which represents the characteristic rate of change of the DNA state compared to the characteristic rate of change in protein number, measures the tendency for proteins to be unbound from the DNA; the effective production rate; and distinguishes between the two DNA states in terms of protein dynamics. We present a detailed derivation of the moment equations in Appendix A.

The resulting equations for the 0th moments couple to the higher moments by the interaction function F(i). These lower moments can be solved self-consistently. The resulting solution predetermines all the other moments, which completely describe the probability distribution. Each gene therefore couples to the other gene by the influence of the self-consistently generated proteomic field. One could define the generating function and calculate the probabilities of having a given DNA binding state j for the ith gene when there are ni proteins of type i in the cell. In practice, it is easier to go back to the steady-state master equation and solve directly for the probability distributions than sum an infinite number of moments. Rewriting the steady-state master equation (Eq. 1) one gets


(4)
These sets of equations give recursion relations for Pj(ni) that one can use to express Pj(ni) as a function of P1(0) and P2(0). The normalization condition gives Pj(0) in term of constants and the result is the probability function Pj(ni) as a series. The SCPF approximation reduces the two-gene problem to a one-gene problem parameterized by the moments of the second gene, which can be worked out independently, as we have already shown, and these are represented by F(3–i)—which is a constant in terms of this calculation.

To see the effect of the stochastic nature of the system, we compare the exact solutions of the self-consistent field approximation equations to the results that would follow from deterministic kinetic rate equations for the number of proteins of each type and the fraction of on/off DNA binding states for each gene,

(5)
where n(i) is the number of proteins of type i present in the cell. The exact SCPF equations reduce to the deterministic kinetic equations in the limit of large {omega} and Xad for the case discussed above. The F(3–i) term in the stochastic SCPF equations is replaced by the n2(3–i) term in the deterministic kinetic rate equations. For the toggle switch, where repressors bind as dimers, it is easily shown that the interaction functional may be rewritten in the form

(6)
which in the large {omega}-limit reduces to F(i) = <n(i)>2 + <n(i)>. So for large mean numbers of proteins present in the cell, which corresponds to large effective production rates Xad, <n(i)> of the order of hundreds is a small correction to <n(i)>2. We therefore reproduce the deterministic kinetics result.

As shown by Sasai and Wolynes (2003)Go, the difference in the probability that gene 1 is active and that gene 2 is active, {Delta}C = |C1(1)–C1(2)|, plays the role of an order parameter. We can now consider a family of switches and discuss their stability, sensitivity of regions of bistability to control parameters, and types of bifurcations.


    THE SYMMETRIC TOGGLE SWITCH
 TOP
 ABSTRACT
 INTRODUCTION
 THE SELF-CONSISTENT PROTEOMIC...
 THE TOGGLE SWITCH
 THE SYMMETRIC TOGGLE SWITCH
 THE ASYMMETRIC TOGGLE SWITCH
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 CONCLUSIONS
 APPENDIX A
 APPENDIX B
 ACKNOWLEDGEMENTS
 REFERENCES
 
For pedagogic purposes we will start by analyzing the single symmetric toggle switch, such as discussed above, in which repressors bind as dimers, with {omega}1 = {omega}2 = {omega}, and as it is the most intuitive and shows the most generic behavior. It is an academic example, as even individual genes in switches engineered in the laboratory mostly have different chemical parameters. Yet a lot can be learned from this simple system.

The general mechanism of the phase transition
Fig. 2 shows the phase diagrams for the system, |{Delta}C|, as a function of reservoir protein number and the adiabaticity parameter for the exact SCPF equations for growing values of the parameter describing the tendency that proteins are unbound from the DNA, Xeq. The deterministic kinetics and exact SCPF approximations give qualitatively similar results. The analogous deterministic kinetic phase diagrams agree with the SCPF solutions in the large {omega}- and Xad-limit, hence they become more similar with growing Xeq, as the bifurcation occurs at larger effective production rates for larger Xeq. For large fluctuations and a small unbinding rate, neither gene 1 nor gene 2 is favored and the probability of a given gene to be on is determined solely by the effective production rate of the other gene and decreases in a quadratic manner as the number of repressor proteins grow (Fig. 3). Since the switch is symmetric, the system has one stable state, {Delta}C = 0, where the probabilities of the genes to be on are equal. As the relative protein number fluctuations get smaller and the DNA unbinding rate grows, a proteomic cloud buffers the repressed gene, keeping it repressed. The symmetry of the system is broken and the solution bifurcates into two separate basins of attraction. For the stochastic SCPF equations the bifurcation takes place for larger effective production rates (larger Xad), than for the deterministic equations, even in the large {omega}-limit, which depicts their sensitivity to fluctuations. The critical number of reservoir proteins necessary for the bifurcation of the solution to take place is the same in both approximations and is determined by <n>c = (Xeq)1/2 (Fig. 3). In the discussed example, <n>c = 32 = 10001/2, for Xeq = 1000. For the deterministic kinetic switch the bifurcation takes place when C1(i) = (1 + <n(3–i)>2/Xeq)–1 = 0.5, due to the simple form of the interaction function equal to <n(3–i)>2 = (2XadC1(3–i))2. So C1(i) = 0.5 is equivalent to the <n(3–i)>2/Xeq = 1. In a noisy system larger effective production rates are needed to achieve the critical value of proteins. The interaction function in this case may be written as and always. So at <n>c, F(3–i)/Xeq > 1 and the probability of the genes to be on is <0.5, therefore The mechanism of the bifurcation requires the two genes to be more likely to be unbound than bound for the phase transition to take place. The curvature of the null clines presented in Fig. 2 can be simply worked out to be of the form with {zeta}i,{xi}i constants determined by the specific value of C1(1), C1(2).



View larger version (18K):
[in this window]
[in a new window]
 
FIGURE 2  Phase diagram obtained as an exact solution within the SCPF approximation for the single symmetric switch when repressors bind as dimers with Xeq = 1 (A), 100 (B), and 1000 (C). Contour lines mark values of {Delta}C.

 


View larger version (14K):
[in this window]
[in a new window]
 
FIGURE 3  Probability that genes are in the active state (A), the mean number of proteins of each type present in the cell <n(i)>(B), and the mean number of proteins of each type present in the cell if gene i is in the on-state <n1(i)> (C) as a function of Xad = {delta}Xsw for a symmetric switch. Exact solutions of the SCPF approximation equations compared with deterministic kinetic rate equations solutions, for a single symmetric switch, Xeq = 1000, {omega} = 0.5.

 
Adiabaticity parameter dependence
As the adiabaticity parameter decreases, the area of phase space corresponding to multiple solutions decreases (Fig. 2). For very small values of the adiabaticity parameter, there exists only one solution that corresponds to a state in which the two genes are off. The value of {omega} below which only one solution exists decreases with the tendency for proteins to be bound, but exists for all values of Xeq. Therefore if the two genes have very high repressor binding affinities, the critical number of proteins necessary for the phase transition to take place cannot be formed, even for very high production rates. This region of parameter space where one solution is possible corresponds to a situation in which a buffering proteomic cloud may not form, due to a very fast destruction rate of proteins or a very small unbinding rate from the DNA. The critical number of proteins necessary for the bifurcation to occur grows with the tendency for proteins to be unbound from the DNA (Xeq), as the cloud buffering the genes needs to be bigger and exhibit smaller relative protein number fluctuations, which effectively decrease with the growth of the adiabaticity parameter. This is further discussed in terms of the probability distributions. Therefore a monostable solution exists at all values of the effective growth rate, Xad, for larger values of {omega} at large Xeq than at smaller Xeq values. The bifurcation point is a result of competition between the number of reservoir repressor proteins and the tendency for proteins to be unbound from the DNA. This is clear from the dependence of the number of proteins present in the cell at the bifurcation point on the relative values of Xad and Xeq, but not the adiabaticity parameter {omega}.

Mean protein numbers
The total number of proteins present in the cell, produced both in the on- and off-states, asymptotically away from the bifurcation points is the same for the deterministic and stochastic approximations, and it is given by <n(i)> = 2Xad, when C1(1) {approx} 1 the probability of the gene to be on is close to unity. The number of proteins of a given type present in the cell, when the gene that produces them is in the on-state, is always considerably smaller in the noisy system than in the deterministic case (Fig. 3 C). Since the production rate in the off-state was assumed zero, in the deterministic case no proteins of a given type are present in the cell if the gene is in the off-state, unlike in the noisy system. Therefore the number of proteins in the deterministic system is nonzero only if the gene is on. But interaction of the DNA binding state with the proteins buffering it results in a residual number of proteins present in the off-state for all values of {omega}. The region of bistability of the switch in parameter space grows as the binding rate increases with respect to the unbinding rate, stabilizing the DNA binding states. As the susceptibility of the system to fluctuations increases, the deterministic equations prove to be a poor approximation to describe the state of the system.

Gene-buffering proteomic cloud interactions
The stochastic nature of the system also manifests itself at the DNA level (Fig. 2). As the tendency for proteins to be unbound from the DNA grows, the area of parameter space wherein multiple solutions are possible decreases—since a larger number of proteins is needed to reach a state in which two genes are more likely to be repressed (protein bound state) than at small Xeq. For small unbinding rates or large binding rates, regardless of the ratio of the rate of unbinding of repressors from the DNA to protein degradation, bistability requires smaller numbers of proteins, which correspond to larger relative fluctuations, than for large Xeq. Therefore a larger unbinding rate relative to the binding rate makes the system more susceptible to protein number noise. Competition between Xeq and <n(i)> results in Xeq, for a given null cline, being a parabolic function of Xad, for the dimer binding case, with coefficients determined by {omega} and C1(i). This is easily generalized to higher order functions for higher order (p) oligomers, and results in p-order dependence. The switching region, by which we mean that the region of parameter space between the bifurcation point and {Delta}C > 0.9 decreases as the binding and unbinding rates become comparable (Xeq decreases). As discussed above, the probability of the genes to be on at the bifurcation point tends to 0.5 as the adiabaticity parameter grows (Fig. 3), therefore the probability to be on has to increase by a smaller {Delta}C to reach C1(i) = 1. Therefore the switching region decreases also as the unbinding rate from the DNA grows, since smaller effective production rates are needed to reach {Delta}C = 1, than for small {omega}. Small values of {omega} correspond to large fluctuations in the DNA binding state as well as the protein number state, and result in destabilizing the gene-buffering protein cloud interactions. Hence very large effective production rates are needed for {Delta}C > 0.9. Therefore the DNA unbinding rate must become considerably faster compared to the protein degradation rate for the switch to have two stable solutions in a large region of parameter space.

The probability distributions
A better understanding of the bifurcation can be gained from examining the probability distributions. Fig. 4, A and B, and Fig. 4, C and D, show the evolution of the probability distributions of gene 1 and gene 2, respectively, to be on and off as functions of Xad. The peak of the distribution decreases and the width spreads out as the control parameter grows, until it reaches the bifurcation point at Xad = 44. Then the value of the probability function corresponding to the most probable number of proteins grows again. The spread of the functions grows as the effective production rate in the on-state increases; it narrows, however, with the increase of the adiabaticity parameter, as would be expected, since the DNA state fluctuations become smaller with {omega}. The average number of proteins in the cell in the on-state ({Delta}C > 0.9) does not show a dependence on {omega}. Yet as the unbinding rate from the DNA becomes very fast compared to the protein number fluctuations, the system switches often between the two states, hence a large number of proteins is present even in the off-state. If the DNA unbinding rate is small, the protein number characteristics follow the DNA state having time to reach a steady state within each well, before the DNA binding site switches into the other state, so the number of proteins in the off-state falls to zero (Fig. 5, A and B). This results in a two-peak, bimodal probability distribution (Fig. 4). If {omega} is large, random fluctuations in the DNA state do not change the effective state of the system, since a residual high mean protein number is present even in the off-state. In such a case, lower effective production rates than for small {omega} result in higher protein yields, and hence smaller switching regions.



View larger version (26K):
[in this window]
[in a new window]
 
FIGURE 4  Evolution of probability distributions for the probability of the gene that will be active (on) after the bifurcation to be on (A) and off (B) and the gene that will be inactive (off) to be on (C) and off (D) as a function of the order parameter Xad for a symmetric switch. The bifurcation occurs at Xad = 44, Xeq = 1000, {omega} = 0.5.

 


View larger version (28K):
[in this window]
[in a new window]
 
FIGURE 5  Probability distributions for the gene to be in the on-state (A) and off-state (B) for a gene in the active state for different values of the adiabaticity parameter {omega} = 0.5, 10, 100. Xeq = 100, Xad = {delta}Xsw = 100. Comparison of probability distributions obtained by exactly solving the steady-state equations in the SCPF approximations with analogous Poissonian distributions (C and D). Symmetric switch, Xad = 44, Xeq = 1000, {omega} = 0.5.

 
For small {omega} one might expect Poisson distributions of proteins in each of the DNA states, since the unbinding rate from the DNA is smaller than the protein degradation rate, so the proteins may reach a steady state without the DNA state changing. Hence, effectively proteins would feel the effect of only one well and be subject to a birth/death process. This is not true, however. The difference between the exact solution and a solution obtained within a Poissonian approximation to the state of the system is surprisingly large, owing to the skewed tails of these distributions. Fig. 5, C and D, compares these probability distributions with distributions for the same system if one assumes a Poissonian probability function. The distributions obtained as an exact solution within the SCPF approximation are clearly not symmetric, but exhibit long tails toward zero. Therefore, although the most probable values of the two types of distributions are similar, noise has a destructive impact on the system, resulting in a larger probability of having a smaller number of proteins in the cell than expected based on a Poissonian distribution, whose higher moments are equal to the mean. Therefore, a larger production rate is needed for one of the states to be favored as a result of noise, than that predicted from a symmetric probability distribution. The most probable number of proteins in the on-state, if the unbinding from the DNA is slow, is zero, unlike the number predicted by Poissonian distributions. The influence of noise on protein number fluctuations brings the protein-number means down, as can also be seen from Fig. 3 C. Overall, the spread of the probability distributions is large, and their characteristics for small values of the control parameters are different from those predicted by Poissonian distributions, let alone by deterministic kinetic equations; therefore the effects of stochasticity may not be neglected.

The nonzero basal effective production rate case
The above analysis concerns a switch with a zero basal production rate, so proteins were not produced in the off-state. In a number of biological systems (Ptashne and Gann, 2002Go) a nonzero basal production rate exists and we now turn to consider the effect of this on a symmetric switch. Fig. 6 B shows the dependence of the bifurcation curves for different values of the effective basal production rate g2/(2k). Values <1, when the death rate is larger than the production rate, show that, for the symmetric switch, assuming the effective production rate to be zero in the off-state is a reasonable approximation. If the on-state has a positive input to the number of reservoir proteins present due to g2/k > 1, the probability of the active gene to be on, even for very large on-state effective production levels Xad, is <1. Hence the off-state contributes considerably to the steady-state number of proteins. The solution that corresponds to the more active of the two states may effectively be an off-state, since it has C1(i) < 0.5, although the effective production rate in the on-state in the bifurcated region of parameter space is much larger than in the off-state (for example, the g2/(2k) = 20 line in Fig. 6 B). As the effective basal production rate increases, a larger production rate in the on-state than for small g2/(2k) > 1 is required to reach the critical number of proteins for the bifurcation to take place, which is given by <n(i)>= 2XadC1(i) – g2/k(2C1(i) – 1). For this reason, even for the deterministic approximation at the bifurcation point, the two genes must be more probable to be off, as can also be seen for the exact SCPF solutions from the probability distributions (Fig. 7, B, C, E, and F). Fig. 6 A shows the dependence of the bifurcation curves on the adiabaticity parameter, which tend to the deterministic case for large {omega}. A closer analysis of the g2/k > 1 case, since the g2/k < 1 is analogous to the zero basal production rate case, which has already been discussed, shows that mean properties of the system are in even better agreement with the deterministic solution than the g2 = 0 case (Fig. 7, A and D). The genes have a nonzero probability of being in the off-state, with the probability distribution of the off-gene having a long tail toward higher protein numbers (Fig. 7, E and F). In the off-state the effective production rate g2/(2k) is small and the noise input is small, relative to the large protein numbers present in the system. The small effect of stochasticity results in the observed similar mean characteristics. Yet the form of the probability distributions for the genes to be on before the transition is especially broad, with a far smaller probability than those of the off-state (Fig. 7, B, C, E, and F). These clearly show that the two genes are more probable to be in the off-state before the bifurcation point. Therefore, although the average observables are similar for the deterministic and SCPF stochastic solutions, the predicted distributions are unusual.



View larger version (33K):
[in this window]
[in a new window]
 
FIGURE 6  Nullclines for a symmetric switch, where proteins bind as dimers, when the effective base production rate is g2/(2k) != 0. (A) Dependence on the adiabaticity parameter {omega} = 0.005, 0.05, 0.5, 5, and 50, compared to the deterministic equations solution, g2/(2k) = 5. (B) Dependence on g2/(2k) = 0.01, 0.1, 0.5, 1.0, 5, 10, and 20, {omega} = 0.5. Xeq = 1000.

 


View larger version (26K):
[in this window]
[in a new window]
 
FIGURE 7  Probability of genes to be on (A) and mean number of proteins of a given type present in the cell (D) for a symmetric switch with an effective base production rate. Evolution of probability distributions for the probability of the gene that will be active after the bifurcation to be on (B) and off (C) and the gene that will be inactive to be on (E) and off (F) as a function of the order parameter Xad for the same system. The bifurcation occurs at Xad = 61, g2/(2k) = 5, {omega} = 0.5, Xeq = 1000.

 
Summary
The symmetric switch is based on a competition between the accessibility of the repressor site and the number of repressor proteins present in the cell. The bifurcation is solely a result of the nonlinearity of the system and introducing noise simply affects the region in parameter space where given states occur. The protein number fluctuations have a destructive role in determining the stability of the bifurcated solution; however, fast DNA unbinding rates can compensate for the destabilizing effect of protein number fluctuations. In this region the stochastic solution predicts similar means to the deterministic case, but the form of the probability distributions which depends on a large number of higher moments is nontrivial. It is a result of the interplay of the DNA binding and protein degradation kinetics.


    THE ASYMMETRIC TOGGLE SWITCH
 TOP
 ABSTRACT
 INTRODUCTION
 THE SELF-CONSISTENT PROTEOMIC...
 THE TOGGLE SWITCH
 THE SYMMETRIC TOGGLE SWITCH
 THE ASYMMETRIC TOGGLE SWITCH
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 THE CASE WHEN PROTEINS...
 CONCLUSIONS
 APPENDIX A
 APPENDIX B
 ACKNOWLEDGEMENTS
 REFERENCES
 
Most switches found in nature are not symmetric. For asymmetric switches, when proteins bind as dimers, the two genes interact, resulting in probabilities to be on, different from those imposed purely by the equilibrium between binding and unbinding. The steady-state solution is a compromise between the tendency that repressors are unbound from the initially off-gene ( for the forward transition, for the backward in the following discussion) and the effective production rate of the initially on-gene ( forward; backward transition), at least for the deterministic case. This results in the characteristic S-curve bifurcation diagram, as presented in, for example, Fig. 12, with possible forward and backward transitions, hence hysteresis. We refer to the transition that occurs with increasing as the forward transition, and that with decreasing as the backward transition. Since is a well-defined function of the probabilities that the genes are on, the simplicity of the deterministic equations allows for a completely analytic discussion of the asymmetric switch. The more complicated form of the exact SCPF equations makes this approach impossible. However, the deterministic rate solution offers valuable insight into the basic mechanism behind the transition.



View larger version (28K):
[in this window]
[in a new window]
 
FIGURE 12  Bifurcation diagrams as a function of for C1(1), with g2(1)/(2k) = g2(2)/(2k) = 5 (A) and C1(2) g2(1)/(2k) = g2(1)/(2k) = 0.5 (B) for 50, and 500. Comparison of exact solutions of the SCPF and deterministic kinetic equations for an asymmetric switch. {omega}1 = {omega}2 = 0.5, and

 
The general mechanism
By combining the steady-state equations of motion for the probabilities of the two genes to be on and noting that, with a zero basal production rate one can derive the form of the deterministic bifurcation curves as

(7)
as a function of C1(2), and

(8)
as a function of C1(1). The transition points are determined as the extrema of Eqs. 7 and 8, which are functions solely of the scaled parameter and are plotted on the bifurcation graphs. It is worth noticing that the bifurcation points C1(i) do not depend on the value of the parameter describing the gene binding kinetics of the gene that is on initially. This is not true for the exact SCPF solution, which cannot be solved analytically, but the bifurcation curve has the more complex form of

(9)
where C1(1) is a function of {omega}2, C1(2), and The bifurcation point is therefore determined by the protein () and DNA () characteristics and mutual interactions ({omega}i) of their two genes. The deterministic approximation therefore greatly simplifies the mathematical mechanism of the transition. This may lead to large errors when studying more complicated biologically relevant systems, where one considers asymmetric switches with nonzero basal production rates and proteins are produced in bursts. The case of the nonzero basal production rate within the deterministic approximation also cannot be solved analytically.

The general picture behind the transition is seen from the deterministic approach. The larger the tendency for proteins to be unbound from the DNA, the larger the effective production rate must be for the transition, from one gene to be active, to the other to be active, to take place —inasmuch as repressor proteins are less likely to bind to the on-gene (i) at large than at small However, if one considers a noisy system, it is effectively harder for proteins to stay bound to the initially off-gene due to the destabilizing effect of DNA binding noise (Fig. 8). For the stochastic system, apart from very low values of the adiabaticity parameter ({omega} < 0.1) (Fig. 11), there is a threshold number of reservoir proteins that will cause a rapid transition. If we start with a small effective production rate for one type of protein and increase this rate, keeping the production rate of the other gene fixed at an initially higher value, the proteins produced by the gene with the initially smaller production rate repress it gradually and ineffectively, until they reduce the probability of the gene to be on to one-half, for the exact SCPF solution. The number of proteins present in the on-state decreases much more rapidly with the change of —whether it be an increase for the forward transition or a decrease for the backward transition in the examples presented—than the number of proteins in the off-state grows (Fig. 10). Hence, the probability of the initially active gene to be on shows a larger sensitivity to the change of than does the off-state probability. This leads to a rapid transition of the previously active gene to an inactive state (Fig. 9). Such behavior is described by Ptashne (1992)Go and Ptashne and Gann (2002)Go in the {lambda}-phage switch; they point out its role as a "buffer against ordinary fluctuations in repressor concentration." The observed system switches when the "repression probability" drops to 50%, as in the solutions of this model. Our analysis seconds the hypothesis of Ptashne and Gann, inasmuch as the deterministic system lacks this behavior, the transition is rapid, and for certain values of parameters, takes place when the probability of the initially on-gene drops to 80% (Fig. 8). The buffering capabilities of the stochastic system are clearly seen in the long tails toward n = 0 of the probability distributions of the gene that is switching from the on- to the off-state (Fig. 9, A and B).



View larger version (23K):
[in this window]
[in a new window]
 
FIGURE 8  Dependence of the probability of genes to be on in an asymmetric switch as a function of increasing parameters of one gene in the forward (top) and backward (bottom) transition for different values of : 5, 50, and 500. All other parameters fixed at and Comparison of solutions of deterministic and exact SCPF equations.

 


View larger version (29K):
[in this window]
[in a new window]
 
FIGURE 11  Bifurcation diagrams for an asymmetric switch, presenting as a function of C1(2) (AC), and C1(1) (DF) for different values of the adiabaticity parameter: {omega}1 = {omega}2 (A,D), {omega}2, with {omega}1 = 0.001 = const (B,D), {omega}1, with {omega}2 = 0.001 = const (C,F). and

 


View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 10  Mean number of proteins of each type present in the cell, according to exact solutions of the SCPF approximation and deterministic kinetic rate equations for an asymmetric switch, with {omega}1 = {omega}2 = 0.5, and and 500 during the forward (A) and backward (B) transitions in an asymmetric switch.

 


View larger version (32K):
[in this window]
[in a new window]
 
FIGURE 9  Evolution of the probability distributions for the two genes to be active for the forward transition (A and B) and the backward (C and D) as a function of for with {omega}1 = {omega}2 = 0.5; for an asymmetric switch.

 
The effect of noise on the bifurcation mechanism
The mean number of proteins at the transition point differs for the deterministic and exact SCPF solution (Fig. 10). More repressors are needed to induce the transition in the deterministic approximation than in the stochastic system, since, due to the form of the interaction function for the exact case, F(i) = <n(i)>2({omega} + 1)/({omega} + C1(i)) + <n(i)> > <n(i)>2. A smaller number of proteins is therefore needed for the inactive gene to become competitive with the active gene. The mechanism of the transition is different from the symmetric gene case, where a critical number of proteins needs to be reached. The asymmetric switch is based on the competition between the probability that proteins of one kind will repress the opposing genes and the analogous probability for the other kind of proteins. The repression capability is governed by which might be looked upon as the product of the probability of having a certain number of repressor proteins (3–i) in the cell and the tendency for them to be bound to the opposing gene (i). In fact, the transition point in the deterministic case is purely a function of such ratios, In both the stochastic and deterministic cases, the transition points are set by the interaction function which regulates the on- and off-state probabilities of a given gene Inclusion of noise in the system effectively increases the nonlinearity of the system, which results in the already discussed buffering capabilities of the system. Stochasticity alters the very simple competitive mechanism seen in the deterministic kinetics to allow for more levels of control of the stability of the state of the system against random fluctuations.

Further comparison of solutions of the deterministic and stochastic equations leads to the same conclusions as for a symmetric switch. As the tendency for proteins to be unbound from the DNA grows, the difference in the critical number of reservoir proteins necessary for the transition to take place increases for both approximations. The critical number of proteins produced by a given gene necessary for the transition to take place for both genes is, in most cases (see {omega} dependence discussion), smaller for the exact solutions of the SCPF equations and the difference between the stochastic and deterministic result grows with both and decreases with {omega}i (Fig. 10). It has a value of 15 for {omega}1 = {omega}2 = 0.5 and 2 for {omega}1 = {omega}2 = 10.

Consider the forward transition. The initially inactive gene is buffered by a cloud of repressor proteins. As one increases the effective production rate of the proteins produced by the inactive gene (), the number of proteins that are able to repress gene 2 grow slowly and linearly where C1(1) ~const, and form a buffering proteomic cloud around it. In the results presented in the figures of this article, the tendency that proteins are unbound from gene 2, (), is smaller than so gene 1 is able to produce enough repressors to form a stable buffering cloud around gene 2 and turn it into the inactive state at quite modest values of If gene 1 produces proteins less effectively, as the probability of it being repressed is larger than in the previous case, and larger values of are needed to produce enough repressors to achieve a high effective probability of binding, An example of how grows as is seen by comparing the for (Fig. 8) and for (Fig. 11).

Adiabaticity parameter dependence
The interaction of the buffering proteomic cloud with the DNA can be altered when the ratio of the DNA unbinding rate compared to the protein degradation rate is changed. For small {omega}i values the unbinding rate of repressors from the DNA is slower than the destruction of the produced proteins. Apart from very small {omega}-values, as long as there is a critical number of repressor proteins in the buffering cloud, the off-gene is repressed and it responds by turning on, but only once the initially on-gene is nearly totally repressed. Large adiabaticity parameters result in the efficient formation of the buffering proteomic cloud. For the initially off-gene, a small DNA unbinding rate of the off-gene decreases the effectiveness of the buffering proteomic cloud around it, as the protein number state can reach a steady state before the DNA state does. The hindered DNA reaction to the protein-number state effectively increases the tendency of repressor proteins to be unbound from the DNA for a given This, in turn, decreases the probability of the initially on-gene to be on, leading to rapid switching behavior as can be seen for gene 2 in the forward, or gene 1 in the backward, transition for {omega} > 0.1 in Fig. 11 A. The initially on-gene reacts to the interaction function of the initially off-gene, for which F(i) -> <n(i)>2/C1(i) + <n(i)> in the small {omega}-limit. Therefore, the interaction function is effectively increased for C1(i) {approx} 0, leading to the enhanced buffering. The reaction of the initially off-gene is unaltered, as for C1(i) {approx} 1 F(i) = <n(i)>2 + <n(i)> ~const, if C1(i) remains close to 1. However, if {omega} is very small (black dash-dot curve in Fig. 11, A and D), the buffering proteomic cloud is not given a chance to form due to a very high degradation rate of proteins and gene 2 is simply repressed in a gradual transition. If {omega}1 is extremely small and {omega}2 large, the buffering proteomic cloud around gene 1 cannot form and the probability of it to be off in the forward transition decreases gradually. A buffering proteomic cloud exists around gene 2, hence the backward transition is reminiscent of the deterministic result (Fig. 11, B and E). The most interesting case is shown in Fig. 11, C and F, where a large {omega}1 acts as a buffer against fluctuations in the number of proteins, which repress gene 1. For large production rates of repressors the probability of gene 2 to be on for the forward transition decreases faster than in the deterministic solution; however, the buffering cloud repressing gene 1 allows gene 2 to remain in the on-state. A buffering proteomic cloud does not form around gene 2, and it remains on until the number of proteins produced by gene 1 grows considerably, as the effective production rate, is increased. The effective production rate of gene 1 must be very large to sustain a sufficient steady-state number of proteins to repress gene 2 to the point that C1(1) < 0.5, which leads to switching. For the backward transition, the lack of a buffering proteomic cloud around gene 2 results in destabilizing gene 1 for larger effective production rates than for large {omega}2 values. These examples show how certain combinations of values of adiabaticity parameters can lead to a system with a larger switching region than the deterministic model predicts. This property may be useful when engineering artificial switches. If one has a constraint on the production rates of the genes, one can use repressors with different binding affinities to achieve switching in the desired region of parameter space.

In this simple system slow unbinding from the DNA can compensate for the destabilizing of the DNA state by protein number fluctuations. As the probability of the initially active gene to be on gradually decreases, the initially repressed gene becomes active only once the probability of the other gene to be on has fallen below a certain value, {alpha}. The susceptibility of the system to protein number fluctuations may be estimated by the value of {alpha}. For small {omega}, which is still able to sustain a buffering proteomic cloud, this value tends to be 0.5. The incapability of the system to form a buffering proteomic cloud is much stronger if both adiabaticity parameters are small, since the reaction of both genes to the change in the number of proteins is hindered (Fig. 11, A and D). DNA state fluctuations contribute to effectively faster protein number fluctuations, therefore the exact solution exhibits the very small {omega}-characteristics, where a buffering proteomic cloud cannot form, for a slightly wider range of the adiabaticity parameter than one would expect with a Poissonian distribution (results not shown). Combining these observations, a switch works most effectively if the change of the DNA state compared to the protein number fluctuations of one gene is sufficiently smaller than that of the other gene, to allow for effective buffering.

The nonzero basal production rate
The asymmetric switch, in which both genes have a nonzero basal effective production rate, proves to be susceptible to noise. In Fig. 12, we show the dependence of C1(1), with g2(1)/(2k) = g2(2)/(2k) = 5 and C1(2), with g2(1)/(2k) = g2(2)/(2k) = 0.5 in the small {omega}i limit. The stochastic solutions converge to the deterministic solutions for large {omega}. If gene 2 is initially in the on-state, the majority of proteins are produced with the high fixed rate in the on-state, as g1(2) >> g</