ebook img

Modeling delay in genetic networks: From delay birth-death processes to delay stochastic differential equations PDF

0.64 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Modeling delay in genetic networks: From delay birth-death processes to delay stochastic differential equations

Modeling delay in genetic networks: From delay birth-death processes to delay stochastic differential equations Chinmaya Gupta,1 Jos´e Manuel L´opez,1 Robert Azencott,1 Matthew R. Bennett,2,3 Kreˇsimir Josi´c,1,4 and William Ott1 1)Department of Mathematics, University of Houston, Houston, TX 2)Department of Biochemistry & Cell Biology, Rice University, Houston, 4 1 0 TX 2 3)Institute of Biosciences & Bioengineering, Rice University, Houston, n a TX J 2 4)Department of Biology & Biochemistry, University of Houston, Houston, 1 TX ] N M Delay is an important and ubiquitous aspect of many biochemical processes. For . o example, delay plays a central role in the dynamics of genetic regulatory networks as i b - it stems from the sequential assembly of first mRNA and then protein. Genetic regu- q [ latory networks are therefore frequently modeled as stochastic birth-death processes 1 with delay. Here we examine the relationship between delay birth-death processes v 2 8 and their appropriate approximating delay chemical Langevin equations. We prove 6 2 that the distance between these two descriptions, as measured by expectations of . 1 functionals of the processes, converges to zero with increasing system size. Further, 0 4 we prove that the delay birth-death process converges to the thermodynamic limit 1 : v as system size tends to infinity. Our results hold for both fixed delay and distributed i X delay. Simulations demonstrate that the delay chemical Langevin approximation is r a accurate even at moderate system sizes. It captures dynamical features such as the spatial and temporal distributions of transition pathways in metastable systems, os- cillatory behavior in negative feedback circuits, and cross-correlations between nodes in a network. Overall, these results provide a foundation for using delay stochas- tic differential equations to approximate the dynamics of birth-death processes with delay. 1 I. INTRODUCTION Gene regulatory networks play a central role in cellular function by translating genotype into phenotype. By dynamically controlling gene expression, gene regulatory networks provide cells with a mechanism for responding to environmental challenges. Therefore, creating accurate mathematical models of gene regulation is a central goal of mathematical biology. Delay in protein production can significantly affect the dynamics of gene regulatory net- works. For example, delay can induce oscillations in systems with negative feedback1–7, and has been implicated in the production of robust, tunable oscillations in synthetic gene cir- cuits containing linked positive and negative feedback8,9. Indeed, delayed negative feedback is thought to govern the dynamics of circadian oscillators10,11, a hypothesis experimentally verified in mammalian cells12. In genetic regulatory networks, noise and delay interact in subtle and complex ways. Delay can affect the stochastic properties of gene expression and hence the phenotype of the cell2,13–16. It is well known that noise can induce switching in bistable genetic circuits17–26; the infusion of delay dramatically enhances the stability of such circuits27 and can induce an analog of stochastic resonance28,29. Variability in the delay time (distributed delay) can accelerate signaling in transcriptional signaling cascades30. Given the importance of delay in gene regulatory networks, it is necessary to develop methods to simulate and analyze such systems across spatial scales. In the absence of delay, it is well known that chemical reaction networks are accurately modeled by ordinary differential equations (ODEs) in the thermodynamic limit, i.e. when molecule numbers are sufficiently large. When molecule numbers are small, however, stochastic effects can dominate. In this case, the chemical master equation (CME) describes the evolution of the probability density function over all states of the system. Gillespie’s stochastic simulation algorithm (SSA)31 samples trajectories from the probability distribution described by the CME. While exact, the CME is difficult to analyze and the SSA can be computationally expen- sive. To address these issues, a hierarchy of coarse-grained approximations of the SSA has been developed32 (see Figure 1). Spatially discrete approximations, such as τ-leaping33–36 and K-leaping37 trade exactness for efficiency. At the next level are chemical Langevin equa- tions (CLEs), which are stochastic differential equations of dimension equal to the number 2 CME SSSSAA -Leaping 39 FPE CLE N (cid:29) 1 39 LNA 38 RRE N → ∞ FIG. 1. Schematic of the modeling hierarchy for biochemical systems. The black arrows link the variouscomponentsofthetheoryforMarkovsystems(nodelay); redarrowslinkthecorresponding delay components. Numbers attached to arrows refer to papers that establish the corresponding links. Empty arrowheads denote heuristic derivations; in this paper, we rigorously establish the dSSA to dCLE and dCLE to dRRE links. of species in the biochemical system. CLEs offer two advantages. First, unlike the SSA, the well-developed ideas from random dynamical systems and stochastic differential equations apply to CLEs. Second, it is straightforward to simulate large systems using CLEs. Finally, in the thermodynamic limit, one arrives at the end of the Markovian hierarchy: the reaction rate equation (RRE). The Markovian hierarchy above (no delay) is well-understood32,40, but a complete ana- logue of the Markovian theory does not yet exist for systems with delay. The SSA has been generalized to a delay version - the dSSA - to allow for both fixed2,41 and variable30,38 delay. Some analogues of τ-leaping exist for systems with delay; see e.g. D-leaping42. Several methods have been used to formally derive a delay chemical Langevin equation (dCLE) from the delay chemical master equation (dCME); see Section IV for details. Brett andGalla39 usethepathintegralformalismofMartin, Siggia, Rose, Janssen, anddeDomini- cis to derive a dCLE approximation without relying on a master equation. The Brett and Galla derivation produces the ‘correct’ dCLE approximation of the underlying delay birth- death (dBD) process in the sense that the first and second moments of the dCLE match 3 FIG. 2. (A) Typical trace obtained by using the dCLE. (B) Corresponding trace obtained using dSSA. By zooming into a small time segment, we see that the trace obtained from the dCLE (C) consists of equi-spaced time points (corresponding to an Euler discretization) and is continuous. ThecorrespondingsegmentofthetracefromthedSSA(D)consistsofeventsthatoccuratrandom times and has jump discontinuities. The displayed traces were gathered after a long transient. The model simulated is the single-gene positive feedback model with N = 500; see Section IIC1. those of the dBD process. However, their derivation has some limitations (see Section IV). In particular, it gives no rigorous quantitative information about the distance between the dBD process and the dCLE. In this paper, we establish a rigorous link between dBD processes and dCLEs by proving that the distance between the dBD process and the correct approximating dCLE process converges to zero as system size tends to infinity (as measured by expectations of functionals of the processes). In particular, this result applies to all moments. It is natural to express distance in terms of expectations of functionals because the dBD process is spatially discrete while the correct dCLE produces continuous trajectories (see Figure 2). Further, we prove that both processes converge weakly to the thermodynamic limit. Finally, we quantitatively estimate the distance between the dBD process and the correct dCLE approximation as well as the distance of each of these to the thermodynamic limit. All of these results hold for both fixed delay and distributed delay (see Figure 3A). The correct dCLE approximation is distinguished within the class of Gaussian approxi- 4 mations of the dBD process by the fact that it matches both the first and second moments of the dBD process. As we will see, it performs remarkably well at moderate system sizes in a number of dynamical settings: steady state dynamics, oscillatory dynamics, and metastable switches. We will demonstrate via simulation and argue mathematically using character- istic functions that no other Gaussian process with appropriately scaled noise performs as well. In the following, the term ‘dCLE’ shall refer specifically to the dCLE derived by Brett and Galla and expressed by (A1), unless specifically stated otherwise. We prove our mathematical results in the supplement43. II. SIMULATIONS/OUTLINE AND INTERPRETATION OF RESULTS Genetic regulatory networks may be simulated using an exact dSSA to account for tran- scriptional delay2,30,38,41. Here we provide a heuristic derivation of a related dCLE, and show that in a number of concrete examples it provides an excellent approximation of the system (see Figure 3). These simulations raise the following questions: Is the dCLE approximation valid in general? Can the expected quality of the approximation be quantified in general? We answer these questions mathematically in Section III. We will adopt the following notation for reactions with delay, α(X,Y) X +Y (cid:57)(cid:57)(cid:57)(cid:57)(cid:57)(cid:75) Z µ Here α(X,Y) denotes the rate of the reaction, the dashed arrow indicates a reaction with delay, and µ is a probability measure that describes the delay distribution. Solid arrows indicate reactions without delay. A. A transcriptional cascade First we consider a transcriptional cascade with two genes that code for proteins X and Y. Protein X is produced at a basal rate; production of Y is induced by the presence of X. The state of the system is represented by an ordered pair (X,Y). Note that we use X and Y to denote both protein names and protein numbers. The reactions in the network, and 5 FIG. 3. (A) The effect of distributed delay on the protein production process. The “input process” is the first step in the transcription process, while the “output process” is the final mature product that enters the population. The time delay τ accounts for the lag between the initialization of transcription and the production of mature product. In a system with distributed delay, different production events can have different delay times; the order of the output process may therefore not matchthatoftheinputprocess. (B–E)Simulatedgeneregulatorynetworkmotifs: atranscriptional cascade (B), oscillators (C) and metastable systems (D–E). the associated state change vectors v , are given by i ∅ (cid:57)a(cid:57)(cid:75) X v = (1,0) (1) 1 µ X −b−1→X ∅ v = (−1,0) (2) 2 ψ(X) ∅ (cid:57)(cid:57)(cid:57)(cid:57)(cid:75) Y v = (0,1) (3) 3 µ Y −b−2→Y ∅ v = (0,−1) (4) 4 This system can be simulated exactly using the dSSA: Suppose the state of the system and the reactions in the queue are known at time t (the queued reactions can be thought of 0 as the “input process”; see Figure 3A), and that the delay kernel µ is supported on a finite interval [0,τ ]44. 0 (1) Sample a waiting time t from an exponential distribution with parameter A := a + w 0 b X +ψ(X)+b Y. 1 2 6 (2) If there is a reaction in the queue that is scheduled to exit at time t < t +t , advance q 0 w to time t and set t → t and (X,Y) (cid:55)→ (X,Y) + (v(1),v(2)), where v = (v(1),v(2)) is q 0 q i i i i i the change in the system due to the scheduled reaction. Finally, sample a new waiting time for the next reaction. (3) If no reaction exits before t +t , set t (cid:55)→ t +t and sample a reaction type from the 0 w 0 0 w set {1,2,3,4} with probabilities proportional to {a,b X,ψ(X),b Y}, respectively. If the 1 2 reaction chosen is a non-delayed reaction, perform the update (X,Y) (cid:55)→ (X −1,Y) (or (X,Y) (cid:55)→ (X,Y −1)). However, if the reaction chosen is a delayed reaction, the state change vector (1,0) (or (0,1)) is put into the queue along with an exit time t . The q difference τ = t −t between the current and the exit time is sampled from the delay q 0 distribution µ. We now heuristically derive the dCLE for the feed-forward system from this spatially discrete process. Suppose the delay kernel µ is given by a probability density function κ supported on [0,τ ] (dµ(s) = κ(s)ds). We first approximate the number of reactions that produce Y (Eq. 0 (3)) that will be completed within the interval [t ,t + ∆], where t denotes the current 0 0 0 time and ∆ is a small increment. Since the production of Y involves delay, a reaction of this type that is completed within [t ,t +∆] must have been initiated at some time within 0 0 [t −τ ,t ]. Let t > t −∆ > t −2∆ > ··· be the partition of [t −τ ,t ] into intervals 0 0 0 0 0 0 0 0 0 of length ∆. The (random) number of reactions completed within [t ,t +∆] and initiated 0 0 within [t −(i+1)∆,t −i∆] may be approximated by a Poisson random variable with mean 0 0 ψ(X )∆·κ((i+1)∆)∆. t0−(i+1)∆ Summing over i, the (random) number of reactions completed within [t ,t + ∆] may be 0 0 approximated by a Poisson random variable with mean (cid:88)(cid:2) (cid:3) ∆ ψ(X )(κ((i+1)∆)∆) ; t0−(i+1)∆ i this is a Riemann sum that approximates the integral (cid:90) τ0 ∆ ψ(X )κ(s)ds. t0−s 0 7 FIG. 4. Cross correlation functions for the two-species feed-forward architecture at system size N = 1000. TheinsetshowssampletrajectoriesforX (black; top)andY (blue; bottom). Themean field model has a fixed point, and for this reason, stochastic dynamics stay within a neighborhood of this fixed point (see Theorem 6). However, the effect of increasing delay is clearly seen in the cross correlation function. The color bar displays the delay size corresponding to the cross correlation curves. Parameter values are given by ψ˜(X) = a˜ x3/(m3 +x3), a˜ = 0.92, a˜ = 1.39, 1 1 b = b = ln(2), m = 1.33, µ = δ . The cross correlations have been normalized by dividing by the 1 2 τ standard deviations σ and σ of X and Y. Time has been normalized to cell cycle length. X Y Known as τ-leaping, this line of reasoning produces a Poissonian approximation of the dBD process: δX = Poisn(a∆)−Poisn(b X ∆) t 1 t (cid:18) (cid:90) τ0 (cid:19) δY = Poisn ∆ ψ(X )dµ(s) −Poisn(b Y ∆). t t−s 2 t 0 Here Poisn(η) denotes a Poisson random variable with mean η. If these Poisson random variables have large mean, they can be approximated by normal random variables. For example, the Poisson variable representing the number of reactions that produce Y can be approximated by a normal random variable with mean and variance equal to ∆(cid:82)τ0ψ(X )dµ(s). Since each reaction changes the state of either X or Y (but 0 t−s neverboth),itfollowsthattheevolutionofthesystemcanbeapproximatedbythestochastic 8 difference equation (cid:112) δX = ∆(a−b X )+ ∆(a+b X )N(0,1) (5a) t 1 t 1 t (cid:18)(cid:90) τ0 (cid:19) δY = ∆ ψ(X )dµ(s)−b Y (5b) t t−s 2 t 0 (cid:115) (cid:18)(cid:90) τ0 (cid:19) + ∆ ψ(X )dµ(s)+b Y N(0,1), t−s 2 t 0 where N(0,1) is the standard normal random variable. System (5) may be written in terms of concentrations. Let N be a system size parameter. We think of N as a characteristic protein number; X /N and Y /N therefore represent t t ˜ fractions of this characteristic value. Writing x = X /N, y = Y /N, ψ(x) = ψ(Nx)/N, and t t t t assuming that the basal production rate a scales with N as a = a˜N, we obtain 1 (cid:112) δx = ∆(a˜−b x )+ √ ∆(a˜+b x )N(0,1) (6a) t 1 t 1 t N (cid:18)(cid:90) τ0 (cid:19) ˜ δy = ∆ ψ(x )dµ(s)−b y (6b) t t−s 2 t 0 (cid:115) 1 (cid:18)(cid:90) τ0 (cid:19) + √ ∆ ψ˜(x )dµ(s)+b y N(0,1). t−s 2 t N 0 Eq. (6) is the Euler–Maruyama type discretization of a delay stochastic differential equa- √ tion. Replacing ∆ with dt and ∆N(0,1) with dW in (6), we obtain t 1 (cid:112) dx = (a˜−b x )dt+ √ (a˜+b x )dW1 (7a) t 1 t 1 t t N (cid:18)(cid:90) τ0 (cid:19) ˜ dy = ψ(x )dµ(s)−b y dt (7b) t t−s 2 t 0 (cid:115) 1 (cid:18)(cid:90) τ0 (cid:19) + √ ψ˜(x )dµ(s)+b y dW2. t−s 2 t t N 0 This is the dCLE for the transcriptional cascade in this section. Taking the formal thermodynamic limit, N → ∞, in Eq. (7) yields the reaction rate equations derived in38: dx = (a˜−b x )dt (8a) t 1 t (cid:18)(cid:90) τ0 (cid:19) ˜ dy = ψ(x )dµ(s)−b y dt. (8b) t t−s 2 t 0 The dynamics described by Eq. (8) are quite simple; if b > 0 and b > 0, then (8) has a 1 2 globally attracting stable stationary point. 9 To test the validity of the dCLE approximation (7), we examine if it captures the inter- action between the two proteins in our transcriptional cascade network. Figure 4 shows the cross correlation functions obtained by simulating the system with N = 1000 using dSSA. From left to right, the curves correspond to fixed delay τ increasing from 0 to 3. The cor- responding cross correlation curves for the dCLE approximation (7) are indistinguishable from those obtained using dSSA. In the heuristic derivation above, we first fix N and let ∆ → 0 to obtain the dCLE; we then separately let N → ∞ to obtain the thermodynamic limit. Brett and Galla39 also derive the dCLE by first fixing N and then sending ∆ → 0. However, the two limits, ∆ → 0 and N → ∞, cannot be taken independently; this is a common problem with heuristic derivations of stochastic differential equations, even in the absence of delay45. The time discretization, ∆, can be thought of as a sampling frequency, while the system size, N, determines the rate at which reactions fire. If N becomes too large for a given ∆, then the number of reactions that fire within [t,t + ∆] no longer follows a Poisson distribution with mean dependent only on the state of the system at time t. On the other hand, if N is too small for a given ∆, then the Poisson distribution cannot be approximated by a normal distribution. In order to rigorously derive the Langevin approximation and estimate the distance between the dBD and dCLE processes, we will have to take a careful limit by relating ∆ to N (with ∆ → 0 as N → ∞). We describe the proper scaling in Section III. Appliedtothetranscriptionalcascade, Theorem4assertsthatprovided∆scalescorrectly withN,thedistancebetweenthedBDprocessandtheprocessdescribedbyEq.(7)converges to zero as N → ∞ (as measured by expectations of functionals of the processes). Theorem 5 asserts that the dBD process then converges weakly to the thermodynamic limit given by Eq. (8) as N → ∞. Moreover, when ∆ is correctly scaled with respect to N, Theorem 6 provides explicit bounds for the probabilities that the dBD and dCLE processes deviate from a narrow tube around the solution of Eq. (8). In the previous example, the deterministic system has a fixed point. The time series for the stochastic system, therefore, stay within a small neighborhood of this fixed point (see inset, Fig. 4). In the next example, we show that the dCLE approximation remains excellent even when the deterministic dynamics are non-trivial. We consider a degrade- and-fire oscillator for which the deterministic system has a limit cycle. The dCLE correctly capturesthepeakheightandtheinter-peaktimesforthedSSArealizationofthedegradeand 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.