| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

* Biophysics Program and
Chemistry Department, Stanford University, Stanford, California 94305
Correspondence: Address reprint requests to Vijay S. Pande, E-mail: pande{at}stanford.edu.
| ABSTRACT |
|---|
|
|
|---|
ß protein fold. To characterize the nature of each room temperature transition, we calculated the probability of transmission for 500 points along each free energy barrier. We introduced a method for determining transition states by employing the transmission probability, Ptrans, and determined which conformations were transition state ensemble members (Ptrans
0.5). The transmission probability may be used to characterize the barrier in several ways. For example, we ran simulations at 82°C, determined the change in Ptrans with temperature for all 2,000 conformations, and quantified Hammond behavior directly using Ptrans correlation. Additionally, we propose that diffusion along Ptrans may provide the configurational diffusion rate at the top of the barrier. Specifically, given a transition state conformation x0 with estimated Ptrans = 0.5, we selected a large set of subsequent conformations from independent trajectories, each exactly a small time
t after x0 (250 ps). Calculating Ptrans for the new trial conformations, we generated the P(Ptrans|
t = 250 ps) distribution that reflected diffusion. This approach provides a novel perspective on the diffusive nature of a protein folding transition and provides a framework for a quantitative study of activated relaxation kinetics. | INTRODUCTION |
|---|
|
|
|---|
Advances in computer power and new path sampling algorithms have allowed progress in this direction (6
9
). It is now tractable to determine if a structure is a member of the TSE without constructing a reaction coordinate (10
12
). For example, after defining folded and unfolded sink states, one can calculate the probability that a given conformation reaches the folded sink before it reaches the unfolded sink. If this commitment probability, Pfold, is
0.5, then the structure lies along the high dimensional hyperplane (separatrix) between the sink states. The TSE is a set of low free energy conformations on the separatrix.
Here, we introduce new methods related to Pfold and use them to probe the activated kinetics of the 39-residue N-terminal domain of L9 (L939). Although L939 is small, it is a challenging system to examine computationally, due to its complex topology and a relatively slow millisecond folding timescale. Although the system appears two state to experimental probes (13
), we found an unfolding mechanism that included parallel unfolding pathways with intermediates. We will describe the detailed L939 folding mechanism elsewhere. Here, we focus on the methodology needed to characterize the observed free energy barriers with extensive kinetic simulations.
| METHODS |
|---|
|
|
|---|
rxn, and it becomes possible to self-consistently divide the conformational space into stable macrostates (minima F and minima U).
The mean first passage time for trajectories released from a U microstate to capture at an F sink should yield the phenomenological rate constant. In comparison to the slow barrier-crossing timescale, relaxation of an activated conformation to a sink is rapid and rarely experimentally resolved (15
). In this study, we are interested not only with the mean time to capture but with the identification of activated conformations. Activated conformations have a significant probability of absorption by sinks in both F and U. In the limit of an infinite number of trajectories of infinite length, the ratio of trajectories captured by the F sink is rigorously converted into a probability.
In practice, we have only a finite number of trajectories, and it is not trivial to precisely calculate mean first passage times or the relative probability of capture at sink F and U. Determining a Pfold value is a large computational expense. Instead, we take the following approach. For each tested conformation, x0, N molecular dynamics (MD) trajectories are started from x0 (with randomized initial velocities drawn from a Maxwell distribution). Since the trajectories are long enough to reach either the folded or unfolded state, every simulation has only two possible outcomes. Therefore, each simulation represents a Bernoulli trial. Given a conformation specific transmission probability, Pfold(x0) = p, we expect a finite sequence of N independent Bernoulli trials to contain n folding events with probabilities determined by the binomial distribution
.
Unfortunately, statistical error is an obstacle to the application of this method. The uncertainty in each Pfold calculation may be estimated by the standard deviation of the binomial distribution mean,
, such that a Pfold value calculated using 80 simulations retains half the statistical error of a Pfold value calculated with 20 simulations. In this work, we employ an alternate treatment of the statistical error using Bayesian inference (Appendix) to estimate each specific transmission probability. Our error bounds are 95% credible intervals for each Pfold obtained by integrating the Bayesian posterior distribution.
The idea of conformation-specific transition probabilities has a long history, and the term Pfold has been used to describe several subtly different calculations. Onsager originally described a splitting probability for the case of ion-pair recombination (16
). With increased computational power it became possible to apply this idea to complex reactions. Du et al. calculated the relative probability of reaching sink F before sink U for a lattice protein, where sink F corresponded to the native microstate and sink U corresponded to an unfolded macrostate with few native contacts (12
). Thanks to the computational tractability of lattice protein models, they were able to run many trial Monte Carlo (MC) trajectories to completion (N = 400) from each initial conformation of interest. A similar approach was applied to all-atom MD simulations of a hairpin, terminating trial trajectories when they reached specified cutoffs (17
). Whereas Du et al. originally termed their calculation a "transmission coefficient" it is now often termed the probability of folding, Pfold. Caflisch and co-workers have applied these methods to atomistic MD simulations of various systems including sizable proteins. For example, Gsponer and Caflisch calculated six approximate Pfold values (N = 10) for an SH3 domain (18
). Pfold calculations are challenging because the relaxation times can be quite slow; 10% of the SH3 simulations had not reached either sink after 200 ns. Such uncommitted trajectories are common and are problematic because excluding uncommitted trajectories from the Pfold ratio decreases the precision and rigor of the calculated probabilities.
Timescale of commitment
Others have explicitly considered the timescale for commitment, adopting a methodology that eliminates uncommitted trajectories by construction. Rather than calculating the probability of reaching sink F before sink U, one calculates the probability of being within macrostate F after some elapsed time,
commit. To understand why a short relaxation time
commit is sufficient, we refer to nonequilibrium statistical mechanics. The Onsager regression hypothesis asserts that relaxation from a prepared nonequilibrium ensemble is related to the equilibrium relaxation of a spontaneous fluctuation (19
,20
). Chandler provides a derivation of this relationship using the fluctuation-dissipation theorem and linear response theory with the assumption that the initial perturbed ensemble is not far from equilibrium (21
). Dellago, Bolhuis, and Geissler derive the Onsager regression hypothesis for arbitrarily large perturbations away from equilibrium (22
).
Here, each initial nonequilibrium ensemble consists of a single activated protein conformation with randomly selected Maxwell-Boltzmann initial velocities. The activated conformation must relax to a nonequilibrium population of macrostates F and U within the transient
commit interval. From this initial nonequilibrium population of F and U, we may calculate the reactive flux across a dividing cutoff between F and U (21
,23
). After
commit, the reactive flux across the cutoff matches the phenomenological rate constant. The
commit time is much shorter than the barrier-crossing time
rxn. This separation of timescale between
commit and
rxn is a natural consequence of the free energy barrier. Long timescale behavior (
t >
rxn) is quite different: all correlation with x0 is lost, the reactive flux goes to zero, and the ensemble fraction of F and U approaches the equilibrium constant. Roughly speaking, we may think of
commit as A), the timescale for relaxation from the transition state to the free energy minima, B), the timescale at which recrossing the dividing surface becomes negligible, or C), the timescale at which the ensemble fraction on either side of the cutoff reaches a plateau (24
).
These properties allow us to calculate transmission probabilities after an elapsed time. For example, Hubner et al. calculated the fraction of MC trials that met various folding cutoffs after
commit = 107 MC steps (14
). In other words, Shakhnovich and co-workers estimated the time necessary for trial trajectories to commit to a stable macrostate and calculated the fraction within macrostate F at this time. They term this fraction Pfold. An alternate terminology was introduced by Bolhuis and co-workers, who define the "committor" PA, the probability that short trajectories initiated from a particular configuration with randomly chosen initial moments will end inside cutoff F after a short interval
commit.
We employ a committor we call the probability of transmission, Ptrans(
t,x0). We define Ptrans(
t,x0) in the following manner: we choose a fixed dividing surface separating product from reactant, start N trajectories from a specific conformation x0, and calculate the fraction of trajectories on the product side of the dividing surface after an elapsed time
t (Fig. 1). We call our committor Ptrans because it is defined over four different transitions rather than a single folding transition. Also, we wish to avoid overloading the term Pfold. We defined Ptrans values for each barrier in such a way that Ptrans = 1 corresponds to commitment to the more folded state and Ptrans = 0 corresponds to the less folded state. Where convenient, we include the arguments
t and x0 to emphasize that each Ptrans(
t,x0) calculation is specific to an individual conformation and depends on the elapsed time. Qualitatively, Ptrans(
t,x0) and Pfold both aim to provide the same splitting probability. Excepting the placement and sensitivity analysis of the cutoff at the barrier top and the explicit consideration of the convergence with time, our Ptrans(
t,x0) calculations are quite similar to previous Pfold calculations (14
).
|
ß topology and folds on the millisecond timescale. The parent molecule NTL9 has been thoroughly studied by the Raleigh laboratory (25
Native state simulations at moderate and high temperature have produced many unfolding events (complete loss of the ß-sheet), and we describe new order parameters to track the loss of each strand pair (1:2 and 1:3). A set of native sheet hydrogen bonds was chosen from inspection of native state simulations (Fig. 2 D). A set of strand
-carbon positions was chosen from inspection of the native NMR model. The two strand pair order parameters (S1:2, S1:3) are the sum of the DSSP (http://swift.cmbi.ru.nl/gv/dssp/) hydrogen bond energy for the set of hydrogen bonds between each strand (E1:2, E1:3) and the RMSD of the inter-
-carbon distance matrix (dRMSD) to the equivalent native model strand pair (D1:2, D1:3). The electrostatic component provides dynamic range to distinguish strand pairs with strong native hydrogen bonds, and the dRMSD component tracks differences between conformations that lack native hydrogen bonds.
|
|
| RESULTS |
|---|
|
|
|---|
We show the same 200°C unfolding ensemble in terms of two other popular order parameters:
-carbon RMSDc
to the native model and the fraction of native contacts retained, Q (Fig. 2 e). The set of native contacts was chosen from inspection of the 25°C native ensemble after 100 ns. The RMSDc
and Q order parameters do not resolve the strand pair intermediates noted in Fig. 2 c. Similarly, we expect that additional distinct kinetic species are unresolved by the S1:2 and S1:3 strand quality order parameters. Unresolved kinetic species are troublesome when they persist for timescales exceeding the simulation length.
S1:2 and S1:3 successfully separate macrostates as shown in Fig. 2, ac, but individual reactive trajectories cross the dividing cutoffs many times. Therefore, S1:2 and S1:3 are effective order parameters but are not optimal reaction coordinates. Interpreting single trajectory motion in terms of strand pair quality is difficult because the dynamics are diffusive, depend on orthogonal degrees of freedom, and are rough at the 250-ps frequency at which we record snapshots. These shortcomings make it difficult to directly extract barrier-crossing dynamics of single trajectories. Furthermore, it is inappropriate to interpret the strand pair order parameters physically because they combine geometric and energetic criteria and thus lack meaningful units.
Such difficulties can be bypassed with ensemble measurements. To characterize the rate of transition from the native basin to an intermediate, we may simply consider the net change in population. We divide the L939 conformational space into four quadrants (Fig. 2, ac) corresponding to the folded state F, the intermediate state I13 that retains the 1:3 strand pair, the intermediate state I12 that retains the 1:2 strand pair, and the unfolded state U. The two dividing cutoffs are quite simple and correspond to S1:2 or S1:3 = 0. Shifting this cutoff ±0.5 along each barrier changed the Ptrans values very little (0.0480.063 RMSD). The transmission probability is robust with respect to the cutoff because the overlap between macrostates is minimized at the dividing surface.
Calculating Ptrans for conformations along specific barriers is slightly complicated by the existence of four macrostates. It is an approximation to treat individual barriers as independent two-state transitions. For example, when measuring Ptrans for an activated F-I13 conformation (loss of the 1:2 strand pair), a small fraction of the trajectories (<5%) appears to cross the other cutoff (loss of 1:3). These cutoff crossing events reflect both noise and true side reaction transitions. A meaningful Ptrans(
t,x0) value should exist as long as
t is short compared to subsequent processes that affect the ratio of product to reactant. When computing each Ptrans value, including or excluding the small side reaction fraction produced similar results (data not shown). We included this fraction. For example, when measuring the Ptrans for the F-I13 transition, the unfolded quadrant is treated as equivalent to the I13 quadrant and the I12 quadrant is treated as equivalent to the F quadrant (Fig. 2).
Kinetic memory, a transition state identification heuristic
We used a heuristic to select putative TSE members from a 200°C unfolding ensemble. The ensemble consisted of thousands of simulations of varying length (e.g., 4,700 simulations reach 10 ns, 1,500 continue to 50 ns) for an aggregate simulation time of more than 200 ms. This ensemble contains more than 500 independent unfolding events through each identified barrier (using S1:2 and S1:3). For each unfolding event, we selected the last conformation before barrier crossing. To filter out transitions that rapidly recross the cutoff, we provided each conformation with a 1-ns kinetic memory and considered only putative transition state conformations from trajectories that dwell in both the prior and posterior quadrant for 1 ns. We collected 500 putative conformations for each identified barrier and calculated the transmission probability for each conformation. Despite our efforts to pick conformations that correspond to pivotal barrier-crossing instances, only a subset (
10%) of the putative TSE members had transmission probability values between 0.4 and 0.6 (Fig. 3 a). This reflects the difficulty of selecting TSE members using a heuristic and underscores the need to validate putative TSE conformations.
|
To most accurately estimate the probability of transmission, we must run simulations long enough that Ptrans(
t,x0) converges to a plateau. Unless otherwise noted, all Ptrans values represent the ensemble fraction calculated at 10 ns, Ptrans(10 ns,x0). To demonstrate that 10 ns is a reasonable commitment time, we show the good correlation between Ptrans(10 ns,x0) values and Ptrans(20 ns,x0) values (Fig. 3 b) with RMSD values for each barrier (500 conformations) ranging from 0.043 to 0.058. Transmission probability values calculated after 5 ns, Ptrans(5 ns,x0), were also quite similar (RMSD = 0.061) to Ptrans(10 ns,x0). In contrast, Ptrans values calculated in the first several nanoseconds had not yet converged. For example, the Ptrans(250 ps,x0) values were quite different (RMSD = 0.219) from the Ptrans(10 ns,x0) values (Fig. 3 c). The rate at which the Ptrans(
t,x0) values converged to a stable value could be observed by plotting the RMSD of all Ptrans(
t,x0) values versus all reference Ptrans(10 ns,x0) values (Fig. 3 d). Given the rapid decay in the RMSD of all Ptrans values, we infer that most individual Ptrans(
t,x0) values quickly converge. This relaxation is imperfectly fit by a single exponential decay with rate constant kdecay = (2.4 ns)1 and is well fit by a stretched exponential
with kdecay = (1.8 ns)1 and the stretching factor ß = 0.51.
These rates provide a rough estimate for the molecular timescale
commit of this barrier. Stretched exponentials have been previously used to model downhill processes or processes with many relevant kinetic states (36
). The rapid relaxation of an activated ensemble of varied initial states fits both criteria. It is difficult to extract additional information from this initial relaxation. At short times (Fig. 3 c), the Ptrans is usually lower than the plateau value, reflecting a transient unfolded excess. This property might reflect the origin of the initial conformations from the high temperature 200°C ensemble or a tendency of the selection heuristic (which selects putative transition states immediately before unfolding events).
We show typical Ptrans(
t,x0) time courses in Fig. 3, e and f. A small fraction of the initial conformations do not reach stable Ptrans(
t,x0) values in the length of our simulations. This is a source of noise in the remaining analysis. We have not culled unconverged Ptrans(
t,x0) values to avoid introducing bias. One direction for future work is the careful time averaging and extraction of Ptrans(
t,x0) plateau values from each Ptrans(
t,x0) trace.
The challenges associated with precisely calculating Ptrans values are significant: both the requirement for hundreds of MD simulations per trial conformation and the requirement that these simulations reach the characteristic timescale
commit. We have demonstrably satisfied these constraints here given four distinct strand pair transitions of a small protein that folds on the millisecond timescale. It is quite likely that longer proteins will exhibit transitions that require MD simulations of greater length. It may also be difficult to map larger protein motions to distinct two-state barrier-crossing events, a necessary precondition for studying the kinetics of activated conformations using these methods.
Ptrans correlation and Hammond effects
Now we turn to the applications of detailed kinetic observations. One question we address is the effect of polymer temperature on the TSE. To consider the effect of temperature on both the protein and solvent, one would ideally employ an explicit solvent model parameterized to reproduce solvent properties and the hydrophobic effect over a large range of temperature values. In the absence of such models, we have employed an implicit solvent model that does not change with temperature. Accordingly, our high temperature simulations only reflect the effect of temperature on the polymer. This perturbation is useful to determine the nature of the folding reaction at room temperature through observable shifts in Ptrans. Also, we probe the shape of the 25°C and 82°C free energy landscapes at points chosen from nonequilibrium high temperature unfolding ensembles. Ideally, such probe points would be selected from an equilibrium ensemble at room temperature to guarantee that the points sample a relevant portion of the landscape.
Caveats aside, we believe that we have sampled the relevant TSEs and can generalize the effect of polymer temperature. A drastic change in the L939 TSE due to polymer temperature seems unlikely given the similar saddle point positions observed in the strand quality projection (Fig. 2, ac). Robust unfolding mechanisms with respect to temperature have been observed in previous simulations including explicit solvent unfolding simulations (37
).
The expected effect of elevated temperature is Hammond behavior: the TSE should become more similar to the native state. For example, a conformation that is a transition state conformation with Ptrans = 0.5 at room temperature would have Ptrans < 0.5 at elevated temperature. Fig. 4 shows the Ptrans correlation between 25°C and 82°C for each trial conformation. As expected, the room temperature Ptrans values were higher than the Ptrans values at the experimental Tm. Qualitatively, the 25°C and 82°C free energy barriers appear quite similar. We then calculated the transition state shifts (
x
) and barrier width changes (
') most consistent with observed Ptrans shifts, assuming a quadratic free energy barrier shape. For a more detailed explanation of the fitting routine and theory, we refer the reader to Rhee and Pande (38
). In all cases, the small observed temperature shifts qualitatively matched expected Hammond effects, moving the TSE toward the more folded minima. Furthermore, the sigmoid shape of Fig. 4 a reveals that the F-I13 free energy barrier both shifts slightly and contracts as the temperature drops to 25°C. The barrier contraction may be rationalized by considering that low temperature is likely to stabilize I13 in addition to F. Since Pfold is exponentially dependent on the free energy barrier (38
), we expect Ptrans correlation to be a sensitive probe of perturbations to the free energy landscape.
|
t,x0) values, but the deviations from the identity line are small. To gauge the certainty of the fitting process and construct confidence intervals for the calculated changes in barrier width and position, we use a bootstrap error analysis. Essentially, we repeated the Ptrans correlation fitting process for 100 bootstrap samples. Each bootstrap sample contains 500 Ptrans values drawn with replacement from the full Ptrans distribution. Using 95% confidence bounds, we found that the most evident differences between each barrier were statistically justified (Fig. 4); namely, the TSE shift for the I12-U transition is larger than the TSE shift for I13-U, which is in turn larger than the TSE shift for the other two transitions. Also, the 25°C barrier width decreases more for the F-I13 transition than the other three transitions. These Ptrans correlation calculations provide a first step toward quantitative Hammond effect analysis, with comparison of the magnitude of barrier shifts and width changes.
Ptrans diffusion rate and transition rate theory
To the extent that the free energy profile along Ptrans is the ideal one-dimensional coordinate for representing reaction kinetics, the intrinsic diffusion constant along this coordinate is the ideal configurational diffusion constant. In the limit of small perturbations around the TSE, we postulate that diffusion along the Ptrans reaction coordinate should reflect flat one-dimensional diffusion (i.e., in the absence of a free energy gradient). If this hypothesis is correct, we would predict a Gaussian distribution of Ptrans values to arise from a transition state after short times.
Calculating a configurational diffusion constant directly from simulation is an avenue for comparison with analytical folding theory. Theoretical work related to diffusion along structural reaction coordinates is quite extensive (39
45
). For example, Wolynes and co-workers relate the configurational diffusion constant on a structural reaction coordinate Q to the autocorrelation time of Q within a neighboring basin of attraction (39
). Baumketner and Hiwatari provide a recent example, calculating the configurational diffusion constant along a physical reaction coordinate for an off-lattice hairpin using an ensemble averaged time correlation (40
). We believe that direct calculation of diffusion at the TSE using a kinetic reaction coordinate will complement theories that have been developed for the diffusion of proteins along structural reaction coordinates.
We calculated diffusion along the Ptrans coordinate for the F-I13 transition. To do so, we constructed the distribution P(Ptrans|
t = 250 ps) using 144 trial daughter conformations, each a snapshot 250 ps after an initial father conformation x0 with Ptrans
0.5. The father conformation is the validated transition state conformation shown in Fig. 5. We expected 250 ps to be a short interval relative to barrier crossing and thus anticipated that most of the 144 daughter conformations would also be transition state conformations. We calculated Ptrans for each conformation. As expected, the daughter conformations were still activated (0.20 < Ptrans < 0.70). The daughter distribution P(Ptrans|
t = 250 ps) is shown in Fig. 6 a. For a finite number of shooting trajectories, N, the width of the daughter distribution is partially due to uncertainty in each Ptrans calculation. To minimize the contribution of the statistical uncertainty, we simulated many (N » 400) trajectories for each daughter Ptrans value.
|
|
0.024) relative to the observed standard deviation (0.097) of the P(Ptrans|
t = 250 ps) distribution. Second, we extracted an estimate for the limiting standard deviation of the P(Ptrans|
t = 250 ps) distribution by plotting the standard deviation for cumulative subsets of the data (Fig. 6 c). The standard deviation of the P(Ptrans|
t = 250 ps) distribution decays with increasing N, converging to a limiting standard deviation. Specifically, the standard deviation trend was least squares fit by A x N0.5 + C with C = 0.093. Third, we directly illustrate the statistical noise (Fig. 6 b) by regrouping
50,000 simulation outcomes to construct 110 uniformly distributed samples (three trajectories per daughter conformation, N
432). In other words, we calculate Ptrans for a cluster of similar conformations rather than one distinct conformation. Each of these 110 samples is an independent estimate of the Ptrans value for the entire cluster of daughter conformations. Accordingly, the width of the resulting distribution, P(Ptrans), reflects statistical noise, and the standard deviation trend was successfully least squares fit by A x N0.5. These three arguments demonstrate that the spread seen in Fig. 6 a is not merely the statistical noise associated with individual Ptrans calculations but is instead related to the properties of the 250-ps daughter ensemble.
The calculation of Ptrans for an entire cluster is not merely a control. One might wish to treat the entire set of daughter conformations as a homogeneous cluster because they are structurally and kinetically related. In fact, Rao et al. have provided a precedent, employing an algorithm that calculates Pfold for a cluster of structurally related conformations given a long equilibrium simulation (46
). Given sufficient trajectories, the Ptrans value of a cluster should converge to a single value corresponding to the ensemble average Ptrans. Indeed, if we estimate Ptrans using all data for the daughter cluster, n = 26,346 and N = 57,510, we obtain the most precise commitment fraction calculated to date (Ptrans = 0.458 ± 0.002).
The P(Ptrans|
t = 250 ps) distribution reflects diffusion along a kinetic reaction coordinate. The Gaussian shape of this distribution (0.01 skewness and 0.04 kurtosis) supports the hypothesis that kinetic diffusion near the TSE is flat. As a proof of concept, we use the P(Ptrans|
t = 250 ps) distribution to obtain a configurational diffusion constant. As described above, the standard deviation of the P(Ptrans|
t = 250 ps) distribution was estimated to be 0.093 in the limit of large N. Therefore, the most rapid motion along the Ptrans coordinate,
Ptrans/
t, was 0.093 per 250 ps, or 0.374 ns1. We take the inverse,
t/
Ptrans = 2.7 ns per event, as an estimate of
commit, the molecular timescale for this transition. This corresponds well with 2.4 ns, the rough estimate of
commit obtained earlier as the timescale for Ptrans(
t) plateau.
However, interpreting
Ptrans/
t as a velocity is incorrect because the standard deviation of a diffusing distribution
is proportional to the square root of the elapsed time
t and the diffusion constant D. For short
t, we extract a proper diffusion constant assuming diffusion along a flat one-dimensional coordinate,
. We calculate D = 0.0175 ns1 at the barrier top. If ideal diffusion holds, a
function at time zero (Ptrans = 0.5) diffusing with D = 0.0175 ns1 will be completely absorbed by boundaries at 0 and 1 within 16 ns. This absorption process may be roughly approximated as an exponential decay (
commit = 5.3 ns). However, the observed ideal Gaussian diffusion will not hold after longer elapsed times. At longer
t the simulations will relax into product and reactant minima and the distribution P(Ptrans|
t) should become bimodal. Also, an asymmetric P(Ptrans|
t) distribution could arise from a free energy gradient or differential diffusion. Future simulations started from daughter conformations with longer
t should allow us to gauge these effects and to deduce the shape of the free energy barrier.
Classical transition state theory treats the folding process with a single constant prefactor, k0, where
. The Ptrans diffusion rate provides a route to test this approximation by determining the configurational diffusion constant for different free energy barriers, altered temperature, mutation, or even for a different member of the same TSE. The transition state prefactor, which relates to the maximum speed of protein folding, depends on the nature of the conformational transition and polymer length. Eaton and co-workers have proposed a Nres/100-µs rule of thumb for the protein folding speed limit where Nres is the number of residues (47
). This would predict a
400-ns speed limit for L939 which actually folds more than three orders of magnitude more slowly. This is unsurprising given the complexity of L939 topology relative to other small proteins. Also, in this work we are examining a single transition within the context of a larger folding process. Furthermore, barrier top diffusion may be rapid in comparison to diffusion along other portions of the free energy landscape. Our most rapid observed relaxation time was
3 ns, which is just faster than the most rapid events characterized by Kiefhaber and co-workers in studies of triplet-triplet energy transfer across various peptide sequences (48
). The speed limit rule of thumb, our Ptrans-diffusion calculations, and the presence of intermediates are all consistent with the premise that L939 could be optimized to fold more quickly. The K12M mutant itself is one such an example, as it folds more quickly (1209 ± 72 s1) than wild-type L939 (778 ± 72 s1) (13
).
Kinetic similarity and structural similarity
Our data are well suited to address the connection between kinetic and structural similarity. As the structural deviation between two conformations goes to zero,
Ptrans must also go to zero. How similar must two conformations become before we may assume that their kinetic properties are similar? With our Ptrans-diffusion data set we can address this question (Fig. 6 D). For each pair of daughter conformations, we calculated the RMSD in the inter-
-carbon distance matrix (C
dRMSD) and found a fairly normal distribution centered around 1.33 Å. In contrast, the absolute change in the transmission probability |
Ptrans| between each pair of conformations was biased toward low changes because the possible values are restricted to the range from 0 to 1.
Plotting the correlation between these properties for each daughter pair, we found that larger structural deviations did not prevent similar Ptrans values. However, conformations within 1 Å C
dRMSD rarely had |
Ptrans| > 0.25. Thus, we pose the hypothesis that a structural cutoff, C
dRMSD < 1 Å may be sufficient to ensure kinetic similarity (|
Ptrans| < 0.25) for sufficiently large protein conformation transitions. The ability to infer kinetic similarity from structural similarity is crucial when clustering conformations to build a Markovian state model, a master equation description of global kinetics (49
51
). Specifically, a principal limitation of Markov state models is that one must only cluster protein conformations with similar kinetic properties. An ideal clustering metric would be sufficiently detailed to capture the precise structure-kinetics correlation yet would be general enough to detect every slow protein transition.
| CONCLUSIONS |
|---|
|
|
|---|
We demonstrated Hammond effects using kinetic measurements and found changes to both the free energy barrier position and width. Transition state shifts due to polymer temperature (25°C vs. 82°C) were small. Finally, we determined that it is possible to estimate a transmission probability diffusion rate. With precise Ptrans values (
400 simulations per conformation), we determined the molecular timescale and the rate of kinetic diffusion at the barrier top. These parameters should prove useful to decouple the kinetic and thermodynamic contributions to the rate theory of L939 folding.
| APPENDIX |
|---|
|
|
|---|
, is
. This expression has three important terms. The posterior distribution,
, is proportional to the data likelihood,
, times the prior distribution,
. Because each simulation has two possible outcomes, the data likelihood is given by the binomial distribution
.
The prior distribution contains any information about the distribution of p that is available before data collection. One may bias the result toward a predicted p value in exchange for more rapid convergence. Here, we used a so-called uninformative prior
that is simply a constant distribution and introduces no bias. The Beta distribution is also a conjugate prior, a convenient choice of prior distribution that simplifies the derivation of the posterior distribution. The form of the Beta distribution
reveals that the Beta distribution is similar to the binomial distribution, with the parameters
and ß replacing n and N n, respectively. The
-functions simply normalize the Beta distribution.
All factors that do not depend on p are constants and can be ignored, leaving a posterior distribution for p proportional to
. This posterior distribution is itself a Beta distribution and equals
for the uniform prior. As additional observations are collected, the posterior distribution depends only slightly on the prior. The peak of the posterior distribution may fall at any value between 0 and 1. The expectation value and variance of
are
and
, respectively. Upper and lower limits for 95% confidence intervals were determined to each exclude 2.5% of the probability distribution. All distribution statistics were calculated using Mathematica 5.0 (52
).
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
C.D.S. and Y.M.R. were supported by fellowships from the Howard Hughes Medical Institute and Stanford University, respectively.
Submitted on October 7, 2005; accepted for publication February 1, 2006.
| REFERENCES |
|---|
|
|
|---|
2. Onuchic, J. N., and P. G. Wolynes. 2004. Theory of protein folding. Curr. Opin. Struct. Biol. 14:7075.[CrossRef][Medline]
3. Chavez, L. L., J. N. Onuchic, and C. Clementi. 2004. Quantifying the roughness on the free energy landscape. Entropic bottlenecks and protein folding rates. J. Am. Chem. Soc. 126:84268432.[CrossRef][Medline]
4. Sanbonmatsu, K. Y., and A. E. Garcia. 2002. Structure of Met-enkephalin in explicit aqueous solution using replica exchange molecular dynamics. Proteins. 46:225234.[CrossRef][Medline]
5. Best, R. B., and G. Hummer. 2005. Reaction coordinates and rates from transition paths. Proc. Natl. Acad. Sci. USA. 102:67326737.
6. Ding, F., N. V. Dokholyan, S. V. Buldyrev, H. E. Stanley, and E. I. Shakhnovich. 2002. Direct molecular dynamics observation of protein folding transition state ensemble. Biophys. J. 83:35253532.
7. Davis, R., C. M. Dobson, and M. Vendruscolo. 2002. Determination of the structures of distinct transition state ensembles for a ß-sheet peptide with parallel folding pathways. J. Chem. Phys. 117:95109517.[CrossRef]
8. Bolhuis, P. G. 2003. Transition-path sampling of ß-hairpin folding. Proc. Natl. Acad. Sci. USA. 100:1212912134.
9. Bolhuis, P. G., D. Chandler, C. Dellago, and P. L. Geissler. 2002. Transition path sampling: throwing ropes over rough mountain passes, in the dark. Annu. Rev. Phys. Chem. 53:291318.[CrossRef][Medline]
10. Li, L., and E. I. Shakhnovich. 2001. Constructing, verifying, and dissecting the folding transition state of chymotrypsin inhibitor 2 with all-atom simulations. Proc. Natl. Acad. Sci. USA. 98:1301413018.
11. Hummer, G. 2004. From transition paths to transition states and rate coefficients. J. Chem. Phys. 120:516523.[CrossRef][Medline]
12. Du, R., V. S. Pande, A. Y. Grosberg, T. Tanaka, and E. S. Shakhnovich. 1998. On the transition coordinate for protein folding. J. Chem. Phys. 108:334350.[CrossRef]
13. Horng, J. C., V. Moroz, and D. P. Raleigh. 2003. Rapid cooperative two-state folding of a miniature alpha-beta protein and design of a thermostable variant. J. Mol. Biol. 326:12611270.[CrossRef][Medline]
14. Hubner, I. A., J. Shimada, and E. I. Shakhnovich. 2004. Commitment and nucleation in the protein G transition state. J. Mol. Biol. 336:745761.[CrossRef][Medline]
15. Wei, W. Y., and M. Gruebele. 2003. Folding at the speed limit. Nature. 423:193197.[CrossRef][Medline]
16. Onsager, L. 1939. Initial recombination of ions. Phys. Rev. 54:554557.
17. Pande, V. S., and D. S. Rokhsar. 1999. Molecular dynamics simulations of unfolding and refolding of a beta-hairpin fragment of protein G. Proc. Natl. Acad. Sci. USA. 96:90629067.
18. Gsponer, J., and A. Caflisch. 2002. Molecular dynamics simulations of protein folding from the transition state. Proc. Natl. Acad. Sci. USA. 99:67196724.
19. Onsager, L. 1931. Reciprocal relations in irreversible processes. I. Phys. Rev. 37:405426.[CrossRef]
20. Onsager, L. 1931. Reciprocal relations in irreversible processes. II. Phys. Rev. 38:22652279.[CrossRef]
21. Chandler, D. 1987. Introduction to Modern Statistical Mechanics. Oxford University Press, New York.
22. Dellago, C., P. G. Bolhuis, and P. L. Geissler. 2005. Transition path sampling methods. In Lecture Notes for the International School of Solid State Physics. K. Binder and G. Ciccotti, editors. Springer, Erice, Sicily.
23. Chandler, D. 1978. Statistical mechanics of isomerization dynamics in liquids and the transition state approximation. J. Chem. Phys. 68:29592970.[CrossRef]
24. Montgomery, J. A. J., D. Chandler, and B. J. Berne. 1979. Trajectory analysis of a kinetic theory for isomerization dynamics in condensed phases. J. Chem. Phys. 70:40564066.[CrossRef]
25. Horng, J. C., and D. P. Raleigh. 2003. Phi-values beyond the ribosomally encoded amino acids: kinetic and thermodynamic consequences of incorporating trifluoromethyl amino acids in a globular protein. J. Am. Chem. Soc. 125:92869287.[CrossRef][Medline]
26. Horng, J. C., V. Moroz, D. J. Rigotti, R. Fairman, and D. P. Raleigh. 2002. Characterization of large peptide fragments derived from the N-terminal domain of the ribosomal protein L9: definition of the minimum folding motif and characterization of local electrostatic interactions. Biochemistry. 41:1336013369.[CrossRef][Medline]
27. Cho, J. H., S. Sato, and D. P. Raleigh. 2004. Thermodynamics and kinetics of non-native interactions in protein folding: a single point mutant significantly stabilizes the N-terminal domain of L9 by modulating non-native interactions in the denatured state. J. Mol. Biol. 338:827837.[CrossRef][Medline]
28. Luisi, D. L., C. D. Snow, J. J. Lin, Z. S. Hendsch, B. Tidor, and D. P. Raleigh. 2003. Surface salt bridges, double-mutant cycles, and protein stability: an experimental and computational analysis of the interaction of the Asp 23 side chain with the N-terminus of the N-terminal domain of the ribosomal protein l9. Biochemistry. 42:70507060.[CrossRef][Medline]
29. Snow, C. D., E. J. Sorin, Y. M. Rhee, and V. S. Pande. 2005. How well can simulation predict protein folding kinetics and thermodynamics? Annu. Rev. Biophys. Biomol. Struct. 34:4369.[CrossRef][Medline]
30. Snow, C. D., H. Nguyen, V. S. Pande, and M. Gruebele. 2002. Absolute comparison of simulated and experimental protein-folding dynamics. Nature. 420:102106.[CrossRef][Medline]
31. Ponder, J. W., and F. M. Richards. 1987. An efficient Newton-like method for molecular mechanics energy minimization of large molecules. J. Comput. Chem. 8:10161024.[CrossRef]
32. Allen, M. P. 1980. Brownian dynamics simulation of a chemical-reaction in solution. Mol. Phys. 40:10731087.[CrossRef]
33. Jorgensen, W. L., and J. Tirado-Rives. 1988. The OPLS potential functions for proteins. Energy minimizations for crystals of cyclic peptides and crambin. J. Am. Chem. Soc. 110:16571666.[CrossRef]
34. Qiu, D., P. S. Shenkin, F. P. Hollinger, and W. C. Still. 1997. The GB/SA continuum model for solvation. A fast analytical method for the calculation of approximate Born radii. J. Phys. Chem. A. 101:30053014.[CrossRef]
35. Yun-yu, S., L. Wang, and W. F. van Gunsteren. 1988. On the approximation of solvent effects on the conformation and dynamics of cyclosporin A by stochastic dynamics simulation techniques. Mol. Simul. 1:369383.[CrossRef]
36. Gruebele, M. 1999. The fast protein folding problem. Annu. Rev. Phys. Chem. 50:485516.[CrossRef][Medline]
37. Day, R., B. J. Bennion, S. Ham, and V. Daggett. 2002. Increasing temperature accelerates protein unfolding without changing the pathway of unfolding. J. Mol. Biol. 322:189203.[CrossRef][Medline]
38. Rhee, Y. M., and V. S. Pande. 2005. On the role of chemical detail in simulating protein folding kinetics. Chem. Phys. 323:6677.[CrossRef]
39. Socci, N. D., J. N. Onuchic, and P. G. Wolynes. 1996. Diffusive dynamics of the reaction coordinate for protein folding funnels. J. Chem. Phys. 104:58605868.[CrossRef]
40. Baumketner, A., and Y. Hiwatari. 2002. Diffusive dynamics of protein folding studied by molecular dynamics simulations of an off-lattice model. Phys. Rev. E. 66:011905.[CrossRef]
41. Hummer, G. 2005. Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations. N. J. Phys. 7:34.[CrossRef]
42. Zwanzig, R. 1988. Diffusion in a rough potential. Proc. Natl. Acad. Sci. USA. 85:20292030.
43. Bryngelson, J. D., and P. G. Wolynes. 1989. Intermediates and barrier crossing in a random energy model (with applications to protein folding). J. Phys. Chem. 93:69026915.[CrossRef]
44. Woolf, T. B., and B. Roux. 1994. Conformational flexibility of o-phosphorylcholine and o-phosphorylethanolamine: a molecular dynamics study of solvation effects. J. Am. Chem. Soc. 116:59165926.[CrossRef]
45. Wang, J., K. Zhang, H. Lu, and E. Wang. 2005. Quantifying kinetic paths of protein folding. Biophys. J. 89:16121620.
46. Rao, F., G. Settanni, E. Guarnera, and A. Caflisch. 2005. Estimation of protein folding probability from equilibrium simulations. J. Chem. Phys. 122:184901.[CrossRef][Medline]
47. Kubelka, J., J. Hofrichter, and W. A. Eaton. 2004. The protein folding speed limit. Curr. Opin. Struct. Biol. 14:7688.[CrossRef][Medline]
48. Krieger, F., B. Fierz, O. Bieri, M. Drewello, and T. Kiefhaber. 2003. Dynamics of unfolded polypeptide chains as model for the earliest steps in protein folding. J. Mol. Biol. 332:265274.[CrossRef][Medline]
49. Swope, W. C., J. W. Pitera, and F. J. Suits. 2004. Describing protein folding kinetics by molecular dynamics simulations. 1. Theory. J. Phys. Chem. B. 108:65716581.
50. Swope, W. C., J. W. Pitera, F. J. Suits, M. Pitman, M. Eleftherious, B. G. Fitch, R. S. Germain, A. Rayshubski, T. J. C. Ward, Y. Zhestkov, and others. 2004. Describing protein folding kinetics by molecular dynamics simulations. 2. Example applications to alanine dipeptide and a beta-hairpin peptide. J. Phys. Chem. B. 108:65826594.
51. Singhal, N., C. D. Snow, and V. S. Pande. 2004. Using path sampling to build better Markovian state models: predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. J. Chem. Phys. 121:415425.[CrossRef][Medline]
52. Wolfram Research. 2003. Mathematica. Wolfram Research, Champaign, IL.
This article has been cited by other articles:
![]() |
G. Settanni and A. R. Fersht |