Originally published as Biophys J. BioFAST on October 27, 2006.
doi:10.1529/biophysj.106.087684
Biophysical Journal 92:547-561 (2007)
© 2007 The Biophysical Society
Refolding upon Force Quench and Pathways of Mechanical and Thermal Unfolding of Ubiquitin
Mai Suan Li *,
Maksim Kouza * and
Chin-Kun Hu
* Institute of Physics, Polish Academy of Sciences, Warsaw, Poland; and
Institute of Physics, Academia Sinica, Nankang, Taipei, Taiwan
Correspondence: Address reprint requests to Prof. Mai Suan Li, E-mail: masli{at}ifpan.edu.pl; or Prof. Chin-Kun Hu, E-mail: huck{at}phys.sinica.edu.tw.
 |
ABSTRACT
|
|---|
The refolding from stretched initial conformations of ubiquitin (PDB ID: 1ubq) under the quenched force is studied using the C
-G
model and the Langevin dynamics. It is shown that the refolding decouples the collapse and folding kinetics. The force-quench refolding-times scale as
F
exp(fq
xF/kBT), where fq is the quench force and
xF
0.96 nm is the location of the average transition state along the reaction coordinate given by the end-to-end distance. This value is close to
xF
0.8 nm obtained from the force-clamp experiments. The mechanical and thermal unfolding pathways are studied and compared with the experimental and all-atom simulation results in detail. The sequencing of thermal unfolding was found to be markedly different from the mechanical one. It is found that fixing the N-terminus of ubiquitin changes its mechanical unfolding pathways much more drastically compared to the case when the C-end is anchored. We obtained the distance between the native state and the transition state
xUF
0.24 nm, which is in reasonable agreement with the experimental data.
 |
INTRODUCTION
|
|---|
Deciphering the folding and unfolding pathways and free energy landscape of biomolecules remains a challenge in molecular biology. Traditionally, folding and unfolding are monitored by changing temperature or concentration of chemical denaturants. In these experiments, due to thermal fluctuations of initial unfolded conformations it is difficult to describe the folding mechanisms in an unambiguous way. With the help of the atomic force microscopy, mechanical force has been used to prepare well-defined initial states of proteins (1
,2
). Using the initial force, fI, which is higher than the equilibrium critical force, fc, to unfold the tandem of poly ubiquitin (Ub), Fernandez and Li (2
) have shown that the refolding can be initiated starting from stretched conformations or force-denaturated ensemble (FDE) and quenching the force to a low constant value, fq (fq < fc). Monitoring folding events as a function of the end-to-end distance (R), they have made the following important observations:- Contrary to the standard folding from the thermal denaturated ensemble (TDE), the refolding under the quenched force is a multiple stepwise process.
- The force-quench refolding time obeys the Bell formula (3
),
, where
is the folding time in the absence of the quench force and
xF is the average location of the transition state (TS).
Motivated by the experiments of Fernandez and Li (2
), we have studied (4
) the refolding of the domain I27 of the human muscle protein using the C
-G
model (5
) and the four-strand ß-barrel model sequence S1 (6
) (for this sequence the nonnative interactions are also taken into account). Basically, we have reproduced qualitatively the major experimental findings listed above. In addition, we have shown that the refolding is a two-state process in which the folding to the native basin attractor (NBA) follows the quick collapse from initial stretched conformations with low entropy. The corresponding kinetics can be described by the biexponential time dependence, contrary to the single exponential behavior of the folding from the TDE with high entropy.
To make the direct comparison with the experiments of Fernandez and Li (2
), in this article we performed simulations for a single domain Ub using the C
-G
model (see Materials and Methods for more details). Because the study of refolding of 76-residue Ub (Fig. 1 a) by all-atom simulations is beyond present computational facilities, the G
modeling is an appropriate choice. Most of the simulations have been carried out at T = 0.85 TF = 285 K. Our present results for refolding upon the force quench are in qualitative agreement with the experimental findings of Fernandez and Li, and with those obtained for I27 and S1 theoretically (4
). A number of quantitative differences between I27 and Ub will be also discussed. For Ub, we have found the average location of the transition state
xF
0.96 nm, which is in reasonable agreement with the experimental value 0.8 nm (2
).

View larger version (36K):
[in this window]
[in a new window]
|
FIGURE 1 (a) Native state conformation of ubiquitin taken from the PDB (PDB ID: 1ubq). There are five ß-strands: S1 (2 6 ), S2 (12 16 ), S3 (41 45 ), S4 (48 49 ), and S5 (6571), and one helix A (23 34 ). (b) Structures B, C, D, and E consist of pairs of strands (S1,S2), (S1,S5), (S3,S5), and (S3,S4), respectively. In the text we also refer to helix A as structure A.
|
|
Experimentally, the unfolding of the polyubiquitin has been studied by applying a constant force (7
). The mechanical unfolding of Ub has been investigated previously using G
-like (8
) and all-atom models (8
,9
). In particular, Irbäck et al. (9
) have explored mechanical unfolding pathways of structures A, B, C, D, and E (see the definition of these structures and the ß-strands in the caption to Fig. 1) and the existence of intermediates in detail. We present our results on mechanical unfolding of Ub for the five following reasons:- The barrier to the mechanical unfolding has not been computed.
- Experiments of Schlierf et al. (7
) have suggested that Cluster 1 (i.e., strands S1, S2, and the helix A) unfolds after Cluster 2 (strands S3, S4, and S5). However, this observation has not yet been studied theoretically.
- Since the structure C, which consists of the strands S1 and S5, unzips first, Irbäck et al. (9
) pointed out that strand S5 unfolds before S2 (the terminal strands follows the unfolding pathway S1
S5
S2). This conclusion may be incorrect, because it has been obtained from the breaking of the contacts within the structure C.
- In pulling and force-clamp experiments, the external force is applied to one end of the proteins, while the other end is kept fixed. Therefore, one important question emerges as to how fixing one terminus affects the unfolding sequencing of Ub. This issue has not been addressed by Irbäck et al. (9
).
- Using a simplified all-atom model, it was shown (9
) that mechanical intermediates occur more frequently than in experiments (7
). It is relevant to ask if a C
-G
model can capture similar intermediates as this may shed light on the role of nonnative interactions.
In this article, from the force dependence of mechanical unfolding times we estimated the distance between the native state and the transition state to be
xUF
0.24 nm, which is close to the experimental results of Carrion-Vazquez et al. (10
) and Schlierf et al. (7
). In agreement with the experiments (7
), Cluster 1 was found to unfold after Cluster 2 in our simulations. Applying the force to the both termini, we studied the mechanical unfolding pathways of the terminal strands in detail and obtained the sequencing S1
S2
S5, which is different from the result of Irbäck et al. (9
). When the N-terminus is fixed and the C-terminus is pulled by a constant force, the unfolding sequencing was found to be very different from the previous case. The unzipping initiates, for example, from the C-terminus but not from the N-terminus. Anchoring the C-end is shown to have a little effect on unfolding pathways. We have demonstrated that the present C
-G
model does not capture rare mechanical intermediates, presumably due to the lack of nonnative interactions. Nevertheless, it can correctly describe the two-state unfolding of Ub (7
).
It is well known that thermal unfolding pathways may be very different from the mechanical ones, as has been shown for the domain I27 (11
). This is because the force is applied locally to the termini while thermal fluctuations have the global effect on the entire protein. In the force case, unzipping should propagate from the termini whereas under thermal fluctuations the most unstable part of a polypeptide chain unfolds first.
The unfolding of Ub under thermal fluctuations was investigated experimentally by Cordier and Grzesiek (12
) and by Chung et al. (13
). If one assumes that unfolding is the reverse of the refolding process then one can infer information about the unfolding pathways from the experimentally determined
-values (14
) and
-values (15
,16
). The most comprehensive
-value analysis is that of Went and Jackson. They found that the C-terminal region, which has very low
-values, unfolds first and then the strand S1 breaks before full unfolding of the
-helix fragment A occurs. However, the detailed unfolding sequencing of the other strands remains unknown.
Theoretically, the thermal unfolding of Ub at high temperatures has been studied by all-atom molecular dynamics (MD) simulations by Alonso and Daggett (17
) and Larios et al. (18
). In the latter work, the unfolding pathways were not explored. Alonso and Daggett have found that the
-helix fragment A is the most resilient toward temperature but the structure B breaks as early as the structure C. The fact that B unfolds early contradicts not only the results for the
-values obtained experimentally by Went and Jackson (14
) but also findings from a high resolution NMR (12
). Moreover, the sequencing of unfolding events for the structures D and E was not studied.
What information about the thermal unfolding pathways of Ub can be inferred from the folding simulations of various coarse-grained models? Using a semiempirical approach, Fernandez predicted (19
) that the nucleation site involves the ß-strands S1 and S5. This suggests that thermal fluctuations break these strands last, but what happens to the other parts of the protein remain unknown. Furthermore, the late breaking of S5 contradicts the unfolding (12
) and folding (14
) experiments. From later folding simulations of Fernandez et al. (20
,21
), one can infer that the structures A, B, and C unzip late. Since this information is gained from
-values, it is difficult to determine the sequencing of unfolding events even for these fragments. Using the results of Gilis and Rooman (22
), we can only expect that the structures A and B unfold last. In addition, with the help of a three-bead model it was found (23
) that the C-terminal loop structure is the last to fold in the folding process and most likely plays a spectator role in the folding kinetics. This implies that strands S4, S5, and the second helix (residues 3840) would unzip first but again the full unfolding sequencing cannot be inferred from this study.
Thus, neither the direct MD (17
) nor indirect folding simulations (19
23
) provide a complete picture of the thermal unfolding pathways for Ub. One of our aims is to decipher the complete thermal unfolding sequencing and compare it with the mechanical one. The mechanical and thermal routes to the denaturated states have been found to be very different from each other. Under the force, e.g., the ß-strand S1, unfolds first, while thermal fluctuations detach strand S5 first. The later observation is in good agreement with NMR data of Cordier and Grzesiek (12
). A detailed comparison with available experimental and simulation data on the unfolding sequencing will be presented. The free energy barrier to thermal unfolding was also calculated.
To summarize, in this article we have obtained the following novel results. We have shown that the refolding of Ub is a two-stage process in which the "burst" phase exists on very short timescales. The construction of the T f phase diagram allows us to determine the equilibrium critical force fc separating the folded and unfolded regions. Using the exponential dependence of the refolding and unfolding times on f,
xF and
xUF were computed. Our results for fc,
xF and
xUF are in acceptable agreement with the experiments. It has been demonstrated that fixing the N-terminus of Ub has much stronger effect on mechanical unfolding pathways compared to the case when the C-end is anchored. In comparison with previous studies, we provide a more complete picture for thermal unfolding pathways, which are very different from the mechanical ones.
 |
MATERIALS AND METHODS
|
|---|
C
-G
model for Ub
We use coarse-grained continuum representation for Ub in which only the positions of C
-carbons are retained. The interactions between residues are assumed to be G
-like and the energy of such a model is (5
)
 | (1) |
Here 
i =
i
0i, ri, i+1 is the distance between beads i and i + 1,
i is the bond angle between bonds (i 1) and i, and
i is the dihedral angle around the ith bond and rij is the distance between the ith and jth residues. Subscripts 0, NC, and NNC refer to the native conformation, native contacts, and nonnative contacts, respectively. Residues i and j are in native contact if r0ij is less than a cutoff distance dc taken to be dc = 6.5 Å, where r0ij is the distance between the residues in the native conformation. With this choice of dc and the native conformation from the PDB (Fig. 1 a), we have the total number of native contacts Qmax = 99.
The first harmonic term in Eq. 1 accounts for chain connectivity and the second term represents the bond-angle potential. The potential for the dihedral angle degrees of freedom is given by the third term in Eq. 1. The interaction energy between residues that are separated by at least three beads is given by 1012 Lennard-Jones potential. A soft-sphere repulsive potential (the fourth term in Eq. 1) disfavors the formation of nonnative contacts. The last term accounts for the force applied to C- and N-termini along the end-to-end vector
. We choose Kr = 100
H/Å2, K
= 20
H/rad2,
, and K
(3) = 0.5
H, where
H is the characteristic hydrogen bond energy and C = 4 Å. Since TF = 0.675
H (see below) and TF = 332.5 K (24
), we have
H = 4.1 kJ/mol = 0.98 kcal/mol. Then the force unit [f] =
/Å = 68.0 pN.
We assume the dynamics of the polypeptide chain obeys the Langevin equation. The equations of motion (see (25
) for details) were integrated using the velocity form of the Verlet algorithm (26
) with the time step
t = 0.005
L, where
L = (ma2/
H)1/2
3 ps.
Simulations
To obtain the T f phase diagram we use the fraction of native contacts or the overlap function (27
)
 | (2) |
where
ij is equal to 1 if residues i and j form a native contact and 0 otherwise, and
(x) is the Heaviside function. The argument of this function guarantees that a native contact between i and j is classified as formed when rij is shorter than 1.2 r0ij. The probability of being in the native state, fN, which can be measured by various experimental techniques, is defined as fN = 

, where
...
stands for a thermal average. The T f phase diagram (a plot of 1 fN as a function of f and T) and thermodynamic quantities were obtained by the multiple histogram method (28
), extended to the case when the external force is applied to the termini (29
,30
). In this case, the reweighting is carried out not only for temperature but also for force. We collected data for six values of T at f = 0 and for five values of f at a fixed value of T. The duration of MD runs for each trajectory was chosen to be long enough to get the system fully equilibrated (9 x 105
L from which 1.5 x 105
L were spent on equilibration). For a given value of T and f, we have generated 40 independent trajectories for thermal averaging.
For the mechanical unfolding we have considered two cases. In the first case, the external force is applied via both termini N and C. In the second case it is applied to either N- or C-terminus.
To simulate the mechanical unfolding the computation has been performed at T = 285 K and mainly at the constant force f = 70, 100, 140, and 200 pN. This allows us to compare our results with the mechanical unfolding experiments (7
) and to see if the unfolding pathways change at low forces. Starting from the native conformation but with different random number seeds the unfolding sequencing of helix A and five ß-stands is studied by monitoring fraction of native contacts as a function of the end-to-end extension. In the case of structures A, B, C, D, and E we consider not only the evolution of the number of intrastructure contacts as has been done by Irbäck et al. (9
), but also the evolution of all contacts (intrastructure contacts and the contacts formed by a given structure with the rest of a protein).
In the thermal unfolding case the simulation is also started from the native conformations and it is terminated when all of the native contacts are broken. Due to thermal fluctuations there is no one-to-one correspondence between R and time. Therefore R ceases to be a good reaction coordinate for describing unfolding sequencing. To rescue this, for each ith trajectory we introduce the progressive variable
, where
is the unfolding time. Then we can average the fraction of native contacts over a unique window 0
i
1 and monitor the unfolding sequencing with the help of the progressive variable
.
 |
RESULTS
|
|---|
Temperature-force phase diagram and thermodynamic quantities
The T f phase diagram, obtained by the extended histogram method (see Materials and Methods), is shown in Fig. 2 a. The folding-unfolding transition, defined by the yellow region, is sharp in the low temperature region but it becomes less cooperative (the fuzzy transition region is wider) as T increases. The weak reentrancy (the critical force slightly increases with T) occurs at low temperatures. This seemingly strange phenomenon occurs as a result of competition between the energy gain and the entropy loss upon stretching. The similar cold unzipping transition was also observed in a number of models for heteropolymers (31
) and proteins (29
) including the C
-G
model for I27 (M. S. Li, unpublished results). As follows from the phase diagram, at T = 285 K, the critical force fc
30 pN, which is close to fc
25 pN, is estimated from the experimental pulling data (to estimate fc from experimental pulling data, we use fmax
fcln(v/vmin) (32
), where fmax is the maximal force needed to unfold a protein at the pulling speed v. From the raw data in Fig. 3 b of Carrion-Vasquez et al. (10
), we obtain fc
25 pN). Given the simplicity of the model this agreement can be considered satisfactory and it validates the use of the G
model.
Fig. 2 b shows the temperature dependence of population of the native state fN. Fitting to the standard two-state curve
, one can see that it works pretty well (solid curve) around the transition temperature but it gets worse at high T due to slow decay of fN. Such a behavior is characteristic for almost all of theoretical models (25
) including the all-atom ones (33
). In fitting we have chosen the hydrogen-bond energy
H = 0.98 kcal/mol in Hamiltonian (1
), so that TF = Tm = 0.675
H/kB coincides with the experimental value 332.5 K (24
). From the fit we obtain
Hm = 11.4 kcal/mol, which is smaller than the experimental value 48.96 kcal/mol indicating that the G
model is, as expected, less stable compared to the real Ub. Taking into account nonnative contacts and more realistic interactions between side-chain atoms is expected to increase the stability of the system.
The cooperativity of the denaturation transition may be characterized by the cooperativity index,
c (see (34
) and (35
) for definition). From simulation data for fN presented in Fig. 2 b we have
c
57, which is considerably lower than the experimental value
c
384 obtained with the help of
Hm = 48.96 kcal/mol and Tm = 332.5 K (24
). The underestimation of
c in our simulations is not only a shortcoming of the off-lattice G
model (36
) but also a common problem of much more sophisticated force fields in all-atom models (33
).
Another measure of the cooperativity is the ratio between the van' t Hoff and the calorimetric enthalpy
2 (37
). For the G
Ub we obtained
2
0.19. Applying the base line subtraction (38
) gives
2
0.42, which is still much below
2
1 for the truly one-or-none transition. Since
2 is an extensive parameter, its low value is due to the shortcomings of the off-lattice G
models but not due to the finite size effects. More rigid lattice models give better results for the calorimetric cooperativity
2 (39
).
Fig. 3 a shows the free energy as a function of Q for several values of force at T = TF. Since there are only two minima, our results support the two-state picture of Ub (7
,13
). As expected, the external force increases the folding barrier,
FF (
FF = FTS FD) and it lowers the unfolding barrier,
FUF (
FUF = FTS FN). From the linear fits in Fig. 3 b we obtain the average distance between the TS and D states,
xF =
FF/f
1 nm, and the distance between TS and the native state,
xUF =
FUF/f
0.13 nm. Note that
xF is very close to
xF
0.96 nm obtained from refolding times at a bit lower temperature T = 285 K (see Fig. 6 below). However,
xUF is lower than value 0.24 nm followed from mechanical unfolding data at f > fc (Fig. 8). This difference may be caused by either sensitivity of
xUF to the temperature, or the determination of
xUF from the approximate free energy landscape, since a function of a single coordinate Q is not sufficiently accurate.
We have also studied the free energy landscape using R as a reaction coordinate. The dependence of F on R was found to be smoother (results not shown) compared to that obtained by Kirmizialtin et al. (40
) using a more elaborated model (23
) involving the nonnative interactions.
Refolding under quenched force
Our protocol for studying the refolding of Ub is identical to that used in the experiments of Fernandez and Li (2
). We first apply the force fI
70 pN to prepare initial conformations (the protein is stretched if R
0.8 L, where the contour length L = 28.7 nm). Starting from the FDE we quenched the force to fq < fc and then monitored the refolding process by following the time dependence of the number of native contacts Q(t), R(t), and the radius of gyration Rg(t) for typically 50 independent trajectories.
Fig. 4 shows considerable diversity of refolding pathways. In accord with experiments (2
) and simulations for I27 (4
), the reduction of R occurs in a stepwise manner. In the fq = 0 case (Fig. 4 a), R decreases continuously from
18 nm to 7.5 nm (stage 1) and fluctuates around this value for
3 ns (stage 2). The further reduction to R
4.5 nm (stage 3) until a transition to the NBA. The stepwise nature of variation of Q(t) is also clearly shown up but it is more masked for Rg(t). Although we can interpret another trajectory for fq = 0 (Fig. 4 b) in the same way, the timescales are different. Thus, the refolding routes are highly heterogeneous.

View larger version (43K):
[in this window]
[in a new window]
|
FIGURE 4 (a,b) The time dependence of Q, R, and Rg for two typical trajectories starting from FDE (fq = 0 and T = 285 K). Arrows 1, 2, and 3 in panel a correspond to time 3.1 (R = 10.9 nm), 9.3 (R = 7.9 nm), and 17.5 ns (R = 5 nm). Arrow 4 marks the folding time F = 62 ns (R = 2.87 nm) when all 99 native contacts are formed. Panels c and d are the same as in panels a and b, but for fq = 6.25 pN. The corresponding arrows refer to t = 7.5 (R = 11.2 nm), 32 (R = 9.4 nm), 95 ns (R = 4.8 nm), and F = 175 ns (R = 3.65 nm).
|
|
The pathway diversity is also evident for fq > 0 (Fig. 4, c and d). Although the picture remains qualitatively the same as in the fq = 0 case, the timescales for different steps becomes much larger. The molecule fluctuates around R
7 nm, e.g., for
60 ns (stage 2 in Fig. 4 c), which is considerably longer than
3 ns in Fig. 4 a. The variation of Rg(t) becomes more drastic compared to the fq = 0 case.
Fig. 5 shows the time dependence of
R(t)
,
Q(t)
, and
Rg(t)
, where
...
stands for averaging over 50 trajectories. The left and right panels correspond to the long and short time windows, respectively. For the TDE case (Fig. 5, a and b), the single exponential fit works pretty well for
R(t)
for the whole time interval. A little departure from this behavior is seen for
Q(t)
and
Rg(t)
for t < 2 ns (Fig. 5 b). Contrary to the TDE case, even for fq = 0 (Fig. 5, c and d) the difference between the single and biexponential fits is evident not only for
Q(t)
and
Rg(t)
but also for
R(t)
. The timescales, above which two fits become eventually identical, are slightly different for three quantities (Fig. 5 d). The failure of the single exponential behavior becomes more and more evident with the increase of fq, as demonstrated in Fig. 5, e and f, for the FDE case with fq = 6.25 pN.
Thus, in agreement with our previous results, obtained for I27 and the sequence S1 (4
), starting from FDE the refolding kinetics compiles from the fast and slow phase. The characteristic timescales for these phases may be obtained using a sum of two exponentials,
, where A stands for R, Rg, or Q. Here
characterizes the burst-phase (first stage) while
may be either the collapse time (for R and Rg) or the folding time (for Q)
. As in the case of I27 and S1 (4
),
and
are almost independent on fq (results not shown). We attribute this to the fact that the quench force
is much lower than the entropy force (fe) needed to stretch the protein. At T = 285 K, one has to apply fe
140 pN for stretching Ub to 0.8 L. Since
, the initial compaction of the chain that is driven by fe is not sensitive to the small values of fq. Contrary to
,
was found to increase with fq exponentially. Moreover,
, implying that the chain compaction occurs before the acquisition of the native state.
Fig. 6 shows the dependence of the folding times on fq. Using the Bell-type formula (3
) and the linear fit in Fig. 6, we obtain
xF
0.96 nm, which is in acceptable agreement with the experimental value
xF
0.8 nm (2
). The linear growth of the free energy barrier to folding with fq is due to the stabilization of the random coil states under the force. Our estimate for Ub is higher than
xF
0.6 nm obtained for I27 (4
). One of possible reasons for such a pronounced difference is that we used the cutoff distance dc = 0.65 and 0.6 nm in the G
model (1
) for Ub and I27, respectively. The larger value of dc would make a protein more stable (more native contacts) and it may change the free energy landscape leading to enhancement of
xF. This problem requires further investigation.
Absence of mechanical unfolding intermediates in C
-G
model
To study the unfolding dynamics of Ub, Schlierf et al. (7
) have performed the AFM experiments at a constant force f = 100, 140, and 200 pN. The unfolding intermediates were recorded in
5% of 800 events at different forces. The typical distance between the initial and intermediate states is
R = 8.1 ± 0.7 nm (7
). However, the intermediates do not affect the two-state behavior of the polypeptide chain. Using the all-atom models, Irbäck et al. (9
) have also observed the intermediates in the region 6.7 nm < R < 18.5 nm. Although the percentage of intermediates is higher than in the experiments, the two-state unfolding events remain dominating. To check the existence of force-induced intermediates in our model, we have performed the unfolding simulations for f = 70, 100, 140, and 200 pN. Because the results are qualitatively similar for all values of force, we present the f = 100 pN case only.
Fig. 7 shows the time dependence of R(t) for 15 runs starting from the native value RN
3.9 nm. For all trajectories the plateau occurs at R
4.4 nm. As seen below, passing this plateau corresponds to breaking of intrastructure native contacts of structure C. At this stage, the chain ends are almost stretched out, but the rest of the polypeptide chain remains nativelike. The plateau is washed out when we average over many trajectories and
R(t)
is well fitted by a single exponential (Fig. 7), in accord with the two-state behavior of Ub (7
).
The existence of the plateau observed for individual unfolding events in Fig. 7 agrees with the all-atom simulation results of Irbäck et al. (9
), who have also recorded a similar plateau at R
4.6 nm at short timescales. However, unfolding intermediates at larger extensions do not occur in our simulations. This is probably related to neglect of the nonnative interactions in the C
-G
model. Nevertheless, this simple model provides the correct two-state unfolding picture of Ub in the statistical sense.
Mechanical unfolding barrier
We now try to determine the barrier to the mechanical unfolding from the dependence of the unfolding times
UF on f. It should be noted that this way of determination of the unfolding barrier is exact and it would give a more reliable estimate compared to the free energy landscape approach in which the free energy profile is approximated as a function of only one order parameter.
We first consider the case when the force is applied via both termini N and C. Since the force lowers the unfolding barrier,
UF should decrease as f increases (Fig. 8). The present G
model gives
UF smaller than the experimental values by approximately eight orders of magnitude. E.g., for f = 100 pN,
UF
12 ns whereas the experiments gives
UF
2.77 s (7
). As seen from Fig. 8, for f < 140 pN
UF depends on f exponentially. In this regime,
, where
xUF is the average distance between the N and TS states. From the linear fit in Fig. 8 we obtained
xUF
0.24 nm. Using different fitting procedures, Schlierf et al. (7
) obtained
xUF
0.14 nm and 0.17 nm. The larger value
xUF
0.25 nm was reported in the earlier experiments (10
). Thus, given experimental uncertainty, the C
-G
model provides a reasonable estimate of
xUF for the two-state Ub.
In the high force regime (f > 140 pN), instead of the exponential dependence,
UF scales with f linearly (inset in Fig. 8). The crossover from the exponential to the linear behavior is in full agreement with the earlier theoretical prediction (32
). Similar crossover has been also observed (41
) for the another G
-like model of Ub but
xUF has not been estimated. At very high forces,
UF is expected to be asymptotically independent of f.
One can show that fixing one terminus of a protein has the same effect on unfolding times no matter whether the N- or C-terminus is fixed. Therefore, we show the results obtained for the case when the N-end is anchored. As seen from Fig. 8, the unfolding process is slowed down nearly by a factor of 2. It may imply that diffusion-collision processes (42
) play an important role in the Ub unfolding. Namely, as follows from the diffusion-collision model, the time, required for formation (breaking) contacts, is inversely proportional to the diffusion coefficient, D, of a pair of spherical units. If one of them is idle, D is halved and the time needed to break contacts increases accordingly. Although fixing one end increases the unfolding times, it does not change the distance between the TS and the native state,
xUF (Fig. 8).
Mechanical unfolding pathways: force is applied to both termini
Here we focus on the mechanical unfolding pathways by monitoring the number of native contacts as a function of the end-to-end extension
R
R Req, where Req is the equilibrium value of R. For T = 285 K, Req
3.4 nm. Following Schlierf et al. (7
), we first divide Ub into two clusters. Cluster 1 consists of strands S1, S2, and the helix A (42 native contacts) and cluster 2, strands S3, S4, and S5 (35 native contacts). The dependence of fraction of intracluster native contacts is shown in Fig. 9 for f = 70 and 200 pN (similar results for f = 100 and 140 pN are not shown). In agreement with the experiments (7
), Cluster 2 unfolds first. The unfolding of these clusters becomes more and more synchronous upon decreasing f. At f = 70 pN the competition with thermal fluctuations becomes so important that two clusters may unzip almost simultaneously. Experiments at low forces are needed to verify this observation.

View larger version (16K):
[in this window]
[in a new window]
|
FIGURE 9 The dependence of fraction of the native contacts on the end-to-end extension for Cluster 1 (solid lines) and Cluster 2 (dashed lines) at f = 70 pN and 200 pN. The results are averaged over 200 independent trajectories. The arrow points to the extension R = 8.1 nm.
|
|
The arrow in Fig. 9 marks the position
R = 8.1 nm, where some intermediates were recorded in the experiments (7
). At this point there is intensive loss of native contacts of Cluster 2, suggesting that the intermediates observed on the experiments are conformations in which most of the contacts of this cluster are already broken but Cluster 1 remains relatively structured (
40% contacts). One can expect that Cluster 1 is more ordered in the intermediate conformations if the side chains and realistic interactions between amino acids are taken into account.
To compare the mechanical unfolding pathways of Ub with the all-atom simulation results (9
), we discuss the sequencing of helix A and structures B, C, D, and E in more detail. We monitor the intrastructure native contacts and all contacts separately. The later include not only the contacts within a given structure but also the contacts between it and the rest of the protein. It should be noted that Irbäck et al. have studied the unfolding pathways based on the evolution of the intrastructure contacts. Fig. 10 a shows the dependence of the fraction of intrastructure contacts on
R at f = 100 pN. At
R
1 nm, which corresponds to the plateau in Fig. 7, most of the contacts of C are broken. In agreement with the all-atom simulations (9
), the unzipping follows C
B
D
E
A. Since C consists of the terminal strands S1 and S5, it was suggested that these fragments unfold first. However, this scenario may be no longer valid if one considers not only intrastructure contacts but also other possible ones (Fig. 10 b). In this case the statistically preferred sequencing is B
C
D
E
A, which holds not only for f = 100 pN but also for other values of f. If it is true, then S2 will unfold even before S5. To make this point more transparent, we plot the fraction of contacts for S1, S2, and S5 as a function of
R (Fig. 11 a) for a typical trajectory. Clearly, S5 detaches from the core part of a protein after S2 (see also the snapshot in Fig. 11 b). So, instead of the sequencing S1
S5
S2 proposed by Irbäck et al., we obtain S1
S2
S5.

View larger version (16K):
[in this window]
[in a new window]
|
FIGURE 10 (a) The dependence of fraction of the intrastructure native contacts on R for structures A, B, C, D, and E at f = 100 pN. (b) The same as in panel a, but for all native contacts. The results are averaged over 200 independent trajectories.
|
|
The dependence of the fraction of native contacts on
R for individual strands is shown in Fig. 12 a (f = 70 pN) and Fig. 12 b (f = 200 pN). At
= 8.1 nm contacts of S1, S2, and S5 are already broken, whereas S4 and A remain largely structured. In terms of ß-strands and A we can interpret the intermediates observed in the experiments of Schlierf et al. (7
), as conformations with well-structured S4 and A, and low ordering of S3. This interpretation is more precise compared to the above argument based on unfolding of two clusters because if one considers the average number of native contacts, then Cluster 2 is unstructured in the intermediate state (Fig. 9), but its strand S4 remains highly structured (Fig. 12).
From Fig. 12, we obtain the following mechanical unfolding sequencing:
 | (3) |
It should be noted that the sequencing (3
) is valid in the statistical sense. In some trajectories, S5 unfolds even before S1 and S2 or the native contacts of S1, S2, and S5 may be broken at the same timescale (Table 1). From Table 1, it follows that the probability of having S1 unfolded first decreases with lowering f but the main trend (3
) remains unchanged. One has to stress again that the sequencing of the terminal strands S1, S2, and S5 given by Eq. 3 is different from that proposed by Irbäck et al. (9
), based on the breaking of the intrastructure contacts of C. Unfortunately, there are no experimental data available for comparison with our theoretical prediction.
Mechanical unfolding pathways: one end is fixed
N-terminus is fixed
Here we adopted the same procedure as in the previous section except the N-terminus is held fixed during simulations. As in the process where both of the termini are subjected to force, one can show that Cluster 1 unfolds after Cluster 2 (results not shown).
From Fig. 13, we obtain the unfolding pathways
 | (4a) |
 | (4b) |
which are also valid for the other values of force (f = 70, 100, and 140 pN). Similar to the case when the force is applied to both ends, the structure C unravels first and the helix A remains the most stable. However, the sequencing of B, D, and E changes markedly compared to the result obtained by Irbäck et al. (9
) (Fig. 10 a).
As evident from Eqs. 3 and 5, anchoring the first terminal has a much more pronounced effect on the unfolding pathways of individual strands. In particular, unzipping commences from the C-terminus instead of from the N-terminus. Fig. 13 c shows a typical snapshot where one can see clearly that S5 detaches first. At the first glance, this fact may seem trivial because S5 experiences the external force directly. However, our experience on unfolding pathways of the well-studied domain I27 from the human cardiac titin, e.g., shows that it may be not the case. Namely, as follows from pulling experiments (43
) and simulations (44
), strand A from the N-terminus unravels first, although this terminus is kept fixed. From this point of view, which strand of Ub actually detaches first is, a priori, not clear. In our opinion, it depends on the interplay between the native topology and the speed of tension propagation. The latter factor probably plays a more important role for Ub, while the opposite situation happens with I27. One possible reason for it is related to the high stability of helix A, which does not allow either for the N-terminal to unravel first or for servility in unfolding starting from the C-end.
C-terminus is fixed
One can show that unfolding pathways of structures A, B, C, D, and E remain exactly the same as in the case when Ub has been pulled from both termini (see Fig. 10). Concerning the individual strands, a slight difference is observed for S5 (compare Fig. 13 d and Fig. 12). Most of the native contacts of this domain break before S3 and S4, except the long tail at extension
nm due to high mechanical stability of only one contact between residues 61 and 65 (the highest resistance of this pair is probably due to the fact that among 25 possible contacts of S5 it has the shortest distance |61 65| = 4 in sequence). This scenario holds in
90% of trajectories, whereas S5 unravels completely earlier than S3 and S4 in the remaining trajectories. Thus, anchoring C-terminus has much less effect on unfolding pathways than in the case when the N-terminus is immobile.
It is worthwhile to note that, experimentally, one has studied the effect of extension geometry on the mechanical stability of Ub fixing its C-terminus (10
). The greatest mechanical strength (the longest unfolding time) occurs when the protein is extended between N- and C-termini. This result has been supported by Monte Carlo (10
) as well as MD (8
) simulations. However, the mechanical unfolding sequencing has not been studied yet. It would be interesting to check our results on the effect of fixing one end on Ub mechanical unfolding pathways by experiments.
Thermal unfolding pathways
To study the thermal unfolding we follow the protocol described in Materials and Methods. Two-hundred trajectories were generated starting from the native conformation with different random seed numbers. The fractions of native contacts of helix A and five ß-strands are averaged over all trajectories for the time window 0
1. The unfolding routes are studied by monitoring these fractions as a function of
. Above T
500 K, the strong thermal fluctuations (entropy-driven regime) make all strands and helix A unfold almost simultaneously. Below this temperature, the statistical preference for the unfolding sequencing is observed. We focus on T = 370 and 425 K. As in the case of the mechanical unfolding, Cluster 2 unfolds before Cluster 1 (results not shown). However, the main departure from the mechanical behavior is that the strong resistance to thermal fluctuations of Cluster 1 is mainly due to the stability of strand S2 but not of helix A (compare Fig. 14, c and d, with Fig. 12). The unfolding of Cluster 2 before Cluster 1 is qualitatively consistent with the experimental observation that the C-terminal fragment (residues 3676) is largely unstructured while nativelike structure persists in the N-terminal fragment (residues 135) (45
47
). This is also consistent with the data from the folding simulations (23
) as well as with the experiments of Went and Jackson (14
), who have shown that the
-values
0 in the C-terminal region. However, our finding is at odds with the high
-values obtained for several residues in this region by all-atom simulations (48
) and by a semiempirical approach (19
). One possible reason for high
-values in the C-terminal region is the force fields. For example, Marianayagam and Jackson have employed the GROMOS 96 force field (49
) within the GROMACS software package (50
). It would be useful to check whether the other force fields give the same result.

View larger version (48K):
[in this window]
[in a new window]
|
FIGURE 14 (a) The dependence of fraction of intrastructure native contacts on the progressive variable for all structures at T = 425 K. (b) The same as in panel a, but for T = 370 K. (c) The dependence of the all native contacts of the ß-strands and helix A at T = 425 K. (d) The same as in panel c, but for T = 370 K.
|
|
The evolution of the fraction of intrastructure contacts of A, B, C, D, and E is shown in Fig. 14 a (T = 425 K) and b (T = 370 K). Roughly we have the unfolding sequencing, given by Eq. 5, which strongly differs from the mechanical one. The large stability of the
-helix fragment A against thermal fluctuations is consistent with the all-atom unfolding simulations (17
) and the experiments (14
). The N-terminal structure B unfolds even after the core part E, and at T = 370 K its stability is comparable with helix A. The fact that B can withstand thermal fluctuations at high temperatures agrees with the experimental results of Went and Jackson (14
) and of Cordier and Grzesiek (12
), who used the notation ß1/ß2 instead of B. This also agrees with the results of Gilis and Rooman (22
), who used a coarse-grained model, but disagrees with results from all-atom simulations (17
). This disagreement is probably because Alonso and Daggett studied only two short trajectories and B did not completely unfold (17
). The early unzipping of the structure C (Eq. 5a) is consistent with the MD prediction (17
). Thus our thermal unfolding sequencing (Eq. 5a) is more complete compared to the all-atom simulation, and it gives reasonable agreement with the experiments.
We now consider the thermal unstability of individual ß-strands and helix A. At T = 370 K (Fig. 14 d), the trend that S2 unfolds after S4 is more evident compared to the T = 425 K case (Fig. 14 c). Overall, the simple G
model leads to the sequencing given by:
 | (5a) |
 | (5b) |
From Eqs. 3, 4b, and 5b