| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |


* Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts; and
Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
Correspondence: Address reprint requests to J. M. Borreguero, E-mail: jmborr{at}bu.edu.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Theoretical efforts in the study of protein folding (Bryngelson and Wolynes, 1989
; Eaton et al., 2000
; Fersht and Daggett, 2002
; Karplus and McCammon, 2002
; Klimov and Thirumalai, 2002
; Ozkan et al., 2002
; Pande et al., 2000
; Plotkin and Onuchic, 2002
; Thirumalai et al., 2002
; Tiana and Broglia, 2001
) have focused on small, single domain proteins. It is found in experiments (Jackson, 1998
) that the majority of these proteins undergo folding transition with no accumulation of kinetic intermediates in the sampled range of experimental conditions. However, kinetics studies of other two-state proteins (Bachmann and Kiefhaber, 2001
; Khorasanizadeh et al., 1996
) suggest the presence of short-lived intermediates that cannot be directly detected experimentally. In a recent analysis, Sanchez and Kiefhaber (2003)
explained the curved Chevron plotsthe nonlinear dependence of folding and unfolding rates on denaturant concentration (Fersht, 2000
; Ikai and Tandford, 1973
; Matouschek et al., 1990
)of 17 selected proteins by assuming the presence of an intermediate state. In addition, recent molecular dynamics studies of the SH3 domain have suggested the presence of a core-hydrated, native-like intermediate in the latest stages of the folding process (Cheung et al., 2002
), as well as an intermediate state in an isolated fragment (Gnanakaran and García, 2003
). Led by these studies, we hypothesize that single domain proteins may exhibit intermediates in the folding transition under suitable environmental conditions.
To test our hypothesis, we perform a molecular dynamics study of the folding pathways of the c-Crk SH3 domain (Berman et al., 2000
; Branden and Tooze, 1999
; Wu et al., 1995
; PDB access code 1cka). The SH3 domain is a family of small globular proteins which has been extensively studied in kinetics and thermodynamics experiments (Filimonov et al., 1999
; Grantcharova and Baker, 1997
; Grantcharova et al., 1998
; Guerois and Serrano, 2000
; Knapp et al., 1998
; Martinez et al., 1999
; Riddle et al., 1999
; Viguera et al., 1994
; Villegas et al., 1995
). We select c-Crk (57 residues, Fig. 1 a) as the SH3 domain representative, and present our results in terms of the following sequence segments: 1), N-terminal (residues 17); 2), RT-loop (820); 3), Diverging turn (2130); 4), n-Scr loop (3038); 5), Distal hairpin (3950); 6), 310
-helix (5153); and 7), C-terminus (5457).
|
model of interactions, based on the topology of the native state, is a suitable tool to study the folding process of the c-Crk SH3 domain. Our previous thermodynamic studies (Borreguero et al., 2002
The G
model has been successfully applied to the study of two-state proteins (Clementi et al., 2003
; Ding et al., 2002
; Zhou and Karplus, 1997
), but there are no studies to assess the performance of the model for three-state proteins. We test the ability of our model to reproduce intermediate states in a variety of three-states proteins and apparent two-states proteins (Sanchez and Kiefhaber, 2003
). We select proteins RNase (Yamasaki et al., 1995
), SNAse (Walkenhorst et al., 1997
), Barnase (Fersht, 2000
), CheY (Lopez-Hernandez and Serrano, 1996
), Im7 (Ferguson et al., 1999
), and P16 (Tang et al., 1999
), for which, at the experimental conditions studied, the existence of intermediates in the folding process have been observed experimentally. We also select proteins Gelsolin-WT (Isaacson et al., 1999
) and U1A (Otzen et al., 1999
), for which authors observed a nonlinear dependence of the observed folding/unfolding rates versus urea concentration, although evidence was not conclusive on the existence of intermediates.
In addition to the folding kinetics investigation, we address the relevance of the initial unfolded state for the subsequent evolution of the folding process. Studies suggest that the protein may retain part of the native structure even under strong denaturing conditions (García et al., 2001
; Millet et al., 2002
; Shortle and Ackerman, 2001
; Zagrovic et al., 2002
). A native-like structure of the unfolded state speeds up the conformational search to the native state that the protein must perform. In addition, a native-like structure limits the number of possible folding intermediates, and guarantees that the structure of the intermediates will share some similarities with that of the native state. We perform studies of the initial unfolded state under different temperature conditions, and compare our results in the particular temperature conditions where experiments are available (Kortemme et al., 2000
).
We determine the kinetic partition temperature (Thirumalai et al., 2002
), TKP, below which the model c-Crk protein exhibits slow folding pathways and above which the protein undergoes a cooperative folding transition with no accumulation of intermediates. Below TKP, we study the presence of one or two intermediates in the slow folding pathways and determine their structures. We find that one of the intermediates populates the folding transition for temperatures as high as TKP, when the intermediate is not stabilized.
| MATERIALS AND METHODS |
|---|
|
|
|---|
in case of Gly). Details of the model, the surrounding heat bath, and the selection of structural parameters are discussed in detail in a previous study (Borreguero et al., 2002
model of interactions (G
and Abe, 1981
We perform simulations and monitor the time evolution of the protein and the heat bath with the discrete molecular dynamics algorithm (DMD), which uses step potentials (Alder and Wainwright, 1959
; Dokholyan et al., 1998
; Rapaport, 1997
; Zhou et al., 1997
). The earliest molecular dynamics simulations were performed with the discrete algorithm, before the advent of continuous potentials. DMD has a higher speed performance than conventional molecular dynamics, making DMD a choice tool to simulate the folding of proteins.
Frequencies and folding simulations
To calculate the frequency map at T = 1.0, we probe the presence of the native contacts in each of the 1100 initially unfolded conformations. Then, we compute the probability of each native contact to be present. To calculate the frequency map at Ttarget, we select one particular folding transition and we probe the presence of the native contacts during the time interval that spans after the initial relaxation and before the simulation reaches the folding time tF. To compute tF, we stop the folding simulation when 90% of the native contacts form. Then, we trace back the folding trajectory and record tF when the root mean-square deviation (RMSD), with respect to the native state, becomes smaller than 3 Å. We consider all protein conformations occurring for t > tF as belonging to the folded state and of no relevance to the folding transition.
Contact formation times
Following the kinetics of each particular native contact provides us with a detailed picture of the folding process. We compute the fraction of the 1100 folding simulations for which native contact (i, j) is present at time t and temperature T, pij(t, T). For this particular contact, we estimate the characteristic contact formation time tij(T) with the relation pij(tij, T)pij(0, T) = e1 x (pij(
, T)pij(0, T)). When pij(t, T) is a single exponential distribution, then tij(T) coincides with the average time of the distribution.
Similarity score function
We introduce the similarity score function, S = (a/23)(15b)/15, where a is the number of native contacts belonging to the set of contacts C1, and b is the number of native contacts belonging to set C2 (Fig. 5 e). C1 has 23 contacts and C2 has 15 contacts. If the protein is unfolded, then a
b
0, thus S
0. Similarly, if the protein is folded, then a
23 and b
15, thus S
0 again. Finally, if the protein adopts the intermediate I1 structure, then a
23 and b
0, thus S
1.
|
| RESULTS |
|---|
|
|
|---|
-spectrin SH3 domain protein (Kortemme et al., 2000
-spectrin SH3 domain to the sequence of c-Crk SH3 domain that we employ in our studies (sequence identity 34%, RMSD = 2.4 Å). We find that the set of native contacts with a high probability to form (p > 0.75) can predict 37 native contacts out of the 47 native contacts (79% sensitivity) with a distinctive NMR signal found by Kortemme et al. (2000)
-spectrin. Alternatively, we find that out of the 57 predicted native contacts, 37 are correct (65% specificity). We cannot predict the set of non-native contacts with a distinctive NMR signal, since our model does not allow us to calculate frequencies for non-native contacts.
Relaxation of the initial unfolded state
Our initially unfolded state ensemble consists of 1100 protein conformations that we sample from a long equilibrium simulation at a very high temperature, T0 = 1.0, at equal time intervals of 104 t.u. This time separation is long enough to ensure that the sampled conformations have low structural similarity among themselves. We calculate the frequency map of this unfolded state. At T = 1.0, only nearest and next-nearest contacts have high frequency, and the frequency decreases dramatically with the sequence separation between the amino acids.
When we quench the system from T = 1.0 to a target temperature, Ttarget (see Materials and Methods), the system relaxes in
1500 t.u. Due to the finite size of our heat bath, the heat released by the protein upon folding increases the final temperature of the system by 0.03 energy units above Ttarget. After relaxation, the protein stays for a certain time in the unfolded state, then undergoes a folding transition. During this time interval, the protein explores unfolded conformations at equilibrium, and we calculate the frequency map of the unfolded state for different target temperatures.
At Ttarget = 0.64, slightly above TF, the secondary structure is unstable (Fig. 2 a), with average frequency
(see Materials and Methods). Successful folding requires the cooperative formation of contacts throughout the protein in a nucleation process (Borreguero et al., 2002
; Ding et al., 2002
). At Ttarget = 0.54, the secondary structure is more stable, although still conserving a degree of flexibility (Fig. 2 b,
). Then the conformational search for the native state is optimized by limiting the search to the formation of a sufficient number of long-range contacts. At Ttarget = 0.33, the lowest temperature studied, secondary structure elements form during the rapid collapse of the model protein in the first 1500 t.u. (Fig. 2 c,
). During collapse, some tertiary contactscontacts between secondary elementsmay also form. The formation of these contacts before the proper arrangement of secondary structure elements may lead the protein model to a kinetic trap. Finally, folding proceeds at this temperature through a thermally activated search for the native state.
|
Except for Barnase, we observe a variety of intermediate energy and RMSD values in the unfolding process of the selected proteins that do not correspond to the typical values of the native and unfolded states (Fig. 3). These values correspond to intermediate states in the unfolding process of the selected nine proteins. We observe one intermediate in rapid interconversion to the native state for Gelsolin WT. In the opposite extreme, RNase displays a long-lived intermediate before complete unfolding, whereas CheY shows a mixed behavior where the survival time of the intermediate increases with temperature. In addition, CheY exhibits a second intermediate before complete unfolding. Proteins 1UA, SNAse, and P16 show intermediates only before complete unfolding. These are on-pathway kinetic intermediates. The unfolding process of homologous proteins Im7 and Im9 is remarkably dissimilar. Whereas Im7 displays an intermediate, Im9 is a two-state protein. A similar scenario was observed in the folding process of these two proteins (Ferguson et al., 1999
). Finally, we do not detect any intermediate in the unfolding process of c-Crk SH3, but only folded and unfolded states. Thus our protein model captures the essential properties that distinguish two-state from three-state proteins.
|
tF
(Fig. 4 f) and standard deviation
F. The ratio r(T)
tF
/
F measures the average folding time in units of the standard deviation
F. This quantity characterizes the deviation of the distribution of folding times from the single exponential distribution, for which r
1. We expect r
1 for Ttarget > TF, because at these high temperatures the folding transitions become rare events and are single-exponential distributed. As we decrease Ttarget, we expect r > 1 just below TF, because the folded state becomes more stable than the unfolded state, and the folding transitions are favored. Distributions with r > 1 indicate a narrow distribution centered in
tF
, so that most of the simulations undergo a folding transition for times of the order of the average folding time. However, if we continue decreasing Ttarget, we expect some folding transitions to be kinetically trapped, and the folding time distribution will spread over several orders of magnitudes. Such distributions have r < 1. Thus, there is a temperature below TF where the maximum of r(T) occurs, and which signals the onset of slow folding pathways. We use the maximum of r(T) to define TKP.
|
20°C; see also Fig. 4 d). We find that r approaches 1 as we increase the temperature above TKP, and the distribution of folding times approximates a single-exponential distribution. In particular, the distribution of folding times fits the single-exponential distribution
for temperatures near and above TF. The ratio r(T) decreases monotonically below TKP, indicating that the distribution of folding times spreads over several orders of magnitude. This is the consequence of an increasing fraction of folding simulations kinetically trapped (Fig. 4, a and b). The average folding time
tF
is minimal not at TKP, but at a lower temperature
(Fig. 4 f). At this temperature, we find that the protein becomes temporarily trapped in
7% of the folding transitions. On the other hand, the remaining simulations undergo a folding transition much faster, thus minimizing
tF
. Interestingly,
even though the distribution of folding times at this temperature is non-exponential.
Folding pathways
Below TKP, an increasing fraction of the simulations undergo folding transitions that take a time up to three orders of magnitude above the minimal
tF
. In addition,
tF
increases dramatically (Fig. 4 f). At the lowest temperatures studied, we distinguish between the majority of simulations that undergo a fast folding transition (the fast pathway) and the rest of the simulations that undergo folding transitions with folding times spanning three orders of magnitude (the slow pathways). At the low temperature T = 0.33, the potential energy of the fast pathway has on average a time evolution similar to that of all the simulations at TKP = 0.54, indicating that there are no kinetic traps in the fast pathway.
For each folding simulation that belongs to the slow pathways, we sample the potential energy at equal time intervals of 100 t.u. until folding is finished (see Materials and Methods). Then, we collect all potential energy values and construct a distribution of potential energies. We find that below T = 0.43, the distribution is markedly bimodal (Fig. 5 a). The positions of the two peaks along the energy coordinate do not correspond to the equilibrium potential energy value of the folded state (Fig. 5 b). Therefore we hypothesize the existence of two intermediates in the slow pathways. We denote the two putative intermediates as I1 and I2 for the high energy and low energy peaks, respectively. As temperature decreases, the peaks shift to lower energies, but the energy difference between the two peaks, approximately six energy units, remains constant (Fig. 5 b). A constant energy difference implies that the two putative intermediates differ by a specific set of native contacts. As temperature decreases, other contacts not belonging to this set become more stable and are responsible for the overall energy decrease. At T = 0.33, we record the distribution of survival times for both intermediates and find that they fit a single-exponential distribution, supporting the hypothesis that each intermediate is a local free energy minima and has a major free energy barrier (Fig. 5 c).
To further test the single free energy barrier hypothesis, we select a typical conformation representing intermediate I2 and perform 200 folding simulations, each with a different set of initial velocities for a set of temperatures in the range 0.33
T
0.52. For each simulation, we record the time that the protein survives in the intermediate state, and find that the average survival time fits the Arrhenius law for temperatures below T = 0.44 (Fig. 5 d). This upper bound temperature roughly coincides with the temperature T = 0.43 below which I2 becomes noticeable in the histogram of potential energies (Fig. 5 a). This result indicates that the free energy barrier to overcome intermediate I2 becomes independent of temperature for low temperatures, or analogously, that the same set of native contacts must form (or break) to overcome the intermediate.
Structure of the intermediates
We randomly select three conformations for each intermediate, and find that they are structurally similar, within each intermediate. Conformations belonging to intermediate I1 have a set of long-range contacts (C1) and a set of medium-range contacts (C3) with high occupancy (Fig. 5 e). Contacts in C1 represent a ß-sheet made up by three strands: the two termini and the strand belonging to the diverging turn, which we name strand A (residues 2429, Fig. 5 e; and I1 in Fig. 6). For a folding transition through the slow pathway, this ß-sheet stabilizes in the early events of the folding process, and strand A can no longer move freely. Contacts in C3 are the contacts within the distal hairpin and with the n-Src loop. Contacts in C3 constrain the flexibility of the strand shared by the distal hairpin and the n-Src loop, which we name strand B (residues 3641, Fig. 5 e; and I2 in Fig. 6). The restricted flexibility of strands A and B prevent the mutual closed packing found in the native state. Intermediate I1 features a set of contacts (C2) with no occupancy at all (Fig. 5 e) that are the result of the restricted flexibility of strands A and B.
|
Once we identify the structure of the intermediates, we investigate whether intermediate I1 is present at larger temperatures when no distinction can be made concerning fast and slow folding pathways. To test this hypothesis, we sample the protein conformation during the folding transition at equal time intervals of 60 t.u. for each of the 1100 simulations, and compare these conformations to intermediate I1 with a similarity score function (see Materials and Methods). For each folding transition, we record only the highest value of the similarity score, thus obtaining 1100 highest score values. At TKP, the histogram of the highest scores is bimodal, with 25% of the folding simulations passing through intermediate I1 (Fig. 5 f). Surprisingly, we find that at TKP, simulations that undergo the folding transition through I1 show kinetics of folding no different than those of the rest of simulations.
Cooperativity of the folding process
We investigate the cooperativity at TKP with the time evolution of the frequency map, which we obtain with an average over the 1100 folding simulations at each moment of time. We find that different native contacts have different initial and final frequencies, as well as different time evolution. The majority of the contacts have low initial frequencies. As folding progress, the frequency increases in an exponential-like manner until it finally reaches a value close to 1, when folding is finished. Other contacts, however, do not follow this general trend but present unusual kinetics of formation (Fig. 7 a). In particular, some contacts have low final frequencies. We observe that these contacts are located in the surface of the protein, and can be assigned to three different categories: 1), isolated long-range contacts; 2), contacts in the base of hairpins and loops; and 3), short-range contacts whose native distance is close to the cutoff distance. Isolated long-range contacts can be easily broken by thermal fluctuations, and are difficult to form because of their long-range nature. Contacts in the base of hairpins and loops are the first to break in the transient unzipping of these structures. Finally, short-range contacts whose native distance is close to the cutoff distance can be easily broken because they undergo frequent collisions with the potential energy barrier that binds them. The more collisions a contact undergoes, the higher is the probability that this contact breaks.
|
We also compute the time evolution of the contact frequencies at T = 0.33 for simulations that undergo a folding transition through the fast pathway. The histogram of formation times is bimodal, as for TKP (Fig. 7 b). However, at T = 0.33 the peak corresponding to short formation times is more populated than the peak corresponding to long formation times. In fact, the long-times peak corresponds only to the tertiary long-range contacts, and has a tail for some contacts that take much longer to form. Thus, at low temperatures the secondary structure elements stabilize rapidly, independent of each other, and the folding process finishes when the formed secondary structure elements interact and form the tertiary long-range contacts. This folding mechanism only requires a cooperativity of short-range type, and is prone to be kinetically trapped.
| DISCUSSION |
|---|
|
|
|---|
model of interactions to distinguish between two-state and three-state kinetics. In addition, our equilibrium studies of the unfolded state at TF show that our modified G
model has a high sensitivity to detect the important amino acid contacts.
From our relaxation studies of the initial unfolded state, we observe that the structure of the unfolded state is highly sensitive to the target temperature, Ttarget. The role of the unfolded state in determining the folding kinetics has already been pointed out in recent experimental and theoretical studies (García et al., 2001
; Plaxco and Gross, 2001
; Shortle and Ackerman, 2001
). We observe nucleation, folding with minimal kinetic barriers, and thermally activated mechanisms for the different observed unfolded states.
We observe that the typical formation times of secondary and tertiary contacts tend to separate from each other as we decrease Ttarget below TF, suggesting a weakening of cooperativity between both types of contacts. In our model, a decrease in Ttarget is analogous to an increase in the stability of the native state. Thus the degree of cooperativity among the amino acids weakens under increasingly native conditions. A similar loss of cooperativity was found by Freire in extensive studies of the equilibrium fluctuations of the native state of several proteins (Luque et al., 2002
). They found maximal cooperativity among the amino acids when native and denatured states had equal probability. These conditions correspond to T = TF in our study. Close to TF, secondary structure elements stabilize only when tertiary contacts form fast after the secondary structure forms, the reason being that thermal fluctuations can disrupt the secondary structure elements when isolated. These fluctuations rapidly decrease in magnitude as temperature decreases, and secondary structure becomes stable under such conditions.
In previous studies, various methods have been developed to determine the temperature that signals the onset of multiple folding pathways. Wolynes and Onuchic's groups (Socci et al., 1996
) determined a glass transition temperature, Tg, at which the average folding time is halfway between tmin and tmax, where tmin is the minimum average folding time and tmax is the total simulation time. This method is sensitive to the a priori selected tmax, and the authors found a 10% error in the calculation of Tg by changes of tmax. Also, Shakhnovich's group (Gutin et al., 1998
) estimated a critical temperature, Tc, at which the temperature dependence of the equilibrium potential energy leveled off. From their results, one can evaluate a 20% error in their calculation of Tc. Both Tg and Tc are temperatures that authors use to characterize the onset of multiple folding pathways. In our study we use TKP, which signals the breaking of time translational invariance of equilibrium measurements for temperatures below this value (Dokholyan et al., 2002
). We estimate a 2% error in our calculation of TKP from uncertainties in the location of TKP in Fig. 4 g.
At TKP, secondary structure elements are partially stable, which limits considerably the conformational search for the native state. Furthermore, TKP is a relatively high temperature that prevents the stabilization of improper arrangements of the protein conformation, thus minimizing the occurrence of kinetic traps. Below TKP, the model protein exhibits two intermediates with well-defined structural characteristics. The modest number of misfolded states is a direct consequence of the prevention of non-native contacts. This prevention reduces dramatically the number of protein conformations. Furthermore, since a low energy value implies that most of the native interactions have formed, there are few conformations having both low energy and structural differences with the native state (Plotkin and Onuchic, 2002
).
It is found experimentally (Heidary et al., 2000
; Juneja and Udgaonkar, 2002
; Silverman et al., 2000
; Simmons and Konermann, 2002
) that proteins exhibit only a discrete set of intermediates. Even though in real proteins amino acids that do not form a native contact may still attract each other, experimental and theoretical studies confirm that native contacts have a leading role in the folding transition. Protein engineering experiments (Fersht, 1995
; Grantcharova et al., 1998
; Northey et al., 2002
) show that transition states in two-state globular proteins are mostly stabilized by native interactions. To quantitatively determine the importance of native interactions in the folding transition, Paci et al. (2002)
studied the transition states of three two-state proteins with a full-atom model. They found that on average, native interactions accounted for
83% of the total energy of the transition states. Of relevance to our studies of the SH3 domain are the full-atom study (Shea et al., 2002
) and the protein engineering experiments (Grantcharova et al., 1998
; Riddle et al., 1999
) showing that the transition state of the src-SH3 domain protein is largely determined by the native state. On the other hand, evidence exists that in some proteins, non-native contacts are responsible for the presence of intermediates. In a study of the homologous Im7 and Im9 proteins (Capaldi et al., 2002
), authors identified a set of non-native interactions responsible for an intermediate state in the folding transition of Im7 protein. In another study (Mirny et al., 1996
), authors performed Monte Carlo simulations of two different sequences with the same native state in the 3 x 3 lattice. One sequence presented a series of pathways with misfolded states due to non-native interactions.
At low temperatures, simulations that undergo folding through intermediate I1 reveal that contacts between the two termini form earlier than the contacts belonging to the folding nucleus (Borreguero et al., 2002
). This result coincides with an off-lattice study of a 36-monomer protein (Abkevich et al., 1994
). In this study, the authors found an intermediate in the folding transition of their model protein. Inspection of the intermediate revealed no nucleus contacts, but a different set of long-range contacts had already formed. In addition, Serrano's group (Viguera and Serrano, 2003
) engineered a variant of the
-spectrin SH3 domain, by which they increased the stability of the distal hairpin with new, stable long-range contacts. Authors observed an intermediate in the folding process when these newly introduced long-range contacts formed in the denatured state, preceding the formation of the transition state. Thus, environmental conditions that favor stabilization of long-range contacts other than the nucleus contacts may induce intermediates in the folding transition.
Alternatively, short-range contacts in key positions of the protein structure may also be responsible for slow folding pathways. In a study of the forming binding protein WW domain with the G
model (Karanicolas and Brooks, 2003
), the authors found a slow folding pathway in the model protein, and a cluster of four short-range native contacts that are responsible for this pathway. However, the authors observed that it was the absence, not the presence, of these native contacts in the unfolded state that generated biphasic folding kinetics. Thus, environmental conditions that favor destabilization of short-range contacts may promote the formation of intermediate states in the folding transition.
We also investigate the survival time of intermediate I2, and find that the free energy barrier separating I2 from the native state is independent of temperature. Thus, the average survival time follows Arrhenius kinetics. The value of the free energy barrier is
5.85 energy units, indicating that approximately six native contacts break when the protein conformation reaches the transition state that separates I2 from the native state. At the low temperatures studied, thermal fluctuations are still large enough so that the observed survival times of I2 should be much smaller if only any six native contacts were to break. Thus we hypothesize that it is always the same set of native contacts that must break in the transition I2
native state. Our observations of the transition I1
I2 support this hypothesis. In this transition, we find that the set of contacts C4 always breaks.
At TKP, we do not detect any intermediate from kinetics measurements of the average folding time, or analogously, from the folding rate. However, with the similarity score function we detect intermediate I1 in 25% of the folding transitions. The fact that the folding transitions populating intermediate I1 at TKP are kinetically no different than the rest suggests that thermal fluctuations are strong enough to prevent stabilization of this state. Interactions stabilizing intermediate I1 involve but a few amino acids and therefore should not prevail over the thermal fluctuations.
In a study of protein Im9 (Gorski et al., 2001
), the authors reported the existence of an intermediate in the folding transition under acidic conditions (pH = 5.5). This finding led authors to formulate the hypothesis that Im9 has an intermediate at normal conditions (pH = 7.0), but it is too unstable to be detected with current kinetic experimental techniques. Interestingly, the homologous protein Im7 (60% sequence identity) undergoes folding transition through an intermediate in all tested experimental conditions (Capaldi et al., 2002
), supporting the authors' hypothesis. Similarly, in a recent report (Kamagata et al., 2003
), authors observed two parallel pathways in the folding process of the proline-free staphylococcal nuclease with no accumulation of intermediates below the deadtime (4 ms) of the detection apparatus. It would be interesting to study this protein under stronger stabilizing conditions that may also stabilize any putative intermediate. Changes in both the environmental conditions and the amino acid sequence is therefore a general strategy to uncover hidden intermediates in the folding transition of a two-state protein. An alternative approach is an extensive study of the folding trajectories at TKP that may reveal the hidden intermediates. This is particularly useful for computer simulations, because simulations at low temperatures, when intermediates are easily identifiable, may require several orders-of-magnitude longer than simulations at TKP.
| CONCLUSION |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Submitted on January 2, 2004; accepted for publication March 10, 2004.
| REFERENCES |
|---|
|
|
|---|
Alder, B. J., and T. E. Wainwright. 1959. Studies in molecular dynamics. I. General method. J. Chem. Phys. 31:459466.[CrossRef]
Bachmann, A., and T. Kiefhaber. 2001. Apparent two-state tendamistat folding is a sequential process along a defined route. J. Mol. Biol. 306:375386.[CrossRef][Medline]
Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. 2000. The Protein Data Bank. Nucleic Acids Res. 28:235242.
Borreguero, J. M., N. V. Dokholyan, S. V. Buldyrev, E. I. Shakhnovich, and H. E. Stanley. 2002. Thermodynamics and folding kinetics analysis of the SH3 domain from discrete molecular dynamics. J. Mol. Biol. 318:863876.[CrossRef][Medline]
Branden, C., and J. Tooze. 1999. Introduction to Protein Structure. Garland Publishing, New York.
Bryngelson, J. D., and P. G. Wolynes. 1989. Intermediates and barrier crossing in a random energy model (with applications to protein folding). J. Phys. Chem. 93:69026915.[CrossRef]
Capaldi, A. P., C. Kleanthous, and S. E. Radford. 2002. Im7 folding mechanism: misfolding on a path to the native state. Nat. Struct. Biol. 9:209216.[Medline]
Cheung, M. S., A. E. García, and J. N. Onuchic. 2002. Protein folding mediated by solvation: water expulsion and formation of the hydrophobic core occur after the structural collapse. Proc. Natl. Acad. Sci. USA. 99:685690.
Choe, S. E., P. T. Matsudaira, J. Osterhout, G. Wagner, and E. I. Shakhnovich. 1998. Folding kinetics of villin 14T, a protein domain with a central ß-sheet and two hydrophobic cores. Biochemistry. 37:1450814518.[CrossRef][Medline]
Clementi, C., A. E. García, and J. N. Onuchic. 2003. Interplay among tertiary contacts, secondary structure formation and side-chain packing in the protein folding mechanism: all-atom representation study of protein L. J. Mol. Biol. 326:933954.[CrossRef][Medline]
Ding, F., N. V. Dokholyan, S. V. Buldyrev, H. E. Stanley, and E. I. Shakhnovich. 2002. Direct molecular dynamics observation of protein folding transition state ensemble. Biophys. J. 83:35253532.
Dokholyan, N. V., S. V. Buldyrev, H. E. Stanley, and E. I. Shakhnovich. 1998. Molecular dynamics studies of folding of a protein-like model. Fold. Des. 3:577587.[CrossRef][Medline]
Dokholyan, N. V., E. Pitard, S. V. Buldyrev, and H. E. Stanley. 2002. Glassy behavior of a homopolymer from molecular dynamics simulations. Phys. Rev. E. 65:030801:1030801:4.
Eaton, W. A., V. Muñoz, J. S. J. Hagen, G. S. Jas, L. J. Lapidus, E. R. Henry, and J. Hofrichter. 2000. Fast kinetics and mechanisms in protein folding. Annu. Rev. Biophys. Biomol. Struct. 29:327359.[CrossRef][Medline]
England, J. L., B. E. Shakhnovich, and E. I. Shakhnovich. 2003. Natural selection of more designable folds: a mechanism for thermophilic adaptation. Proc. Natl. Acad. Sci. USA. 100:87278731.
Ferguson, N., A. P. Capaldi, R. James, C. Kleanthous, and S. E. Radford. 1999. Rapid folding with and without populated intermediates in the homologous four-helix proteins Im7 and Im9. J. Mol. Biol. 286:15971608.[CrossRef][Medline]
Fersht, A. R. 1995. Characterizing transition states in protein-foldingan essential step in the puzzle. Curr. Opin. Struct. Biol. 5:7984.[CrossRef][Medline]
Fersht, A. R. 2000. A kinetically significant intermediate in the folding of barnase. Proc. Natl. Acad. Sci. USA. 97:1412114126.
Fersht, A. R. 2000b. Transition-state structure as a unifying basis in protein-folding mechanisms: contact order, chain topology, stability, and the extended nucleus mechanism. Proc. Natl. Acad. Sci. USA. 97:15251529.
Fersht, A. R., and V. Daggett. 2002. Protein folding and unfolding at atomic resolution. Cell. 108:573582.[CrossRef][Medline]
Filimonov, V. V., A. I. Azuaga, A. R. Viguera, L. Serrano, and P. L. Mateo. 1999. A thermodynamic analysis of a family of small globular proteins: SH3 domains. Biophys. Chem. 77:195208.[CrossRef][Medline]
G
, N., and H. Abe. 1981. Non-interacting local-structure model of folding and unfolding transition in globular proteins. I. Formulation. Biopolymers. 20:9911011.[CrossRef][Medline]
García, P., L. Serrano, D. Durand, M. Rico, and M. Bruix. 2001. NMR and SAXS characterization of the denatured state of the chemotactic protein CheY: implications for protein folding initiation. Prot. Sci. 10:11001112.
Gnanakaran, S., and A. E. García. 2003. Folding of a highly conserved diverging turn motif from the SH3 domain. Biophys. J. 84:15481562.
Gorski, S. A., A. P. Capaldi, C. Kleanthous, and S. E. Radford. 2001. Acidic conditions stabilise intermediates populated during the folding of Im7 and Im9. J. Mol. Biol. 312:849863.[CrossRef][Medline]
Grantcharova, V. P., and D. Baker. 1997. Folding dynamics of the Src SH3 domain. Biochemistry. 36:1568515692.[CrossRef][Medline]
Grantcharova, V. P., D. S. Riddle, J. N. Santiago, and D. Baker. 1998. Important role of hydrogen bonds in the structurally polarized transition state for folding of the src SH3 domain. Nat. Struct. Biol. 8:714720.
Guerois, R., and L. Serrano. 2000. The SH3-fold family: experimental evidence and prediction of variations in the folding pathways. J. Mol. Biol. 304:967982.[CrossRef][Medline]
Gutin, A., A. Sali, V. Abkevich, M. Karplus, and E. I. Shakhnovich. 1998. Temperature dependence of the folding rate in a simple protein model: search for a "glass" transition. J. Chem. Phys. 108:64666483.[CrossRef]
Heidary, D. K., J. C. O'Neill, M. Roy, and P. A. Jennings. 2000. An essential intermediate in the folding of dihydrofolate reductase. Proc. Natl. Acad. Sci. USA. 97:58665870.
Holm, L., and C. Sander. 1996. Mapping the protein universe. Science. 273:595602.
Ikai, A., and C. Tandford. 1973. Kinetics of unfolding and refolding of proteins. I. Mathematical analysis. J. Mol. Biol. 73:145163.[CrossRef][Medline]
Isaacson, R. L., A. G. Weedsdagger, and A. R. Fersht. 1999. Equilibria and kinetics of folding of gelsolin domain 2 and mutants involved in familial amyloidosis-Finnish type. Proc. Natl. Acad. Sci. USA. 96:1124711252.
Jackson, S. E. 1998. How do small single-domain proteins fold? Fold. Des. 3:R81R91.[CrossRef][Medline]
Juneja, J., and J. B. Udgaonkar. 2002. Characterization of the unfolding of ribonuclease A by a pulsed hydrogen exchange study: evidence for competing pathways for unfolding. Biochemistry. 41:26412654.[CrossRef][Medline]
Kamagata, K., Y. Sawano, M. Tanokura, and K. Kuwajima. 2003. Multiple parallel-pathway folding of proline-free staphylococcal nuclease. J. Mol. Biol. 332:11431153.[CrossRef][Medline]
Karanicolas, J., and C. L. Brooks, III. 2003. The structural basis for biphasic kinetics in the folding of the WW domain from a formin-binding protein: lessons for protein design? Proc. Natl. Acad. Sci. USA. 100:39543959.
Karplus, M., and J. A. McCammon. 2002. Molecular dynamics simulations of biomolecules. Nat. Struct. Biol. 9:646652.[CrossRef][Medline]
Khorasanizadeh, S., I. D. Peters, and H. Roder. 1996. Evidence for a three-state model of protein folding from kinetics analysis of ubiquitin variants with altered core residues. Nature Struct. Biol. 3:193205.[CrossRef][Medline]
Kiefhaber, T. 1995. Kinetics traps in lysozyme folding. Proc. Natl. Acad. Sci. USA. 92:90299033.
Kitahara, R., and K. Akasaka. 2003. Close identity of a pressure-stabilized intermediate with a kinetic intermediate in protein folding. Proc. Natl. Acad. Sci. USA. 100:31673172.
Klimov, D. K., and D. Thirumalai. 2002. Stiffness of the distal loop restricts the structural heterogeneity of the transition state ensemble in SH3 domains. J. Mol. Biol. 317:721737.[CrossRef][Medline]
Knapp, S., P. T. Mattson, P. Christova, K. D. Berndt, A. Karshikoff, M. Vihinen, C. I. Smith, and R. Ladenstein. 1998. Thermal unfolding of small proteins with SH3 domain folding pattern. PSFG. 23:309319.
Kortemme, T., M. J. S. Kelly, L. E. Kay, J. Forman-Kay, and L. Serrano. 2000. Similarities between the spectrin SH3 domain denatured state and its folding transition state. J. Mol. Biol. 297:12171229.[CrossRef][Medline]
Lopez-Hernandez, E., and L. Serrano. 1996. Structure of the transition state for folding of the 129 aa protein CheY resembles that of a smaller protein, CI-2. Fold. Des. 1:4355.[Medline]
Luque, I., S. A. Leavitt, and E. Freire. 2002. The linkage between protein folding and functional cooperativity: two sides of the same coin? Annu. Rev. Biophys. Biomol. Struct. 31:235256.[CrossRef][Medline]
Martinez, J. C., A. R. Viguera, R. Berisio, M. Wilmanns, P. L. Mateo, V. V. Filimonov, and L. Serrano. 1999. Thermodynamic analysis of
-spectrin SH3 and two of its circular permutants with different loop lengths: discerning the reasons for rapid folding in proteins. B