| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Complex Systems Division, Department of Theoretical Physics, Lund University, Lund, Sweden
Correspondence: Address reprint requests to A. Irbäck, Tel.: 46-46-222-3493; Fax: 46-46-222-9686; E-mail: anders{at}thep.lu.se.
| ABSTRACT |
|---|
|
|
|---|
20 residues each. The test set contains both
-helical (Trp cage, Fs) and ß-sheet (GB1p, GB1m2, GB1m3, Betanova, LLM) peptides. The model, which is entirely sequence-based, is able to fold these different peptides for one and the same choice of model parameters. Furthermore, the melting behavior of the peptides is in good quantitative agreement with experimental data. Apparent folded populations obtained using different observables are compared, and are found to be very different for some of the peptides (e.g., Betanova). In other cases (in particular, GB1m2 and GB1m3), the different estimates agree reasonably well, indicating a more two-state-like melting behavior. | INTRODUCTION |
|---|
|
|
|---|
There are, however, questions about the interaction potentials used in the simulations that need further investigation. One difficulty is that different potentials give very different relative weights to the
-helix and ß-strand regions of the Ramachandran space (Zaman et al., 2003
). A potential that successfully folds an
-helical peptide might therefore have problems with ß-sheet peptides, and vice versa. Another difficulty is with the temperature-dependence of observable quantities. As pointed out by Zhou et al. (2001)
, it seems that most current models need further calibration to give a temperature-dependence that is not too weak; as a result, calculated melting temperatures are often unrealistically high. A systematic study of these thermodynamic questions requires extensive conformational sampling and is a challenge, especially in models with explicit water.
Here we study a model that contains all atoms of the polypeptide chains but no explicit solvent molecules. Formally, such a model is obtained by integrating out the solvent degrees of freedom. Finding an accurate and computationally tractable approximation of the resulting effective potential is, however, a highly nontrivial problem. Examples of implicit solvent models that have been used in folding studies with some success include the generalized Born approach (Still et al., 1990
), the method based on screened Coulomb potentials by Hassan et al. (2003)
, and the method based on solvent-accessible surface areas by Ferrara et al. (2002)
. In this article, we study a minimalistic model in which the effects of the solvent are represented by an effective attraction between nonpolar side chains. Our study focuses on the thermodynamic behavior of this model, which we investigate using efficient Monte Carlo methods rather than molecular dynamics. This choice is made for computational convenience; with some minor modifications, it would be possible to study the same model using molecular dynamics. Promising computational techniques have recently been proposed by Hansmann and Wille (2002)
and Schug et al. (2003)
, but these methods are for energy-minimization, which is insufficient for our purposes.
In addition to effective hydrophobic attraction, the interaction potential of our model contains two major terms, representing excluded-volume effects and hydrogen bonding. The potential is deliberately kept simple, partly for the sake of clarity but also for practical reasons; any potential requires careful calibration, and this task is easier with a simple potential like ours with fewer parameters to tune. In the future, the potential may be further developed with the inclusion of new terms such as Coulomb interactions between side-chain charges, but not before it becomes clear that they are needed. The different terms of the potential represent either the interaction between two individual atoms (excluded volume), or two pairs of atoms (e.g., hydrogen bonds), or an effective interaction between a pair of side chains (hydrophobicity). The largest units playing a role in the potential are the amino acids, and no information about the sequence as a whole or its native structure is used in the potential.
Our approach toward the problem of determining the interaction potential is phenomenological. The shape of individual terms is inspired by intuitive notions rather than being rigorously derived from a microscopic picture. Their exact functional forms and relative sizes are constrained by the effectiveness of the model in describing the folding behavior of more and more sequences. When such a potential evolves to a point where it can successfully fold a significant number of peptides of different native geometries, and capture the thermodynamic behavior of all those peptides, it would be useful on its own as a working potential for thermodynamic studies of new sequences, and also provide hints about the relative importance of different physical effects in protein folding.
We have previously shown that earlier versions of this model are able to fold both
-helix and ß-sheet peptides (Irbäck et al., 2003
; Irbäck and Sjunnesson, 2004
). In this article we present a further development of this model. We test the new model on the following set of peptides (see Fig. 1): the
-helical Trp cage (Neidigh et al., 2002
) and Fs (Lockhart and Kim, 1992
, 1993
), and the ß-sheet peptides GB1p (Kobayashi et al., 1993
; Blanco et al., 1994
), GB1m2 and GB1m3 (Fesinmeyer et al., 2004
), Betanova (Kortemme et al., 1998
), and LLM (López de la Paz et al., 2001
). Here GB1p denotes the C-terminal ß-hairpin from the protein G B1 domain, whereas Betanova is a designed three-stranded ß-sheet peptide. GB1m2 and GB1m3 are mutants of GB1p, whereas LLM is a mutant of Betanova, with enhanced stabilities. We find that our model provides a good description of the thermodynamic behavior of all these peptides. The same model was furthermore used in a recent study of the oligomerization properties of the amyloid Aß1622 peptide (Favrin et al., 2004
), with very promising results.
|
| MODEL AND METHODS |
|---|
|
|
|---|
,
and a number of side-chain torsion angles as its degrees of freedom. Numerical values of the geometrical parameters held constant can be found elsewhere (Irbäck et al., 2003
In the simulations we internally use a dimensionless energy scale. The correspondence (constant factor) of this scale to the physical energy scale is determined by using the model prediction of the dimensionless energy value for an observable and the experimental value for the same. We use the melting temperature Tm = 315 K of the Trp cage (Neidigh et al., 2002
) for this purpose (see below), which is found to correspond to a dimensionless energy kTm of 0.470 in the model (k is Boltzmann's constant). Energy parameters of the model (such as the
ev,
loc,
, etc., below) are given in our internal energy scale. It must be emphasized that this energy scale is left unchanged when analyzing the other peptides.
![]() | (1) |
![]() | (2) |
ev = 0.10, and
i = 1.77, 1.75, 1.55, 1.42, and 1.00 Å for S, C, N, O, and H atoms, respectively. The values of the radii
i agree reasonably well with the statistical analysis of Tsai et al. (1999)
i values for C, N, and O strongly influence the shape of the Ramachandran
,
distribution, and must therefore be carefully chosen. The parameter
ij in Eq. 2 has the value 0.75 for all pairs except those connected by three covalent bonds, for which
ij = 1. The reason why we use a reduction factor
ij < 1 for all nonlocal pairs is both computational efficiency and the restricted flexibility of a chain with only torsional degrees of freedom, which could create artificial traps. To speed up the calculations, Eq. 2 is evaluated using a cutoff of
and pairs with fixed separation are omitted.
The second energy term, E loc, has the form
![]() | (3) |
- and
-angles for amino acid I. The partial charges are taken as qi = ±0.20 for H and N and qi = ±0.42 for C' and O (Brändén and Tooze, 1991
loc = 100, corresponding to a dielectric constant of
r
2.5.
The third term of the energy function is the hydrogen-bond energy Ehb, which has the form
![]() | (4) |
, ß) are given by
![]() | (5) |
![]() | (6) |
We consider only hydrogen bonds between NH and CO groups, and rij denotes the HO distance,
ij the NHO angle, and ßij the HOC angle. The parameters
, and
hb are taken as 3.1, 2.0, and 2.0 Å, respectively. The function u(r) is calculated using a cutoff of rc = 4.5 Å. The first sum in Eq. 4 contains backbone-backbone interactions, whereas the second sum contains interactions between charged side chains (Asp, Glu, Lys, and Arg) and the backbone. The latter type of interaction is taken to be effectively weak (
), because there are competing interactions between the side-chain charges and the surrounding water that are omitted in the model. For the same reason, we do not include any term in Ehb corresponding to side chain-side chain interactions. It is possible that the effective strength
should be made stronger in case the side-chain charge gets shielded from the water. This context dependence is ignored in the model, which should be a reasonable approximation for small peptides. Hydrogen bonds between parts that are very close in sequence are rare in protein structures and therefore disregarded in the model; specifically, we disallow backbone NH (C' O) groups to make hydrogen bonds with the two nearest backbone C' O (NH) groups on each side of them, and we also forbid hydrogen bonds between the side chain of one amino acid with the nearest donor or acceptor on either side of its C
. As a simple form of context dependence, we assign a reduced strength to hydrogen bonds involving chain ends, which tend to be exposed to water. A hydrogen bond involving one or two end groups is reduced in strength by factors of 2 and 4, respectively. If there are capping groups, these groups are taken to be the end groups; otherwise, the two end-amino acids take this role.
The fourth energy term, Ehp, represents an effective hydrophobic attraction between nonpolar side chains. It has the pairwise additive form
![]() | (7) |
![]() | (8) |
, and C
atoms. The definition of AI for the other hydrophobic side chains has been given elsewhere (Irbäck et al., 2003
|
, strongly influence the folding properties of the model, and are therefore well determined. Others, such as
, are less important and, as a result of this, quite poorly determined.
The new version of the model differs from earlier versions in the precise form of the simple context-dependence of Eloc and Ehb. Also, the reduction factor for the hydrophobic attraction between next-nearest neighbors along the chain has been changed. Furthermore, we have added Pro, which does not occur in any of our previously studied sequences, to the list of hydrophobic amino acids. All other parameters of the potential are the same as in the last version of the model, except for a slight reduction in strength of the local potential (
loc).
It should be stressed that this potential is not expected to provide a good description of general amino acid sequences. For example, it is likely that the pairwise additive hydrophobicity potential is inadequate for long chains, due to double-counting effects. For long chains, anticooperative multibody effects might play a significant role (Shimizu and Chan, 2001
). By extending the present calculations in the future to new and longer sequences, we hope that it will be possible to refine the potential and thereby make it more general.
Computational methods
To study the thermodynamic behavior of this model, we use simulated tempering (Lyubartsev et al., 1992
; Marinari and Parisi, 1992
; Irbäck and Potthast, 1995
), in which the temperature is a dynamical variable. For a review of simulated tempering and other generalized-ensemble techniques for protein folding, see Hansmann and Okamoto (1999)
. We study eight different temperatures Tk, which range from Tmin = 275 K to Tmax = 369 K and are given by
. The average acceptance rate for the temperature jumps is
70%.
Our simulations are carried out using two different elementary moves for the backbone degrees of freedom: first, the highly nonlocal pivot move in which a single backbone torsion angle is turned; and second, a semilocal method (Favrin et al., 2001
) that works with up to eight adjacent backbone degrees of freedom, which are turned in a coordinated manner. Side-chain angles are updated one by one. Every update involves a Metropolis accept/reject step, thus ensuring detailed balance. All our simulations are started from random configurations. All statistical errors quoted are 1
errors obtained from the variation between independent runs. For each peptide, we performed
10 independent runs. Each run contained 109 elementary Monte Carlo steps (1.5 x 109 steps for GB1p) and required 12 CPU days on a 1.6-GHz computer.
To characterize the folding behavior of the different peptides, we monitor several quantities. For a peptide with N amino acids, we define the
-helix content H as the fraction of the N2 inner amino acids with their Ramachandran (
,
) pair in the region 90° <
< 30°, 77° <
< 17°. We calculate the radius of gyration, Rg, over the backbone atoms, with unit mass for all atoms. We also study root mean-square deviations (RMSD) from folded reference structures, calculated over either the backbone atoms or all heavy atoms. A backbone RMSD is denoted by
b and a heavy-atom RMSD by
. For the ß-sheet peptides, there exist topologically distinct states that the backbone RMSD cannot discriminate between, which makes it necessary to use the heavy-atom RMSD.
In our analysis of the results from the simulations, it turns out that the temperature-dependence of a quantity X in many cases can be well described by the simple two-state expression
![]() | (9) |
E], where Tm is the midpoint temperature and
E = EuEn is the energy difference between the unfolded and native states. With these assumptions, a fit to Eq. 9 has four parameters:
E, Tm, Xu, and Xn. | RESULTS AND DISCUSSION |
|---|
|
|
|---|
Trp cage
The optimized 20-residue Trp cage (NLYIQWLKDGGPSSGRPPPS) is a miniprotein with a compact folded state and a melting temperature of 315 K, as determined by circular dichroism (CD) and nuclear magnetic resonance (NMR) measurements (Neidigh et al., 2002
). The NMR-derived native structure (Neidigh et al., 2002
) contains a short
-helix (residues 28), a single turn of 310-helix (residues 1114), and a hydrophobic core consisting of three proline residues (Pro12, Pro18, Pro19) and two aromatic residues (Tyr3, Trp6). The folding time is a few microseconds at room temperature (Qiu et al., 2002
). Its small size, fast folding, and relative stability makes the Trp cage an ideal testbed for computational methods, and folding simulations of this peptide were reported by several groups (Snow et al., 2002
; Simmerling et al., 2002
; Schug et al., 2003
; Pitera and Swope, 2003
; Zhou, 2003a
). Two of these groups performed thermodynamic studies (Pitera and Swope, 2003
; Zhou, 2003a
). Both groups made detailed comparisons with raw NMR data with very good results, but the calculated melting temperatures were too high (
400 K).
In our model the melting temperature of the Trp cage is, by definition, equal to its experimental value, since we use this quantity to set the energy scale of the model. For this purpose, we consider the helix content H, as defined in the previous section, which should be strongly correlated with the CD signal studied experimentally. Fig. 2 a shows our results for H against temperature. A fit to the data with the two-state expression in Eq. 9 is also shown. As can be seen in the figure, the two-state fit provides an excellent description of the data. The midpoint temperature from this fit, Tm, is set to 315 K, the experimental melting temperature. Having done that, there is no free parameter left in the model. The fitted value of the parameter
E = 11.5 ± 0.2 kcal/mol is, in contrast to that of Tm, not used for calibration, but is rather a prediction of the model.
|
E]}. Fig. 3 shows the native population obtained using the above mentioned
E and Tm, against temperature, along with experimental values based on CD and NMR (Neidigh et al. 2002
5% at the lowest temperatures. With the overall energy scale properly determined, we thus find that the melting behavior of this peptide is well described by the model.
|
30% (see Fig. 2 a). An RMSD analysis confirms that the typical low-temperature structure is similar to the NMR structure (PDB code 1L2Y, first model), as illustrated in Fig. 2 b. This figure shows the free energy F(
b, E) calculated as a function of the backbone RMSD
b (residues 219) and the energy E, at 275 K. We see that F(
b, E) has a simple shape with one dominating minimum, which is located at
b
2.3 Å.
Fs
The designed 21-residue Fs peptide is given by Suc-A5(AAARA)3A-NH2, (where Suc is succinylic acid) and makes an
-helix (Lockhart and Kim, 1992
, 1993
). Other N-capping groups than Suc have also been used in the experiments on this peptide. The melting behavior of Fs was studied using CD as well as infrared (IR) spectroscopy. The melting temperature measured by IR was 334 K (Williams et al., 1996
), whereas the CD-based studies obtained Tm = 308 K (Lockhart and Kim, 1993
) and Tm = 303 K (Thompson et al., 1997
). Computational studies of Fs have also been reported (Vila et al., 2000
; García and Sanbonmatsu, 2002
; Nymeyer and García, 2003
). By explicit water simulations, García and Sanbonmatsu (2002)
obtained a Tm of 345 K, which is in reasonable agreement with the IR-based value. Using an earlier version of our model and ignoring the capping groups, a Tm of 310 K was obtained (Irbäck et al., 2003
). In the present calculations, we include the Suc and NH2 groups.
Fig. 4 a shows the helix content versus temperature as obtained from our Fs calculations. A two-state fit of the data gives Tm = 304 ± 1 K, which is significantly lower than the IR-based result mentioned above but in perfect agreement with the CD studies, especially that of Thompson et al. (1997)
. For the energy difference, we obtain
E = 11.9 ± 0.3 kcal/mol, which also agrees with what Thompson and co-workers found, namely
E = 12 ± 2 kcal/mol. It may be worth noting that the experimental data that we compared with in the Trp cage case were based on CD rather than IR.
|
b, E) at 275 K. In the absence of a precise experimental structure for Fs, we define
b as the (backbone) RMSD from an ideal
-helix (all residues). From the figure we see that the free energy has its global minimum at
b
0.5 Å, which indeed corresponds to the
-helix. There are also two local minima at
b
7 Å and
b
11 Å, both of which correspond to ß-sheet structures. These two minima are very weakly populated compared to the
-helix minimum.
GB1p and GB1m2/GB1m3
Using exactly the same model, we now turn to ß-sheet peptides. That GB1p (GEWTYDDATKTFTVTE), the 4156-residue fragment from the protein G B1 domain, makes a ß-hairpin on its own, was a breakthrough discovery (Blanco et al., 1994
) that has been followed by numerous atomic simulations of this particular sequence (Roccatano et al., 1999
; Pande and Rokhsar, 1999
; Dinner et al., 1999
; García and Sanbonmatsu, 2001
; Zhou et al., 2001
; Zhou, 2003b
; Zagrovic et al., 2001
; Kussell et al., 2002
; Bolhuis, 2003
; Wei et al., 2004
). Recently, two mutants of GB1p with enhanced stability were designed (Fesinmeyer et al., 2004
), GB1m2 and GB1m3, by replacing the turn segment DDATKT by NPATGK. The mutant GB1m2 (GEWTYNPATGKFTVTE) is identical to GB1p except for this change, whereas GB1m3 (KKWTYNPATGKFTVQE) differs from GB1p at the chain ends as well. By CD and NMR, GB1m3 was estimated to be 86 ± 3% folded at 298 K and to have a Tm of 333 ± 2 K, whereas GB1m2 was found to have a slightly lower folded population, 74 ± 5% at 298 K, and a Tm of 320 ± 2 K (Fesinmeyer et al., 2004
). In the same study, GB1p was estimated to be
30% folded at 298 K. An earlier NMR study found GB1p to be 42% folded at 278 K (Blanco et al., 1994
). Both these estimates of native population for GB1p are low compared to the result of a Trp fluorescence study (Muñoz et al., 1997
); a two-state analysis of these data gave Tm = 297 K and
E = 11.6 kcal/mol (Muñoz et al., 1997
).
It turns out that our model fails to reproduce the experimental difference in stability between GB1m2 and GB1m3. In fact, GB1m2 and GB1m3 show nearly identical behavior in our model. For clarity, we therefore show results only for one of these peptides, GB1m3, in the figures below.
Fig. 5 shows the hydrophobicity energy Ehp against temperature for GB1p and GB1m3 in the model. We expect Ehp to be strongly correlated with Trp fluorescence for these peptides, as Trp43 forms a hydrophobic cluster together with Tyr45, Phe52, and Val54. A two-state fit to our data for GB1p gives Tm = 297 ± 1 K and
E = 14.2 ± 0.2 kcal/mol, which indeed is in good agreement with the Trp fluorescence results for this peptide (Tm = 297 K,
E = 11.6 kcal/mol). The same type of fit gives Tm = 321 ± 1 K and
E = 15.0 ± 0.4 kcal/mol for GB1m3, and Tm = 322 ± 2 K and
E = 15.1 ± 0.4 kcal/mol for GB1m2. These two very similar Tm estimates lie close to the experimental result for GB1m2 (320 ± 2 K) and somewhat below that for GB1m3 (333 ± 2 K). Our Ehp data indicate that GB1m2 and GB1m3 indeed are markedly more stable than GB1p in the model, which is confirmed by the results discussed next.
|
,E) for GB1p, at 275 K. On its own the GB1p fragment is believed to adopt a folded structure similar to that it has as part of the native protein G B1 domain, although the NMR restraints were insufficient to determine a unique structure for the excised fragment. As reference structure in the calculation of
, we therefore use the corresponding fragment of the NMR structure for the full protein G B1 domain (PDB code 1GB1, residues 4156, first model; see Gronenborn et al., 1991
is used instead of the backbone RMSD
b, because
b cannot distinguish between the two possible ß-hairpin topologies (with similar backbone folds but oppositely oriented side chains). We find that the two lowest minima of F(
,E), at
2.0 Å and
3.2 Å, both correspond to a ß-hairpin with the same topology and the same set of backbone hydrogen bonds as the reference structure. The main difference between these two minima lies in the shape of the turn region. In addition to these minima, there are two weakly populated local minima at
5.3 Å and
810 Å, which correspond to a ß-hairpin with the opposite topology and
-helix, respectively. The shape of F(
, E) for GB1p was also studied using earlier versions of our model (Irbäck et al., 2003
|
,E) has a simpler shape for GB1m3 than for GB1p. There is only one detectable free-energy minimum for GB1m3, and this minimum corresponds to a structure similar to that favored for GB1p.
Different experiments on GB1p have, as mentioned above, obtained different ß-hairpin populations. One way of estimating folded populations in the model is by two-state fits like those in Fig. 5. An independent and more direct estimate can be obtained by counting native backbone hydrogen bonds. To this end, we consider a hydrogen bond formed if its energy is
. The number of native backbone hydrogen bonds in a given conformation is denoted by
. Fig. 7 shows the probability distribution of
for GB1p and GB1m3 at 299 K, which is very close to the temperature (298 K) at which the folded populations of these two peptides were compared by CD and NMR (Fesinmeyer et al., 2004
). We find that the probability distribution
has a clear bimodal shape for both peptides, with one native and one unfolded peak. The native peak is, as expected from the results above, significantly larger for the mutant GB1m3 than for GB1p. Taking conformations with
as native and those with
as unfolded, we obtain native populations of 82 ± 1% for GB1m3, 84 ± 1% for GB1m2, and 27 ± 2% for GB1p. The overall agreement between these results and the experimental data (86 ± 3% for GB1m3, 74 ± 5% for GB1m2,
30% for GB1p) is very good, although the model slightly overestimates the folded fraction for GB1m2. Note that the native populations estimated from
, thanks to the bimodality, are quite well determined, despite that the precise definition of native in terms of
is somewhat arbitrary.
|
are very rare (see Fig. 7).
Our Ehp- and
-based native populations for GB1p are different; from the Ehp data we obtain a native population of 46% at 299 K, where the
analysis gives 27%. The magnitude of this difference is similar to that between different experiments. The
-based result is in good agreement with CD and NMR data, whereas the Ehp-based result agrees with Trp fluorescence data. For GB1m3 (and GB1m2), we do not know of any Trp fluorescence study. Our model suggests that the difference between different methods would be smaller in this case. Our Ehp-based folded population at 299 K is 85% for GB1m3, which is close to our
-based result of 82%.
Betanova and LLM
Betanova is a designed antiparallel three-stranded ß-sheet peptide with 20 residues (RGWSVQNGKYTNNGKTTEGR) (Kortemme et al., 1998
), which is only marginally stable (López de la Paz et al., 2001
). Recently, Betanova mutants with higher stability were developed (López de la Paz et al., 2001
), such as the triple mutant LLM (Val5Leu, Asn12Leu, Thr17Met). The NMR-based native populations of LLM and Betanova are 36% and 9%, respectively, at 283 K (López de la Paz et al., 2001
). Results in good agreement with these estimates were obtained when testing an earlier version of our model on these two peptides (Irbäck and Sjunnesson, 2004
). Folding simulations of Betanova have also been performed by other groups, using coarse-grained (Kim et al., 2004
) and atomic (Bursulaya and Brooks, 1999
; Colombo et al., 2002
) models.
The folded structure of Betanova and LLM contains eight backbone hydrogen bonds, four in each of the two ß-hairpins. Fig. 8 a shows the probability distribution of the number of native backbone hydrogen bonds,
, in our model for LLM and Betanova, at 287 K. The distributions have three peaks. In addition to the folded and unfolded peaks at high and low
, there is also a peak at
. Visual inspection of snapshots from the simulations reveals that conformations at this peak tend to contain the first (N-terminal) ß-hairpin but not the second (C-terminal) one. This conclusion, which is in agreement with experimental data (López de la Paz et al., 2001
), is confirmed by the frequencies of occurrence of the individual hydrogen bonds, shown in Fig. 8 b. We see that the hydrogen bonds of the first ß-hairpin (14) occur more frequently than those of the second ß-hairpin (58), especially for Betanova. For a conformation to be counted as folded, we require that
. With this definition, we find that Betanova and LLM are 6 ± 1% and 38 ± 2% folded, respectively, at 287 K, which is in good agreement with the experimental results (9% and 36% at 283 K).
|
E = 8.9 ± 0.1 kcal/mol for Betanova, and Tm = 302 ± 1 K and
E = 10.9 ± 0.2 kcal/mol for LLM. These fitted two-state parameters contrast sharply with the results of the
analysis above, especially for Betanova. In fact, for Betanova, the fitted two-state parameters correspond to a native population of 80% at the temperature 287 K, at which Betanova was estimated above to be only 6% folded. This discrepancy between the native populations obtained using Ehp and
data clearly show that, in our model, these two peptides do not behave as ideal two-state systems. It is worth noting that the quality of the two-state fits in Fig. 9 a, nevertheless, is very good, which illustrates that deviations from the simple two-state picture can be very hard to detect from the temperature-dependence of a single quantity (Favrin et al., 2003
|
,E) for Betanova at 275 K. Like for the ß-hairpins, we use all the heavy atoms in the RMSD, but limit the comparison to the residues 318. The residues 1, 2, 19, and 20 do not participate in the ß-sheet structure. There is a local minimum at
3.2 Å representing the state obtained in our model that most resembles the NMR structure. That this state is not the most probable state in the model is consistent with the low native population found experimentally for this peptide. The corresponding graph for LLM shows a much more prominent minimum representing the native conformation.
The character of the melting transition
For GB1p, Betanova, and LLM, we saw above that the apparent native population depends on which quantity we study. This dependence reflects the fact that these peptides do not show ideal two-state behavior in the model. A quantity for which we obtain a relatively high apparent melting temperature not only for these three peptides but for all the peptides studied, is the radius of gyration, Rg. The Tm values obtained from our Rg data for Fs and the Trp cage are 29 K and 9 K higher, respectively, than what we found above using the helix content. For GB1m3, our Rg data gives a Tm that is 6 K higher than that obtained above using the hydrophobicity energy. These comparisons show that none of the peptides studied behaves as a perfect two-state system in our model, although the deviations from this behavior might be relatively small for some of them, such as GB1m3.
One measure of the sharpness of the melting transition is the height of the peak in the specific heat, Cv. In Fig. 10, we show specific heat curves for the different peptides studied. The results for GB1m2 are again very similar to those for GB1m3 and therefore omitted. The specific heat exhibits a clear peak for all the peptides studied, but the height of the peak varies. The peak is highest for GB1m3, indicating that the melting transition is most two-state-like for this peptide. A comparison of the energy distributions of the different peptides (not shown) supports this conclusion. For GB1m3, we find that the energy distribution has a bimodal shape, although not very pronounced. The other peptides all have wide but single-peaked distributions. The distribution is particularly wide, virtually flat, for GB1p, which has the next highest peak in Cv.
|
| CONCLUSION |
|---|
|
|
|---|
20 amino acids each, namely the Trp cage, Fs, GB1p, GB1m2, GB1m3, Betanova, and LLM. First of all, our study shows that the model folds these different sequences to structures similar to their experimental structures, for one and the same choice of model parameters. In addition, we investigated the stability and melting behavior of the peptides. The following list is a brief summary of these calculations, focusing on the observables expected to be correlated with the corresponding experimental probes.
E values that are in good agreement with CD data, whereas the Tm value is somewhat low compared to its IR-based value.
-helical Trp cage.
|
The temperature-dependence of the model is, to us, surprisingly good, for two reasons. First, the temperature-dependence was not considered at all when calibrating the model, except in the determination of the energy scale. A considerable amount of fine-tuning was required to obtain proper folded structures, but no further fine-tuning was performed once that goal had been achieved. Second, our calculations do not involve any reparameterization of the energy function. In other words, the parameters of the energy function are temperature-independent, which is a simplifying assumption rather than a controlled approximation. On the other hand, it should be noted that the melting transition is not triggered by a sudden change in, for example, the strength of the hydrophobic attraction.
In the development of this model, we have taken a purely phenomenological approach. The model will be further developed by studying new amino acid sequences, which will impose new conditions on the interaction potential. As before, the challenge will be to do this in a backward-compatible manner; the model must not lose its ability to fold previously studied sequences. As to limitations of the current version of the model, we know that it is unable to properly fold the so-called Trp-zip ß-hairpins (Cochran et al., 2001
), which make ß-hairpins in the model but with the wrong topology. We also expect that refinement of the model will be needed as the chains get larger. For example, as mentioned earlier, it is likely that our pairwise additive hydrophobicity potential will have to be supplemented with multibody terms for large chains. Finding out how to change the model to make it more general without losing computational efficiency will not be an easy task, but the results obtained so far makes it tempting to try.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
This work was in part supported by the Swedish Research Council and the Knut and Alice Wallenberg Foundation through the Swegene consortium.
Submitted on July 25, 2004; accepted for publication December 1, 2004.
| REFERENCES |
|---|
|
|
|---|
Bolhuis, P. G. 2003. Transition-path sampling of ß-hairpin folding. Proc. Natl. Acad. Sci. USA. 14:1212912134.
Brändén, C., and J. Tooze. 1991. Introduction to Protein Structure. Garland Publishing, New York.
Bursulaya, B. D., and C. L. Brooks III. 1999. Folding free energy surface of a three-stranded ß-sheet protein. J. Am. Chem. Soc. 121:99479951.[CrossRef]
Cochran, A. G., N. J. Skelton, and M. A. Starovasnik. 2001. Tryptophan zippers: stable, monomeric ß-hairpins. Proc. Natl. Acad. Sci. USA. 98:55785583.
Colombo, G., D. Roccatano, and A. E. Mark. 2002. Folding and stability of the three-stranded ß-sheet peptide Betanova: insights from molecular dynamics simulations. Proteins. 46:380392.[CrossRef][Medline]
Dinner, A. R., T. Lazaridis, and M. Karplus. 1999. Understanding ß-hairpin formation. Proc. Natl. Acad. Sci. USA. 96:90689073.
Dobson, C. M. 2003. Protein folding and misfolding. Nature. 426:884890.[CrossRef][Medline]
Dyson, H. J., and P. E. Wright. 2002. Coupling of folding and binding for unstructured proteins. Curr. Opin. Struct. Biol. 12:5460.[CrossRef][Medline]
Favrin, G., A. Irbäck, and F. Sjunnesson. 2001. Monte Carlo update for chain molecules: biased Gaussian steps in torsional space. J. Chem. Phys. 114:81548158.[CrossRef]
Favrin, G., A. Irbäck, B. Samuelsson, and S. Wallin. 2003. Two-state folding over a weak free-energy barrier. Biophys. J. 85:14571465.
Favrin, G., A. Irbäck, and S. Mohanty. 2004. Oligomerization of amyloid Aß1622 peptides using hydrogen bonds and hydrophobicity forces. Biophys. J. 87:36573664.
Ferrara, P., J. Apostolakis, and A. Caflisch. 2002. Evaluation of a fast implicit solvent model for molecular dynamics simulations. Proteins. 46:2433.[CrossRef][Medline]
Ferrenberg, A. M., and R. H. Swendsen. 1988. New Monte Carlo technique for studying phase transitions. Phys. Rev. Lett. 61:26352638.[CrossRef][Medline]
Fesinmeyer, R. M., F. M. Hudson, and N. H. Andersen. 2004. Enhanced hairpin stability through loop design: the case of the protein G B1 domain hairpin. J. Am. Chem. Soc. 126:72387243.[CrossRef][Medline]
García, A. E., and K. Y. Sanbonmatsu. 2001. Exploring the energy landscape of a ß-hairpin in explicit solvent. Proteins. 42:345354.[CrossRef][Medline]
García, A. E., and K. Y. Sanbonmatsu. 2002.
-helical stabilization by side chain shielding of backbone hydrogen bonds. Proc. Natl. Acad. Sci. USA. 99:27822787.
Gnanakaran, S., H. Nymeyer, J. Portman, K. Y. Sanbonmatsu, and A. E. García. 2003. Peptide folding simulations. Curr. Opin. Struct. Biol. 13:168174.[CrossRef][Medline]
Gronenborn, A. M., D. R. Filpula, N. Z. Essig, A. Achari, M. Whitlow, P. T. Wingfield, and G. M. Clore. 1991. A novel, highly stable fold of the immunoglobulin-binding domain of streptococcal protein G. Science. 253:657661.
Hansmann, U. H. E., and Y. Okamoto. 1999. New Monte Carlo algorithms for protein folding. Curr. Opin. Struct. Biol. 9:177183.[CrossRef][Medline]
Hansmann, U. H. E., and L. T. Wille. 2002. Global optimization by energy landscape paving. Phys. Rev. Lett. 88:068105.[CrossRef][Medline]
Hassan, S. A., E. L. Mehler, D. Zhang, and H. Weinstein. 2003. Molecular dynamics simulations of peptides and proteins with a continuum electrostatic model based on screened Coulomb potentials. Proteins. 51:109125.[CrossRef][Medline]
Irbäck, A., and F. Potthast. 1995. Studies of an off-lattice model for protein folding: sequence dependence and improved sampling at finite temperature. J. Chem. Phys. 103:1029810305.[CrossRef]
Irbäck, A., B. Samuelsson, F. Sjunnesson, and S. Wallin. 2003. Thermodynamics of
- and ß-structure formation in proteins. Biophys. J. 85:14661473.
Irbäck, A., and F. Sjunnesson. 2004. Folding thermodynamics of three ß-sheet peptides: a model study. Proteins. 56:110116.[CrossRef][Medline]
Kim, S. Y., J. Lee, and J. Lee. 2004. Folding of small proteins using a single continuous potential. J. Chem. Phys. 120:82718276.[CrossRef][Medline]
Kobayashi, N., S. Endo, and E. Munekata. 1993. Conformational study on the IgG binding domain of protein G. In Peptide Chemistry. N. Yanaihara, editor. ESCOM, Leiden, The Netherlands. 278281.
Kortemme, T., M. Ramírez-Alvarado, and L. Serrano. 1998. Design of a 20-amino acid, three-stranded ß-sheet protein. Science. 281:253256.
Kussell, E., J. Shimada, and E. I. Shakhnovich. 2002. A structure-based method for derivation of all-atom potentials for protein folding. Proc. Natl. Acad. Sci. USA. 99:53435348.
Lockhart, D. J., and P. S. Kim. 1992. Internal Stark Effect measurement of the electric field at the amino acid terminus of an
-helix. Science. 257:947951.
Lockhart, D. J., and P. S. Kim. 1993. Electrostatic screening of charge and dipole interactions with the helix backbone. Science. 260:198202.
López de la Paz, M., E. Lacroix, M. Ramírez-Alvarado, and L. Serrano. 2001. Computer-aided design of ß-sheet peptides. J. Mol. Biol. 312:229246.[CrossRef][Medline]
Lyubartsev, A. P., A. A. Martsinovski, S. V. Shevkunov, and P. N. Vorontsov-Velyaminov. 1992. New approach to Monte Carlo calculation of the free energy: method of expanded ensembles. J. Chem. Phys. 96:17761783.[CrossRef]
Marinari, E., and G. Parisi. 1992. Simulated tempering: a new Monte Carlo scheme. Europhys. Lett. 19:451458.[CrossRef]
Muñoz, V., P. A. Thompson, J. Hofrichter, and W. A. Eaton. 1997. Folding dynamics and mechanism of ß-hairpin formation. Nature. 390:196199.[CrossRef][Medline]
Neidigh, J. W., R. M. Fesinmeyer, and N. H. Andersen. 2002. Designing a 20-residue protein. Nat. Struct. Biol. 9:425430.[CrossRef][Medline]
Nymeyer, H., and A. E. García. 2003. Simulation of the folding equilibrium of
-helical peptides: a comparison of the generalized Born approximation with explicit solvent. Proc. Natl. Acad. Sci. USA. 100:1393413939.
Pande, V. S., and D. S. Rokhsar. 1999. Molecular dynamics simulations of unfolding and refolding of a ß-hairpin fragment of protein G. Proc. Natl. Acad. Sci. USA. 96:90629067.
Pitera, J. W., and W. Swope. 2003. Understanding folding and design: replica-exchange simulations of "Trp-cage" miniproteins. Proc. Natl. Acad. Sci. USA. 100:75877592.
Press, W. H., B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. 1992. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, UK.
Qiu, L., S. A. Pabit, A. E. Roitberg, and S. J. Hagen. 2002. Smaller and faster: the 20-residue Trp-cage protein folds in 4 µs. J. Am. Chem. Soc. 124:1295212953.[CrossRef][Medline]
Roccatano, D., A. Amadei, A. Di Nola, and H. J. C. Berendsen. 1999. A molecular dynamics study of the 4156 ß-hairpin from B1 domain of protein G.