| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Biophys J, December 2002, p. 3032-3038, Vol. 83, No. 6


and
*ISIS, Laboratoire de Chimie Biophysique, Université Louis
Pasteur, 67000 Strasbourg, France;
Biochemisches Institut
der Universität Zürich, 8057 Zürich, Switzerland;
Cambridge University Chemical Laboratory, Cambridge CB2
1EW, United Kingdom; and §Department of Chemistry and
Chemical Biology, Harvard University, Cambridge, Massachusetts 02138 USA
| |
ABSTRACT |
|---|
![]() |
|---|
Do G

values), whereas those that depend on the former are only qualitative, at best.
| |
INTRODUCTION |
|---|
![]() |
|---|
Protein folding is one of the essential reactions
in living systems. Recently, attention has focused on this reaction not only because of its fundamental role (Fersht, 1999
), but also because
of the interest in protein folding generated by the availability of
many protein sequences from a rapidly increasing number of genomes and
the realization that misfolded proteins are involved in disease
(Dobson, 1999
, 2001
). Considerable progress has been made in achieving
an understanding of the folding reaction by the use of simplified
models (Bryngelson et al., 1995
; Chan and Dill, 1998
; Dinner et al.,
2000
). In particular, the problem posed by the "Levinthal Paradox"
(namely that a polypeptide chain can find its unique native structure
in spite of the very large number of possible denatured conformations)
has been solved. It has been shown that a reasonable energy bias toward
the native state can reduce the search of conformation space
sufficiently for folding to take place on the experimental time scale
(Karplus, 1997
). One model, referred to as the "G

; Takada, 1999
), has
been widely used in studies of protein folding (Zhou and Karplus, 1999
;
Alm and Baker, 1999
; Muñoz and Eaton, 1999
; Galzitskaya and
Finkelstein, 1999
; Ozkan et al., 2001
; Vendruscolo et al., 2001
;
Shimada et al., 2001
). It is characterized by an energy function that
replaces the nonbonded interactions (van der Waals and electrostatic
terms) by attractive native-state contact energies; in some cases,
non-native repulsions are also present (Zhou and Karplus, 1999
; Shimada
et al., 2001
). Applications of G
; Muñoz and Eaton, 1999
;
Galzitskaya and Finkelstein, 1999
), lattice model calculations (Ozkan
et al. 2001
), and three-dimensional C
or all-atom
folding simulations by molecular dynamics (Zhou and Karplus, 1999
) and
Monte Carlo methods (Shimada et al. 2001
); in the latter an excluded
volume term is added to prevent collapse of the structure. For the
phenomenological descriptions (Alm and Baker, 1999
; Muñoz and
Eaton, 1999
; Galzitskaya and Finkelstein, 1999
), the G

).
As a consequence, trajectories calculated with G

; Vendruscolo et al., 2001
; Shimada et al.,
2001
).
Given the widespread use of G

). In an earlier paper (Paci et al., 2002b
)
we showed that the contact approximation with a cut-off radius of 5.5 Å for the EEF1 potential gives an excellent approximation to the total
energy for native and non-native configurations of proteins, even when
non-native interactions contribute significantly. It was also shown
there that the inclusion of solven shielding is essential for the
validity of the contact model. Here, we examine the validity of the
most widely used form of the G

), which uses a single parameter to relate a
residue-residue contact to its interaction energy, are similar.
EEF1 and G
). The EEF1
energy function can be decomposed into a sum of pairwise residue-residue interactions, EIJ, where I and
J correspond to the residues. For each geometry or for an ensemble of
geometries, such as those representing the unfolding state (see
Methods), we can write the EEF1 energy in the form
|
(1) |

2, the bonded
terms make no contribution, and EIJ can be
written (see Methods)
|
(2) |



|
(3) |

|
(4) |


is a constant parameter. In the
common implementation of G
|
(5) |
is a proportionality constant determined by a fitting
procedure (Muñoz and Eaton, 1999| |
NATIVE-STATE ANALYSIS |
|---|
![]() |
|---|
Figure 1 shows the distribution of
the interaction energy for the interacting residue pairs. It has a peak
close to zero, and the average is
1.2 ± 1.8 kcal/mol for the
eight proteins. It is evident that the contact energies cover a wide
range and that the use of a single coefficient, as in Eq. 4, is a rough approximation.
|
Figure 2 a shows a scatter plot of the relationship between EIJ as calculated for the native structure with EEF1 and the number of contacts between residues I and J. Data for four proteins are included in the figure (see caption); the lines represent the best (least-squares) fit for individual proteins and for all the proteins simultaneously. Although a qualitative relationship is evident, there is considerable scatter in the distribution.
|
Table 1 shows the calculated total
energies obtained for the native states of eight proteins from the EEF1
energy function in the column headed E(EEF1) (Eq. 1). The
next column, Econt(EFF1), shows the result
obtained with the contact approximation, E

P) and with an average parameter for all the proteins
(
P and 
P;
with 
P', which is chosen so that Eq. 5 yields
the native state (EEF1) energy for each protein; the values are also
listed in Table 2. As can be seen,
P and
P' are very similar for each protein (within 0.01 kcal/mol). Nevertheless, because the number of residue-residue
contacts is in the range 3000-7000 (as listed in Table 1), such small
differences lead to large changes in the total energy. This provides a
cautionary note on the use of such formulations.
|
|
In applications of G
;
Xu et al., 1998
; Cota et al., 2000
), and for transition states
(Vendruscolo et al., 2001
), the quantity of interest is not the
interaction energy of residue pairs, but rather the interaction energy
per residue, E

J
N
|
(6) |
|
(7) |
P) and for all the proteins simultaneously (
P for the various proteins obtained from the
fits to Eq. 6 are given in Tables 1 and 2. The values of
P are very similar to
P and to
P' and, in most cases, are closer than
P
to
P', the parameters that fit the total energy. This is
in accord with the fact that
EGo(
P) is a better approximation
to E(EEF1) than is
EGo(
P) in most cases (see Table
1). This is an important result because it provides a justification for
the use of Eq. 6 in the analysis of protein stability and transition
state structures.
| |
NON-NATIVE INTERACTIONS |
|---|
![]() |
|---|
The determination of contribution of non-native contacts along
folding pathways is important for an evaluation of G
Transition states
In Table 3, we show the results for
the transition state ensembles (TSE) determined by constraining the
calculated
values to be equal to the experimental ones, using
molecular dynamics and the EEF1 potential (see Methods). The values in
Table 3 are obtained by considering a single representative structure
of the TSE; corresponding results are obtained if averages over the TSE are made. The structures in the ensembles have a root mean square deviation (RMSD) in the range 4-6 Å from the native state. The first
two columns give the EEF1 energy, E(EEF1), used as a
reference, and the contact energy calculated with EEF1,
Econt(EEF1). As can be seen, the agreement is as
good as it is for the native state (Table 1). The energies calculated
using only the native contacts, EGo(EEF1) are,
in all cases, less negative than the true energies. The non-native
contribution, Enon-Go(EEF1), also given in the
Table, is in the range of
69 to
122 kcal/mol. This shows that
non-native contacts contribute significantly to stabilizing the
transition state. Table 3 also lists as
EGo(
P) and
Enon-Go(
P), the corresponding
results obtained using Eq. 5 with
P, the best fit of the
energy parameter to the native state for each protein. There are
significant deviations of EGo(
P)
from EGo(EEF1), in addition to the errors in the
latter. The deviations are both positive and negative (between
32 and
+60 kcal/mol); in comparison to Econt(EEF1), the
differences are all positive, as for EGo(EEF1).
|
Thus, use of the G
P), results in a deeper well
for the native state relative to the transition state than do the
actual energy values; e.g., for procarboxypeptidase A2 (1aye), the EEF1
transition state energy is 50.5 kcal/mol above the native state,
whereas it is calculated to be 124.4 kcal/mol and 92.9 kcal/mol with
EGo(EEF1) and
EGo(
P), respectively. The same is
also true relative to the denatured state, in correspondence with the
unrealistically deep funnel-like structure of the energy surface
obtained with the G
Thermally induced non-native conformations
Table 4 shows results corresponding
to those in Table 3 for a set of non-native conformations of CI2
obtained by an unfolding simulation at 450 K, followed by quenching
simulation at 300 K (see Methods). The contact approximation is valid
for these non-native states, as it is valid for native and transition
states. However, the various G
P) are similar. For the
largest RMSD analyzed (11.7 Å) the stabilization arising from
non-native contacts, Enon-Go(EEF1) and
Enon-Go(
P) is larger than that
from the native (G
|
In some simulations (Shimada et al., 2001
), it has been assumed that
non-native contacts are repulsive, which leads to faster folding than
the standard G

), varied the non-native
interactions over a range that included both attractive and repulsive
values. A non-native repulsive interaction somewhat weaker than the
corresponding native attractive interaction (a ratio of ~0.4) gave a
folding time closest to the experimental value.
| |
VALIDITY OF G![]() |
|---|
![]() |
|---|
The G
). As such, it has been very successful (Takada,
1999
). Now the G
A comparison between an all-atom molecular-mechanics effective-energy
function with solvent shielding and the G



; Xu et al., 1998
; Cota et al., 2000
) and provides a justification
for the determination of the coarse-grained structure of
transition-state ensembles based on such models (Vendruscolo et al.,
2001
; Paci et al., 2002a
). G
; Vendruscolo et al., 2001
). For other portions of
the potential energy surface, particularly collapsed misfolded states,
the non-native contacts neglected in G




, 2001
)).
| |
METHODS |
|---|
![]() |
|---|
We use a molecular mechanics potential energy function (EEF1)
for the atoms with an implicit solvent term. EEF1 is based on the
CHARMM19 polar hydrogen representation (Neria et al., 1996
) with a
Gaussian model for solvation (Lazaridis and Karplus, 1999
). The
function, called EEF1, has been used in a variety of applications concerned with the protein folding reaction (Lazaridis and Karplus, 1999
), including the high-temperature unfolding of the protein CI2
(Lazaridis and Karplus, 1997
), where good agreement was obtained with
simulations that used an explicit representation of the solvent (Li and
Daggett, 1994
).
The effective energy, EEEF1(R), of a
protein with conformation R includes the protein internal
energy and the solvation free energy. Both can be written as a sum over
all residue pairs. Details are given in Lazaridis and Karplus (1999)
.
Proteins and conformations used for analysis
Eight proteins were used in the analysis. They are
acylphosphatase (PDB entry1aps (Pastore et al., 1992
)), chymotrypsin
inhibitor 2 or CI2 (PDB entry 2ci2) (McPhalen and James, 1987
),
-spectrin SRC 3 domain (Blanco et al., 1997
) (PDB entry 1aey), the
third fibronectin type III repeat from tenascin (Leahy et al., 1992
) (PDB entry 1ten),
-LA (Ren et al., 1993
) (PDB entry 1hml), procarboxypeptidase A2 (Garcia-Saez et al., 1997
) (PDB entry 1aye), an
immunoglobulin-like modules from titin I-band (Improta et al., 1996
)
(PDB entry 1tit), the cell-cycle regulatory protein p13suc1, SUC1
(Endicott et al., 1995
). The experimental structure was, in all cases,
minimized for 200 steepest descent steps to eliminate bad contacts.
Several types of non-native structures of interest for the
understanding of protein folding and unfolding were examined.
Transition state ensembles were obtained using an approach based on
experimental
values to bias the trajectory (Vendruscolo et al.,
2001
; Paci et al., 2002a
) toward conformations where the fraction of
native contacts equals the experimental
value for those residues
for which such value has been measured. For CI2, high-temperature unfolded states were obtained by increasing the temperature of the
Nosé-Hoover thermostat to 450 K during the simulation of over 1 ns or longer. Collapsed configurations were generated from the high
temperature conformations by decreasing the temperature to 300 K over
200-ps trajectories.
| |
ACKNOWLEDGMENTS |
|---|
E.P. acknowledges financial support from Forschungskredit der Universität Zürich. M.V. is a Royal Society University Research Fellow. The research of M.K. is supported in part by the Centre National de la Recherche Scientifique (ESA 7006), by the Ministère de l'Education Nationale, de la Recherche et de la Technologie (Strasbourg), and by a grant from the National Institutes of Health (Harvard).
| |
FOOTNOTES |
|---|
Address reprint requests to Martin Karplus, ISIS, Laboratoire de Chimie Biophysique, Université Louis Pasteur, 4 rue Blaise Pascal, 67000 Strasbourg, France. Tel.: +33-90-241560; Fax: +33-90-241562; E-mail: marci{at}tammy.harvard.edu
Submitted March 14, 2002, and accepted for publication May 1, 2002.
| |
REFERENCES |
|---|
![]() |
|---|
-values in protein folding kinetics.
Nature Struct. Biol.
8:765-769[Medline].
-lactalbumin possesses a distinct zinc binding site.
J. Biol. Chem.
268:19292-19298

Biophys J, December 2002, p. 3032-3038, Vol. 83, No. 6
© 2002 by the Biophysical Society 0006-3495/02/12/3032/07 $2.00
This article has been cited by other articles:
![]() |
J. I. Sulkowska and M. Cieplak Selection of Optimal Variants of Go-Like Models of Proteins through Studies of Stretching Biophys. J., October 1, 2008; 95(7): 3174 - 3191. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Geierhaas, R. B. Best, E. Paci, M. Vendruscolo, and J. Clarke Structural Comparison of the Two Alternative Transition States for Folding of TI I27 Biophys. J., July 1, 2006; 91(1): 263 - 275. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Clementi and S. S. Plotkin The effects of nonnative interactions on protein folding rates: Theory and simulation Protein Sci., July 1, 2004; 13(7): 1750 - 1766. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Lindorff-Larsen, E. Paci, L. Serrano, C. M. Dobson, and M. Vendruscolo Calculation of Mutational Free Energy Changes in Transition States for Protein Folding Biophys. J., August 1, 2003; 85(2): 1207 - 1214. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Paci, A. Cavalli, M. Vendruscolo, and A. Caflisch Analysis of the distributed computing approach applied to the folding of a small {beta} peptide PNAS, July 8, 2003; 100(14): 8217 - 8222. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |