| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Biophys J, August 2002, p. 723-732, Vol. 83, No. 2

and
*Department of Biochemistry, University of Wisconsin, Madison,
Wisconsin 53706;
Department of Biochemistry and Cell
Biology, W.M. Keck Center for Computational Biology, and
Department of Computational and Applied Mathematics,
Rice University, Houston, Texas 77005 USA
| |
ABSTRACT |
|---|
|
|
|---|
The dynamic behavior of proteins in crystals is examined
by comparing theory and experiments. The Gaussian network model (GNM) and a simplified version of the crystallographic translation libration screw (TLS) model are used to calculate mean square fluctuations of
C
atoms for a set of 113 proteins whose structures have been determined by x-ray crystallography. Correlation coefficients between the theoretical estimations and experiment are calculated and
compared. The GNM method gives better correlation with experimental data than the rigid-body libration model and has the added benefit of
being able to calculate correlations between the fluctuations of pairs
of atoms. By incorporating the effect of neighboring molecules in the
crystal the correlation is further improved.
| |
INTRODUCTION |
|---|
|
|
|---|
To understand a protein's function, one must
know about both its structure and dynamics. X-ray crystallography can
give good structural information by allowing the determination of the
average position of atoms and the amplitudes of their displacements
from these average positions. However, this classic analysis tells little about the ways the molecule moves. Many methods, such as molecular dynamic simulations, have been devised for modeling protein
dynamics (MacKerell et al., 1998
), but these often involve complicated
and/or inaccurate potential functions and are computationally expensive. Some researchers have shown that simplified potentials, involving only a few parameters, can give results that are just as
accurate as those of more complicated methods for many purposes (Tirion, 1996
; Levitt et al., 1985
; ben-Avraham and Tirion, 1998
; Hinsen and Kneller, 1999
; Higo and Umeyama, 1997
).
The Gaussian network model (GNM) proposed by Bahar and colleagues
(Bahar et al., 1997
, 1998
; Haliloglu et al., 1997
; Haliloglu and Bahar,
1999
) describe protein mobility in terms of the atoms' local packing
density and exploits concepts developed in the theory of elastic
networks (Eichinger, 1972
; Kloczkowski and Mark, 1989
). Tirion (1996)
has shown that such single-parameter potentials can effectively model
low-frequency modes of protein motion. Bahar et al. (1997)
have further
shown that for myoglobin the GNM gives measurable agreement with
experimental crystallographic B-factors and furthermore can be used to
calculate cross-correlations between motions of different atoms and
compare them with NMR data (Haliloglu and Bahar, 1999
). Although the
GNM contains very little detail and is not amino acid specific, it
gives remarkably reliable results for the C
atoms with much less computation time than traditional dynamics simulations.
Crystallographic structure determination includes information about
thermal and other fluctuations of the atoms in a crystal. Each atom can
be assigned a Debye-Waller temperature factor or B-factor with the
latter proportional to the mean square amplitude of the fluctuations.
Although these factors have some limitations (Kuriyan et al., 1986
)
they represent a solid experimental source of information on the
dynamics of proteins.
The translation libration screw (TLS) model (Schomaker and Trueblood,
1968
; Sternberg et al., 1979
; Kuriyan and Weis, 1991
; Harata et al.,
1999
), developed by Schomaker and Trueblood, models a crystalline
protein as an internally rigid body undergoing motion along
translation, libration, and screw axes. Determining B-factors with the
TLS model requires performing a six-parameter least-squares optimization of the observed and calculated diffraction patterns. In
our study, we are interested in protein structure-based ab initio
calculation of protein motion, so we modify the full TLS treatment to
depend only on the molecular coordinates, calculating the square of the
displacement of each C
from the center of mass
of the protein, corresponding to the lattice-independent libration
component. For simplicity, we will refer to this simplified model as
the libration model.
Although the GNM and libration models have both been shown to be
capable of reproducing experimental B-factors for some test cases, no
studies involving more than a few structures have been reported.
Furthermore, debate continues as to which is more physically accurate
and realistic and why (Haliloglu and Bahar, 1999
).
In this context, we have completed a comparative study between
librational and GNM methods in reproducing crystallographic B-factors
with a set of 113 high-resolution (2.0 Å or better) proteins. We further modified the GNM calculations by incorporating the
effects of neighboring atoms and molecules in the crystal lattice (Fig.
1). The GNM method has just two critical
adjustable parameters, the maximum
C
-C
distance for
which the Hookean springs are attached and the associated force
constant. The sensitivity of the calculation to this first parameter
and analysis of the force constant are also explored.
|
| |
MODEL AND METHODOLOGY |
|---|
|
|
|---|
GNM background
The GNM describes a protein as an elastic network of
carbons
attached by Hookean springs where the atoms fluctuate about their mean
positions. The Kirchhoff or valency-adjacency matrix of such a
structure is constructed using Eq. 1:
|
(1) |
-carbons
and rc is the cutoff distance,
normally ~7.0 Å. The close relationship of this matrix to the
Hessian from classic normal mode analysis has been described (Atilgan
et al., 2001A quantity proportional to the mean-square fluctuations of each atom
and the cross-correlation fluctuations between different atoms are the
diagonal and off-diagonal elements, respectively, of the pseudo inverse
of the Kirchhoff matrix. This inverse can also be expressed as a sum of
eigenvectors as in Eq. 2:
|
(2) |
are the eigenvalues of
, arranged in descending
order, with the smallest, zero-valued eigenvalue omitted. The
qk are the eigenvectors of
, and
the superscript T indicates the transpose. For our symmetric
positive semi-definite matrices the identical pseudo inverse can also
be constructed using singular value decomposition
|

The variance/covariance matrix and B-factor of each atom can be
calculated from the mean-square displacements by Eqs. 3 and 4:
|
(3) |
|
(4) |
is a constant scaling factor.
In the first part of the work we are interested in calculating the
linear correlation coefficient between the experimental and calculated
B-factors as given by
|
(5) |
-atom,
x is the mean value of the
xj values,
yj and y are the corresponding quantities for calculated B-factors, and n is
the total number of C
-atoms. This number
measures only the relative rise and fall of the two curves (Eq. 5) and
does not require that they be scaled properly. The correlation
coefficient can range from
1 (perfect anti correlation) to +1
(perfect correlation). The absolute scale of the theoretical
predictions depends on a spring constant, which can also be determined
by comparing the experimental and theoretical curves (see below).
Method of calculation
Structures
For the comparative study between GNM and libration model to experimental B-factors, a set of 113 proteins from two different nonredundant sets were used. Researchers at Duke (Word et al., 1999
carbons alone were used to model
the protein structures. Though many structures contained counterions or
cofactors on the surface or in the active site, there exists no good
systematic method for modeling these, and it must be done entirely ad
hoc. The only cofactor we have modeled is the heme group, where the
four bridging methylene carbons and the iron atom are all treated as
C
atoms. When calculating the center of mass,
all atoms were given a mass of 12 atomic mass units.
Numerical calculations
We used Mathematica to calculate the GNM-based B-factors for each protein in a batch mode. The Kirchhoff matrix is formed first from Eq. 1. We invert the Kirchhoff matrix with the help of Eq. 2 and with each eigenvector contributing toward the B-factor. We ignore the eigenvector with value zero. Arranging the eigenvectors in the ascending order of their eigenvalues, we found that the first 30% of them are the major contributors toward the B-factor, but inclusion of all eigenvectors helps slightly (Fig. 2). For the anisotropic network model calculations we used Mathematica's Pseudo Inverse function, which uses a singular value decomposition formalism instead of eigen analysis.
|
GNM with contacts and neighbors
Usually GNM assumes springs between C
atoms only within the same molecule. But these molecules are not
isolated entities in crystals. Instead they reside in a lattice with
neighbors. We included the neighbors in our calculation to incorporate
the effect of environment on dynamic behavior. We first included only C
within 7.0 Å of the concerned molecule but
also considered all neighboring molecules surrounding the central
molecule. We used the program CNS (Brunger et al., 1998
) to identify
the neighboring atoms and molecules.
Libration model
As previously mentioned, the TLS model is approximated by
assuming mean square fluctuations are proportional to the square of the
distance of each
carbon from the protein's centroid. The square of
the distance, rather than the distance itself, is used to give the
calculated B-factors the same units as those in the GNM calculations.
Correlation coefficients were calculated by standard procedures.
Determination of kBT/
To calculate B-factors from Eqs. 3 and 4 we determined the value
of kBT/
by least-squares fitting
to the observed B-factors. kBT/
' was also determined by
least-squares fitting with a combined scale and offset parameter to
allow a measure of rigid-body translation components.
| |
RESULTS AND DISCUSSION |
|---|
|
|
|---|
The comparison between theory and experiment was performed for
several models, including libration only, GNM on all
C
atoms, GNM omitting from the correlation
coefficient those atoms making crystal contacts, GNM including the
central molecule and only neighboring atoms, and GNM including complete
neighboring molecules in the lattice.
Because each eigenvector is weighted as the reciprocal of its
respective eigenvalue, it is possible that only the small-eigenvalue terms contribute significantly to the total sum. Therefore, the sum in
Eq. 2 was evaluated using the eigenvectors corresponding to the
smallest 30% as well as using 100% of the eigenvalues. As described
previously, a subset of the eigenvalues are most important in the
computation of the inverse of the Kirchhoff matrix (Haliloglu et al.,
1997
), but for best results all nonzero eigenvalues/singular values
should be included when computationally feasible.
The average correlation coefficient for the agreement between
experimental B-factors and those calculated by the simple GNM procedure
using rc = 7.3 Å was 0.594 with all
and 0.581 with 30% of the eigenvectors (Table
1). The best value for the cutoff for
assigning C
values to be connected,
rc, was also determined to be 7.3 Å by evaluating the GNM model with various values (Table 1). The
individual protein correlations ranged from 0.000 (1DDT) to 0.831 (1FRD) (see Table 3). The average coefficient for the libration method
was 0.515 (Table 2), with values ranging
from
0.423 (1OSA) to 0.886 (1ARU) (Table
3).
However, the GNM gave higher correlation coefficients than the
libration method for 70 (62%) of the 113 structures.
|
|
|
There exists the question of whether the GNM or libration model might be more accurate for certain types of structures. There are 21 cases (for 7.3 Å) where the GNM and libration correlation coefficients differ by at least 0.2. In 18 of these, the GNM coefficient is better. Many of these structures are irregularly shaped (e.g., concave or dumbbell-shaped).
Indeed, one might expect that the libration method would work best on highly spherical structures. The libration theory implicitly assumes that atoms far away from the centroid are closer to the surface. It makes sense that this model would be most applicable when atoms that are the same distance from the centroid are also roughly the same distance from the surface.
In contrast, the theory underlying the GNM rests on local packing
density as the determinant of thermal fluctuation. If a structure is
not well packed with respect to its centroid, if it is concave, for
example, there will be atoms quite near the centroid that are also near
the surface. The GNM presumes, however, that tightly packed
C
atoms will fluctuate less than loosely packed C
atoms. For irregularly shaped
structures, the local C
packing density
becomes much more important in defining the mobility of the atoms,
hence the overall superiority of the GNM models.
Fig. 2, A and B, shows the structures and experimental and calculated B-factors for calmodulin (1OSA) and lithostathine (1LIT). The GNM method gave much better results than the libration for calmodulin, which is shaped like a dumbbell. For litostathine, which is much more regular, though slightly oblong, the libration method gave a higher correlation coefficient, although both methods gave qualitatively similar results. A particularly interesting case is diphtheria toxin (1DDT) where an entire domain is tethered to the rest of the protein by only a single strand of polypeptide. Of course the simple GNM method, which does not include neighbors, severely overestimates the mobility of this domain as seen in the crystallographic B-factors (Fig. 2 C). This high degree of mobility is, however, likely for the protein in solution. (When contacts are included in the calculation, the domain is immobilized, and theory and experiment agree. (see below.) It should be noted, however, that although most of the structures where GNM is superior to libration are nonspherical, some of the structures where the two methods perform equally well are also shaped irregularly.
Additionally, the omission of cofactors from many of the structures may affect their behavior in the computations. A protein that has a cofactor in the active site may experience higher stability in that region than the models account for when the cofactor is omitted. Of the 113 structures in the list, 6 of them contained a heme group. For 5 of these, correlation coefficients were computed with and without the heme, and in all 5 cases, the values improved upon the inclusion of a subset of atoms from the heme in the structure. This suggests that the modeling of other structures can be improved with a systematic and reliable method of treating the cofactors.
Another issue to be noted is that the atomic coordinates used in these
models comes from crystallographic structures. In some of these
structures (mainly atoms near the surface), there is an interaction not
only among atoms belonging to the same protein chain but also between
proteins that are adjacent to one another inside the crystal. If one
omits from the correlation coefficient calculation the
C
atoms within 7.0 Å of neighbors, the
agreement between theory and experiment improves (Table 2).
By including these crystal packing effects the results are dramatically
improved. Adding only neighboring atoms (GNM contact model) is not as
effective as adding entire neighboring molecules (GNM neighbor model).
By including neighboring molecules but taking only the central molecule
for comparison the average correlation coefficient was improved from
0.594 in the simple GNM method to 0.661 in GNM with all neighboring
molecules. Again a maximum spring length of 7.3 Å is better than 7.0 Å (Table 2). Attempts to use the anisotropic version of the GNM model
(Atilgan et al., 2001
; Doruker et al., 2000
) failed to improve the results.
The correlation coefficient analysis measures the relative agreement
between B-factors and GNM dynamics in terms of its positions of peaks
and valleys in the functions, but does not include any measure of the
overall scale of the motions. The factor
kBT/
is essentially a force
constant for the virtual springs connecting C
atoms and sets the overall scale factor. We determined optimal values
for kBT/
and also define
kBT/
' as the scaling factor
including a constant additive offset for each PDB entry (Table 3). The
constant was added because of previous evidence that both static
lattice and dynamic sources of displacements exist in crystals (Kuriyan
et al., 1986
).
The mean and standard deviation of
kBT/
, 0.87 ± 0.46 Å2, suggest that there is some relative
variability in the basic spring constant of the proteins. The
temperature dependence in the theory suggests that those crystal
structures determined at lower temperatures might have smaller
kBT/
values, although it is
well known that crystal structures solved from rapidly quenched samples
retain much of their dynamic disorder as static disorder. In fact, the mean kBT/
for the five
crystal structures determined at ~100-150 K is 0.62, which is
smaller than the overall average.
The mean kBT/
' and offset are
0.96 ± 0.50 Å2 and
0.71 ± 2.39 Å2, respectively. The offset, which can absorb
several types of crystallographic artifacts such as lattice disorder
and other sin(
/
)-dependent data processing errors, has a mean
value very close to zero, implying that there is no large systematic
contribution of lattice disorder to crystallographic B-factor. The
standard deviation of 2 Å2 suggests, however,
that each crystal structure may have circumstances that lead to the
need for such an offset. It is also likely that the simplifying nature
of the model itself introduces some errors that are accommodated by
this variable.
| |
CONCLUSION |
|---|
|
|
|---|
It appears that the GNM model is better suited for estimating protein motions than the libration model, especially for highly irregular or nonspherical structures. Furthermore, it is able to compute cross-correlations between different atoms. These cross-correlations are the determinants of directed motions. Having a theoretical method to test these cross-correlations against experimental data is extremely valuable. With further research to determine a good method for including cofactors in the protein structures, the method could be even more useful.
Biological implication
The function of a protein depends on both its structure and dynamics. Crystallographic analysis routinely provides estimates of the amplitudes of motions of atoms, but the effect of the surrounding lattice on the motions is always an uncertainty. Here we show that a simplified molecular mechanics model can effectively describe protein motions, including the effects of crystal contacts. Because the added neighbors improve our results, our confidence in the method is increased. The results further suggest that GNM calculations on a single protein molecule may give a different and more accurate picture than crystallographic temperature factors give on the dynamics of an isolated protein molecule because the constraining effects of the lattice on the crystallographic result can be factored out.
| |
ACKNOWLEDGMENTS |
|---|
Funding for this work was provided by the Arnold and Mabel Beckman Foundation (J.M.), The Robert A. Welch Foundation (C-1142), the Wisconsin Alumni Research Foundation, the National Science Foundation (ACI-0082645), and National Institutes of Health grants AR40252 and GM64598.
| |
FOOTNOTES |
|---|
Address reprint requests to Dr. George N. Phillips, Jr., Department of Biochemistry, University of Wisconsin, 433 Babcock Drive, Madison, WI 53706. Tel.: 608-263-6142; Fax: 608-262-3453. E-mail: phillips{at}biochem.wisc.edu.
Submitted December 17, 2001, and accepted for publication April 17, 2002.
| |
REFERENCES |
|---|
|
|
|---|
-amylase inhibitor.
Proteins.
40:512-524[Medline].
-lactalbumin refined by full-matrix least-squares method.
J. Mol. Biol.
287:347-358[Medline].
Biophys J, August 2002, p. 723-732, Vol. 83, No. 2
© 2002 by the Biophysical Society 0006-3495/02/08/723/10 $2.00
This article has been cited by other articles:
![]() |
J. G. Su, C. H. Li, R. Hao, W. Z. Chen, and C. Xin Wang Protein Unfolding Behavior Studied by Elastic Network Model Biophys. J., June 15, 2008; 94(12): 4586 - 4596. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zheng A Unification of the Elastic Network Model and the Gaussian Network Model for Optimal Description of Protein Conformational Motions and Fluctuations Biophys. J., May 15, 2008; 94(10): 3853 - 3857. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Eyal and I. Bahar Toward a Molecular Understanding of the Anisotropic Response of Proteins to External Forces: Insights from Elastic Network Models Biophys. J., May 1, 2008; 94(9): 3424 - 3435. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Z. Sen, M. Kloster, R. L. Jernigan, A. Kolinski, J. M. Bujnicki, and A. Kloczkowski Predicting the Complex Structure and Functional Motions of the Outer Membrane Transporter and Signal Transducer FecA Biophys. J., April 1, 2008; 94(7): 2482 - 2491. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Hinsen Structural flexibility in proteins: impact of the crystal environment Bioinformatics, February 15, 2008; 24(4): 521 - 528. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Yang, G. Song, and R. L. Jernigan How Well Can We Understand Large-Scale Protein Motions Using Normal Modes of Elastic Network Models? Biophys. J., August 1, 2007; 93(3): 920 - 929. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. K. Poon, X. Chen, M. Lu, N. K. Vyas, F. A. Quiocho, Q. Wang, and J. Ma Normal mode refinement of anisotropic thermal parameters for a supramolecular complex at 3.42-A crystallographic resolution PNAS, May 8, 2007; 104(19): 7869 - 7874. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. G. Su, X. Jiao, T. G. Sun, C. H. Li, W. Z. Chen, and C. X. Wang Analysis of Domain Movements in Glutamine-Binding Protein with Simple Models Biophys. J., February 15, 2007; 92(4): 1326 - 1335. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Erman The Gaussian Network Model: Precise Prediction of Residue Fluctuations and Application to Binding Problems Biophys. J., November 15, 2006; 91(10): 3589 - 3599. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Kondrashov, Q. Cui, and G. N. Phillips Jr. Optimization and Evaluation of a Coarse-Grained Model of Protein Motion Using X-Ray Crystal Data Biophys. J., October 15, 2006; 91(8): 2760 - 2767. [Abstract] [Full Text] [PDF] |
||||
![]() |
L.-W. Yang, A. J. Rader, X. Liu, C. J. Jursa, S. C. Chen, H. A. Karimi, and I. Bahar oGNM: online computation of structural dynamics using the Gaussian Network Model. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W24 - W31. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Bond, J. D. Faraldo-Gomez, S. S. Deol, and M. S. P. Sansom Membrane protein dynamics and detergent interactions within a crystal: A simulation study of OmpA PNAS, June 20, 2006; 103(25): 9518 - 9523. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. H. Shrivastava and I. Bahar Common Mechanism of Pore Opening Shared by Five Different Potassium Channels Biophys. J., June 1, 2006; 90(11): 3929 - 3940. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang and R. L. Jernigan Comparison of tRNA Motions in the Free and Ribosomal Bound Structures Biophys. J., November 1, 2005; 89(5): 3399 - 3409. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Lu and J. Ma The Role of Shape in Determining Molecular Motions Biophys. J., October 1, 2005; 89(4): 2395 - 2401. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Meinhold and J. C. Smith Fluctuations and Correlations in Crystalline Protein Dynamics: A Simulation Analysis of Staphylococcal Nuclease Biophys. J., April 1, 2005; 88(4): 2554 - 2563. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Delarue and P. Dumas On the use of low-frequency normal modes to enforce collective movements in refining macromolecular structural models PNAS, May 4, 2004; 101(18): 6957 - 6962. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Li and Q. Cui Analysis of Functional Motions in Brownian Molecular Machines with an Efficient Block Normal Mode Approach: Myosin-II and Ca2+-ATPase Biophys. J., February 1, 2004; 86(2): 743 - 763. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Radivojac, Z. Obradovic, D. K. Smith, G. Zhu, S. Vucetic, C. J. Brown, J. D. Lawson, and A. K. Dunker Protein flexibility and intrinsic disorder Protein Sci., January 1, 2004; 13(1): 71 - 80. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |