| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892
Correspondence: Address reprint requests to Wenjun Zheng, E-mail: zhengwj{at}helix.nih.gov.
| ABSTRACT |
|---|
|
|
|---|
10). The performance of this method is also shown to be robust against different choices of pairwise distance constraints and errors in their values. This method, if supplied with the experimentally derived distance constraints (for example, from NMR or other spectroscopic measurements), can be applied to the analysis of protein conformational changes toward transient states. | INTRODUCTION |
|---|
|
|
|---|
Experimentally, pairwise distances between specified atoms of a protein in its native state (in solution) can be obtained by NMR. There are other techniques that utilize fast spectroscopy (for example, site-direct spin labeling combined with electron paramagnetic resonance spectroscopy; see Hubbell et al., 2000
) to probe pairwise distances of a protein in a transient state. Computationally, it has been well known that even a small number of pairwise distance constraints can improve the protein structure modeling significantly (Skolnick et al., 1997
; Debe et al., 1999
). In the framework of ENM, because functionally relevant conformational changes generally involve a small number of low-frequency normal modes, it is natural to expect that a small number of pairwise distance constraints, if chosen properly, would be sufficient for obtaining a good approximation to the conformational changes.
Technically, in the framework of normal-modes analysis the distance constraints can be either enforced directly as "hard" constraints or incorporated indirectly as "soft" constraints (or restraints):
We will use the above "soft" constraints-based method to computationally predict the conformational changes. We will test this method on a list of test cases to evaluate its performance in terms of both accuracy and robustness.
| MATERIALS AND METHODS |
|---|
|
|
|---|
atomic coordinates for a protein's native structure, we build an elastic network model by using a harmonic potential with a single force constant to account for pairwise interactions between all C
atoms that are within a cutoff distance (RC = 10 Å). The energy in the elastic network representation of a protein is:
![]() | (1) |
atoms i and j, and
is the distance between C
atoms i and j, as given in the crystal structure.
For the above harmonic Hamiltonian we can perform the standard normal-modes analysis , and using the eigenvectors of the lowest-frequency normal modes (starting from mode No. 1 after excluding the six zero modes for translations and rotations) we can compute the overlaps with the conformational changes between two states with known structures (Zheng and Doniach, 2003
). The drastic simplification of representing the complex protein structure by an effective harmonic potential is justified by a study (Tirion, 1996
), which showed that a single spring constant potential reproduces the slow dynamics that is computed from the normal modes analysis of a complex all-atom potential.
We note that the cutoff distance RC = 10 Å is selected as a trade-off between the following two considerations: first, RC should be large enough to avoid additional zero modes besides the six rotational and translational modes; second, RC should be small enough to avoid introducing too much nonphysical long-range interaction. In practice, we find similar results for slightly different cutoff distances (data not shown).
Predict conformational changes from distance constraints
Motivation
Assume we have the three-dimensional coordinates of the initial protein structure's C
atoms, and N pairwise distance constraints for the unknown end structure. The goal is to predict the conformational change from the initial structure to the end structure. Here we limit our attention to the directionality of the conformational change (a 3L-dimensional vector where L is the length of sequence) but not its amplitude.
There are two different ways to achieve this goal:
Hard distance constraints
One can use the linear combination of M lowest-frequency modes to satisfy N linearized pairwise distance constraints (
) (n = 1, 2...N):
Assume
then it must satisfy the following N linear equations (n = 1, 2,...N):
![]() | (2) |
is the perturbational change of the pairwise distance for (
) caused by the eigenvector of mode m;
is the change of the pairwise distance for (
) derived from the given distance constraint. To satisfy N independent constraints as in Eq. 2, M should be no less than N. If N is equal to M, there is only one solution to Eq. 2; when M > N, there will be multiple solutions.
Our tests have shown that the direct satisfaction of the "hard" distance constraints (M = N) often results in poor overlap between the computed displacement by Eq. 2 and the measured one (see Table 2).
|
First, we introduce N pairwise distance constraints (
) (n = 1,2...N) as a perturbation to the Hamiltonian of the elastic network:
![]() | (3) |
H and the force vector
F are computed as follows:
![]() | (4) |
is the inverse of the "effective" spring constant for pair (
) in the old structure (
the eigenvalue of mode m;
, the perturbational change of the pairwise distance for (
) caused by the eigenvector of mode m);
k gives the overall amplitude of the perturbation;
is the pairwise distance for pair (
) in the end (initial) structure.
Second, the response displacement
induced by the above perturbation (
) at second-order approximation is computed as follows:
![]() | (5) |
) is generally as accurate as second order (adding second-order term makes little difference). The factor of
favors low-frequency modes in their contribution to x.
It is straightforward to verify the following: under the assumption of linear response, the contribution to the energy perturbation in Eq. 3 from each individual pairwise constraint, by itself, results in the change of that pairwise distance that satisfies the constraint perturbationally. However, when all contributions are added up, none of those constraints are satisfied any more. So the basic assumption is: every pairwise constraint can be enforced by a pairwise force applying on that particular pair "independently", and the interpair interference can be ignored (for example, one can ignore the change in the pairwise distance for pair 2 caused by the forces applied on pair 1). The interpair interference can be taken into account by tuning the
as variables to satisfy the constraints exactly and meanwhile minimize the energy in Eq. 5. However, our test of such alternative method (data not shown) showed, surprisingly, significantly degraded performance. We suspect that the interpair interferences are probably much weaker in real proteins than described by the ENM.
The response displacement as computed above is used as an approximation to the conformational change. Its accuracy can be assessed by calculating its overlap with the measured conformational change (generalized cosine between these two vectors; see Tama and Sanejouand, 2001
); the higher the overlap is, the more accurate the prediction will be.
Criteria for selecting residue pairs
Pairwise distance constraints can be experimentally retrieved by a variety of techniques. Intuitively, only residue pairs with significant change of distance (
) during the transition will be useful for predicting the conformational changes. Therefore, the selection criteria are needed before the method can be tested. Here for the purpose of testing cases for which both crystal structures are known, we use the following criteria:
d|) during the transition; the significance is assessed by a Z-score:
and we keep those with Z|
d| > 1. In summary, we select those residue pairs that satisfy the above two conditions and keep them as a pool of pairwise distance constraints for further testing. The pairwise distance constraints used for the later testing can only be obtained from this pregenerated pool. Of course, in practice, when only the initial crystal structure is known, this pool of pairwise distance constraints is obtained by experiments.
Test protocol
We propose the following two procedures to test the accuracy and robustness of the method:
Ideal test
We use the top N residue pairs (ranked by the pairwise distance change |
d|) as the input of distance constraints (N = 1, 2, ..., 10), then we compute the response displacement and its overlap with the measured conformational change to assess the performance.
We define the success criteria as follows. A test case is said to successfully pass the ideal test if there exists N
10 such that using the top N pairs as input results in a higher or similar overlap with the measured conformational change than any single mode.
Nonideal test: including the following two tests
Test 1. We randomly pick N pairs from the pool of significant pairs as generated above. For a given N (N = 1, 2, ... , 10), we repeat the calculation 100 times with different randomly selected N pairs and then compute the average and standard deviation of the computed overlaps. The average assesses the average performance whereas the standard deviation gives the robustness of the method.
Test 2. We introduce a random fractional error (following the uniform distribution between 50 and 50%) to the new pairwise distance values. For a given input of top N pairwise constraints, we repeat the calculations 100 times with different inaccurate values of distance constraints and then compute the average and standard deviation of the overlaps. The average assesses the average performance whereas the standard deviation gives the robustness of the method.
We define the success criteria as follows. A test case is said to successfully pass the nonideal tests if there exists N
10 such that: a), the average overlaps obtained from the above two tests are both higher than or similar to the maximal overlap between the measured conformational change and any single mode; b), the standard deviation is much smaller than the average overlap.
Test cases
We test this method for a list of protein pairs with both structures available in the Protein Data Bank (PDB). Fourteen pairs in the list are obtained from a recent study (Tama and Sanejouand, 2001
); we only exclude four pairs for reasons such as the lack of dominance of low-frequency modes among the lowest 10 modes. We then supplement by eight additional pairs of proteins from our own studies.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
Then we run the following two tests:
Ideal test
To demonstrate the best performance this method can offer, assume we are given the top N pairs (sorted by |
d|, the pairwise distance change during the transition) from the pool as the input of distance constraints (N = 1, 2, ... 10). For those top N pairwise constraints, we compute the response displacement as defined in Eq. 5, and then calculate its overlap with the measured conformational change. We compare it with the maximal overlap between any single mode and the measured conformational change. We then ask the following two questions to assess the performance: 1), What is the minimum N needed to get a similar or higher overlap than any single mode? 2), What is the highest overlap attained as N varies from 1 to 10. We record these two numbers in Table 2 for all the test cases.
A test case is said to successfully pass the ideal test if our method obtains a better or similar performance than any single mode (see Materials and Methods for details of the success criteria).
Nonideal test
We design the following two nonideal tests to assess the robustness of our method:
A test case is said to successfully pass the nonideal tests if our method "statistically" obtains a better or similar performance than any single mode (see Materials and Methods for details of the success criteria).
Then we go into a detailed discussion of the results. To clearly analyze the results, we classify the 22 test cases into the following three categories:
Successful cases with single-mode dominance
Among the test cases that successfully pass both the ideal and nonideal tests, for 12 of them (see the top part of Table 2 for details) there is a single mode that dominates the measured conformational change. Among these 12 cases, only three are dominated by precisely the lowest-frequency mode (mode No. 1) and four by the second-lowest-frequency mode (mode No. 2); the remaining five have their dominant mode ranging from mode No. 3 to No. 6 (Table 2). Therefore, even for cases with single-mode dominance, a simple choice of the dominant mode based solely on lowest frequency is generally not feasible.
For example, the transition (1ddt
1mdt) is dominated by mode No. 2 (overlap = 0.564). In both the ideal and nonideal tests, our method captures mode No. 2 as the dominant mode (see Fig. 2. The nonideal test with different choices of input pairs reveals high robustness with slightly reduced performance (average overlap
0.7, and SD
0.1). It is noted that the robustness against errors in the input distance constraints is very strong: for N = 1...10 pairs, the standard deviation is virtually zero.
|
1yts; see Fig. 3) and (2lao
1lst; see Fig. 4). In both transitions, both nonideal tests reveal very robust performance (small standard deviation).
|
|
1avh).
Successful cases with multimodes dominance
Among the test cases that successfully pass both the ideal and nonideal tests, for five of them (see the bottom part of Table 2) there are two modes that dominate the measured conformational change.
We discuss these cases in details as follows.
Transition (9aat
1ama) is dominated by mode No. 6(overlap = 0.515) and No. 7 (overlap = 0.459). In the ideal test, our method (with
4 pairs as input) can capture mode No. 6 as the dominant mode together with mode No. 1. This is not surprising because mode No. 1 frequency (0.000326) is much lower than mode No. 6 (0.057652), which favors its presence in the response displacement. The nonideal test reveals reasonable robustness with different choices of pairs as input (average overlap
0.5, ± SD
0.15 for N
4 pairs). The robustness against errors in the input distance constraints is relatively strong (for N = 1...10 pairs, the SD is always
0.1).
Transition (1cll
1ctr) is dominated by three modes: No. 3(overlap = 0.374), No. 4 (overlap = 0.380), and No. 5 (overlap = 0.405). Our method captures mode No. 3 as the dominant and No. 4 as subdominant mode (see Fig. 1). This explains its high overlap of 0.69 with the measured conformational change. The nonideal test with different choices of pairs reveals good robustness with slightly reduced performance (average overlap
0.5, and ± SD
0.1). It is noted that the robustness against errors in the input distance constraints is extremely strong: for N = 1...10 pairs, the SD is always <0.003.
|
1anf) is dominated by mode No. 2(overlap = 0.675) and No. 1 (overlap = 0.650). Our method correctly captures mode No. 2 as dominant mode and mode No. 1 as subdominant mode. The nonideal test with different choices of pairs offers almost as good performance as the ideal test (average overlap
0.8, and ± SD
0.2) for
4 pairs as input. It is noted that the robustness against errors in the input distance constraints is also very strong: for N = 1...10 pairs, the SD is always <0.02.
Transition (1dfl
1kk7) is dominated by mode No. 1(overlap = 0.518) and No. 3 (overlap = 0.475), both of which are correctly captured as dominant or subdominant mode by this method. The nonideal test with different choices of pairs as input reveals slightly reduced performance than the ideal test and good robustness (average overlap
0.50.6, ± SD
0.2) for N
5 pairs. The robustness against errors in the input distance constraints is relatively strong (for N = 1...10 pairs, the SD is always
0.1).
Transition (1vom
1mma) is dominated by mode No. 1(overlap = 0.558), and No. 2 (overlap = 0.371). Both modes are captured by our method as dominant or subdominant modes. The nonideal test with different choices of pairs as input reveals somewhat reduced performance than the ideal test and reasonable robustness (average overlap
0.50.6, ± SD
0.2) for N
5 pairs. The robustness against errors in the input distance constraints is very strong (for N = 1...10 pairs, the SD is always
0.01).
To summarize, in the above five successful cases our method correctly captures one or both of the dominant modes that also dominates the predicted conformational change and thus achieves a comparable or better performance than any single mode alone. Although the nonideal test gives somewhat reduced performance than the ideal test (with more pairs needed and a small variation in the overlap), it is generally robust and the results are not sensitive to the choices of pairs from the pool and the accuracy of the input distance constraints. The robustness against the latter is particularly impressive.
Unsuccessful cases
There are five unsuccessful cases that are discussed as follows:
Transition (8adh
6adh). There is a dominant mode No. 3 (overlap = 0.68); the ideal test gives reasonable performance (although the overlap 0.56 is lower than 0.68 of mode No. 3), and the nonideal test gives reduced performance with good robustness against both the choices of pairs from the pool and the inaccuracy of the input distance constraints. Therefore, this case is actually partially successful.
Transition (3enl
7enl). There is a weakly dominant mode No. 1(overlap = 0.345). We obtain good ideal test result but worse nonideal test result although with good robustness.
In the remaining three cases (including 4dfr
5dfr, 1hhp
1ajx, and 1hil
1him), the ideal test result is good but the nonideal test fails to give robust results (the standard deviation is comparable to the average overlap); namely, the performance is sensitive to either the choices of pairs or errors of distance constraints or both. We note that the size of the pool of significant pairs is relatively small for these three cases, which may result in relatively strong susceptibility to the contribution of each individual pair and therefore cause weak robustness. Indeed, for the transitions 1hhp
1ajx and 1hil
1him, when we enlarge the pool size the robustness is significantly improved (data not shown).
| SUMMARY |
|---|
|
|
|---|
10) of pairwise distance constraints, we have obtained a good overlap between the computed conformational change and the measured one, which is higher than (or close to) the maximal overlap between any single mode and the measured one. In particular, in cases where more than one normal mode dominates, the predicted conformational change can correctly capture all or some of the dominant modes and give a better overlap than any single mode. We also find that increasing the number of constraints generally does not significantly improve the overlap values. The results of the nonideal test are also encouraging: for most of the test cases (17 out of 22), slightly more constraints are needed to match the performance of the ideal test, and the robustness against different choices of pairs of constraints and errors in the values of distance constraints is generally strong. The dependence on the number of constraints is stronger than in the ideal test; the average overlap improves and the variance of the overlap decreases as more constraints are used. Therefore, for practical use of this method, we need to use slightly more constraints than suggested by the ideal test, which improves not only the average performance but also the robustness.
It is noted that the dependence on the accuracy of distance constraints is very weak for most of the test cases even for a relative large fractional error (up to 50%). This is critical to the practical application of this method with experimentally derived distance constraints that are usually of limited accuracy.
| CONCLUSION |
|---|
|
|
|---|
10) are used; in several cases even a single constraint has already yielded very good results. This method generally performs better than using any single normal mode, especially in cases where more than one mode dominates the transition. The robustness of the method against different choices of residue pairs and errors in the values of distance constraints has also been shown to be fairly strong. The success of this method lends support to the critical roles of collective low-frequency motions in facilitating biomolecular functions. The easy and accurate triggering of such collective mode(s) by manipulating just a small number of interacting pairs of residues may be essential to the mechanism of allostery initiated by ligand binding or protein-protein interactions.
Compared with other computational methods that utilize the distance constraints to model protein structures (for example, using molecular dynamics simulation with additional energy terms from the constraints as restraints, as implemented in CHARMM by Brooks et al., 1983
), this method has the following advantages: first, its implementation is fast and easy; second, it is free from any trapping in local minima; and third, it is applicable to large protein complexes. Furthermore, the conformational change predicted by this method can serve as a zero-order approximation that can be further refined by more sophisticated methods (for example, using dynamical simulations based on all-atom potentials).
Before ending, we acknowledge that there is limitation and inaccuracy in the ENM and there exist some protein conformational changes that cannot be described by the low-frequency normal modes (for example, some local structural changes). However, the basic idea proposed here is not limited to the ENM and it can be applied to the normal modes analysis of other force fields like the all-atom potentials.
For future work, we will apply this method with the experimentally derived distance constraints (for example, from NMR or other optical spectroscopy probes) to the analysis of protein conformational changes toward transient states that are difficult to capture by NMR or x-ray crystallography.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
This work is supported by funding from the National Institutes of Health.
Submitted on December 21, 2004; accepted for publication February 1, 2005.
| REFERENCES |
|---|
|
|
|---|
Brooks, B., R. Bruccoleri, B. Olafson, D. States, S. Swaminathan, and M. Karplus. 1983. CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4:187217.[CrossRef]
Debe, D., M. Carlson, J. Sadanobu, S. Chan, and W. Goddard. 1999. Protein fold determination from sparse distance restraints. J. Phys. Chem. B. 103:30013008.
Delarue, M., and Y. H. Sanejouand. 2002. Simplified normal mode analysis of conformational transitions in DNA-dependent polymerases: the elastic network model. J. Mol. Biol. 320:10111024.[CrossRef][Medline]
Gerstein, M., and W. Krebs. 1998. A database of macromolecular motions. Nucleic Acids Res. 26:42804290.
Hubbell, W. L., D. S. Cafiso, and C. Altenbach. 2000. Identifying conformational changes with site-directed spin labeling. Nat. Struct. Biol. 7:735739.[CrossRef][Medline]
Isin, B., P. Doruker, and I. Bahar. 2002. Functional motions of influenza virus hemagglutinin: a structure-based analytical approach. Biophys. J. 82:569581.
Keskin, O., S. Durell, I. Bahar, R. L. Jernigan, and D. G. Covell. 2002. Relating molecular flexibility to function: a case study of tubulin. Biophys. J. 83:663680.
Kim, M. K., R. L. Jernigan, and G. S. Chirikjian. 2002. Efficient generation of feasible pathways for protein conformational transitions. Biophys. J. 83:16201630.
Kundu, S., and R. L. Jernigan. 2004. Molecular mechanism of domain swapping in proteins: an analysis of slower motions. Biophys. J. 86:38463854.
Skolnick, J., A. Kolinski, and A. R. Ortiz. 1997. MONSSTER: a method for folding globular proteins with a small number of distance restraints. J. Mol. Biol. 265:217241.[CrossRef][Medline]
Tama, F., O. Miyashita, and C. L. Brooks, III. 2004. Normal mode based flexible fitting of high-resolution structure into low-resolution experimental data from cryo-EM. J. Struct. Biol. 147:315326.[CrossRef][Medline]
Tama, F., and Y. H. Sanejouand. 2001. Conformational change of proteins arising from normal mode calculations. Protein Eng. 14:16.
Tirion, M. M. 1996. Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Phys. Rev. Lett. 77:19051908.[CrossRef][Medline]
Xu, C., D. Tobi, and I. Bahar. 2003. Allosteric changes in protein structure computed by a simple mechanical model: hemoglobin T
R2 transition. J. Mol. Biol. 333:153168.[CrossRef][Medline]
Zheng, W., and B. R. Brooks. 2005. Identification of dynamical correlations within the myosin motor domain by the normal mode analysis of an elastic network model. J. Mol. Biol. 346:74559.[CrossRef][Medline]
Zheng, W., and S. Doniach. 2003. A comparative study of motor-protein motions by using a simple elastic-network model. Proc. Natl. Acad. Sci. USA. 100:1325313258.
This article has been cited by other articles:
![]() |
B. Isin, K. Schulten, E. Tajkhorshid, and I. Bahar Mechanism of Signal Propagation upon Retinal Isomerization: Insights from Molecular Dynamics Simulations of Rhodopsin Restrained by Normal Modes Biophys. J., July 15, 2008; 95(2): 789 - 803. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zheng A Unification of the Elastic Network Model and the Gaussian Network Model for Optimal Description of Protein Conformational Motions and Fluctuations Biophys. J., May 15, 2008; 94(10): 3853 - 3857. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-W. Chu and G. A. Voth Coarse-Grained Free Energy Functions for Studying Protein Conformational Changes: A Double-Well Network Model Biophys. J., December 1, 2007; 93(11): 3860 - 3871. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Sonne, C. Kandt, G. H. Peters, F. Y. Hansen, M. O. Jensen, and D. P. Tieleman Simulation of the Coupling between Nucleotide Binding and Transmembrane Domains in the ATP Binding Cassette Transporter BtuCD Biophys. J., April 15, 2007; 92(8): 2727 - 2734. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zheng and B. R. Brooks Modeling Protein Conformational Changes by Iterative Fitting of Distance Constraints Using Reoriented Normal Modes Biophys. J., June 15, 2006; 90(12): 4327 - 4336. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |