help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Rosen, J. B.
Right arrow Articles by Dill, K. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rosen, J. B.
Right arrow Articles by Dill, K. A.

Biophys J, December 2000, p. 2818-2824, Vol. 79, No. 6

A Method for Parameter Optimization in Computational Biology

J. B. Rosen,* A. T. Phillips,dagger S. Y. Oh,Dagger and K. A. Dill§

 *Computer Science and Engineering Department, University of California at San Diego, San Diego, California 92093 USA;  dagger Department of Computer Science, University of Wisconsin-Eau Claire, Eau Claire, Wisconsin 54702 USA;  Dagger Department of Mathematics, Chungnam National University, Taejon 305-764, Korea; and  §Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, California 94118 USA




    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

Models in computational biology, such as those used in binding, docking, and folding, are often empirical and have adjustable parameters. Because few of these models are yet fully predictive, the problem may be nonoptimal choices of parameters. We describe an algorithm called ENPOP (energy function parameter optimization) that improves---and sometimes optimizes---the parameters for any given model and for any given search strategy that identifies the stable state of that model. ENPOP iteratively adjusts the parameters simultaneously to move the model global minimum energy conformation for each of m different molecules as close as possible to the true native conformations, based on some appropriate measure of structural error. A proof of principle is given for two very different test problems. The first involves three different two-dimensional model protein molecules having 12 to 37 monomers and four parameters in common. The parameters converge to the values used to design the model native structures. The second problem involves nine bumpy landscapes, each having between 4 and 12 degrees of freedom. For the three adjustable parameters, the globally optimal values are known in advance. ENPOP converges quickly to the correct parameter set.



    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

There are many models in computational chemistry, biology, and materials science, in which parameterized energy functions are used to predict three-dimensional structures of molecules. Folding, threading, docking, protein-protein recognition, and loop refinement methods are examples from computational biology. It is seldom clear whether failures of such models are attributable to the form of the model's mathematical functions, or to poor choices of the parameters used in them. Such models are usually so computationally expensive that it is impossible to be systematic about finding the "optimal" parameters, i.e., those parameters that minimize some measure of total structural error.

To find the optimal parameters, say for a folding algorithm, it would be necessary to 1) compute the folded structures for many proteins, 2) determine the errors, 3) change the parameters, and then iterate this whole process for many different sets of parameters to find the best ones. This has not been computationally feasible before. Instead, model parameters are usually chosen to varying degrees by physical estimates, guesswork, and arbitrary trial and error. Such efforts involve small searches through large parameter spaces. As with other types of search problems, parameter optimization can depend on the order in which the parameters are chosen, and it can get caught in traps from which it cannot escape.

There are methods for optimizing parameters in certain classes of problems (Esposito and Floudas, 1998; Maiorov and Crippen, 1994; Koretke et al., 1998; Hao and Scheraga, 1996). Two examples are threading (Goldstein et al., 1992; Hendlich et al., 1990; Maiorov and Crippen, 1994; Thomas and Dill, 1996; Mirny and Shakhnovich, 1996; Huber and Torda, 1998; Koretke et al., 1996) and lattice models of folding (Hao and Scheraga, 1996; Goldstein et al., 1992; Shrivastava et al., 1995; Govindarajan and Goldstein, 1995). In both cases, the ability to find optimal parameters is a direct consequence of the facts that 1) the conformational space is discrete, and 2) the global optimum is guaranteed to be among the conformations searched. Whenever global optima can be found through finite searches, there are methods that can be used to learn parameters that can distinguish correct from incorrect structures. This is arguably the principal advantage of threading versus folding algorithms: the former involve discrete searches, so parameters can be improved systematically.

As far as we know, no algorithm yet exists that can find optimal parameters for models having continuous degrees of freedom. Consider protein folding. To improve the parameters in a folding model, you need to recompute the lowest energy structure many times, once after each iteration of small changes in parameters. This means many minimizations. The main problem in optimizing parameters for continuum models is that the typical minimization methods---Monte Carlo, simulated annealing, or molecular dynamics---find only local minima and fall into different energy wells each time, so there is no unique and reproducible mapping from a given set of model parameters to a given model native structure. This lack of reproducibility is the primary reason that parameter optimizations are difficult in continuum models.

We describe here a computational method, called ENPOP (energy parameter optimization), that searches for globally optimal parameters for continuous models. It takes as input 1) a given model and its associated adjustable parameters, 2) the (usually incorrect) best structure for each of m different molecules that is predicted by the starting parameters, 3) the m correct (true) structures that the model should produce, and 4) some measure of structural error between predicted structures and true, known, or correct best structures. The technology that enables us to circumvent the reproducibility limitation and to systematically optimize parameters is the recently developed CGU (convex global underestimator) method that finds global minima---or at least reproducible minima---of energy landscapes, at least for short enough chains (Dill et al., 1997a,b).

Our approach is general and guarantees improved, and sometimes even globally optimal, parameters. It should be applicable to a wide range of models in computational biology and chemistry. It allows any differentiable measure of "structural error" to be minimized over the continuous space of both potential function parameters and molecular conformations, while placing no restrictions on the form of the potential function other than that it be differentiable. Our method simultaneously tracks the global minimum for each of the proteins in the model as the parameters are changed, while providing a guaranteed reduction in the structural error between model-native and true-native states. For this we do not require any additional global minimization (beyond that needed to get the initial global minima), so the method is computationally efficient. Our strategy then is to learn a set of parameters for one set of proteins with known structures and apply those parameters to the prediction of other structures. Our hope is that by finding better parameters for any given model, this method will ultimately allow computational biologists to develop improved models for folding, binding, docking, etc.



    ENERGY FUNCTION IMPROVEMENT BY PARAMETER OPTIMIZATION
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

We consider some model for an energy landscape, F(Phi alpha ). F(Phi alpha ) is the conformational energy or free energy as a function of the n degrees of freedom (coordinates or bond angles, etc.) that are given by the vector Phi  is in  Rn and as a function of the k parameters of the model that are given by the parameter vector alpha  is in  Rk. The parameters might include Lennard-Jones parameters, steric terms, hydrogen bond or hydrophobic interaction strengths, coefficients of bond angle energies, etc. There is no limitation on the functional forms of the terms. Although the method is general, we will make the discussion more concrete by focusing on protein folding. The predicted native state is given by the global minimum vector Phi G, which has a free energy F(Phi G). If the model energy function were perfect, it would predict the true native structure; that is, we would have Phi G = Phi N, where Phi N is the correct dihedral angle vector for the native state of this protein.

We will consider the prediction accuracy in terms of the error in the degrees of freedom ∥Phi G - Phi N∥2 for each protein. Other measures, such as RMS, could be used with minor modification. To make explicit the dependence of F on the parameters, we let alpha  be a vector of k parameters of the energy function that are common for all proteins. In the energy functions we test here, we typically have k <=  15.

Suppose we wish to improve the native structure predictions for a set of m proteins, with energy functions F(j)(Phi (j)alpha ) and native states Phi N(j), j = 1, ... , m. Note that each potential function F(j) and set of nj independent variables Phi (j) will be different and will depend on the number and sequence of beads in the jth protein. But the vector alpha  is independent of j and will be the same for all proteins. For any fixed value of alpha , we can use the CGU algorithm to find the corresponding global minimum Phi G(j)(alpha ) for each protein. Each such global minimum is characterized by the conditions
F<SUP><UP>j</UP></SUP>(&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>, &agr;)≤F<SUP>(<UP>j</UP>)</SUP>(&PHgr;<SUP>(<UP>j</UP>)</SUP>, &agr;), (1)

∇<SUB>&PHgr;</SUB>F<SUP>(<UP>j</UP>)</SUP>(&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>, &agr;)=0, <UP>and</UP> (2)

H<SUP>(<UP>j</UP>)</SUP>(&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>, &agr;) <UP>positive definite,</UP> (3)
where H(j) is the (nj × nj) Hessian matrix with respect to Phi  of F(j) (Phi (j)alpha ). While unlikely, it is possible that for some initial choice alpha I, some Hessian H(j) may only be positive semidefinite at the corresponding global minimum. If this occurs, a different value of alpha I should be chosen. The total conformational error is defined by
&rgr;(&agr;) <LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>m</UP></UL></LIM> w<SUB><UP>j</UP></SUB>&rgr;<SUB><UP>j</UP></SUB>(&agr;), (4)
where wj >=  0 are any arbitrary weighting factors that the user wants to include, and rho j(alpha ) is the jth molecule conformational error, given by
&rgr;<SUB><UP>j</UP></SUB>(&agr;)∥&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>(&agr;)−&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>N</UP></SUB>∥<SUP>2</SUP>. (5)
The weights wj can be selected to give more or less importance to certain proteins, for example, based on known accuracies or reliabilities of their structures, but otherwise might normally be set equal to one. The parameter adjustment method proposed here can now be stated as a k-dimensional minimization problem, with simple bounds on the parameters
<LIM><OP><UP>min</UP></OP><LL>&agr;</LL></LIM> &rgr;(&agr;) <UP>subject to &agr;<SUB>min</SUB></UP>≤&agr;≤&agr;<SUB><UP>max</UP></SUB>. (6)
The bounds (alpha min)j and (alpha max)j should be chosen to appropriately restrict the range on the parameter alpha j to values that are meaningful to the problem.

The key computational expense in this method is computing the function rho (alpha ) and its gradient nabla alpha rho . So we use an efficient minimization method that requires relatively few function and gradient calls, the Sequential Quadratic Programming (SQP) method (Gill et al., 1986).

We need to track the changing value of Phi G(j)(alpha ) as alpha  is adjusted. We impose the constraint that Phi G(j)(alpha ) must remain a local minimum of F(j) (Phi (j)alpha ) as alpha  is modified. This is done by requiring that the system of nj nonlinear equations in Eq. 2 remains satisfied as alpha  changes, starting with an initially chosen parameter vector alpha I. Note that this does not guarantee that the updated value of Phi G(j)(alpha ) will also remain the global minimum of F(j)(Phi (j)alpha ), although this is easily checked at the conclusion of the method by another CGU global optimization step.

Phi G(j)(alpha ) can be updated without recomputing the global minimum, by using the implicit function theorem, provided that the Hessians H(j) are positive definite. This condition is satisfied at Phi G(j) by Eq. 3. The implicit function theorem uses a linearized approximation to Eq. 2 at its current values of Phi  and alpha  and therefore gives the required small change Delta alpha to balance any small change Delta Phi . It is valid if the neglected terms in ∥Delta alpha ∥2 and ∥Delta Phi ∥2 are much smaller than the first-order term in ∥Delta alpha ∥ and ∥Delta Phi ∥. For a small change Delta Phi G(j), the corresponding change Delta alpha required to keep Eq. 2 satisfied is given approximately by
H<SUP>(<UP>j</UP>)</SUP>(&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>, &agr;)&Dgr;&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>+J<SUP>(<UP>j</UP>)</SUP><SUB>&agr;</SUB>(&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>, &agr;)&Dgr;&agr;=0, (7)
where Jalpha (j) = Jalpha (j)(Phi G(j)alpha ) are the nj × k Jacobians of nabla Phi F(j)(Phi G(j)alpha ). Because the Hessians H(j) = H(j)(Phi G(j)alpha ) are nonsingular, we can write this as
&Dgr;&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>=<UP>−</UP>[H<SUP>(<UP>j</UP>)</SUP>]<SUP>−1</SUP>J<SUP>(<UP>j</UP>)</SUP><SUB>&agr;</SUB>&Dgr;&agr;. (8)
Because nabla alpha rho j(alpha ) = -[Jalpha (j)]T[H(j)]-1nabla Phi rho j(alpha ), it then follows that the desired gradient nabla alpha rho (alpha ) of the total conformation error as a function of parameters is given by
∇<SUB>&agr;</SUB>&rgr;(&agr;)=<UP>−</UP>2 <LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>m</UP></UL></LIM> w<SUB><UP>j</UP></SUB>[J<SUP>(<UP>j</UP>)</SUP><SUB>&agr;</SUB>]<SUP><UP>T</UP></SUP>[H<SUP>(<UP>j</UP>)</SUP>]<SUP>−1</SUP>(&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>(&agr;)−&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>N</UP></SUB>). (9)
Corresponding to each Phi G(j), good approximations to the Hessians H(j) are obtained directly from the SQP implementation of the local minimization, a key phase of the CGU method. Also, the Jacobians Jalpha (j) are relatively easy to compute because Jalpha (j) has only k columns.



    THE ENPOP ALGORITHM
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

We call our method ENPOP (energy function parameter optimization). Here is the general algorithm:

1. Guess an initial alpha I.

2. Run the CGU method successively on the functions F(j), j = 1, ... , m with alpha  = alpha I fixed. Denote the result of each CGU run by Phi G(j).

3. Using each Phi G(j) from step 2, perform the minimization over alpha  described in Eq. 6. To perform this step, use alpha I as a starting point and use Eq. 9 for the gradient of rho .

To compute the change Delta Phi G(j) corresponding to altered parameters (at each step of the local minimization) Delta alpha , we use the implicit function theorem, which states that (see Eq. 7) Delta Phi G(j) = -[H(j)]-1Jalpha (j)Delta alpha . This step will result in a new parameter set alpha new with corresponding global minima Phi G(j)(alpha new), j = 1, ... , m.

The ENPOP algorithm is quite general and can be applied to any set of molecular structures with a given model and potential function. All that is required is a suitable representation for the Hessian H(j) and Jacobian Jalpha (j) specific to the model.



    GUARANTEEING REDUCTION IN STRUCTURAL ERROR
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

The method described above optimizes parameters while enforcing the requirement that the initial global minimum remain at least a local minimum of the model energy function. But what guarantees that the original global minimum will not shift to become just a local minimum? In this section, we describe criteria that ensure that the original minimum remains global. The purpose of such a criterion is computational efficiency: when the global minimum is no longer ensured, the CGU can be run again to check whether the current minimum is global. The goal is to run the CGU only the minimum possible number of times, to save computational cost.

When attempting to improve the parameter values for the set of potential functions F(j)(Phi G(j)(alpha ), alpha ), for j = 1, ... , m, it is useful to have prior information about the amount of reduction in the error rho (alpha ) that can be expected. We now show that a lower bound on the decrease in rho (alpha ) can be given, based on information available at the initial value alpha  = alpha I. That is, we show that we can always obtain a parameter vector alpha 1 such that
&rgr;(&agr;<SUB>1</SUB>)≤&rgr;(&agr;<SUB><UP>I</UP></SUB>)−&Dgr;&rgr;, (10)
where Delta rho is given by Eq. 14. Because Delta rho is a guaranteed lower bound, the actual decrease in rho (alpha ) obtained by ENPOP may typically be much greater.

To determine Delta rho , we need to know several quantities that are available once the global minima Phi G(j) have been computed for each protein by the CGU method. For this purpose, we define the gradient in parameter space, g = nabla alpha rho (alpha I), as given by Eq. 9, and the initial energy gaps Delta F(j) for j = 1, ... , m and alpha  = alpha I, by
&Dgr;F<SUP>(<UP>j</UP>)</SUP>=F<SUP>(<UP>j</UP>)</SUP><SUB><UP>LM</UP></SUB>−F<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>>0. (11)
In Eq. 11, FG(j) triple-bond  F(j)(Phi G(j)(alpha I), alpha I) represents the global minimum energy found by the CGU method, using the parameter vector alpha I, and FLM(j) is the corresponding next lowest local minimum energy, both for the jth protein. If the gradient ∥g∥ = 0, then alpha I is already a stationary point of rho (alpha ), and a new value of alpha I must be selected. Similarly, if Delta F(j) = 0, for any j, then any change in alpha  may cause the coordinates of the global minimum for F(j)(Phi (alpha ), alpha ) to move discontinuously from its current conformation Phi G(j) to the conformation corresponding to the alternate global minimum (because FLM(j) = FG(j)). In this case, the jth protein should be replaced (or removed) in the test set. Therefore, we can assume that Eq. 11 holds for all j = 1, ... , m, and that ∥g∥ > 0.

In addition, we will require the quantities Delta Fmin triple-bond  minj Delta F(j) > 0, the Hessian Hrho (alpha ) of rho (alpha ), and the curvature of rho  in the direction of g as given by
&lgr;=g<SUP><UP>T</UP></SUP>H<SUB>&rgr;</SUB>g/g<SUP><UP>T</UP></SUP>g. (12)
The value of lambda  is bounded by lambda min <=  lambda  <=  lambda max, where lambda min and lambda max are the minimum and maximum eigenvalues of Hrho (alpha I). Because Hrho (alpha ) may be indefinite at alpha I, lambda  may be negative.

Finally, we need a uniform bound on the gradient of F(Phi (alpha ), alpha ) with respect to alpha :
∥∇<SUB>&agr;</SUB>F(&PHgr;(&agr;), &agr;)∥≤&bgr;. (13)
In terms of these quantities, it can be shown that
&Dgr;&rgr;=<FENCE><AR><R><C>∥g∥&Dgr;F<SUB><UP>min</UP></SUB>/2&bgr;</C><C>&lgr;≤0</C></R><R><C>∥g∥&Dgr;F<SUB><UP>min</UP></SUB>/4&bgr;</C><C>&lgr;>0 <UP>and</UP> &lgr;&Dgr;F<SUB><UP>min</UP></SUB>≤2&bgr;∥g∥</C></R><R><C>∥g∥<SUP>2</SUP>/2&lgr;</C><C>&lgr;>0 <UP>and</UP> &lgr;&Dgr;F<SUB><UP>min</UP></SUB>>2&bgr;∥g∥</C></R></AR></FENCE>. (14)
In the first two bounds in Eq. 14, the change in alpha  is limited by the possibility that the current global minimum is replaced by one of the other local minima (it is assumed that a new global minimum will arise only from existing local minima). In the third bound, the decrease in rho (alpha ) is at least that obtained by a single steepest descent step in the direction -g in alpha -space.

We now test ENPOP on two very different problems. The first test involves three short protein molecules, in two-dimensional conformational space, using a simple energy function with four parameters. ENPOP reduces the total conformation error from its initial value rho (alpha I) = 1012.3 to its minimum rho (alpha N) = 0.87. The second test problem is one for which we know in advance the optimum parameter vector alpha N, such that Phi G(alpha N) = Phi N. We do not, of course, use this knowledge in ENPOP, but we can verify that ENPOP computes the correct parameter vector alpha N. This test problem consists of nine different bumpy energy landscapes, each with a different number of degrees of freedom (ranging from 4 to 12), but with a common set of three parameters. We find that ENPOP always finds the optimum parameter vector alpha N, starting with different initial vectors alpha I.



    TEST PROBLEM 1: 2D PROTEIN FOLDING MODEL
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

We first consider a problem involving three short chain molecules, in two-dimensional conformation space, using a simple energy function. The energy function consists of four terms, a bond length penalty term, a Lennard-Jones (LJ) attraction/repulsion term, a hydrophobic attraction term, and a hydrogen bond term. The energy function contains four adjustable parameters. The test molecules have from 12 to 37 beads and differ from each other in the specification of which beads are hydrophobic and which pairs are hydrogen bonded. The native state conformations are created in advance by arbitrarily choosing bond angles. The four parameters are adjusted by ENPOP so that the global minimum of the energy function, for each molecule, gives a corresponding conformation as close as possible to its known native conformation.

To simplify both the calculation and the discussion, the degrees of freedom are the (xy) coordinates of each bead, (xi, yi), i = 1, ... , nn = 12, 25, 37. For each molecule, bead 1 is fixed at the origin. We let X denote the vector with 2n elements (xiyi), i = 1, ... , n. The conformation of a molecule is then specified by the vector X. Because the distance rij between beads i and j is given by rij2 = (xi - xj)2 + (yi - yj)2, the vector X completely specifies all pairwise distances between beads. We also let alpha  is in  R4 and denote the parameter vector specifying the four parameter values as alpha i, i = 1, ... , 4.

The energy function is
F(<B><UP>X</UP></B>, &agr;)=50 <LIM><OP>∑</OP><LL><UP>i=1</UP></LL><UL><UP>n−1</UP></UL></LIM> [r<SUB><UP>i,i+1</UP></SUB>−1.0]<SUP>2</SUP>+<LIM><OP>∑</OP><LL><UP>i=1</UP></LL><UL><UP>n−1</UP></UL></LIM> <LIM><OP>∑</OP><LL><UP>j>i</UP></LL></LIM> L(r<SUB><UP>ij</UP></SUB>, &agr;<SUB>1</SUB>, &agr;<SUB>2</SUB>)−&agr;<SUB>3</SUB> <LIM><OP>∑</OP><LL><UP>i,j∈H</UP></LL></LIM> r<SUB><UP>ij</UP></SUB><SUP>−6</SUP>−&agr;<SUB>4</SUB> <LIM><OP>∑</OP><LL><UP>i,j∈HB</UP></LL></LIM> r<SUB><UP>ij</UP></SUB><SUP>−6</SUP>, (15)
where alpha i > 0 and
L(r, &agr;<SUB>1</SUB>, &agr;<SUB>2</SUB>)=&agr;<SUB>1</SUB>[<UP>−</UP>2r<SUP>−6</SUP>+0.1&agr;<SUB>2</SUB>r<SUP>−12</SUP>]. (16)
H denotes the set of hydrophobic beads, and HB denotes the set of hydrogen bonded pairs. Note that the minimum in the LJ term occurs at rmin = (0.1alpha 2)1/6, where it has the value (-10alpha 1)/alpha 2.

For each test molecule the sets of hydrophobic and hydrogen-bonded beads are chosen to be different, while the parameter values are the same for all molecules. There is one energy function for all proteins, but folds are different because proteins have different monomer sequences. We represent the energy function for the jth test molecule by Fj(Xjalpha ), j = 1, 2, 3. For each test molecule we have a known native state conformation XNj, j = 1, 2, 3.

Molecule 1 consists of 12 beads, with four of them hydrophobic: H = (1, 2, 11, 12); and there are three hydrogen-bonded pairs: HB = (1, 4), (5, 8), (9, 11). Molecule 2 consists of 25 beads, with seven of them hydrophobic: H = (3, 6, 7, 15, 21, 24, 25); and four hydrogen-bonded pairs: HB = (3, 6), (9, 12), (15, 18), (19, 22). Finally, molecule 3 consists of 37 beads, with 12 of them hydrophobic: H = (1, 5, 8, 9, 12, 17, 20, 21, 24, 29, 32, 33); and seven hydrogen-bonded pairs: HB = (1, 4), (3, 6), (7, 10), (17, 20), (19, 22), (23, 31), (34, 36).

We start with an arbitrary initial choice alpha I of the parameter vector. Corresponding to alpha I are global minimum energy conformations XIj, obtained by minimizing Fj(Xjalpha I), j = 1, 2, 3. These initial and native conformations are shown in Fig. 1, as the first and last of the four conformations for each molecule. Let Xj(alpha ) denote the minimum energy conformation for any alpha . We have Xj(alpha I) = XIj. We aim to find alpha N, so that the conformation error rho (alpha ) is minimized, where
&rgr;(&agr;)=<LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>3</UP></UL></LIM> ∥<B><UP>X</UP></B><SUB><UP>j</UP></SUB>(&agr;)−<B><UP>X</UP></B><SUB><UP>Nj</UP></SUB>∥<SUP>2</SUP>. (17)
To illustrate the process of parameter optimization, we explore trajectories in parameter space, from the initial to final parameter vectors. We follow the steepest descent path in parameter space to find the native parameter vector alpha N, such that rho (alpha N) is a minimum. At each descent step Delta alpha we reduce rho (alpha ) by a small amount. The corresponding changes in the function values Fj and corresponding conformations Xj(alpha  + Delta alpha ) are then obtained by local minimizations of Fj(Xjalpha  + Delta alpha ), starting with Xj(alpha ). The SQP code NPSOL (Gill et al., 1986) is used for this local minimization, and because the conformation change due to Delta alpha is small, this computation is fast. A typical steepest descent calculation in parameter space requires several hundred steps and takes 10-30 min on a current workstation. Note that the more efficient SQP minimization method could also have been used in parameter space, but it would have given much less information about the nature of the contours on the rho (alpha ) surface.




View larger version (27K):
[in this window]
[in a new window]
 
FIGURE 1   Three two-dimensional model proteins. The three structures on the right are the true natives, which are the global minimum conformations for the native parameter vector alpha N. The three structures on the left are the global minima for an incorrect initial parameter vector alpha I. ENPOP finds the correct parameters alpha N through an iterative procedure starting with alpha I.

One steepest descent calculation is shown in Figs. 1 and 2. For this example, the conformation changes for each of the three molecules as alpha  changes from alpha I to alpha N are shown in Fig. 1. For each of the three test molecules, the initial conformation (given by Xj(alpha I), j = 1, 2, 3) is shown on the left of the figure. The conformations on the right of the figure show Xj(alpha N), j = 1, 2, 3. The two intermediate conformations, for each molecule, show the Xj(alpha ) for two intermediate values of alpha  along the path in parameter space, as rho (alpha ) was minimized. The red beads are hydrophobic, and hydrogen bonding is between selected pairs of beads, as stated above. The initial values of the parameters were alpha I = (1.0, 0.5, 0.1, 50). The parameter values that minimized rho (alpha ) were alpha N = (1.7, 0.85, 0.73, 35). The initial value of the conformation error was rho (alpha I) = 1012.3, and its final value was rho (alpha N) = 0.87. The difference between the desired native conformations XNj and the final computed conformations Xj(alpha N) are so small that they cannot be distinguished in Fig. 1.




View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 2   Changes in the parameter values as ENPOP steps from alpha I to alpha N, through a steepest descent path in parameter space.

The conformations in Fig. 1 show that initially the hydrogen bond term dominates relative to the hydrophobic term, but that in the native state the hydrophobic term essentially determines the structure. This is because alpha 3 (hydrophobic) increases from 0.1 to 0.73, while alpha 4 (hydrogen bond) decreases from 50 to 35. The manner in which the four parameters changed along the steepest descent path is shown in Fig. 2, which shows the trajectory in parameter space in moving from alpha I to alpha N, along the steepest descent path for rho (alpha ). Because we are following a steepest descent path in parameter space, the value of rho (alpha ) decreases monotonically as alpha  is changed from alpha I to alpha N. However, as shown in Fig. 2, the values of the four parameters, alpha i, i = 1, 2, 3, 4, do not change monotonically as a function of the path length in parameter space. Thus, for example, alpha 1 and alpha 3 are initially decreasing, even though they eventually increase along the steepest descent path. This shows that while minimizing rho (alpha ) in parameter space is a nontrivial problem, it can be successfully accomplished, as illustrated by this example.



    MODEL PROBLEM 2: BUMPY LANDSCAPES WITH KNOWN OPTIMAL PARAMETERS
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

For the second test problem, we sought a set of bumpy energy landscapes for which we can know the optimal parameter set in advance. The degrees of freedom are Phi  is in  Rn, and the energy function is F(Phi alpha ), which depends in a smooth way on Phi  and on the parameter vector alpha  is in  Rk. In this case we chose the function
F(&PHgr;, &agr;)=&PSgr;−&mgr; <UP>cos</UP>(w&PSgr;), (18)
where Psi  = Psi (Phi alpha ) = 1/2(Phi  - Aalpha )T D(Phi  - Aalpha ), µ > 0 and w > 0 are constants, A is an n × k matrix of rank k, and D is an n × n positive diagonal matrix. For any fixed value of the parameter <A><AC>&agr;</AC><AC>&cjs1171;</AC></A>, F(Phi <A><AC>&agr;</AC><AC>&cjs1171;</AC></A>) has many local minima but attains its unique global minimum at Phi  = Phi G(<A><AC>&agr;</AC><AC>&cjs1171;</AC></A>) = A<A><AC>&agr;</AC><AC>&cjs1171;</AC></A>, with the value
F(&PHgr;<SUB><UP>G</UP></SUB>(<A><AC>&agr;</AC><AC>&cjs1171;</AC></A>), <A><AC>&agr;</AC><AC>&cjs1171;</AC></A>)=<UP>−</UP>&mgr;. (19)
Thus, the solution Phi G(alpha ) to
<UP>global </UP><LIM><OP><UP>min</UP></OP><LL>&PHgr;</LL></LIM> F(&PHgr;, &agr;) (20)
for any fixed alpha  is known a priori. The dependence of F(Phi alpha ) on Phi  is illustrated in Fig. 3 for k = n = 2, µ = 20, w = 5, and
<AR><R><C>A</C><C>=</C><C><FENCE><AR><R><C>0.069</C></R><R><C><UP>−</UP>0.399</C></R></AR>
<AR><R><C>0.501</C></R><R><C>0.034</C></R></AR>
</FENCE>,</C></R><R><C>D</C><C>=</C><C><FENCE><AR><R><C>15.318</C></R><R><C>0.0</C></R></AR>
<AR><R><C>0.0</C></R><R><C>10.452</C></R></AR>
</FENCE>, <UP>and</UP></C></R><R><C>&agr;</C><C>=</C><C><FENCE><AR><R><C><UP>−</UP>0.038</C></R><R><C>0.364</C></R></AR>
</FENCE>.</C></R></AR> (21)
F(Phi alpha ) has a large number of local minima, but the global minimum is at
&PHgr;<SUB><UP>G</UP></SUB>=A&agr;=<FENCE><AR><R><C>0.180</C></R><R><C>0.028</C></R></AR></FENCE>. (22)
This form of the potential function is characterized by a rugged energy landscape with numerous kinetic traps, energy barriers, and narrow pathways to the native state. Although this potential function is artificial, it shares many of the characteristics of real protein folding energy landscapes (Bryngelson and Wolynes, 1987, 1989; Chen and Dill, 1998; Dill and Chan, 1997; Leopold et al., 1992).




View larger version (52K):
[in this window]
[in a new window]
 
FIGURE 3   The potential function F(Phi alpha ) for model problem 2 with k = n = 2.

To extend this test problem to the more general case of m different molecules with native states Phi N(j), j = 1, ... , m, the energy function F(j) for the jth such molecule will be given by Eq. 18 with D = D(j) and A = A(j), j = 1, ... , m. If the A(j) are different, we can usually only make Phi G(j) = Phi N(j) for at most one landscape, where Phi G(j) is the global minimum vector Phi  of Eq. 20 for the jth landscape. In this case, we set the weights wj given in Eq. 4 to wj = 1 for j = 1, ... , m. Then the general problem of determining the vector alpha  that minimizes the average angular error rho (alpha ), over all m landscapes, previously defined by Eqs. 4-6, simply reduces to
<LIM><OP><UP>min</UP></OP><LL>&agr;</LL></LIM> &rgr;(&agr;)=<LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>m</UP></UL></LIM> ∥&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>(&agr;)−&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>N</UP></SUB>∥<SUP>2</SUP>. (23)
For the specific energy function in Eq. 18, we know that
&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>G</UP></SUB>(&agr;)=A<SUP>(<UP>j</UP>)</SUP>&agr;. (24)
Therefore, Eq. 23 is the simple least-squares problem for alpha  given by
<LIM><OP><UP>min</UP></OP><LL>&agr;</LL></LIM> <LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>m</UP></UL></LIM> ∥A<SUP>(<UP>j</UP>)</SUP>&agr;−&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>N</UP></SUB>∥<SUP>2</SUP>. (25)
Its solution alpha * is given by the solution to the system of linear equations
<LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>m</UP></UL></LIM> <FENCE>A<SUP>(<UP>j</UP>)<SUP><UP>T</UP></SUP></SUP>A<SUP>(<UP>j</UP>)</SUP></FENCE> &agr;=<LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>m</UP></UL></LIM> A<SUP>(<UP>j</UP>)<SUP><UP>T</UP></SUP></SUP>&PHgr;<SUP>(<UP>j</UP>)</SUP><SUB><UP>N</UP></SUB>. (26)
Furthermore, the gradient of F with respect to Phi  is given by
∇<SUB>&PHgr;</SUB>F=[1+&mgr;w <UP>sin</UP>(w&PSgr;)]D(&PHgr;−A&agr;). (27)
This gradient is zero at Phi  = Aalpha and at all points where 1 + µw sin(wPsi ) = 0. If the Hessian of F is positive at such a point, it is a local minimum of F.

Because of the special form of F(Phi alpha ), as given by Eq. 18, we know in advance the Hessian and Jacobian matrices that are needed to compute the gradient nabla alpha rho (alpha ). As a result, we were able to use an efficient SQP local minimization directly in the parameter space to minimize rho (alpha ), as given by Eq. 23.

Test problem 2 consists of nine different landscapes, each with a different number of degrees of freedom (ranging from 4 to 12). Therefore, the vector Phi (j), j = 1, ... , 9, for each molecule has a different dimensionality, with Phi (j) is in  Rnj, nj = j + 1, and j = 1, ... , 9. The value of k = 3 is chosen so that there are three parameters to be determined. For each molecule, an (nj × 3) matrix A(j) was generated with linearly independent columns. The native state, denoted by Phi N(j), for each molecule was obtained by choosing a value <A><AC>&agr;</AC><AC>&cjs1171;</AC></A> and then computing Phi (j) = A(j)<A><AC>&agr;</AC><AC>&cjs1171;</AC></A>, j = 1, ... , 9. Each Phi (j) was then randomly perturbed to give a native state Phi N(j). This was done so that no value of alpha  exists for which Phi G(j)(alpha ) = Phi N(j); that is, rho (alpha ) in Eq. 23 cannot be zero. From the solution to Eq. 26, we know a priori the optimal value of alpha * and its corresponding minimum error rho (alpha *) = 2.19. This is necessary for validating that the method can find globally optimal parameters.

A randomly chosen alpha I is in  R3 was used as the starting value for the parameters. This gives the initial error rho (alpha I) = 58.76. Corresponding to this initial choice for alpha , each landscape has an initial global minimum conformation Phi G(j)(alpha I), computed by the CGU global minimization algorithm. This value is, of course, also given directly by A(j)alpha I, but we did not use this information for our tests. The ENPOP algorithm was then applied to improve alpha  so as to simultaneously bring all of the global minima Phi G(j)(alpha ) as close as possible to their corresponding native conformations Phi N(j). ENPOP required five iterations to reduce rho  from rho (alpha I) = 58.76 to its minimum value rho (alpha *) = 2.19. Each iteration gives a reduced value of rho  and corresponding global minimum conformations Phi G(j)(alpha ). The corresponding values of rho (alpha ) obtained by the algorithm at each iteration were 58.76, 11.47, 9.36, 3.56, 2.78, and 2.19. The final value rho (alpha *) = 2.19 is the known minimum possible value of rho (alpha ) for this test problem.



    CONCLUSIONS
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES

We have described a method, called ENPOP, for optimizing the parameters in models used in computational biology, such as in folding and docking, where there is a unique structure at a global minimum of energy. ENPOP iteratively refines the parameters by enforcing the requirement that the energy minimum that represents the best predicted structures move systematically closer to the true native structures. We validate the method on two very different test problems. One is a two-dimensional short-chain protein folding problem. The other involves bumpy energy landscapes for which the optimal parameters are known in advance, to check that the method can identify the unique globally optimal parameters. While these test problems are relatively simple, they show that the ENPOP method is computationally efficient and can improve and optimize the parameters in models of the type that are commonly used in computational biology.


    ACKNOWLEDGMENTS

The authors thank the San Diego Supercomputer Center for computational resources and the National Institutes of Health (grant GM34993) and the National Science Foundation (grant DBI-9996165) for support for this research.


    FOOTNOTES

Received for publication 9 December 1999 and in final form 7 July 2000.

Address reprint requests to Dr. Ken A. Dill, Department of Pharmaceutical Chemistry, University of California at San Francisco, Suite 415, Laurel Heights Campus, 3333 California St., San Francisco, CA 94118. Tel.: 415-476-9964; Fax: 415-476-1508; E-mail: dill{at}maxwell.ucsf.edu.



    REFERENCES
TOP
ABSTRACT
INTRODUCTION
ENERGY FUNCTION IMPROVEMENT BY...
THE ENPOP ALGORITHM
GUARANTEEING REDUCTION IN...
TEST PROBLEM 1: 2D...
MODEL PROBLEM 2: BUMPY...
CONCLUSIONS
REFERENCES