help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Mehler, E. L.
Right arrow Articles by Guarnieri, F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mehler, E. L.
Right arrow Articles by Guarnieri, F.

Biophys J, July 1999, p. 3-22, Vol. 77, No. 1

A Self-Consistent, Microenvironment Modulated Screened Coulomb Potential Approximation to Calculate pH-Dependent Electrostatic Effects in Proteins

Ernest L. Mehler and Frank Guarnieri

Department of Physiology and Biophysics, Mount Sinai School of Medicine, CUNY, New York, New York 10029

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
THEORETICAL BASIS AND...
RESULTS AND DISCUSSION
CONCLUSIONS
APPENDIX
REFERENCES

An improved approach is presented for calculating pH-dependent electrostatic effects in proteins using sigmoidally screened Coulomb potentials (SCP). It is hypothesized that a key determinant of seemingly aberrant behavior in pKa shifts is due to the properties of the unique microenvironment around each residue. To help demonstrate this proposal, an approach is developed to characterize the microenvironments using the local hydrophobicity/hydrophilicity around each residue of the protein. The quantitative characterization of the microenvironments shows that the protein is a complex mosaic of differing dielectric regions that provides a physical basis for modifying the dielectric screening functions: in more hydrophobic microenvironments the screening decreases whereas the converse applies to more hydrophilic regions. The approach was applied to seven proteins providing more than 100 measured pKa values and yielded a root mean square deviation of 0.5 between calculated and experimental values. The incorporation of the local hydrophobicity characteristics into the algorithm allowed the resolution of some of the more intractable problems in the calculation of pKa. Thus, the divergent shifts of the pKa of Glu-35 and Asp-66 in hen egg white lysozyme, which are both about 90% buried, was correctly predicted. Mechanistically, the divergence occurs because Glu-35 is in a hydrophobic microenvironment, while Asp-66 is in a hydrophilic microenvironment. Furthermore, because the calculation of the microenvironmental effects takes very little CPU time, the computational speed of the SCP formulation is conserved. Finally, results from different crystal structures of a given protein were compared, and it is shown that the reliability of the calculated pKa values is sufficient to allow identification of conformations that may be more relevant for the solution structure.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
THEORETICAL BASIS AND...
RESULTS AND DISCUSSION
CONCLUSIONS
APPENDIX
REFERENCES

The electrostatic interaction between two charges in a dielectric medium decreases rapidly as the separation between the two entities is increased. This is a result of the dielectric medium's intrinsic ability to shield the electrostatic field arising from the charge. There is compelling experimental and theoretical evidence that the general functional form of this screening is sigmoidal. Early experiments on organic acids (Debye and Pauling, 1925; Schwarzenbach, 1936; Webb, 1926) were collected together and presented as mean values in tabular form (Conway et al., 1951) that produced a smooth, sigmoidal screening as a function of separation with an asymptotic value of about 80. Experiments on water as a dielectric medium indicate that the electrostatic field, at loci only two solvation layers (<= 6 Å) away from a charge, is dramatically damped, almost approaching the bulk dielectric screening of epsilon  = 80 (Harvey and Hoekstra, 1972; Pennock and Schwan, 1969; Takashima and Schwan, 1965). When two charges are in close proximity, however, the dielectric screening rapidly diminishes approaching the vacuum value because there is no intervening dielectric. These conditions are simultaneously satisfied with a radially dependent sigmoidal screening function. Additionally, it has been demonstrated that the generalized Born-surface area continuum solvation model also gives rise to sigmoidal screening (Mehler, 1996b). Thus, the general sigmoidal screening form applies to both high and low dielectric media. Theoretically, the Lorentz-Debye-Sack (LDS) theory of polar solvation also leads to sigmoidally screened electrostatic fields (Debye, 1929; Lorentz, 1952; Sack, 1926; Sack, 1927). LDS theory has been used to develop radial dielectric permittivity profiles to describe ion and dipole field sources (Ehrenson, 1989) and to derive an expression for the hydration energies of spherical ions (Bucher and Porter, 1986) from the integral form of the Born Equation (Born, 1920). Subsequently, Ehrenson (1989) showed that by using Bötcher's analytic expression (Böttcher, 1938) of Onsager's theory (Onsager, 1936), the reaction field corrected LDS theory for both dipolar and asymmetric ion field sources also yielded sigmoidal screening. (For a review of LDS theory and its relation to protein electrostatics, see Mehler, 1996a.) In various applications (Collura et al., 1994; Mehler, 1996a; Mehler and Solmajer, 1991), it has been shown that sigmoidal screening leads to the reliable prediction of properties dependent on electrostatic effects in proteins, and therefore is an attractive formulation for the development of implicit solvent models (Guarnieri et al., 1998; Hassan et al., 1999).

Recently, a self-consistent approach based on using sigmoidally screened Coulomb potentials (SCP) for calculating pH-dependent properties in proteins has been derived and applied to several systems (Mehler, 1996b). The results indicated that the reliability of the method is similar to that achieved by the finite difference Poisson-Boltzmann (FDPB), but requires substantially less computing effort. Interestingly, although these various methods predict a majority of the pKa shifts correctly, they were consistent in the occasional dramatic errors on singularly important residues. The divergent pKa shifts of the two titratable acids of hen egg white lysozyme, Glu-35 and Asp-66, are particularly problematic. The divergent pKa shifts cannot be simply explained by the buried fraction (Demchuk and Wade, 1996), since the two titratable groups are nearly completely buried. Thus, it appears more likely that the unique local environment around each residue is a key determinant for the pKa of Glu-35 to be two pK units higher than its water value, whereas in Asp-66 the pKa is more than two units lower. Ponnuswamy et al. (1980), showed that at a detailed level, the local environmental hydrophobicity that surrounds each residue is distinct and can be quite different from all other residues, implying that a different screening would be required for each site (for PB calculations, each site would require its own internal permittivity value) as has been pointed out by Warshel (King et al., 1991).

In this work we hypothesize that a key source of single residue aberrations in pKa is a result of the special protein microenvironment that exists around the singular residue. To test this hypothesis, a strategy has been developed to quantitatively characterize the hydrophilicity or hydrophobicity of the microenvironment around each titratable group, and then use this property to modulate the components of the electrostatic free energy operative in proteins. The advantage of this approach is that by using physicochemical properties independent of the parameters required to calculate the electrostatic effects, one might gain additional insight into the relation between protein structure and the forces that control its function. Accounting for every local microenvironment in the entire protein structure could become prohibitively costly. Therefore, to preserve the computational speed of the SCP, the quantitative calculation of the microenvironments has been implemented in a way that requires virtually no additional CPU time.

In the next section of this paper the method for quantifying the microenvironments is presented, and the dielectric screening function used previously (Mehler, 1996a; Mehler and Eichele, 1984) is reformulated to account for the local variation in the dielectric properties of proteins. The algorithm is applied to calculating the pKa in bovine pancreatic trypsin inhibitor (BPTI), triclinic hen egg white lysozyme (HEWL), ribonuclease A (RnaseA), ribonuclease T1 (RnaseT), ribonuclease HI (RnaseHI), calbindinD9k (Cab) and the B1 immunoglobulin G-binding domain of protein G (ProtG). These seven proteins provide a data set of 103 measured pKa values that have been used to develop the parametrization of the algorithm. Subsequently, the method is tested on turkey ovomucoid inhibitor domain 3, HIV protease, and other crystal forms of some of the proteins in the test set. Finally, the results obtained here are compared with other calculations and potential improvements for the quantification of the microenvironments are considered.

    THEORETICAL BASIS AND CHARACTERIZATION OF MICROENVIRONMENTS
TOP
ABSTRACT
INTRODUCTION
THEORETICAL BASIS AND...
RESULTS AND DISCUSSION
CONCLUSIONS
APPENDIX
REFERENCES

Calculation of the electrostatic free energy

The calculation of the pKa of titratable groups in proteins is greatly simplified by considering the thermodynamic cycle (Bashford and Karplus, 1990; Bashford and Karplus, 1991; Warshel, 1981)
<AR><R><C></C><C></C><C>2.303<UP>RTpK</UP><SUP><UP>A</UP></SUP><SUB><UP>a</UP></SUB>(<UP>s</UP>)</C></R><R><C></C><C><UP>AH</UP>(<UP>s</UP>)</C><C>⇐⇒</C><C><UP>A</UP><SUP><UP>−</UP></SUP>(<UP>s</UP>)+<UP>H</UP><SUP><UP>+</UP></SUP>(<UP>s</UP>)</C></R><R><C>&Dgr;&mgr;<SUB><UP>s,p</UP></SUB>(<UP>AH</UP>)</C><C><AR><R><C>⇑</C></R><R><C>⇓</C></R></AR>
</C><C></C><C><AR><R><C>⇑</C></R><R><C>⇓</C></R></AR>
</C><C>&Dgr;&mgr;<SUB><UP>s,p</UP></SUB>(<UP>A</UP><SUP><UP>−</UP></SUP>)</C></R><R><C></C><C><UP>AH</UP>(<UP>p</UP>)</C><C>⇐⇒</C><C><UP>A</UP><SUP><UP>−</UP></SUP>(<UP>p</UP>)+<UP>H</UP><SUP><UP>+</UP></SUP>(<UP>s</UP>)</C></R><R><C></C><C></C><C>2.303<UP>RTpK</UP><SUP><UP>A</UP></SUP><SUB><UP>a</UP></SUB>(<UP>p</UP>)</C></R></AR> (1)
where (p) and (s) refer to protein or solvent, respectively. Thus, the pKa of group A in the protein can be calculated from its pKa in model solvent (water) and the additional changes in free energy that arise when the titratable group is transferred from water into the protein. Here, only the electrostatic contributions are considered and pKa(p) is obtained from the relation
<UP>pK</UP><SUP><UP>A</UP></SUP><SUB><UP>a</UP></SUB>(<UP>p</UP>)=<UP>pK</UP><SUP><UP>A</UP></SUP><SUB><UP>a</UP></SUB>(<UP>s</UP>)+<FR><NU>w<SUP><UP>int</UP></SUP><SUB><UP>A</UP></SUB>+&Dgr;w<SUP><UP>tr</UP></SUP><SUB><UP>A</UP></SUB></NU><DE>2.303<UP>RT</UP></DE></FR>, (2)
where wAint is the interaction free electrostatic energy of the charged group in the field of all the other groups in the protein, and Delta wAtr is the change in self-energy on transferring the group from water into the protein.

In Mehler (1996b), it was shown that with an SCP of the form
&PHgr;(<UP><B>r</B></UP>)=<LIM><OP>∑</OP><LL><UP>j</UP></LL></LIM> <FR><NU>q<SUB><UP>j</UP></SUB></NU><DE><UP>D</UP>(r<SUB><UP>j</UP></SUB>)r<SUB><UP>j</UP></SUB></DE></FR>, (3)
where qj is the charge at point rj, rj = |r - rj| and D(r) is a screening function, the total electrostatic free energy of a system consisting of M groups could be expressed as the sum of the interaction term and a transfer contribution as
w=<LIM><OP>∑</OP><LL><UP>A=1</UP></LL><UL><UP>M</UP></UL></LIM> w<SUP><UP>int</UP></SUP><SUB><UP>A</UP></SUB>+&Dgr;w<SUP><UP>tr</UP></SUP><SUB><UP>A</UP></SUB>, (4)
where A may refer to subgroups of each amino acid residue and other groups contributing to the total charge of the system. As discussed in the introduction, both experimental results and theoretical considerations indicated that D(r) should have a sigmoidal form (for details, see the Appendix) as originally proposed by Mehler and Eichele (1984).

For the calculation of the pKa, it is not necessary to evaluate the total electrostatic free energy, but only the contribution from each titratable group in the field due to all the other groups in the protein. In most pKa calculations the full titration charge is placed at one or more fixed positions in the protonatable moiety of the N titratable residues, and the equilibrium charge state is calculated from the distribution of the charge over the 2N ionization microstates (Bashford and Karplus, 1991; Beroza et al., 1991; Gilson, 1993). In the procedure developed in Mehler (1996b), the assignment of the ionization charge over the atoms of the protonatable moiety was determined variationally to allow it to respond to the protein environment. To reduce computing time, the direct determination of the 2N charge microstates was bypassed by coupling the total variational ionization charge of each group, at the given pH, to the Henderson-Hasselbalch equation. In this paper the method presented in Mehler (1996b) is still used, but has been modified; the details of the modified approach are discussed in the Appendix.

The calculations reported in Mehler (1996b) were based on a single screening function that had been parametrized earlier on the basis of pKa shifts in bifunctional organic acids and bases (Dwds defined in Mehler and Eichele, 1984). Thus, the only parameter introduced in Mehler (1996b) was a scaling of the interaction energy of nearby charges. To improve and generalize the approach followed in Mehler (1996b), it is hypothesized that the screening for groups deeply buried in the protein should be less than for solvent exposed groups. This reflects the commonly accepted view that the protein interior is more hydrophobic than water. To implement this hypothesis, two new screening curves were constructed by redefining D0 and lambda  (see Appendix, Eq. A1) so that, for small separations, one curve would provide more screening than Dwds, used in Mehler (1996b), whereas the other would provide less screening. These two screening functions, D1 and D2, as well as the screening function in water, Dw, are shown in Fig. 1 (D3 will be discussed below) and the values of D0 and lambda  are given in Table I. Although these two screening functions appear to be very similar, for small values of r (see inset, Fig. 1) the energy is quite sensitive to small changes in the screening. Thus, at r = 2 Å, the difference between the two curves for unit charges amounts to nearly 4.5 pK units, while, at r = 3 Å, the difference is 1.4 pK units. In contrast, at r = 8 Å, the difference is only 0.09 pK units and continues to decrease with increasing r. Thus, for small separations between groups, the contributions to the interaction energy are quite sensitive to which screening function is used. This is also the case for the transfer energy, since the Born radii, Ra in Eq. A2, lie between about 1.2 Å and 2.6 Å. Which screening function is used for a given titratable group depends on the fraction of the group's solvent accessible surface area that is buried in the protein, i.e., the buried fraction. When these two screening functions are incorporated into the calculation, there was some improvement in the pKa values, but the pKa of Glu-35 in lysozyme was still too small by more than one pK unit.



View larger version (22K):
[in this window]
[in a new window]
 
FIGURE 1   Screening functions defined in Table 1. Solid line, Dw; short dashed line, D1; long dashed line, D2; and dots, D3. The inset is a detail from 0.5 to 3 Å. (See the Appendix for details).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1   Parameters for the dielectric screening functions*

Quantification of the microenvironment

Analysis of the local environments around Glu-35 and Asp-66 in lysozyme shows that for the former the nearest neighbors are largely hydrophobic residues, whereas for the latter residue the nearest neighbors are much more hydrophilic. The increased hydrophobicity around Glu-35 has consequences for both components of the electrostatic free energy: the energy transfer term will become more positive (more unfavorable), which can be effected by a decrease in Rp (see Eq. A6). The screening of the Coulomb potential should also decrease, especially for interactions with nearby atoms so that they are larger in magnitude relative to a less hydrophobic region. However, this will be partially offset by the fairly small partial charges on atoms in hydrophobic groups. Since wint is usually negative, whereas wtr will be positive for hydrophobic regions, the resulting shift in pKa will depend on the subtle balance between these two quantities. The transfer of Asp-66 from water to a relatively hydrophilic environment, where Dp(Rp) is larger, will result in a smaller transfer energy (less unfavorable; it could be negative if the local environment is more hydrophilic than water), and the Coulombic screening should be greater. These considerations clearly suggest that the relevant quantity for characterizing the local environment, here called the microenvironment, is the hydrophobicity or hydrophilicity (here we use Hpy to refer to these properties and their values in particular cases) of the residues or their fragments near the given titratable group.

There are a large number of Hpy scales available for evaluating the hydrophobicity of amino acid residues (Cornette et al., 1987). Generally these scales have been used to evaluate the hydrophobicity of a particular residue, or at least its side chain. Here, the hydrophobicity of the particular residue is not required, but instead, because the response of the residue to changes in the local environment is the quantity of interest, it is a quantitative measure of the hydrophobicity/hydrophilicity of the microenvironment that is needed. A hydrophobicity scale based on atoms or small fragments is most appropriate for the algorithm proposed here because it allows the microenvironment to be defined in a simple, distance dependent way. In addition, this type of scale can account for the fact that side chains generally do not have a single Hpy value along their entire length, nor do all the atoms of a side chain necessarily belong to a given microenvironment. A further important advantage of a fragment-based scale is that the presence of coenzymes, prosthetic groups, ligands, or nucleic acids can all be accounted for with a consistent scale. From these considerations, the Rekker Fragmental Hydrophobic Constants (Rekker, 1977; Rekker, 1979) were selected to evaluate the hydrophobicities of the microenvironments in the present calculations.

The Rekker scale was the first fragmental system developed and is based on assigning hydrophobicity parameters and correction factors to small functional fragments to calculate log P values (for details, see Rekker, 1977). This scale has been widely used in the pharmaceutical industry as a means to calculate partition coefficients of potential drug candidates. Here, we use it not to obtain the hydrophobicity/hydrophilicity of a particular residue, but to obtain this quantity for the local protein environment. This is done by (1) identifying fragments of all the residues that are in the neighborhood of the titratable residue, (2) assigning the Rekker coefficients to these fragments, (3) summing them to determine the local hydrophobicity/hydrophilicity of the region around the titratable residue, and (4) assigning an appropriate screening for this environment. Three other fragmental systems have been proposed, of which two are based on functional fragments (Leo et al., 1975; Suzuki and Kudo, 1990) and one is atom based (Ghose and Crippen, 1986). All the fragmental systems that have been proposed to date are calibrated on the basis of partition coefficients (log P). The parameters are therefore completely independent of protein structural information. Thus, their successful use to describe local protein environments will give some indication of the universality of the hydrophobic concept and the transferability of scales developed from different physical concepts. A recent report compared the four methods and showed that the Rekker system was one of the most accurate available (Mannhold et al., 1995).

In this initial application the original values of the hydrophobic fragmental constants and correction factors were used (Rekker, 1977). For each fragment the values were partitioned to the atoms belonging to that fragment, and to simplify the calculations, net charge on a specific group was not accounted for. Then the microenvironment can be defined on the basis of a distance criterion, and only atoms satisfying it are included in the calculation of the total Hpy around the given group. The initial set of values that have been used in the calculations are tabulated in Table 2. Note that the water value was estimated from the fragmental constants of aliphatic OH and Hneg (Rekker, 1977).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 2   Rekker fragmental hydrophobic constants* for amino acid residues

The microenvironment of any group is defined as the region that lies within a sphere of radius 4.25 Å from any (nonhydrogenic) atom belonging to the group. This radius was chosen because it essentially includes the first shell around each atom, but does not extend to atoms beyond this. In addition, this value is in reasonable agreement with the criterion used by Ponnuswamy et al. (1980). In analogy with the formula used to calculate log P from the fragmental hydrophobic constants (Rekker, 1977), the Hpy value for each microenvironment of a group is calculated from the formula
<UP>Hpy<SUB>A</SUB></UP>=<LIM><OP>∑</OP><LL><UP>a</UP></LL><UL>N<SUB><UP>A</UP></SUB></UL></LIM> <LIM><OP>∑</OP><LL><UP>Bb</UP></LL></LIM> d<SUB><UP>b</UP></SUB><UP> RFHC<SUB>b</SUB></UP> (r<SUB><UP>ab</UP></SUB>≤4.25 Å) (<UP>B</UP>≠<UP>A</UP>). (5)
In Eq. 5, RFHCb is the contribution of atom b to the fragment's fragmental hydrophobic constant as given in Table 2, NA is the number of atoms in group A and
d<SUB><UP>b</UP></SUB>=<FENCE><AR><R><C>1</C><C><UP>if atom b has not been counted</UP></C></R><R><C>0</C><C><UP>if atom b has been counted,</UP></C></R></AR></FENCE>
to ensure that each atom in the microenvironment is counted only once. Schematic representations of the microenvironments around Glu-25 and Asp-66 in lysozyme are presented in Fig. 2. These diagrams show that in most cases only some atoms of a given amino acid residue are located within 4.25 Å of one or more atoms of the group in question. Moreover, these arrangements of the atoms in space clearly show how the protein architecture has evolved to create the hydrophobic microenvironment around Glu-35 and the hydrophilic microenvironment around Asp-66.



View larger version (21K):
[in this window]
[in a new window]
 
FIGURE 2   Residues defining the microenvironments of Glu-35 and Asp-66 in HEWL. The number under each residue is its contribution to the total hydrophobicity or hydrophilicity. Atoms within 4.25 Å of any atom of the titratable moiety (shown in bold) are labeled. See text for details.

In addition to the Hpy of the protein microenvironments, the Hpy in the solvent are needed to calculate the gain or loss of hydrophilicity in the solvent to protein transfer process. These were calculated by immersing each titratable residue in an equilibrated water droplet and carrying out 100 ps of dynamics after heating to 300 K. The residues were capped with neutral fragments at the termini. From the interval 70-100 ps, 31 structures were extracted, an Hpy calculated for each structure using the water fragment values given in Table 2, and the final value was obtained from the average of the Hpy values calculated from the 31 structures using Eq. 5. The results for the protonatable residues are given in Table 3 and the atoms belonging to the titratable groups, as defined in the PAR19 parameter set of CHARMM (Brooks et al., 1983), are also shown. It should be noted that in analogy to Rekker's (1977) formula for calculating log P from the hydrophobic fragmental constants, Hpy as defined in Eq. 5 is an extensive quantity. Therefore, in an environment that is uniformly hydrophobic or hydrophilic its magnitude will tend to increase with increasing size of the group. In addition, basic groups will tend to orient the waters with the dipole pointing away from the titration site (oxygen pointing toward the group), which will tend to increase the hydrophilicity of the microenvironment, whereas for the acidic groups the dipole will tend to orient toward the titration site (hydrogens nearer the acidic group) tending to increase the hydrophobicity.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 3   Hydrophilicity of titratable groups in water

Three measures for characterizing the hydrophobicity of the microenvironments are defined: first is HpyA defined in Eq. 5, second is the total hydrophobicity of the microenvironment due to the fraction of the group buried in the protein and the fraction exposed to solvent
<UP>THpy<SUB>A</SUB></UP>=<UP>BF<SUB>A</SUB> Hpy<SUB>A</SUB></UP>+(1−<UP>BF<SUB>A</SUB></UP>)<UP>Hpy</UP><SUP><UP>0</UP></SUP><SUB><UP>A</UP></SUB>, (6)
where HpyA0 is the value of Hpy in the solvent, and finally, a measure of the hydrophobicity of a given group's microenvironment relative to what it would be if the group was totally immersed in water, i.e.,
r<UP>Hpy<SUB>A</SUB></UP>=<UP>THpy<SUB>A</SUB>/Hpy</UP><SUP><UP>0</UP></SUP><SUB><UP>A</UP></SUB>. (7)
These quantities have been calculated for all proteins in the data set, and average values, calculated from the seven proteins in the data set for each type of titratable residue, are given in Table 4. It is of interest that the buried fractions and microenvironments are quite different for the different residue types. The reciprocal of rHpy is an indication of the increase in the environmental hydrophobicity on transferring the titratable group from solvent to its final position in the protein. Thus, for the lysines, this increase is only about 1.5, which is not surprising considering the small value of <OVL>BF</OVL>. Histidine is the second most buried residue, but on the average, its microenvironment appears to be quite hydrophilic, and the increase in hydrophobicity is less than two. The titratable moiety of Tyr is the most buried with the most hydrophobic microenvironments. In an initial more extended study of the microenvironments of all groups in a sample of 13 globular proteins comprising a total of 7132 groups, it was found that <OVL>BF</OVL> and <OVL><IT>r</IT>Hpy</OVL> had values of 0.81 and 0.43, respectively (Mehler, unpublished results). Therefore, the average increase in hydrophobicity is about 2.3, which compares well with the value of 2.6 obtained in an earlier study by Ponnuswamy et al. (1980) using very different methods and assumptions.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 4   Average values of the microenvironment properties for each type of titratable residue

The buried fractions (BF) and microenvironment properties of each titratable moiety in lysozyme are given in Table 5. It is seen that only a few values of Hpy are greater than zero. However, as will be discussed below, it was found that titratable groups with a BF less than 0.7 should be considered as solvent exposed for the pKa calculations. Considering groups with BF > 0.7, only Glu-35 has a positive value of Hpy. The values of THpy or rHpy provide additional insight into the change in the hydrophobicity of the microenvironment when the group is transferred from water to protein. In HEWL, THpy is always negative and the highest value is -0.59 for Glu-35; the next largest value is -1.99 for Tyr-53. Comparison of the microenvironment of Glu-35 with the average values and standard deviations for Glu (Table 4) also clearly exhibits the unusual nature of the local environment around Glu-35.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 5   Buried fractions, screening function assignments, and properties of the microenvironments of HEWL

Considering rHpy, the smaller the value the more hydrophobic the protein microenvironment of the group relative to water. Thus, the value of Glu-35 is the smallest (it implies that its protein microenvironment is around 20 times more hydrophobic than water!) in HEWL, and the next value is five times larger. Comparison of the microenvironments of Glu-35 and Asp-66 shows how the structural characteristics (see Fig. 2) translate into the quantitative description of the differences in hydrophobicity between the two microenvironments provided by the three hydrophobicity parameters. In HEWL, only Asp-52 has an Hpy value slightly more negative than Asp-66. Thus, the protein contribution to the microenvironments of these two residues are the most hydrophilic in HEWL, while that of Glu-35 is the most hydrophobic. It should finally be noted that the values of rHpy can be negative, indicating extreme hydrophobicity, or greater than one, indicating a microenvironment more hydrophilic than water.

Parametrization and the microenvironment

The hydrophobicity measures of the microenvironments in lysozyme (Fig. 2, Table 5) reveal a complex mosaic of different dielectric regions that present determinants of electrostatic properties that are quite independent of the buried fraction. In several important cases a totally buried residue, which usually is considered to exist in a low dielectric region, is shown to actually be found in a quite hydrophilic microenvironment. Thus, transfer of such a residue from water to protein is, in fact, only a small perturbation because the screening in such a microenvironment of the protein will be close to that in water. Nevertheless the microenvironments around the titratable groups of proteins show that they are generally more hydrophobic than in water (compare Tables 3, 4, and 5).

The values of D0 and lambda  used to define D1 and D2 were determined as described above to account for the expected differences in dielectric behavior depending on the extent of burial, but the explicit values were not calibrated against the pKa of the protein data set. To incorporate these two screening functions into the algorithm, a switching value had to be determined, which was accomplished by assigning D1 to all protonatable groups with a BF below a given threshold and D2 for the others. A value of 0.7 was found to give the lowest root mean square deviation (rmsd) between calculated and experimental pKa values in the data set and is used for switching between D1 (BF < 0.7) and D2 (BF >=  0.7) for all subsequent calculations. The calculations carried out in this way do not consider the explicit effects of the microenvironments and therefore are labeled as "menv-free" calculations in the following discussions.

From Fig. 2 and Tables 3, 4, and 5, it is apparent that for individual residues there can be considerable variation around the mean values of the properties characterizing the microenvironments. It seems reasonable, therefore, to make use of this to identify microenvironments that exhibit especially large deviations from the mean, and use the physical implications of these deviant regions to modulate the electrostatic properties of the titratable groups in question. To incorporate this idea into the algorithm, it is necessary to define quantitatively when the properties of a microenvironment deviate enough from the mean values to require modification of the menv-free assignments of the screening functions given above. These parameters were determined in the following way. The microenvironments of groups for which the calculated pKa showed the largest errors were analyzed to rationalize on physical-chemical grounds the sources of the errors and the changes in screening that might lead to improvement. However, only reduction in overall rmsd was used to decide if a particular combination of screening functions was to replace the default assignments depending only on the BF value. This approach was taken because of the observation from the comparison of results using different crystal forms that a particular large error may be due to a conformation that is not relevant for the solution structure (see below) or that the sources of a particular error may be due to one or more inadequacies of the method. For these reasons, and because the results indicate that the quantification of the microenvironments will need further extension and refinement, exhaustive optimization was not carried out. Modification of the default screening was applied to titratable groups with BF >=  0.7, where the hydrophobicity or hydrophilicity of the microenvironments was far from the mean value; the border region with 0.7 <=  BF <=  0.8 was treated differently from groups with BF > 0.8. For groups with very small buried fractions (BF < 0.3) and microenvironments very close to water, a third screening function was introduced (D3 in Table 1 and Fig. 1) that is much closer to Dw than D1 or D2. Table 6 gives the thresholds for which the default conditions are modified for calculating the electrostatic energies, and they are discussed in the following paragraphs.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 6   Buried fraction and microenvironment-dependent assignments of screening functions and scaling factors

The protonatable moiety of Glu-35 in lysozyme is a buried titratable group in a very hydrophobic environment (see Fig. 2 and Table 5). According to Rekker (1977), Table IX, 1, p. 301), amino acid residues for which the fragmental hydrophobic constants sum to a relative hydrophobicity greater than one, are considered lipophilic, and this value has been used here to define an extremely hydrophobic microenvironment. As mentioned above, only a few buried tyrosines are in more hydrophobic microenvironments than Glu-35, but their pKa values have not been measured. The effect of very hydrophobic environments is to reduce screening and to make the transfer of the titratable group from water to the protein more unfavorable. Since the pKa has been measured only for Glu-35, any parameter adjustment is arbitrary, so that in these cases (see Table 6) Delta wtr has been scaled to increase the energy penalty for transferring a charge from solvent to the protein interior.

The average value of Hpy for the microenvironments surrounding the titratable residues in the seven proteins of the data set is about -1.5, but, as noted (see Table 4), the variations around this mean are considerable. For Asp and Glu when BF > 0.7, the microenvironments of a fairly large number of them are considerably more hydrophilic than -1.5. For such cases, the screening D2 may imply a microenvironment that is too hydrophobic so that the large, unfavorable transfer energy results in pKa values that are too high for the acids and too low for the bases. For 14 of the 53 aspartic and glutamic acids in the data set, this was the case, and the overestimation of these pKa values was corrected by using D1 instead of D2 for calculating Delta wtr, because this implies transfer to a more hydrophilic microenvironment. For tyrosine and the bases the number of cases meeting the conditions of burial with BF greater than 0.7 and surrounded by very hydrophilic microenvironments was too small to assess any trends. It was found empirically, however, that using D1 instead of D2 for calculating wint gave a small improvement in the overall rmsd. This result may be fortuitous because it appears to be counterintuitive. At the same time the response of any titratable group to its environment is determined by the balance between wint and Delta wtr, so that, as the differences in properties given in Table 4 suggest, this response may depend on residue type. Finally, the third screening function, D3, is only used when the properties of the microenvironment indicate conditions that are very close to pure solvent.

Computational details

The calculations reported below were carried out using protein coordinates obtained from the protein data bank (Bernstein et al., 1977) as follows: BPTI:4PTI* (Marquart et al., 1983), HEWL:2LZT* (Ramanadham et al., 1981) and 1HEL (Wilson et al., 1992), RnaseA:3RN3* (Howlin et al., 1989), RnaseT:3RNT* (Kostrewa et al., 1989), RnaseHI:2RN2* (Katayanagi et al., 1992) and 1RNH (Yang et al., 1990), Cab:3ICB* (Szebenyi and Moffat, 1986), protG:1PGA* and 1PGB (Gallagher et al., 1994), turkey ovomucoid inhibitor domain 3(OMT3):1PPF (Bode et al., 1986), and hiv protease (1HIV) (Thanki et al., 1992), where the starred coordinate sets were used in the optimization procedure. The preparation of the coordinates was carried out in the same way as in I, and the values of the model pKa used here are N-term, 7.5; C-term, 3.8; His, 6.3; Glu, 4.4; Asp, 4.0; Tyr, 10.0; Lys, 10.4; and Arg, 12.0. The partial charges and group structures of the amino acid residues are those defined in the PAR19 topology file of CHARMM (Brooks et al., 1983), whereas the partial charges of the neutral forms of the titratable groups were taken from the PAR22 (MacKerell et al., 1992; MacKerell et al., 1995) topology files. The (BF)a of Eq. A6 were evaluated from the solvent-accessible surface areas calculated with CHARMM using a probe radius of 1.0 Å, and the group average (defined by CHARMM) was used to scale all the Delta watr for the given group (see Table 5 for the HEWL values). The Born radii were determined in the same way as described in the appendix of Mehler (1996b) except that atomic values were calculated, and the value obtained for Dw was used in both self energy terms in Eq. A6. For charges closer than 1.9 Å, wint was scaled by a factor of 0.35. This scaling of nearby charges is similar to that used in Mehler (1996b) and compensates for the breakdown of the point charge model for nearby charges. However, due to the modified variational procedure (see Appendix) the cutoff distance could be reduced from 2.8 Å in Mehler (1996b) to the present value. Therefore, the scaling is essentially only used where uncompensated steric clashes leave two charges too close together. In addition, ionic strength is taken into account using a Debye screening as in Mehler (1996b). Finally, the polar protons of histidine are treated symmetrically in assigning the charge of the neutral species.

    RESULTS AND DISCUSSION
TOP
ABSTRACT
INTRODUCTION
THEORETICAL BASIS AND...
RESULTS AND DISCUSSION
CONCLUSIONS
APPENDIX
REFERENCES

Calculation of pKa

The calculated pKa for the test suite of seven proteins are presented in Tables 7-13. To evaluate the effect on the pKa of introducing the dependence of the electrostatics on the microenvironment, the pK1/2 calculated with the screening functions D1 and D2 only, i.e., menv-free are also tabulated. Calculations were carried out in steps of 0.25 pH units and the pK1/2 of each titratable group was estimated by linear interpolation between the two pH values where the fraction of charged species crossed the 0.5 value. The ionic strength was adjusted to be as close as possible to the experimental conditions and is 0.1 except where stated otherwise. For most of the proteins in the data set the experimental pKa values to be used for calculating rmsd are clear from the quoted reference, but in some cases more than one choice is possible, and these will be noted as required for clarity. Finally, the Null model introduced by Antosiewicz et al., (1994) assumes that the pKa values in the protein are the same as in the model (solvent), and it was noted that the rmsd for this model is usually quite small, since most of the pKa values do not shift much from their model values. Because of this the rmsd resulting from the calculations provide only a partial measure of reliability. Of equal or greater importance are the number and size of the large errors, and these will also be considered in assessing the reliability of the approach.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 7   pKa values for BPTI

Bovine pancreatic trypsin inhibitor

The titratable groups in BPTI are mostly solvent exposed, thus the pKa values are not shifted far from the null values (see Table 7). For only three residues (Arg-20, Tyr-23, and Tyr-35) are the BFs in the protein greater than 0.7. Tyr-23 is most noteworthy because the microenvironment is very hydrophilic (Hpy = -5.4), and D1 was used to screen the SCP (see Table 6) causing an increase of about 0.5 pK units in pK1/2, which reduces the error to about <FR><NU>1</NU><DE>3</DE></FR> of its menv-free value. The rmsd for BPTI is quite small in agreement with other calculations (Antosiewicz et al., 1994; Demchuk and Wade, 1996; Juffer et al., 1997).

Calbindin D9k

In Cab 9 of the 10 lysine residues are exposed to solvent, so that the microenvironments are less significant in modulating their electrostatic effects. The results given in Table 8 show that the calculated pKa values of all 10 lysines are in good agreement with experiment, and the rmsd is slightly smaller than that calculated from the menv-free conditions. Inspection shows that small shifts in several p/Ca values lead to the decrease in rmsd with the largest shift of 0.2 pK units exhibited by Lys-12.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 8   pKa values for CabD9K

The authors of the NMR measurements (Kesvatera et al., 1996) also used a Monte Carlo method (Svensson et al., 1990; Svensson et al., 1991) to calculate the pKa of the lysine residues. However, the resulting rmsd of 0.86 is larger than the null model value of 0.74. Juffer et al. (1997) calculated the pKa values using a boundary element method to solve the Poisson-Boltzmann equation. In agreement with other studies (Antosiewicz et al., 1996) they found that with a low internal dielectric constant (epsilon i) the results deviated substantially from the observed values. With an epsilon i = 20, agreement was much better (rmsd = 0.51). Using the value of the solvent's dielectric constant for epsilon i gave an rmsd of 0.36 for both zero and 0.1 ionic strength. Except for Lys-55, the other lysine residues are exposed to solvent so that the results of Juffer et al. (1997), that high values of epsilon i yield better results, is not unexpected. It is noteworthy that the pKa values of Lys-25 and Lys-55 are shifted upward one or more pKunits. The BF of these two residues is about 0.5 and 0.8, respectively, so that the upward shift in pKa shows that the interresidue interactions can be as important as the transfer terms in accounting for large changes in pKa from the Null model values.

Hen egg lysozyme

The main problem with HEWL is proper calculation of the pKa of Glu-35, and, at the same time, to obtain reasonable values for the pKa of the other titratable residues. The source of the difficulty is the high hydrophobicity of the microenvironment of Glu-35 as discussed above. Quantification of the hydrophobic characteristics of the microenvironments reveals why solution of the PB equation requires a low epsilon i to give a reasonable result for the pKa of Glu-35, but for many other pKa such a low value causes large deviations from the experimental results. With a larger value of epsilon i, this trend is reversed (Antosiewicz et al., 1994).

The pKa values for HEWL are given in Table 9, and from the rmsd values it is seen that introduction of the microenvironments leads to substantial improvement. A large portion of this improvement is due to Glu-35, but it is seen that the errors in several other pKa values are also reduced, e.g., Tyr-53, Lys-96, and His-15. Other residues show smaller improvements, and a few pKa values show slightly larger errors. Glu-7, which deviates by more than one pK unit from the experimental value, is only 27% buried and not influenced by the microenvironment. This problem is further discussed in the next section.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 9   pKa values* of hen egg lysozyme

Demchuk and Wade (1996) attempted a local characterization for identifying a subset of residues described by a high internal dielectric constant (HD) and a set to be described by a low value (LD). For HEWL, their overall rmsd was 0.63 as compared to 0.49 obtained here. The HD subset (see Demchuk and Wade, 1996) gave an rmsd of only 0.37, while here it was 0.50.

The titration curves of Glu-35 obtained from the menv-free calculation and with inclusion of the microenvironments are shown in Fig. 3. Both curves titrate over a fairly extended pH range. The net effect of the hydrophobic microenvironment (scaling of wtr by 1.75) is to decrease the slope of the titration curve thereby shifting pK1/2 to a larger value and further extending the titration range, so that between a pH of 4 and 6 the equilibrium charged fraction changes from about 0.25 to 0.55. This extended titration range of Glu-35 may have functional significance because it ensures that an ample concentration of lysozyme with protonated Glu-35 is available over a fairly large pH range around the optimal value (Fukamizo et al., 1983). The dependence of the interaction energy and transfer energy on pH is presented in Fig. 4. The effect of scaling is clearly seen in wtr for pH values greater than about eight. In this region, wtr continues to increase, only starting to level off at very high pH. In contrast, the transfer energy from the menv-free calculation levels off around a pH of 10. The behavior of the energy components in the important pH range 4 to 6 is surprising. Here wtr and w0tr are nearly identical, in spite of the scaling factor, but wint is less negative than w0int. From Eq. A6, wtr (q = 1) has a value of 6.8 Kcal/mol and scaling by 1.75 increases this to 12.0 Kcal/mol. At a pH of 6 the menv-free calculation gives q35 = -0.673, yielding 3.1 Kcal/mole for the transfer free energy. Interestingly, the equilibrium value of the scaled Delta wtr is 3.2 Kcal/mol, which is achieved by reducing the menv-free value of q35 to -0.514. In this case the variational procedure lowers the net charge to minimize the unfavorable transfer contribution with the consequence that the pK1/2 value is increased. The effect on the interaction energy is a small increase of 0.3 Kcal/mol (see Fig. 4).



View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 3   Glu-35 of HEWL titration curve. Closed circles, menv-free calculation; open circles, includes microenvironment.



View larger version (19K):
[in this window]
[in a new window]
 
FIGURE 4   Energy components of Glu-35 as a function of pH. Closed symbols, transfer energy; open symbols, interaction energy; diamonds, menv-free calculation; circles, complete calculation.

Immunoglobulin G-binding domain B1 of Protein G

The pKa values of this small protein (56 amino acid residues) were recently reported by Khare et al. (1997), and the calculated results are reported in Table 10. The microenvironments have very little effect on the calculated pKa because only 3 of the 21 titratable residues are buried (Tyr-3, Tyr-45, and Glu-56). Hpy of Glu-56 is 0.93 and rHpy is 0.10, so that the microenvironment of this residue is quite hydrophobic, but not as hydrophobic as for Glu-35 of HEWL. Comparison with the calculated pK1/2 given by Khare et al. (1997) shows reasonably good agreement in the overall trends of the deviations.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 10   pKa values* of the B1 immunoglobulin G-binding domain of protein G

Ribonuclease A and Ribonuclease T1

Results for RnaseA are given in Table 11. Introduction of the microenvironments leads to overall improvement in the calculated pKa. His-48 shows the biggest improvement when the environmental effects are incorporated while the error in the pKa of Glu-2 also decreases substantially. The overall rmsd obtained here was 0.55, whereas Demchuk and Wade (1996) obtained 0.63. For the subset labeled HD, they obtained an rmsd of 0.44, whereas here, an rmsd of 0.60 was obtained. These results from HEWL and RnaseA suggest that the HD selection criterion proposed by Demchuk and Wade (1996) can successfully select a subset of titratable residues with high epsilon i, but that a single smaller epsilon i for the low dielectric residues is not sufficient for discriminating the differences in the microenvironment of these residues.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 11   pKa values of RnaseA

The results of the pK calculations for RnaseT1 are presented in Table 12. The titratable moiety of His-92 exists in a fairly hydrophobic microenvironment, which shifts the pKa by about 0.6 pK units, reducing the error by about half.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 12   pKa values of RnaseT1

Ribonuclease HI

As shown in Table 13, incorporation of the microenvironment improves the pKa values of Glu-32 and Asp-148. Glu-32 is about 71% buried, but in a fairly hydrophilic environment, with Hpy and rHpy having values of -3.45 and 0.48, respectively. Thus, the menv-free screening of D2 for the transfer term is changed to D1, but D2 is still used for the interaction term. The effect of this change in screening is to decrease both transfer and interaction energies leading to a decrease in pK1/2. Similarly, Asp-148 is 97% buried, but with an Hpy of -4.4 is still in a hydrophilic environment, so that the screening used to calculate wtr is set to D1. The effect of the increased screening is a reduction in wtr leading to a decrease of 1 pK unit in the calculated pK1/2. Nevertheless, the remaining error for this residue, as well as Asp-102, is still greater than 1 pK unit. Oda et al. (1994) pointed out that these two residues did not titrate in the pH range of 2 through 8. Both residues form ionic H-bonds with basic residues (Katayanagi et al., 1992). Oda et al. (1994) have also calculated the pKa values of the acidic groups using the FDPB method (Klapper et al., 1986; Warwicker and Watson, 1982). Their rmsd of about 1.6 was obtained with an internal dielectric constant of 10; their calculation would probably have been considerably improved if they had used an epsilon i of 20 (Antosiewicz et al., 1996).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 13   pKa values* for RnaseHI

The calculated pKa of His-124 is too low by more than 1 pK unit. However, as Oda et al. (1994) note, comparison of the 2RN2 (Katayanagi et al., 1992) crystal structure (used in the present calculations) and the 1RNH (Yang et al., 1990) crystal structure shows a very large conformational change for His-124 in the two structures. In 2RN2, His-124 is far away from both catalytic residues, Asp-10 and Asp-70, but close to a Lys from a neighboring molecule (Oda et al., 1993), which would decrease the pK1/2 of His-124 as found here. In contrast, in 1RNH His-124 is close to Asp-10 and Asp-70, and in this configuration His-124 would be in an acidic environment that would increase its pKa as observed; this point will be discussed in a following section.

Turkey ovomucoid third domain and HIV protease

As a test of the microenvironment selection criteria developed using the above proteins, the pKa values of the 15 titratable groups in OMT3 have been calculated. The experimental titrations (Schaller and Robertson, 1995) were carried out at a nominal ionic strength of 10 mM, but the authors point out that, at the end of the titration, the ionic strength is closer to 50 mM. pKa calculations were done at ionic strengths of 10 mM, 25 mM, and 50 mM. The smallest rmsd was obtained at 25 mM and these results are reported in Table 14. Results calculated at 1 M ionic strength are also included since Schaller and Robertson (1995) also measured the pKa values at this ionic strength.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 14   pKa values of turkey ovomucoid third domain

The remarkable aspect of this protein is that, in spite of its small size, the pKa values of four of the carboxy groups are shifted downward by one or more pK units, although all the acidic groups except Asp-27 are solvent exposed. The source of these large pK shifts is attributed to H-bond formation and other electrostatic interactions (Swint-Kruse and Roberson, 1995). From Table 14 it is seen that the overall rmsd agrees well with the values obtained for the other proteins. Only Asp-27 is in error by a larger value, which is in agreement with the trend found by Antosiewicz et al. (1996). At 1 M ionic strength the rmsd is somewhat greater, but the overall trend to larger pK1/2 values as observed from the measurements is preserved. The larger rmsd may be the result of using a simple Debye screening to account for ionic strength. This approximation is probably reasonable in the range up to 200 mM, but for 1 M probably is no longer valid.

As a second test the pKa values of HIV protease have been calculated, but only the results for the Aspartyl dyad are discussed. The experimental data show a range for both pKa1 (4.9-6.8) and pKa2 (3.1-3.7) (Hyland et al., 1991; Ido et al., 1991). The calculation was carried out at zero ionic strength and yielded the values 5.45 and 3.32 for pKa1 (Asp-25A) and pKa2 (Asp-25B), respectively. Thus the calculated values appear to fall within the experimentally determined range, and they are also in reasonable agreement with the recent calculations reported by Luo et al. (1998). The fraction surface area buried in the protein for Asp-25A is 0.88, while for Asp-25B it is 0.85. Thus, as in lysozyme, the separation of the two pKa values cannot be due to differences in the degree of burial. However, because of small differences in conformation of the two monomers, the microenvironments of the two groups are different with values for Hpy of -1.72 (Asp-25A) and -3.97 (Asp-25B). With these values, wint and Delta wtr of both groups are calculated with D2 (see Table 6). Nevertheless, the lower Hpy value for Asp-25B suggests additional interactions with polar groups leading to enhanced stabilization of the deprotonated form. Finally, in the 1HIV structure, Asp-25B is able to form a better H-bond with Gly-27B [R(OD1 - N) = 2.81 Å] than is the case for Asp-25A [R(OD1 - N) = 3.23 Å].

Assessment of the results

Table 15 summarizes the reliability of the SCP approach using the description of microenvironments developed in this paper. Incorporation of the modulating effects of the titration site microenvironments on the electrostatic free energy leads to an improvement of about 15% in the rmsd relative to the menv-free calculation for the entire data set of 103 measured pKa values. It should be realized, however, that with two different screening functions some accounting of environment, based on the buried fraction only, has already been included in the menv-free calculations. Therefore, the rmsd is smaller than that obtained by Antosiewics et al. (1994) from their FDPB calculations, but closer to the overall rmsd obtained by Demchuk and Wade (1996). More significant is the decrease in the larger errors due to inclusion of the environmental effects: at the error threshold of 0.5 pK unit, the decrease is 17% and increases to 33% at a threshold of 1.0 pK unit. A scatter plot of the pKa values from both menv-free and microenvironment calculations is given in Fig. 5. The regression line has a slope of 0.98 and an intercept of 0.21. The correlation coefficient was 0.99. The scatter plot shows that accounting for the microenvironment induces shifts in the points so that they lie closer to the ideal line pKcalc = pKexp.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 15   Summary of results



View larger version (18K):
[in this window]
[in a new window]
 
FIGURE 5   Scatter plot of calculated versus experimental pKa values. Open triangles, menv-free results; closed circles, complete calculation.

An error analysis according to residue type is given in Table 16. With the present data set the largest errors were found in the pKa values of Asp and Glu, with the bases having smaller errors; for Tyr and the termini not enough data is available to be statistically meaningful. The error trends from the present calculations are somewhat different from the trends noted in Antosiewicz et al. (1996), where the rmsd from all the residues were about the same.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 16   Statistical summary by residue

The mean error of Asp shows that the calculated pKa are too large. This trend is clearly seen in Fig. 6, where the experimental (Tanford and Roxby, 1972) and calculated titration curves of lysozyme have been plotted. The overestimation of the acidic pKa values results in the slope of the calculated curve being too steep in the pH region 2-4. In the pH interval 4-6, the titration curve is primarily controlled by His-15 and Glu-35 that both titrate in this interval. The pK1/2 points of the calculated titration curves of these two residues (see inset, Fig. 6) are shifted relative to the measured values (Bartik et al., 1994) such that their equilibrium charges approximately cancel in the pH interval 4.5-5.5 and lead to a flatter slope in this portion of the calculated titration curve. Inspection of the slope of the His-15 experimental titration curve suggests that it is steeper than the calculated slope (Bartik, K., C. Redfield, and C. M. Dobson, private communication). Finally, from a pH of 6 to 11, the experimental and calculated curves are in good agreement.



View larger version (21K):
[in this window]
[in a new window]
 
FIGURE 6   Calculated (open circles) and experimental (closed circles) titration curves of lysozyme. Inset shows calculated titration curves of His-15 and Glu-35.

Limitations of pKa calculations using a single structure

In their analysis of pKa values calculated with the FDPB algorithm, Antosiewicz et al. (1996) included average values obtained from NMR structures, and they noted a trend toward improvements in the results. The use of simulation to incorporate protein flexibility (relaxation effects) into the calculation of protein electrostatics has been addressed by Warshel and collaborators using linear response theory (Sham et al., 1997; Sham et al., 1998). Other approaches have been reported (Alexov and Gunner, 1997; Beroza and Case, 1996), and these authors also found some improvements in the calculated values. However, at present not enough systems have been tested to come to any general conclusions. Variability in the conformation of side chains was already pointed out by Bashford and Karplus (1990) and others (Yang et al., 1993; You and Bashford, 1995) in pK calculations using the trigonal and tetragonal crystal forms of HEWL. These differences are already reflected in the buried fractions of the titratable residues and in their microenvironments, which can show variations that, in some cases, are large enough to change the screening assignments for the calculation of the electrostatic free energy components.

Oda et al. (1993)