help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Originally published as Biophys J. BioFAST on April 8, 2005.
doi:10.1529/biophysj.104.055996
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental File
Right arrow All Versions of this Article:
biophysj.104.055996v1
88/6/3888    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Archontis, G.
Right arrow Articles by Simonson, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Archontis, G.
Right arrow Articles by Simonson, T.
Biophysical Journal 88:3888-3904 (2005)
© 2005 The Biophysical Society

Proton Binding to Proteins: A Free-Energy Component Analysis Using a Dielectric Continuum Model

Georgios Archontis * and Thomas Simonson {dagger}

* Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus; and {dagger} Laboratoire de Biochimie (UMR7654 du CNRS), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France

Correspondence: Address reprint requests to Georgios Archontis, E-mail: archonti{at}ucy.ac.cy; or Thomas Simonson, E-mail: thomas.simonson{at}polytechnique.fr.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 NUMERICAL METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 SUPPLEMENTARY MATERIALS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Proton binding plays a critical role in protein structure and function. We report pKa calculations for three aspartates in two proteins, using a linear response approach, as well as a "standard" Poisson-Boltzmann approach. Averaging over conformations from the two endpoints of the proton-binding reaction, the protein's atomic degrees of freedom are explicitly modeled. Treating macroscopically the protein's electronic polarizability and the solvent, a meaningful model is obtained, without adjustable parameters. It reproduces qualitatively the electrostatic potentials, proton-binding free energies, Marcus reorganization free energies, and pKa shifts from explicit solvent molecular dynamics simulations, and the pKa shifts from experiment. For thioredoxin Asp-26, which has a large pKa upshift, we correctly capture the balance between unfavorable carboxylate desolvation and favorable interactions with a nearby lysine; similarly for RNase A Asp-14, which has a large pKa downshift. For the unshifted thioredoxin Asp-20, desolvation by the protein cavity is overestimated by 2.9 pKa units; several effects could explain this. "Standard" Poisson-Boltzmann methods sidestep this problem by using a large, ad hoc protein dielectric; but protein charge-charge interactions are then incorrectly downscaled, giving an unbalanced description of the reaction and a large error for the shifted pKa values of Asp-26 and Asp-14.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 NUMERICAL METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 SUPPLEMENTARY MATERIALS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The accurate determination of amino acid pKa values in proteins is of fundamental interest in biophysical chemistry. Knowledge of the pKa values of residues in the active site of an enzyme helps identify potential proton donors and acceptors, and contributes to our understanding of the reaction mechanism (Warshel, 1981Go; Raquet et al., 1997Go; Nielsen and McCammon, 2003Go). The stability of proteins (Yang and Honig, 1993Go; Swietnicki et al., 1997Go; Schaefer and Karplus, 1997Go; van Vlijmen et al., 1998Go; Morikis et al., 2001Go) and protein-ligand complexes (Mackerell et al., 1995Go) depends on the ionization state of titratable residues. pKa and redox potential shifts provide information on the strength of electrostatic interactions in the protein interior (Sternberg et al., 1987Go; Varadarajan et al., 1989Go). Furthermore, theoretical calculations of protein properties often depend strongly on assumptions about the ionization state of titratable groups (Simonson, 2003Go).

The ionization state of titratable amino acids in solution is known from experiment. However, in a folded protein, the pKa values can be shifted with respect to the solution values. The shifts are caused by a combination of factors, including the loss of interactions with the aqueous environment, interactions with the protein's charged and polar groups, and structural reorganization of the protein in response to proton binding (Warshel, 1987Go; Sham et al., 1997Go; Schutz and Warshel, 2001Go; Simonson et al., 1999Go, 2004Go). These factors compete with each other, making it difficult to predict the direction and magnitude of a particular pKa shift.

Theoretical methods to calculate protein pKa values have been the focus of considerable efforts in the past two decades (Warshel, 1981Go, 1987Go; Warshel et al., 1984Go; Warshel and Russell, 1985Go; Bashford and Karplus, 1990Go; Gilson, 1993Go; Yang and Honig, 1993Go; Antosiewicz et al., 1996Go; Beroza and Case, 1996Go; Schaefer and Karplus, 1997Go; Schaefer et al., 1998Go; Sham et al., 1997Go; Mehler and Guarnieri, 1999Go; Georgescu et al., 2002Go; Nielsen and McCammon, 2003Go; Simonson et al., 2004Go; Warwicker, 1999Go, 2004Go). Several models that use very different assumptions (including a "null" model) have had a roughly comparable success, so that the underlying physics remains partly unclear and controversial. A widely used approach treats the protein as a low-dielectric cavity immersed in a high-dielectric solvent, and determines the electrostatic free energy at various ionization states by solving the Poisson-Boltzmann (PB) equation (Warwicker and Watson, 1982Go; Davis and McCammon, 1990Go; Honig and Nicholls, 1995Go). In applications of these models to pKa calculations, it is customary to treat the protein dielectric constant as an adjustable parameter, which accounts for protein structural reorganization upon a change in the ionization state (proton binding or release), as well as for nonelectrostatic effects such as van der Waals interactions, or the change in the protein conformational entropy upon proton binding. The meaning of the protein dielectric constant and many of the approximations involved in continuum electrostatic calculations have been discussed in detail (Fröhlich, 1949Go; Warshel et al., 1984Go; Sham et al., 1997Go, 1998Go; Schutz and Warshel, 2001Go; Simonson, 2003Go; Simonson and Perahia, 1995aGo; Simonson et al., 1999Go; Smith et al., 1993Go; Krishtalik et al., 1997Go; Nakamura, 1996Go). A difficulty is the consistent parameterization of these models. For example, most current implementations use molecular mechanics charge sets, which have been painstakingly parameterized to reproduce equilibrium electrostatic potentials in combination with a protein dielectric constant of 1 (Cornell et al., 1995Go; Jorgensen and Tirado-Rives, 1988Go; Mackerell et al., 1998Go). This can pose a problem for pKa calculations in which protein reorganization plays a large role. The reorganization requires a protein dielectric significantly >1 (Fröhlich, 1949Go; Sham et al., 1997Go; Smith et al., 1993Go; Simonson and Perahia, 1995aGo; Simonson et al., 1999Go), which is then inconsistent with the charge set.

The idea of estimating pKa values by a linear response approximation (LRA), using structures corresponding to the states before and after ionization, was proposed by Warshel (Lee et al., 1992Go, 1993Go; Sham et al., 1997Go). In this work, we analyze several proton-binding reactions with a simple, LRA method, which combines molecular dynamics simulations with continuum electrostatics. Our method is equivalent to the LRA method of Warshel, in that it expresses the protonation free energy as an electrostatic interaction energy between the inserted charge and the permanent and induced charges of the system, averaged over the equilibrium states before and after proton binding (see Eq. 7). The LRA expression (Eq. 7) is derived from a two-step decomposition of the charge insertion free energy, introduced originally by Marcus (1956)Go in the context of electron transfer theory, and applied recently to study dielectric relaxation in the enzyme aspartyl-tRNA synthetase (Simonson et al., 1999Go; Archontis and Simonson, 2001Go). The solvent and the electronic polarization of the protein are treated by a dielectric continuum model (Warwicker and Watson, 1982Go), whereas the atomic reorganization of the protein is described explicitly, by averaging explicitly over conformations that are representative of the two endpoint states (Lee et al., 1992Go, 1993Go; Sham et al., 1997Go; Aqvist et al., 2002Go). We refer to the method used in this work as PB/LRA. Closely related linear response methods for free-energy calculations have been introduced in the past (Del Buono et al., 1994Go; Aqvist et al., 2002Go), and have been used successfully to calculate binding free energies in several receptor-ligand systems (Aqvist, 1991Go; Aqvist et al., 2002Go; Florian et al., 2002Go, 2003Go; Jorgensen, 2004Go). This PB/LRA method was also employed by Eberini et al. for pKa calculations in apo- and holo-ß-lactoglobulin (Eberini et al., 2004Go) (see below).

The optimum protein dielectric constant for continuum pKa calculations depends on the details of the model. In particular, it depends on which microscopic effects are represented explicitly and which are represented implicitly (Fröhlich, 1949Go; Lee et al., 1992Go, 1993Go; Sham et al., 1997Go; Schutz and Warshel, 2001Go; Simonson, 2003Go; Simonson and Perahia, 1995aGo; Eberini et al., 2004Go; Krishtalik et al., 1997Go). By averaging over the two protein states before and after protonation, we account explicitly for the atomic rearrangements in response to proton binding. As a result, we expect the best results to be obtained with a protein dielectric {epsilon}p = 1 or 2, depending on whether electronic polarization is considered explicitly or not (Fröhlich, 1949Go). Despite this expectation, a larger dielectric constant may still be needed. Eberini et al. employed the PB/LRA method and a continuum model to calculate the pKa of a Glu residue in apo- and holo-ß-lactoglobulin (Eberini et al., 2004Go). Even though the protein relaxation was explicitly accounted for by averaging over the end states, it was found that {epsilon}p {approx} 8 was needed to reproduce the experimental pKa. Proton binding to this particular Glu was associated with large conformational rearrangements in the apo form of the enzyme and changes in the solvent exposure of the titratable Glu. The authors speculated that the high dielectric constant might compensate for limited sampling of the protein conformations in the two end states and/or deviations from linear response. In the work presented here, atomic reorganization of the protein is modeled explicitly, whereas electronic polarization is modeled implicitly. The best results are indeed obtained with a protein dielectric {epsilon}p of 1 or 2.

We use the PB/LRA method to compute the pKa shifts of three carboxylates in two proteins. Two of the carboxylates correspond to interior residues, with high pKa shifts; Asp-26 in thioredoxin, with a large pKa of 7.5 (Langstemo et al., 1991Go) and Asp-14 in ribonuclease A, with a low pKa of 2.0 (Forsyth et al., 2002Go). As discussed by Warshel (Schutz and Warshel, 2001Go), such interior residues represent the proper benchmarks to test the accuracy of a pKa calculation. The third carboxylate is the solvent-exposed Asp-20 in thioredoxin, with an unshifted pKa of 4.0 (Forsyth et al., 2002Go). The chosen residues titrate approximately independently of surrounding ionizable groups; i.e., the ionization state of the other groups can be assumed fixed whereas the carboxylate under study binds a proton at a pH close to its pKa (Dillet et al., 1998Go; Simonson et al., 2004Go); this simplifies the analysis.

The results are compared to experiment and to a recent calculation of the same pKa shifts by molecular dynamics free energy (MDFE) simulations in explicit solvent (Simonson et al., 2004Go). Importantly, the comparison to MDFE includes the electrostatic potentials, the reorganization free energies, the protonation free energies, and the pKa shifts. Semiquantitative agreement with MDFE and with the experimental pKa shifts is obtained using a low, physically reasonable value of the protein dielectric constant (1 or 2). In two out of three cases, the method actually yields better agreement with experiment than do the MDFE simulations (Simonson et al., 2004Go), probably because of a superior treatment of electronic polarizability. Agreement with experiment is rather poor for the unshifted Asp-20 case. Nevertheless, in all three cases, the method yields insights into the mechanisms that determine the pKa.

A detailed group decomposition of the free energy is used to identify important contributions to the pKa shifts. We also compute the reorganization, or relaxation free energies in response to ionization (the second step in the Marcus free-energy decomposition) (Marcus, 1956Go; Krishtalik et al., 1997Go; Sharp, 1998Go; Simonson et al., 1999Go; Archontis and Simonson, 2001Go), even though they are not needed to calculate the pKa values. The relaxation free energy and the total free energy are linked (Simonson et al., 1999Go), so that the relaxation data place a constraint on the protein dielectric constant that can be used in a physically meaningful way for the pKa values. This reduces our reliance on fitting to the experimental data and makes the model more predictive. We show that, although the pKa values are best calculated with a protein dielectric of 1 or 2, the protein reorganization free energy is reproduced using a protein dielectric of 2–8. This difference is not surprising, because the present pKa calculations model the protein atomic reorganization explicitly, whereas the reorganization calculations model it implicitly. This result also agrees with our earlier studies (Simonson et al., 1999Go; Archontis and Simonson, 2001Go) and those of Krishtalik et al. (1997)Go.

We also compare our results to a "standard" PB protocol that uses a single protein endpoint structure (Bashford and Karplus, 1990Go; Yang and Honig, 1993Go; Antosiewicz et al., 1996Go; Raquet et al., 1997Go; Warwicker, 1999Go, 2004Go). Because it does not include the protein relaxation explicitly, the "standard" method yields pKa shifts that depend strongly on the structural model assumed for the protein. When a low dielectric value is used for the protein ({epsilon}p = 2–4), the method fails in all three carboxylate calculations. When the protein dielectric is increased to {epsilon}p {approx} 20, the "standard" method yields pKa shifts in reasonable agreement with experiment for two cases, thanks to a fortunate compensation of errors, even though this high dielectric value is shown to be unphysical for these systems.

The following two sections present the theoretical derivation of the PB/LRA method and the details of the numerical calculations. Results are presented next; the last section is a discussion.


    THEORY
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 NUMERICAL METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 SUPPLEMENTARY MATERIALS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Proton binding as a charge insertion process
The quantity of interest is the double free-energy difference {Delta}{Delta}G between the protonation free energies when the side chain is part of the protein, and when the same chemical group is alone in aqueous solution (Warshel, 1981Go, 1987Go; Bashford and Karplus, 1990Go). {Delta}{Delta}G is proportional to the difference between the pKa of the side chain in the protein, pKa,prot, and that of the chemical group in solution, pKa,model:

(1)
where q is the charge of the ionized form of the titratable residue. The proton binding is modeled in this work as the insertion of a set of point charges {{Delta}qi} onto selected aspartate side-chain atoms (Warshel, 1987Go; Bashford and Karplus, 1990Go). The compound employed as a model of the chemical group in solution is an aspartic acid molecule with N-acetyl and N-methylamide blocking groups (Fig. 1 A). Its pKa is known experimentally to be ~4. We denote by "AspH"/"Asp" the protonated/unprotonated states and by "midpoint" the fictitious intermediate state where the Asp carboxylate carries a net charge of –1/2. The partial charges used for each charge state are shown in Fig. 1.



View larger version (22K):
[in this window]
[in a new window]
 
FIGURE 1  (A) Schematic representation of the model compound (2N-acetyl-1N-methyl-aspartic acid 1-amide), showing the partial charges at the ionized (bottom values) and protonated state (top values). Charges unaffected by ionization are not shown. (B) Schematic view of the titrating side chain, surrounded by the protein (hatched) and solvent (shaded) dielectric media. Fragment A corresponds to the inner, white region (see "Decomposing the free energy into two components").

 
In the continuum dielectric framework used here, both the protein and the surrounding aqueous solution will be treated as uniform dielectric media. The titratable side chain, however, is treated as a cavity (Fig. 1 B). This is consistent with our earlier work (Simonson et al., 1999Go; Archontis and Simonson, 2001Go; Hoefinger and Simonson, 2001Go) and with the continuum models commonly used in quantum chemistry (Cramer and Truhlar, 1999Go). It differs from the usual continuum approach, which embeds the side chain directly within the protein dielectric. A side-chain cavity is essential to obtain reasonable values for the protonation free energy (as compared to MDFE for example). However, it has a rather small effect on the double free-energy difference {Delta}{Delta}G between protein and model compound, which is the experimentally relevant quantity. Indeed, the cavity contributions largely cancel when the model compound result is subtracted, so that this treatment gives a similar result for {Delta}{Delta}G to the usual, "embedded side-chain" approach.

Decomposing the free energy into two components
In the context of electron transfer theory, Marcus introduced a decomposition of the charging free energy (Marcus, 1964Go) that is physically illuminating and leads to a practical method for pKa calculation (Sham et al., 1997Go). The transformation of the system from its initial, "reactant" state to its final, "product" state can be decomposed into a "static" and a "relaxation" step (Fig. 2) (Marcus, 1956Go). The free-energy change {Delta}G can be written

(2)
where and are the static and relaxation free energies, and the superscript designates the starting, reactant state. In the case where a single point charge is inserted at a position "0", the static term is given by

(3)
where is the equilibrium electrostatic potential at the insertion site "0" in the reactant state.



View larger version (11K):
[in this window]
[in a new window]
 
FIGURE 2  Schematic representation of the reactant and product free-energy surfaces, showing the Marcus two-step reaction pathway. The curves shown correspond to the Asp-20 free-energy components, obtained with (static) and (relaxation).

 
The reverse transformation can be carried out by inserting the charge –q into the "product state" at the same site. The corresponding free-energy change {Delta}G' can be written

(4)
with

(5)

The continuum model is almost always assumed to be linear (Marcus, 1964Go; Bashford and Karplus, 1990Go; Honig and Nicholls, 1995Go; Roux and Simonson, 1999Go). Under this assumption, the relaxation free energy is the same in the reactant and product states (Sham et al., 1997Go; Simonson et al., 1999Go; Simonson, 2002Go):

(6)

Combining Eqs. 2, 4, and 6, we obtain for the total free-energy change:

(7)

Equation 7 is identical to the linear response approximation method of Warshel (Lee et al., 1992Go, 1993Go), and is the basis of the linear interaction energy method of Aqvist (e.g., Eq. 3 of Aqvist et al., 2002Go). It shows that the total electrostatic free-energy change can be calculated by averaging the interaction energies between the inserted charge and the permanent and induced charges of the reactant and product states. It has been used for pKa calculations in several proteins (Sham et al., 1997Go; Eberini et al., 2004Go), and for binding free-energy calculations in several receptor-ligand systems (Aqvist, 1991Go; Florian et al., 2002Go, 2003Go; Jorgensen, 2004Go). In earlier work, we used it to study dielectric relaxation in response to charge insertion in the active site of the enzyme aspartyl-tRNA synthetase (Simonson et al., 1999Go; Archontis and Simonson, 2001Go).

Under the linear response approximation, the average of the reactant and product static terms in Eq. 7 is exactly equal to the static term of the fictitious "midpoint" state, in which a charge of q/2 has been introduced. Thus, we can write

(8)

Generalizing these results to the insertion of several charges is straightforward (see, e.g., Simonson et al., 1999Go for details). The charges are inserted onto a set of atoms that constitute a chemical fragment "A". Denoting {Delta}qi the charge inserted onto atom i, Eqs. 7 and 8 take the general form

(9)

(10)
where is the equilibrium electrostatic potential at the insertion site i in the state "xxx". In this work, the Asp carboxylate protonation is modeled by the introduction of a total charge q = +1 onto the side-chain carboxylate atoms (Fig. 1).

Equations 9 and 10 express the free-energy change {Delta}G in terms of the equilibrium electrostatic potentials of the reactant, product, or midpoint states at the charge insertion sites. To evaluate these potentials for a particular protein structure, we solve the Poisson equation numerically, using dielectric values and {epsilon}w = 80, respectively, for the protein and the solvent. The potentials are averaged over several hundred protein structures, generated by molecular dynamics (MD) simulations of the protein in a box of water, with the aspartate of interest in the appropriate charge state (reactant, product, or midpoint state; see Numerical Methods).

Importantly, the linear response assumption also leads to a direct relation between the static and relaxation free-energy components (pointed out in a slightly different context by Marcus, 1965Go; Muegge et al., 1997Go):

(11)

By comparing the relaxation free energy calculated directly (see below) and by Eq. 11, we can verify the self-consistency of our analysis (Simonson et al., 1999Go).

Selection of protein dielectric values
The pKa values calculated from Eqs. 9 and 10 depend strongly on the protein dielectric constant, As discussed by Warshel (Lee et al., 1992Go, 1993Go; Sham et al., 1997Go; Schutz and Warshel, 2001Go) and others (Simonson, 2003Go; Simonson and Perahia, 1995aGo; Eberini et al., 2004Go; Krishtalik et al., 1997Go), the choice of depends, in turn, on the nature of the microscopic effects that are explicitly modeled. By averaging over the two protein states, before and after protonation, this PB/LRA method accounts explicitly for the atomic rearrangements in response to proton binding. As a result, we expect the best results will be obtained with a protein dielectric {epsilon}p = 1 or 2. Still, a larger dielectric constant may be needed even in this case (see, e.g., Eberini et al., 2004Go where a PB/LRA method with was used), to account for factors such as limited sampling of the protein end states or deviations from linear response.

In this work, we use four criteria to determine the optimum The first is to reproduce approximately the known experimental pKa values. This is satisfied with The second is to choose a value consistent with the molecular mechanics charge set used here (Mackerell et al., 1998Go). The molecular mechanics charges are designed to reproduce the equilibrium electrostatic potential with a dielectric constant of 1; the contribution of electronic polarization is included implicitly in the partial charges. The equilibrium structures used here were generated from MD simulations using this same charge set and a protein dielectric of 1 (Simonson et al., 2004Go). The implicit description can sometimes underestimate the effects of electronic polarization; see Simonson et al. (2004)Go for an example. Therefore, a value of between 1 and 2 appears plausible. The third criterion is to reproduce, at least qualitatively, the individual protonation free energies obtained by MDFE for the protein and the model compound. This is more stringent than simply reproducing the difference {Delta}{Delta}G between the protein and the model compound. The fourth criterion is a consistency condition that must be obeyed by any linear response model, including a dielectric continuum model. Indeed, the static and relaxation free-energy components are linked directly by Eq. 11, and indirectly by Eqs. 2, 3, and 9. These relations place constraints on the model, and help delimit which values of are physically meaningful. This criterion has been mostly ignored in previous applications. It requires that we consider in more detail the relaxation free-energy component and that we distinguish between two different protein dielectric constants (Krishtalik et al., 1997Go; Simonson et al., 1999Go).

More about the relaxation free energy
If the system obeys linear response, and if the electrostatic potentials are available for either the midpoint state or for both endpoint states, the total free-energy change {Delta}G can be obtained without ever calculating the relaxation terms (they cancel in Eqs. 710). Nevertheless, their calculation helps constrain the values of the dielectric constant that are physically meaningful. They also provide valuable information about the dielectric behavior of the system, because they directly measure its polarizability in response to the charge insertion (Simonson et al., 1991Go, 1999Go; Simonson and Perahia, 1995bGo; Archontis and Simonson, 2001Go; Simonson, 2002Go). Therefore, we also evaluate the relaxation free energies for different ionization states of the Asp side-chain carboxylate, using a range of values for the protein dielectric.

In the dielectric continuum framework, relaxation in response to the introduced charge is modeled by a redistribution of polarization charge at the protein-solvent boundary, determined by the dielectric constants of the solvent, {epsilon}w = 80 and of the protein, For any state "xxx" (reactant or product), one can show (Simonson et al., 1999Go) that the relaxation free energy is identical to the usual Born self-energy of the inserted charges {{Delta}qi}, given by

(12)

Here, is the reaction potential on the site i in state "xxx", due to the charges {{Delta}qi}.

Because the Poisson equation is linear, one might expect that our PB/LRA model would obey linear response by construction, and and would automatically agree. This is not necessarily true, because the free energies depend on the protein structure, which could vary in a nonlinear manner. However, in practice, it is true to a good approximation.


    NUMERICAL METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 NUMERICAL METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 SUPPLEMENTARY MATERIALS
 ACKNOWLEDGEMENTS
 REFERENCES
 
To obtain the electrostatic potentials appearing in Eqs. 9 and 10, we performed finite-difference Poisson calculations for a large number of equilibrium structures (50–200) for each charge state of the Asp side chain being considered. The structures were taken from molecular dynamics simulations of each charge state, described elsewhere (Simonson et al., 2004Go). The protein and solvent were treated as two homogeneous dielectric media. The reacting side chain was treated as a cavity (of dielectric unity). The protein-solvent and protein-cavity dielectric boundaries were defined by the molecular surface of the protein and the reacting side chain, respectively, constructed using atomic radii from the CHARMM22 force field, with the exception of the hydrogen radii, which were set to 1.0 Å (Mohan et al., 1992Go; Simonson and Brünger, 1994Go). The probe sphere for the surface construction was 2 Å. With this probe radius, the protein has no internal cavities (other than the reacting side-chain cavity). The solvent dielectric was set to 80 (unless otherwise mentioned). The value of the protein dielectric, was varied between 1 and 4. The permanent charges on the protein atoms were taken from the CHARMM22 molecular mechanics force field (Mackerell et al., 1998Go). The finite-difference Poisson equation was solved in two steps. The first step utilized a cubic-grid spacing of 0.8–1.0 Å, and the second, focusing step a spacing of 0.25–0.30 Å. The calculations were done partly with the UHBD program and partly with the MEAD program (Madura et al., 1995Go; Bashford, 1997Go). In particular, MEAD allows calculations with three distinct dielectric media (cavity, protein, solvent).

When calculating the electrostatic potential at a charge insertion site i (Eqs. 9 and 10), a contribution due to preexisting charge at the same site is subtracted; this "self" contribution exactly cancels when the free energy for the model compound is subtracted. For comparison to molecular dynamics free-energy results, we also subtract interactions between site i and protein atoms that are separated from i by one or two covalent bonds (Mackerell et al., 1998Go). This has essentially no effect on the computed pKa shifts, but it allows a direct comparison with the MDFE free-energy derivatives (below).

Unless otherwise mentioned, all the calculations were done assuming zero ionic strength, so that in fact the Poisson equation was solved, rather than the Poisson-Boltzmann equation. Nevertheless, we always refer to the "PB/LRA" method. A subset of Asp-20 calculations was done with the experimental ionic strength (100 mM monovalent salt concentration) (Forsyth et al., 2002Go); this has a very small effect on the free energies.

Calculations with some alternate solvent treatments are reported in Supplementary Material. These include calculations involving 1–3 explicit water molecules, and calculations with different solvent dielectric constants.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 NUMERICAL METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 SUPPLEMENTARY MATERIALS
 ACKNOWLEDGEMENTS
 REFERENCES
 
We calculate the protonation free energies, relative to a model aspartate compound in solution, of three aspartate side chains in two different proteins: Asp-26 in thioredoxin, Asp-14 in RNase A, and Asp-20 in thioredoxin. The protonation free energies have also been calculated by MDFE simulations with an explicit solvent representation and two different force fields (CHARMM and AMBER), as well as by MDFE with two different implicit solvent (generalized Born) representations. These calculations were presented in detail elsewhere (Simonson et al., 2004Go). The CHARMM MDFE runs employed the same atomic charges and radii used here.

To validate the method and the interpretation, we compare the {Delta}{Delta}G from PB/LRA both to experiment and to the CHARMM explicit solvent MDFE runs. In addition, we compare the PB/LRA static free energies and protonation free energies to the explicit solvent MDFE runs. We perform two types of component analyses: group decompositions, which reveal the most important stabilizing interactions, and separate calculations of the relaxation free energies, which provide a measure of the protein polarizability and allow a consistency test of the continuum model. Finally, we compare this PB/LRA method to a more "standard" PB protocol that employs structures from a single endpoint of the protonation reaction. We present the results for each aspartate side chain separately. We begin by the model compound, to illustrate the calculation of the static and protonation free energies. Asp-26 is then discussed in detail, and Asp-14 more briefly. Asp-20, with its unshifted pKa, presents unexpected difficulties and is described last.

The model compound
To determine the tendency of each Asp to be protonated, we must first consider the model compound in solution: an aspartic acid molecule with N-acetyl and N-methylamide blocking groups (Fig. 1), whose experimental pKa is 4.0 (Forsyth et al., 2002Go). To calculate the static terms entering in the free-energy expressions, Eqs. 9 and 10, we need the equilibrium electrostatic potentials at the charge insertion sites (i = Cß, C{gamma}, O{delta}1, O{delta}2, H{delta}; Fig. 1). Calculations are performed for three different equilibrium states: xxx = reactant, product, and midpoint. Direct interaction energies between the inserted charges {Delta}qi and prior charges at the same site i are omitted. We also report the corresponding static free energies from the CHARMM MDFE runs. The MDFE values are identical to the free-energy derivatives (with respect to a charging parameter {lambda}) reported in Table 1 of Simonson et al. (2004)Go. The signs correspond to the direction ASP -> ASPH, i.e., the direction of Asp protonation.


View this table:
[in this window]
[in a new window]
 
TABLE 1  Static free energies from PB/LRA and MDFE

 
The static free energies are given in Table 1. With a solute dielectric constant of the static free-energy ranges from +139.6 (charged, reactant state) to –6.0 kcal/mol (neutral, product state). Using the static free-energy values and Eqs. 9 and 10, we can compute the protonation free energy. The results are listed in Table 2, along with the result from MDFE simulations of the same system with the CHARMM force field (Simonson et al., 2004Go). The pathways "2-point" and "1-point", respectively, are described by Eqs. 9 and 10. A compound pathway, represented by "3-point", involves all three states ("reactant", "midpoint", and "product") and is described by Eq. 9. For the model compound, all pathways yield the same value (+66.5 kcal/mol), indicating that linear response is accurately satisfied (as expected for a small molecule in aqueous solution). This value is somewhat more positive than the corresponding CHARMM MDFE estimate (+60.5 kcal/mol).


View this table:
[in this window]
[in a new window]
 
TABLE 2  Protonation free energies from PB/LRA and MDFE

 
Setting the solute backbone is treated as a bulk dielectric medium, whose polarizability corresponds roughly to the electronic polarizability of a peptide group. We continue to treat the titrating side-chain moiety as a cavity (with a dielectric of one). With this scheme, the static free energies shift slightly (Table 1). The protonation free energy (Table 2) becomes +67.7 kcal/mol, slightly more positive than before (as expected, because of the additional polarizability). As noted in Numerical Methods, treating the side chain as a cavity is consistent with quantum chemistry practice, but differs from the usual PB methods for proteins, which "embed" the side chain directly in the protein dielectric medium. This cavity method is essential to obtain static free energies and protonation free energies in fair agreement with MDFE. It has only a small effect on the experimentally relevant pKa shifts, however, because of cancellation between the cavity contributions in the protein and the model compound (see below).

Thioredoxin Asp-26
Static free energies
In Table 1, we report the static free energies for Asp-26 in different charge states, using a protein dielectric Because atomic reorganization is explicitly accounted for in our method, is the physically realistic range (Fröhlich, 1949Go). When the ionizable side chain forms a small cavity within the protein.

With the static free energy varies from 95.5 kcal/mol (ionized state) to 28.7 kcal/mol (protonated state). Agreement with the explicit solvent MDFE simulations is fair. The larger value in the ionized state originates from favorable interactions between the charged Asp-26 carboxylate and nearby protein groups (particularly Lys-57), and from solvent polarized by Asp-26 itself. In the protonated state, the interactions between Asp-26 and the surrounding groups are weaker and the static free-energy term is much smaller.

Agreement between the midpoint value and the average of the two endpoints is a necessary condition for the system to obey linear response (Eq. 8). In the MD simulations of thioredoxin presented in Simonson et al. (2004)Go, it was observed that the protonated Asp-26 {chi}2 side chain occupies two conformers (proton pointing into the pocket or toward the solvent), which become equivalent at the ionized end state. As shown in Simonson et al. (2004)Go, the presence of the two conformers leads to a significant deviation from linear response for the protein, although the solvent presumably still responds as a linear medium. This LRA model allows for a protein response that is piecewise linear, with different slopes in the first and second halves of the reaction (see below).

With the static terms vary between 116.8 kcal/mol (ionized state) and 10.7 kcal/mol (protonated state), and the midpoint value is 59.5 kcal/mol (Table 1).

Protonation free energy
Using the static free-energy values and Eqs. 9 and 10, we can compute the protonation free energy for Asp-26. The results are listed in Table 2, along with the result from MDFE simulations of the same system with the CHARMM force field (Simonson et al., 2004Go).

The PB/LRA protonation free energies from the three pathways differ somewhat, and are consistently more negative than the corresponding CHARMM MDFE values. If the response to ionization were rigorously linear, all the pathways should give the same result. In fact, with the estimated free-energy change ranges from 50.5 to 62.1 kcal/mol (Table 2). The variability arises because the midpoint static term is very different from the average over the charged- and neutral-state static terms (see Table 1); this indicates a nonlinearity in the protein dielectric response, already noted in the MDFE study (Simonson et al., 2004Go). We recall that in this PB/LRA model, protein atomic reorganization is included explicitly by considering the two endpoint states and the midpoint state. The "3-point" pathway, which uses conformations for all three states, explicitly allows for the possibility that the protein response may be piecewise linear, with different slopes in the first and second halves of the reaction; this is exactly the behavior observed in the earlier MDFE simulations. Therefore, this is presumably the most reliable pathway for Asp-26, giving a protonation free energy of 56.3 kcal/mol. Most of the discrepancy between PB/LRA and MDFE will cancel when the model compound result is subtracted (below).

With all the protein reorganization is included in the atomic rearrangements between the endpoint structures. In fact, electronic polarizability may play a role in the pKa shift. The CHARMM atomic charge set is meant to include electronic polarization implicitly (Mackerell et al., 1998Go), in a mean field way. This average treatment may underestimate local effects in some cases, so that a larger value of may be appropriate. With the Asp-26 protonation free energy increases to 61.7 kcal/mol.

Relative protonation free energy and pKa shift
The protonation free energies with respect to the model compound, and the corresponding pKa shifts are reported in Table 3. With and using the 3-point pathway (expected to be the most reliable one), the protonation free energy of Asp-26 relative to the model compound is {Delta}{Delta}G = –10.2 kcal/mol. This is higher (in absolute value) than experiment, which gives –4.8 kcal/mol, but in very good agreement with MDFE (especially the more reliable, "backward" MDFE run (Simonson et al., 2004Go)). Like PB/LRA with the MDFE method models atomic reorganization of the protein explicitly, but has no explicit electronic polarization.


View this table:
[in this window]
[in a new window]
 
TABLE 3  PB/LRA free energies relative to the model compound

 
With electronic polarizability is explicitly included. PB/LRA then gives {Delta}{Delta}G = –6.0 kcal/mol, in good agreement with experiment.

Group decomposition of the static free energy
The static free energy depends strongly on the state (protonated, ionized, or "midpoint"). To understand this dependency, we decompose the static terms into contributions from selected protein residues (Table 4). Calculations were done with


View this table:
[in this window]
[in a new window]
 
TABLE 4  Thioredoxin Asp-26 group contributions to the PB static free energy

 
The main difference between the protonated (ASPH) and ionized (ASP) static terms arises from the Asp-26 self-energy term ("Asp-26a") and from nearby Lys-57. The Asp-26a term corresponds to the reaction field on Asp-26 due to solvent polarized by its own carboxylate. In agreement with linear response, this term is proportional in each state (ASPH, midpoint, ASP) to the Asp-26 charge in that state. In contrast, the contribution of Lys-57 is fairly constant in the neutral (13.2) and midpoint states (14.8 kcal/mol), becoming much more positive (44.4 kcal/mol) in the charged state. Lys-57 forms a salt bridge with Asp-26 in the charged state; this interaction is completely absent in the protonated and midpoint structures (see Figs. 4 and 5 in Simonson et al., 2004Go).

RNase Asp-14
We have also calculated the protonation free energy for Asp-14 in a different protein, ribonuclease A. The PB static free energies are listed in Table 1. With they vary from 130.0 kcal/mol (ionized state) to 7.7 kcal/mol (neutral state). The midpoint value (70.0 kcal/mol) is close to the average over the endpoints (68.9 kcal/mol), indicating that this protonation reaction is well described by linear response.

The protonation free energy, consequently, has a very small dependency on the choice of pathway (not shown). With the protonation free energy is estimated to be between 70.0 and 70.7 kcal/mol, significantly more positive than the corresponding MDFE value (59.4 kcal/mol). The 3-point value is 70.3 kcal/mol (see Table 2). With the protonation free energy is 67.8 kcal/mol.

The experimental pKa for Asp-14 is 2, two units below the model compound, and corresponding to an experimental relative protonation free energy of 2.7 kcal/mol. With the relative protonation free energy of Asp-14 is estimated here to be 3.8 kcal/mol. With it is 0.1 kcal/mol, in poor agreement with experiment. However, the ribonuclease simulations were done with all the protein carboxylates ionized except for Asp-14 (Simonson et al., 2004Go). These groups are solvent exposed and distant from Asp-14, and it was assumed that their protonation state would not affect the results strongly. In fact, implicit solvent (generalized Born) simulations showed (Simonson et al., 2004Go) that when the other carboxylates are protonated, the Asp-14 protonation free-energy changes by 1.1 kcal/mol. Including this correction, with the Asp-14 PB/LRA protonation free energy increases to 4.9 kcal/mol (see Table 3). With it becomes 0.7 kcal/mol (the correction scales with ). The MDFE calculation yielded a relative protonation free energy of 0 kcal/mol, i.e., a zero pKa shift. The PB/LRA results with either are somewhat more accurate. A dielectric of would give perfect agreement with experiment.

A group decomposition of the RNase Asp-14 free energies is given in Table 5, with The most important contributions come from Asp-14, His-48, Arg-33, and Tyr-25. In the crystal structure (determined at a pH of 5.2), Asp-14 is ionized and interacts with His-48 and Arg-33. In the MD simulations, Asp-14 makes a hydrogen bond to His-48 and forms a solvent-separated interaction with Arg-33. In accord with these observations, the contributions of His-48 and Arg-33 to the PB static term are significant, with His-48 having the larger value. Both contributions decrease by 12–13 kcal/mol in the neutral Asp-14 state, indicating a weakening of interactions upon Asp-14 protonation.


View this table:
[in this window]
[in a new window]
 
TABLE 5  RNase Asp-14 group contributions to the PB static free energy

 
Thioredoxin Asp-20
Asp-20 is completely exposed to the solvent and has an experimental pKa shift of zero. This should be an "easy" case for pKa calculations. In fact, it poses specific problems.

The PB static free energies are listed in Table 1. With the static terms are 131.9 and –9.3 kcal/mol, respectively, for the charged and neutral states. The midpoint value (60.5 kcal/mol) is close to the average over the endpoints (61.3 kcal/mol). With the static free energies in the ionized and half-ionized states are more positive. The qualitative agreement with the MDFE results is very good.

Given the linear dependence of the PB static values on the Asp-20 charge, the protonation free energy is almost independent of the pathway (not shown). With the average over all pathways is {Delta}G = 60.8 kcal/mol (Table 2). The relative protonation free energy with respect to the model is –5.7 kcal/mol (Table 3). The experimental result is 0 kcal/mol. The corresponding estimate from MDFE is –0.9 kcal/mol. Thus, with the PB/LRA method incorrectly predicts a large upwards pKa shift.

To take into account the experimental ionic strength, 100 mM monovalent salt concentration, we also solved the Poisson-Boltzmann equation for the protein in this case. This increases the protonation free energy by ~0.5 kcal/mol, improving the agreement with experiment very slightly.

Decomposition into contributions from all protein residues (Table 6) shows that the Asp-20 carboxylate itself accounts almost entirely for the static PB free-energy values, by polarizing the surrounding solvent. This is because Asp-20 is extensively solvated, and its electrostatic interactions with other protein residues are largely screened. This allows us to explain the underestimated protonation free energy and the incorrect pKa shift with Considering the midpoint, for example, the static free energy of 60.5 kcal/mol is 5.9 kcal/mol weaker than for the model compound (Table 1). This difference is about the same as the error in the protonation free energy. It indicates that the half-charged Asp-20 carboxylate polarizes its surroundings more strongly in the model compound than in the protein. Because the solvent properties are the same in the two cases, it is the protein that must account for the discrepancy. In fact, the protein polarization by Asp-20 is underestimated. Indeed, with electronic polarization is underestimated (because the MD charges only include it in a mean field sense; Mackerell et al., 1998Go). This effect was less evident for Asp-26 and Asp-14 (above), where the protein polarization was dominated by atomic rearrangements.


View this table:
[in this window]
[in a new window]
 
TABLE 6  Thioredoxin Asp-20 group contributions to the PB static free energy

 
Taking the protonation free energy increases to 63.3 kcal/mol and {Delta}{Delta}G increases to –4.4 kcal/mol, or –3.9 kcal/mol, including the effect of ionic strength. This is still rather far from the experimental value of zero.

These results raise the question: why does MDFE successfully predict a small pKa shift for Asp-20? The MDFE calculations use the same charges and the same conformations as the PB/LRA calculations; therefore, they must underpolarize the protein to the same extent. To produce its larger protonation free energy, MDFE must overpolarize the solvent around thioredoxin, compared to the solvent around the model compound. This compensation of errors between protein and solvent polarization makes the MDFE calculation more robust than PB/LRA for this case.

This suggests that in the PB/LRA calculation, the macroscopic treatment of solvent polarization in the vicinity of Asp-20 may be inaccurate. We discuss in Supplementary Material some variants of the model that use slightly different solvent treatments. Some of these improve the agreement with experiment slightly, but they do not change the basic picture described here. Notice that for Asp-26, MDFE with explicit solvent led to an even larger error of 6.1 kcal/mol (Simonson et al., 2004Go).

An obvious way to increase the Asp-20 protonation free energy is to increase the protein dielectric Warshel and co-workers (Warshel et al., 1984Go) have shown that the use of a large dielectric constant for charge-charge interactions is expected to work well in cases of protein surface groups. This has been empirically confirmed by several pKa calculations from various groups. However, we emphasize again that because the present PB/LRA method treats protein reorganization explicitly, the physically realistic range is ; see Fröhlich for a definitive discussion (Fröhlich, 1949Go).

Relaxation free energies
Following the two-step Marcus procedure, the charging free energy can be decomposed into a static and a relaxation step; see Eqs. 2 and 4. If linear response holds, the relaxation free energies are independent of the state; they cancel from Eqs. 9 and 10, and are not needed for {Delta}G. Nevertheless, the relaxation free energies are of interest. They help characterize the dielectric response to proton binding (Simonson and Perahia, 1995bGo; Simonson et al., 1999Go; Archontis and Simonson, 2001Go). In addition, the static and relaxation free energies are linked (Eq. 11). Thus, the relaxation data serve as an additional guide in choosing the optimum protein dielectric constant for the pKa calculations.

To calculate the relaxation free energies, we perform PB calculations for several values of the protein dielectric constant and apply Eq. 12. The results are summarized in Table 7. The relaxation values depend strongly on the protein dielectric constant As discussed above and elsewhere (Simonson et al., 1999Go; Archontis and Simonson, 2001Go; Krishtalik et al., 1997Go), different values will normally be appropriate for and Optimal values can be determined from the linear response consistency relation, Eq. 11. Moderate values of are optimal here. Larger values of 20 or 80, say, are not appropriate.


View this table:
[in this window]
[in a new window]
 
TABLE 7  Relaxation free energies

 
We first discuss Asp-26. The relaxation free energy depends very weakly on the charge state, in accord with linear response (not shown). A dependency could arise if, e.g., the location of the side chain in the protein, or the overall shape of the protein dielectric cavity changed abruptly with the Asp carboxylate charge state. The prediction of the linear response formula (Eq. 11) is also shown in Table 7 for various values of When is used, consistency between the static and relaxation terms is best satisfied with a relaxation dielectric constant (values underlined). When is used, consistency between the static and relaxation terms is best satisfied with a lower relaxation dielectric constant The difference is expected, because the first scheme corresponds to a protein that has explicit electronic polarizability (), so that the relaxation free energy is enhanced.

In the case of RNase Asp-14, the optimum relaxation dielectric constant is when When relaxation is enhanced, and the best

In the case of Asp-20, when is used, consistency between the static and relaxation terms is best satisfied with a relaxation dielectric constant In this case, {Delta}Gr is very weakly sensitive to because Asp-20 is solvent exposed, and the relaxation is largely dominated by solvent.

The range of values found here for all three Asp side chains, is similar to that found for two positions in the active site of the enzyme aspartyl-tRNA synthetase (Simonson et al., 1999Go; Archontis and Simonson, 2001Go). It is consistent with theoretical estimates of reorganization free energies in cytochrome c (Simonson and Perahia, 1995bGo; Simonson, 2002Go; Muegge et al., 1997Go), and with calculations of the average dielectric constant of several proteins (King et al., 1991Go; Smith et al., 1993Go; Simonson and Perahia, 1995aGo; Pitera et al., 2001Go; Simonson, 2003Go). It is also consistent with the polarizability measured by several experimental techniques, including dielectric dispersion by dry protein powders (Bone and Pethig, 1985Go), Stokes shift measurements for a probe bound in the active site of chymotrypsin (Mertz and Krishtalik, 2000Go), and Stark shifts of chromophores in the photosynthetic reaction center (Steffen et al., 1994Go).

Comparison to a "standard" pKa protocol
It is important to compare our pKa results with a PB protocol that is widely used for the calculation of pKa shifts (Bashford and Karplus, 1990Go; Raquet et al., 1997Go; Yang and Honig, 1993Go; Antosiewicz et al., 1996Go; Warwicker, 1999Go, 2004Go). In this "standard" protocol, a single protein structure is used, corresponding to one or the other of the ionization states of the residue of interest. For an Asp, a crystal structure with the Asp side chain in the ionized state is typical. Because the same atomic coordinates are used for both states, the protein structural relaxation is not taken into account explicitly, but only implicitly via the redistribution of polarization charge, governed by the protein and solvent dielectric constants. To obtain the protonation free energy of the same group in solution, analogous PB calculations are performed for a model compound (Fig. 1), embedded in a high-dielectric medium (the solvent).

We emphasize that this method can be implemented by simply adding the static and relaxation free energies computed here for either one of the endpoint structures. The constraint with the "standard" method is to use the same protein dielectric for both components:

(13)
(even though we have already seen that this is not physically realistic with this CHARMM22 charge set). Some of the data are available in Tables 1 and 7; to explore other dielectric values, additional calculations were done. With this method, we computed the pKa shifts of Asp-26, Asp-14, and Asp-20. In each case, two sets of calculations were done using, respectively, structures for the ionized or the protonated Asp state. The structures are the same ones used for the PB/LRA calculations above.

The results are given in Table 8. They vary with the protein dielectric constant as expected, but have also a striking dependence on the structural model assumed for the protein. Dielectric constants of 2–4 yield poor results in all cases. For Asp-26 the relative free energies are –8 and +4 kcal/mol, respectively, when protonated or ionized structures are used with {epsilon}p = 4. The corresponding pKa shifts are 5.8 and –2.8; the experimental value is 3.5. Increasing the protein dielectric constant to 20, the pKa shifts become 1.4 and –0.7, respectively; i.e., they converge toward zero, but deviate considerably from the experimental result.


View this table:
[in this window]
[in a new window]
 
TABLE 8  pKa shifts with the "standard" PB protocol

 
A similar dependency on the protein structure is observed for Asp-14. With a protein dielectric of 4, the free energy is –7 kcal/mol when structures of the protonated endpoint are used and 5 kcal/mol when structures of the ionized endpoint are used. Increasing the protein dielectric to {epsilon}p = 20, the free energies become –3 and 0, respectively. The experimental result is for Asp-14 protonation is 2.7 kcal/mol (–2 pKa units).

In the case of the solvent-exposed Asp-20, the pKa shifts computed with the two sets of structures are more similar. Agreement with experiment is fair when a high protein dielectric of 20 is used.

In the continuum dielectric framework used here, the titratable side chain is treated as a cavity within the protein medium (Fig. 1 B). This differs from the usual approach, which embeds the side chain directly within the protein dielectric. A side-chain cavity is essential to obtain reasonable values for the protonation free energy (compared to MDFE). Results are also reported in Table 8 using an "embedded" side chain. As expected, the effect of the side-chain cavity on the pKa shifts is much smaller than on the absolute protonation free energies (because the cavity contributions largely cancel when the model com