help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gunner, M. R.
Right arrow Articles by Wise, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gunner, M. R.
Right arrow Articles by Wise, M.

Biophys J, March 2000, p. 1126-1144, Vol. 78, No. 3

Backbone Dipoles Generate Positive Potentials in all Proteins: Origins and Implications of the Effect

M. R. Gunner, Mohammad A. Saleh, Elizabeth Cross, Asif ud-Doula, and Michael Wise

Physics Department, City College of New York, New York 10031

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Asymmetry in packing the peptide amide dipole results in larger positive than negative regions in proteins of all folding motifs. The average side chain potential in 305 proteins is 109 ± 30 mV (2.5 ± 0.7 kcal/mol/e). Because the backbone has zero net charge, the non-zero potential is unexpected. The larger oxygen at the negative and smaller proton at the positive end of the amide dipole yield positive potentials because: 1) at allowed phi and psi angles residues come off the backbone into the positive end of their own amide dipole, avoiding the large oxygen; and 2) amide dipoles with their carbonyl oxygen surface exposed and amine proton buried make the protein interior more positive. Twice as many amides have their oxygens exposed than their amine protons. The distribution of acidic and basic residues shows the importance of the bias toward positive backbone potentials. Thirty percent of the Asp, Glu, Lys, and Arg are buried. Sixty percent of buried residues are acids, only 40% bases. The positive backbone potential stabilizes ionization of 20% of the acids by >3 pH units (-4.1 kcal/mol). Only 6.5% of the bases are equivalently stabilized by negative regions. The backbone stabilizes bound anions such as phosphates and rarely stabilizes bound cations.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

The amide group of the protein backbone is the most prevalent polar group in any protein, and it plays several well established roles in determining protein structure and function. Thus, when a protein folds the backbone NH and C==O groups in the protein interior find hydrogen bonds to replace those made to water in the unfolded polypeptide (Yang and Honig, 1995a, b). The pattern of regular intra-backbone hydrogen bonds yields the protein secondary structures that have been the subject of research going back to the early work of Pauling. Amides in specific motifs have been shown to be important for the stabilization of buried charges. Interactions of charges with the backbone have been identified both by using geometric rules that identify hydrogen bonds (Baker and Hubbard, 1984; Rashin and Honig, 1984; Stickle et al., 1992; McDonald and Thorton, 1994; Gandini et al., 1996) and by calculation of the intra-protein electrostatic potential (Spassov et al., 1997). Interaction of charges with the alpha -helix dipole (Wada, 1976; Hol et al., 1978; Hol, 1985) have been implicated in increased protein stability (Nicholson et al., 1988; Sali et al., 1988) and in pKa shifts of acidic and basic residues (Aqvist et al., 1991; Sancho et al., 1992; Sitkoff et al., 1994). Amides in loops also make hydrogen bonds to stabilize charges. The backbone is important in calcium, (Strydnaka and James, 1989), phosphate, and sulfate (Hol, 1985; Quiocho et al., 1987; Jacobson and Quiocho, 1988; Luecke and Quiocho, 1990; He and Quiocho, 1993; Yao et al., 1996) binding sites, and in ion binding in the potassium channel (Doyle et al., 1998). Amides also play important roles in enzyme reactions such as in the oxyanion hole of the serine proteases, where they stabilize the negative charge on the substrate carbonyl in the transition state (James et al., 1980).

Cations are stabilized in regions of negative potential and anions in positive regions. Because the amide group is a dipole, if it is properly oriented it can interact favorably with either charge. However, there is growing evidence that the backbone stabilizes anions more often than cations. For example, there are more bound anions such as phosphate and acidic amino acids at helix N-termini than cations at the C-termini (Hol et al., 1981; Richardson and Richardson, 1988; Gandini et al., 1996). A large positive potential is found at the redox center in iron-sulfur proteins (Langen et al., 1992; Swartz et al., 1996), at the phosphate binding site in alpha /beta barrel proteins (Raychaudhuri et al., 1997), and at a cluster of buried acids in the bacterial photosynthetic reaction centers (Beroza et al., 1995; Lancaster et al., 1996). Calculations show that charges on acidic side chains are better stabilized than bases by the backbone dipoles in aspartate transcarbamylase (Oberoi et al., 1996). The backbone is found to produce a generally positive potential near the protein surface (Spassov et al., 1997). However, there has been no investigation of whether there is a general principle that the potential from the backbone is, on average, positive, or of how the neutral amide dipoles could produce this result.

While the secondary structure motifs are the most obvious consequence of proteins having an amide linkage, this paper will show that the amide group imposes additional, inescapable consequences for protein structure and function. Most simply, the shape of the amide is dominated by the oxygen of the carbonyl (C==O) being substantially larger than the amine HN hydrogen (Fig. 1). One consequence of this is that to avoid a steric clash the peptide R group is trans to the C==O, closer to the HN, at favored phi and psi angles. Moreover, the curvature of a protein's surface favors placing the larger carbonyl oxygen out toward the solvent, while the smaller HN is more likely to be packed in the protein interior. Thus, the asymmetry of the amide group itself imposes an asymmetric packing of the amides within proteins. Electrostatic interactions are the most long-range in proteins. Asymmetry in the orientation of a collection of dipoles, even those that are involved in hydrogen bonds, will generate a significant, non-zero electrostatic potential. This can influence the disposition and energy of the charged groups within proteins.



View larger version (132K):
[in this window]
[in a new window]
 
FIGURE 1   Space filling representation of an amide group. The amine HN (r = 1.0 Å) is substantially smaller than the carbonyl oxygen (r = 1.6 Å). The first atom of the two side chains (CB) adjacent to the amide are oriented as they would be in an alpha -helix.

This paper will describe the analysis of many protein structures to show that the neutral backbone dipoles make the electrostatic potential more positive within proteins of all motifs. It will then be shown how the structure of the amide dipole itself, negative toward the carbonyl oxygen and positive toward the amide proton, produces a non-zero potential in all proteins. Lastly, an analysis of the distribution of acidic and basic side chains and ionized substrates and cofactors in many proteins will show a bias toward burying anions rather than cations, not unexpected if the backbone dipoles make the protein interior more positive. Each protein represents a balance of many forces such as the hydrophobic effect favoring non-polar residues inside a protein and the solvation energy stabilizing charged residues on the surface. The basic geometry of the amide dipole by producing more positive potentials within all proteins adds another term to the forces that influence each protein's folding, structure, and function.

    MATERIALS AND METHODS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Protein structures

Proteins were selected from the Brookhaven data bank (Bernstein et al., 1977) to contain examples of many of folds in the SCOP classification system (Murzin et al., 1995). SCOP classes are alpha , all alpha -helix; beta , all beta -sheet; alpha /beta , mainly parallel beta -sheets (beta -alpha-beta units); alpha  + beta , mainly antiparallel beta -sheets (segregated alpha  and beta  regions); small, usually dominated by metal ligand, heme, and/or disulfide bridges; multi, multi-domain (alpha  and beta ); membrane, membrane and cell surface proteins and peptides. SCOP classifies domains independently, so proteins can belong to several motifs. When domains in one protein are in different SCOP classes the protein is designated mixed-motif, a group that includes all SCOP multi-domain proteins.

The following 305 proteins were used. The 141 proteins with resolution of <= 1.8 Å are underlined. The 30 structures with resolution >= 2.6 Å are in italics.

alpha -helix:  1aep, 1ala, 1bbh, 1bgc, 1bgd, 1cce, 1ccr, 1clm, 1cmb, 1cpc, 1cpt, 1csh, 1dcc, 1eco, 1fia, 1gmf, 1hdd, 1hrs, 1huw, 1hyp, 1lis, 1lmb, 1lpe, 1mba, 1mbc, 1mdy, 1oct, 1omd, 1par, 1phc, 1r69, 1rhg, 1rib, 1rop, 1utg, 256b, 2abk, 2asr, 2ccy, 2cep, 2cnd, 2cro, 2cts, 2cyp, 2hhb, 2hmq, 2mhr, 2pal, 2wrp, 2ycc, 351c, 3c2c, 3gly, 3icb, 4bp2, 5cpv, 5cyt.

beta -sheet:  1aac, 1acx, 1arb, 1avd, 1bbp, 1bcx, 1bgh, 1cau, 1ctm, 1f3g, 1gcs, 1gct, 1gof, 1hbp, 1hcb, 1hlc, 1hmr, 1hne, 1hoe, 1hvj, 1icm, 1ifc, 1igm, 1mdc, 1mjc, 1mup, 1nsc, 1paz, 1plc, 1pmy, 1png, 1ppl, 1pts, 1r1a, 1rbp, 1scs, 1sgt, 1shf, 1shg, 1snc, 1stp, 1ten, 1tie, 1tld, 1tnf, 1ton, 1ttb, 1vmo, 2alp, 2apr, 2ayh, 2aza, 2ca2, 2cab, 2cpl, 2er7, 2fb4, 2ltn, 2mcm, 2mev, 2pab, 2pcy, 2pec, 2plv, 2psg, 2rhe, 2rsp, 2sam, 2sga, 2sil, 2snv, 2sod, 2stv, 3est, 4fgf, 4gcr, 4pep, 4sbv, 6nn9.

alpha /beta :  1aba, 1aco, 1ads, 1alk, 1amp, 1bnh, 1cde, 1cus, 1gdh, 1gpb, 1hmy, 1lct, 1nar, 1nba, 1nip, 1ofv, 1omp, 1rpa, 1rve, 1s01, 1sto, 1thg, 1tml, 1tpf, 1trk, 1 ulb, 1wht, 2ak3, 2dkb, 2dri, 2had, 2prk, 2rn2, 2trx, 3chy, 3cla, 3dfr, 3eca, 3hsc, 4fxn, 5p21, 7aat, 8abp.

alpha  + beta :  1aak, 1ahc, 1alc, 1apa, 1ast, 1aya, 1brn, 1cew, 1ctf, 1dtp, 1fdd, 1fkf, 1frd, 1fus, 1fxd, 1fxi, 1gmp, 1iag, 1igd, 1lba, 1mat, 1mol, 1npk, 1pkp, 1ppn, 1ris, 1rms, 1sha, 1tbp, 1ubq, 1yat, 2acg, 2act, 2bop, 2chs, 2ci2, 2dnj, 2fxb, 2hpr, 2lzm, 2ms2, 2msb, 2pol, 2ssi, 2uce, 3b5c, 3il8, 4tms, 7rsa, 9rnt.

small:  1aap, 1cbn, 1fas, 1isu, 1nxb, 1rdg, 2cdv, 2ovo, 2sn3, 4ins, 4pti, 4rxn, 9wga.

multi-motif:  1ezm, 1isb, 1sry, 2tmn, 3sdp, 1bia, 1chm, 1cse, 1emd, 1gal, 1glv, 1lvl, 1pca, 1pda, 1phh, 1rbl, 2glt, 2npx, 2reb, 2sic, 3cox, 3grs, 4enl, 4gpd, 4mdh, 5rub, 9ldt, 2 cmd, 2pia, 8atc, 1dlh, 1tss, 2aai, 2mha, 1ddt, 1esl, 1dsb, 1glq, 1gne, 1hna, 2gst, 2pgd, 4ts1, 1gia, 1fc2, 1lla, 1prc, 2bpf, 3mdd, 1cdg, 1cdo, 1eft, 1hpl, 2aaa, 8adh, 1gla, 1dlc, 1tnr, 2bbk, 2por, 1rpl, 1gma, 1ppt.

Crystallographic waters, SO4, and PO4 with >10% of their surface exposed to solvent were deleted. The surface exposure was determined with the program SURFV (Sridharan et al., 1992). Protons were added to the proteins with a 1.0 Å bond length and standard geometry.

Calculation of the electrostatic free energy terms for acidic and basic residues

Electrostatic free energy terms were calculated for the ionized form of the acidic residues Asp and Glu and the bases Arg and Lys. DelPhi calculations were run for each residue with charges only on the atoms of this one side chain. All other atoms in the protein had zero charge. Focusing was used (Gilson et al., 1987) so that the minimum resolution for mapping the atoms and surface to the grid for the finite difference solution of the Poisson equation was 0.83 Å/grid. The dielectric constant for the protein (epsilon prot) was 4, while that of the surrounding solvent (epsilon solv) was 80. For each ionized side chain the same calculation provides the pairwise interactions of the residue with the backbone and its reaction field energy.

Pairwise interactions between the backbone and ionized side chains

The potential was determined at all atoms in the backbone in a protein where a single acidic or basic residue has charge. The free energy of the pairwise interaction between the backbone and side chain i (Delta Gbkn) is:
&Dgr;G<SUP><UP>i</UP></SUP><SUB><UP>bkbn</UP></SUB>=<LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>R</UP></UL></LIM> <LIM><OP>∑</OP><LL><UP>bj=1</UP></LL><UL><UP>bn</UP></UL></LIM> &PSgr;<SUP><UP>si</UP></SUP><SUB><UP>bj</UP></SUB>q<SUB><UP>bj</UP></SUB> (1)
where Psi bjsi is the potential at atom b in the backbone of the jth residue from charges on the ith side chain. This pairwise interaction was obtained for the bn atoms of the backbone that bear partial charge (qa) (Table 1). The interaction was then summed for all R backbone amides in the protein.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1   The charges used on the atoms of the backbone amides

Reaction field energy

The reaction field energy (also referred to as the self, solvation, or Born energy) measures the difference in energy of an ion or dipole when it is transferred between media with different abilities to reorganize around charges. Electronic polarization and rearrangement of atomic dipoles both contribute. Using continuum electrostatic theory, the response of the media is encapsulated in the dielectric constant. The reaction field energy is calculated here using an algorithm in DelPhi, which determines the interaction energy between the charges on the protein atoms and charges induced at the protein-water dielectric boundary (Nicholls and Honig, 1991; Sridharan et al., 1992).

The penalty for placing a charge at its location in the protein is the difference between the reaction field energy of the residue in situ and the reaction field energy of the same residue isolated from the protein:
&Dgr;G<SUB><UP>rxn</UP></SUB>=&Dgr;G<SUB><UP>rxn in protein</UP></SUB>−&Dgr;G<SUB><UP>rxn in soln</UP></SUB> (2)
Delta Grxn in protein and Delta Grxn in soln are both negative, favorable terms. Delta Grxn is always a positive, unfavorable energy term because the absolute value of Delta Grxn in protein is always less than Delta Grxn in soln. The reaction field energy for side chains in solution were obtained for isolated coordinates of each side chain in the protein data bank file 1PRC (Table 2). There is very little variation between different conformers of any side chain, so one reference value is used for each type of residue.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 2   Reaction field energy in solution for acidic and basic side chains

Calculation of interactions between the backbone and all side chains and bound ligands

Average potential in the protein

The potential was calculated by placing partial charges on all backbone amides. A DelPhi calculation was carried out with a 1293 grid. This provides a grid spacing of >1.0 Å/grid for all but 30 proteins. The potential (Psi abkbn) from the backbone at each of the m non-backbone heavy atoms (a) was averaged to determine VP. The potential at waters and other non-protein atoms was not included in the sum.
V<SUB><UP>p</UP></SUB>=<FR><NU>1</NU><DE>m</DE></FR> <LIM><OP>∑</OP><LL><UP>a=1</UP></LL><UL><UP>m</UP></UL></LIM> &PSgr;<SUP><UP>bkbn</UP></SUP><SUB><UP>a</UP></SUB> (3a)
In a group of N proteins the average of VP is:
<UP>Av</UP>V<SUB><UP>p</UP></SUB>=<FR><NU>1</NU><DE>N</DE></FR> <LIM><OP>∑</OP><UL><UP>N</UP></UL></LIM> V<SUB><UP>p</UP></SUB> (3b)
The average potential (VS) from the backbone at a residue was obtained from:
V<SUB><UP>S</UP></SUB>=<FR><NU>1</NU><DE>n</DE></FR> <LIM><OP>∑</OP><LL><UP>a=1</UP></LL><UL><UP>n</UP></UL></LIM> &PSgr;<SUP><UP>bkbn</UP></SUP><SUB><UP>a</UP></SUB> (4a)
where there are n non-backbone heavy atoms (a) in the side chain

In a group of R residues the average of VS is:
<UP>Av</UP>V<SUB><UP>S</UP></SUB>=<FR><NU>1</NU><DE><UP>R</UP></DE></FR> <LIM><OP>∑</OP><UL><UP>R</UP></UL></LIM><UP> V</UP><SUB><UP>S</UP></SUB> (4b)
The free energy of interaction of the jth side chain or ligand with the backbone is:
&Dgr;G<SUP><UP>j</UP></SUP><SUB><UP>bkbn</UP></SUB>=<LIM><OP>∑</OP><LL><UP>a=1</UP></LL><UL><UP>n</UP></UL></LIM> &PSgr;<SUP><UP>bkbn</UP></SUP><SUB><UP>aj</UP></SUB> q<SUB><UP>a</UP></SUB> (5)
where qa is the charge on atom a in an appropriate partial charge set. The free energy of interaction of a side chain or ligand with the backbone (Delta Gbkbn) can be calculated with either Eq. 1 or 5. For Eq. 1 the side chain is charged and the potential is collected at all the atoms of the backbone. For Eq. 5, the backbone is charged and the potential is collected at the side chain atoms.

Unless otherwise noted, calculations of VP, VS, and Delta Gbkn use CHARMM partial atomic charges for backbone (Table 1) and side chains (Brooks et al., 1983); epsilon prot is 4 and epsilon solv is 80. The atomic radii were for each atom type H 1.2 Å, C 1.8 Å, N 1.5 Å, O 1.6 Å, S 1.9 Å, P 1.2 Å.

Interaction between side chains and specific amide groups

The interaction of each side chain with each amide was calculated in 51 proteins. Each DelPhi calculation had partial charges on only one amide group. Thus, R calculations were made for a protein with R residues. The grid resolution was >0.83 Å/grid for each protein. Where necessary the focusing technique was used centered on the amide that carried the partial charges (Gilson et al., 1987). The net charge was 0 in each run, resulting from ± 0.9 charge for a standard amide and ± 0.75 for Pro. Equations 3 and 4 were used to calculate the average potential from each amide within the protein or at specific side chains; Eq. 5 provided the free energy of interaction between specific side chains and individual amides.

Potential at CB from amide(n) and amide(c) as a function of the phi and psi angle

All non-terminal amino acids in a protein lie between an amide toward the N-terminal (amide(n)) and one toward the C-terminal (amide(c)) (Fig. 2). Two series of 36 Ala tripeptide coordinates were constructed. In one set the phi angle was changed in increments of 10°, in the other the psi angle was varied. For the series with different phi angles, all atoms toward the N-terminal were rotated holding the central CA and CB and all atoms toward the C-terminal rigid. The series with different psi angles were constructed holding the N-terminal and the central CA and CB fixed and rotating atoms toward the C-terminal.



View larger version (27K):
[in this window]
[in a new window]
 
FIGURE 2   Each non-terminal side chain lies between 2 amides, one toward the N-terminal and the other toward the C-terminal. (A) The amides toward the N-terminal (amide(n)) and C-terminal (amide(c)) of the side chain of residue i. (B) One amide is amide(c) for one side chain (i) and is amide(n) for the next side chain (i + 1) in the protein.

The potential at the central CB was obtained using Coulomb's law assuming a uniform dielectric constant of 4. Calculations with the tripeptides surrounded by water (epsilon prot = 4; epsilon solv = 80) were calculated with DelPhi. In this case the positions of all atoms in the tripeptide modify the dielectric boundary, and so effect the results. The variation of phi was carried out in tripeptides where psi is -60°, while the psi rotation was carried out in peptides where phi is 120°.

Comparing the surface exposure of the carbonyl O and amine HN for each amide

In the standard protein, the N to HN distance is 1.0 Å and the H radius is 1.2 Å. In contrast the average C to O bond length is 1.23 Å and the O radius is 1.6 Å. This geometry ensures that the O will have more surface to expose to solvent than the HN does. Protein coordinates were prepared where the HN to N distance was 1.23 Å and the HN radius was 1.6 Å. The surface exposure of the O and the modified HN to a 1.4 Å probe were calculated with the program SURFV (Sridharan et al., 1992).

The in situ pKa of acidic and basic residues

The pKa of acids or bases in proteins can be different from that found in solution because interactions in the protein shift the relative energy of residue or ligand charged and neutral state (Churg and Warshel, 1986; Bashford and Karplus, 1990; Gunner and Honig, 1991; Yang et al., 1993; Antosiewicz et al., 1994; Gunner et al., 1997). The complete calculation of residue ionization states is beyond the scope of this paper. However, other interactions in the protein will modify the expected effects of Delta Grxn and Delta Gbkn. Thus, if the charge state of all other R residues were fixed, the pKa of residue i would be shifted from its value in solution (pKsoln,i) in the following way:
<UP>pK</UP><SUB><UP>prot,i</UP></SUB>=<UP>pK</UP><SUB><UP>soln,i</UP></SUB>+&Dgr;G<SUP><UP>crg</UP></SUP><SUB><UP>rxn,i</UP></SUB>+&Dgr;G<SUP><UP>crg</UP></SUP><SUB><UP>bkn,i</UP></SUB>+&Dgr;G<SUP><UP>crg</UP></SUP><SUB><UP>other,i</UP></SUB>−&Dgr;G<SUP><UP>neu</UP></SUP><SUB><UP>rxn,i</UP></SUB>−&Dgr;G<SUP><UP>neu</UP></SUP><SUB><UP>bkn,i</UP></SUB>−&Dgr;G<SUP><UP>neu</UP></SUP><SUB><UP>other,i</UP></SUB> <LIM><OP>∑</OP><LL><UP>j=1</UP></LL><UL><UP>R</UP></UL></LIM>(&Dgr;G<SUP><UP>crg</UP></SUP><SUB><UP>sdchn</UP>(<UP>j</UP>),<UP>i</UP></SUB>−&Dgr;G<SUP><UP>neu</UP></SUP><SUB><UP>sdchn</UP>(<UP>j</UP>),<UP>i</UP></SUB>) (6)
The terms Delta Gbjkn,icrg and Delta Grxn,icrg, the charged residue's interaction with the backbone and its reaction field energy, are calculated with Eqs. 1 and 2, respectively, and will be described in detail here. The interactions of the neutral forms of a residue (Delta Gbkn,ineu and Delta Grxn,ineu) are often small. The final sum represents the difference in the pairwise interactions of the j other polar and charged side chains with residue i in its charged and neutral form. This is the most significant omitted term. Other terms can arise from intra-protein motions that are coupled to the ionization of the residue (Delta Gother). Within the protein the charge state of all residues are interdependent (see Bashford and Karplus, 1990; Yang et al., 1993; Antosiewicz et al., 1994; Alexov and Gunner, 1997 for a more complete description).

    RESULTS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

The potential from the backbone within proteins

Backbone potential within four representative proteins

The degree to which the backbone amides make protein interiors more positive is shown graphically for four proteins with the basic folding motifs: alpha , beta , alpha /beta , and alpha  + beta . The potential at a representative slice through each protein with only backbone dipoles assigned partial charges is visualized with the program GRASP (Nicholls et al., 1991) (Fig. 3). Although the net charge on each protein is zero, the interior is predominately positive. At least a quarter of the total volume of each protein is at a potential above 120 mV, while <10% is below -120 mV (Table 3).



View larger version (159K):
[in this window]
[in a new window]
 
FIGURE 3   Electrostatic potential at a slice through four proteins with different folds. Potentials calculated and displayed with the program GRASP (Nicholls et al., 1991). Blue regions are at positive and red at negative potential; CHARMM charges, epsilon protein = 4; epsilon solvent = 80. (A) alpha  motif: Met-hemerythrin from sipunculid worm (Themiste dyscrita) (2HMQ chain A) (Holmes and Stenkamp, 1991). A 104 residue iron-binding protein in a four-helical up-and-down bundle with a left-handed twist (Motif descriptions from the SCOP data base (Murzin et al., 1995)). (B) beta  motif: human lipid binding protein (1HMR) (Zanotti et al., 1992). A 129 residue 10-stranded meander beta -sheet folded upon itself. (C) alpha /beta motif: triose phosphate isomerase from Trypanosoma brucie brucei (1TPF) (Kishan et al., 1994); a 247 residue alpha /beta barrel which has 8 alternating alpha  and beta  segments forming an internal, parallel beta -sheet barrel; and (D) alpha  + beta  motif: bovine ribonuclease A (7RSA) (Wlodawer et al., 1988); A 124 residue protein with a long curved beta -sheet and 3 alpha -helices.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 3   Electrostatic potentials within four proteins with different folds

Average potential from the amide backbone inside all proteins

The potential from the backbone (VP) was determined in 305 proteins chosen to include representatives of many folding motifs (Eq. 3a). VP determines the potential at non-polar, polar, and ionizable side chains. VP is always positive, ranging from 57 to 244 mV (1.3-5.6 kcal/mol/e). The average VP is 110 ± 30 mV (2.54 ± 0.70 kcal/mol/e) (Eq. 3b, Fig. 4).



View larger version (46K):
[in this window]
[in a new window]
 
FIGURE 4   The number of proteins with different values of the average electrostatic potential at the side chain heavy atoms (VP). VP was calculated with Eq. 3a for 305 proteins. The patterns for different SCOP protein motifs: alpha , black; beta , horizontal; alpha  + beta , diagonal; alpha /beta , cross-hatch; others, white.

The average potential from the backbone is positive for all protein motifs. Helical proteins have on average the smallest potentials (95 ± 23 mV) and alpha /beta proteins the largest (136 ± 36 mV) (Table 4). There are more small or pure alpha  or beta  proteins among the least positive proteins, and more alpha /beta or mixed motif proteins among the most positive. However, all folds are represented in both the most and least positive proteins studied except for the small proteins.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 4   The average potential at all non-hydrogen, side chain atoms from the backbone dipoles inside 305 proteins

Importance of specific parameters used in the calculations

The dielectric constants for protein and solvent were varied to determine whether the bias toward the backbone potentials being positive is due to the specific parameters used (Table 4). If the calculations use a uniform dielectric constant of 4, rather than having an epsilon solv of 80, the average potential of the proteins tested is 137 ± 62 mV. Thus the result does not depend on the high dielectric constant of the solvent. Raising the interior dielectric constant diminishes VP without changing its sign (data not shown). The charge distribution can also be varied. For example, moving the 0.1 charge placed on CA in the CHARMM charge set to the HN (EQ charge in Table 1) also yields a positive average potential (93 ± 34 mV).

It is possible to determine the relative importance of the atoms that make up the backbone dipoles in determining VP. Each amide can be viewed as two smaller dipoles with zero net charge: a unit made of the carbonyl (C and O) and one of the amine (HN, N, and CA) (Table 1). For each protein ~77% of the average potential is a result of the C---O dipole while 22% results from the HN-N-CA charges (Fig. 5). The same relative importance can be found in the contribution of each mini-dipole to the dipole moment of the amide. Thus, an amide with CHARMM charges has a dipole moment of 4.2 D. The carbonyl mini-dipole moment is 3.2 D, representing 76% of the total, while it is 1.0 D for the amine.



View larger version (26K):
[in this window]
[in a new window]
 
FIGURE 5   Comparison of the average potential at side chain heavy atoms (VP) for proteins with different charges on the backbone. VP was calculated with Eq. 3a. Charges from Table 1: (open circle ), amine (HN, N, CA) charges; (), carbonyl (C, O) charges. The straight lines are described by: 11.91 + 0.77x (r2 = 0.96) and -11.2 + 0.22x (r2 = 0.71)

Average potential at different types of side chains

The average potential was determined at each side chain (VS) (Table 5). Only 2.0% of the residues are at potentials below -60 mV, while 75.6% are more positive than +60 mV. The average of VS is always positive for all types of residues, ranging from 228 mV for Ala to 32 mV for Arg. The average side chain potential is most positive for small groups such as Ala, Cys, and Ser, and decreases as the side chain becomes larger. This results in the average VS for all side chains being more positive than the average VP for all proteins. The smaller, more positive side chains contribute as much as a large side chain to the average of VS, but not VP.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 5   The distribution of side chains at different potentials from the backbone amide dipoles

Potential at small molecules, cofactors, and substrates bound to proteins

There are many ligands bound to the proteins analyzed here. The potential from the backbone was investigated at several types of bound molecules (see Table 6).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 6   The distribution of ligands and cofactors at different potentials from the backbone amide dipoles

The average potential at buried waters is positive, with twice as many waters at potentials >+60 mV than at <-60 mV. Thus, these neutral dipoles are likely to be found at positive potential.

Metals are the only bound cations that are present in any abundance in proteins. Many of the divalent cations cadmium, cobalt, copper, non-heme iron, manganese, magnesium, ytterbium, and zinc are at potentials from the backbone >300 mV. Only Ca2+ and Na+ are ever found at potentials from the backbone more negative than -70 mV. The importance of specialized backbone motifs for coordinating Ca2+ is well established (Strydnaka and James, 1989; McPhalen et al., 1991). Thus, the bias toward the backbone being positive inside proteins extends even toward the binding sites for positive ions. With the exception of calcium and sodium, the backbone substantially destabilizes cation binding. These must be bound by protein side chains or anionic ligands.

The positive potential from the backbone at iron sulfur clusters has been previously described (Langen et al., 1992; Swartz et al., 1996). The very positive potential strongly favors the reduced over the oxidized form of these redox sites.

Many enzyme substrates such as ATP or GTP are nucleotides, while many cofactors such as flavins and nicotinamides are derived from nucleotides. Each has negatively charged phosphate groups. The average potential at the phosphates is 435 mV, which will substantially stabilize binding. Small anions such as phosphate or sulfate are also always bound in regions of positive potential from the backbone.

Structure of the amide group yields the imbalance between positive and negative regions generated by the protein backbone

Role of the neighboring amides in generating the bias toward positive potentials in proteins

The potential from each amide at each side chain was determined for 51 proteins that sample several folds and include the most and least positive Vp values in each structural class (Table 7). This group of proteins is slightly more positive than the 305 proteins, yielding the small differences among Tables 5-7.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 7   The contribution of the neighboring and distal amides to the potential at different amino acids

Each non-terminal side chain lies between two neighboring amides, one toward the N-terminal, the other toward the C-terminal (Fig. 2). All other amides in the protein are distal to this side chain. Phi and psi angles define the neighboring amide orientation, secondary and tertiary structures produce the arrangement of the distal amides. Analysis of the potential from neighboring and distal amides shows: 1) the potential from the neighboring amides is always positive; 2) the standard deviation of this potential increases as the flexibility of the side chain increases; 3) the potential from the distal amides is very variable, as seen in the large standard deviation of this value for each type of residue; 4) on average, the distal amides also raise the potential at all residues except at the bases Arg and Lys; and 5) the average potential for Cys from the distal amides is very positive. This is largely due to the very positive values at the Cys that are ligands in iron-sulfur clusters (Table 6) which are over-represented in the group of proteins.

The potential at a side chain (VS) is the sum of the potential from the neighboring and the distal amides (Fig. 6). The neighboring amides contribute 122 ± 68 mV to the average. The relative constancy of this value shows that, independent of protein motif, the potential from the backbone starts with a bias of ~110 mV within all proteins. Proteins with average potentials less than this have contributions from each group's distal amides that are on average negative. The average potential from the distal amides in the different proteins ranges from -40 to 120 meV, extending to higher positive than negative values.



View larger version (28K):
[in this window]
[in a new window]
 
FIGURE 6   Comparison of the contribution of the neighbor and distal amides to the average potential for 51 proteins. Each residue is charged in turn in each protein and the potential collected at the two neighboring side chains and at the distal side chains. Different protein motifs: alpha , black-square; beta , ; alpha  + beta , black-triangle; alpha /beta , open circle ; others, triangle . The straight lines are described by neighboring amides, 86.91 + 0.18x (r2 = 0.56); and distal amides, -86.9 + 0.82x (r2 = 0.96)

Why the potential from the neighboring amides is always positive

The potential from the neighboring amides at CB in a medium of uniform dielectric constant is solely determined by the phi angle (for amide(n)) and the psi angle (amide(c)) (Fig. 2). Under these simplified conditions it becomes clear why the potential from the neighboring amides at any residue is almost always positive. The impact of the surrounding solvent and extended side chains on the potential and resulting Delta Gbkn will be described below.

The potential is shown visually for an amide group along with the CBs for which this is amide(n) and amide(c) (Fig. 2 and 7). The polypeptide chains are arranged with phi and psi angles found in alpha -helices or beta -sheets. In each case the CBs toward the N- or the C-terminal are in the region of positive potential from the amide.



View larger version (87K):
[in this window]
[in a new window]
 
FIGURE 7   Each amide forms the junction between two residues (Fig. 2 B). One amide is amide(c) for residue (i), with an orientation between side chain and amide determined by the psi angle. The same amide is amide(n) for the next side chain (i + 1) and their orientation is described by the phi angle. GRASP (Nicholls et al., 1991) pictures showing the two CBs (green) neighboring one amide in (A) alpha -helix (phi  = -52, psi  = -53); (B) beta -strand (phi  = -123, psi  = 143). The five atoms assigned charge are labeled, colored red (negative) or blue (positive), and given a radius that is proportional to the partial charge. The isopotential contours at +0.85 kcal/e (blue) and -0.85 kcal/e (red) calculated with (C, D) epsilon peptide = epsilon solv = 4; and (E, F) epsilon peptide = 4, epsilon solv = 80.

The potential was determined as a function of the phi and psi angles at the middle CB in an Ala-tripeptide (Fig. 8). The potential from amide(n) is less than zero only for phi values between 40° and 180°, a region that is unfavorable for any residue but Gly because of steric hindrance between CB (of residue i) and the amide(n) (residue i-1) carbonyl oxygen (Ramachandran et al., 1974). Thus, the side chain is constrained to come off the backbone into the positive rather than the negative end of amide(n) because the carbonyl oxygen has a van der Waals radius that is much larger than the HN. The phi angles in alpha -helices lie close to the maximum value of the potential, while beta -sheets rotate the side chain into regions of lower potential from amide(n).



View larger version (32K):
[in this window]
[in a new window]
 
FIGURE 8   The potential at the middle CB in an Ala tripeptide from (A) amide(n) as a function of the phi angle and (B) amide(c) as a function of the psi angle (see Fig. 7 A). The potential was calculated with (bold line) epsilon peptide = epsilon solv = 4; (flatter, light line) epsilon peptide = 4, epsilon solv = 80. The relative occurrence of residues with different phi (C) and psi (D) angles in the 305 proteins considered in this study were determined with the program DSSP (Kabsch and Sander, 1983): Solid line, alpha -helix; heavy dotted line, beta -sheet; light line, other.

The potential from amide(c) is always positive, in part because the carbonyl C is always closer than the O to the CB. The region of maximum potential is at values for psi that are disallowed. The potential in helical regions is slightly larger than for beta -sheets.

The potential at CB from the neighboring amides is influenced by the dielectric properties of the surrounding solvent. Thus, the isopotential contours from an amide group are smaller when the amide is immersed in solvent (Fig. 7). However, the pattern of the variation of the potential with phi and psi is independent of solvent (Fig. 9).



View larger version (29K):
[in this window]
[in a new window]
 
FIGURE 9   The dependence of the average potential at the side chain (VS) on the length of the side chain. The average potential (VS), ; the contribution from the neighboring amides, triangle ; the contribution of the distal amides, black-square. Data from Table 8.

As the side chains become longer the potential from the neighboring amides decreases (Fig. 8). A decrease in the positive potential along individual side chains was noted previously by Spassov (Spassov et al., 1997). In addition, longer side chains have more allowable rotomers with atoms in different positions relative to the amide dipole, which increases the deviation from the average potential (Table 7).

The amide orientation relative to the protein surface affects the intra-protein potential

Modified protein structures were prepared where the HN to N bond in the amide amine was lengthened to be as long as the O to C bond in the carbonyl and the HN radius was increased to the size of the O. The surface accessibility of O and HN in these modified structures provides a simple, rough estimate of whether each amide points its carbonyl or amine out toward the solvent. With few exceptions, if an amide O is more surface-exposed than its HN, this amide raises the potential in the protein (top right quadrant of Fig. 10). If the O is more buried the amide lowers the potential (bottom left quadrant). The same pattern is found for alpha -helical, beta -sheet, and random coil regions of all protein folds.



View larger version (40K):
[in this window]
[in a new window]
 
FIGURE 10   The difference in the exposure of the HN and O vs. the contribution of that amide to the average potential within the four-helix bundle 2HMQ, the beta -barrel 1HMR, the alpha /beta barrel 1TPF, and the alpha  + beta  protein 7RSA. Residues in alpha -helices (black-square), in beta -sheets (black-triangle), and in loops (open circle ). The structures were modified as described in the Methods section to equalize the length and size of the HN-N and C-O dipoles. The potential was calculated with epsilon protein = 4, epsilon solv = 80.

The total contribution to the potential from amides with HN more exposed, O more exposed, or with little difference between their exposure were compared (Table 8). The residues that have little differential exposure contribute only a small amount to the average potential within the protein. For each protein the contribution per amide for those with the O or the HN more exposed are of similar magnitude, but opposite sign. However, there are always more amides where the O surface exposure exceeds that of the HN than those with the opposite orientation. Overall 38 ± 6% of the O's in the 305 proteins studied here have at least 10% of their surface exposed, while only 17 ± 6% of the HNs are this exposed. The preponderance of surface-exposed carbonyl oxygens is another reason why the interior of all proteins is at positive potential. This provides a mechanism for raising the potential at buried ligands that lack the interactions with neighboring amides that raise the potential at side chains.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 8   The contribution of amides to the potential in the protein depends on the amide orientation relative to the protein surface

How the positive potential from the backbone contributes to the free energy of ionized side chains in proteins

The free energy of interaction between side chains and the backbone

The potential is positive at the non-polar residues such as Val (average VS is 163 mV), Ile (145 mV), and Leu (126 mV) (Table 5). Moving from a potential of 0 into a potential of 163 mV would stabilize a negative charge by -3.75 kcal/mol or destabilize a positive one by an equivalent amount. However, despite the significant potential (Psi I) these neutral, non-polar residues contribute little to the free energy of side chain interaction with the backbone (Delta Gbkn), because the net atomic partial charge (qI) is near zero (Eq. 5). The large positive potential at non-polar residues supports the picture that forces other than favorable electrostatic interactions between side chain and amide dipoles are responsible for the predominately positive protein interior. However, the average of VS at the acidic residues Asp and Glu is 45 and 24 mV, respectively, more positive than at their polar analogs Asn and Gln. The bases Arg and Lys do have the least positive average VS. Thus, electrostatic interactions between backbone and side chains do contribute somewhat to the amide orientation that determines the potential.

VS considers all side chain heavy atoms equally (Eq. 4a). In contrast, Delta Gbkn considers the partial charge on each atom and the potential (Eq. 5). Delta Gbkn is favorable at the basic residues despite the average side chain potential being positive. Thus, the atoms with positive charge must be in regions that are more negative than the average for the residue as a whole. In contrast, the average Glu VS is 108 mV while the average Delta Gbkn is only -68 meV (-1.6 kcal/mol). Thus, the potential must be more positive at atoms that cannot add to the favorable Delta Gbkn because they have little charge.

Loss of reaction field energy of ionized amino acids in proteins

The loss of reaction field energy (Delta Grxn) (Eq. 2) provides a quantitative measure of the distribution of buried charges in proteins. The interactions with the potential created by the backbone will be most important for buried, charged residues. Delta Grxn was calculated for the acids Asp and Glu, and bases Lys and Arg (Figs. 11 and 12; Table 9). Seventy percent have lost <4.1 kcal/mol of the reaction field energy they would have if free in water, shifting the residue pKa by <3 pH units (Eq. 6). Thus, as expected, most of these ionizable residues are near the surface. However, 30% (5501) have Delta Grxn >4.1 kcal/mol. Half of these have lost sufficient reaction field energy to shift their pKa values by 5 pH units (6.8 kcal/mol). A 5 pH unit shift destabilizes an ionized Asp, moving its pKa from 4 to 9. The same Delta Gxn shifts the pKa of an Arg from 12.5 to 7.5. Burial in the protein can also be assessed by the exposure of the side chain to the surface. The fraction of residues that have lost >6.8 kcal/mol Delta Grxn is comparable to the fraction of residues that have <10% of the side chain atoms with significant charge exposed to the solvent (Table 9).



View larger version (60K):
[in this window]
[in a new window]
 
FIGURE 11   The distribution of acidic and basic side chains with different values of Delta Grxn and Delta Gbkn in 305 proteins with different motifs. CHARMM charges were used for side chains and amides. The net charges in each run were +1 on the bases or -1 on the acids. epsilon protein = 4, epsilon solv = 80. Acids: , Asp; open circle , Glu. Bases: black-square, Arg; , Lys.



View larger version (49K):
[in this window]
[in a new window]
 
FIGURE 12   The relationship between Delta Gbkn and Delta Grxn for the acidic and basic amino acids in 305 proteins. The bold line is for -Delta Gbkn = Delta Grxn. The dashed line shows the maximum value for Delta Gxn, when Grxn = 0 and Delta Grxn = -Grxn in soln (Table 2). The ±1.5 kcal/mol has been removed.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 9   Loss of reaction field energy (Delta Grxn) for ionized acids and bases within proteins

Different propensities are found for burying each type of side chain. There are more buried Asp, similar numbers of buried Arg and Glu, and fewer buried Lys. Overall there are more buried acids than bases (Fig. 11, Table 9). This disparity becomes more significant as Delta Grxn increases. For residues where Delta Grxn is 4.1-6.8 kcal/mol, 56% are acids. Of the residues where Delta Grxn is >6.8 kcal/mol 62% are acids, representing 17% of the acids and 12% of the bases.

Interaction of ionized residues with the backbone

A buried acid or base with a large Delta Grxn will be neutral at physiological pH unless specific elements of the protein stabilize the charge (Eq. 6). Nearby charges or appropriately oriented dipoles can compensate for the loss of reaction field energy. The free energy of stabilization of each acidic and basic residue due to the electrostatic potential from the protein amide dipoles (Delta Gbkn) was calculated with Eq. 1 using CHARMM charges for the backbone (Table 1). Fig. 12 compares Delta Grxn and Delta Gbkn for individual amino acids. No surface-exposed residue (Delta Grxn ~ 0) has a large Delta Gbkn. However, buried groups have a wide range of interactions with the backbone. The straight line of slope 1 in Fig. 12 shows where -Delta Gbkn = Delta Grxn. If there were no other interactions (e.g., with the other protein side chains) the pKa of groups along this line would be identical to that found in solution. There are a small number of residues where stabilization by the potential from the backbone dipoles is larger than the destabilization due to removal from the water dipoles (Fig. 12 and Table 10). In the absence of other interactions the protein would shift the pKa of acids to lower and bases to higher pH values. Prior calculations have shown that hyper-stabilized residues can be functionally important. For example, in the photosynthetic reaction center a cluster of buried acids remain significantly ionized because they exist in a region where -Delta Gbkn > Delta Grxn (Lancaster et al., 1996).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 10   The interaction of ionized acidic and basic side chains with the backbone (Delta Gbkn)

There are fewer residues with large Delta Gbkn than large Delta Grxn (Tables 9 and 10). Only 14% of the acidic or basic residues have Delta Gbkn larger than ±4.1 kcal/mol. The different types of side chains have the same order of propensities for large values of Delta Gbkn as for Delta Grxn (Asp > Glu >=  Arg > Lys). However, the difference between acids and bases is far more striking. For example, Delta Gbkn is -4.1 kcal/mol for 20% of the acids, while only 6.5% of the bases have interactions above this threshold. For most residues Delta Gbkn is favorable. However, 80% of the strong, favorable interactions with the backbone are to acids, only 20% to bases. Of the small number of residues with unfavorable Delta Gbkn, 93% are bases (Figs. 11, 12). Thus, acids are more likely to be buried than bases and they are much more likely to be stabilized inside the protein by the potential from the amide dipoles. These distinctions are as expected if the potential from the protein backbone creates a bias to favor buried acids and raise the energy of buried bases.

The role of hydrogen bonds in creating favorable interactions between backbone and side chain

A hydrogen bond between the terminus of an acidic side chain and the amide HN or a basic side chain and the amide O generally indicates that the backbone will stabilize the charged residue. The necessity of hydrogen bonds for generating large values of Delta Gbkn was investigated (Table 11). Of the 1942 acids stabilized by >4.1 kcal/mol, 710 make no hydrogen bonds to the backbone. In contrast, of the 526 bases only 70 make no hydrogen bonds. This result highlights the bias toward the protein being positive inside. Thus, negative regions are almost always formed with local, hydrogen bonds while positive regions can be generated by longer-range interactions.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 11   Acids or bases that are stabilized by the backbone by more than 4.1 kcal/mole (3 Delta pH unit) without making hydrogen bonds to the backbone

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

The average potential from the neutral amide dipoles (VP) is found to be positive in every protein (Table 4, Fig. 4). Larger regions of each protein are at positive rather than negative potential (Fig. 3) and this potential is often large (Tables 5 and 6). The numerical value of the potential depends on the charge distribution used for the amide and the dielectric constant for the protein. However, the average remains positive even when these parameters are varied (Table 4). The potential from the backbone is positive within all proteins for two reasons. First, the side chains of all residues come off the backbone into the positive end of both their neighboring amides (Fig. 2 A). The regions of phi/psi space where side chains are close to the carbonyl oxygen are disallowed because of van der Waals overlap (Ramachandran and Sasisekharan, 1968). The HN proton is much smaller, so the side chain can come closer. In addition, the orientation of the amide at the protein surface influences the interior potential. The larger, more highly charged carbonyl O is more than twice as likely to be oriented into the solvent then the amine HN. The amides, with their O's more surface-exposed, raise the interior potential (Fig. 10). The restrictions in phi/psi space influence the interactions between amides and their neighboring side chains. The distribution of amide orientation at the protein surface raises the potential at distal side chains and bound ligands.

It is remarkable, given the complexity and uniqueness of individual proteins, that the neutral backbone yields a potential that is, on average, significantly positive in every protein. The question is how this bias affects protein structure and function. Empirical rules determined from the distribution of residues in protein structures have established the importance of other forces in proteins. Thus, the hydrophobic effect is recognized by many, though not all, non-polar residues being buried. Again, the solvation of charged residues stabilizes them on the surface where the majority are found (Table 9).

The analysis of the distribution of acidic and basic side chains reveals that despite the energetic penalty for removing charges from water, many are buried. However, there are significantly more buried acids than bases. This is as expected if the positive potential from the amides affects side chain location. There are 1.7 times as many acids that have lost 6.8 kcal/mol (5&nbs