| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Biophys J, January 1999, p. 28-39, Vol. 76, No. 1
Molecular Modeling Laboratory and Department of Cell Biology and Biochemistry, United States Army Medical Research Institute of Infectious Diseases, Frederick, Maryland 21702 USA
| |
ABSTRACT |
|---|
|
|
|---|
A continuum model is provided of the free energy terms
that contribute to the molecular association of ricin A-chain (RTA) with the rRNA substrate and several small ligands. The model for RTA
interactions with the RNA was taken from a previously proposed complex
containing a 29-mer oligonucleotide hairpin (Olson, 1997
. Proteins 27:80-95), and models for the ligands were
constructed from x-ray crystallographic structures. The calculated
absolute free energies of complex formation for the RTA-RNA assembly
and several single-residue substitutions are in good agreement with experimental data, given the approximations of evaluating the strain
energy and conformational entropy. The free energy terms were found to
resemble those of protein-protein complexes, with the net unfavorable
electrostatic contribution offset by the favorable nonspecific
hydrophobic effect. Decomposition of the RTA-RNA binding free energy
into individual contributions revealed the electrostatic "hot"
spots arising from charge-charge complementarity of the interfacial
arginines with the RNA phosphate backbone. Base interactions of the
GAGA loop structure dominate the hydrophobic complementarity. A
linear-scaling model was parametrized for evaluating the binding of
small ligands against the rRNA substrate and illustrates the free
energy determinant required for designing specific RTA inhibitors.
| |
INTRODUCTION |
|---|
|
|
|---|
Ribosome-inactivating proteins (RIPs) are a large
family of N-glycosidases that exhibit exquisite specificity
in their catalytic hydrolysis of a single adenine base from among
nearly 7000 nucleotides found in mammalian ribosomes. Depurination
occurs at the first adenine in the loop sequence GAGA, located in a
highly conserved, single-stranded rRNA hairpin. Included among the RIPs
is the extraordinarily toxic ricin protein. Isolated from the
seeds of the castor bean plant Ricinus communis, ricin is a
globular heterodimer consisting of a 267-residue catalytic A-chain
(RTA) linked by a disulfide bond to the cell-surface binding B-chain of
262 residues. Molecular recognition of the catalytic adenine can been
inferred from x-ray crystallographic structures of RTA complexed with
substrate analogs (Monzingo and Robertus, 1992
). The putative adenine
binding mode consists of a combination of specific hydrogen bonds and
aromatic stacking between two invariant tyrosine side chains in the
enzyme specificity site. Available are site-directed mutagenesis
studies elucidating structure-function correlates of ricin (Day et al., 1996
; Kim and Robertus, 1992
; Ready et al., 1991
; Frankel et al., 1990
).
Owing to the remarkable cytotoxicity of ricin, there is considerable
interest in the search for strong competitive inhibitors, yet only
modest success has been achieved to date for small molecules (Yan et
al., 1997
). There is more promise for a transition-state mimic
containing a modified 14-mer ribonucleotide hairpin (Chen et al.,
1996
), though its molecular size limits its practical use as a
therapeutic inhibitor. The failure of smaller similar ribonucleotide
loops to inhibit ricin (Link et al., 1995
) indicates that many issues
remain to be clarified regarding ligand size requirements for
inhibition. Questions arise as to what interactions are critical for
the substrate remote from the catalytic adenine recognition site and,
more importantly, whether small molecules can effectively block the
strong binding of RTA to an extended rRNA substrate.
To better understand the structural interactions between ricin and the
rRNA substrate, a modeled complex was recently developed (Olson, 1997
)
by molecular docking the RTA crystal structure with a 29-mer
oligonucleotide hairpin obtained from an NMR solution structure
(Szewczak and Moore, 1995
; Szewczak et al., 1993
). This particular
oligonucleotide contains the GAGA loop sequence and is depurinated by
RTA. Conformational sampling of docking the two structures was
performed by using a molecular dynamics (MD) simulated-annealing
method. The calculations generated a large ensemble of complex
configurational states and their analysis revealed a vast network of
interactions between the two molecules, mediated by an array of contact
sites arising from a group of arginines. To make the connection with
experimental binding data, this paper extends the previous study by
analyzing the free energy determinants of complex formation for RTA
interactions with the RNA and several small ligand molecules.
We report the results of a continuum model developed for estimating
macromolecular interactions in solution. Electrostatic free energies
were evaluated by solving the nonlinear Poisson-Boltzmann (NLPB)
equation (Reiner and Radke, 1990
; Sharp and Honig, 1990
), and a
free-energy relationship based on the molecular surface area was used
for modeling the hydrophobic effects (Jackson and Sternberg, 1994
). A
comparison of the free energies for both the RTA-RNA assembly and
several protein mutants with experimentally determined affinities is
presented and provides confidence in the overall validity of the
modeled complex. We also present the binding free energy for the
RTA-RNA complex dissected into individual contributions of protein
residues and RNA nucleotides. Understanding these module or group free
energies reveals the binding "signature" (or "fingerprints";
Muegge et al., 1997
) of the interfacial surface, and allows for a
computationally tractable assessment of the structural and functional
features critical for molecular association with the rRNA substrate.
Also reported are free energy terms for the small ligands in their
crystallographic assemblies. Their contrast with the substrate yields
insight into the difficulties encountered in designing small effective
inhibitors against RTA. The paper concludes with a linear-scaling model
that combines the substrate and small ligands into a single continuum
framework that should prove useful in gauging the binding free energies
of novel inhibitors found either by de novo design methods or database searching.
| |
METHODS |
|---|
|
|
|---|
The numerical determination of the binding free energies
(
Gbind) for the RTA-RNA complex is based on
the calculational approach where molecular association is partitioned
into independent contributions arising from electrostatic
(
Gele) and nonpolar
(
Gnp) interactions, plus a strain-energy
(
Gstrain) and conformational entropy
(T
S) term (Gilson and Honig; 1988
; Jackson and
Sternberg, 1995
; Froloff et al., 1997
)
|
(1) |
S combines the terms of
side-chain torsional and translational-rotational entropy, and
T is the absolute temperature. To calculate the total
electrostatic free energy of the system, we used a thermodynamic
pathway of association where
Gele is expressed as a sum of three terms (Gilson and Honig, 1988
|
(2) |
|

Gsolprot-RNA corresponds
to the loss of charge-solvent interaction energy through the partial
desolvation of the electrostatically charged protein on binding;

GsolRNA-prot is the loss of
charge-solvent interaction energy through the partial desolvation of
the RNA on binding; and the term
Gintprot-RNA is the electrostatic
interaction energy between the two macromolecules embedded in the
solvent dielectric continuum. The desolvation energetics are evaluated
from the differences (Gilson and Honig, 1988
|
(3a) |
|
(3b) |
Gsolprot-RNA and
GsolRNA-prot are determined by setting
the atomic charges for the binding partner to zero. This is
accomplished by using a charging cycle similar to that introduced by
Lee et al. (1992)
s) from the
space occupied by the binding partner and replacing it with the low
dielectric medium of the macromolecule (
m). The sum of
electrostatic free energy terms in Eq. 2 is given by the volume
integral of the fixed charge density,
, and the potential,
,
where the potential is evaluated from the NLPB equation (Reiner and
Radke, 1990
|
(4) |
is a function of the Debye length and ionic strength
of the bulk solution, and r is the position vector in the
reference frame centered on the protein-RNA complex. Equation 4 is
typically solved by the finite-difference procedure (Nicholls and
Honig, 1991
|
(5) |
i is the
potential generated by the protein at this point.
Atomic coordinates for the RTA-RNA complex were taken from a previously
reported MD simulated-annealing study of docking RTA with the 29-mer
RNA molecule (Olson, 1997
). The RNA sequence is displayed in Fig.
1 a, where the adenine
depurinated by RTA is designated as A15. Single-residue mutants Y80F,
N209S, and R180Q were constructed from the wild-type complex by
replacing the native side chains with the most favorable substituted
conformers. Partial atomic charges for the protein and RNA were derived
from the AMBER force field (Weiner et al., 1986
). For the mutant R180Q,
conformational relaxation of the phosphoribose backbone centered around
A15 and G16 was performed by using a short energy minimization
calculation (100 cycles using a steepest descent algorithm) and an MD
simulation (1000 iterations at a temperature of 300 K using a timestep
of 1 fs). The catalytic water in the RTA-RNA complex (Olson, 1997
) was
treated explicitly by using the TIP3P model (Jorgensen et al., 1983
).
Force-field cutoffs were set at 12 Å and a dielectric constant of
= 1 was implemented.
|
The DelPhi software package (Gilson et al., 1988
) was used to solve Eq. 4. The complexes and the surrounding solvent were mapped onto a
1733 lattice. Water-accessible surfaces of the
macromolecules were constructed by using a 1.4-Å-radius probe to
define regions of the low dielectric medium (modeled using
m values in the range of 2 to ~7), embedded in a
dielectric solvent water (
s = 80). An ion exclusion
radius of 2.0 Å was added to the surfaces to account for ion size.
Parameters defining atomic radii were taken from the CFF91 molecular
mechanics force field (Maple et al., 1994
). The ionic strength was set
at physiological conditions of 0.145 M. Ionization states for residues
Asp, Glu, Lys, Arg, and His were set corresponding to a neutral pH.
Full Coulombic boundary conditions were applied for all calculations
and the number of nonlinear iterations for solving the NLPB equation
was set at 250. Dummy atoms were used to retain an identical scale and
position on the grid for the complexes and individual structures. Systematic errors were estimated to be <2% when compared with calculations carried out with increased grid resolutions. Calculational dependencies of the results on the position of the molecule on the grid
were acceptably small and, therefore, neglected.
Atomic coordinates for ligands pteroic acid (PTA), formycin
5'-monophosphate (FMP), and adenyl-3',5'-guanosine (ApG) bound to the
RTA protein were kindly provided by Dr. Jon Robertus (Yan et al., 1997
;
Monzingo and Robertus, 1992
). Chemical formulas for PTA and FMP are
illustrated in Fig. 1, b and c, with hydrogens added according to the suggested PTA structure (Yan et al., 1997
, 1998
)
and from previous modeling calculations of FMP (Olson et al., 1995
).
The phosphate group of FMP was modeled either as a monoanion or
dianion. Charges for the ligand-bound proteins were the same as the
RTA-RNA complex, while those of PTA and FMP were derived from ab initio
quantum mechanical calculations by using the 6-31g** basis set. For
the dinucleotide ApG, AMBER charges were used. Calculations of the
electrostatic free energies were performed similar to those described
above for the protein-RNA complex.
Of the remaining free energy terms in Eq. 1, we endeavored only to
calculate the nonpolar contribution arising from the hydrophobic effect, using the following expression for the cavitation free energy
(Jackson and Sternberg, 1994
)
|
(6) |
A is the change in the molecular surface
area on complex formation and
is the surface tension, initially set
at 68 cal/mol/Å2. This value of the surface tension was
taken from a recent continuum analysis of an antibody-antibody complex
and its 16 alanine substitutions (Olson, 1998| |
RESULTS AND DISCUSSION |
|---|
|
|
|---|
We begin this section with an analysis of the absolute binding free energy for the RTA-RNA complex and its decomposition into individual amino acid residues on the protein interfacial surface and nucleotides on the RNA surface. This is followed by a discussion of calculations on small ligands.
Binding free energies for the RTA-RNA complex
Table 1 presents the results of Eq. 1 for the calculated total free energy of molecular association
(
Gcalc) for the wild-type protein and mutants
Y80F, N209S, and R180Q, each interacting with the 29-mer RNA.
Illustrated in Fig. 2 is the analyzed
complex configuration, which was extracted from an ensemble of
conformers generated by MD simulation methods (Olson, 1997
). This
configuration consists of a large network of molecular contacts where
the target adenine base (designated as A15) of the GAGA hairpin
structure is positioned in a recognition site between two tyrosine
rings, one being the residue Tyr-80. The side chain of Asn-209 is
hydrogen-bonded with the guanine base of G14, and Arg-180 contacts both
the adenine base of A15 and the phosphate backbone linking the guanine
G16. A putative catalytic water molecule is positioned in the active site near Arg-180 and was included explicitly in the treatment of the
continuum description. Calculations on this specific conformer were
carried out by using several different mean-field parameters describing
the "macromolecule dielectric" and surface tension. The
experimental binding free energy for the wild type was estimated from
the expression
Gexpt =
RTlnKm, where
Km is the reported Michaelis constant (Kim and
Robertus, 1992
), and R is the universal gas constant. For
the three mutants, relative changes in binding free energies were
estimated from the reported Km values and the catalytic rate constants (Kim and Robertus, 1992
; Ready et al., 1991
) by using an approximation that treats the kinetic
off-rates as constant and much smaller than the corresponding turnover
rates.
|
|
The first set of calculations applied parameter values of
m = 2 and
= 68 cal/mol/Å2, where the
dielectric constant is in the range of values commonly implemented in
continuum models of protein-protein (e.g., Xu et al., 1997
, and
references cited therein) and protein-DNA interactions (Misra et al.,
1994
). Although both continuum parameter values appear to be
reasonable, they are, nonetheless, somewhat arbitrary in their choice
for constructing models of protein-nucleic acid interactions and serve
only as initial estimates. Clearly an
m of 2 accounts
only for electronic polarization and not the protein and solvent
reorganization resulting from mutations. The value of
m
can approach ~2 only when structural relaxation is explicitly taken
into account (Sham et al., 1998
; Warshel and Papazyan, 1998
; Muegge et al., 1998
), and the simulated-annealing method used in
deriving the RTA-RNA complex (Olson, 1997
) achieved the relaxation needed for applying this dielectric value to the wild-type structure. However, and as described further below, the effect of reorganization for the mutants can be modeled implicitly by linearly scaling the free
energy function.
The free energy term
Gconf in Table 1
represents the sum of conformational entropy and strain energy
contributions, and the average of this term was estimated by
empirically fitting the
Gcalc for the
wild-type structure and mutants Y80F and N209S to the
Gexpt values. For this particular dielectric
model, the mutant R180Q was eliminated in the numerical fit. The fitted
Gconf of ~16 kcal/mol is physically
plausible given the size of the protein-RNA complex (Fig. 2
a). Theoretical estimates for protein-protein assemblies
indicate that the rotational and translational entropy loss on complex
formation is in the range of 2 to 15 kcal/mol at room temperature (Page
and Jenks, 1971
; Erickson, 1989
; Finkelstein and Janin, 1989
; Tidor and
Karplus, 1994
; Murphy et al., 1994
). As for strain energies, values
depend on the magnitude of induced fit and should be considerably less
than the free energies of unfolding for both the unbound protein and
RNA molecule. Further evaluations of the continuum model with
m set at 3 or higher yielded unrealistic values for
Gconf (depending, obviously, on the choice of
), and thus suggest a reasonable a priori treatment of the
electrostatics using an
m = 2.
Notwithstanding the many caveats inherently built into the continuum
model, the potential of mean force (
Gele +
Gcav
27 kcal/mol) plus a customary
nonfitted
Gconf estimate of ~15 kcal/mol (10 for entropic and 5 for strain energy; see, e.g., Froloff et al.,
1997
) yields a
Gcalc for the wild-type
assembly in good accord with
Gexpt, which has
an observed value of ~
9 kcal/mol. The significance of this
agreement is that the simulated-annealing method sampled conformational
space sufficiently well in finding a conformer that represents an
accurate description of the global binding mode of the rRNA
substrate. The continuum model reveals that
Gint is the largest free energy term, showing
significant protein-RNA electrostatic interactions. This favorable term
is, however, offset by the electrostatic desolvation cost of both molecules, particularly the RNA structure. The large penalty in desolvating the RNA is not unexpected and is the result of the negatively charged phosphate backbone involvement in the association process. An accurate treatment of the desolvation energetics for the
RNA is critical in achieving a realistic total free energy of binding.
A comparison of 
GsolRNA-prot
calculated by using the linearized Poisson-Boltzmann equation with the
nonlinear form (Eq. 4) shows an increase of nearly 9 kcal/mol, which
results in a significantly less favorable
Gcalc.
Overall, our calculations indicate that the free energy terms
describing the molecular association of the RTA protein with the RNA
parallel those of protein-protein complexes, as analyzed by similar
continuum approaches (Lee et al., 1992
; Jackson and Sternberg, 1995
;
Froloff et al., 1997
; Olson and Cuff, 1998
; Olson, 1998
; Muegge et al.,
1998
). Most importantly, the nonspecific hydrophobic effect provides
the driving force underlying favorable complex formation
(
Gcav =
95.5 kcal/mol), while the net
electrostatic contribution, with its role in conferring conformational
specificity, opposes binding (
Gele = +68.6
kcal/mol). Our result of this balance between the hydrophobic effect
and electrostatic specificity for the RTA-RNA complex is likely to be a
generalization of proteins interacting with nucleic acids and simply
extends earlier models of protein-protein association (e.g., Chothia
and Janin, 1975
), and the more recent analysis of ligand-DNA
interactions (Misra and Honig, 1995
). The magnitude of the predicted
hydrophobic driving force in stabilizing RTA-RNA binding is reflected
in a buried molecular surface area of ~1400 Å2, which is
comparable to large protein-protein interfaces. As described further
below, the many contacts of this interface and their requisite for
recognition and binding have important implications for the development
of RTA inhibitors.
An improvement may be gained for Y80F in the relative free energy
difference from the wild-type complex if the change in side-chain torsional freedom plays a favorable role in binding the rRNA substrate. The crystal structure of Y80F exhibits multiple side-chain
conformations (Kim and Robertus, 1992
), whereas the wild-type structure
shows the tyrosine ring anchored by interactions of the hydroxyl group with the backbone of a neighboring active-site residue (Katzin et al.,
1991
; Rutenber et al., 1991
). Co-crystal structures of RTA and several
other RIPs bound with substrate analogs show disorder in the side chain
of the tyrosine (Weston et al., 1994
; Ren et al., 1994
). Previous MD
simulation analysis of the binding of ApG to RTA found similar
conformational transitions for Tyr-80 (Olson, 1997
). Although these
observations suggest an entropic role for this residue and its
phenylalanine substitution in binding the substrate, a full assessment
of the side-chain conformational freedom is, nevertheless, difficult to
estimate quantitatively by sampling active-site dynamics of both the
unbound and bound states, and requires further study.
For the mutant R180Q, the putative catalytic water molecule was
repositioned in the active site to bridge the glutamine side chain with
N3 of A15 (Day et al., 1996
), and the resultant structure was subjected
to a short MD simulation permitting structural relaxation. Despite
these efforts,
Gcalc shows a significant loss
in free energy from the wild type. The
Gele
penalty can be reduced to obtain a more realistic
Gcalc in better agreement with
Gexpt by an increase in
m to a
value set at ~2.8, assuming the same fitted
Gconf. Using this dielectric constant yields
a predicted
Gcalc =
4.3 kcal/mol (Table 1),
which is in accord with
Gexpt =
5 kcal/mol.
The requirement of a larger
m for modeling the mutation
of a charged residue is not surprising, and it reflects the need for
greater effective damping of the interactions within a dielectric
continuum. In effect, scaling
m to larger values reduces
the overestimation of both solvation and interaction energetics while
avoiding the explicit reorganization of the pre-oriented dipoles found
in the native complex (Sham et al., 1998
; Warshel and Papazyan, 1998
;
Muegge et al., 1997
, 1998
).
Assuming that
m is an adjustable parameter that
represents the contributions that are not treated explicitly in
continuum models, values of
Gcalc
corresponding to the wild-type and mutants Y80F and N209S can be
reconciled with the experimental values if one readjusts
Gele to reduce the overestimation of
electrostatic energetics and absorbs the entropic term into
Gcav. The second set of calculations
presented in Table 1 represents this empirical approach, which consists
of free energies obtained from the following equation:
|
(7) |
and
parameters are electrostatic and cavitation
scaling factors, respectively, and are used to fit new values of
m and
. Albeit this equation was evaluated as a
mean-field model without thermal conformational sampling, it resembles
a linear-response approximation (LRA) constructed with two free energy
terms (Hansson et al., 1998
Gconf. Continuum
parameters for this model were fitted to
Gexpt and yielded a larger
m
value of ~7 and a lower
of ~20 cal/mol/Å2, where
the surface tension is now more characteristic of "microscopic" values used with modeling solvent-accessible surfaces (Jackson and
Sternberg, 1994
Gcalc and in the relative free energy changes
are significantly reduced, while R180Q still requires a larger
m value of ~9 to achieve an agreement with
Gexpt. The hydrophobic effect remains the
driving force for complex formation, although
Gcav now represents an effective free energy
term depending on more than molecular complementarity.
Decomposition of binding free energy
The free energy terms
Gele and
Gcav calculated above represent an aggregate
of many contributions at the binding interface between RTA and the RNA,
which can be decomposed on a per-residue level into individual
contributions arising from both molecules. Various calculational
schemes are available for dissecting binding free energies, the most
notable being the recent methods developed and applied by Warshel and
co-workers (Muegge et al., 1996
-1998
). Here, for electrostatics of the
protein, atomic charges were switched on and off one residue at a time
for interfacial side chains within 3.5 Å of the RNA molecule.
Electrostatics of the backbone and the protein outside of the interface
were treated as part of the remaining aggregate. The hydrophobic effect
was decomposed by using a per-residue buried molecular surface area.
Similar calculations were conducted for the RNA, except the CGAGAG loop
region was dissected in terms of structural groups corresponding to the
bases, phosphate groups, and ribose rings. Combined, the sum of
individual contributions of each molecule assures a complete
thermodynamic cycle and provides an efficient approach of gauging
residue contributions to molecular association without the
computational task of evaluating free energies arising from mutations.
Although this approach is more efficient, a direct comparison with
site-directed mutagenesis is only an approximation. Nevertheless, it is
likely that the relative ranking of these artificial mutants
correlates well with experimental measurements using glycine
substitutions of the protein interacting with, for example, the 14-mer
ribonucleotide transition-state mimic inhibitor (Chen et al., 1996
), or
a 29-mer RNA hairpin with the catalytic adenine replaced with a
nonhydrolyzable complementary ring.
Summarized in Table 2 are the key
contributions for RTA residues interacting with the RNA molecule
calculated by using the scaled parameters
m = 7.1 and
= 20.0 cal/mol/Å2 to evaluate the continuum model.
Similar qualitative results are obtained using
m = 2 and
= 68 cal/mol/Å2 with the entropic factor included.
Alternatively, Warshel and co-workers (Muegge et al., 1997
, 1998
)
suggested the use of a priori
m values of 4 and 20 for
uncharged and charged residues, respectively, in their studies of
dissecting the free energies of protein complexes, although they
neglected the cavitation energetics. The components of
Gcalc are now defined for contact residues (
Gres), where

Gsol is the desolvation of RTA side chains
upon binding the RNA,
Gint is the
corresponding residue side-chain electrostatic interaction, and
Gcav is the effective contribution of the
total residue to the net cavitation energy. Illustrated in Fig.
3 are values for
Gele and
Gcav
projected onto the protein molecular surface (Nicholls et al., 1991
).
Surfaces colored red depict "hot" residues that contribute
significantly to the favorable RNA association, whereas dark
blue-colored surfaces are "cold" residues that destabilize the
binding. The bound ligand shown is the target adenine of the 29-mer RNA
hairpin.
|
|
The decomposed
Gcalc indicates significant
stabilization to the RNA association arising from the positively
charged arginines located in the active-site cleft. These
solvent-exposed residues easily offset their desolvation penalties,
allowing favorable
Gele values to provide
electrostatic complementarity to the negatively charged RNA phosphate
backbone, which is characterized by a much greater desolvation cost.
Several residues of this arginine-rich binding motif are either
invariant or conserved among the RIPs. The two most favorable
contributing residues, Arg-213 and Arg-258, function by anchoring
the phosphate backbone of the Watson-Crick basepair of C13 and G18,
which closes the loop structure of the RNA hairpin (Fig. 2
b). Arg-213 is conserved, showing a positive charge in
nearly 60% of the RIPs, while Arg-258 is highly variable and may be
responsible, in part, for differentiating the binding affinities among
the RIPs for the substrate. An additional possible significance of
these two arginines is that their high affinities may "unzip" the
basepairs in RNA ligands with short stems (Link et al., 1995
), leading
to the lack of inhibition. Further contributions to this RNA molding
are likely to arise from arginines 48 and 134, with the former residue
variable and the latter invariant. The side chain of Arg-134
contributes significantly to substrate specificity through
electrostatic interactions with the base of G14.
The calculations also indicate that in contrast to the arginines, all
of the negatively charged residues in contact with the RNA produce
unfavorable
Gele contributions. These
electrostatic values, ranging from ~2 to 9 kcal/mol, clearly depend
on the choice of ionization state for each asparagine and glutamate.
The net destabilization effect of these negative charges is conceivably the energetic cost of providing a network of charge-charge balance needed in stabilizing the tertiary structure of the unbound active-site cleft (Day et al., 1996
; Kim and Robertus, 1992
). For example, disrupting the ion pair between Glu-177 and Arg-180 by replacing the
arginine with a histidine affects the global stability of RTA if the
imidazole ring is deprotonated (Day et al., 1996
). Further mutagenesis
can be suggested from our electrostatic calculations, particularly
substitutions of many of the high-affinity arginines and several of
their hydrophilic bridges in the active site with hydrophobic isosteres
(Hendsch and Tidor, 1994
), thus predicting a stable protein yet weaker
binding complex. The particularly large unfavorable contribution from
the invariant Glu-177 near the target adenine may also help stabilize
the charge distribution of the transition-state complex.
Hydrophobic effects appear diffuse across the large interfacial area
(Fig. 3), showing an average per residue
Gcav
of ~
0.5 kcal/mol. Two clusters of residues exhibiting larger values
are the tyrosines 80 and 123, and the cluster of Asn-209 and Arg-213. The complementary
Gres values for the RNA
hexamer loop region, dissected per nucleotide, are presented in Table
3. Nearly half of the net RNA
cavitation energy of ~
14 kcal/mol is contributed by the tetraloop
base structure, and the phosphoribose backbone contributes slightly
less. The calculations indicate that stacking A15 between the tyrosines
together with stacking G16 in a recognition site on the opposite side
of Tyr-80 (Fig. 2 b) forms a compacted hydrophobic core with
comparatively low desolvation energetics, reminiscent of protein
folding.
|
In terms of electrostatic specificity, only the base of G14 can easily
balance its desolvation penalty, showing a
Gele =
1.8 kcal/mol. The guanine base
forms hydrogen bonds with Asn-209 and Arg-213, and the pair-wise
Gint for these two interactions are
calculated to have affinities of
0.8 and
2.6 kcal/mol,
respectively. Although both hydrophilic interactions are favorable,
neither is sufficiently strong to compensate for their desolvation cost in stabilizing complex formation (from Tables 2 and 3,

Gsol of 1.3 kcal/mol for the N209-G14
interaction and 5.6 kcal/mol for R213-G14). Rather, the calculations
show that G14 is buried in a favorable electrostatic environment of
surrounding protein residues, and similarly, the two RTA side chains
are buried in an RNA environment that allows binding to occur. A
continuum analysis of hydrophilic bridges for protein-protein complexes
suggests similar favorable energetic roles in stabilizing complex
formation (Xu et al., 1997
). The specific interactions of the A15 base
is balanced by favorable and unfavorable contributions arising from Arg-180 and Glu-177, plus backbone interactions of Val-81 and Gly-121
in the aggregate contribution. The larger desolvation cost of A15 and
G16 relative to the other bases is a result of the unstacking and
rotation of both nucleotides from their initial positions found in the
unbound NMR structure (Szewczak and Moore, 1995
).
RNA binding is dominated by the nonspecific interactions of the
phosphate backbone. Net electrostatic contributions
(
Gele) from the backbones of A15 and G16 are
weak, while the flanking nucleotides exhibit favorable energetics.
Again, this effect is a function of the perturbation placed on rotating
the reverse turn of the hairpin, resulting from the electrostatic
complementarity of the active-site cavity. Contributions arising from
the ribose groups are minimal for this particular RNA conformer in
binding the protein.
Binding free energies for small ligands
Presented in Table 4 are the
Gcalc values for ligands PTA, FMP, and ApG,
bound to the active site of RTA. Calculations of the NLPB equation were
carried out with continuum parameter values set at
m = 7.1 and
= 20.0 cal/mol/Å2, as used in the free energy
decomposition of the RTA-RNA assembly. The FMP binding free energy was
calculated for three different structures, where the first two used the
x-ray crystallographic complex (Monzingo and Robertus, 1992
) and
modeled the phosphate group as a monoanion and dianion. The third
structure was taken from a previously reported MD simulation of the
dianion ligand with RTA (Olson et al., 1995
).
|
The calculations indicate that for PTA and ApG, the free energies of
binding reproduced the estimated
Gexpt values
within ±1 kcal/mol. Similar calculations, using
m = 2 and
= 68 cal/mol/Å2, yielded overestimates of
favorable binding, and thus require various assumptions regarding
entropic costs. As mentioned above, the ansatz of absorbing the
Gconf term for the RNA complex into the
continuum description permits direct comparisons with the binding of
small ligands (described further below). For ligand molecules it has
generally been argued, based on crystallographic thermal factors and
mean-free paths, that the configurational volumes for translational
freedom do not differ appreciably between the bound and free states
(Murphy et al., 1994
). Nevertheless, cratic entropy resulting from
mixing may lead to free energy penalties estimated at 2 kcal/mol for
molecular association at room temperature (Murphy et al., 1994
). Many
empirical studies of binding small ligands to proteins apply either ad
hoc values for translational and rotational entropy of 1 to 2 kcal/mol,
or simply neglect the contribution altogether. In either case, the
Gcalc values using
m = 7.1 and
= 20.0 cal/mol/Å2 appear to be good estimates of
ligand binding.
For calculations on FMP, analysis of the crystal structure modeled
either as a monoanion or dianion ligand yielded values of
Gcalc showing large errors, while the
simulation structure for the dianion reveals a much better agreement. A
possible explanation of this variance may lie in differences of the
binding thermodynamics found in the crystal and in aqueous solution.
The more favorable interaction energy of the simulation structure
results from an ion-pair interaction of the phosphate group with
Arg-258 (Fig. 4 a); an
interaction similar to those dominating the RNA binding. Concomitantly,
this repositioning of the phosphate group accounts for the larger
desolvation costs of both the protein and ligand. The increase in
Gcav is due to the overall optimization of
the complex, in particular, packing of the formycin ring between
tyrosines 80 and 123.
|
A comparison of the free energy terms governing the binding of the
ligands with those of the 29-mer RNA using the same continuum parameter
set (see Table 4 versus Table 1) clearly shows the task at hand in
designing small inhibitors that effectively compete for the active site
of RTA. As expected, the individual energetics for each of the terms
are significantly reduced for the ligands. For example, the cavitation
energies are nearly 20 kcal/mol less stable, and similarly, the
electrostatic interactions are weaker by more than 30 kcal/mol. What is
important, however, as found for the native protein-RNA complex, is the
minimization of
Gele and its offset achieved
by optimizing
Gcav. The ligand PTA exhibits the most favorable
Gele, while ApG shows the
largest buried surface area for determining
Gcav. Neither of the two ligands attracts many of the critical binding points found stabilizing the rRNA substrate, which appear scattered across the large buried surface area
(see Fig. 3). Decomposition of
Gcalc for the
RTA-PTA complex (Table 5) shows Arg-180
making favorable contributions to
Gele, while
many of the arginines predicted in binding the 29-mer RNA contribute
very little.
|
Our calculations for contrasting the small ligands with the large RNA
molecule suggest that an analysis of the key chemical groups in the
active site that interact with the substrate should be useful in the
design of effective inhibitors by focusing on chemical moieties
accessible to the ligands. For example, possible improvement in the
binding affinity of PTA may be gained by extending the size of the
ligand to take advantage of charge-charge interactions with Arg-258
(Fig. 4 b), although the desolvation penalty must be
carefully minimized for both the ligand and protein. Alternatively, the
Gele surface of Fig. 3 indicates regions
remote from the adenine binding site that may provide sufficient
nucleation sites to "grow" small molecules with optimized favorable
free energies. The calculations on PTA also reveal that the cavitation
energies of tyrosines 80 and 123 confer critical nonspecific
interactions with the pterin ring, albeit less than those observed in
the packing of A15 and G16 for the RNA complex. Improvement in this
particular hydrophobic binding may be difficult without creating large
branched ligands (note ApG lacks conformational mimicry with the
binding of G16). As was predicted for the RTA-RNA complex, negatively charged residues destabilize the binding of PTA.
Finally, a unified model for RTA binding of the rRNA substrate and
small ligands was constructed by fitting Eq. 7. Free energy terms for
Gele and
Gcav for
RTA and mutants Y80F and N209S were taken from Table 1, and values for
ligands PTA, FMP (simulation), and ApG from Table 4. Fitting these six
structures yielded an
m = 6.4 and
= 21.5 cal/mol/Å2, and a scatter plot of
Gcalc versus
Gexpt
using these values is presented in Fig.
5. Although the data set is relatively
small, the quality of the fit yields a correlation coefficient of 0.9 and an estimated average absolute error of ±0.5 kcal/mol. The improved
agreement for the small ligands (Table 4 versus Fig. 5) is the result
of including the RNA complex into the fitted continuum parameters, and
may reflect the fact that the binding is dominated by the larger and
more stable complex. Difficulties can typically arise for modeling
small ligands because of their weaker binding affinities and, moreover,
possible multiple binding modes. The lack of these problems in the
RTA-RNA complex contributes significantly to the success of modeling
the mutations and, in general, the continuum modeling of macromolecular
complexes.
|
It is noteworthy that this linear scaling of a protein and several
variants binding its large substrate, combined with scaling of small
ligands complexed to the same native protein, is unique among continuum
analyses of molecular association. The utility of unifying the
continuum parameters allows putative ligands found either by de novo
design methods or database searching to be scored against the substrate
based on free energies, rather than the naive and futile evaluations of
structural-interaction energies via atomic force-fields. Although this
particular point is well known among the molecular docking community,
the calculations presented here clearly demonstrate the significance of
having an empirical approach, which incorporates a realistic
thermodynamic description of binding and which is suitable for
modifying ligands to improve their affinities. These goals are not
unique and are similar with recent reported applications of the LRA
(e.g., Jones-Hertzog and Jorgensen, 1997
), albeit solutions to Eq. 7,
and in particular, the NLPB equation, provide a mean-field approach. It
remains to be seen if this continuum model yields any
"universality" in the parameters
m and
for
gauging the binding affinities of the other RIPs with the 29-mer RNA
and small substrate analogs.
| |
ACKNOWLEDGMENTS |
|---|
We thank A. F. Monzingo and J. D. Robertus for making the crystallographic structures available. We gratefully acknowledge the Biomedical Supercomputing Center of the Frederick Cancer Research and Development Center for a generous grant of computer time.
The work of L.C. was supported by the Army High Performance Computing Research Center under the auspices of the Department of the Army, Army Research Laboratory cooperative agreement number DAAH04-95-2-0003/contract number DAAH04-95-C-0008.
| |
FOOTNOTES |
|---|
Received for publication 31 July 1998 and in final form 5 October 1998.
Address reprint requests to Dr. Mark A. Olson, Molecular Modeling Laboratory, USAMRIID, Department of Cell Biology and Biochemistry, 1425 Porter Street, Frederick, MD 21702. Tel.: 301-619-4236; Fax: 301-619-2348; E-mail: molson{at}ncifcrf.gov.
| |
REFERENCES |
|---|
|
|
|---|
cI repressor and EcoRI endonuclease.
J. Mol. Biol.
238:264-280[Medline].
-momorcharin.
Structure.
2:7-16[Medline].
Biophys J, January 1999, p. 28-39, Vol. 76, No. 1
© 1999 by the Biophysical Society 0006-3495/99/01/28/12 $2.00
This article has been cited by other articles:
![]() |
M. A. Olson, J. H. Carra, V. Roxas-Duncan, R. W. Wannemacher, L. A. Smith, and C. B. Millard Finding a new vaccine in the ricin protein fold Protein Eng. Des. Sel., April 1, 2004; 17(4): 391 - 397. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Olson and T. L. Armendinger Free-energy contributions to complex formation between botulinum neurotoxin type B and synaptobrevin fragment Protein Eng. Des. Sel., September 1, 2002; 15(9): 739 - 743. |