Carbohydrate ligands are important mediators of
biomolecular recognition. Microcalorimetry has found the complex-type
N-linked glycan core pentasaccharide
-GlcNAc-(1
2)-
-Man-(1
3)-[
-GlcNAc-(1
2)-
-Man-(1
6)]-Man to bind to the lectin, Concanavalin A, with almost the same affinity as
the trimannoside, Man-
-(1
6)-[Man-
-(1
3)]-Man. Recent
determination of the structure of the pentasaccharide complex found a
glycosidic linkage
torsion angle to be distorted by
50° from the NMR solution value and perturbation of some key
mannose-protein interactions observed in the structures of the mono-
and trimannoside complexes. To unravel the free energy contributions to
binding and to determine the structural basis for this degeneracy, we
present the results of a series of nanosecond molecular dynamics
simulations, coupled to analysis via the recently developed MM-GB/SA
approach (Srinivasan et al., J. Am. Chem. Soc.
1998, 120:9401-9409
). These calculations indicate that the strength of
key mannose-protein interactions at the monosaccharide site is
preserved in both the oligosaccharides. Although distortion of the
pentasaccharide is significant, the principal factor in reduced binding
is incomplete offset of ligand and protein desolvation due to poorly
matched polar interactions. This analysis implies that, although
Concanavalin A tolerates the additional 6 arm GlcNAc present in the
pentasaccharide, it does not serve as a key recognition determinant.
 |
INTRODUCTION |
The interaction of complex sugars with proteins
dictates a broad spectrum of physiological processes, ranging from
leukocyte adhesion and tumor metastasis to host-pathogen recognition
(Springer and Lasky, 1992
; Sharon and Lis, 1998
). To explore the
associated possibilities for therapeutic intervention, it is therefore
imperative to understand the salient factors governing
carbohydrate-protein recognition. Recent advances in organic synthesis,
high resolution spectroscopy, and microtitration calorimetry have
combined to yield valuable insights into the structural and energetic
principles underlying these important receptor-ligand interactions,
with particularly extensive work on the archetypal carbohydrate-binding protein, the legume lectin concanavalin A (Con A). Con A is a plant
storage and defense protein (Peumans and van Damme, 1995
), which can be
isolated from the legume, jack bean Canavalia ensiformis. The natural substrates of the lectin are thought to be high-order N-linked glycans, constituent components of many cell membrane proteins. A number of the components of these complex sugars are illustrated in Fig. 1. Glycans are linked
to the nitrogen of an asparagyl residue and are usually branched, as
illustrated by the biantennary fucosylated saccharide of bovine
immunoglobulin G, labeled sugar V in Fig. 1 (Fujii et al., 1990
).
Complex-type N-linked glycans also share a common pentasaccharide
sequence (sugar IV). Molecular-level details of Con A in the native
saccharide-free state (Weisgerber and Helliwell, 1993
) and bound to a
variety of saccharide ligands have been determined by x-ray
crystallography (Naismith et al., 1994
; Naismith and Field, 1996
;
Moothoo and Naismith, 1998
; Dimick et al., 1999
). The overall topology
of saccharide-free Con A is found to be an
2
homodimer above pH 7 and an
4 tetramer below
pH 7. Each subunit is a 26-kDa monomer of 237 residues, forming a
-sandwich of two
-sheets. The amino acid sidechains of the lectin
subunit are preorganized into a shallow binding site, partly through
interactions with the two proximal divalent cations,
Ca2+ and Mn2+. With the
modularity of one binding site per subunit, the multimeric arrangement
of Con A permits recognition of multiple sugars with enhanced avidity,
a cooperative phenomenon known as multivalency (Weis and Drickamer,
1996
).

View larger version (16K):
[in this window]
[in a new window]
|
FIGURE 1
Substrates recognized by Con A: mannoside (I),
disaccharide (II), trimannoside (III), pentasaccharide (IV) and a
fucosylated biantennary glycan (V).
|
|
The first Con A complex to be crystallographically determined was a
tetramer of Con A with methyl-
-D-mannopyranoside
(mannoside or
-MeOMan) at each of the four binding sites (Derewenda
et al., 1989
; Naismith et al., 1994
). The overall fold was found to be identical to the native structure, with a considerable number of direct
contacts to the protein forged by
-MeOMan. Subsequent determination
of Con A in complex with the disaccharides, Man-
-(1
6)-Man-
-OMe and Man-
-(1
3)-Man-
-OMe (Bouckaert et al., 1999
), found the O-1-linked mannose to occupy the monosaccharide binding site in an
analogous fashion to
-MeOMan. Similarly, the crystal structure of a
trisaccharide, Man-
-(1
6)-[Man-
-(1
3)]-Man (Naismith and Field, 1996
), found the 1
6 terminal mannose to occupy the
monosaccharide site. The central reducing sugar in the core and the
1
3 terminal mannose occupied an adjacent extended region of the
binding groove, also making direct polar and apolar contacts with the
protein, albeit fewer in number. The O2 of the
reducing sugar was observed to participate in a hydrogen bond with a
structural water. Interestingly, an alternative binding mode for the
trimannoside to one of the four Con A subunits was observed from a
separate crystallographic study (Loris et al., 1996
). In this
structure, the
-(1
3)-linked mannose is rotated with respect to
the other saccharide units, but interacts with the same amino acid
residues as before. This duality in binding mode has also been observed
for the Man-
-(1
2)-Man-
-OMe/Con A crystal structure (Moothoo et
al., 1999
) and clearly highlight the subtle balance of forces governing
lectin-carbohydrate interactions.
The interpretation of this structural information has been
greatly enhanced by thermodynamic parameters of association from microcalorimetry (Table 1). Although
there is variation in measured entropies and enthalpies of binding, the
free energies converge to a value of ~
5.2 kcal/mol for
-MeOMan,
corresponding to a modest affinity constant of 0.82 × 105
M
1. As expected from the
greater number of interactions with the lectin, the trimannosyl ligand
binds with a higher affinity, in fact almost 50-fold greater than the
mannoside, consistent with the notion of the three mannose residues as
an essential recognition determinant in the glycan core. Thermodynamic
measurements have additionally been obtained for the
-GlcNAc-(1
2)-
-Man-(1
3)-[
-GlcNAc-(1
2)-
-Man-(1
6)]-Man pentasaccharide, a carbohydrate fragment common to complex-type N-linked glycan core regions. Given the increased magnitude in binding
free energy of 2 kcal/mol of the trimannoside over the mannoside, it is
perhaps of some surprise that the pentasaccharide binds with only a
slightly higher affinity of 0.9 kcal/mol relative to the trimannoside
(Table 1). In an effort to understand the structural basis of the
lectin specificity, the structure of the Con A/pentasaccharide complex
was recently reported at a resolution of 2.7 Å (Moothoo and Naismith,
1998
). It was found that the 1
6 mannose occupying the monosaccharide
binding site was slightly displaced in orientation relative to the
mannoside and trimannoside conformations, with a loss in the number of
van der Waals but not polar contacts (Fig.
2). This was thought to arise from an unfavorable orientation of the GlcNAc six arm required to alleviate steric hindrance by lectin residues Thr226,
Gly227 and Arg228, where
destabilization lay principally in a distortion of the
dihedral by ~50° from the observed solution minimum, as determined by NMR studies (Homans, 1995
). Ancillary evidence is provided from
calorimetric studies of the disaccharide
-GlcNAc-(1
2)-
-Man (sugar II of Table 1), which only liberates 5.2 kcal/mol upon binding
to Con A. This is essentially the same affinity as for the mannoside,
thereby implying that binding of the GlcNAc unit is energetically
neutral.
Although experimentally inaccessible, determination of the free energy
components associated with these structural features is feasible
through theoretical analysis. Because carbohydrates are dynamic in
nature and interact intimately with aqueous solvent, we chose to use a
recently proposed computational model that performs free energy
analysis on molecular dynamics (MD) trajectories of a receptor-ligand
complex and the ligand free in solution. The molecular
mechanics
Poisson-Boltzmann/surface area (MM-PB/SA) approach
(Srinivasan et al., 1998
) has previously proved useful in the study of
sequence-dependent solvation of DNA helices (Cheatham et al., 1998
),
RNA-protein interactions (Reyes and Kollman, 2000
), protein folding
(Lee et al., 2000
), protein-protein interfaces (Massova and Kollman,
1999
) and protein-ligand interactions (Kuhn and Kollman, 2000a
,b
).
Similar performance has been afforded by the more computationally
expedient generalized Born/surface area (MM-GB/SA) variant of the
MM-PB/SA approach (Srinivasan et al., 1998
). Therefore, we chose to use
the MM-GB/SA model in seeking to capture the important effects of
conformational plasticity and aqueous solvent on carbohydrate-protein
binding thermodynamics and structure. Consequently, the objective of
this study was to explore sugar-protein interactions in the lectin
complexes of sugars I, III, and IV, analyzing nanosecond trajectories
of the bound and free ligands by the MM-GB/SA method to quantify the possible intrinsic energetic penalty of the distorted glycosidic linkage; the concomitant effect of this orientation on the interactions of the sugar moieties with the lectin; and to consider other possible contributions to the observed free energy of binding, notably the
influence of solvent. Comparison with spectroscopic and calorimetric studies also permits assessment of the quality of the MM-GB/SA potential used for dynamics and free energy calculations in its application to carbohydrate-protein recognition.
 |
MATERIALS AND METHODS |
Molecular dynamics simulations
Dynamical studies were performed for saccharide ligands I, III,
and IV, both bound to the lectin receptor Con A and in aqueous solution.
Sugar-protein complexes
Initial geometries for the sugar-protein complexes were
constructed from the crystallographically determined atomic
coordinates: for I, PDB entry code 5cna, resolution 2.0 Å (Naismith et
al., 1994
); for III, PDB entry code 1cvn, resolution 2.3 Å (Naismith and Field, 1996
); and for IV, PDB entry code 1tei, resolution 2.7 Å (Moothoo and Naismith, 1998
). The complexes crystallize as
4 tetramers. However, the four binding sites
exhibit a near identical shape and binding affinity (Dimick et al.,
1999
), with an absence of cooperativity allowing the simulation studies
to focus on the receptor-ligand interaction of a single subunit
(subunit A). Crystallographic waters within 3 Å of the protein were
included in the model, leading to 92 waters for I, 62 for III, and 15 for IV, a reflection of the resolution of the respective crystal
structures. An analysis of water structure in the crystals of 11 legume
lectins by Loris et al. (1994)
has highlighted seven conserved
hydration sites, with two waters bound to each metal ion
(Ca2+ and Mn2+), a water
associated with a
-bulge, another with an
-hairpin and
importantly, a structural water at the binding pocket, which serves as
a bridge from Asn14, Asp16
and Arg228 of Con A to the
O2 oxygen of the reducing mannose. In the
pentasaccharide/Con A structure, however, the second water coordinated
to Ca2+ was absent in subunits A, C, and D,
although present in B. Consequently, an appropriate water was inserted
at the coordination site; without this water, energy minimization led
to gross distortion of the crystallographic geometry. Hydrogens were
added to the protein using the edit module of AMBER (Case et
al., 1999
), and to the carbohydrate ligands using MAVIS (Parkinson et
al., 1998
). A number of hydroxyl rotamers required adjustment to
reproduce the crystallographically observed polar contacts in
subsequent calculations. Before minimization and molecular dynamics,
the three complexes were immersed in further solvent cubes, previously
equilibrated, leading to an average total of 7623 water molecules.
Three sodium counterions were also added at minima in the protein
electrostatic potential to ensure an electroneutral unit cell for
calculation of the periodic Coulomb potential.
All minimization and dynamics were conducted using AMBER 6.0 (Case et
al., 1999
), combined with the Cornell et al. (1995)
force field for the
protein, the TIP3P water potential (Jorgensen, 1983
) and the GLYCAM_93
force field for carbohydrates (Woods et al., 1995
). The carbohydrate
van der Waals parameters were updated for use with AMBER 6.0, by
analogy to Woods et al. (1995)
using for GLYCAM_93 atom types OG and
AC/EC, the OS and CT values from the Cornell et al. (1995)
force field.
The GLYCAM_93 atomic point charges were retained as they had been
derived from fitting to the HF/6-31G* electrostatic potential (ESP).
The efficacy of the GLYCAM potential in describing sugar conformation
and energetics has been evidenced by a number of studies (Woods et al.,
1998
; Bradbrook et al., 1998
, 2000
). Parameters for the metal cations were taken from the crystallographic and simulation study of Con A
complexes by Bradbrook et al. (1998)
, which were derived to reproduce
the coordination geometry about the two ions found in the crystal
structures of Con A, taking values of R* and
of (1.79 Å, 0.014 kcal/mol) for Ca2+ and (1.69 Å, 0.014 kcal/mol)
for Mn2+. Periodic boundary conditions were used
in conjunction with a particle-mesh Ewald summation (Essman et al.,
1995
) to treat long-range electrostatics. To our knowledge, the present
study is the first study to provide an accurate treatment of long-range
electrostatics in the simulation of protein-carbohydrate systems. Van
der Waals interactions were truncated at 9 Å. The SHAKE algorithm (van
Gunsteren et al., 1977
) was used to constrain all covalent bond lengths involving hydrogen. A time step of 2 fs was used. The equilibration protocol involved rounds of energy refinement and simulated annealing, with progressively decreasing harmonic restraints on the
ligand-protein geometries. This was followed by a 200-ps simulation in
the isothermal-isobaric ensemble, in the absence of constraints, at 300 K and 1 atm using a heat bath coupling of 1.0 ps (Berendsen et al.,
1984
) and a pressure coupling of 1.0 ps. Subsequently, the equilibrated
trajectories of the system were allowed to evolve within the canonical
ensemble for 1.0 ns. Every picosecond, structures were archived for analysis.
Sugars in aqueous solution
Corresponding aqueous solution simulations of ligands I, III,
and IV were conducted, using the potential energy function and parameters outlined above. In detailing the construction of
representative solution phase conformations of the flexible sugar
solutes, geometries of the carbohydrate ligands are subsequently
discussed using the crystallographic definition of glycosidic torsions:
is O5-C1-O1-Ci,
is
C1-O1-Ci-C(i-1), and
is
O6-C6-C5-O5. Ci and C(i-1) are aglyconic atoms.
For comparison with NMR results, we also use the International Union of
Pure and Applied Chemistry definition for
, which
is
H as O6-C6-C5-H5. For the methyl
mannoside (I), an initial GG rotamer was considered for the primary
hydroxymethyl group, conventionally defined by the O5-C5-C6-O6 and
C4-C5-C6-O6 dihedral angles. Solution structures for the methyl
trimannoside (III) have been previously determined by NMR studies
(Brisson and Carver, 1983
; Cummings et al., 1986
; Carver and Cummings, 1987
) The
-(1
6) linkage has been shown by NMR and semi-empirical studies to exhibit considerable flexibility (Homans et al., 1987
): two
conformers were found to be populated at room temperature, varying in
the value of the
-(1
6) interresidue dihedral angle
H: conformers with
H of
60° (g) and
180° (t) were observed experimentally, with approximately
equal populations of both (Brisson and Carver, 1983
). Terminal
hydroxymethyl orientations were assigned the GG conformation, as found
in both NMR and crystallographic studies (Imberty et al., 1991
).
Because interconversion between the two conformers was expected to be
slow (Brisson and Carver, 1983
), both conformers were modeled, denoted
III/g (
H =
60°) and
III/t (
H = 180°).
Similarly, an NMR geometry of the pentasaccharide (IV) was made
available (Homans, 1995
), with the CH2OH groups in the GG conformation and a trans orientation of
. Alternative conformers of IV were not considered. As
before, hydrogens were added to the protein using the edit
module of AMBER (Case et al., 1999
), and to the carbohydrate ligands
using MAVIS (Parkinson et al., 1998
), selecting appropriate
hydrogen-bonding hydroxyl orientations. Addition of pre-equilibrated
solvent to the sugars led to an average total number of water molecules
of 506. Energy refinement was followed by simulated annealing and 200 ps of equilibration via NPT dynamics. One nanosecond of data
was acquired at 300 K in the canonical ensemble, with similar
simulation details to the protein-sugar complex studies. Every
picosecond, structures were archived for analysis.
Molecular mechanics and generalized Born/surface area
calculations
Energetic analysis of equidistant points in phase space along
the dynamical trajectories was performed using the MM-GB/SA method of
Srinivasan et al. (1998)
. This hybrid approach calculates a gas-phase
contribution to binding using an all-atom force field and incorporates
the influence of solvent via the Generalized Born continuum solvent
method (Still et al., 1990
; Jayaram et al., 1998
). For computational
expediency, an energy difference between bound and free protein and
ligand is calculated at points along the bound and aqueous trajectories
(in this case, 100 snapshots due to sampling at 10-ps intervals). The
implicit assumption here is that the ensemble generated by the
molecular mechanics potential used in the MD simulations overlaps
substantially with the distribution of configurations that would be
generated by a hybrid MM-GB/SA potential. According to this method, we
may calculate binding affinity of ligand L to receptor R in aqueous
solution,
G
, as (Massova and
Kollman, 1999
)
|
(1)
|
where
H
and
S
is the enthalpy and entropy of
R-L binding respectively, without solvent contributions, and
T is temperature.



is the ensemble-averaged
difference in solvation free energy between R and L, bound and free,
calculated by the GB/SA continuum model. The enthalpy of binding,
H
, can be approximated by the
ensemble-averaged differences, denoted by a bar in Eq. 2, in
bond-angle-dihedral, electrostatic and van der Waals potential energies
upon binding,
|
(2)
|
Calculation of the solute entropy of binding
(
S
in Eq. 1) is problematic due
to integration over the large number of receptor and ligand degrees of
freedom. Approximate approaches have used normal mode calculations on a small number of structures. However, it has been noted that these harmonic calculations, using gas-phase or solvent reaction field minimized structures, may lead to inaccurate estimates of solute entropy (Srinivasan et al., 1998
). An alternative approach is to
directly extract the solute entropy from a complete molecular dynamics
trajectory (Karplus and Kushik, 1981
). Here, the quasi-harmonic configurational entropy is related to the covariance matrix for fluctuations in degrees of freedom,
|
(3)
|
where n is the number of degrees of freedom,
kB is the Boltzmann constant and
is the determinant of the covariance matrix evaluated
over a given trajectory. Due to the large size of the protein, we
estimate the quasi-harmonic entropy of binding of the carbohydrate to
the protein using only the covariance of the ligand torsional degrees
of freedom, i.e., set
S
=
S
and n to be the
number of rotatable bonds in the ligand. The carbohydrate torsions are
expected to exhibit the most significant restriction on binding of all
the protein and ligand geometric degrees of freedom. For comparison, we
also present
S
, calculated
within the harmonic approximation using representative trajectory
snapshots. After Kuhn and Kollman (2000b)
, we calculate the entropy of
binding for the ligand from solution to a subregion of protein within 8 Å of the bound ligand, using the nmode module of AMBER. To
include the effect of bulk solvent, minimizations were performed using a distance-dependent dielectric of 4r. We note that the solvent entropy
contribution to binding is implicitly included within the theoretical
framework of the GB/SA model.
Ensemble-averaged absolute solvation free energies
(

) were used to calculate
the relative binding energy via Eq. 1. At each timestep,
G
was determined using the pairwise
additive generalized Born (GB) model (Still et al., 1990
; Hawkins et
al., 1996
), parameterized to be consistent with AMBER (Jayaram et al.,
1998
).
G
is composed of Coulombic
(
G
) and nonpolar (
G
) components, which may be further resolved into contributions for each atom i of an
N-atom solute:
|
(4)
|
Here, Ai is the
solvent-accessible surface area, and
is a surface tension
parameter, which we take to be 0.0072 kcal/(mol Å). Due to the sum
over individual solute atoms, it is straightforward to consider
contributions to
G
from particular
chemical groups, in this instance, from specific saccharide or amino
acid residues. Nonpolar contributions are calculated using the
MSMS program to compute solvent-accessible surface areas (Sanner et al., 1996
). PARSE atomic radii (Sitkoff et al., 1994
) and
Cornell/GLYCAM 93 charges were used to calculate electrostatic solvation free energies with a solvent dielectric
of 80. Because electrostatic solvation parameters were unavailable for
Ca2+ and Mn2+ ions, we
adopted the van der Waals radii from Bradbrook et al. (1998)
. An
effective Born radius is then obtained by scaling after the method of
Hawkins et al. (1996)
. Following Hawkins et al. (1996)
and Jayaram et
al. (1998)
, we set the ions' screening parameters to unity. We then
calibrated the scale factors against the experimental ion solvation
free energies (Marcus, 1991
). In combination with a small nonpolar
contribution, scale factors of 1.028 (Mn2+) and
1.145 (Ca2+) were found to best reproduce the
metal ion experimental solvation free energies.
 |
RESULTS |
Before energetic postprocessing by the MM-GB/SA approach, we
report structural and dynamic aspects of the molecular dynamics simulations of the carbohydrate, bound and free in solution, in relation to available NMR spectroscopic and x-ray data. To facilitate discussion, the saccharide residues that comprise I, III, and IV, are
labeled according to the scheme in Fig.
3. The central saccharide unit of
pentasaccharide
-GlcNAc-(1
2)-
-Man-(1
3)-[
- GlcNAc-(1
2)-
-Man-(1
6)]-Man
is the reducing mannose and thus labeled red Man. This residue is
linked at O3 to C1 of a second mannose, which we label the 1
3 Man.
In turn, connected to the O2 of this 1
3 Man is
the terminal GlcNAc arm, which we denote as the 3 arm GlcNAc, referring
to which branch from the reducing mannose the
-(1
2)-linked GlcNAc
is connected. The central reducing mannose is also linked at O6 to C1
of an adjacent mannose, labeled 1
6 Man. A terminal GlcNAc residue is
-(1
2)-linked to O2 of this 1
6 Man, and
is called the 6 arm GlcNAc. The 1
6 Man residue is common to all
three ligands (mono-, tri-, and pentasaccharide). III and IV share the
central trimannoside core. Correspondingly, we partition the
carbohydrate binding site on Con A into five regions: the 3 arm binding
site, the 1
3 Man site, the red Man site, the 1
6 Man
(monosaccharide) site, and the 6 arm site. These sites overlap, sharing
several protein residues.

View larger version (22K):
[in this window]
[in a new window]
|
FIGURE 3
Partitioning of five saccharide units of IV for
analysis: 6 arm, a GlcNAc unit, -(1 2)-linked to 1 6 Man,
-(1 6)-linked to red Man; 3 arm, a GlcNAc unit, -(1 2)-linked
to 1 3 Man, -(1 3)-linked to red Man.
|
|
Carbohydrate-solution trajectories
We first consider the conformation of the unbound carbohydrate
ligands in aqueous solution, discussed in terms of chemical group.
Glycosidic linkages
Ensemble-averaged glycosidic angles are compared with experiment
for III and IV (Table 2). For all
linkages, there is good agreement between simulation and NMR average
values. No interconversion is observed between the g and
t forms of the
-(1
6) linkage between 1
6 Man and red
Man in III. This is as expected, given that interconversion occurs
slowly in comparison to the tumbling time of the ligand of
150 ± 20 ps measured by NMR (Brisson and Carver, 1983
).
Interestingly, increased flexibility is found in the
-(1
3)
angle of III/g relative to the
trans conformer, possibly as a consequence of its more
linear shape, with fewer internal hydrogen bonds. This
angle displays considerable flexibility in III and IV, which samples
trans and gauche orientations, reflected in the
large standard deviations (Table 2). Motional averaging around the
-(1
3) linkage has been noted from NMR and simulation studies of a
number of carbohydrates (Homans, 1995
). In general, greater flexibility
is observed in the glycosidic linkages of IV with respect to III and I. Multiple rotamer sampling is reflected in the torsional standard
deviations (Table 2), particularly for the
-(1
2) linkages of the
GlcNAc arms. This increased flexibility at the molecular extremities is
reminiscent of the fraying of oligonucleotides observed in aqueous
solution. Of particular note is the behavior of the
angle of the 6 arm/1
6 Man linkage, which populates the bound rotamer
(
=
132°) to a limited degree (Fig. 4). The distribution of
(
,
) angles for the GlcNAc arms indicates a broad spread in
relative to
(Fig. 4).
This is expected from stereoelectronic control exerted by the
exo-anomeric effect on
of the
-(1
2) linkage absent
on
(Lemieux, 1971
; Juaristi and Cuevas, 1994
).

View larger version (19K):
[in this window]
[in a new window]
|
FIGURE 4
Glycosidic torsion angle sampling for 1 3 Man/arm
(top) and 1 6 Man/arm (bottom) linkages
of IV during molecular dynamics trajectories for IV/Con A complex
(right) and IV in aqueous solution
(left). Spectroscopically determined geometries are also
indicated ( ).
|
|
Hydroxymethyl groups
NMR studies indicate approximately equal populations of GT and GG
(Nishida et al., 1984
; Hori et al., 1990
). In these simulations, the
carbohydrate exocyclic hydroxymethyl groups remain stable in their
initial GT or GG conformation over the timescale of the simulations.
This stability has been observed elsewhere (Simmerling et al., 1998
)
and is a reflection of the sizeable free energy barrier to rotation in
solution, given by potential of mean force calculations to be 4-6
kcal/mol for glucose in solution.
Hydroxyl groups
Primary and secondary carbohydrate hydroxyls are fully engaged in
hydrogen bonds, both to the solvent and to other regions of the solute.
Conformational transitions are frequent, highlighting the dynamic
nature of the water hydrogen bond network encompassing the sugar.
N-acetyl groups
The C-C-N-C dihedrals that define the rotation of the terminal
N-acetyl groups attached to the terminal glucose moieties display restricted rotation in solution, with infrequent transitions between low energy conformers. Inspection of the trajectories reveals frequent
hydrogen bonds between the N-acetyl group of the terminal GlcNAc arms
and the adjacent mannose residues, mediated by two, one, or no bridging
water molecules (for example, direct hydrogen bonds between the 6 arm
amide group and the 6-hydroxyl group of the neighboring mannose).
Direct or indirect intersaccharide hydrogen bonding explains the
limited conformational freedom of the N-acetyl group. Water-bridged
hydrogen bonds are also observed between other adjacent residues in
sugar IV: for example, O4 of the 1
3 Man with
O2 of the reducing Man.
Therefore, we conclude that the three sugars adopt conformations
consistent with NMR-derived parameters, and can, in fact, sample their
bound conformations in aqueous solution, albeit infrequently. The
backbone
torsions are fairly rigid, whereas
can show larger torsional librations. Evidence of
frequent water-mediated interresidue interactions is observed,
particularly involving the N-acetyl arms of IV.
Carbohydrate-protein trajectories
Structural stability is demonstrated by the low-protein backbone
root mean square deviation (RMSD) of ~1 Å over the 3-ns trajectories of Con A in complex with I, III, and IV (Fig.
5). A slight drift observed in the RMSD
is not uncommon for large protein/solvent systems (Tara et al., 1999
).
The seven structural waters identified by Loris et al. (1994)
remain
integral to the protein throughout the three trajectories. To compare
with solution trajectories, we again discuss the dynamics of the
carbohydrate ligands in terms of chemical groups.

View larger version (28K):
[in this window]
[in a new window]
|
FIGURE 5
Backbone RMS deviations (Å) from crystal structure of
Con A during simulation of complex with ligands I
(dotted), III (dashed), and IV
(solid).
|
|
Glycosidic linkages
As with carbohydrate-solution simulations, good agreement is
observed between experimental and MD-derived torsion angles (Table 2).
Bound ligands III and IV do not explore the alternative
crystallographic conformation at the
-(1
3) linkage found by Loris
et al. (1996)
. All torsions experience rigidification upon binding,
although the effect is small for angle
linking 3 arm and
1
3 Man in IV. This motion is illustrated by the
(
,
) plot in Fig. 4, where we can see that
both GlcNAc arms explore a subset of conformations that are found in
solution, but that the 3 arm explores a larger proportion of these,
perhaps due to the higher degree of solvent exposure of the 3 arm in
the lectin complex. B-factors also provide information on the motion of
the protein and carbohydrate (Bradbrook et al., 1998
). Therefore, we
can observe the enhanced flexibility of the 3 arm relative to the rest
of ligand IV by high average x-ray (32.0 Å2) and
MD-calculated (31.5 Å2) B-factors (atoms 48-61
of Fig. 6). Nevertheless, the 6 arm
average x-ray B-factor (21.4 Å2) is not
negligible (Moothoo and Naismith, 1998
). The lowest mobility and
tightest torsional fluctuations are exhibited by the 1
6 Man (Table
2, Fig. 6), with an average x-ray B-factor of 13.0 Å2. Although the trends in atomic
crystallographic B-factors are well reproduced by the simulation, we
note that the B-factors calculated from the trajectories are a lower
limit, due in part to neglect of global rotational and translational
motion in the MD, and averaging that may occur over different
crystallographic geometries.

View larger version (21K):
[in this window]
[in a new window]
|
FIGURE 6
Comparison of calculated (dotted line)
and crystallographic (solid line) B-factors
(Å2) for III/Con A (bold) and IV/Con A
(standard). Saccharide residues are labeled
(a) 6 arm, (b) 1 6 Man,
(c) red Man, (d) 1 3 Man, and
(e) 3 arm.
|
|
Hydroxymethyl groups
In complex with Con A, the exocyclic CH2OH
groups of I, III, and IV exhibit oscillations about a stable GG
conformation, comparable in magnitude to those found in solution. The
exception is the CH2OH group of the 6 arm, which
samples all three rotamers, with 43.7% GT, 55.7% GG, and a trace
amount of TG. This contrasts with a single GG conformer observed in solution.
N-acetyl groups
In both arms of IV, the N-acetyl groups are conformationally
restricted. For the 3 arm, the constraint is by van der Waals contacts
with Tyr12, Pro13 and
His205, and for the 6 arm GlcNAc, it is via a
hydrogen bond to O
of Ser168.
Hydroxyl groups
In the majority of hydroxyls, rotamer sampling is considerably
restricted upon binding. The frequency of sampling for the C6-O6
torsion of the 6 arm reduces by an order of magnitude. For each of the
1
6 Man hydroxyls, nearly unhindered rotation in solution reduces to
a single rotamer, reflecting strong, specific polar carbohydrate-protein hydrogen bonds at the monosaccharide (1
6 Man)
binding site.
Carbohydrate-protein interactions
Accordingly, short average interatomic distances for the key
protein and carbohydrate groups at the monosaccharide binding site, for
I, III, and IV, are observed in the MD simulations (Table 3). The specificity of the interactions
is further highlighted by the high hydrogen bond occupancy over the
trajectory (defined as a heavy atom interatomic distance of <3.5 Å).
The notable exception to this is the small fraction (0.2-0.7) of
occupied Arg228 N···O4 hydrogen bonds. The
non-ideal geometry of this polar interaction has been noted elsewhere
(Naismith et al., 1994
; Naismith and Field, 1996
; Moothoo and Naismith,
1998
). Inspection of the trajectory indicates that, although the
Arg228 backbone N···O3 interaction is well
preserved in I, III, and IV (Table 3), the long, flexible side-chain of
Arg228 explores a range of conformations and,
consequently, a range of interactions with the carbohydrate. We may
classify these interaction geometries into three modes: a single
hydrogen bond from Arg228 NE to O3 (Fig.
7); a bifurcated interaction with the
terminal guanidinyl nitrogen atoms of Arg228 to
O3 (Fig. 7); and a mode without a side-chain hydrogen bond to O3 (Fig.
8). The distribution of these
interactions can be mapped using as a coordinate, the
Arg228 HE···O3 interatomic distance, with
characteristic values 2.0, 3.6, and 4.9 Å, respectively, for the three
modes. Each complex explores two of the three possible modes (Fig.
9). All three modes are consistent with
an Arg228 salt-bridge interaction with
Asp16, as observed in the crystal structure,
although any of the three guanidinyl Ns may act as donor. Thus, we can
see an approximately inverse correlation in population of
Arg228 N···O4 and
Arg228 NE···O3 interactions (Table 3).
View this table:
[in this window]
[in a new window]
|
TABLE 3
Average carbohydrate-protein interatomic distances (Å)
from X-ray crystallography (dxrd) and molecular
dynamics (dMD), with fractional
(focc) occupancy in parentheses
|
|

View larger version (32K):
[in this window]
[in a new window]
|
FIGURE 7
Interactions of ligand IV with Arg228:
(top) single hydrogen bond mode; (bottom)
bifurcated hydrogen bond mode.
|
|

View larger version (36K):
[in this window]
[in a new window]
|
FIGURE 9
Variation of Arg228 HE···O3 distance
(Å) indicating sampling of Arg228 modes over molecular
dynamics trajectories of I (bottom), III
(middle), and IV (top) bound to Con A.
|
|
Fewer contacts are observed for the remaining residues (Table 3),
indicating the less specific nature with which the residues are
recognized. Occupancy of polar contacts is generally high, although low
for the nonideal Asp16
OD2···O2 geometry of red Man and low for
the two protein backbone hydrogen bonds formed with the 6 arm in IV/Con
A (Table 3). In the 6 arm GlcNAc, the MD average interatomic distances
are all greater than in the crystal. We may attribute these
observations to the exploration of different interaction modes, due in
part to Arg228. With respect to
Thr226, and, indirectly,
Gly224, this is due to the multiple rotamer
sampling of the 6 arm exocyclic hydroxymethyl group, reported above
(Fig. 7). We also note that there is sampling of a hydrogen bond
between Thr15 N···O4 of the 1
3 Man, that
is observed in the Loris et al. (1996)
geometry of III (Table 3). The
values of this distance in the III/Con A and IV/Con A x-ray structures
are 3.7 (Naismith and Field, 1996
) and 3.6 Å (Moothoo and
Naismith, 1998
), respectively. During the MD, however, periodic
N···O4 distances of ~3 Å are found.
Finally, we may superimpose the 237-amino acid polypeptide backbones of
the ensemble-averaged structures of I, III, and IV in complex with Con
A from the three trajectories (Fig.
10). There is a clear consensus in the
primary importance of the conserved structure of the 1
6 Man and the
monosaccharide binding site thereof. However, there is some distortion
with respect to the mannoside. Superimposing the protein backbone atoms
of the MD-average structures of I/Con A with III/Con A gives an RMSD of
0.42 Å for the nonhydrogen sugar atoms of the 1
6 Man residue. For
overlay of I/Con A with IV/Con A, we find an RMSD of 0.81 Å. The
higher RMSD for the pentasaccharide is centered around distortion at
the O3 and
-(1
2)-linked O2 atoms.
Perturbation at the O2 site in IV/Con A has been
observed previously in the static x-ray structures (Moothoo and
Naismith, 1998
).

View larger version (11K):
[in this window]
[in a new window]
|
FIGURE 10
Overlay of average MD structures of I, III, and IV in
complex with Con A by superposition of all protein backbone atoms.
|
|
The carbohydrate-protein trajectories demonstrate a stable protein
conformation, good agreement with crystallographic B-factors and
geometric parameters for the ligand, and near complete occupancy of the
dominant carbohydrate-protein interactions observed from static x-ray
structures. The dynamic ensemble also contains a fraction of
alternative interactions arising from ligand and protein contacts not
observed from the static crystal structure, including water-mediated
interactions. Based upon these structurally validated dynamics
simulations of the carbohydrates, bound and in solution, we may analyze
the trajectories using the MM-GB/SA approach to obtain a thermodynamic
interpretation of binding affinity.
Binding free energy
We now turn to consider energetic analysis of binding by
applying Eq. 1 using the solution and protein-bound carbohydrate trajectories. Free energies of binding,
G
, are reported in Table
4. The trends exhibited by the
calorimetrically determined binding free energy (Table 1) are
reproduced by the calculated
G
:
the oligosaccharide ligands IV and III bind to Con A with equal
affinity, and both about 50% more tightly than the mannoside I. The
binding free energy of III/t and III/g are almost
identical, implying that the absolute free energies of the ligands in
solution are the same. This observation is in agreement with NMR
studies that indicate the two conformers to be essentially equally
populated. In subsequent discussion of the energetics of III, we will
use a Boltzmann average based upon the relative binding free energy of
the two ligands in solution (trans is favored by 0.6 kcal/mol). This averaging has a small bearing on the calculated binding
free energies for III, and we do not consider multiple conformational
states of IV in solution. Subsequent incorporation of the entropic
contribution to
G
, via
calculation of normal modes within the harmonic approximation, leads to
a decrease in binding by 4.2 kcal/mol of IV over III. To consider
absolute binding free energies, we may use this estimate of the entropy
with
G
, and incorporate the
free energy change due to loss of global translational and rotational
degrees of freedom of ligand and receptor on binding, of approximately
7-11 kcal/mol (Ajay and Murcko, 1995
). From this, we observe that the
absolute binding energies are overestimated for IV and III, and
underestimated for I, reflecting the difficulty in calculating free
energies of complex systems using a summation of components approach,
and highlights, in particular, limitations in accurately estimating the
solute entropy of binding. One might consider the ligand torsional
degrees of freedom to be the most salient entropic discriminator for
binding. Using only the quasi-harmonic ligand torsional entropy, we
predict a decrease in binding of IV relative to III of 4.5 kcal/mol
(Table 4), perhaps fortuitously close to the estimate from harmonic
calculations. We now proceed to use the MM-GB/SA model as a tool to
identify and quantify the physical contributions to binding of ligands
I, III, and IV, in particular seeking to explain the observed and
calculated low affinity of IV for Con A.
Ligand and receptor contributions to binding
From Table 4, we observe increasing magnitude in enthalpies of
binding,
H
, with increasing ligand size and number of protein-ligand contacts, from
77.8 kcal/mol for I to
166.5 kcal/mol for IV. The enthalpy of binding also
incorporates the cost of distortion on binding, and, as a result, is
not a direct measure of the receptor-ligand interaction. We can
separate
H
into a
receptor-ligand interaction enthalpy
(
H
) and ligand (
H
) and receptor
(
H
) distortion components:
|
(5)
|
Here, we obtain
H
by
restricting the application of the MM-GB/SA method to the ligand in its
bound and solution trajectories (Table
5). We neglect distortion of the protein
(setting
H
to zero), which
would require a further separate MD trajectory of the native lectin in
aqueous solution. The ligand distortion enthalpies increase with ligand
size.
H
is 15.7 kcal/mol for IV,
which is 4.1 kcal/mol larger than the Boltzmann-averaged value for III.
This enthalpy difference agrees well with the results of a study using
the MM2CARB force field to map the glycosidic linkage of a
disaccharide,
-GlcNAc-
-(1
2)-Man. The map suggests that the
energetic penalty of altering the
angle from the
solution to bound value is of the order of 3 kcal/mol (Imberty et al., 1991
).
We can subsequently calculate a receptor-ligand interaction enthalpy,
H
, from Eq. 5. For the three
ligands (Table 6), this quantity follows
the same trends as
H
(Table 4).
Through utilization of the extended binding sites by the two additional
mannose sugar residues, III achieves a 43.6 kcal/mol increase in
binding over the mannoside I. Additional contacts formed by IV lead to
a 51.9 kcal/mol gain in enthalpy over III. We can evaluate the
contribution of each sugar residue to the overall binding enthalpy
(Table 6). From this decomposition, we observe the 1
6 Man saccharide
units in I, III, and IV to interact most strongly with Con A, with an enthalpy of between
80.4 and
86.8 kcal/mol (Table 6). This agrees
well with an enthalpy of
83.8 kcal/mol calculated from previous
simulation of the methyl mannose/Con A complex by Bradbrook et al.
(1998)
. The 1
6 Man saccharide unit of I binds with the highest
affinity of the three ligands, in agreement with the suggestion (Moothoo and Naismith, 1998
) that the methyl mannoside/Con A complex represents the optimal geometry for binding at the monosaccharide site.
However, the associated standard errors for the 1
6 Man/Con A
interaction enthalpies are large (Table 6), and, statistically, we
cannot distinguish between the energies. It has also been postulated that the reducing mannose has the least interaction energy with Con A,
and our calculations are in line with this, with an interaction enthalpy of between
13.3 and
20.0 kcal/mol for the three ligands. In III/Con A, the 1
3 Man is thus the major contributor to the additional affinity arising from the lectin extended binding site. Analysis also reveals that the source of the increased
H
of IV/Con A over III/Con A is
principally protein contacts with the 6 arm GlcNAc (
43.5 kcal/mol),
and a weaker interaction with the 3 arm (
14.2 kcal/mol). These
favorable interactions are accompanied by a slight weakening by 6.7 kcal/mol and by 2.9 kcal/mol at the red Man and 1
3 Man sites
respectively (Table 6).
The saccharide unit-protein interaction enthalpy can be separated
further into individual amino acid contributions to identify key amino
acid residues for sugar binding. Performing this decomposition for the
1
6 Man unit (Fig. 11) clearly
highlights the crucial role of Asp208 in
tethering the sugar, with a significant, mainly electrostatic contribution to
H
, of between
36.5 and
45.5 kcal/mol for the three complexes.
Asp208 forms two short, strong hydrogen bonds to
the saccharide O4 and O6 atoms (Table 3). Significant contributions are
also made by hydrogen bonds with Gly98,
Leu99 and Tyr100.
Interestingly, despite a favorable van der Waals interaction with the
1
6 Man, Ala207 makes a net unfavorable
contribution. The greatest variation is seen for
Arg228, arising from the previously noted
conformational sampling. The low average interaction in III/Con A
arises from sampling principally of the "no hydrogen bond" mode
with respect to O3 of 1
6 Man (Fig. 8), relative to other modes found
for I/Con A and IV/Con A. In III/Con A, the
2.1 kcal/mol interaction
with 1
6 Man comes from the Arg228 backbone
N···O3 interaction, which is conserved (Table 3). A similar
analysis can be applied to the
43.5 kcal/mol interaction enthalpy of
the 6 arm in the pentasaccharide IV with Con A. Here, the major
contributors are residues Thr226 (
12.3
kcal/mol) and Ser168 (
9.4 kcal/mol), which make
hydrogen bond contacts (Table 3). A more modest contribution is made by
the less sampled Gly224 contacts (
4.6
kcal/mol). We also note a large net contribution to
H
of
22.2 kcal/mol by various charged aspartyl residues clustered around the binding site
(Asp16, Asp203,
Asp208, and Asp235),
providing a favorable electrostatic potential for ligand binding.

View larger version (16K):
[in this window]
[in a new window]
|
FIGURE 11
Per amino acid energy decomposition (kcal/mol) for
interaction of Con A with 1 6 Man of I (top), III
(middle) and IV (bottom); van der Waals
contribution (shaded) and total interaction energy
(unshaded).
|
|
To complete the consideration of receptor and ligand
contributions to binding, in the absence of a solvent field, we here consider the entropies of binding. The global ligand entropies of
binding are reported in Table 4. Entropy contributions calculated from
harmonic analysis
(
T
S
) are of similar
magnitude to binding enthalpies, leading to small binding free
energies. For III and IV, we note that these entropy calculations provide a lower limit to the entropy decrease on binding, due to
limited sampling about the
-(1
3) linkage in solution. However, ligand I appears to lose the greatest free energy on binding due to
entropy (30.4 kcal/mol), compared to 20.7 kcal/mol for IV and then 16.5 kcal/mol for III. The origin of this trend is unclear. It has been
found for the avidin system that the entropy contribution from averaged
normal mode calculations can vary by 5 kcal/mol in unfavorable cases
(Kuhn and Kollman, 2000a
). The quasi-harmonic entropy calculations,
although smaller in magnitude because we consider only the dihedral
contribution to the entropy, show a clear progression in magnitude of
entropy loss with increasing ligand size. This entropy can be factored
into saccharide residue contributions, partitioning the covariance
matrix into residue blocks and applying Eq. 3 to the determinant of
each block. The assumption is neglect of inter-residue torsional
correlation, which may be estimated from the difference between the
residue total and the total using the complete ligand covariance
matrix, and is ~6 cal/(K mol) for binding of I, III, and IV (Table
7). Given this approximation, we observe
that, as for
H
(Table 6), the
largest residue contribution to
S
is from the 1
6 Man; this quantity is, on average,
12.8 cal/(K mol). Thus, the most tightly bound fragment has the largest entropy decrease on binding. In general, however, the residue ordering for is
not quite the same as for
S
. With
regard to binding of IV,
S
for the reducing mannose is slightly greater in magnitude than 1
6 Man.
As expected, the 3 arm GlcNAc experiences very little change in entropy
(
3.3 cal/(K mol)) on binding, as it remains effectively free
in solution. Interestingly, the 6 arm yields a small increase of 1.0 cal/(K mol) in entropy on binding. Examination of the covariance matrix
shows the largest contributor to be the C6-O6 torsion, which, in the
bound state, explores a variety of rotamers, but in solution, due
possibly to a restrictive solvent network noted earlier, samples only
the GG conformer. It is unclear how appropriate, however, the
application of a multivariate Gaussian distribution assumed by Eq. 3 is
in this instance, where multiple minima are sampled. Perhaps a bimodal
or higher moment distribution function is more suitable (Fisher, 1993