Molecular Biophysics Laboratory, Institute of Physics, Adam
Mickiewicz University, Umultowska 85, 61-614 Pozna
, Poland
We propose a new, automated method of converting
crystallographic data into a bead model used for the calculations of
hydrodynamic properties of rigid macromolecules. Two types of molecules
are considered: nucleic acids and small proteins. A bead model of short
DNA fragments has been constructed in which each nucleotide is
represented by two identical, partially overlapping spheres: one for
the base and one for the sugar and phosphate group. The optimum radius
= 5.0 Å was chosen on the basis of a comparison of the
calculated translational diffusion coefficients
(DT) and the rotational relaxation times
(
R) with the corresponding experimental data for B-DNA
fragments of 8, 12, and 20 basepairs. This value was assumed for the
calculation DT and
R of
tRNAPhe. Better agreement with the experimental data was
achieved for slightly larger
= 5.7 Å. A similar procedure was
applied to small proteins. Bead models were constructed such that each
amino acid was represented by a single sphere or a pair of identical, partially overlapping spheres, depending on the amino acid's size. Experimental data of DT of small proteins
were used to establish the optimum value of
= 4.5 Å for amino
acids. The lack of experimental data on
R for proteins
restricted the tests to the translational diffusion properties.
 |
INTRODUCTION |
The structure of biopolymers has been a subject
of investigation ever since they were discovered. So far, the most
accurate methods for structure determination have been x-ray
diffraction techniques. These techniques are, however, restricted to
crystallizable molecules, and it has not been established if the
molecular structure in a crystal is identical to that in solution. Many
methods have been developed or adopted to probe biopolymer properties
in solution. The most common are dynamic light scattering (Berne and
Pecora, 1976
), transient electric birefringence (Eden and Elias, 1983
), and nuclear magnetic resonance and ultracentrifugation (Cantor and
Schimmel, 1980
). Hydrodynamic properties measured by these methods do not allow construction of molecular models with atomic resolution (except NMR for small proteins), but still, a lot of effort
has been put into obtaining as much information as possible from the
experimental results. Hydrodynamic properties, for example translational DT and rotational
DR diffusion coefficients, are converted into size and shape parameters of model structures. The
simplest model is a sphere with only one parameter: radius R. The famous Stokes-Einstein relation
DT = kT/(6
R) allows the calculation of a radius
of the model sphere. However, until 1966 exact relations were
restricted to spheres, ellipsoids of revolution (Perrin, 1936
), and
approximate for cylinders (Broersma, 1960a
, b
), which do not reflect
the actual shapes of real macromolecules. Attempts to apply models of
arbitrary shape are based on the so-called bead modeling technique,
where the actual shape is approximated by an ensemble of spheres
(beads) of given radii and positions. The choice of the bead size
depends on the molecule size and the accuracy required. The basis for
the method is the theory of irreversible transport properties of rigid
macromolecules developed by Kirkwood (1954)
. Its present form is a
result of extensive work of several groups of researchers (see Garcia
de la Torre and Bloomfield, 1981
for details). The first applications
to simple structures were aimed at testing the theory. In 1975 Teller
and de Haen carried out calculations of diffusion coefficient for
globular proteins taking into account only the surface atoms of the
molecule (Teller and de Haen, 1975
). In a subsequent paper (Teller et
al., 1979
) they report on the calculations in which the shape of the
protein is modeled by a layer of beads coating the molecule (shell
model). A similar approach was applied later by Pastor and co-workers (Pastor and Karplus, 1988
; Venable and Pastor, 1988
). The high numerical cost of this method strongly restricts its application to
smaller biopolymers. Bloomfield and Garcia de la Torre modeled proteins
in the form of large domains so that the total number of beads was
limited to a few tens. From the point of view of accuracy, the atomic
model seems to be a better choice, but the numerical cost restricts its
applicability to molecules consisting of no more than a few thousand
atoms. However, a model consisting of a few large domains is probably
too crude to follow small differences in
DT and the construction of a bead
model is rather arbitrary. Garcia de la Torre et
al.(1994a)
proposed a good solution for DNA fragments: each bead
corresponds to a single nucleotide. The procedure was fully
reproducible because the beads' centers were placed according to the
helix equation.
We should also mention here works of Porschke (e.g., Antosiewicz
and Porschke, 1989
) devoted to improving different bead models of
biological macromolecules, and the more general approach of Felderhof,
e.g., analytical calculations of multisubunit shell frictional
properties (in McCammon et al., 1975
), and dynamics of interacting
particles (in Felderhof, 1996
, and references therein).
Recently, a few reports appeared devoted to different methods of
constructing bead models of biological macromolecules. Byron (1997)
developed a method based on the local density of atoms. The space
occupied by the molecule was divided into small cubes. In each of them
there is a bead, the size and position of which is determined by the
position of all the atoms in that cube. Such an approach allows an easy
adjustment of the model resolution (cube size) to the molecule size.
Spotorno et al. (1997)
developed a whole set of computer programs
(BEAMS) for generation, visualization, and computation of the
hydrodynamic and conformational properties of bead models of proteins.
The main aim of their method was to construct a bead model from
low-resolution experimental or simulated data as a tool for testing
hypothetical molecule conformations. Hellwig et al. (1997)
constructed
a bead model of a medium-size protein placing beads in the positions of
C
atoms. The calculated
DT values were then compared to
experimental data. An alternative method for the calculation of
hydrodynamic properties was proposed by Zhou (1995a
,b
). He
estimated them through the relations between translational friction and
capacitance and between intrinsic viscosity and polarizability. Those
electrostatic properties were, in turn, calculated by means of the
boundary-element technique.
Encouraged by those results, we tried to develop a fully automatic
algorithm that would convert any crystallographic data into an
appropriate bead model. Since we intended to apply it to macromolecules
of arbitrary shape, the algorithm could not be based on any equation
(as, for example, the helix for DNA), but only on the atomic
coordinates. Nucleic acids seemed to be much simpler than proteins in
terms of structure and homogeneity, so we decided to begin with short
DNA fragments. Having experimental data of both translational and
rotational diffusion coefficients of different DNA fragments, we tried
to adjust our algorithm to provide satisfactory agreement between the
measured and calculated values regardless of the DNA length. A similar
procedure was performed for proteins, although due to much more complex
structure and lack of extensive experimental data, our attempts are
still subject to constant improvement. Also, the lack of
R experimental data in this case restricted
the tests to the translational diffusion coefficients.
 |
THEORY |
Hydrodynamic interactions
The origin of the bead modeling theory was well described in
Garcia de la Torre and Bloomfield, 1981
. Here we give only a brief
description of the method.
Let us consider a rigid assembly of N beads immersed in a
continuous medium. The actual velocity
v'i of the ith bead is changed due to hydrodynamic interactions with other
beads:
|
(1)
|
where vi is the unperturbed
velocity of the ith bead,
Fi is the frictional force acting on
the ith bead, and Tij is the
tensor describing hydrodynamic interactions between the ith and jth beads. A simple multiplication by the ith
bead's frictional coefficient
i gives a
similar equation for the frictional force Fi:
|
(2)
|
where
i is given by Stokes' law
(
i = 6
0
I),
0 being the viscosity of the solvent and
i the radius of the ith bead. For
the sake of convenience, the so-called shielding tensors
Gi are introduced
|
(3)
|
from which the translational friction tensor
may be
calculated:
|
(4)
|
The translational diffusion coefficient
DT of the whole assembly (molecule) is
given by
|
(5)
|
where k is Boltzmann's constant, T is
temperature, and fT is the
translational friction coefficient defined as
|
(6)
|
As the model is analytically insoluble for more then two beads,
the main problem of the theory is to determine the most accurate form
of the hydrodynamic interactions tensor
Tij. For the case of separate beads we
applied the form given by Garcia de la Torre and Bloomfield (1981)
:
|
(7)
|
while for overlapping elements (of the same radius) the formula
developed by Rotne and Prager (1969)
was applied:
|
(8)
|
 |
MATERIALS AND METHODS |
The program HI4, kindly provided by Venable and Pastor, was used
for the calculations of the model's hydrodynamic properties. The
source code was modified to include the Rotne-Prager interaction tensor
for the case of overlapping spheres. All the calculations were
performed on a regular PC using Microsoft Fortran Compiler for Windows.
For some of the structures a large RAM was required (up to 256 MB).
Values of DT and
R (taken as 
) were
calculated for models with different bead sizes. Molecular structures
of appropriate DNA fragments were generated using a computer program (HyperChem, Hypercube Inc., Windows version 4.0).
Construction of a bead model
In order to take full advantage of the atomic coordinates we
decided to group atoms and place the beads at the geometrical centers
of the groups
|
(9)
|
where the vector rj points at the
jth bead's center and nj
is the number of atoms in the jth group of atoms. We decided to include all the atoms (hydrogens) and use small groups (10-30 atoms) to maintain the actual shape of the molecule surface. Since our
preliminary calculations for such models showed that beads have to
overlap to give reasonable results, we restricted our approach to the
case of identical beads.
All our computer programs for constructing the bead models were written
in Turbo Pascal for DOS. The source code and executable versions will
be available upon request from the corresponding author.
 |
RESULTS |
Nucleic acids
Fig. 1 a shows schematic
views of a nucleotide immersed in its model bead. To test the procedure
applicability, we compared the calculated values of
DT and
R with
experimental dynamic light scattering data available for short DNA
fragments (8, 12, and 20 basepairs, Eimer and Pecora, 1991
). The choice
of the experimental method used for comparison with the bead model
calculations was a consequence of the results of Eimer et al. (1990)
.
In their article they compared relaxation times obtained from NMR and
depolarized dynamic light scattering measurements of short
oligonucleotides to the reorientation times of model cylinders
calculated using the formulas derived by Broersma and Garcia de la
Torre. It was found that light scattering data describe the
reorientation of the whole molecule around its shorter axis, while the
NMR cross-relaxation times were superpositions of reorientational times
around both axes and the internal local motions of the molecules. A
schematic view of the DNA 20-mer surrounded by its bead model is shown
in Fig. 1 b. We calculated
DT and
R for
different values of bead radius
in the range 6 Å <
< 8.5 Å. On the basis of the results presented in Fig.
2, a value of
= 7.3 Å was
chosen for further calculations.

View larger version (78K):
[in this window]
[in a new window]
|
FIGURE 1
Graphical representation of the single-bead model of
nucleic acids. (a) A schematic view of the arrangement
of adenine atoms inside a model bead in two projections;
(b) a schematic view of a B-DNA 20-mer in a
ball-and-stick representation immersed in its bead model.
|
|

View larger version (30K):
[in this window]
[in a new window]
|
FIGURE 2
Results of single-bead model calculations of
translational diffusion coefficients and rotational relaxation times
for short DNA fragments; the numbers in the figure denote the number of
basepairs; The rectangles represent the error bars defined by the
experimental errors (height) and projected on the bead radius (width).
Horizontal lines denote the experimental values.
|
|
The same procedure was applied to the tRNAPhe
molecule (for experimental data see Patkowski et al., 1990a
, and for
the structure model see Sussman et al., 1978
). The results of the
calculations shown in Fig. 3 fit the
experimental data for the same value of
= 7.3 Å very well. At
first this result seems to be quite reasonable and we regarded it as a
positive verification of our approach. However, hydration of RNAs and
DNAs can be different, depending on particular RNA local structure.
Looking at Fig. 1 a one can see that there is a lot of empty
space inside the bead, especially out of the base plane. It does not
matter in the case of a double-stranded structure of DNA because this
part is mostly buried inside the model, and due to the overlap of the
beads cancels out in calculations. But in the case of RNA that empty
space may become a source of error because of different packing. To
minimize this effect and to be able to take into account different
sizes of nucleotides containing purines and pyrimidines, we applied a
"double bead" model both for DNAs and tRNA. In this model each
nucleotide was divided into two groups of atoms: one containing the
base and the other one containing the sugar and the phosphate group
(Fig. 4). Due to the method's
constraints, both beads had to have the same size. Again, calculations
were performed for different
values 3 Å <
< 8 Å. The
results, presented in Fig. 5, show the same tendency as for the "single bead" model: shorter DNA fragments seem to be slower in reality than their models for the bead radius corresponding to best-fit value of longer ones. Although the error bars
almost overlap, we think this effect could be relevant as a result of
the "ends effect." All results have been summarized in Table
1.

View larger version (33K):
[in this window]
[in a new window]
|
FIGURE 3
Results of single-bead model calculations of
translational diffusion coefficients and rotational relaxation times
for tRNAPhe. Description of the details as in Fig. 2.
|
|

View larger version (77K):
[in this window]
[in a new window]
|
FIGURE 4
Graphical representation of the double-bead model of
nucleic acids. (a) A schematic view of the arrangement
of adenine atoms inside a model bead in two projections;
(b) a schematic view of a B-DNA 20-mer in ball-and-stick
representation immersed in its bead model.
|
|
View this table:
[in this window]
[in a new window]
|
TABLE 1
Translational diffusion coefficients and rotational
relaxation times of short B-DNA fragments and tRNAPhe
|
|
Small proteins
The procedure applied to model small proteins is similar to the
one applied to nucleic acids. Also in this case we tried both single-
and double-bead models. A single bead model was recently applied by
Eimer's group (Hellwig et al. 1997
) in studies of a large (220 kDa)
MoFe protein from Azotobacter vinelandii (PDB code 2MIN,
Peters et al., 1997
). In their model, centers of beads were placed in
the position of C
atoms. We adopted this
method for our calculations as the single bead model. However, we are
convinced that the nonuniform size and shape of the amino acids (Fig.
6) requires a more detailed representation in the bead modeling procedure. We decided to develop a
"mixed model" in which eight of the smaller amino acids (Gly, Ala,
Val, Ser, Thr, Cys, Pro, Asp) are modeled as single spheres while all
the others are modeled as two partially overlapping spheres. The
criterion, although a little arbitrary, was the value of an average
distance of all the amino acid atoms from their geometrical center
(below 2 Å). Again, the bead center is positioned at the geometrical
center of all the atoms in a group. In the case of eight small amino
acids the group of atoms is equivalent to the residuum and the bead
center is placed in the geometrical center of all its atoms. For larger
amino acids one group is always composed of 10 atoms: the backbone
atoms and the surrounding of the C
atom, while
the second one contains the rest of the atoms. In all cases the center
of the first bead is very close to the C
atom.
The center of the other one is placed at the geometrical center of the
remaining group of atoms. Fig. 6 shows the effects of application of
such a model. In Fig. 6 a we present a small amino acid
modeled with a single sphere and one of the largest amino acids modeled
with two, almost separated spheres. A model of a small protein is shown
in Fig. 6 b.

View larger version (72K):
[in this window]
[in a new window]
|
FIGURE 6
Graphical representation of the bead model of proteins.
(a) A schematic view of glycine and arginine atom
arrangement inside the model beads; (b) a schematic view
of a small protein (BPTI) immersed in its bead model.
|
|
For tests we could use only those proteins for which both
crystallographic structure and experimental values of diffusion coefficients were available. Six small proteins were chosen for the
test calculations: bovine trypsin inhibitor (bovine pancrease), profilin (Acanthamoeba castellanii), insulin (hexamer, pig),
ribonuclease A (bovine pancrease), lysozyme (turkey egg white), and
cellulase (humicola insolens). The crystallographic data
were taken from the Brookhaven Data Bank. Basic information about the
proteins are summarized in Table 2. Due
to the lack of experimental data on rotational relaxation, only the
values of DT were calculated for
different values of
and compared to the experimental data (Fig.
7). A substantial error of the
experimental data is reflected in the uncertainty of the
best value: ±0.3 up to ±0.8 Å (7-18%). The average value of
av = 4.5 Å fits the
experimental data for most of the proteins tested.

View larger version (33K):
[in this window]
[in a new window]
|
FIGURE 7
Results of bead model calculations for small proteins.
The horizontal lines denote experimental values.
Symbols: , BPTI; , insulin hexamer; ,
cellulase; , lysozyme; , profilin; , ribonuclease A. The
rectangles represent experimental error bars.
|
|
It seems that the introduction of the second sphere is important only
for small proteins, for which a detailed structure of the surface may
play a big role, while for the large ones one sphere per amino acid,
centered, e.g., at the position of the C
atom,
is sufficient. We checked both methods on the series of six small
proteins. The numbers presented in the first two columns of Table
3 support that observation. For
smaller proteins (e.g., BPTI, profilin, lysozyme) the difference
between these two methods is of an order of 5%, while for insulin
hexamer (306 amino acids) the difference decreases to much below 1%.
It is clear that for larger structures it is mainly the overall shape
that influences the friction coefficient, while for smaller proteins
every detail of the surface may have a substantial influence on its
hydrodynamic properties.
View this table:
[in this window]
[in a new window]
|
TABLE 3
Comparison of calculation results (translational diffusion
coefficient DT [10 6
cm2/s]) for different bead models
|
|
The choice of the test proteins was strongly affected by three factors:
availability of crystallographic and experimental data and the size of
the molecule (hardware limitations). The latter problem can be
partially solved by applying algorithms calculating the solvent
availability at all the points of the surface. We took advantage of the
ASC (Analytical Surface Calculation) computer program (version 2.12)
written by Eisenhaber (Eisenhaber and Argos, 1993
; Eisenhaber et al.,
1995
). Crystallographic data were converted into our bead model and the
resultant files in the xyzr format were analyzed by the ASC program.
Beads unavailable to the solvent were rejected from the structure and
hydrodynamic calculations were performed on the modified files. The
results, presented in the "ASC dbl." column of Table 3, show that
the diffusion coefficient of the modified structure is either exactly equal (insulin) or higher by a fraction of a percent (smaller proteins)
than DT of the original model. The
fact that such a difference exists, although only beads with zero
surface exposed to the solvent were rejected, probably comes from the
effect of the finite solvent molecule size used in the ASC
calculations. The algorithms calculating hydrodynamic properties of
bead models do not take into account the atomic structure of water.
Even the smallest holes and pores are penetrated by a continuous solvent.
Our test calculations have been also extended to the case of large
proteins. Two large structures were chosen: MoFe protein investigated recently by Hellwig et al. (1997)
with the molecular weight of 220 kd, and ferritin in the apo form (no iron)
with the molecular weight of 474 kd. Ferritin is a multi-domain protein and the crystallographic data are available only for a single domain
(PDB code 1IER, Granier et al., 1997
). We took advantage of a data file
generated at Washington University, in which symmetry relations were
used to generate a whole molecule model from coordinates of atoms of a
single domain. Our hardware did not allow for calculations on the
original double bead models, and ASC algorithms had to be used for both
molecules. In the case of the MoFe protein it was possible to reject
only beads with zero available surface and perform
DT calculations (1300 beads). In the
case of ferritin over 80% of beads had to be rejected, so we couldn't
avoid rejecting beads available to the solvent. In order to estimate
the error of such a procedure we performed a series of calculations for the MoFe protein rejecting beads with different areas of surface accessible to the solvent. For the value used in the case of ferritin (33 Å2) the difference in calculated
DT amounted to 0.54% of that obtained when only beads with no accessible surface were rejected. The results,
presented in the last two rows of Table 3, show a tendency of the model
calculations to overestimate the mobility of large molecules. In the
case of ferritin we were able to perform a comparison with simple
geometrical considerations. It is generally accepted that ferritin has
the form of a spherical protein shell covering the mineral, iron-rich
core. The radius of the cavity is estimated to be 40 Å (Stryer, 1995
).
Taking a molecular weight of 474 kd and the typical partial specific
volume of proteins v = 0.73 cm3/g
(Cantor and Schimmel), we get a value of 58.5 Å for the outer radius, which is >10 Å smaller then the hydrodynamic radius
Rh = 69 Å corresponding to the
measured diffusion coefficient. The very good quality of the sample
(second cumulant below 0.05) and the use of a multi-tau correlator
(ALV5000) practically exclude any possibility of experimental
artifacts, such as the presence of ferritin dimers or larger
aggregates, which might increase the best-fit hydrodynamic radius. To
verify this discrepancy we also measured the distance between the two
most distant molecules in the atomic model of the ferritin molecule.
The result of 135 Å corresponds more to the measured value than to the
calculated one. We think that the source of discrepancy lies in two
facts: 1) the domains have the form of long barrels that cannot form a
smooth spherical surface, which results in the presence of many edges
and holes in the protein shell of ferritin; and 2) the same applies to
the inner cavity for which the diameter of 80 Å is the smallest one
and so the calculations above have wrong assumptions. A uniformly
distributed water shell on the surface, which is incorporated in our
model, seems not to work properly in the conditions where large
portions of water form kinds of puddles that fill all the grooves and
holes in the surface of large multi-domain molecules. The same
argumentation applies to the MoFe protein, which consists of four
subunits bound in such a way that a lot of water can be immobilized in
the space between them. Hydration of smaller molecules probably cannot
be as efficient simply because there is not enough space to form large
grooves or holes in the structure of the molecule.
 |
CONCLUSIONS |
In this paper we have presented a systematic automated procedure
for constructing bead models of biomacromolecules from the atomic
coordinates. Despite the model's simplicity, the results presented
here are quite promising. Comparing the hydrodynamic properties
calculated from these models with experimental data, one can verify or
optimize the ternary structure of biopolymers in solution. We are aware
of the basic problem in our approach; that is, the nonuniform
distribution of water on the surface of macromolecules, which is not
implemented in the model. Improved versions of the method including
calculations of hydration properties of macromolecules are already
considered in our group. More model proteins should be considered in
the testing procedure, preferably with experimental data from one source.
Some numerical problems arise with the growing size of the molecules.
The amount of computer memory required to calculate DT is proportional to
(3N)2, N being the number
of beads. Hence, for large N, the solution of the problem
becomes more expensive and time-consuming. Nevertheless, with the
increasing capabilities of computers today, this restriction will
probably disappear soon. It is possible and quite reasonable in terms
of run time to analyze molecules modeled using 1000 beads even on
medium-size workstations. The size of the largest variable is
3N × 3N × 16 (double precision 16 bytes per number). For N = 1000 the variable size
amounts to 140 MB. In principle, at the time of writing this paper, a
fast PC equipped with sufficient RAM amount (256 MB) can also handle
such a task.
Partial financial support of Volkswagen Stiftung, Federal Republic
of Germany, and the US-Polish Maria Sklodowska-Curie Joint Fund II
Grant MEN/NSF-96-254 is gratefully acknowledged.
Address reprint requests to Dr. Jacek Gapi
ski, A. Mickiewicz
University, Institute of Physics, Umultowska 85, 61-614 Pozna
,
Poland. Tel.: 48-61-8273265; Fax: 48-61-8257758; E-mail:
gapior{at}amu.edu.pl.