help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Sen, S.
Right arrow Articles by Nilsson, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sen, S.
Right arrow Articles by Nilsson, L.

Biophys J, October 1999, p. 1782-1800, Vol. 77, No. 4

Structure, Interaction, Dynamics and Solvent Effects on the DNA-EcoRI complex in Aqueous Solution from Molecular Dynamics Simulation

Srikanta Sen and Lennart Nilsson

Center for Structural Biochemistry, Karolinska Institute, Department of Biosciences, Huddinge, Sweden

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
SUMMARY AND CONCLUSIONS
REFERENCES

A 0.7-ns molecular dynamics simulation of the DNA-EcoRI complex in a 7.0-Å solvent shell indicated a stable behavior of the system. No significant evaporation or smearing of the solvent's outer boundary occurred. The structure and the intermolecular interactions were found to be well maintained during the simulation. The interaction pattern in the simulation was found to be very similar to that in the crystal structure. Most of the specific interactions between the DNA and the protein were found to be enhanced in the simulation compared to that in the crystal structure as a result of improved interaction geometry. The nonspecific interactions were found to be stronger than the specific ones. The specific interactions between the N7 atoms of Gua4 or Ade5 or Ade6 and the protein were found to be present over almost the entire time of the simulation, whereas hydrogen bonds involving the amino groups of the Ade5 and Ade6 with the protein were found to be relatively weaker, with lower probability and shorter lifetime. The time evolution of the root mean square deviations of the DNA and the protein were highly correlated even at the later part of the simulation, showing the tight binding between them. Several long-lived water bridges were found between the DNA backbone atoms and the protein and also between the two protein monomers, which increased the overall stability of the complex. The two protein monomers were found to interact strongly with each other. The energy of the DNA kink deformation was estimated as approximately 31 kcal/mol.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
SUMMARY AND CONCLUSIONS
REFERENCES

One of the central issues of modern molecular biology and biophysics is to understand the interactions stabilizing biomolecular complexes in aqueous solution, particularly how a small sequence change can lead to a significant difference in affinity, either directly or through structural rearrangements. Knowledge of these aspects is essential for designing simple artificial biomolecules having desired properties, which can be used for controlling various cellular processes. From the biological point of view, DNA-protein interactions provide the most important class of biomolecular complexes for studying such issues. There has been considerable experimental work in this field (Draper, 1993; Rosenberg, 1991; Kim et al., 1990; Luisi et al., 1991; Newman et al., 1994; Lesser et al., 1990, 1993; Robinson and Sligar, 1993, 1994; Billeter et al., 1993; Qian et al., 1993; Schwabe et al., 1993; Venclovas et al., 1994; Hirsch and Aggarwal, 1995; Eriksson et al., 1995; Szczelkun et al., 1997; Jeltsch et al., 1994; Misra et al., 1994). Although characterization of the interactions between the DNA and the protein in a DNA-protein complex is possible through x-ray crystallography, nuclear magnetic resonance (NMR), or some indirect biochemical experiments, the detailed characterization of the interaction strength and the dynamic features at the atomic level in such cases has not yet proved to be very straightforwardly obtained from experiments. Molecular modeling and molecular dynamics (MD) simulation techniques, through which individual interactions and dynamics at the atomic level can be probed explicitly, provide another way to look into such details of molecular structure and interactions (McCammon and Harvey, 1987; Brooks et al., 1989; van Gunsteren, 1993). Moreover, as most biomolecular interactions take place in an aqueous environment, it is also very important to consider the effects of aqueous solvent, including, among others, competition for hydrogen bonding, screening of electrostatic interactions, and favoring structural rearrangements with minimal exposure of hydrophobic groups to aqueous solvent. There are also cases where water molecules have been seen to supply crucial and specific contacts between protein and DNA (Billeter et al., 1996; Otwinoski et al., 1988; Qian et al., 1993, Karplus and Fearman, 1994; Eriksson et al., 1995; Eriksson and Nilsson, 1995; Kosztin et al., 1997). Again, it is not always possible to characterize the solvent molecules in terms of localization and dynamics in x-ray or NMR experiments. On the other hand, the individual solvent molecules can be followed rather easily in MD simulation studies and can be characterized in full detail (Eriksson et al., 1995; Eriksson and Nilsson, 1995; Kosztin et al., 1997; Alexander et al., 1998). Thus, the results from MD simulations can complement the corresponding experimental studies in improving our insights about structure, dynamics, interactions, and solvent effects involving biomolecular complexes at the atomic level. The aim of the present study is to use molecular modeling and MD simulation techniques for investigating the details of the crucial interactions (both direct and water-mediated) that are responsible for the stability and high specificity in the DNA-EcoRI complex and its dynamics in aqueous solution. Several works (Eriksson et al., 1995; Eriksson and Nilsson, 1995; Bishop et al., 1996, 1997; Kosztin et al., 1997) in this direction involving other DNA-protein complexes have been published in recent years. In our present work we have selected DNA-EcoRI complex as a model system for such studies because a large body of experimental biochemical and biophysical data is readily available (Draper, 1993; Rosenberg, 1991; Kim et al., 1990; Newman et al., 1994; Lesser et al., 1990, 1993; Robinson and Sligar, 1993, 1994; Venclovas et al., 1994; Misra et al., 1994; Szczelkun et al., 1997; Jeltsch et al., 1994) for this system, which can be of great use in evaluating the reliability of the results obtained by the present modeling and simulation studies.

EcoRI is a well known type II restriction endonuclease that binds to DNA, specifically to the base sequence containing the stretch d(Gdown-arrow AATTC)2, and cuts the DNA strands at the position marked by the arrow by hydrolyzing the sugar-phosphate backbone in the presence of Mg+2 ion as a cofactor. Endonucleases are unique in that they can cleave efficiently small (4-6 bp long) recognition sites on foreign DNA but avoid potential cleavage of the cellular DNA at sites that differ by as little as a single basepair. Fig. 1 represents the crystallographic structure of the DNA-EcoRI complex. EcoRI is a dimeric protein. Each monomer contains 277 amino acid residues and binds to the above base sequence on each DNA strand to form the complex. Binding of the protein monomers introduces bends in the structures of both the DNA strands, making the DNA kinked (Rosenberg, 1991; Kim et al., 1990). Protein-DNA interactions are often classified into specific and nonspecific contacts (Billeter, 1996). The interactions between the protein and the DNA bases are known as the specific interactions, and those formed between the protein and the DNA backbone are termed nonspecific interactions because the sugar-phosphate backbone parts are the same for all DNA nucleotides. Clearly, the specific contacts play the major role in the selection of a certain DNA base sequence, whereas the nonspecific ones serve mostly to increase the overall stability of the complex. Structural studies on the DNA-EcoRI complex by crystallography also indicate that the sequence specificity is determined in part by the direct contacts between the protein and the DNA bases. It is found that both the strands are recognized by hydrogen bonds with each purine base and contacts to the pyrimidines and the tight complementarity of the surfaces of the DNA and the protein at the recognition site also provide extensive interactions between the protein and the DNA backbone which further increases the overall stability of the complex (Rosenberg, 1991; Kim et al., 1990; Newman et al., 1994; Lesser et al., 1990, 1993; Venclovas et al., 1994). It is generally believed that specific contacts play the major role in the selection of a certain DNA base sequence and the nonspecific ones serve mostly to increase the overall stability of the complex. There is, however, also the possibility of indirect recognition through the local structural properties of DNA. Such indirect mechanisms may play an important role in protein-DNA complexes like the EcoR1-DNA complex, where protein binding induces large deformations in the DNA.



View larger version (105K):
[in this window]
[in a new window]
 
FIGURE 1   Crystal structure of the EcoRI-DNA complex showing the relative positions of the DNA strands and the protein monomers (colored green and red).

In the present work, we have performed MD simulation studies of the EcoRI-DNA complex in aqueous solution, starting from the 2.5 Å resolution x-ray crystallographic structure of the complex (Rosenberg, 1991; Kim et al., 1990). However, as the complex itself is a huge one (~9000 atoms), use of a solvent box with periodic boundary condition or a solvent sphere with stochastic deformable boundary (SDB) condition to solvate the system makes the size of the overall system quite large, and a MD simulation of reasonable length becomes computationally highly expensive. We therefore have solvated the whole complex by a water shell 7 Å thick around it. The advantage of using a solvent shell is that it solvates the whole complex and at the same time keeps the overall system size reasonably moderate. As no artificial constraints are imposed on the solvent system, the complex is also free to bend for proper relaxation. Such a solvent shell has been used before (Kandadai and Reddy, 1996; McCammon and Harvey, 1987), but its properties have not yet been fully characterized. In the present work we have analyzed some properties of this setup to check whether it can provide a realistic solvent environment and support correct dynamics of the macromolecular complex. We have also compared the time evolutions of a smaller system (EcoRI complex with only the first protein monomer) between MD simulations in a solvent shell setup and in a more accurate SDB setup and found that the behaviors are quite similar, justifying the use of the solvent shell setup for performing MD simulations of large systems, at least for the time scale of several hundreds of picoseconds.

From the 700-ps dynamics trajectory thus obtained, we have compared the details of the direct specific and nonspecific interactions between the DNA and the protein in the simulation structure in aqueous solution with those in the crystal structure. Most of the important specific and nonspecific interactions in the crystal structure are present in the simulated structure in solution and remain well maintained during the dynamics. It is seen that in general, the specific interactions are enhanced in the simulation compared to the crystal structure as a result of either improved H-bonding geometry or the development of a few new additional interactions. We have also analyzed the water bridges in the complex to investigate their role on the specificity and the overall stability of the complex. The time evolution of different properties, such as the interactions between different molecular components, have been investigated, and the observed results have been discussed in terms of solvent interactions. Finally, we have analyzed the energetics of the kink introduced in both the strands of the DNA duplex in the complex. For this analysis, we have allowed the kinked DNA isolated from the complex to relax by MD simulation in aqueous solvent and have checked if the kink is still there in the relaxed average structure of the DNA. We have compared the self energy of the relaxed free DNA in solution with that of the kinked DNA in the complex to have an estimate of the kink energy. Free energy perturbation calculations and ordinary MD simulations for several mutants of the complex have been discussed in detail in the accompanying article (Sen and Nilsson, 1999).

    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
SUMMARY AND CONCLUSIONS
REFERENCES

System setup and solvation of the complex

We have used the x-ray crystallographic coordinates of the EcoRI-DNA complex at 2.5 Å resolution (Rosenberg, 1991; Kim et al., 1990) as the starting model structure with an all-atoms representation. The complex system includes the whole of the dimeric protein (except the first 16 amino acid residues in each monomer as obtained from the x-ray crystallographic coordinates) and the associated dodecamer DNA duplex of base sequence d(CGCGAATTCGCG)2 containing the recognition sequence d(GAATTC)2 as the central part of the DNA. In the complex are five histidine residues for each protein monomer, which we have chosen to be neutral, with the NE2 atom protonated, because the experimental pH was about 7 (Rosenberg, 1991; Lesser et al., 1993). First, we have minimized the energy of the crystallographic structure of the complex in vacuum for 500 steepest descent steps to remove any residual bad van der Waal contacts or other strains in the structure and then solvated the complex. Because the complex itself is very large (~ 9000 atoms and dimensions are x = ±36.9 Å, y = ±37.7 Å, and z = ±30.0 Å) we have solvated the molecular complex by a 7.0-Å aqueous solvent shell prepared by immersing the complex in a large sphere of TIP3P water (Jorgensen et al., 1983) and deleting any water molecule which had its oxygen atom at a distance either less than 2.8 Å or more than 7 Å from a non-hydrogen atom of the solute. This keeps the overall size of the solvated system moderate (18,000 atoms). On the other hand, solvating the complex in a water sphere with a water layer of at least 7 Å should have required a radius of about 38 + 7 = 45 Å, and a periodic box would have been even larger. The total charge of the complex is -26 units, so we have included 26 Na+ counterions to make the system electrically neutral. The counterions were placed by replacing 26 water molecules with highest electrostatic energies of the oxygen atoms in the solvent shell and care was taken such that no two counterions were placed closer than 5 Å. Further energy minimization of the whole solvated system was done by 500 steepest descent steps, and this solvated system was used to perform subsequent MD simulation as discussed in the protocol section. For the SDB setup we have immersed the DNA-protein complex with only the first monomer of the protein at the center of a sphere of radius 36 Å filled with TIP3P water molecules and all the water molecules which were closer than 2.8 Å to any atom of the complex were removed. The water molecules in the sphere were experiencing the action of a deformable boundary force arising from mean field interactions of water molecules beyond the 36-Å boundary (Brooks and Karplus, 1983). To make the system electrically neutral 24 sodium counterions were placed as described before and energy minimization was performed as described above.

Dynamics simulation protocol

The dynamics simulation in the case of solvent shell was performed by integrating Newton's equations of motion by the leap-frog algorithm with an integration time step of 2 fs, using the program package CHARMM version 25 along with the combined protein and nucleotide topology and parameters 22 (Brooks et al., 1983; MacKerell et al., 1995, 1998). We used 13 Å as the cutoff value for generating the nonbonded pair list and updated the list every 20 steps. For the nonbonded interactions, force shift option was used, causing the interaction energies and the forces to vanish smoothly at a distance of 12.0 Å. All bond lengths involving hydrogens were kept fixed using the SHAKE algorithm (Ryckaert et al., 1977). During simulation, we first heated the system to 300K over 2 ps and subsequently equilibrated at 300K over 2 ps by assigning velocities from a Gaussian distribution of velocities at 300K. The simulation was then continued and the temperature was checked every 200 steps. If the average temperature was outside the window of ±10K, the velocities of the atoms were rescaled accordingly to maintain the temperature around 300K. During the simulation, the base pairing at either end of the DNA duplex was constrained by harmonic constraint on the hydrogen bonds to minimize possible end effects. The simulation was continued for a total of 700 ps and the trajectories were saved every 100 steps for analysis. The trajectories over the last 300 ps were used to calculate most of the various average structural, interactional, and dynamic properties of the system as described in Analysis (below). The average structure of the complex was calculated over the last 300 ps of the trajectory, taking all the frames into consideration, and the potential energies of the average coordinates of the complex were then energy-minimized by 100 steepest descent steps in vacuum. In the SDB setup the atoms in the region between the radii 34 Å and 36 Å (the buffer region) interacted with a stochastic heat bath at 300K and dissipative forces (Brooks and Karplus, 1983) and followed Langevin dynamics algorithm while, for the atoms inside the 34-Å radius, Newton's equation of motion for ordinary MD simulation was applied. Other aspects were the same as in solvent shell simulation. SDB dynamics simulation was performed for 200 ps.

Procedure for estimating the energetics of the DNA kinks

For investigating the energetics of the kinked DNA, we have isolated the kinked DNA from the crystal structure of the complex and minimized its energy by 300 steepest descent steps and then placed 22 Na+ ions as counterions, each at a distance of 3.5 Å from the phosphorus atom on the bisector of the angle O1P-P-O2P of the respective phosphate group. This DNA with the counterions was then solvated by a 7-Å water shell and the overlapping water molecules were removed. MD simulation of this solvated kinked DNA was then performed, following the protocol described above, in order to allow it to relax into its unstrained conformation in the absence of the protein counterpart in the complex. We monitored the time evolution of the RMSD of the DNA to confirm its relaxation. When the average RMSD was found to be stable over a substantial period of simulation, we assumed that the DNA had reached an average relaxed structure. Then we calculated the average self energy < Erelax> and its fluctuations for the free DNA from the time series obtained from the simulation trajectory over the last 300 ps. Similarly, we have calculated the average self energy < Ekinked> and its fluctuations of the DNA in the complex over the last 300 ps of the simulation, over which the structure is most stable, and have compared it with < Erelax> . Using these two average self energies, we finally calculated Delta E = < Ekinked>  - < Erelax> , which gives a reasonable estimate of the kink deformation energy in the DNA.

Analysis

Structural deviations of the protein-DNA complex from the initial energy-minimized x-ray crystallographic structure were assessed on the basis of the RMSD. In computing the RMSD the overall translational and rotational motions have been removed by superimposing each configuration of the complex from the trajectory at intervals of 100 steps onto the initial structure.

The mean square positional fluctuations < Delta r2> of the atoms with reference to the average structure in solvent, averaged over the trajectories after rigid body alignment against the coordinates of the starting structure, were calculated for comparing the relative dynamic fluctuations of the different residues of the protein and DNA in the complex. The average was taken over all atoms except the hydrogens in a given amino acid residue or nucleotide.

The conformations of the DNA part of the complex in the crystal and the average structure in solution were characterized by the helicoidal parameters calculated by using the software package Curves 5.1 (Lavery and Sklenar, 1988; Ravishankar et al., 1989).

The correlation coefficients between pairs of dynamical quantities were computed to investigate the correlated motions present in the dynamics of the complex. The linear cross-correlation coefficient between two dynamical quantities x and y is given by the expression
<UP>C<SUB>xy</SUB></UP>=⟨&Dgr;x<SUB><UP>i</UP></SUB> · &Dgr;y<SUB><UP>i</UP></SUB>⟩/[⟨&Dgr;x<SUB><UP>i</UP></SUB><SUP>2</SUP>⟩ · ⟨&Dgr;y<SUB><UP>i</UP></SUB><SUP>2</SUP>⟩]<SUP>1/2</SUP>,
where < ·····> denotes an average over the trajectory and Delta xi = xi - < x> , Delta yi = yi - < y> .

The diffusion coefficients for the water molecules and the sodium counterions in the systems were estimated from the mean square displacement, by using the Einstein relation (Allen and Tildesley, 1987)
<UP>lim</UP><SUB>(<UP>t</UP>→∞)</SUB>∂⟨[<UP>R</UP>(t)<UP>−R</UP>(0)]<SUP>2</SUP>⟩/∂t=6<UP>D</UP>
where < ····> denotes the average over all the water molecules except the evaporated ones, R(t) is the position vector of a particular water molecule at time t, and D is the diffusion coefficient.

Solvent-accessible surface area for the whole complex and its different molecular components (DNA, protein) were calculated using the method of Lee and Richards (1971) with a test particle diameter of 1.6 Å.

Direct and water-mediated H-bonding interactions between the monomers and between the DNA and the protein were analyzed using the following criterion. Two atoms are considered to form a H-bond (A---H-D) if the acceptor-hydrogen distance is <2.4 Å and the acceptor-hydrogen donor angle is >135°. A water bridge is defined as an interaction between residues that makes hydrogen bonds to a common water molecule. The number of average hydrogen bonds or water bridges made by an atom is calculated by dividing the sum of the number of H-bonds or water bridges made by the atom in each frame by the total number of frames considered for analysis. Similarly, the average lifetime of a H-bond or a bridge is calculated by summing the duration of all H-bond or bridging events and dividing by the number of such events for any given donor-acceptor pair. In these calculations we used the coordinate frames from the trajectory at an interval of 1 ps and, in the calculations of averages, considered only events with lifetimes >5 ps.

    RESULTS AND DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
SUMMARY AND CONCLUSIONS
REFERENCES

Properties of the solvent shell and comparison between shell and SDB simulations

First we have checked whether a 7-Å solvent shell behaves properly during the period of simulation to mimic the solvation of the complex in a realistic fashion. Because in such a setup, in the absence of the periodic boundary condition or the boundary force as used in a SDB setup, there is the possibility that the outer solvent surface may diffuse out or the water molecules may evaporate, we have looked into these aspects as follows. We have monitored the number of evaporated water molecules against the simulation time. When the least distance of the oxygen atom of a water molecule from any of the atoms in the solute complex becomes very large (>50 Å), then we consider that particular water molecule to be an evaporated one. Fig. 2 a indicates that even in this setup, not many water molecules are really evaporated, although the number increases in a fairly linear fashion with the increase in the length of the simulation period. Considering the actual number (~3000) of water molecules present in the system, the number (39) of evaporated water molecules at the end of the simulation over 700 ps is insignificant.



View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 2   (a) Time evolution of the number of evaporated water molecules from the solvent shell during MD simulation. (b) Time development of the number of water molecules which are diffused out of the 7-Å-thick initial solvent boundary due to free dynamics. (c) The distribution of diffused water molecules outside the 7-Å-thick layer after 700 ps dynamic simulation of the system. The dotted line indicates the number of evaporated water molecules.

Fig. 2 b shows the time evolution of the number of water molecules residing outside the 7-Å boundary limit. It is found that during the first few tens of picoseconds the number increased very rapidly and then reached an apparent steady state when the number did not change significantly over the simulated time period. The fluctuating nature of these data points is due to the fact that under the action of the mutual interactions among the nearby water molecules and its dynamics, some water molecules go out crossing the 7-Å boundary at the same time some others cross into the boundary; a transient imbalance between these two processes causes the fluctuation. The steady average value of the number of water molecules outside the 7-Å boundary is negligible compared to the total number of water molecules in the system, indicating insignificant smearing of the solvent boundary over the simulation time period.

We have also plotted the number distribution of the water molecules beyond the 7-Å limit of the solvent shell against the distance from the solute surface in the frame of the 700th ps in Fig. 2 c. The result indicates that there is indeed a smearing out of the boundary, but the spreading is limited within only a few angstroms from the initial 7 Å boundary. The dashed line represents the number of evaporated water molecules in that frame.

For further characterization of the dynamic properties of the solvent in the shell, we have calculated the diffusion coefficients of water (Dwat) and the ions (Dion) using the mean squared displacement obtained from the dynamic trajectories and the standard Einstein relation. Estimates were made over a 500-ps time scale for three different time origins and their average was calculated. The estimated values Dwat = (2.04 ± 0.02) × 10 -9 m2/s and Dion = (0.39 ± 0.05) × 10-9 m2/s from the present simulation data are in agreement with earlier studies with TIP3P water model (Eriksson and Nilsson, 1995) and also for the sodium ions (Eriksson and Nilsson, 1995; Koneshan et al., 1998). In this estimation we have excluded the evaporated water molecules. The corresponding experimental value is Dwat = 2.4 × 10-9 m2/s. However, recent simulation with pure TIP3P water yields the diffusion coefficient in the range Dwat = (5.2-5.5) × 10-9 m2/s (van der Spoel et al., 1998), which is higher than our estimated value. Compared to the self-diffusion coefficient of pure TIP3P water (without any solute inside), a smaller value of the estimate is expected in the present case because the DNA-protein complex has a large surface in contact with water and a large number of water molecules interact with it, contributing less to the calculated mean squared displacement value. It may also be noted that as the thickness of the solvent shell is only 7 Å, the motion of the water molecules over longer time scale may be a little bit restricted (the component of motion normal to the solute surface) and hence, calculation of diffusion coefficients using large values of t (i.e., 500 ps) seems to give an incorrect estimate. On the other hand, for shorter time scales the validity of the diffusion equation breaks down. Thus, as previously suggested (Lindahl and Edholm, 1998), we calculated the diffusion coefficient for water over an intermediate time interval (100 ps) and obtained a value of (3.29 ± 0.49) × 10-9 m2/s. Use of a much shorter time interval (10 ps) yields a higher value (4.5 × 10-9 m2/s) for Dwat approaching the value obtained for TIP3P in an unrestrained system (van der Spoel et al., 1998). The variation of the values estimated over different time lengths is due to the restricted degrees of freedom in the shell. We have not calculated the radial distribution function of water molecules in the solvent shell for comparison because it appears that the water shell is not thick enough to produce a meaningful radial distribution function in the present case.

Fig. 3 compares the time evolution of the RMSD of the complex where only one monomer (the first one) is present and dynamic simulations are performed in two completely independent setups, the SDB setup and the 7-Å solvent shell setup. We removed the second monomer from the crystal structure and used this modified system as the test system because, if there is any significant difference between the performance of the two different setups, the difference could appear more pronounced for this non-native system than in the case of the original complex with both the monomers. The comparison shows that there is no significant difference in the behavior of the time evolution of RMSD. All these results together indicate that this 7-Å solvent shell setup is reasonably good for studying the solvent simulation of such large systems over time periods of considerable length.



View larger version (15K):
[in this window]
[in a new window]
 
FIGURE 3   Time development of the RMSD of the complex with a single protein monomer in solvent shell setup (solid line) and SDB setup (dashed line).

Time evolution of the RMSD of different molecular components and the RMSD and fluctuation patterns of the different residues during dynamics

In Fig. 4 a, the solid line represents the time evolution of the RMSD of the non-H atoms of the whole protein-DNA complex over the entire period of simulation. It is clearly seen that the average values of the RMSD of the whole complex as well as the molecular components have become fairly stable over a substantial period of the simulation, indicating that a stable structure of the complex in solution has been reached. However, considering the fact that we have used a 2.5-Å resolution crystal structure for the starting coordinates, the average RMSD (2.57 ± 0.11 Å over the last 400 ps of the trajectory) of the complex compared to the starting structure does not indicate a very significant conformational deviation of the complex in solution. Comparison between the time evolution of the RMSD of the DNA (dotted line) and the protein (dashed line) indicates that more structural rearrangements occurred for the protein compared to the DNA duplex. The average RMSD of the DNA duplex is 1.89 ± 0.15 Å, whereas for the protein it is 2.63 ± 0.12 Å. The relatively larger fluctuations in the RMSD values for the DNA as seen in Fig. 4 a seem to be due to the ends of the DNA duplex (presence of more free ends exposed outside the complex) and also to the smaller number of atoms in it. The average RMSD of the DNA without the end bases is obtained as 1.66 ± 0.13 Å. It is also noticeable that since the protein is about 12 times larger compared to the DNA duplex involved, the time evolution of the whole complex is governed mainly by the behavior of the protein part.



View larger version (20K):
[in this window]
[in a new window]
 
FIGURE 4   Time development of the RMSD of (a) the (i) whole complex (solid line), (ii) the protein (dashed line, almost superimposed on the solid line) and (iii) the DNA duplex (dotted line), (b) of the base atoms (line 1), backbone atoms (line 2) of the DNA, and the backbone atoms (line 3) and the side chains (line 4) of the protein.

Calculation of the linear cross-correlation coefficient between the two time series of the RMSD of the DNA and the protein yields a value of 0.9 for the linear correlation coefficient over the first 100 ps of the dynamics trajectory, indicating that they are highly correlated. On the other hand, the correlation coefficient is found to be only 0.4 over the last 100 ps of the trajectory, implying a loss of correlation. During the first 100 ps there is a common drift in the RMSD time series of the DNA and the protein that may be the cause of the high correlation over this period. Moreover, analysis of the solvent accessible surface area (discussed below) indicates that in the crystal structure of the complex, a large fraction of the DNA surface is covered by the protein and thus, during the initial dynamics, structural change in most parts of the DNA can occur only when it is coupled with a structural change in the protein in the same locality, which allows the DNA to change its local conformation. This may also be partially responsible for the initial high correlation between the two time series. As dynamics goes on, the complex relaxes in solvent and gradually becomes slightly loose, allowing the DNA more freedom to move independently of the protein; thus, the motion between them becomes less correlated. However, the correlation coefficient over the last 100 ps of the trajectory still shows the existence of a significant degree of correlation, indicating tight binding between the DNA and the protein even in the relaxed complex. Fig. 4 b indicates that more rearrangements occurred for the backbone atoms of the DNA compared to the bases, whereas for protein, more structural changes occurred for the side chains than the backbone atoms.

In the case of DNA the fluctuations are smaller compared to the protein and are similar over the nucleotides, except at the ends (data not shown). Fig. 5 represents the plots of the RMSD (Fig. 5 a) and the positional fluctuations of the atoms (Fig. 5 b) in a residue averaged over all the atoms in the residue of monomers EcoRIa (solid line) and EcoRIb (dashed line) in the complex in aqueous solution, with reference to the average structure in solution. Comparison between the fluctuation pattern of the two monomers of the protein shows striking similarity, indicating that the two monomers exhibit very similar dynamics in solution. The regions of the protein monomers that interact with the DNA strand are found to exhibit relatively smaller fluctuation compared to the other parts of the monomer. It is seen that larger changes occurred for the residues that mainly represent the different loops of the protein. The largest change occurred in the residue range Asn121 to Ala142, which corresponds to the longest loop in the structure of the protein.



View larger version (30K):
[in this window]
[in a new window]
 
FIGURE 5   (a) The average RMSD of the protein residues against the residue number of the monomers EcoRIa (thick line) and EcoRIb (thin line) and (b) the average positional fluctuations of the protein residues against the residue number of the monomers EcoRIa (thick line), EcoRIb (thin line), and the temperature factors of the crystal structure, converted to positional fluctuations (dotted line).

The time evolution of the radius of gyration of the complex (data not shown) does not show any significant change throughout the simulation, indicating that the complex maintained its overall shape and size during the simulation in aqueous solvent.

Characterization of the direct interactions between the DNA and the protein

Fig. 6 a compares the specific interaction pattern of the DNA bases of the first strand (DNA1) with the protein dimer of the complex in the crystal structure (dashed line) and the dynamic average structure in aqueous solution (solid line) against the DNA nucleotide number. Fig. 6 b represents the same for the other strand (DNA2) of the DNA. It is seen that in both the strands, all six bases in the recognition site interact with the protein in different ways, showing their different roles in providing the specificity and stability of binding. Interestingly, we found that both in the crystal structure and in the dynamics average in solution, some additional bases in the 5' side of the DNA strand outside the recognition site interact appreciably with the protein. These interactions may have an important role in the initial binding to some extent to the nonspecific region and then moving to the recognition part by a sliding mechanism, as suggested (Szczelkun et al., 1997; Jeltsch et al., 1994).



View larger version (39K):
[in this window]
[in a new window]
 
FIGURE 6   The interaction energies of the individual DNA residues with the whole protein in crystal (dashed line) and in the energy minimized dynamic average structure (solid line) against the DNA nucleotide number. (a) and (b) represent the cases of specific interaction between the protein and the DNA strand 1 and the DNA strand 2, respectively. (c) and (d) represent the nonspecific interactions for the DNA strands 1 and 2, respectively.

It is clearly seen that the interaction pattern for the two DNA strands (Fig. 6, a and b) is very similar to that in the crystal structure, indicating that most of the specific interactions between the DNA and the protein as found in the crystal structure are also present in the dynamic average structure and the interaction strengths, in general, are relatively enhanced in aqueous solution. These plots also indicate that, as in the crystal structure, the most important specific interactions of DNA in the simulation average structure come from Gua4 and Ade6 and the interaction strength is much enhanced in the simulated average structure in solution. The reason for the large enhancement of the specific interaction energy in the case of Gua4 in both strands was found to be closer H-bonding contact resulting from relaxation of the structure in the simulation. The distance between the acceptor and the hydrogen was 3.2 Å in the crystal structure, whereas it was less than 2.4 Å in solution structure.

Fig. 6, c and d, compares the nonspecific interaction pattern of the two DNA strands with protein in the crystal structure and in the average structure. Here also the pattern of the nonspecific interaction between the DNA nucleotide and the protein dimer in the simulation was found to be very similar to that seen in the crystal structure. The nonspecific interaction strengths in the dynamics average were found to be slightly enhanced for some nucleotides compared to the crystal structure, whereas for others it was found to be slightly reduced. The strongest nonspecific interactions of DNA came from Cyt3, Gua4, and Ade5. Specificity is determined by the overall difference between the total DNA-protein complex and a DNA-protein interaction energies of the wild-type complex and a mutant variant. When the difference is taken, it is expected that the common part coming from the nonspecific interaction should be cancelled out, making the specificity dependent mainly on the difference in specific interaction energies. However, the nonspecific interaction actually enhances the overall stability of the complex and thus, a large nonspecific interaction energy only shifts the equilibrium concentration of the complex toward higher value in the complex formation reaction. Fig. 7 a represents the specific-interaction energies between the whole DNA and EcoRIa (the first protein monomer) in the crystal structure against the protein residue number. A clearly distinct pattern of interaction is observed, indicating the relative strengths of the interactions with which the different bases of the DNA and the protein residues are interacting in the complex. The number of each residue that interacts considerably with the DNA (as evidenced by this energy calculation) is indicated at the top of the corresponding histogram bar. The interaction of the DNA is found to occur with several different highly local regions on the one-dimensional sequence of the protein. Fig. 7 b represents the specific-interaction pattern between the whole DNA and EcoRIa in the average structure in aqueous solution. Fig. 7 c represents the same for the whole DNA and EcoRIb (the second monomer of the protein). In a similar fashion Fig. 8 a represents the nonspecific interaction patterns between the whole DNA and EcoRIa in the crystal structure. Fig. 8, b and c, represents the same for the non-specific interaction between the DNA and EcoRIa (Fig. 8 b) and DNA and EcoRIb (Fig. 8 c) in the average structure in solution. Comparison of the interaction strengths for the different residues gives a quantitative measure of the relative importance of the different residues of the protein in the process of molecular recognition and stabilization of the complex. From the plots in Figs. 7 and 8, it is evident that the interaction patterns are very similar to those in the crystal structure. It was also found that the most important interactions for specificity seem to involve the protein residues Ala138, Gly140, Asn141, and Arg145, in which any alteration is expected to result in a considerably altered specificity of DNA sequence, whereas the most important nonspecific interactions come from the residues Lys113, Lys117, Arg145, and Lys148, where mutation should most affect the overall stability of the complex. However, the relative importance of the interactions of these residues in aqueous solution are found to be slightly different than those in the crystal structure. Variation of the interaction strengths in both the cases of specific and nonspecific interactions are mainly found to occur due to improved H-bonding geometry and the development of a few more interactions between the two interacting residues (Table 1). It may be mentioned here that in the crystal there are contacts between the primary protein-DNA complex and the neighboring complexes. In solution environment such contacts are lost. Thus, this physical difference may also be partly responsible for some of the differences in the structural and interactional properties of the complex observed here.



View larger version (15K):
[in this window]
[in a new window]
 
FIGURE 7   Specific interactions of the whole DNA duplex with (a) the protein monomer EcoRIa in crystal structure, (b) the protein monomer EcoRIa in the average structure in aqueous solution, and (c) the protein monomer EcoRIb in the average structure in aqueous solution, against the protein residue number.



View larger version (18K):
[in this window]
[in a new window]
 
FIGURE 8   Nonspecific interaction of the whole DNA duplex with (a) the protein monomer EcoRIa in crystal structure, (b) the protein monomer EcoRIa in the average structure in aqueous solution, and (c) the protein monomer EcoRIb in the average structure in aqueous solution, against the protein residue number.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1A   H-bonding interaction between the DNA and protein


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1B   H-bonds between residues of monomers

Further calculations of the nonbonded interaction energies show that, as in the crystal structure, each DNA strand is not only interacting mostly with the corresponding protein monomer (DNA1-EcoRIa and DNA2-EcoRIb) but also cross-interacting with the other monomer (DNA1-EcoRIb and DNA2-EcoRIa) with comparable energies in solution simulation (data not shown). Only a few bases in the recognition site are involved in these cross-interactions and the most important interactions come from the protein residues Arg200. Table 1A gives a detailed comparison of the DNA-protein H-bonding interaction pattern in solution simulation with those in crystal structure. It was found that the specific interactions involving the N7 atom of the purine bases Gua4, Ade5, and Ade6 were maintained throughout the simulation, whereas the specific interactions involving the amino groups of the Ade5 and Ade6 were found to be less stable with less H-bonding time and shorter lifetime during the later part of the present simulation. Table 1B gives the comparison of the H-bonding interaction pattern between the two monomers of the protein in crystal and in the average simulation structure.

It is clearly seen from Fig. 5, c and d, that the nonspecific interactions of Ade6 and Thy8 in both DNA strands with the protein are decreased in the simulated average structure. We looked into the local structure and the time development of these interactions and found that each of these nucleotides interacts with a few residues of the protein. During dynamics, in the case of Ade6, the side chain of the interacting protein residue Lys113 moved away from its phosphate oxygen; as a result, the overall interaction was reduced. In a similar way the side chain of the residue Lys117, which was interacting with the DNA nucleotide Thy8 in the crystal structure, moved out due to a rotation of the side chain. As these reductions in interactions happened in both strands of the DNA, it seems to be a natural consequence of the solvent environment. It was also found that the moving out of Lys117 is the result of hydrophilic effect for both strands, which indicated an increase in their interaction energy with water. On the other hand, in the case of Ade6, we found that a water molecule penetrated into the local structure and screened the interaction between Ade6 and Lys113. Moreover, in the H-bond list, it may be noted that a crucial specific interaction (between H61 of Ade5-DNA1 and OD1 of Asn141 of EcoRIb) is absent. To find the reason, we looked into the time evolution of the distance between these two atoms and found that this H-bond was well maintained up to the first 375 ps of the dynamic simulation and then suddenly disrupted due to a rotation of the side chain of the Asn141 of EcoRIb (Fig. 9). However, the same interaction in the other strand was properly maintained throughout the entire simulation period (Fig. 9). As the other important interactions between the DNA and the protein were also maintained, we consider this event an isolated accidental case that does not affect the general validity of the results. Because this conformational change occurred only once, we can say only that this is a possible motion and cannot estimate the likelihood that it will happen.



View larger version (33K):
[in this window]
[in a new window]
 
FIGURE 9   The time evolution of the distance between the atom H61 of Ade5 of the first strand of the DNA and the atom OD1 of Asn141 of the second monomer of the protein (solid line) and the distance between the atom H61 of Ade5 of the second strand of the DNA and the atom OD1 of Asn141 of the first monomer of the protein (dotted line).

Calculation of the non-bonded interaction energies is also quite useful for identifying the protein residues in the interface of the DNA and protein. Consideration of only those protein residues whose energies of interaction with the DNA are >0.5 kcal/mol indicates that parts of the beta  sheets beta I, beta II, beta 2, beta 3, and beta 4, the alpha -helices alpha 1 and alpha 4, and the loops between them form the DNA-protein interface. Only those protein residues within the cutoff for evaluation of non-bonded interactions (12 Å) are included in these calculations, and, because DNA is a charged molecule, there is small but finite possibility that a charged amino acid residue at a larger distance could also have an interaction energy >0.5 kcal/mol. The names of the beta  sheets and alpha  helices are according to the crystal structure (Rosenberg, 1991; Newman et al., 1994).

Characteristics of the DNA in the average structure of the complex in solution

To see if any significant difference occurs to the structure of the kinked DNA of the complex in aqueous solution, we looked into the helicoidal parameters of the DNA in the average structure of the complex obtained from simulation in water and compared them with the corresponding values in the crystal structure. Fig. 10 shows the summary of the comparison of some of the helicoidal parameters between the crystal structure and the solution average structure. A large difference is observed in the average inclination value for the bases in the DNA in simulated average structure of the complex compared to the crystal structure. The Tilt values were also found to be more irregular in the simulated structure than in the crystal structure. The observed differences may be due to the fact that the contacts of the complex with the other neighboring units in the crystal are lost in the solution environment, allowing enhanced local effects. The indirect effects of A-helix bias of CHARMM parameters may also be partially responsible (Yang and Pettitt, 1996; MacKerell, 1997; Feig and Pettitt, 1997; Sprous et al., 1998). Looking at the different helicoidal parameter values of the crystal structure also shows a few unusually large local values (rise, etc.), indicating that the strong interaction between DNA and the protein introduce strong local effects which may be more pronounced in solution. The data also indicate that the strong binding also cause the parameters to be significantly irregular over the entire DNA duplex in the complex. However, the average values over all the nucleotides of Rise (3.69 in the crystal and 3.74 in the simulation) and Twist (31.2 in the crystal and 28.7 in the simulation) show close similarities between crystal structure and the average structure. The average values of the different backbone torsional angles were mostly similar in the crystal structure and in the simulation average structure, except in nucleotides number 8 and 9, for which the torsional angles delta , epsilon , and xi  were found to be significantly different.



View larger version (28K):
[in this window]
[in a new window]
 
FIGURE 10   Summary of the comparison between the helicoidal parameters for the DNA in the EcoRI complex in crystal structure (solid line) and the average structure in aqueous solution (dashed line) against the DNA nucleotide number. The left column represents the first strand of DNA (DNA1) and the second column represents the other strand.

The time evolution of the interactions between DNA and protein

The time evolutions of the interactions between the component molecules are also important quantities for better characterization of the behavior of the complex in aqueous solution. The average values evolution of the interactions of DNA1-EcoRIa and DNA2-EcoRIb over the last 300 ps of the trajectory were obtained as -351 ± 16 kcal/mol and -395 ± 21 kcal/mol, respectively. The averages were computed as the mean of the interaction energies in frames at an interval of 2 ps obtained from the last 300 ps of the dynamic trajectory. The linear correlation coefficient between these two time series was obtained as 0.34, indicating a significant degree of correlation between the dynamics of the direct interaction of these two pairs of systems in the complex. The average values for the DNA1-EcoRIb and DNA2-EcoRIa, calculated in the same way are found to be -138 ± 13 and -133 ± 11 kcal/mol, respectively. The linear correlation coefficient between these two time series was found to be 0.04, indicating practically uncorrelated dynamics. Calculation also showed that the two protein monomers interact favorably with each other and their mutual interaction contributes considerably to stabilizing the complex. For the time evolution of the interaction between the two monomers of the protein it was found that during the early part of the simulation a more favorable interaction was grown, which gradually disappeared, and the interaction reached a steady state with an average interaction strength -808 ± 49 kcal/mol, which is only slightly less favorable than in the initial state. The physical reason behind this behavior seems to be the interaction with water molecules. During initial dynamics only the local interactions between the monomers dominate and grow faster. As the dynamic simulation continues, the complex relaxes and more and more water molecules get access to the protein and adjust themselves around the solute, making favorable interactions with the monomers. This is evident from the time evolution of the interaction energy of the protein monomers with water (discussed below), at the cost of the direct interactions between the two monomers. The time evolution of these energies and the values of the associated fluctuations clearly show that these interactions are well maintained throughout the dynamic simulation. Calculation of the linear correlation between the time series for the interaction between the two monomers of the protein and the DNA1-EcoRIa interaction show a linear correlation coefficient of -0.51 and for the other pair as -0.43 indicating that in each case the interactions are considerably anti-correlated to a similar degree. The DNA1-EcoRIb and DNA2-EcoRIa show correlation with the monomer-monomer interaction of the protein with correlation coefficients 0.61 and 0.17, respectively. The large difference between the correlation coefficients in these two cases is not quite clear. The overall DNA-protein interaction energy is anti-correlated with the dynamics of the monomer-monomer interaction of the protein by a correlation coefficient -0.29. The anti-correlated nature of the time series of the individual DNA-protein interaction and the protein-protein interaction simply implies that during the equilibrium dynamics, the variation in the DNA-protein interaction energy grows or decays partially at the cost or gain of the protein-protein interaction. However, the average interaction energy between the two protein monomers was found to be stronger than any of the average interactions between DNA1 and EcoRIa, DNA2 and EcoRIb, DNA1 and EcoRIb, or DNA2 and EcoRIa, though altogether the DNA-protein interaction is stronger than the monomer-monomer interaction of the protein. Fig. 11 represents the interaction map between the two protein monomers indicating the mutually interacting regions. Only interactions >0.5 kcal/mol in magnitude were considered. The monomer-monomer interface thus consists of selected regions of the monomers. It was found that their interaction consists mostly of the H-bonding interactions and van der Waal interaction also constitutes a significant contribution. In some cases strong interaction of as much as -74.4 kcal/mol between two residues of the two monomers were observed. Throughout the dynamic simulation the interaction energy between the two strands of the DNA remained quite stable, with an average value of -1045 ± 24 kcal/mol.



View larger version (22K):
[in this window]
[in a new window]
 
FIGURE 11   Comparison of the residue-residue maps of the interaction between the two monomers (EcoRIa and EcoRIb) of the protein in crystal structure (ddager ) and in simulation average structure (+). The major secondary structures are also indicated.

Solvent interactions: solvent-accessible surface area, H-bonding, and water bridging

Considering the time developments of the interactions of the DNA and the protein separately with water (Fig. 12) it is seen that in both cases favorable interaction with water grew considerably with time. The interaction energy reached a stable average value for the DNA very rapidly, whereas for the protein the growth of interaction occurred in two phases: a first phase of large growth in interaction occurred very fast, then saturation was reached slowly. A double exponential (E = Einfinity  + A1e-t/t1 + A2e-t/t2) fit of the time evolution of the DNA-water interaction gives the first time constant, t1, as 7.2 ps and the second time constant, t2, as large as 1933.5 ps, indicating that the variation in the second phase is very slow over the time period of our simulation. Similarly, for the time evolution of the protein-water interaction, t1 was obtained as 4.2 ps and t2 as 309.4 ps. Because water molecules are small, their reorientational relaxation time is also small. Thus it seems that the variation in the DNA-water interaction energy is due mainly to the reorientations of the nearby water molecules. The time evolution of the RMSD and the self energy of DNA in the complex also indicated very little change in its structure. On the other hand, it appears that the rapid increase in the protein-water interaction in the early part of the simulation occurs due to the reorientational relaxation of the nearby water molecules, and the slow growth part results due to the relaxation of the protein monomers themselves to enhance their interaction with water. This physical picture is consistent with the above fitting results.



View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 12   Time evolutions of the interactions of water with the protein (solid line) and the DNA (dashed line).

In order to investigate how these interactions of the solvent with the solute are related to the solvent accessible surface area (ASA) of the solute complex, we analyzed different aspects of the ASA value of the whole complex and also separately for the molecular components. Fig. 13 a represents the time evolution of the ASA value of the whole complex. It is clearly seen that during the first 100 ps of dynamics in solution, the complex was gradually loosened up with a considerably larger overall ASA value of the complex and maintained that state throughout the simulation. We also compared the solvent-accessible surface area of the molecular components of the complex between its initial crystal structure and the average values calculated from different frames at an interval of 1 ps from the last 200 ps of the dynamic trajectory. Table 2 summarizes the results for comparison. It was found that the ASA for both the protein and the DNA have increased significantly in the simulation compared to that in the crystal structure. It is also seen from Table 2 that major contributions to the enhancement in the protein's ASA is caused by the increased solvent exposure of the charged, polar and the glycine units of the protein while the non-polar residues showed a significant decrease in ASA value due to hydrophobic effect. Calculation also shows that the hydrophobic residues have the least average ASA (19.7 Å2) per residue and the charged residues have the largest value (67.5 Å2). An overview of the difference in solvent exposure of the individual protein residues of the monomer EcoRIa between the crystal structure and the solution simulation is given in Fig. 13 b as a histogram plot of the difference (Delta ASA = ASAave - ASAcrys) in the residue-wise ASA values of the EcoRIa protein monomer in crystal (ASAcrys) and in the average solution structure (ASAave). A positive bar for a residue in this plot represents an enhancement of the solvent exposure of that residue while a negative bar indicates a reduction of solvent exposure. Considerable enhancement of exposure for some residues are seen clearly, while for some other residues reduction of solvent exposure can be observed. The residues in the interior of the complex have zero exposure and do not show any change. Thus, the ASA calculations also are consistent with the idea that the growth of favorable solvent interaction is partly due to some structural changes of the protein.



View larger version (27K):
[in this window]
[in a new window]
 
FIGURE 13   (a) Time evolution of the solvent-accessible surface area of the whole complex and (b) plot of the difference (DASA = ASAave - ASAcrys) between the ASA values of the individual residues of the protein monomer EcoRIa in crystal (ASAcrys) and in solution average structure (ASAave).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 2   Solvent-accessible surface areas

To have an idea about the amount of surface area of the DNA duplex actually accessible to the solvent, we compared the ASA values of the DNA duplex when it is in the complex and when it is isolated from the complex. The results for the crystal structure as represented in Table 3 indicate that although a large fraction of the DNA surface is covered by the protein, a significant amount is still accessible to the solvent. The average values calculated from the dynamic average (Table 2) show a significant increase in the ASA value for the DNA duplex and this allows the DNA to form considerable number of H-bonds with water and also water bridges with protein. Analysis of H-bonds of water with the DNA indicates that strong H-bonds are mostly formed with the phosphate group oxygens and occasionally with the O5' and O3' atoms of the DNA backbone. However, in some cases the O2, H42, of Cyt, O2 of Thy, N3 of Ade, and N3, H22, O6, and N7 atoms of Gua bases were found to participate in H-bonding with water. Hydrogen bonds between water and the protein occur extensively with both the peptide backbone and the side chains.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 3   Comparison of solvent-accessible surface area values for the DNA in the DNA-EcoRI complex

We have separately analyzed the water bridges between DNA and protein and those between protein and protein. The results are summarized in Table 4, A and B. We found that the DNA formed a considerable number of water bridges with the protein, which are quite long-lived, as seen in Table 4 A. We have also found several strong water bridges between the two monomers of the proteins. All these water bridges increase the overall stability of the entire complex. It may be pointed out that in the present study we did not find any water bridge between the DNA and the protein involving the bases which could enhance the specificity of the DNA-protein interaction. It is clearly seen from Table 4 A that water bridges are formed mainly involving the oxygen atoms of the phosphate groups and the O3' atoms of the DNA backbone. We have not found any water bridge between the two strands of the DNA, which may be a consequence of the fact that the DNA is substantially buried into the protein, but have found some water bridges between different atoms of the same DNA strand, particularly at the ends of the helix. In some cases it is observed that depending on the geometric criteria (i.e., distance and angle) for H-bonds used, calculation shows breaks in the bridging interaction with one atom pair while at the same time developing a new bridge between another atom pair in the same locality. The water molecule only changes its orientation and, as a result, remains trapped in that locality for quite a long time, but the average lifetime for the bridges are less compared to the overall residence time of a water molecule in that locality. In most cases a specific atom pair is bridged by a water molecule over a substantial part of the trajectory where the bridging water molecule is some times exchanged with another water molecule or the same water molecule renews the bridge after a break making the average lifetime shorter.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 4A   DNA-protein water bridges


                              
View this table:
[in this window]
[in a new window]
 
TABLE 4B   Water bridges between protein monomers

Interactions and distribution of counterions

Calculation of the average interaction energies of the counterions with the DNA and separately with the protein indicates that the overall ion-protein interaction energy (-539.5 ± 56 kcal/mol) is stronger than the ion-DNA interaction energy (-342.4 ± 33 kcal/mol). The reason seems to be the relative size of the molecular components. As the protein is very large compared to the DNA and in the complex, it covers a large part of the DNA double helix, the ions, like the water molecules, cannot get much access to the DNA; rather, they get much more access to the protein. As a result the overall interaction energy with protein is found to be more favorable than that with the DNA. However, the interaction energy per residue indicates much stronger interaction with the DNA than with the protein, as expected. Fig 14 shows the superimposed positions of the counterions over the last 200 ps of the dynamics trajectory at an interval of 1 ps obtained after removing the overall rotational and translational motions of the solute from their coordinates. It was found that the ions are mobile but always stay close to the surface of the complex and there are several local spots on the molecular surface where the ions stay most of the time with occasional moving over to a nearby similar spot. These local sites correspond mainly to the phosphate groups of the DNA exposed to the solvent and the negatively charged groups of the protein residues accessible to the solvent.



View larger version (106K):
[in this window]
[in a new window]
 
FIGURE 14   Plot of the superimposed positions of the counterions obtained from snapshots at every 2 ps from the last 200 ps of the trajectory after removing the overall rotational and translational motions from them. The average structure of the complex is also shown. The two protein monomers are represented in red and green and the ion positions are represented in yellow.

Relaxation and energetics of the kink in the DNA

The time evolution of the RMSD of the kinked DNA isolated from the DNA-EcoRI complex with respect to its starting kinked structure shows a rapid growth of the RMSD value over the first 200 ps of dynamic simulation and then the average value is stabilized around 5.5 Å (Fig. 15). This RMSD behavior clearly indicates that the initial distorted structure of the kinked DNA isolated from the complex has been relaxed properly during the MD simulation in aqueous solution. Similar structural studies on the EcoRI recognizing kinked DNA duplex by MD simulation have been performed in a recent work (Duan et al., 1997) and have found