| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



* Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Elche (Alicante), Spain;
Centro de Biología Molecular "Severo Ochoa" (Consejo Superior de Investigaciones Cientificas-Universidad Autonoma de Madrid), Universidad Autónoma de Madrid, Madrid, Spain; and
Biocomputation and Complex Systems Physics Institute, Zaragoza, Spain
Correspondence: Address reprint requests to J. L. Neira, Tel.: 34-96-665-8459; E-mail: jlneira{at}umh.es.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
CA of HIV-1 is formed by two independently folded domains separated by a flexible linker (16
–19
). The N-terminal domain (residues 1–145 of the intact protein) is composed of five coiled-coil
-helices, with two additional short
-helices after an extended proline-rich loop (16
–18
). The C-terminal domain (residues 146–231), CAC, is a dimer both in solution and in the crystal form (19
,20
). Each CAC monomer is composed of a short 310-helix followed by an extended strand and four
-helices:
-helix 1 (residues 160–172),
-helix 2 (residues 178–191),
-helix 3 (residues 195–202), and
-helix 4 (residues 209–214), that are connected by short loops or turnlike structures. The dimerization interface is largely formed by the mutual docking of
-helix 2 from each monomer, being the side chains of each tryptophan (Trp184) deeply buried in the dimer interface (19
,20
). The two other aromatic residues in each monomer, Tyr164 and Tyr169, are within the hydrophobic core of each monomer, well away from the dimer interface. We have previously described the folding and association reactions of CAC, which involve a monomeric intermediate that rearranges and dimerizes to yield the native dimer (21
). The structure of this monomeric intermediate, as shown by CD and fluorescence, is nativelike with a solvent-exposed tryptophan (21
). This monomeric intermediate has been observed in the thermal unfolding of the dimeric species by FTIR and NMR spectroscopies (22
). We have determined by thermodynamic analysis, using alanine mutants, the energetic contribution of each side chain at the dimerization interface to the folding and association of CAC (23
). These studies have shown that the side chain of Trp184 (21
,23
) and those of Ile150, Leu151, Arg154, Leu172, Glu175, Val181, Met185, and Leu189 are key for CAC dimerization (23
). Moreover, the Trp184Ala mutant remains essentially monomeric at protein concentrations of hundreds of micromolar (23
). The spectroscopic results, as well as the quantitative agreement of the thermodynamic parameters (free energies and m-values) of the folding of the nonmutated CAC and of the monomeric CAC mutant Trp184Ala (23
), indicate that the structure of the latter may resemble that of the monomeric intermediate detected during the folding and association of nonmutated, dimeric CAC. To provide further insights into the association of CA from HIV, we have solved the three-dimensional structure of CAC mutant Trp184Ala in solution by NMR. In this work, for the sake of clarity, the mutant will be referred to as CACW40A to denote the position of the mutation in the C-terminal domain; also, we will be numbering the amino acids of CACW40A from its first residue, i.e., the added N-terminal methionine is Met1, and the second residue is Ser2 (which corresponds to Ser146 in the numbering of the intact CA). The results revealed that, in CACW40A, only three of the four native
-helices of CAC are stably formed. The second helix seems to be involved in a conformational exchange equilibrium. The third and fourth
-helices (second and third helices in CACW40A) are rotated 90° with respect to its orientation in the nonmutated protein. The extraordinary ability of the CAC monomer to change its structure may contribute to the different modes of association of CA during HIV assembly.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Protein expression and purification
The unlabeled mutant CACW40A protein was expressed in Escherichia coli BL21(DE3) in Luria Broth and purified as described in the literature (21
–23
), with the following modifications: 1), after ammonium sulfate precipitation, the protein was redissolved in 50 mM acetate buffer, pH 5.2; 2), the same buffer containing NaCl was used to elute the protein from the SP- and Heparin-Sepharose columns; and 3), before the size-exclusion chromatographic step, the protein was dialyzed against 25 mM phosphate buffer, pH 7.3, and the solution supplemented with NaCl up to a final concentration of 150 mM. Protein stocks were run in SDS-PAGE gels and found to be >97% pure. The 15N-labeled sample was obtained by using Bioexpress medium (Cambridge Isotope Laboratories, Andover, MA). The labeled protein was purified with the same protocol. Protein concentration was calculated from the absorbance measured at 280 nm, by using the extinction coefficients of amino acids (24
). Samples were concentrated at the final NMR concentration by using Centriprep Amicon devices (Millipore), with a molecular weight cutoff of 3500 Da.
Fourier transform infrared spectroscopy (FTIR)
Samples of nonmutated CAC and mutant CACW40A in phosphate buffer 10 mM, pH 6.8, were concentrated with Vivaspin devices (Vivascience, Hannover, Germany), with a molecular weight cutoff of 5000 Da, to a final protein concentration of 160 µM. Volumes of 100 µl of this stock sample were dried in a Speed Vac concentrator (Savant, Farmingdale, NY), resuspended in D2O and incubated at room temperature for 90 min to maximize the H-D exchange of the protein. Finally, the samples were dried again, dissolved in 20 µl of D2O, and placed amid a pair of CaF2 windows separated by a 50-µm thick spacer in a Harrick (Ossining, NY) demountable cell. Spectra were acquired at 298 K on a model No. FTIR-66S instrument (Bruker Optik, Ettlingen, Germany) equipped with a deuterated triglycine sulfate detector and fitted with a water bath (Thermo Haake, Paramus, NJ). The cell container was continuously filled with dry air. Usually, 500 scans per sample were taken, averaged, apodized with a Happ-Genzel function, and Fourier-transformed to give a final resolution of 2 cm–1. The contributions of the buffer spectra were subtracted, and the resulting spectra were used for analysis, as described (25
). The error in the estimation of the percentage of secondary structure depends mainly on the removal of spectral noise, and based on several model proteins, this error was estimated to be 2% (25
). To be conservative, we have considered that in our band decomposition the estimated error was 4%.
Protein secondary structure components were quantified from curve-fitting analysis by band decomposition of the original amide I' band after spectrum smoothing. Spectrum smoothing was carried out by applying the maximum entropy method, assuming that noise and band shape follow a normal distribution. The minimum bandwidth was set to 12 cm–1. The resulting spectra had a signal/noise ratio >4500:1. Derivation of IR spectra was performed using a power of 3, breakpoint of 0.3, and Fourier self-deconvolution was performed using a Lorentzian bandwidth of 18 cm–1 and a resolution enhancement factor (the so-called factor k) of 2.0. To quantify the secondary structure, the number and position of the absorbance band components were taken from the deconvoluted spectra; the bandwidth was estimated from the derived spectra; and the absorbance height from the original spectra. The iterative curve-fitting process was performed under Spectra-Calc (Galactic Industries, Salem, NH). The number, position, and band shape were kept fixed during the first 200 iterations. The fittings were further refined by allowing the band positions to vary for 50 additional iterations. The agreement between experimental and theoretical spectra was assessed from the
2 values (within the range 1 x 10–5 to 4.5 x 10–5). The area of the fitted absorbance band components was used to calculate the percentage of secondary structure.
NMR samples
All NMR experiments were acquired on a model DRX-500 spectrometer (Bruker) equipped with a triple resonance probe and z-pulse field gradients. Homonuclear 1D- and 2D-NMR experiments were performed with
1 mM samples in 0.5 ml at 293, 298, and 303 K, pH 7 (uncorrected for deuterium isotope effects) in either H2O/D2O (90%/10%, v/v) or D2O. Heteronuclear two- and three-dimensional spectra were acquired at 293 K. Centriprep Amicon devices (Millipore) were used to exchange the sample in 100% D2O, during 4 h at 278 K. TSP was used as the internal chemical shift reference in the homonuclear experiments.15N chemical shifts were referenced following the method of Wishart (26
).
Translational diffusion measurements (DOSY experiments)
Translational self-diffusion measurements were performed with the pulsed-gradient spin-echo NMR method (27
,28
). The following relationship exists between the translational self-diffusion coefficient, D, and the NMR parameters,
![]() | (1) |
is the duration (in seconds) of the gradient; G is the strength of the gradient (in T cm–1);
is the time (in seconds) between the gradients; and
is the recovery delay between the bipolar gradients (100 µs). Data are usually represented as the –ln(I/I0) vs. G2, and the slope of the line is
from where D can be easily obtained.
The Stokes-Einstein equation relates D with the R, the hydrodynamic radius of the sphere, and the viscosity of the solvent,
, according to
![]() | (2) |
The viscosity of a solution is very weakly influenced by the macromolecule component at the concentrations used in the NMR studies, and therefore, the viscosity of the solution is that of the solvent. Solvent viscosity is temperature-dependent, according to Lapham et al. (29
):
The terms a, b, and c are given for a particular D2O:H2O ratio. In our conditions, a 100% D2O solution, the values are a = –4.2911, b = –164.97, and c = 174.24. This yields a value of
= 1.253 kg/(cm s) at 293 K, used in our calculations.
The gradient strength was calibrated by using the diffusion rate for the residual proton water line in a sample containing 100% D2O in a 5-mm tube, and backcalculating G. Experiments were acquired with a postgradient eddy-current relaxation delay of 5 ms. Each experiment was averaged over 128 scans and the number of points was 16 K. The strength of the gradient was varied from 2% of the total power of the gradient coil to 95%, with a sine function shape. The largest protein concentration used was 1 mM, and the other used concentrations were dilutions of that protein stock. The duration of the gradient was varied between 2.2 and 3 ms, and the time between both gradients was between 100 and 150 ms. The most upfield-shifted methyl groups were used to measure the changes in intensity.
1D-, 2D-, and 3D-NMR spectroscopy
All spectra were acquired in the phase-sensitive mode. Frequency discrimination in the indirect dimensions was achieved by using either the time-proportional phase incrementation technique, TPPI (30
), or States-TPPI (31
).
One-dimensional spectra were acquired with 16 K data points, averaged over 512 scans with 6000 Hz of spectral width (12 ppm). 1H homonuclear TOCSY (32
), using the MLEV-17 sequence, with mixing times of 45, 60, and 80 ms (33
), and NOESY (34
) spectra with 80, 150, and 200 ms mixing times were recorded by standard methods with water suppression achieved by WATERGATE (35
). Usually, the size of the acquisition data matrix was 2048 x 512 points in t2 and t1, respectively, with 64 scans (TOCSY) or 128 scans (NOESY) per increment in the t2 dimension. The spectral width was 6000 Hz in both dimensions. Before Fourier transformation, the two-dimensional data matrix was multiplied by a phase-shifted sine bell or square sine bell function in both dimensions. The corresponding shift was optimized in every experiment. Base-line correction and zero-filling were applied in both dimensions. All spectra were processed and analyzed by using XWINNMR (Bruker AXS, Karlsruhe, Germany) working on a PC computer.
Heteronuclear experiments included 15N-HSQC (36
), three-dimensional 15N- TOCSY-HSQC (37
), three-dimensional 15N-NOESY-HSQC (37
), and HNHA (38
). Spectral widths were 6000 Hz for 1H and 1000 Hz for 15N. Mixing times for the NOESY experiments were 150 and 200 ms, and 45, 60, and 80 ms for the TOCSY experiments. In the three-dimensional experiments, 2 K data points were acquired in the t3 dimension (1H), and 64 and 512 scans were acquired in the t2 and t1 dimensions, respectively. In the 2D-HSQC experiments, 4 K data points were obtained in the 1H dimension and 128 scans were acquired in the 15N axis.
Geometrical restraints
1H NMR resonances were assigned by using standard sequential assignment procedures (39
). Spin systems were identified by analysis of the two- and three-dimensional TOCSY experiments. The use of the three temperatures and the two- and three-dimensional experiments allowed us to resolve most of the ambiguities that arose from chemical shift degeneracies. All spectra were analyzed by using the Sparky software (40
).
The signals from the 2D-NOESY experiment recorded at 150 ms were used as input for the structure calculation. After integration, the peak volumes and chemical shifts list were output to a CYANA-compatible format (41
). The automatic calibration method implemented in CYANA, CALIBA (42
), was run to transform the peak volumes into distances. Default weighting factors were attributed to the different restraint categories for different types of atoms. The upper limits obtained were used without modification for the initial structure calculations.
The assignment of cross-peaks in the NOESY spectrum was carried out in several steps. A preliminary NOE data set (
500 NOEs) was used to calculate initial structures with CYANA. The resulting conformers with the lowest target functions were used to assign additional peaks in the NOESY spectrum. The cycle of calculations and assignments was repeated until no further assignments were possible. The quality of the introduced constraints was checked at every step by analyzing the restraint violations of the calculated conformers. The NOE cross-peaks corresponding to restraints that were consistently violated in a significant number of structures were checked for possible overlap in the 3D-NOESY, and the corresponding restraints were modified. The cycle was repeated until no consistent violation was detected. An example of this procedure is exemplified by the region Ser34 to Ala40, where no cross-peaks in the HSQC spectrum were observed (see Results). However, the first runs of the structure calculations indicated the proximity between the aliphatic chains of residues in this region and the aromatic moieties at the first helix (Phe17, Tyr20, Tyr24, and Phe25). If such proximity was observed in greater than three-quarters of the calculated structures, and there were unassigned NOEs involving the aromatic moieties, then those NOEs were tentatively assigned, involving the side chain of a residue in that region.
Slowly exchanging amide protons were identified in a 2D-NOESY recorded after the exchanging step. Hydrogen-bond restraints were included in the last steps of the calculation if 1), the amide proton signal was observed in the exchange NOESY experiment; and 2), the hydrogen bond was present in at least one-third of the structures inspected. Each hydrogen bond was specified by two distance restraints: HN-O distance of 1.7–2.3 Å and N-O distance of 2.4–3.3 Å.
The 3JHNH
coupling constants were measured from a 3D-HNHA, and the corresponding backbone dihedral
-angles obtained by means of the appropriate Karplus curve (38
). The 3JHNH
s measured were not corrected for relaxation effects and were likely to be 5–10% underestimated (43
). Backbone
-angles were constrained to –60° ± 30° for 3JHNH
< 5.5 Hz, and –120° ± 60° for 3JHNH
> 7 Hz. From the ratio between intraresidue and sequential HN-H
NOESY cross-peak intensities and the secondary structure indicated by the conformational shifts of the H
protons, the
-torsion angles were extracted, and introduced as constraints. The conformational shifts of the H
protons were obtained from the tabulated random-coil values (39
).
In the last rounds of the calculations, stereospecific assignments of diastereotopic protons were included. Stereospecific assignments were carried out by using the GLOMSA module of CYANA (42
). The stereospecifically assigned protons were: Asp8 (Hß), Arg10 (H
), Pro13 (H
), Pro16 (Hß), Phe17 (Hß), Arg18 (Hß), Arg29 (Hß), Leu46 (Hß), Val47 (H
), Gln48 (Hß,H
), Pro52 (Hß), Leu61(Hß), Leu67 (Hß, H
), Glu68 (Hß), and Glu69 (Hß).
Structure calculations and analysis
All histidine, arginine, and lysine residues were regarded as positively charged, and glutamate and aspartate side chains as negatively charged. In the NOESY experiments, all proline residues (namely, Pro13, Pro16, Pro52, and Pro63), except Pro3 and Pro80, which could not be assigned, were in a trans conformation, as concluded from the H
–H
NOEs with the corresponding preceding residues. A total of 500 random conformers were annealed in 12,000 steps. The 50 conformers with the lowest target function constituted the final family. Each member of this family was subjected to restrained energy minimization by using the AMBER 8 package (44
). Force-field constants of 32 kcal mol–1 Å–2 and 32 kcal mol–1 rad–2 were used for NOE and torsion angle constraints, respectively. Thus, calculations were carried out with the standard force field of the program in vacuo plus the NOEs and angle constraint terms.
The quality of the structures was evaluated in terms of deviations from ideal bond lengths, angles, and through Ramachandran plots by using PROCHECK (45
) and PROCHECK-NMR (46
). Based on those analysis, the coordinates of the 30 best structures of the AMBER family were chosen. Structures have been represented either with PyMOL (47
) or MOLMOL (48
).
| RESULTS |
|---|
|
|
|---|
|
-helical content of both proteins. In both proteins, there was a clear band assigned unambiguously to
-helix, at 1650 cm–1 (Table 1), which was shifted at smaller wave-numbers in the nonmutated CAC. Further, in both proteins there was a band at 1660 cm–1, which could be assigned to either 310 or
-helix (both kind of structures were present in both protein species); and finally, both proteins had a band at 1626 cm–1, whose assignment was ambiguous (51
-helical structure in both proteins. We can provide, however, an estimation of the overall content of helical structure in both proteins, and conclude that, within the limitations of FTIR and the experimental uncertainty, the helical content in both proteins was identical. This result agrees with that obtained by CD (21
CACW40A is monomeric at NMR concentrations
Some CAC mutants, which, at moderate concentrations, revealed no detectable tendency to dimerize, could show a monomer-dimer equilibrium at the high concentrations used for NMR studies. To determine whether CACW40A was monomeric under NMR conditions, we carried out DOSY experiments in the concentration range from 50 µM to 1 mM.
The translational diffusion coefficient of CACW40A increased linearly as the protein concentration decreased (Fig. 1 B), since at lower protein concentrations, the molecular impairment of the translational diffusion was smaller. The extrapolated D at infinite dilution of the protein (i.e., the y-axis intersection) was 1.20 ± 0.04 x 10–6 cm2 s–1 at 293 K. Equation 2 yielded a hydrodynamic radius for a spherical CACW40A of 14.2 ± 0.3 Å.
|
equals the volume of a sphere (54
where M is the molecular mass of the protein,
is its partial specific volume, and N is the Avogadro's number. The molecular mass of a monomeric CACW40A is 9534.9 Da, and
as calculated from amino-acid composition (55
In addition to the results of the DOSY experiments supporting the monomeric state of CACW40A at millimolar concentrations, there are other independent pieces of evidence supporting the above conclusion. First, previous ultracentrifugation experiments have shown that CACW40A was monomeric up to concentrations of 23 µM, and probably, to much higher concentrations (19
); the lowest concentration used in the DOSY experiments was 50 µM (Fig. 1 B). Second, gel filtration chromatography has shown that CACW40A remains monomeric at protein concentrations of hundreds of micromolar (23
). Third, NMR relaxation measurements were carried out at two different concentrations (1 mM and 400 µM, which overlap with the concentration used in the DOSY experiments; see Fig. 1), and they did not show any variation in the rates as the protein concentration was varied (data not shown). Fourth, and finally, DOSY measurements, in other proteins, have been shown to constitute an unambiguous probe to determine protein aggregation and oligomerization even at protein concentrations <100 µM (56
–62
), and whose results have also been supported by other biophysical techniques (63
,64
).
Assignment of CACW40A
In the 2D-HSQC, 63 cross-peaks (out of the expected 80 signals) could be clearly identified. The intensity and the broadness among the cross-peaks in the HSQC was very different, suggesting that some amino acids were affected by movements within several timescales. For instance, the cross-peaks of residues Ser34 to Ala40, except those of Val37 and Lys38, were not present at any of the explored temperatures. These residues are close to the mutation site, Ala40, and the absence of resonances could be due to line broadening, arising from conformational exchange. Absence of signals in HSQC experiments due to exchange processes has also been observed in loops and highly mobile regions (65
,66
). Resonances of residues at the N-terminus, Ser2 to Ser5, and at the C-terminus, Pro80 to Leu87, could not be unambiguously assigned (Supplementary Material, Table S1). Sequential cross-correlations were established based on H
-HN (or H
-H
for proline residues), Hß-HN, and/or HN-HN NOEs observed between adjacent residues. For all the residues unambiguously assigned, at least one sequential NOE could be identified.
The intensities of the main chain NOEs, the observed short-, medium-, and long-range NOEs, and the conformational chemical shifts of the H
resonances indicated that the secondary structure of CACW40A in solution consists of three
-helices (Fig. 2). A total of 24 amide protons showed protection. The location of these protected amide protons is in agreement with the limits of secondary structure determined by NOE data (Fig. 2).
|
Calculation of the solution structure of CACW40A
A total of 927 NOEs were collected and translated into 779 relevant distance restraints; further, 66 angle restraints and 40 hydrogen-bond restraints (two for each hydrogen bond) were included in the calculations. The 50 structures with the lowest target functions from CYANA had an average target function of 0.88 (with a minimum at 0.39 and a maximum at 1.18), and an RMS deviation of 1.20 ± 0.20 Å for backbone residues and 1.73 ± 0.22 for all atoms in the ordered polypeptide region (Leu7-Gln75). The NMR structure of CACW40A is represented by the best 30 conformers obtained by AMBER (Table 2). The elements of secondary structure comprising the three
-helices are among the best defined regions of the protein (Fig. 3).
|
|
|
-helix, comprising residues Phe17 to Arg29. Fraying at the N- and C-termini of the helix is observed, as concluded from the hydrogen bonds between Tyr20 and Phe17 (which was not included in the calculations), Ala30 and Thr27, and Gln31 and Leu28.
The first
-helix contains the four aromatic residues of the protein, namely Phe17, Tyr20, Phe24, and Tyr25, which form the main core. The side chain of Val21 is deeply inserted in the core as shown by NOEs with the side chains of Leu46, Val47, and Met71. Thus, all these NOEs anchor the C-terminal region of the protein to the first helix.
After the first helix a long loop bulges out, whose residues show a large RMS deviation, especially at the N-terminus (Fig. 3 B). Conversely, residues Thr44, Leu45, Leu46, and Val47 are forming the core of the protein with a low RMS deviation (Fig. 3 B). All these residues belong to the second
-helix in the nonmutated CAC. Thereafter, the second helix (residues Asp53 to Lys59) is highly regular, showing NOEs with Phe17 and with the third
-helix (residues Glu68 to Met71). The following residues, Gly62 to Ala65, adopt a type II ß-turn. The last helix is formed by residues Thr66 to Cys74 and it is well defined; in this helix, residues Met70, Met71, and Thr72 are buried. The second and third helices are arranged in a parallel way between them, and in a perpendicular fashion to the first helix (Fig. 4, B and C).
The analysis of the variances of the 30 conformers with PROCHECK-NMR (46
) for the
-,
-, and
1-angles showed that the polypeptide patches Phe17 to Leu28, Asp53 to Leu58, and Glu68 to Ala73 have the smaller variance, with backbone angles close to those expected for ideal
-helices (
- and
-angles), or with side chains in a well-fixed conformation (for the
1-angle).
| DISCUSSION |
|---|
|
|
|---|
-helices connected by loops (Fig. 4). The structural features, and the similarities and differences with each subunit in dimeric CAC are discussed next. The one-turn 310-helix (Leu7-Ile9) of dimeric CAC is also formed in monomeric CACW40A (Fig. 4), as concluded from the FTIR measurements (Table 1), and the observed medium-range NOEs (Fig. 2). The hydrogen-bonding scaffold of the helix was not very stable, as concluded from 1), the hydrogen exchange experiments (Fig. 2); and 2), the fact that some hydrogen bonds are not present in all the structures. Thus, the 310-helix in CACW40A shows a nativelike conformation with a high flexibility.
The first
-helix of dimeric CAC (Phe17 to Arg29) is also present in CACW40A, although the behavior of the four aromatic residues was different to that observed in nonmutated CAC. Whereas Tyr20 and Phe24 showed native contacts (with Leu46 and Asn49; and, with Leu45 and Leu46, respectively), Phe17 and Tyr25 adopted nonnative orientations:
Then, both aromatic residues adopt a nonnative conformation, as shown by the nonnative NOEs with the side chains of Leu46 and Val47 (these residues also have nonnative contacts with Val21). It is tempting to suggest that the changes in the orientation of the aromatic moieties in the first helix are modulated by the proximity of Thr42, Thr44, Leu45, Leu46, and Val47. In dimeric CAC, these latter residues form the last turn of the second helix, where a kink occurs (19
,20
). Since this helix is not stably formed in CACW40A (see below), the protein must hide most of its hydrophobic residues from the solvent; the solution adopted is to bend the C-cap region of the second helix making nonnative contacts with the middle and last regions of the first
-helix (Fig. 4 C).
The last two helices are parallel to each other (Fig. 4, A and B), as in dimeric CAC (19
,20
,67
). However, as the region between Leu45 and Asn49 is bent, the last two helices are arranged perpendicularly to the orientation adopted by the same helices in nonmutated dimeric CAC (Fig. 4 B), and are, thus, in a nonnative conformation.
We also inspected whether the highly conserved region in CA from retroviruses, known as the MHR, adopts a nativelike conformation in CACW40A. In dimeric CAC, the conserved MHR residues (Asp8 to Leu28) form a compact strand-turn-helix motif, stabilized by an extensive hydrogen-bonding network (19
,20
,67
). This network involves the side chains of Arg10, Gln11, Glu15, Arg23, and Asn51. We could not detect any of these hydrogen-bonds in the calculated structures. However, two pieces of evidence suggest that the MHR adopts a nativelike conformation in CACW40A. Firstly, the structure and orientation of first
-helix is nativelike (Fig. 4 A and Fig. 5 A). And secondly, the H
protons of Arg10 and Arg23 did show different chemical shifts, suggesting a fixed conformation for both side chains (Supplementary Material, Table S1).
|
- and
-angles to those observed for the residue in nonmutated CAC. The region from Met41 to Asn49, which makes contact in the nonmutated CAC with the MHR region of the other monomer (19
The structure of CACW40A and its implications for understanding folding and association of CA of HIV-1
The folding and association of nonmutated CAC (either by chemical- or heat-denaturation) involves a monomeric intermediate (21
–23
). The spectroscopic evidence from CD and the very similar values for the thermodynamic parameters in chemical denaturation experiments (free energy and m-values) for the monomeric intermediate of nonmutated CAC and the monomeric CACW40A mutant (21
) indicate that both monomeric forms are very similar. This monomeric intermediate does not expose to the solvent a large amount of hydrophobic patches as concluded from the lack of 8-anilino-1-naphtalene-sulfonate binding at room temperature (21
,22
). However, during the thermal unfolding of nonmutated CAC, the monomeric species binds 8-anilino-1-naphtalene-sulfonate (22
). With the structure of CACW40A in hand, we suggest that upon heating, the loop region, which makes nonnative contacts, separates from the first
-helix, exposing the hydrophobic residues.
The NMR structure of CACW40A is highly consistent with spectroscopic observations of this mutant carried out by CD and FTIR. However, there is an apparent contradiction. In the structure of the nonmutated CAC, helix 2 (the dimerization helix) is fully formed (19
,20
) but not in the monomeric CACW40A (Fig. 4). If helix 2 was totally unstructured in the monomeric form, one could, in principle, expect a significant decrease in helicity relative to the dimer, determined by CD and FTIR; this decrease is not observed either in the monomeric intermediate of CAC or in the CACW40A mutant ((21
) and Table 1). It could be argued that 1), in CACW40A the putative decrease in absolute ellipticity at 222 nm might be counterbalanced by a similar increase due to the different environment adopted by the aromatic residues (see above), which also absorb at 222 nm (68
,69
); and 2), that the mathematical deconvolution of the FTIR spectrum might be unreliable. We cannot rule out the errors in the deconvolution procedure of FTIR, but the method is robust enough, as it has been shown in determination of secondary structure of model proteins (49
–51
). Either way, we cannot rule out that there is a compensation in the ellipticities upon aromatic rearrangements, although we believe that an exact matching is highly improbable. We favor a simpler explanation based in the reported fact that highly flexible proteins can yield almost nativelike spectra by CD (as well as by other probes), if the interconversion between the different conformations is fast enough (70
). We have shown that the segment formed by residues Ser34 to Glu36 and residues Asn39 to Ala40 could not be assigned. The reason behind the absence of assignments is the lack of observable resonances, not the inability to assign the observed ones. The only reasonable explanation for the missing peaks is that those residues are involved in a conformational exchange process that broadens the signals beyond detection (65
,66
). Such exchange process could be due to exchange of amide protons with the solvent, or to interconversion between different conformers (conformational exchange). The NMR experiments described in this work were acquired with gradients, which minimize the amide protein exchange with solvent; further, control NOESY experiments were acquired at 283 K, to reduce solvent exchange, and still no resonances could be observed for residues in that region (data not shown). Thus, we propose that this region of monomeric CAC samples a nativelike conformation (as shown by CD and FTIR) and one or more partially folded states. These equilibria should be intermediate in the NMR timescale, but fast enough in the FTIR and CD time regimes.
Unfolding of helical elements has been observed in other helical proteins when a cofactor has been removed (66
), or when a single residue has been mutated (71,72). Interestingly enough, a helical region in wild-type apocytochrome b562 (residues 1–20) is unstructured in a single point mutant (71
). This mutant shows also nonnative contacts that involve the aromatic moieties of the core, similar to what is observed in CACW40A. The structure of the apocytochrome b562 mutant has been proposed to be similar to that of an unfolding intermediate, detected by kinetic methods. However, there is an important difference between the partially unfolded state of CAC and that in apocytochrome b562. In the apocytochrome b562 mutant, the rest of the protein scaffold remains nativelike, while in CACW40A, the last two helices are rotated when compared to nonmutated CAC. To the best of our knowledge, this is the first example where such major rearrangements have been described, as a consequence of a single mutation. It could be thought that this rotation was the result of the structure calculation. However, several lines of evidence argue against the possibility of an artifact. Firstly, the large number of nonnative NOEs observed between the first
-helix and the patch formed by residues Thr42 to Leu46. Secondly, residues Leu45 to Leu46 were unambiguously assigned and their methyl protons are the most upfield-shifted protons of the spectra, showing facile distinguishable contacts with other regions of the protein (Supplementary Material, Table S1). And finally, the rotation of the last two helices is observed during molecular dynamics simulations (by using the standard AMBER force field) of the in silico constructed CACW40A (L. A. Alcaraz, unpublished results). Taken together, these results support the view that nonnative hydrophobic interactions and nonnative structural rearrangements occur in partially unfolded states and, probably, in folding intermediates.
The structural differences observed for the monomeric form of CAC, as revealed by the NMR results with mutant CACW40A, are entirely consistent with the previous thermodynamic data obtained for the monomeric and dimeric forms of CAC and other CAC mutants (21
–23
). Specifically, the observation is that
-helix 2 is not stably formed and that the packing of some residues is altered—which is consistent with the reduced stability of both CACW40A and the monomeric intermediate, relative to each subunit in the fully formed dimeric CAC (21
). Finally, the structure is also consistent with our predictions that the monomeric species of CAC: 1), lacks some of the native tertiary interactions seen in the dimer; and 2), has a conformation that differs from that of the subunits in the dimer (21
,22
).
Flexibility in CAC and possible biological implications
Our study of the solution structure of CACW40A indicates that CAC, while preserving most of the native scaffold, is structurally highly flexible. In this regard, our results resemble those obtained with a bovine pancreatic trypsin inhibitor mutant whose structure is nativelike, but whose dynamics is highly altered relative to the nonmutated protein (72
). However, one question can be raised: is the flexibility observed in CACW40A a consequence of the mutation (as it is in bovine pancreatic trypsin inhibitor), or, alternatively, is it intrinsic to the CAC domain and is it only enhanced by the mutation, which yields a monomeric species? We will try to answer this question by comparing the different structural changes detected in CAC in three previous studies and in the present one.
In the first study it was proposed, based on the NMR structure of a structurally homologous CAC protein, that the MHR of CAC forms dimeric swapped domains (73
) during assembly of the immature particle, within the context of the intact Gag protein. The authors propose that the MHR of CAC in one of the monomers moves out and disrupts the intramonomer contacts between the 310-helix and the hydrophobic residues in the first
-helix. Concomitantly,
-helix 2, which forms most of the dimerization region of CAC, is pushed toward the 310-helix of the other monomer. Thus, the nativelike scaffold remains mainly nativelike, but nonnative interactions involve the other monomer and the second dimerization helix of CAC. Furthermore, an aromatic residue, homologous to Phe17 in CACW40A, shows contacts with other regions of the domain. We have shown here that in CACW40A, the 310-helix is flexible and that Phe17 adopts nonnative contacts, indicating a plasticity for the N-terminal region. While this article was being reviewed, the x-ray structure of a dimeric domain swapped variant of the CAC has been reported by the same authors (74
). This domain-swapped structure is obtained by a single deletion mutant in CAC (the Ala33), where the swapping zones are the N-terminal region and the first
-helix; on the other hand, the MHR-containing region is involved in the dimerization interface (74
); furthermore, the
-helix 2 region appeared slightly shorter than in the dimeric protein.
In a second study, the four-helix bundle scaffold of each subunit in dimeric CAC is altered by the binding of a heterologous peptide to form a five-helix bundle, as shown by x ray (67
). Most of the structure of the protein remains unaltered except for the second helix (the dimerization helix), which is displaced 6 Å toward the first helix. Moreover, the majority of the new contacts with the binding peptide occur through the patch of hydrophobic residues Val21, Phe24, Tyr25, Leu28, Arg29, Glu43, Leu46, Leu68, and Met71. The shift changes the dimerization interface of CAC, although the protein is still a dimer. Thus, this study shows that the middle and the C-cap of the first helix, and the C-cap of the second helix, have a tendency to adopt nonnative contacts. Interestingly enough, the residues involved are nearly the same observed to adopt nonnative contacts in CACW40A. Then, preventing dimerization, due to the tryptophan to alanine replacement, leads to a tertiary structure of CAC that looks similar in some regions (but not in others) to that observed in the peptide-bound dimer (67
), although the significance of this observation remains to be established.
In a third study it was proposed, based on FTIR, fluorescence, and theoretical analyses, that upon binding to lipid membranes, CAC remains dimeric, but the region around Trp40 and, thus the dimerization helix, shows large structural changes (75
). Furthermore, the third and fourth helices also seem conformationally altered, while the rest of the structure is nativelike.
In brief, those three studies reveal that, upon binding to other molecules (lipids, peptides, or other regions of the Gag protein), the molecular environment around the dimerization helix might be substantially altered. In the monomeric CACW40A mutant, the absence of intermonomer (quaternary) interactions favors the unfolding of the second helix and makes the orientation of the last two helices change. In fact, different biophysical techniques (Table 1 and (21
,23
)), and molecular dynamics simulations (L. A. Alcaraz, unpublished results) suggest that this region fluctuates between the nativelike conformation and other more disordered states. It seems that the plasticity shown by the CAC domain (when faced to different molecular environments) relies on the second helical region, and the changes in this region trigger the changes in the aromatic residues, in the 310-helix, in the second helix, or in the second, third, and fourth helices (67
,73
–75
). Further, in CACW40A, the orientation of the last two helices is also altered to hide from the solvent the hydrophobic residues of the second helix. Taken together, these findings suggest that the multiple roles of CA during assembly of HIV may be mediated by its rather unique ability to adopt different conformations while preserving a similar scaffold in response to different molecular environments.
The fact that the dimerization helix of CAC is not conformationally stable without the intermonomer (quaternary) contacts may also have important implications for the design of drugs able to effectively inhibit the assembly of the HIV-1 capsid by disrupting the CA dimerization interface. We have previously shown that a short peptide comprising the second helical region of CAC (residues Ser34 to Val57) is unfolded in solution, but it is able to bind CAC with an affinity that approached that of the complete CAC domain (76
). Similar studies with other peptides have shown that upon binding the peptide acquires a helical conformation, but the exact structure when the peptide is isolated in solution is unknown (67
,77
). The entropic cost of binding an unstructured peptide that mimics the CAC dimerization interface could not detract from its inhibitory capacity, as a similar entropic cost would have to be paid during the formation of helix 2 on dimerization of CA during assembly of the HIV-1 capsid.
Coordinate and data deposition
The 1H and 15N assignments have been deposited in the BioMagRes bank, BMRB accession No. 15137. The atomic coordinates for an ensemble of 30 structures that represent the solution structure of CACW40A have been deposited in the Protein Data Bank together with the list of restraints used for the structure calculation, PDB code No. 2jo0.
| SUPPLEMENTARY MATERIAL |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
This work was supported by grants from Ministerio de Sanidad y Consumo (MSC) (grant No. FIS 01/0004-02) and Ministerio de Educación y Ciencia (MEC) (grant No. CTQ2005-00360/BQU) to J.L.N., grants from MSC (No. FIS 01/0004-01) and MEC (No. BIO2003-04445) to M.G.M., and by institutional grants from Grupo Urbasa to the Instituto de Biología Molecular and Celular and from Fundación Ramón Areces to the Centro de Biología Molecular "Severo Ochoa". F.N.B. was the recipient of a predoctoral fellowship from MEC. M.A. was the recipient of a research fellowship from Comunidad Autónoma de Madrid.
| FOOTNOTES |
|---|
Abbreviations used: 1D, one-dimensional; 2D, two-dimensional; 3D, three-dimensional; CA, capsid protein of HIV-1 (p24); CAC, C-terminal domain of CA, comprising residues 146–231 of the intact protein; CACW40A, mutant of CAC with Ala instead of Trp at position 184; CD, circular dichroism; DOSY, diffusion-ordered spectroscopy; FTIR, Fourier transform infrared spectroscopy; Gag, the structural polyprotein of retroviruses; HSQC, heteronuclear single quantum coherence; MHR, major homology region; NMR, nuclear magnetic resonance spectroscopy; NOE, nuclear Overhauser effect; RMS, root mean square.
Submitted on November 14, 2006; accepted for publication April 18, 2007.
| REFERENCES |
|---|
|
|
|---|
2. Freed, E. O., and M. Martin. 2001. HIVs and their replication. In Field Virology. D. M. Knipe and P. M. Howley, editors. Lippincott, Philadelphia, PA.
3. Vogt, V. M. 1996. Proteolytic processing and particle maturation. Curr. Top. Microbiol. Immunol. 214:95–132.[Medline]
4. Fuller, T. W., T. Wilk, B. E. Gowen, H.-G. Kräusslich, and V. M. Vogt. 1997. Cryo-electron microscopy reveals ordered domains in the immature HIV-1 particle. Curr. Biol. 7:729–738.[CrossRef][Medline]
5. Ehrlich, L. S., B. E. Agresta, and C. A. Carter. 1992. Assembly of recombinant human immunodeficiency virus type 1 capsid protein in vitro. J. Virol. 66:4874–4883.
6. Gross, I., H. Hohenberg, and H.-G. Kräusslich. 1997. In vitro assembly properties of purified bacterially expressed capsid proteins of human immunodeficiency virus. Eur. J. Biochem. 249:592–600.[Medline]
7. Gross, I., H. Hohenberg, C. Huckangel, and H.-G. Kräusslich. 1998. N-terminal extension of human immunodeficiency virus capsid protein converts the in vitro assembly phenotype from tubular to spherical particles. J. Virol. 72:4798–4810.
8. Ganser, B. K., S. Li, V. Y. Kliskho, J. T. Finch, and W. I. Sundquist. 1999. Assembly and analysis of conical models for t