help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Paterlini, M. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Paterlini, M. G.

Biophys J, December 2002, p. 3012-3031, Vol. 83, No. 6

Structure Modeling of the Chemokine Receptor CCR5: Implications for Ligand Binding and Selectivity

M. Germana Paterlini

Department of Medicinal Chemistry and Supercomputer Institute, University of Minnesota, Minneapolis, Minnesota 55455 USA


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSIONS
REFERENCES

The G-protein coupled receptor CCR5 is the main co-receptor for macrophage-tropic HIV-1 strains. I have built a structural model of the chemokine receptor CCR5 and used it to explain the binding and selectivity of the antagonist TAK779. Models of the extracellular (EC) domains of CCR5 have been constructed and used to rationalize current biological data on the binding of HIV-1 and chemokines. Residues spanning the transmembrane region of CCR5 have been modeled after rhodopsin, and their functional significance examined using the evolutionary trace method. The receptor cavity shares six residues with CC-chemokine receptors CCR1 through CCR4, while seven residues are unique to CCR5. The contribution of these residues to ligand binding and selectivity is tested by molecular docking simulations of TAK779 to CCR1, CCR2, and CCR5. TAK779 binds to CCR5 in the cavity formed by helices 1, 2, 3, and 7 with additional interactions with helices 5 and 6. TAK779 did not dock to either CCR1 or CCR2. The results are consistent with current site-directed mutagenesis data and with the observed selectivity of TAK779 for CCR5 over CCR1 and CCR2. The specific residues responsible for the observed selectivity are identified. The four EC regions of CCR5 have been modeled using constrained simulated annealing simulations. Applied dihedral angle constraints are representative of the secondary structure propensities of these regions. Tertiary interactions, in the form of distance constraints, are generated from available epitope mapping data. Analysis of the 250 simulated structures provides new insights to the design of experiments aimed at determining residue-residue contacts across the EC domains and for mapping CC-chemokines on the surface of the EC domains.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSIONS
REFERENCES

A critical step in cellular entry by HIV-1 is the binding of the viral envelope protein gp120 to chemokine receptors on surface of immune cells (Hoffman et al., 1999; Kolchinsky et al., 1999; Rizzuto et al., 1998; Wu et al., 1996). The CC-chemokine receptors CCR5 and CXCR4 are the major co-receptors for R5 and X4 virus strains, respectively (Berger et al., 1998; Robertson et al., 2000). CCR5 has been the target of anti-HIV-1 strategies for disrupting the interaction with the HIV-1 envelope protein gp120. Studies suggest that gp120 binds preferentially to the N-terminus of CCR5 (Dragic et al., 1998; Farzan et al., 1998), and that the second extracellular region is mostly responsible for the binding of the endogenous chemokines peptides (Samson et al., 1997). The two domains are not clearly separated, because some single point mutations in the N-terminus can also abolish the binding of chemokines (Blanpain et al., 1999a). HIV-1 and chemokines mainly interact with the extracellular (EC) regions of CCR5, while small synthetic ligands mostly bind to residues of the transmembrane (TM) region (Dragic et al., 2000).

Chemokine receptors belong to the rhodopsin family of G-protein coupled receptors (GPCR) characterized by a heptapeptidic helical fold (7-TM) spanning the plasma membrane (Fig. 1). Modeling of GPCR has at its disposal the 2.8 Å resolution structure of bovine rhodopsin (Palczewski et al., 2000) whose 7-TM motif has been proven valid for other receptors in its family as well (Elling et al., 1997; Mizobe et al., 1996). However, local structural motifs, such as those seen in helices of rhodopsin that contain Gly and Pro (Palczewski et al., 2000), need to be reexamined in the context of CCR5. Given the relative simplicity of the helical fold and ~30% sequence identity with rhodopsin in the transmembrane core region, it should be possible to generate models where ~80% of the Calpha atoms are within 3.5 Å of their correct positions (Sanchez and Sali, 1999). Protein models obtained at such resolution have been able to correctly predict the location of binding sites and the size of the ligands (Sanchez and Sali, 1999). To this extent, theoretical GPCR models based on a low-resolution template (Unger et al., 1997; Baldwin et al., 1997) before the x-ray structure have been successfully used in designing experiments for structure validation (Lu and Hulme, 2000; Mizobe et al., 1996; Zhou et al., 1994), determining ligand binding sites (Metzger et al., 1996; Simpson et al., 1999; Strader et al., 1994), and formulating hypotheses on mechanisms of receptor activation (Ballesteros et al., 1998; Cotecchia et al., 2000; Flanagan et al., 1999; Javitch et al., 1997, 1998).



View larger version (68K):
[in this window]
[in a new window]
 
FIGURE 1   Schematic representation of the CCR5 sequence. Gray rectangles outline residues in the 7-TM region, TM1 through TM7, and helix 8. EL and IL denote extracellular and intracellular loop regions, respectively. Disulfide bridges between C20 and C269 and C101 and C178 are shown as lines connecting these cysteines. Gray circles denote conservation of strong groups in CCR1 through CCR5. Gray circles with heavy outlines denote identical residues in CCR1 through CCR5. Black circles denote highly conserved residues in the rhodopsin family of GPCR. For ease of comparison with other GPCR, residues are numbered using the highly conserved residues as reference (Ballesteros and Weinstein, 1995). N: 1.50 (TM1); D: 2.50 (TM2); R: 3.50 (TM3); W: 4.50 (TM4); P: 5.50 (TM5); P: 6.50 (TM6); P: 7.50 (TM7).

I have used the x-ray structure of rhodopsin, in combination with homology techniques, to derive a structural model of the transmembrane region (TM) of CCR5, and then applied the evolutionary trace method (Lichtarge et al., 1996) to compare the residues of the receptor cavity with those of either distantly or closely related homologs. Evolutionary tracing has been successful in identifying functionally important residues across different protein families (Johnson and Church, 2000; Sowa et al., 2000). When applied to CCR5, the tracing readily isolates residues of the receptor cavity that can either bind small synthetic ligands or function as selectivity filters among closely related CC-chemokine receptors. To test this hypothesis, I have docked the antagonist TAK779 (Baba et al., 1999) to CCR5 and compared its binding mode with available experimental data. The selectivity of TAK779 for CCR5 has been previously measured against closely related receptors and the binding site has been probed using site-directed mutagenesis (Baba et al., 1999; Dragic et al., 2000). Simulations have been carried out using an automated docking protocol (Meng et al., 1992) used in docking simulations of opioid ligands (Subramanian et al., 1998, 2000). In that work, the docking procedure, in conjunction with mutagenesis data, provided binding modes in agreement with structure-activity relationship (SAR) of opioid ligands and explained, in part, the observed selectivity. In this work I show that the docking simulations not only are consistent with the mutagenesis data, but also suggest a rationale for the selectivity of TAK779 for CCR5, as compared with the chemokine receptors CCR1 and CCR2. The results show a conserved region of the binding pocket that may function as a binding locus for chemokine antagonists that share a common ammonium group. Nonconserved residues in helices TM3, TM5, and TM6 would impart receptor selectivity. The docking results outline specific residues that can be targeted by site-directed mutagenesis studies for validation of the proposed binding mode.

The importance of the extracellular (EC) regions in binding and recognition of chemokines and the HIV-1 envelope glycoprotein gp120 makes it paramount to include such regions in the modeling effort. Close proximity among the EC domains is likely, due to the presence of two disulfide bonds, one linking the N-terminus to the third extracellular loop (EL), EL3, and a second between EL1 and EL2 (Blanpain et al., 1999b) (Fig. 1). Indirect information on the 3-D arrangement of the extracellular regions can be inferred from the mapping of epitope residues recognized by monoclonal antibodies (mAbs). The antibody mAb-2D7 recognizes residues in both EL1 and EL2 (Lee et al., 1999), while mAbs PA9 and PA14 bind to amino acids in both the N-terminus and EL2 (Olson et al., 1999).

Modeling of GPCR has been mostly applied to residues in the transmembrane region, and considerably less work has been done on the N-terminus and loops connecting the TM helices (Luo et al., 1997; Paterlini et al., 1997; Moro et al., 1999; Odile-Colson et al., 1998; Pogozheva et al., 1998). Modeling of the extracellular regions of GPCR is challenging because of limited homology with known structures and of limitations of current loop modeling techniques. The combination of low sequence identity with rhodopsin and short loop length makes it difficult to match the EC regions with homologous segments of known structure using database search programs (Altschul et al., 1997). The three extracellular loops are generally 10-30 residues in length, and thus behind the applicability of current loop modeling tools (Van Vlijmen and Karplus, 1997; Li et al., 1999, Fiser et al., 2000). Loop modeling is typically applied to isolated loops of globular proteins, while loops of GPCR most likely contain numerous tertiary interactions, as demonstrated by rhodopsin. The tertiary fold of a protein is difficult to predict computationally without the aid of geometrical constraints (Zhang et al., 2002a). In the case of membrane proteins, distance constraints are usually obtained using EPR and fluorescence spectroscopies, or by disulfide bond engineering. For example, EPR spectroscopy has been extensively used to map interactions among the three intracellular loops of rhodopsin (Hubbell et al., 2000). Spectroscopy-derived distances are not available for CCR5. However, indirect distance information can be inferred from epitope mapping results. A previous study on the sensitivity of mAbs to the tertiary structure of their targets has shown that antibodies recognize a region within a 15 Å radius of the epitope residues (Burritt et al., 1998). I have, therefore, transformed the epitope mapping data of CCR5 into distance constraints between the different EC regions to create tertiary contacts among the EC domains.

I have combined homology information, tertiary constraints derived from available experimental data, and ab initio computational tools to model the 95 residues of the extracellular regions of CCR5, which vary in length from a 32-residue-long N-terminus to the 13 residues of EL1. Secondary structure prediction methods were first used to obtain secondary structure propensities of the individual extracellular region. Additional searches for homologous sequences in the PDB protein data bank (Berman et al., 2000) provided validation of the prediction results and initial structural templates for each of the four domains. Conformations of the extracellular regions generated in this fashion were subjected to constrained simulated annealing simulations and analyzed based on conformational energy, structural variability, and average physical characteristics.

The addition of tertiary constraints results in clusters of critical aromatic and acidic residues previously implicated in the binding of chemokines and gp120 (Blanpain et al., 1999a; Howard et al., 1999; Rabut et al., 1998; Zhou et al., 2000). The EC models are used to explain the sensitivity of these residues to mutation and suggest specific residue-residue interactions that could be validated by site-directed mutagenesis and sites for mapping the interaction of the chemokines with CCR5.


    MATERIALS AND METHODS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSIONS
REFERENCES

CCR5 receptor model

The CCR5 sequence (Accession number P51681) was scored against the SWISSPROT database (Bairoch and Apweiler, 2000) using BLAST2.0 (Altschul et al., 1997) with the BLOSUM62 matrix and standard parameters in the Biology Workbench (http://workbench.sdsc.edu) (Subramaniam, 1998). Sequences were aligned using CLUSTALW (Thompson et al., 1994). The suite of programs PERSCAN (Donnelly et al., 1994) was used to identify residues located in the transmembrane from a set of 50 nonredundant sequences with the highest homology to CCR5 (see Table 1). Sequence identity of the 50 sequences with CCR5 was between 28% and 99% for residues in the transmembrane region.

The length of the CCR5 helices was determined by first considering a minimum length of 18 residues in the core of the lipid bilayer. Up to seven additional residues at each end were then added to allow extension of the helices into the lipid polar head regions (White, 1994). Criteria for finding the end of the helices were based either on the alpha -helical index (AP index) (Donnelly et al., 1994), calculated using the 50 sequences in Table 1, or on sequence analysis of these regions.

Criteria from sequence analysis, homology to rhodopsin, and the presence of either Gly or Pro near the ends were used to determine the length of the TM helices. Glycine and proline residues are indicators of helix termination or initiation in globular helices (Prieto and Serrano, 1997; Viguera and Serrano, 1999) and have a similar function in rhodopsin as well (Palczewski et al., 2000). Residues modeled in the alpha -helical conformation are shown in Fig. 1.

The set of 50 unique sequences with the highest sequence homology to CCR5 (Table 1) was also used to compare the orientation of the 7-TM bundle of CCR5 with that of the average template (Baldwin et al., 1997) and of rhodopsin. Calculations of the helix orientations using criteria based on the periodicity of conserved residues (Donnelly et al., 1994) were within 20° of those obtained using periodicity of substitutions (Overington et al., 1992; Donnelly et al., 1994).

Structural homologs of sequences in the EC regions were obtained by searching the PDB database (Berman et al., 2000) using BLASTP2.0 or PSI-BLAST with either PAM30 or PAM70 matrices. Expectation values of the hits varied from E = 0.0002 to E = 107. The searches produced from 3 to 12 structures for each sequence searched. Sequences found in this fashion were then used in MODELLER4 (Sali and Blundell, 1993) using the "model" routine to generate initial loop structures. Structures 1IRK (residues 1150-1163), 1PFC (residues 390-401), 1UWB (residues 274:B-283:B), and 2BPA (residues 120:2-131:2) were used as templates for the N-terminus. Initial conformations of EL1 made use of structures 1TIA (residues 12-19), 1BGY (residues 31:O-50:O) and 1TMF (residues 206-212 of chain 1). Structures used for EL2 were 1AVQ (residues 178-186 of chain A), 1HRA (residues 32-34), and 2TS1 (residues 248-257). Structures used for EL3 were 1ARP (residues 112-128), 17AJ (residues 266-273) and 2MEV (residues 140-146). Intracellular loop regions were built using the Search Loop module of the InsightII program (InsightII, 2002. Accerlys Inc., San Diego, CA).

Prediction of the secondary structure of the loops residues was obtained using the protein prediction tool PELE, as provided by the Biology Workbench (Subramaniam, 1998). The method compares predictions from seven major algorithms and determines secondary structure based on the minimum consensus for each amino acid.

Side chain rotamers were generated using a backbone-dependent rotamer procedure (Dunbrack and Karplus, 1993). The AMBER5.0 (Case et al., 1995) suite of programs was used for energy minimization and simulated annealing simulations. The model of the transmembrane region was subjected to energy minimization followed by 500 ps of molecular dynamics (MD) simulation at 300 K, during which positional restraints applied to backbone atoms were gradually lowered from 2 to 0.1 kcal/mol-Å2. The structural quality of the model was checked using PROCHECK with 94% of the residues in the most favored allowed helical region and remaining 6% in the additionally allowed helical region (Laskowski et al., 1993). Side chain chi 1 and chi 2 angles were found in their most favorable regions without stereochemical conflicts.

Structural models of the TM regions of receptors CCR2 and CCR1 were constructed from the CCR5 template by replacing the nonconserved side chains using SQWRL (Dunbrack and Karplus, 1993), followed by energy minimization.

Side chain accessibility areas of the amino acids were calculated with the program ACCESS as a percentage compared with that measured for that residue in an extended Ala-Xxx-Ala tripeptide (Hubbard and Thornton, 1993).

Structures of the EC regions were generated using a simulated annealing procedure consisting of fast heating to 1200 K, cooling to 300 K up to 30 ps, and constant temperature for an additional 70 ps, followed by energy minimization. The backbone dihedral angle phi  (C-N-Calpha -C) of all residues except Gly were restricted to negative values during annealing and additional constraints were imposed to maintain trans-omega bonds for non-Pro residues and the correct chirality of the Calpha carbons. Constraints made use of a flat well potential function with a force constant of 32 kcal/mol/deg2 for angle constraints and 10 kcal/mol-Å 2 for distance constraints. A similar protocol was used to generate a structure of the intracellular loops. Residues in the transmembrane region were kept in their initial conformation using the belly option of SANDER during the annealing procedure. A set of 250 structures of the EC regions was obtained in this fashion. Clustering of structures based on RMSD was obtained using PADRE (Stahl, M. T., and W. P. Walters. 1995. PADRE. Population analysis and duplicate removal (available by ftp from ccl.osc.edu).

Modeling of TAK779

The geometry of TAK779 was optimized using molecular orbital ab initio methods at the HF/6-31G* level using the Gaussian98 program package (Frisch et al., 2001). Partial atomic charges were obtained using the restrained electrostatic potential (RESP) charge fitting formalism (Bayly et al., 1993). Force field parameters were adapted from similar chemical groups found in the Cornell et al. (1995) force field (parm96.dat of AMBER5.0).

Ligand docking

Docking of the minimum energy structures of TAK779 to the model-built TM domain of CCR5, CCR1, and CCR2 was performed using the automated docking procedure DOCK3.5 (Meng et al., 1992) by following a previously described protocol (Subramanian et al., 1998). The docking cavity was generated using SPHGEN by generating spheres from the solvent-accessible molecular surface of the receptor cavity. Clusters containing 121 spheres for CCR5, 90 for CCR1, and 74 for CCR2 were generated in this fashion. A maximum of six steric overlaps were allowed during the generation of docking orientations. The predicted orientations were scored individually based on the empirical evaluation of the electrostatic and van der Waals energy contributions.

Refinement of the initial receptor-ligand complex was obtained by in vacuo energy minimization (0.001 kcal/mol rms deviation) followed by MD simulations up to 1 ns. A 3.0 kcal/mol positional constraint was applied to the receptor backbone atoms, the cutoff for nonbonded interactions was 8 Å, and the dielectric constant was 4. The temperature of the system was maintained at 298 K with a 0.2 ps coupling constant.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSIONS
REFERENCES

Modeling and characterization of the transmembrane region of CCR5

The model of the transmembrane region of CCR5 is based on the x-ray structure of rhodopsin (Palczewski et al., 2000) and makes use of a previous sequence analysis of 493 GPCR (Baldwin et al., 1997) to determine the average orientation of the helices in the 7-TM bundle with respect to the lipid environment and to outline possible discrepancies between rhodopsin and CCR5. Homology between CCR5 and rhodopsin is lower near the extracellular medium, and variations in these regions may result in a different length and/or different orientation of the helices at these ends. The sequence of CCR5 is schematically shown in Fig. 1, where highly conserved residues are numbered according to the convention of Ballesteros and Weinstein (1995). A set of 50 sequences with the highest sequence identity to CCR5 was used in the comparison (Table 1).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1   G-protein coupled receptor with high sequence homology to human CCR5

Calculations of the lipid-facing direction of helices TM1, TM3, TM4, and TM6 were in good agreement with those obtained from a larger set and from the x-ray structure of rhodopsin. The orientation of residues at the extracellular end of TM2 and TM5, while in agreement with that obtained by Baldwin et al. (1997), deviated by ~100° from rhodopsin. This discrepancy originates from bulges in TM2 and TM5 that alter the helical periodicity, thus exposing residues to the lipid environment that would otherwise face the bundle's interior. The bulge in TM5 of rhodopsin is near a conserved proline, flanked by large hydrophobic residues in the F212IIPLIV218 region. The corresponding CCR5 fragment, L203VLPLLV209, has a similar amino acid composition and is likely to adopt a similar conformation. By using the TM5-rhodopsin template, residues K191 and T195 of TM5 orient toward residue T259 of TM6. They would not face each other if TM5 is modeled as an ideal helix. Previous studies have shown that residues at these three positions are proximal to each other, as they can form interhelical disulfide bonds when mutated to Cys in rhodopsin and the tachykinin NK1 receptors (Elling et al., 2000; Struthers et al., 1999). Similarly, replacement with His creates a zinc binding site in both the NK1 and kappa opioid receptors (Thirstrup et al., 1996; Elling et al., 1997). The NK1, kappa opioid, and CCR5 receptors all have an amino acid composition similar to rhodopsin near the conserved Pro. Helix 5 of CCR5 was therefore modeled using the rhodopsin template.

Replacement of the rhodopsin side chains with those of CCR5 resulted in steric conflict between residues T82 and V83 of TM2 and L107 in TM3. Furthermore, the orientation of the C-terminal residues of TM2 did not agree with that predicted by the Fourier transform analysis of the 50 helices or with the previous sequence analysis of GPCR (Baldwin et al., 1997). Closer analysis of helix 2 showed that T82 and V83 replace residues G89 and G90 of rhodopsin at a location where TM2 and TM3 cross and interact with the closest packing. This type of interaction, where large side chains in one helix pack against the Gly backbone of the other helix, is often found in membrane proteins (Javadpour et al., 1999). In rhodopsin, packing is achieved by a local distortion of the helical conformation at these sites. Replacement of the two glycines with the larger CCR5 side chains within this structural motif results in atomic overlap. The extracellular portion of TM2 was therefore modeled by changing the dihedral angles of the L80-P84 fragment according to average values observed in Pro-containing alpha -helices (Némety et al., 1992; Sankararamakrishnan and Vishveshwara, 1992). The changes in the torsion angles relieved the atomic overlap between TM2 and TM3 and resulted in an orientation of the C-end residues of TM2, in agreement with both the Fourier analysis and the previous sequence analysis of GPCR (Baldwin et al., 1997). A comparison of dihedral angles in this region between CCR5 and rhodopsin is given in Table 2. The modeled TM2 helix is likely to be valid for those GPCRs that contain the highly conserved TM2-Pro and lack the Gly-Gly motif. Templates for helix 7 and its adjoining helix 8 were taken from the x-ray structure of rhodopsin, based on high sequence homology in both helical regions. Helix 7 of rhodopsin is irregular, containing four residues in the 310 conformations, flanked by alpha -helical segments. The model of the transmembrane region of CCR5 is shown in Fig. 2.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 2   Backbone dihedral angle comparison between modeled CCR5 and rhodopsin at the extracellular end of the second transmembrane helix



View larger version (51K):
[in this window]
[in a new window]
 
FIGURE 2   Structural model of the transmembrane and intracellular regions of CCR5. Shown in red are aromatic and acid residues conserved in receptors CCR1 through CCR5 (Y37 (1.39), W86 (2.60), Y108 (3.32), Y251 (6.51), and E283 (7.39)). An adjacent cluster consisting of conserved (F79 (2.53), W248 (6.48)) and CCR5-specific (H289 (7.45), F112 (3.36)) aromatic residues is shown in green. Residues Y37, W86, Y108, and E283 are part of the binding pocket for TAK-779.

The peptide subfamily of GPCR (Kolakowski, 1994) contains receptors that bind a disparate array of peptides, such as angiotensin, bradykinin, opioid, somatostatin, and chemokines. I have used evolutionary tracing (Lichtarge et al., 1996) to locate functionally important residues in this receptor family based on sequence conservation with the goal of uncovering amino acids in CCR5 that are responsible for binding and selectivity of ligands. Evolutionary trace analysis was restricted to the transmembrane regions, where the accuracy of the model template is sufficient for this type of analysis. The sequences of 50 highly homologous receptors (Table 1) were chosen for the comparison and dendograms were constructed by including only residues in the transmembrane region. The resulting phylogenetic tree (Fig. 3 A) is similar to the one generated with the full sequence (not shown), thus suggesting a functional specificity for the TM residues. Results from the tracing were visualized using surface area accessibility plots of the TM regions (Fig. 3 B). The periodic pattern mirrors the helical periodicity, with maxima occurring at residues on the outside of the TM bundle, and minima at residues located either at helical interfaces or the interior. The plot readily identifies conserved and CCR5-specific residues of the receptor cavity. The minima observed for residues TM1-N48, TM2-D76, TM3-I116, TM5-L207, TM6-Y244, and TM7-N293 correspond to a network of interacting residues that define the bottom of the receptor cavity. Of these, N48 (1.50, using the numbering of Ballesteros and Weinstein, 1995), D76 (2.50), and N293 (7.49) are highly conserved in the rhodopsin family of GPCR (36% identity cutoff in Fig. 3 A), either Phe or Tyr is found at position Y244 (6.44), and large hydrophobic residues occupy positions I116 (3.40) and L207 (5.51) (Fig. 3 B). The trace obtained with a 76% identity cutoff shows only one additional conserved residue, TM1-Y37 (1.39). Three additional conserved residues at 92% identity, P34 (1.36), W86 (2.60), and C290 (7.46), are found in both chemokines and angiotensin receptors. The highly conserved TM6-Trp (6.48) of GPCR, W248 in CCR5, appears only in the trace at 92% identity, because of sequence variations in CCR7 and the purinergic receptor. An additional six residues are conserved in receptors CCR1 through CCR5 (residues L33 (1.35), Y108 (3.32), L203 (5.47), Y251 (6.51), E283 (7.39), and H289 (7.45)), and the remaining seven residues are unique to CCR5 (Fig. 3 B).



View larger version (24K):
[in this window]
[in a new window]
 
FIGURE 3   Analysis of the transmembrane region of CCR5. (A) Sequence identity dendogram of the 50 GPCR in Table 1. Only residues of the transmembrane region were included in the comparison. Sequences are denoted using their SWISSPROT database entry names. Percent identity values indicate partition identity cutoff used in the trace analysis. The dendogram was generated using the DRAWGRAM option in the Biology Workbench (http://workbench.sdsc.edu). (B) Solvent accessibility surface area plots of residues in the TM region. Sequences are shown in the direction from the extracellular (left) to the intracellular (right) side. Residues that orient toward the receptor cavity are labeled according to the percent sequence identity using the same symbols as in A. Residues of TM4 are not part of the receptor cavity. Circled star symbols refer to residues common to CCR5 and receptors CCR1 through CCR4. Star symbols denote residues unique to CCR5. Surface areas are given as a percentage compared to the same residues in an Ala-Xxx-Ala peptide (Hubbard and Thornton, 1993).

Automated docking simulations of TAK779 to CCR5

The antagonist TAK779 (Fig. 4) binds with high affinity to CCR5, but its affinity is ~20-fold lower for CCR2 and it does not bind to CCR1 (Baba et al., 1999). I have used a previous automated docking protocol (Subramanian et al., 1998, 2000) to simulate the binding of TAK779 to CCR5, CCR2, and CCR1. Computed docking orientations were sorted based on their docking scoring energy. Docking configurations of TAK779 with energies within 10 kcal/mol from the energy minimum showed interactions of the benzyl-pyran-ammonium group with helices in TM1, TM2, and TM7, and close contact of the ammonium nitrogen with E283. The methylphenyl-benzocycloheptenyl moiety mainly interacted with residues in TM3, TM5, and TM6.



View larger version (8K):
[in this window]
[in a new window]
 
FIGURE 4   Chemical structure of TAK779 (N,N-dimethyl-N-(4[[[2-(4-methylphenyl)-6,7-dihydro-5H-benzocyclohepten-8-yl]carbon-yl]benzyl]-tetrahydro-2H-pyran) (Baba et al., 1999).

A docked structure with the shortest distance between the basic nitrogen and the carboxyl oxygen atoms of E283 was selected from the ensemble of low-energy orientations and used in MD simulations to further refine the receptor-ligand complex. The resulting docking configuration (Fig. 5 and Table 3) was compared with a previous site-directed mutagenesis study of the binding of TAK779 to CCR5. Residues of the binding pocket were defined using a 5 Å cutoff from TAK779. As shown in Table 3, the docking mode contains several residues implicated in the binding of TAK779 (Dragic et al., 2000). In that study, putative binding residues were identified based on a change of 20% or higher in the efficacy of TAK779 to block HIV-1 entry. Comparison with the docking results shows that the majority of such residues have been found by the docking procedure. Such residues include E283 in TM7 and neighboring aromatic residues TM1-Y37, TM2-W86, and TM3-Y108. Interactions of TAK779 with TM5 and TM6 included TM5-T195 and I198 in TM5, and Y251, N252, and L255 in TM6. By comparison, mutagenesis data suggest ligand interaction at I198 (20% change), but a negligible contribution from T195 and L255 (Table 3). Data are not available for Y251 and N252 because mutation of these two amino acids to Ala resulted in poor receptor expression (Dragic et al., 2000).



View larger version (74K):
[in this window]
[in a new window]
 
FIGURE 5   Orientation of TAK779 in the TM region of CCR5 after 1 ns MD simulations. Side chains within 5 Å of TAK779 are shown with a stick representation. Residues in red have previously been implicated in TAK779 binding (Dragic et al., 2000. Residues in green are side chains whose substitution with Ala did not interfere with TAK779 antiviral activity.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 3   Residues of CCR5 within a 5 Å cutoff of TAK779, as found in Fig. 5, and comparison with previous site-directed mutagenesis results

The majority of the CCR5 residues that form the binding pocket of TAK779 are also found in receptors CCR1 through CCR4, suggesting that only a few, nonconserved residues are responsible for the selectivity of TAK779. Specifically, selectivity could originate from some of the seven CCR5-unique residues identified by the trace analysis (Fig. 3 B). This hypothesis was tested by docking simulations of TAK779 to CCR2 and CCR1. Structural models of receptors CCR2 and CCR1 were constructed from the CCR5 template by replacing the nonconserved side chains using the backbone-dependent rotamer procedure of Dunbrack and Karplus (1993). Docking of TAK779 to CCR2 resulted in several orientations where the ammonium group of TAK779 was <6 Å from the conserved glutamate in TM7 (7.39). However, most of these structures lay above the TM region without making appreciable interactions with the CCR2 binding pocket. The results suggest that the transmembrane region of CCR2 cannot accommodate TAK779. Manual docking of TAK779 in an orientation similar to that of Fig. 5 resulted in atomic overlap, due to steric clash between TAK779 and R206 in TM5 and F116 and Y124 in TM3. CCR5 contains smaller amino acids at these positions. Position 5.42 is Arg (R206) in CCR2 and Ile (I198) in CCR5, while residues F116 (3.28) and Y124 (3.36) of CCR2 correspond to L104 and F112 of CCR5, respectively. Docking simulations of TAK779 to CCR1 resulted in some orientations similar to those of CCR5, but the docking scoring energy was high, suggesting residual atomic clashes in the binding region. MD simulations of TAK779 to CCR1 starting from a docked orientation similar to that of Fig. 5 resulted in the methylphenyl-benzocycloheptenyl group outside the pocket defined by the TM3, TM5, and TM6 helices. Amino acid comparison shows that two tyrosines at positions 3.33 and 3.37 of CCR1 (Y114 and Y118, respectively) are replaced by Phe in CCR5 (F109 and F113, respectively). Other positions involve residues of similar size, such as 6.55 (L255 in CCR5 and I259 in CCR1) and 5.42 (I198 in CCR5 and L203 in CCR1). Apparently, replacing Phe with Tyr is sufficient for the observed displacement from the pocket of CCR1 during the MD simulation.

Modeling and characterization of extracellular and intracellular regions

Modeling of the extracellular regions was performed in two steps. In the first step, information on secondary structure propensity was gathered. In the second step, each region was assembled sequentially into the model. Tertiary constraints were then introduced and the receptor was subjected to a series of simulated annealing simulations with appropriate distance and dihedral angle constraints. Secondary structure information was obtained from applications of prediction methods and by searching for structural templates in the PDB database (Berman et al., 2000).

Secondary structure propensities were obtained by comparing predictions from seven major algorithms, as provided by the Protein Structure Prediction (PELE) module of the Biology Workbench (http://workbench.sdsc.edu). Structure was assigned based on the joint prediction for each amino acid. Database searches included seven amino acids at either ends of the loop region to allow stemming of the loops from the helix ends. The searches produced from 3 to 14 hits for each of the loops investigated. In the majority of cases, sequence homology searches produced hits for only portions of the loops, with the average length of the fragment being 11 ± 5 residues. Interestingly, the matches were from either loops or solvent-exposed regions of their respective x-ray structures. The fragments found by the database search were also used to confirm or reject the secondary prediction results. If agreement was found, dihedral angle constraints were added to restrict the matched regions to either the alpha -helical or beta -sheet conformation during the simulations.

Application of secondary structure prediction to the 32-residue N-terminal region produced an extended conformation for residues 9 to 14 and a helical conformation for residues 26 to 31. Searches for sequence homologs produced fragments from 14 unique PDB structures. Segments that matched residues 26-31 were found to be helical (2 of 2 structures). Region 9-14 was matched by structures containing amino acid classified as being either in an extended or bend region (4 structures) and it was modeled as an extended structure based on secondary structure prediction. The remaining residues of the N-terminus mostly matched unstructured segments and the dihedral angles of such residues were not constrained during the simulations. Two prolines, P34 and P35, introduce a helix break between the predicted N-terminus 26-31 helix and TM1.

The first extracellular loop, connecting TM2 with TM3, comprises residues Y89 to M100 of the model. Application of the prediction algorithm resulted in an extension of the TM2 helix to A91. It also showed that the W94DFGNT99 region was unstructured. Databases searches resulted in one hit for region F85-A91, corresponding to a helical fragment and three matches for W94-T99 corresponding to either coil or bend regions of these proteins. The loop structures varied greatly among the three hits and dihedral angle constraints were not applied to the W94-T99 fragment of EL1.

The disulfide bridge between C101 and C178 (Blanpain et al., 1999b) separates EL2 in two segments, with the N-terminal residues linking TM3 and TM4, while residues C-terminal from C178 connect TM4 to TM5 (Fig. 1). Homology searches did not produce clear secondary structure preference when searches were done either on the entire EL2 region or only the N-terminal or C-terminal fragments. Application of homology searches and prediction methods to EL3 resulted in a helical conformation for the L275DQAMQ280 fragment, thus extending the TM7 helix into the extracellular domain by six residues. The database search provided four hits for the L275-Q280 region, three of which were helical. Residues 261-274 of EL3 were matched by four fragments, all of which were in loop regions of these proteins.

Because of the scarcity and limited match of the database results, the fragments were used to generate initial templates and loop optimization was carried out using a constrained simulated annealing protocol. Loops were inserted one at the time by choosing conformations free of steric overlap among the different regions. Dihedral angle constraints were used to restrain the residues to the predicted secondary structures during the simulated annealing procedure. Tertiary constraints were added in the form of distance constraints between residues in different loops, based on available epitope mapping results, summarized in Table 4. Distance constraints r <=  15 Å were used between the Calpha atoms of epitope residues (Burritt et al., 1998). Constraints were applied between N-terminal residues (region 1-13) and EL2 (residues 168, 176-177), and between EL1-D95 and K171-E172 in EL2. A more generous constraint of r <=  25 Å was added between the Calpha atoms of D95 and those of residues 1-13 of CCR5 to account for the observed interaction between the first 13 amino acids of the N-terminus and D95 (Hill et al., 1998). Specific side chain-side chain interactions were not obtained in that study. The 25 Å cutoff corresponds to the average diameter of a globular protein of the same size as the extracellular domain of CCR5 (Harpaz et al., 1994). This cutoff assumes that the observed sensitivity upon mutation of D95 may arise not only from direct side chain-side chain contact, but also from structural changes in the globular fold of these regions. As shown in rhodopsin, the EC domains of GPCRs are likely to form a compact folding motif. By comparison, the EC domain of rhodopsin is similar in length to that of CCR5 and the distance between EL1 and the first 13 N-terminus residues varies between ~15 and ~25 Å. The 25 Å distance constraint between the N-terminus and EL1 simply imposes compactness between these two regions, without creating specific residue pair interactions that are not available at this time. The set of geometrical constraints for the 95-residue extracellular domain consisted of a total of 29 distance constraints and 124 (phi , psi ) dihedral angle constraints (Table 4).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 4   Geometrical constraints used in modeling simulations of the extracellular loop regions*

The three intracellular loops of CCR5 range from 6 to 10 residues in length (Fig. 1). Residues in IL1 were modeled after rhodopsin, whose sequence homology with CCR5 is 83%. The second and third loops, however, have either low homology or differ in length from those of rhodopsin. These loops were therefore modeled by searching through a loop database for segments with similar end-end distance (InsightII, 2002. Accerlys Inc., San Diego, CA). The receptor model was terminated after helix 8 (Palczewski et al., 2000), which has high homology with rhodopsin. Residues 302 to 352 were not modeled because of low homology to rhodopsin and lack of experimental data. Initial loop structures were then subjected to a simulated annealing protocol, as described in the Methods.

The accuracy of modeled loop structures is typically assessed against the actual x-ray structure of the test proteins (Fiser et al., 2000). Unfortunately, such comparison is not possible for CCR5, because of the poor homology with rhodopsin in the extracellular domains. A previous study of single loops by Fiser et al. (2000) has shown that, given an adequate sampling of the conformational space, the quality of the structures can be inferred from 1) the correlation between the energy of the models and the RMSD, and 2) by the low structural variability among the low-energy structures. These two criteria were used here in the analysis of the simulated EC domains.

The bell-shaped energy profile of the simulated structures (Fig. 6 A) indicates statistically significant sampling of the conformational space. A plot of the RMSD from the lowest energy structure versus energy gave a Pearson coefficient r = 0.42 (Figure 6 B). The backbone atoms of the TM regions and EC domains were used in the superimposition. Structural variability was calculated as the average RMSD of the nine structures with the lowest conformational energies. Fig. 7 shows two low-energy structures representative of low variability (E = -398.8 kcal/mol) and high variability (E = -384.2 kcal/mol). Variability was small for five of these structures, suggesting that one dominant native conformation may have been obtained from the simulations (E = -398.8 kcal/mol in Fig. 7). However, the two criteria for assessing the quality of the EC domains are only partially fulfilled by the modeled structure because of the weak-to-moderate correlation between energy and RMSD, and the presence of low-energy conformations with high variability (Fig. 6, inset).



View larger version (17K):
[in this window]
[in a new window]
 
FIGURE 6   Analysis of the extracellular loop regions. (A) Histogram of the conformational energy distribution of the 250 structures generated by the simulated annealing procedure. Energies were binned at 10 kcal/mol. (B) Energy plot versus RMSD from the lowest energy structure (E = -398.9 kcal/mol). The straight line is the least-square fit (Pearson coefficient = 0.42). Inset: the average rms deviation, of the nine low-energy structures. The RMSD is calculated from the average of (9 × 8)/2 pairwise RMSDs.



View larger version (33K):
[in this window]
[in a new window]
 
FIGURE 7   Schematic diagram of two representative low energy structures of the EC domains. Hotspot residues are shown in a stick representation. Orange: N-terminus; cyan: EL1; green: EL2; magenta: EL3.

When each EC domain was superimposed individually, rms deviations from the mean coordinates ranged up to 9 Å for the N-terminus, 5 Å for EL1, 7 Å for EL2, and 6 Å for EL3. The structures were then clustered based on structural similarity to find major structural groups. RMSD cutoffs were increased from 2.0 Å until a significant number of structures were represented in each cluster. Clustering of EL1 structures with a 3.5 Å cutoff resulted in 60% of the structures in six clusters containing 14 to 53 loops each. About 50% of EL2 structures were found in nine clusters with 9 to 18 structures each when using a 6.5 Å cutoff. Fifty percent of the EL3 structures clustered in eight groups containing 9 to 27 structures each. Clustering of the N-terminus structures with a 7 Å cutoff resulted in 30% of the structures in five clusters containing 17 to 23 structures each. Results show that, with the exception of the shorter EL1, the clusters contain structures with large structural variations among them.

I further analyzed the simulated structures in terms of their average properties to capture global characteristics that may not be apparent from either the lowest energy structures or individual loop conformations. Commonalities in the global fold of the EC domains were found by examining average properties of solvent accessibility and frequency of contact interactions among the domains.

The solvent-accessible surface area of the EC amino acids, averaged over the 250 structures, is shown in Fig. 8 A. Approximately 60% of the N-terminus residues have solvent accessibility >40%. On the contrary, EL3 is clearly the most buried of the loops. The disulfide bridge between C101 and C178 limits the solvent accessibility of six residues near C178 in EL2, where, on average, the C-end of this loop is more solvent-exposed than its N-end portion. The low accessibility of D95 in EL1 arises from interaction of this side chain with residues of the N-terminus.



View larger version (26K):
[in this window]
[in a new window]
 
FIGURE 8   Analysis of the extracellular loop regions. (A) Average solvent accessibility surface areas of the extracellular regions of CCR5. Values represent the average over 250 structures generated by the simulated annealing procedure. (B) Interactions between hotspots residues, defined in Table 5, and the three extracellular loops. Normalized values indicate the probability that a given residue of the EC regions is within 5 Å of hotspot residues. Graphs were generated using the 250 structures obtained from the simulated annealing procedure.

The 250 structures were also evaluated based on interaction of hotspot residues, defined here as residues of the loop regions whose mutation results in loss of binding for either chemokines or gp120. As shown in Table 5, hotspot residues consist of mostly acidic and aromatic amino acids in the N-terminus, EL2, and EL3. The environment of these crucial residues was characterized by selection of the interacting residues from the set of the 250 structures (Fig. 8 B). The majority of the models showed interactions of EL1 with hotspot residues located in the N-terminus (D2-I12 and Y14-E18) and EL2 (Y176-T177). The second extracellular loop presented interactions with the D2-I12 fragment of the N-terminus. Close contacts between the Y14-E18 segment and EL3 originate from the disulfide bond between C20 and C269. Residues at either the C-end of the N-terminus (K26 and R31) or F263-N267 of EL3 did not show significant interaction with the other extracellular regions.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 5   Residues implicated in the interaction of CCR5 with either MIP-1beta or dual-tropic HIV virus, as obtained from site-directed mutagenesis data

By comparison, simulations performed without the addition of tertiary distance constraints completed lacked interactions among the four domains, except in the immediate proximity of the two disulfide bonds. The four domains did not assemble into a compact fold, and several structures were found where the domains packed against the 7-TM helices. Such conformations are unrealistic in the membrane environment. They appear because the simulated annealing protocol has to be carried out in vacuo, thus ignoring the lipid environment of the TM helices.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSIONS
REFERENCES

The x-ray structure of rhodopsin provides the template for models of the homologous GPCR. However, the specific sequence motifs of rhodopsin give rise to local distortions of the 7-TM helices that may not be present in other receptors. Deviations from the standard helical conformation result in a different orientation of the helices in the bundle and changes in the shape and amino acid composition of the GPCR binding pocket. Specific structural motifs may also provide a mechanism of receptor activation. Recently, Govaerts et al. (2001a) have investigated the role of the T82VP84 motif in TM2 of CCR5 in receptor activation. A P84A mutation resulted in decreased affinity for chemokines, while mutations of T72 impaired receptor activation following binding of chemokines. In that work, the helix bent at P84 was correlated with the activation process. In this work, I have found that the helical distortion in TM2 of rhodopsin is characteristic of Gly-containing helices in interfacial regions (Javadpour et al., 1999), resulting in tight packing with TM3. The Gly motif of rhodopsin was then replaced in CCR5 with a motif derived from Pro-containing helices (Némety et al., 1992; Sankararamakrishnan and Vishveshwara, 1992) and residues at the C-terminal end of TM2 were oriented based on sequence conservation in a set of 50 sequences homologous to CCR5. The resulting helix kink orients the CC-chemokine conserved tryptophan (W86) in the binding pocket, in an orientation similar to that modeled by Govaerts et al. (2001a). Unlike the changes in TM2, helix 5 of CCR5 was modeled after rhodopsin with a bulge in the L203-V209 region. The TM5 and TM7 rhodopsin templates were maintained based on sequence comparison with rhodopsin and consideration of sequence homology and experimental data on closely related receptors.

Receptors in Table 1 share the highest sequence homology with CCR5. Highly conserved residues line the bottom of receptor cavity (Fig. 3 B), where they form a tight network of interacting side chains. These amino acids comprise the highly conserved "fingerprint" residues characteristic of the rhodopsin-GPCR family (Attwood and Findlay, 1994). It was surprising, however, to find few additional residues shared across receptors classes. Only the angiotensin receptors had residues in common with CC-chemokines (P34 (1.36), W86 (2.60), and C290 (7.46) of CCR5). CCR5 shares six residues with receptors CCR1 to CCR4 (Fig. 3 B). Two conserved aromatic clusters characterize the cavity in these receptors (Fig. 2). Four aromatic residues, TM1-Y37 (1.39), TM2-W86 (2.60), TM3-Y108 (3.32), and TM6-Y251 (6.51), are within an ~6 Å radius from the acidic residue TM7-E283. The second aromatic cluster is adjacent to the first one and includes the highly conserved Y244 (6.44) and W248 (6.48) in TM6 and the CC-chemokine-specific residues TM2-F79 (2.53) and TM7-H289 (7.45). Therefore, the receptor cavity of receptors CCR1 through CCR5 utilizes class-specific aromatic residues for side chain interactions with the "fingerprint" residues of GPCR (Y244 and W248 in CCR5). Given the implication of these residues in receptor activation (Javitch et al., 1998), their interaction with the conserved aromatic cluster could serve as a common mechanism of activation in proteins CCR1 to CCR5.

Site-directed mutagenesis studies have characterized the binding mode of the CCR5 antagonist TAK-779 (Dragic et al., 2000). Single point mutations of residues L33, Y37, W86, Y108, and E283 to Ala decreased the efficacy of TAK-779 in antagonizing the binding of gp120 to CCR5. Our molecular docking simulations show that all of the above residues are part of TAK779 binding site (Dragic et al., 2000) (Fig. 5 and Table 3). In particular, the docking orientation is characterized by an electrostatic interaction between the ammonium group of TAK779 and E283. This glutamate is conserved in receptors CCR1 through CCR5, it is solvent-accessible (Fig. 3 B), and it is the only acidic residue in the extracellular end of the 7-TM bundle. Mutagenesis data have shown the importance of this glutamate not only for the binding of TAK779, but for antagonists of other CC-chemokine receptors as well. Recent mutagenesis data on CCR2 showed loss of binding of basic spiropiperidine ligands upon mutation of Glu to either alanine or glutamine (Mirzadegan et al., 2000). Correspondingly, quaternization of the piperidine nitrogen has been shown to be essential for the high affinity of CCR1 antagonists (Liang et al., 2000; Naya et al., 2001). Given that small ligands of CC-chemokine receptors are characterized by a basic amino group (Liang et al., 2000; Ng et al., 1999; Sabroe et al., 2000), it is plausible that receptors CCR1 through CCR5 share a common binding mode characterized by an electrostatic interaction with the conserved TM7-glutamate. Additional interactions of TAK779 with residues in TM5 and TM6 find partial agreement with the mutagenesis data (Table 3). The residues involved are small (T195) or hydrophobic (L104, F112, and L255). It is plausible that their substitution with Ala may perturb the binding energy less than the replacement of charged or polar aromatic residues. Similarly, loss of opioid receptor affinities is greatest upon mutation of a critical Asp and nearby polar aromatic residues, while mutations of hydrophobic groups have smaller effects (Surratt et al., 1994; Befort et al., 1996; Metzger et al., 2001).

The docking mode of TAK779 (Fig. 5), while in agreement with the experimental data, differs from a previous orientation proposed from the analysis of these same data (Dragic et al., 2000). It has been previously suggested that the methyl-benzyl-heptadyl moiety binds among TM1, TM2, and TM7, while the positively charged ammonium group would orient toward the extracellular domain (Dragic et al., 2000). Automated docking simulations were not performed in that study. The proposed binding mode of TAK779 presented here is validated by the integration of systematic docking simulations together with mutagenesis, SAR, and sequence analysis.

The binding mode of TAK779 accounts for receptor affinity, but also shows that the above six residues are not unique to CCR5, as they are also found in CCR1 and CCR2, two receptors with low affinity for TAK-779 (Baba et al., 1999). Therefore, the selectivity of TAK-779 for CCR5 must derive from nonconserved side chains, such as those in Fig. 3 B. This hypothesis was tested by performing docking simulations of TAK779 to CCR1 and CCR2. Several docking orientations contained interaction of the ammonium nitrogen with the conserved glutamate in TM7, but the 4-methyl-benzocyloheptenyl group of TAK779 docked close to the extracellular end of the 7-TM bundle. We find that substitution of CCR5 amino acids with bulkier side chains in CCR2 and CCR1 is responsible for lack of binding deep in the TM pocket. Such residues are R206, F116, and Y124 in CCR2, and Y104 and Y118 in CCR1.

The docking results, when combined with current mutagenesis data and SAR on CCR5 and other CC-chemokine receptors (Dragic et al., 2000; Liang et al., 2000; Mirzadegan et al., 2000; Naya et al., 2001) suggest a common binding mode for CC-chemokine antagonists. Conserved regions of CC-chemokine receptors (i.e., L33 (1.35), Y37 (1.39), W86 (2.60), Y108 (3.32), and E283 (7.39) in CCR5) bind the common chemical component of chemokines antagonists, i.e., the basic nitrogen of the piperidine ring, while residues in TM3, TM5, and TM6 interact with aromatic moieties of these molecules (the methyl-benzyl-heptadyl moiety in CCR5). The proposed binding mode, where conserved regions of receptors bind conserved chemical motifs of ligands, finds similarities with that of other GPCR, for example opioid and dopamine receptors (Simpson et al., 1999; Metzger et al., 2001; McFadyen et al., 2001). Extensive mutagenesis studies and docking studies of those two receptor classes has shown that a highly conserved Asp in TM3 (3.32) is responsible for the binding of the basic amino group of aminergic and opioid ligands, while divergent chemical moieties bind nonconserved amino acid residues. While the conserved Asp (3.32) is responsible for a large portion of the binding energy in the majority of aminergic and opioid receptors, the contribution of the nonconserved residues to binding was found only after extensive mutagenesis studies. The proposed binding mode for CC-chemokine antagonists can be used to guide further experimental studies to better define the interactions of TAK-779 with the nonconserved residues in TM5 and TM6.

Binding of TAK779 to CCR1 and CCR2 is likely to include residues of the EC regions, whose steric and electrostatic interactions give rise to poor affinity (CCR2) or lack of binding (CCR1). The likelihood that the EC region of CCR5 could interfere with the binding of the antagonist was investigated by pooling residues within a 5 Å radius of Y37, W86, and E278 from the set of 250 EC structures obtained by simulated annealing. Only 5% of the structures had interactions between the binding pocket residues and either EL1 or EL2. Interactions with EL3 were found in 30% of the structures and centered on Q261 and the N273-Q277 region. These results are in agreement with the experimental findings (Dragic et al., 2000) that mutations of EL1 and EL2 residues have no effect on TAK779 binding, and Q261A, N273A-Q277A mutations mildly decreased the ability of TAK779 to interfere with gp120 binding. The results therefore exclude a contribution of the loops to the binding of TAK779 to CCR5.

I have used segment-matching (Sanchez and Sali, 1999) in combination with secondary prediction methods to generate structural templates and dihedral angle constraints of the four extracellular domains. The database searches provided segments that could be superimposed to generate an initial template. Typically, each loop was reconstructed by matching three to four overlapping segments. Given the geometrical variability of the CCR5 loops, it is unlikely that the matches found for CCR5 are a unique representation of the conformation of the EC domains. Therefore, I have chosen to use the database searches not for obtaining an actual structural template, but to uncover regions where regular secondary structure motifs may occur. Such regular structures occur in rhodopsin (Palczewski et al., 2000), where the N-terminus and EL2 assemble into beta -sheet folds, and have been postulated to exist in other receptors as well (Paterlini et al., 1997; Moro et al., 1999; Zhang et al., 2002b). A previous modeling study of EL2 of the kappa -opioid receptor based on database matches and secondary prediction (Paterlini et al., 1997) has been recently validated by NMR spectroscopy of this loop in a DPC micelle (Zhang et al., 2002b).

The conformational freedom of the individual loops was restricted by application of dihedral angle constraints based on both the predicted and observed secondary structure propensities when available. Tertiary interactions, in the form of distance constraints between different structural domains, were added from considerations of available epitope mapping results. A simulated annealing protocol was then used to generate loop models that were analyzed based on conformational energy criteria, conformational variability, and average physical properties. The set of ~150 dihedral and geometrical constraints used in the simulations was not sufficient for complete structural determination of the 95 residues of EC domains. Loop conformations, when clustered according to their RMSD, showed structural variability from 3.5 Å for EL1 to 7 Å for the N-terminus. However, the distance constraints between the EC loops were necessary to maintain a compact fold of the four regions. Simulations carried out without the addition of tertiary distance constraints resulted in improbable structures void of interactions among the EC domains where the longer loops and the N-terminus packed against the hydrophobic TM helices.

Criteria used to assess the quality of modeled loops have limited applicability to CCR5 because of the length of the loops and interactions among the domains. I have estimated the accuracy of the simulated EC domains from the correlation between the conformational energy and RMSD and from the conformational variability of the low-energy conformations. This approach has been previously tested on single loops up to 12 residues in length (Fiser et al., 2000), and it is applied here to a considerably more complex system. I find a low-to-moderate correlation and low variability of some of the structures, those suggesting that a dominant low energy conformation has been found (Fig. 7). The quality of the correlation and variability is likely to improve by either increasing the number of constraints or, alternatively, by applying stricter distance constraints. However, available biological data suggested only a range of interacting residues (as between D95 and the N-terminus), thus making it necessary to adopt distance cutoffs that are not biased toward specific residue-residue interactions.

The binding of chemokine and gp120 has been extensively probed by site-directed mutagenesis of residues in the EC regions (Table 5). Single-point mutations can either eliminate essential side chains interactions with ligands, or perturb the tertiary structure of the EC domains. If the models can distinguish between the two cases we can utilize them to guide experimental design.

Average properties of the modeled EC regions were used to obtain common folding characteristics and pattern of interacting residues. The average solvent accessibility area (Fig. 8 A) reflects, in part, the geometrical constraints imposed by the disulfide bonds and between the N-terminus and EL2. The lower accessibility of EL3 is likely due to the disulfide bond between C20 and C269, which causes the N-terminus to lie directly above EL3 (Fig. 7). The observation that it has not been possible to raise monoclonal antibodies against EL3 (Lee et al., 1999) finds support in low solvent accessibility of EL3 in the current model. On the contrary, the C-terminal region of EL2 has clearly high solvent accessibility (Fig. 8 A), in agreement with the finding that several mAbs can recognize epitopes in this region (Lee et al., 1999).

Structures characteristic of major structural clusters show commonalities, despite the great variability among them. As illustrated in the case of low-energy structures (Fig. 7), the N-terminus cuts across the EC domain to reach EL2 thus separating EL1 from EL3. The interactions between the N-terminus and the three loops give rise to clusters of hotspot aromatic and acidic residues such as with D95 in EL1, Y176-T177 in EL2, and C269 in EL3 (Figs. 8 B). Some of the hotspot residues were also part of the epitope, and as such subjected to distance constraints (Table 4). However, the cutoffs used in the simulations were three to fivefold greater than the one used to identify close contacts (d <=  5 Å) and individual residue contacts were not specified. Many of the hotspot residues are also characterized by low average solvent accessibility because they are involved in side chain-side chain interactions, such as D2, D95, R168, Y176, and T177. I suggest that the loss of binding observed for mutations at these loci originate from a perturbation of the tertiary organization of the EC regions. In contrast, regions such as the C-terminal end of the N-terminus and residues F263-N267 in EL3 both lack specific interactions and are solvent-accessible (Fig. 8, A and B). I propose that the observed loss of binding upon mutagenesis in these regions originates from a direct loss of interactions with chemokines or gp120. Gain-of-function studies where the affinity of CCR5 is restored by complementary replacement or swapping of residue pairs (Zhou et al., 1994; Govaerts et al., 2001b) could be used to validate side chain interactions obtained in the study. For example, experiments could involve residue pairs such as N-term-D2 and EL2-R168, N-term-N13 and EL1-D95, or N-term-E18 and EL3-R274. Putative sites of chemokines interaction outlined here may also be used to map the molecular determinants of RANTES recognition (Nardese et al., 2001) onto the surface of the EC models. For example, the clusters of negatively charged residues can be used to orient RANTES on the EC surface by matching the complementary basic clusters of this chemokines in docking simulations.


    CONCLUSIONS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
CONCLUSIONS
REFERENCES

Presently, GPCRs can be modeled from rhodopsin, the only member of this family for which the x-ray structure has been resolved. The usefulness of these homology models greatly depends on their ability to explain and predict the binding of their endogenous ligands and to efficiently aid the discovery of new synthetic compounds. Sensitivity of the 7-TM template to local distortions, imparted by sequence-specific motifs, can affect the size and nature of the binding cavity. The difficulty in obtaining structural data of the loop regions based on sequence homology makes it arduous to structurally characterize these regions, despite their importance in binding and recognition, as exemplified by CCR5.

I have presented a comprehensive model of CCR5 that elucidates the binding of both small ligands and its sensitivity to mutations for binding of chemokines and the coat protein gp120 of HIV-1. The computational approach has sought to enhance homology-modeling techniques with ab initio simulations and knowledge-based information to generate structural models that are then corroborated by comparison with additional data. The model of the transmembrane region was validated by probing the binding of TAK779 using automated docking simulations. The docking mode finds support in available experimental data and specific hypotheses have been formulated to explain both affinity and selectivity of TAK-779 for CCR5. Application of evolutionary tracing to the 7-TM region readily identifies CC-conserved residues, such as E283 and aromatic residues in TM1, TM2, and TM3, that create a CC-chemokine class-specific receptor pocket. Correlation of the conserved glutamate with the presence of a quaternary ammonium group in current CC-chemokine antagonists further suggests commonalities in the binding of CCR1-CCR5 that can be exploited when designing small ligands for these receptors.

The size and complexity of the EC regions examined here required that supplementary constraints be added during modeling, in addition to knowledge-base information on the secondary structure of the individual loops. These were obtained in the form of tertiary constraints based on the epitope mapping studies. Analysis of the average properties of the simulated structures suggests different roles for the functionally important residues, as either maintaining the tertiary structure of the EC domain, or as being accessible to binding by chemokines or gp120. The constrained simulated annealing protocol, in combination with conformational clustering, provides a systematic approach for generating low-resolution structures of EC domains for further experimental validation and design.

    ACKNOWLEDGMENTS

I thank Loren Gragert for assistance with the computations, and am grateful to Philip Portoghese and Andrew Shenker for helpful discussions.

This work was supported by National Institutes of Health Grant 5K01DA0073 and by the Minnesota Supercomputer Institute.

    FOOTNOTES

Address reprint requests to M. Germana Paterlini, Certusoft Inc., 7831 Glenroy Road, Suite 440, Minneapolis, MN 55439. Tel.: 952-921-0351; Fax: 952-921-0367; E-mail: germana{at}certusoft.com.

Submitted February 7, 2002, and accepted for publication September 10, 2002.


    REFERENCES