| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, New York 10032
Correspondence: Address reprint requests to Barry Honig, E-mail: bh6{at}columbia.edu.
| ABSTRACT |
|---|
|
|
|---|
-RMSD values to the native of 2 Å or less in the transmembrane regions) may be obtained for template sequence identities of 30% or higher if an accurate alignment of the sequences is used. Second, we show that secondary-structure prediction algorithms that were developed for water-soluble proteins perform approximately as well for membrane proteins. Third, we provide a comparison of a set of commonly used sequence alignment algorithms as applied to membrane proteins. We find that high-accuracy alignments of membrane protein sequences can be obtained using state-of-the-art profile-to-profile methods that were developed for water-soluble proteins. Improvements are observed when weights derived from the secondary structure of the query and the template are used in the scoring of the alignment, a result which relies on the accuracy of the secondary-structure prediction of the query sequence. The most accurate alignments were obtained using template profiles constructed with the aid of structural alignments. In contrast, a simple sequence-to-sequence alignment algorithm, using a membrane protein-specific substitution matrix, shows no improvement in alignment accuracy. We suggest that profile-to-profile alignment methods should be adopted to maximize the accuracy of homology models of membrane proteins. | INTRODUCTION |
|---|
|
|
|---|
1% of the structures available in the protein data bank (PDB) (5
There are several features of membrane proteins that distinguish them from water-soluble proteins. The differences arise because the environment of the transmembrane regions of membrane proteins is different from that in aqueous solution: it is predominantly lipophilic, lacks hydrogen-bonding potential, and provides little screening of electrostatic interactions. At the primary sequence level, this results in significant differences in amino acid composition (7
,8
) and in the probabilities of amino acid substitutions during evolution (9
,10
), generally favoring residues with hydrophobic side chains, especially at the protein-lipid interface (11
,12
). In addition, amino acids have been shown to have different secondary-structure propensities in membrane environments and in aqueous solution (13
15
).
The differences in the properties of the two types of protein might be expected to have consequences for the applicability of some homology modeling methods to membrane proteins. For example, differences in amino acid composition and evolutionary substitution probabilities imply that methods for the alignment of protein sequences may not be directly transferable. This possibility has led to the creation of novel amino acid substitution matrices (10
,16
), which are used to identify probable matches in sequences, and to the introduction of so-called bipartite alignment methods that utilize these matrices in transmembrane regions only (10
,16
,17
).
A second aspect of modeling that may be affected by the differences between membrane proteins and water-soluble proteins is the prediction of secondary structure. We draw a distinction between the secondary structure of a residue and its location relative to the membrane, since every amino acid can be labeled as having both a specific secondary-structure type and a specific location. This distinction is useful because it allows for the unique description of secondary-structure elements peripheral to the membrane (18
), as well as coil-like residues within the membrane, e.g., in reentrant loops or unwound helices (19
). Thus, a method capable of accurately predicting the secondary structure of each residue in a membrane protein sequence would provide information that is supplementary to that obtained from the prediction of the location of a particular amino acid with respect to the bilayer. More generally, it is important to understand the extent to which secondary-structure prediction algorithms designed for soluble proteins are applicable to membrane proteins.
A third way the membrane environment may affect homology modeling studies involves the presence of unique topological constraints provided by the lipid bilayer (20
). In principle, it is possible that the range of relative orientations of helices within the membrane is more restricted than in the aqueous phase, which may limit the structural diversity available to families of membrane proteins. It might also suggest that homology models of membrane proteins are more accurate than models of water-soluble proteins for the same level of sequence identity. It is therefore of interest to assess the relationship between sequence identity and structural similarity for membrane proteins.
In this work, we address the three issues raised above. We analyze the performance of state-of-the-art globular-protein homology modeling strategies using a set of 36 homologous membrane protein structures (HOMEP), comprising 11 families of topologically related proteins. Taking each protein in turn, we use all its family members as templates for the construction of homology models whose accuracy is then determined by comparison to the known structure. Although small on the scale of general sequence alignment benchmark sets such as BaliBase (21
), the HOMEP set is carefully compiled and covers a wide range of sequence identities, varying from 80 to <10%.
| METHODS |
|---|
|
|
|---|
lucy/homep) was selected from the PDB (5
Two definitions of the transmembrane regions were adopted. The first, referred to as TM, was defined by hand to incorporate all residues in membrane-spanning secondary-structure elements according to DSSP (22
) that were also superimposed in the structural alignment of all family members. Thus, the TM regions include residues located at the lipid-water interface as well as within the bilayer (Supplementary Material Table 3). The second definition, referred to as TMDET, comprises only residues in the hydrophobic core of the membrane, as defined by the TMDET algorithm (23
) used by the PDB_TM database (24
). Two short segments were incorrectly assigned by TMDET and thus excluded from the analysis: a strand (residues 128133) in a loop region of 1osm and a helical region in the first two N-terminal residues of 1pw4.
Secondary-structure prediction accuracy
Since HOMEP is highly redundant by design, for the analysis of secondary-structure prediction algorithms we used the 40% nonredundant set of membrane proteins from the PDB_TM database from July 1, 2005. After excluding theoretical models, C
-only structures, and proteins with missing residues, the set contained 106 chains from 71 membrane proteins, of which 92 chains were
-helical and 14 chains were ß-barrels. Predictions were obtained with local installations of PSIPRED (25
) v2.3, JNET (26
), and PHDsec (27
), and compared against assignments from DSSP. To obtain the multiple-sequence alignment input for each protein, we ran a PSI-BLAST search on the National Center for Biotechnology Information (NCBI) nonredundant database (nr); we ran three PSI-BLAST iterations including sequences below an E-value cutoff of 5 x 104 and reported sequences with an E-value cutoff of 1 x 103. No filtering of transmembrane regions was carried out.
We also assessed the composite prediction used by HMAP (28
), which is a vector of probabilities for the three states (helix, strand, and coil) determined by direct averaging of the confidence scores from PSIPRED, JNET, and PHDsec. To enable comparison with the DSSP assignments, the prediction at each position was taken as the state with the highest probability.
Generation of sequence alignments
Sequence-to-sequence alignments
The dynamic programming algorithm in ClustalW v1.82 (29
) was used to align each of the query-template sequence pairs. Gap-open penalties (po) of 9, 10, 11, 12, 15, and 20 were tested in combination with gap-extension penalties (pe) of 0.1 or 1. No clear difference was seen in the Q or AL0 scores (see below) of pairwise alignments using these different gap penalties (data not shown), so the default values (po = 10 and pe = 0.1) were used.
Sequence-to-profile alignments
We carried out PSI-BLAST (30
) searches for each template sequence on the nr database, which was clustered at 65% sequence identity; five iterations of PSI-BLAST were carried out using E-value cutoffs as above. The sequence hits were compiled into a multiple-sequence alignment from which very remote homologs were removed according to the sequence threshold of Batalov and Abagyan as described by Tang et al. (28
). This purged alignment was then used to create a sequence-based profile to which the query sequence was aligned with ClustalW, creating a sequence-to-profile alignment. A profile is an alternate representation of the primary sequence in which each amino acid position contains a set of probabilities.
Multiple-sequence alignments
These were generated by combining PSI-BLAST hits (as above) for both query and template into a single nonredundant set of sequences, which were then aligned using ClustalW, (T-Coffee (31
), Muscle (32
), and ProbCons (33
)).
HMAP profile-to-profile alignments
HMAP is a program for the construction and alignment of structure-based profiles (28
) that is similar in its algorithms to other profile-based approaches (34
). For each template we generated two types of profile: HMAP [1,2] and HMAP [1,2,3], which combine sequence and secondary- and tertiary-structure information in different ways. The HMAP [1,2] template profiles combined sequence information from a PSI-BLAST search (as above) with a consensus secondary-structure assignment derived from all templates in the family, alongside position-specific weights reflecting the location of ungapped (i.e., core) positions in the alignment. The HMAP [1,2,3] template profiles differ in that the PSI-BLAST hits were taken from all available templates and merged using a structural alignment as a guide. For the query sequence we created a similar HMAP [1,2] profile, except that the secondary structure was obtained from a consensus prediction (see above) and the position-specific weights depended on the confidence levels of those predictions. Query and template profiles were then aligned using a score designed to favor matching of ungapped core regions and of secondary-structure types. Gap penalties were also assigned according to the location of core regions or secondary-structure elements. We used the local-global alignment method where unaligned terminal residues are only penalized in the query.
In the case of the reductase family of proteins, one member (PDB code: 1l0v) comprises two protein chains, whereas the homologous region in the other two reductase proteins is made up by a single chain. Alignment therefore required concatenation of the sequences or profiles of the two 1l0v chains; multiple sequence alignments were not possible.
Structure-based alignments
Structure-based sequence alignments were carried out with SKA (35
,36
). Residues that were matched in the structure alignment were used to define the correct alignment, which is the reference state in the calculation of the percentage of aligned positions that are correctly predicted, Q (see below). The sequence identity for each query-template pair was calculated using this alignment and was defined as the number of identical residues divided by the length of the shortest sequence.
Measures of accuracy
Models were built using Modeller 6v2 (37
) and were assessed using several measures of structure similarity or model accuracy. In addition to the root mean squared deviation of the positions of the C
atoms (C
-RMSD), we compare the model with the native structure using two scores that are used to evaluate predictions in CASP (38
). Both measures are based on the global distance test (GDT), which determines the number of model-template C
-atom pairs, G(v) that are within a distance threshold, v Å (39
). Using GDT results, the GDT_TS score (40
) is then calculated as the average percentage of residues that fit within four different cutoff distances:
![]() |
-atoms in the template structure. A second measure, the AL0 score (37
![]() |
This threshold corresponds approximately to the distance between adjacent C
atoms in a peptide chain, so that it tends to reflect structural differences corresponding to shifts in the sequence alignment.
Sequence alignment accuracy was also measured using the percentage of correctly aligned positions, Q:
![]() |
For ease of comparison, the individual membrane protein models in our set (one for each query-template pair, M, have been ranked according to i) the fraction of the target structure that can be superimposed on the template within a cutoff distance of 5 Å, and ii) the sequence identity between the target and template. These two rankings, respectively denoted by
and
were combined into a relative difficulty score (41
) for each model: Difficulty
| RESULTS |
|---|
|
|
|---|
-RMSD and GDT_TS scores of these models, plotted against sequence identity (Fig. 1), provide a benchmark of the likely quality of a membrane protein homology model for a given level of sequence identity, assuming that the correct alignment can be achieved and that no refinement is carried out. Fig. 1 shows that the quality of a membrane protein homology model decreases exponentially with decreasing sequence identity.
|
-RMSD values of the two data sets match reasonably well. The membrane protein whole-protein C
-RMSDs are more similar to the values of Flores et al. (43
|
Secondary-structure prediction accuracy
We ran three different programs on a nonredundant set of membrane proteins of known structure and compared the results with assignments calculated using DSSP (Table 1). The per-residue three-state accuracy (helix, strand, or coil) of the three methods was found to be between 68 and 79%, which is comparable to the
76% found for globular proteins (25
,26
,45
,46
). Similar results were obtained for the composite prediction used by HMAP. Note that the standard deviations are large in all cases, especially for PHDsec and JNET, reflecting a variation in scores that is larger than the 710% deviation found for soluble proteins. When considering only the hydrophobic cores, as defined using TMDET, the accuracy improves further, especially for PSIPRED (87%). Comparing the different fold types, we found that
-helical residues in membrane proteins (particularly in the membrane regions) are on average more accurately predicted than ß-strand residues, although the data set is smaller for the latter, making such comparisons tentative.
|
|
|
Structure-based profile-profile alignments
The use of the HMAP [1,2] structure-based profile-to-profile alignment method improves the AL0 scores of the models compared with the ClustalW sequence-to-profile alignments and multiple-sequence alignments (Fig. 3 and Table 2). However, the improvement is less obvious when comparing against the newer multiple-sequence alignment methods and in particular with T-Coffee. The most significant improvement in AL0 obtained from HMAP is seen for the most difficult alignments, with sequence identities of <10%. HMAP [1,2,3] alignments are better than the HMAP [1,2] alignments, especially for pairs of sequences with identities of 030%. Three-dimensional information is incorporated here using structural alignment of the available templates to guide the combination of their sequence information, as well as the assignment of weights to the core regions (see Methods). Clearly the higher precision achieved by combining template information in this way leads to greater accuracy in the alignments.
In summary, the HMAP [1,2] and HMAP [1,2,3] structure-based profile-to-profile alignments result in the most accurate models of all the methods compared here. However, the alignments obtained from HMAP are not optimal as defined by the structure-based alignments, which obviously limits the accuracy of the models built on these alignments.
Bipartite alignments
All the alignments presented so far, whether sequence- or profile-based, were calculated using the BLOSUM62 amino acid substitution matrix, which was developed for globular proteins (50
). It has been suggested that bipartite alignments, which use different substitution matrices for the transmembrane and water-soluble regions, might be more appropriate for membrane proteins (10
,16
). We tested the effect of using a bipartite approach in a sequence-to-sequence alignment scheme (10
,16
) on the HOMEP data set using a simple dynamic programming algorithm where the PHAT matrix (16
) was applied to the known transmembrane regions in the template and the BLOSUM62 substitution matrix was used for the remaining residues. Note that in contrast to the STAM method (17
), we do not align the transmembrane segments separately and then add the loop regions, but rather align the whole sequence and choose the substitution matrix depending on the assignment of each position (10
,16
). The bipartite alignments result in models with lower AL0 scores than when BLOSUM62 is used throughout (Fig. 4 and Table 3); similar results are observed using Q scores. Using the TM definition of the transmembrane region (see Methods), the bipartite alignments were worse still, which reflects the unsuitability of the PHAT matrix for residues in the bilayer interfacial region.
|
|
Errors in individual alignments
For a few models we observe that the alignments generated using either HMAP [1,2] or HMAP [1,2,3] profiles were less accurate than the ClustalW sequence-to-profile alignments. The largest differences are found for the TonB-coupled receptor family, most strikingly in the models where BtuB (PDB code: 1nqe) is the query or where FepA (PDB code: 1fep) is the query. These errors are likely caused by the low secondary-structure prediction accuracy for the long ß-strands in the TonB-coupled receptor family, which is 65.1% with PSIPRED. Other poor quality alignments are found for the seven transmembrane helix models (see Opsins in Supplementary Material Table 1), when rhodopsin (PDB code: 1u19) is either the query or the template, although the HMAP alignments are usually better than the ClustalW sequence-to-profile alignments. The structure of bovine rhodopsin is significantly different from that of the three bacterial opsins: the transmembrane helices of rhodopsin are more distorted and it contains an additional (interfacial) helix, a small ß-sheet, and much longer loops and termini. These differences, along with extremely low sequence identities, combine to yield relatively poor quality alignments and models for this family.
| DISCUSSION |
|---|
|
|
|---|
-RMSD of
1 Å from the native structure (
95% GDT_TS) in the transmembrane regions. Indeed, an acceptable model of, say, 2 Å C
-RMSD in the transmembrane regions (
85% GDT_TS) is possible for most proteins above 30% sequence identity. In contrast, below
25% sequence identity, which is the similarity of many G-protein-coupled receptors to bovine rhodopsinthe only available templatea model may have a transmembrane C
-RMSD from the native above 3.0 Å (
75% GDT_TS). The accuracy of the complete model, including all extramembranous regions, will be expected to be lower than that of the transmembrane region alone. This analysis indicates the accuracy of a model assuming that the conformation of the template structure reflects the desired conformation of the query protein. However, many membrane proteins are believed to undergo conformational changes during functional processes. Homology models cannot be expected to accurately predict such conformational changes per se: only the conformation closest to that of the chosen template will be adequately represented. Thus, the accurate prediction of many different functional conformations of a membrane protein will require template structures in equivalent conformations to be solved.
Membrane protein sequence alignments
Our analysis of sequence alignment algorithms indicates that those methods that have proved effective for water-soluble proteins work for membrane proteins as well. There is a clear progression in alignment accuracy when recently developed multiple sequence alignment (MSA) algorithms are used and additional improvements are obtained with HMAP's profile-to-profile alignment algorithm. Moreover, the increased use of structural information in the HMAP [1,2,3] alignments yields improvements relative to the HMAP [1,2] alignments. We note that ClustalW (29
) is widely used to create sequence alignments for membrane proteins (51
56
). Our results suggest that future work would benefit from the use of profile-to-profile methods and/or more advanced MSA techniques.
Our results on a simple bipartite sequence-to-sequence alignment method using the membrane-protein-specific substitution matrix PHAT show no significant improvement in the alignment quality over a traditional alignment using BLOSUM62. Originally, PHAT was shown to improve sensitivity in sequence database searches of membrane proteins (16
). However, since database searching aims to best discriminate between similar and dissimilar proteins, rather than to achieve the correct global alignment of two sequences, the optimal parameters for the two applications may differ. There have also been some reported improvements in alignment accuracy using PHAT within the program STAM (17
), which might be attributable to the separation and independent alignment of the transmembrane and nontransmembrane regions and to differences in gap penalties, rather than to the choice of substitution matrix. Clearly, the usefulness of membrane-protein-specific substitution matrices is dependent on the context, suggesting that the contribution of the choice of matrix should be carefully assessed in future applications.
Many other strategies have been presented for the alignment of membrane protein sequences (17
,57
59
) and for database searches (60
,61
). For example, probable transmembrane regions and loop regions have been aligned separately as independent segments (17
,58
) and then reassembled. Alignment of hydropathy profiles, rather than of primary sequences, has also been proposed (57
). These methods have not been assessed here, either because they are not automated or because they were only suitable for helical proteins. However, it would be interesting to see how these methods compare with the profile-to-profile methods in terms of membrane protein alignment accuracy. Indeed, comparison of models from fully automated methods with those generated by experts in the field (with manual adjustment of alignments, for example) suggests that the manual approaches can lead to higher model accuracies (62
). This has relevance to the alignments used in, e.g., G-protein-coupled receptor modeling (63
), which have often required manual intervention. Nevertheless, a poor initial alignment may introduce errors that are missed during manual adjustment, particularly at low sequence identities, emphasizing the importance of accurate alignment algorithms.
Secondary-structure prediction
The success of the profile-to-profile methods is dependent on the accurate prediction of secondary structures in the query protein. We have shown that current secondary-structure prediction algorithms, and in particular PSIPRED, are only slightly less accurate for membrane proteins than they are for water-soluble proteins. This is rather surprising, since amino acids in membranes are reported to have different secondary-structure propensities (13
15
) and because early prediction methods (64
) gave results in poor agreement with experimental data for membrane proteins (65
). Our results, which instead assess more recent, neural-network-based approaches using a larger set of high-resolution data, are supported by a previous study of membrane protein ß-barrel prediction (66
) in which similar results were obtained using PSIPRED (73%). (To our knowledge, no similar study has previously been attempted for helical membrane proteins.)
Neural networks derived from soluble proteins might have been expected to perform poorly on membrane proteins for two reasons: the membrane region imposes different secondary-structure propensities on amino acids, and the algorithms were not trained on membrane protein structures. Their success for membrane proteins may be due to the detection of the periodicity that is present in both sets of proteins. Even though the periodicity is effectively inverted, i.e., the surface of transmembrane regions is more hydrophobic than the interior whereas the surface of water-soluble proteins is more hydrophilic than the interior, the existence of a regular periodic pattern alone may be sufficient to obtain good prediction accuracy. In membrane protein ß-barrels, the strands often extend far beyond the hydrophobic bilayer core where their properties are likely to strongly resemble the alternating patterns of water-soluble protein ß-strands. However, the five to seven residues that comprise the membrane-spanning part of the strands may have a more complex pattern: the outer face of the barrel will be predominantly hydrophobic, whereas the interior face properties will depend on whether the barrel is filled with protein or water. This might explain the lower accuracy seen for the predictions on the hydrophobic TMDET regions of the ß-barrels compared with the whole structures, although definitive interpretations are difficult due to the small number of structures (Table 1).
Secondary structure versus transmembrane prediction
Since they do not predict the same property, it is somewhat specious to directly compare the accuracies of secondary-structure predictions with those of transmembrane predictions. For reference, however, we note that the best-performing transmembrane-helix predictors have two-state per-residue accuracies (i.e., whether a residue is in the membrane or not) of
80% (67
,68
). Their accuracy at the segment level (i.e., whether a membrane-spanning helix is detected or not) is generally higher, between 85 and 99%. In the case of the ß-barrel predictors, per-residue accuracies of
82% have been achieved (69
). Thus, both the transmembrane helix and transmembrane strand methods are only slightly more accurate than the secondary-structure prediction algorithms. It is noteworthy, though, that as a consequence of the low number of structures available, accuracies for transmembrane predictions may be inflated by overtraining or by tests using proteins that were also included within the training set (68
). In contrast, the secondary-structure prediction algorithms were solely trained on water-soluble proteins.
| CONCLUSIONS |
|---|
|
|
|---|
| SUPPLEMENTARY MATERIAL |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
This work was supported by the National Science Foundation under grant No. MCB-0416708.
Submitted on February 28, 2006; accepted for publication April 13, 2006.
| REFERENCES |
|---|
|
|
|---|
2. Wallin, E., and G. von Heijne. 1998. Genome-wide analysis of integral membrane proteins from eubacterial, archean, and eukaryotic organisms. Protein Sci. 7:10291038.
3. Krogh, A., B. Larsson, G. von Heijne, and E. L. L. Sonnhammer. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305:567580.
4. Drews, J. 2000. Drug discovery: a historical perspective. Science. 287:19601964.
5. Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. 2000. The protein data bank. Nucleic Acids Res. 28:235242.
6. Petrey, D., and B. Honig. 2005. Protein structure prediction: inroads to biology. Mol. Cell. 20:811819.
7. Wallin, E., T. Tsukihara, S. Yoshikawa, G. von Heijne, and A. Elofsson. 1997. Architecture of helix bundle membrane proteins: an analysis of cytochrome c oxidase from bovine mitochondria. Protein Sci. 6:808815.
8. Liu, Y., D. M. Engelman, and M. Gerstein. 2002. Genomic analysis of membrane protein families: abundance and conserved motifs. Genome Biol. 3:research0054.00510054.0012.
9. Donnelly, D., J. P. Overington, S. V. Ruffle, J. H. A. Nugent, and T. L. Blundell. 1993. Modelling
-helical transmembrane domains: the calculation and use of substitution tables for lipid-facing residues. Protein Sci. 2:5570.
10. Jones, D. T., W. R. Taylor, and J. M. Thornton. 1994. A mutation data matrix for transmembrane proteins. FEBS Lett. 339:269275.
11. Rees, D. C., L. DeAntonio, and D. Eisenberg. 1989. Hydrophobic organization of membrane proteins. Science. 245:510513.
12. Eyre, T. A., L. Partridge, and J. M. Thornton. 2004. Computational analysis of alpha-helical membrane protein structure: implications for the prediction of 3D structural models. Protein Eng. Des. Sel. 17:613624.
13. Li, S.-C., and C. M. Deber. 1994. A measure of helical propensity for amino acids in membrane environments. Nat. Struct. Biol. 1:368373.
14. Blondelle, S. E., B. Forood, R. A. Houghten, and E. Pérez-Payá. 1997. Secondary structure induction in aqueous vs membrane-like environments. Biopolymers. 42:489498.
15. Monné, M., I. Nilsson, A. Elofsson, and G. von Heijne. 1999. Turns in transmembrane helices: determination of the minimal length of a "helical hairpin" and derivation of a fine-grained turn propensity scale. J. Mol. Biol. 293:807814.
16. Ng, P. C., J. G. Henikoff, and S. Henikoff. 2000. PHAT: a transmembrane-specific substitution matrix. Bioinformatics. 16:760766.
17. Shafrir, Y., and H. R. Guy. 2004. STAM: simple transmembrane alignment method. Bioinformatics. 20:758769.
18. Granseth, E., G. von Heijne, and A. Elofsson. 2005. A study of the membrane-water interface region of membrane proteins. J. Mol. Biol. 346:377385.
19. Riek, R. P., I. Rigoutsos, J. Novotny, and R. M. Graham. 2001. Non-
-helical elements modulate polytopic membrane protein architecture. J. Mol. Biol. 306:349362.
20. Bowie, J. U. 2005. Solving the membrane protein folding problem. Nature. 438:581589.
21. Thompson, J. D., P. Koehl, R. Ripp, and O. Poch. 2005. BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins. 61:127136.
22. Kabsch, W., and C. Sander. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 22:25772637.
23. Tusnady, G. E., Z. Dosztanyi, and I. Simon. 2005. TMDET: web server for detecting transmembrane regions of proteins by using their 3D coordinates. Bioinformatics. 21:12761277.
24. Tusnady, G. E., Z. Dosztanyi, and I. Simon. 2005. PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank. Nucleic Acids Res. 33:D275D278.
25. Jones, D. T. 1999. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292:195202.
26. Cuff, J. A., and G. J. Barton. 1999. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins. 40:502511.
27. Rost, B. 1996. PHD: predicting 1D protein structure by profile-based neural networks. Methods Enzymol. 266:525539.
28. Tang, C. L., L. Xie, I. Y. Y. Koh, S. Posy, E. Alexov, and B. Honig. 2003. On the role of structural information in remote homology detection and sequence alignment methods using hybrid sequence profiles. J. Mol. Biol. 334:10431062.
29. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL_W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:46734680.
30. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403410.
31. Notredame, C., D. G. Higgins, and J. Heringa. 2000. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205217.
32. Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:17921797.
33. Do, C. B., M. S. P. Mahabhashyam, M. Brudno, and S. Batzoglou. 2005. ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res. 15:330340.
34. Ohlson, T., B. Wallner, and A. Elofsson. 2004. Profile-profile methods provide improved fold recognition: a study of different profile-profile alignment methods. Proteins. 57:188197.
35. Yang, A. S., and B. Honig. 2000. An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. J. Mol. Biol. 301:665678.
36. Petrey, D., A. Nicholls, and B. Honig. 2003. GRASP2: visualization, surface properties and electrostatics of macromolecular structures and sequences. Methods Enzymol. 374:492509.
37. Sali, A., and T. L. Blundell. 1993. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234:779815.
38. Moult, J., K. Fidelis, A. Tramontano, B. Rost, and T. Hubbard. 2005. Critical assessment of methods of protein structure prediction (CASP)round 6. Proteins. 61:37.
39. Zemla, A. 2003. LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Res. 31:33703374.
40. Moult, J., K. Fidelis, A. Zemla, and R. E. Hubbard. 2001. Critical assessment of methods of protein structure prediction (CASP): round IV. Proteins. 45:27.
41. Venclovas, C., A. Zemla, K. Fidelis, and J. Moult. 2003. Assessment of progress over the CASP experiments. Proteins. 53:585595.
42. Wilson, C. A., J. Kreychman, and M. Gerstein. 2000. Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. J. Mol. Biol. 297:233249.
43. Flores, T. P., C. A. Orengo, D. S. Moss, and J. M. Thornton. 1993. Comparison of conformational characteristics in structurally similar protein pairs. Protein Sci. 2:18111826.
44. Chothia, C., and A. M. Lesk. 1986. The relation between the divergence of sequence and structure in proteins. EMBO J. 5:823826.
45. Rost, B., and V. A. Eyrich. 2001. EVA: large-scale analysis of secondary structure prediction. Proteins. 45:192199.
46. Rost, B. 2001. Review: protein secondary structure prediction continues to rise. J. Struct. Biol. 134:204218.
47. Thompson, J. D., F. Plewniak, and O. Poch. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27:26822690.
48. Wallace, I. M., G. Blackshields, and D. G. Higgins. 2005. Multiple sequence alignments. Curr. Opin. Struct. Biol. 15:261266.
49. Elofsson, A. 2002. A study on protein sequence alignment quality. Proteins. 46:330339.
50. Henikoff, S., and J. G. Henikoff. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA. 89:1091510919.
51. Ogawa, H., and C. Toyoshima. 2002. Homology modeling of the cation binding sites of Na+K+-ATPase. Proc. Natl. Acad. Sci. USA. 99:1597715982.
52. Casadio, R., I. Jacoboni, A. Messina, and V. De Pinto. 2002. A 3D model of the voltage-dependent anion channel (VDAC). FEBS Lett. 520:17.
53. Yang, Q., X. Wang, L. Ye, M. Mentrikoski, E. Mohammadi, Y.-M. Kim, and P. C. Maloney. 2005. Experimental tests of a homology model for OxlT, the oxalate transporter of Oxalobacter formigenes. Proc. Natl. Acad. Sci. USA. 102:85138518.
54. Kuhlbrandt, W., J. Zeelen, and J. Dietrich. 2002. Structure, mechanism, and regulation of the Neurospora plasma membrane H+-ATPase. Science. 297:16921696.
55. Bostina, M., B. Mohsin, W. Kühlbrandt, and I. Collinson. 2005. Atomic model of the E. coli membrane-bound protein translocation complex SecYEG. J. Mol. Biol. 352:10351043.
56. Oyedotun, K. S., and B. D. Lemire. 2004. The quaternary structure of the Saccharomyces cerevisiae succinate dehydrogenase: homology modeling, cofactor docking and molecular dynamics simulation studies. J. Biol. Chem. 279:94249431.
57. Lolkema, J. S., and D. J. Slotboom. 1998. Estimation of structural similarity of membrane proteins by hydropathy profile alignment. Mol. Membr. Biol. 15:3342.
58. Bissantz, C., A. Logean, and D. Rognan. 2004. High-throughput modeling of human G-protein coupled receptors: amino acid sequence alignment, three-dimensional model building, and receptor library screening. J. Chem. Inf. Comput. Sci. 44:11621176.
59. Cserzo, M., J.-M. Bernassau, I. Simon, and B. Maigret. 1994. New alignment strategy for transmembrane proteins. J. Mol. Biol. 243:388396.
60. Clements, J. D., and R. E. Martin. 2002. Identification of novel membrane proteins by searching for patterns in hydropathy profiles. Eur. J. Biochem. 269:21012107.
61. Hedman, M., H. Deloof, G. Von Heijne, and A. Elofsson. 2002. Improved detection of homologous membrane proteins by inclusion of information from topology predictions. Protein Sci. 11:652658.
62. Tress, M. L., I. Ezkurdia, O. Graña, G. López, and A. Valencia. 2005. Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins. 61:2745.
63. Fanelli, F., and P. G. De Benedetti. 2005. Computational modeling approaches to structure-function analysis of G protein-coupled receptors. Chem. Rev. 105:32973351.
64. Chou, P. Y., and G. D. Fasman. 1974. Conformational parameters for amino acids in helical, ß-sheet and random coil regions calculated from proteins. Biochemistry. 13:211222.
65. Wallace, B. A., M. Cascio, and D. L. Mielke. 1986. Evaluation of methods for the prediction of membrane protein secondary structures. Proc. Natl. Acad. Sci. USA. 83:94239427.
66. Bagos, P. G., T. D. Liakopoulos, I. C. Spyropoulos, and S. J. Hamodrakas. 2004. PRED-TMBB: a web server for predicting the topology of ß-barrel outer membrane proteins. Nucleic Acids Res. 32:W400W404.
67. Chen, C. P., and B. Rost. 2002. State-of-the-art in membrane protein prediction. Appl. Bioinformatics. 1:2135.
68. Chen, C. P., A. Kernytsky, and B. Rost. 2002. Transmembrane helix predictions revisited. Protein Sci. 11:27742791.
69. Bagos, P. G., T. Liakopoulos, and S. Hamodrakas. 2005. Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method. BMC Bioinformatics. 6:7.
This article has been cited by other articles:
![]() |
W. Pirovano, K. A. Feenstra, and J. Heringa PRALINETM: a strategy for improved multiple alignment of transmembrane proteins Bioinformatics, February 15, 2008; 24(4): 492 - 497. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Landau, K. Herz, E. Padan, and N. Ben-Tal Model Structure of the Na+/H+ Exchanger 1 (NHE1): FUNCTIONAL AND CLINICAL IMPLICATIONS J. Biol. Chem., December 28, 2007; 282(52): 37854 - 37863. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Holyoake, V. Caulfeild, S. A. Baldwin, and M. S. P. Sansom Modeling, Docking, and Simulation of the Major Facilitator Superfamily Biophys. J., November 15, 2006; 91(10): L84 - L86. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK |