| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |


¶


* Department of Molecular Biophysics and Physiology, Rush Medical College, Chicago, Illinois;
Department of Human Physiology and Pharmacology, University of Rome "La Sapienza," Rome, Italy;
Department of Chemistry, University of Rome "La Sapienza," Rome, Italy;
Department of Physiology, Loyola University Medical Center, Maywood, Illinois; and ¶ Health and Environment Department, Istituto Superiore di Sanitá, Rome, Italy
Correspondence: Address reprint requests to Joseph P. Zbilut. Tel.: 312-942-6008; Fax: 312-942-8711; E-mail: joseph_p_zbilut{at}rush.edu.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
As clearly stated by Chiti et al. (2002)
, the possibility of forming aggregates is intrinsic to any protein. Consequently, it may be difficult to delineate a clearcut border separating aggregating and nonaggregating proteins. Any modification of environmental conditions (pH, temperature, ionic strength, etc.) could in principle drive any protein structure to shift from an isolated globular existence in solution to the formation of multimeric aggregates and the eventual precipitation exiting the solvent-solute equilibrium. This possibility is implicit in the character of the hydrophobic interaction. The main driving force shaping protein tertiary structures is the need to be soluble in water. For this task to be accomplished, the protein must fold in such a way as to hide hydrophobic residues, while exposing polar residues (Bryngelson et al., 1995
).
On the same basis, the protein-protein interaction can be considered as another aspect of the same phenomenon. In general, the search for aggregation cores is not basically different from the search for folding cores, and aggregation can be simply considered an alternative folding. The choice between correct (autonomous) and incorrect (multimeric) folding is a matter of relative preponderance (in energetic terms for given boundary conditions) of the two possible ways. Thus, the choice between alternative foldings is a stochastic matter and the boundary conditions can dramatically alter the relative probabilities of predominance. This is to say that an understanding of this process lies in a statistical, i.e., probabilistic characterization. In the case of protein folding, the relative probability of the two is driven by the balance of hydrophobic charge and steric effects of sequence/environment interaction (Dobson, 2003
). This implies the possibility of recognizing the relative propensity for aggregation by means of an efficient chemicophysical representation of proteins.
The mainly hydrophobic character of folding processes motivated us to look at the hydrophobicity coding of protein sequences as the first step for such a study. The coding of single monomers by means of chemicophysical properties, while allowing for a mechanistic interpretation of the observed results, turns protein sequence investigation into a classical numerical signal analysis problem.
The consideration of proteins as numerical time series has a long history, dating back to the pioneering work by Zimmerman et al. (1968)
and Kyte and Doolitle (1982)
. These initial studies, while providing useful insights, were limited by the use of signal analysis methods not completely suitable for protein sequences. In fact Fourier analysis and linear autocorrelation functions (the basic methods used) have strong limitations for protein sequence studies, since they assume sequence stationarity and signals with a length much higher than an average protein. Thus, although strictly periodic features can be identified, complicated, less obvious features are easily missed.
The interest in nonlinear systems in the eighties allowed for a reinitiation of time-series-style analysis of protein sequences with new mathematical methods independent of data length and stationarity. This resurgence of interest was marked by numerous successes: the demonstration of a correlation between hydrophobicity patterning of peptides and their relative receptors by Mandell et al. (2000)
; hydrophobicity energy patterns (Selz et al., 1998
); the demonstration of a "signature" in terms of hydrophobicity patterning of different classical three-dimensional motifs (Murray et al., 2002
); and demonstration of the predictability of protein stability and protein-protein interaction patterns by our group (Zbilut et al., 2000
; Giuliani and Tomasi, 2002
). For reviews of the second-wave of time series analysis of protein sequences, see Giuliani et al. (2002)
, and Zbilut et al. (2002)
.
In the present article, we report the results from a nonlinear signal analysis approach to hydrophobicity patterning, using both a static and a dynamic approach. The static approach was based both on the search for singularities of the distribution of hydrophobicity along amino acid sequences of aggregating protein systems, and on the classification of different folding behaviors relative to their hydrophobicity patterns. (The term singularity has several definitions depending upon a discipline's perspective. Here, we use the term in a general, nonformal sense of a uniquely occurring pattern. Different patterns can emerge depending upon the analytic tool used. In the present case recurrence quantification was employed; see below.) The protein groups were chosen on the basis of available experimental evidence describing their folding tendencies. The dynamic approach was based on the molecular dynamics simulation of amyloid ß-peptide Aß(140), at different pH values known to have a different permissivity in terms of fibril formation.
Both approaches give a general picture of aggregation mechanisms in terms of the relative propensity for undergoing structural order-disorder transitions of the intervening structures acting at hydrophobicity-singular points along the amino acid chain. Thus, more than being linked to a particular structural feature, the ability to form intermolecular aggregates appears to be correlated with conformational flexibility. Moreover, the mechanism governing the formation of protein polymers like collagen or silk is shown to be completely different from the one governing the formation of aggregates typical of misfolding diseases, which has been shown to be more similar to the one governing the formation of multimeric enzyme complexes, thus stressing the crucial role played by small aggregates in misfolding diseases as suggested by Dobson (2003)
.
| MATERIALS AND METHODS |
|---|
|
|
|---|
We applied this kind of approach to two different situations: 1) the discrimination of different folding behavior of different proteins, and 2) the identification of crucial "hotspots" for aggregation behavior along the acylphosphatase (AcP) protein sequence.
In the first situation, 90 protein sequences of specific structural and functional classes were used for analysis: A, mainly
-helical structures; B, mainly ß-sheet structures; C, proteins giving rise to long extracellular polymeric structures; D, self-aggregating systems (amyloid, serpins); E, natively unfolded proteins; F, proteins involved in DNA processing through supramolecular structures; G, artificial
-helices with very regular patterning of amino acids (Kamtekar et al., 1993
); H, artificial ß-sheets with very regular patterning of amino acids (West et al., 1999
); and I, proteins known to posses
-helix structures (Fodje and Al-Karadaghi, 2002
). (A note regarding the inclusion of
-helices: although the first eight groups are derived from a logical classification of protein groups, group I was included to address the concern that
-helices are an underreported structure of some structural as well as functional significance. It was also noted in the MD simulation of APP Aß(140) with aggregation-prone conditions; see Table 1).
|
Recurrence quantification analysis
RQA is a nonlinear time-series analysis method (Webber and Zbilut, 1994
) which, in addition to the application of protein sequence analysis, has been adopted with success in a number of other fields ranging from physiology, to theoretical physics, to the analysis of reaction mechanisms. The method has been documented extensively, but briefly:
The basis of the method is the projection of the original mono-dimensional series into a multidimensional space constituted by subsequently lagged copies of the original sequence. This corresponds to the generation of the so-called embedding matrix (EM). The EM columns are, in order: a), the original series; b), the series shifted by one amino acid; c), the series shifted by two amino acids; d), etc...until a dimension variable from three to eight consecutive shifts is reached. Thus, the EM is a multivariate matrix whose rows (statistical units) are subsequent patches (or sliding windows) of amino acids with length equal to the embedding dimension, and whose columns (statistical variables) are the whole sequence lagged by subsequent delays. EM is an M x N matrix, with M being the number of amino acids minus the embedding dimension (the last amino acids are eliminated by the shifting of the series due to the embedding procedure), and N the embedding dimension.
The notion of recurrence, at the basis of this technique, is well established (Kac, 1959
). For any ordered series (temporal or spatial), a recurrence is defined as a point which repeats itself. Because recurrences are simply tallies, they make no mathematical assumptions. Given a reference point, X0, and a ball of radius r, in an N-dimensional space, a point is said to recur if
![]() | (1) |
|
These six indexes give a summary of the autocorrelation structure of the series.
The application of RQA implies the a priori setting of the measurement parameters embedding dimension, radius, and line (the minimum number of adjacent recurrent points to be considered as deterministic). On the basis of studies of the maximal information content of protein sequences as well as our previous analyses, the above parameters were set to: embedding dimension, 3; radius, 6 (first minimum of DET as determined by a plot of the radius from 0 to 100; see Fig. 8, below), and line, 2 (Strait and Dewey, 1996
; Giuliani et al., 2002
; Zbilut et al., 2002
; Weiss et al., 2000
).
|
cwebber/.
The dynamic approach
Although the statistical information from the static analysis can be suggestive, an attempt to compare the information with a clearly aggregating protein was sought via molecular dynamics simulations (MDS). Aß(140) is such a protein, and, moreover, its relatively short length permits relatively facile MD manipulation and analysis.
MD simulations
The amyloid ß-peptide, Aß(140) (Serpell, 2000
), was investigated in aqueous solutions at low, medium (pH range 24 and 56, respectively), and neutral pH by three molecular dynamics simulations in an N (number of particles), V (volume), T (temperature) ensemble at normal conditions. The starting configuration was taken from the eighth nuclear magnetic resonance (NMR) model, obtained from PDB, entry code 1BA4, which is the closest to the average NMR structure (Coles et al., 1998
).
The different pH environments were created by changing the protonation state of the ionizable residues according to their pKa. Thus, Glu and Asp residues were negatively charged at medium and neutral pH, and His residues were positively charged at low and medium pH. Moreover, Lys and Arg residues were positively charged under different pH conditions. The site of protonation of all His residues was based on an analysis performed by the program WHATCHECK (Hooft et al., 1996
) To select the protonated nitrogen, the structures were checked for the presence of possible hydrogen bonds by looking at the closest hydrogen bond receptor. Histidines 6 and 13 were protonated at the N
position and residue 14 at the N
position.
Each peptide was immersed in a rectangular box of pre-equilibrate SPC water molecules (Berendsen et al., 1981
). Periodic boundary conditions were adopted and the simulations have been performed at constant temperature using the Berendsen thermal coupling (Berendsen et al., 1984
). To achieve charge neutrality of systems, counterions, Na and Cl, were added by replacing water molecules at the most negative and positive, respectively, electrical potential. In Table 2 the composition of the simulated systems is reported.
|
Individual trajectories are defined as follows: Aß(140) simulations at low, medium, and neutral pH are referred to as AB40L, AB40M, and AB40N, respectively.
Strategy of analysis
To compare the different simulations, the structures were classified according to Jarvis-Patrick method by projections into root-mean-square deviation (RMSD) space (Jarvis and Patrick, 1973
). The obtained structures were clustered by means of the Jarvis-Patrick algorithm as applied to their RMSD values. The method allocates two structures into the same cluster if they are reciprocal first neighbors and share at least three common neighbors. The criterion for two structures to be considered as first-neighbors is simply being among the first 10 structures with the most similar RMSD. Each simulation is sampled every 30 ps and lasts 10,000 ps, and the results are expressed in terms of the subsequent visits of the trajectory to the clusters. This allows for an immediate appreciation of the relative configurational stability of the studied trajectory. An MD simulation remaining for all the simulation period in the same (or few) cluster points to a very stable situation and consequently to a very low number of configurational transitions. On the contrary, an MD simulation characterized by an elevated number of clusters and a rich dynamics between different configurations (clusters) during the simulation period points to a very flexible system.
To characterize the structural differences among different trajectories, the RMSD per residue with respect to the NMR structure and secondary structure content were calculated within each cluster. The analysis of secondary structure was done with the DSSP program (Kabsch and Sanders, 1983
).
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Both artificial
and ß structures (groups G and H) were selected as synthetic (and thus extremely clean in terms of hydrophobicity distribution) examples of the two main secondary structure motifs. The relative similarity of the amyloid system profile to one of these two poles may have the meaning that one of these two "ideal" motifs should be more fibrillating-prone than the other. The same reasoning is at the basis of the choice of groups A and B but with more of an accent on real, natural,
and ß structures that are much more noisy (in terms of hydrophobicity distribution) than the artificial polypeptides. Group C is made of proteins giving rise to large supramolecular structures such as natural silk or collagen and represent a clear example of self-aggregating systems. At odds with amyloidlike structures, these structures are mainly extracellular and are made of an extremely high number of monomers. Polymerizing systems are made of very repetitive patterns of amino acids whereas the majority of natural proteins have quasirandom sequences.
Group E proteins come from the Dunker list of natively unfolded systems (Dunker et al., 2002
). They are proteins completely unfolded in solution and perform their physiological roles through order-disorder transitions. The Dunker group has noted that the great majority of these unfolded systems is involved in many protein-protein or protein-DNA interactions. If amyloid systems should have a recurrence spectrum analogous to these systems we could hypothesize a link between molecular flexibility and the propensity to form intermolecular links. The same reasoning holds true for group F proteins that, at odds with other groups, is not a structural but a functional classification: proteins involved in DNA repair and, in general with DNA transcription regulation, are known to work through the formation of aggregates of different protein species. The similarity of group D amyloid-forming species with proteins of this group could be another indication of a common "signature" of aggregation.
When submitted to RQI, these proteins gave the results indicated in Fig. 2. From the figure it is evident how self-aggregating systems (group D) are very similar to DNA processing proteins (F), and (I)
-helix proteins characterized by a scale-free flat spectrum devoid of major peaks. The apparent connection between amyloid and DNA-related enzymes may be linked to the capability of forming multimeric aggregates of globular proteins. The extracellular polymerizing systems (C) point to a completely different pattern of hydrophobicity distribution. The two synthetic groups (G and H) show very clearcut peaks. This is consistent with the aim of the formation of these artificial proteins, based on exactly repetitive motifs forming
-helices and ß-sheets, respectively. To substantiate these qualitative observations it is important to demonstrate that the recurrence spectrum differences are sufficient to achieve a good discrimination between classes of proteins by means of a quantitative procedure. This was actually the case, when submitted to a canonical discriminant analysis (CDA).
|
The first 60 recurrence intervals (this being the shortest molecule), with tallies being normalized as a percent of total recurrences, were submitted to CDA. The result was an almost perfect discrimination of the 90 proteins as depicted in Table 3.
|
-helices and natural ß-sheets, are situated at the extremes of the axis: both situations correspond to very regular arrangement of hydrophobic/hydrophilic residues. Natural ß-sheets, with longer periodicities than
s, however, are broken up in their natural setting. The H group (artificial ßs) is clearly unique in that it is the most periodic, and clearly defines score 2 as spanning a regularity index. This opposition is of scarce interest for natural amyloid proteins that are posited in the center of the axis (D). This first analysis confirms the hypothesis that fibril-forming systems are not particularly specialized (like synthetic ß-sheets or collagen-like systems) but represent relatively normal proteins (Dobson, 2003
|
systems at one hand and all the other proteins at the other hand. Again there is an opposition between "extremely regular" and "irregular" distributions, under the point of view of
-helix signature of hydrophobicity distribution.
|
-helices (A). This suggests that the recipe for building a natural amyloid system should mix some features of both
-helices and natively unfolded systems. Basically this recipe is not different from the recipe of proteins that for their normal behavior generate supramolecular complexes like DNA processing systems (F).
To synthesize all these observations, we computed a k-means cluster analysis on the data set constituted by the 90 proteins as statistical units and the first four canonical variates as variables. K-means clustering splits a set of objects into a selected number of groups by maximizing between-cluster variation relative to within-cluster variation. It is similar to doing a one-way analysis of variance where the groups are unknown and the largest F value is sought by reassigning members to each group. K-means starts with one cluster and splits it into two clusters by picking the case farthest from the center as a seed for a second cluster and assigning each case to the nearest center. It continues splitting one of the clusters into two (and reassigning cases) until a specified number of clusters are formed. K-means reassigns cases until the within-groups sum of squares can no longer be reduced. The Euclidean distance was used to determine distance variation. The results are presented in Table 4. The natural clusters in the four dimensional space share a strict relation with the a priori folding groups. What is interesting, is the fact that the D group of amyloidlike proteins goes together with natural
-helices (A), DNA repair enzymes (F), and
-helix proteins (I).
|
-helices (A), and DNA processing enzymes (F) gives us a clear message on the nature of protein-protein aggregation process taking place in the amyloidlike systems:These evidences come from a statistical analysis of protein ensembles, and as suggestive as they are, they are not completely clear. The clustering of ADFI suggests some commonalities, but not beyond obvious initial classification. To understand these general conclusions, we now shift to a more local view.
Identification of aggregation hotspots
In a previous article we obtained preliminary evidence for the possibility of equating singularities in determinism along the sequence to aggregation hotspots. This link between aggregation hotspots and deterministic singularities came from the analysis of two prion-like 36-mers where it was demonstrated that the scaling of determinism with radius had a very clear peak at very low radius corresponding to the presence of a very high interaction probability confined to a very specific portion of the sequence (Zbilut et al., 2000
). This peak was present in the case of aggregation-prone peptides and suddenly disappeared when sequence was randomly shuffled (Fig. 5). The presence of such singularities in determinism scaling was demonstrated for Syrian hamster PrP protein as well.
|
An almost ideal model system to confirm these findings is represented by human AcP, whose aggregation propensity was carefully analyzed by Chiti et al. (2002)
. These authors demonstrated the presence of mutational aggregation zones along AcP corresponding to the sequence in the 1631 and 8798 residues range: only mutations intervening in these portions of the sequence are capable of significantly influencing the aggregation behavior of the protein. When looking at the hydropathy profile of AcP, no unique feature of the curve characterizes this site. On the contrary, when submitted to the windowed version of RQA with delay as 1, emb as 3, epoch (window) as 28, overlap as 27, shift = 1, scaling as unit normalization, and radius as 30, a unique determinism peak is evident at one of the aggregation-sensitive portions of the sequence (Fig. 6).
|
|
|
Chiti et al. (2002)
were able to finely tune, through mutational analysis, the zones of the molecule relevant for folding and the zones relevant for aggregation behavior. In an attempt to interpret their data in light of the relative order/disorder status of the different zones along the sequence by making use of the Dunker et al. (2002)
PONDR computation for the determination of disordered areas of AcP (Romero et al., 2001
). The results obtained with the PONDR unfolding predictor algorithm were compared with the AcP RP in Fig. 8. What becomes immediately apparent is that the "disordered" zone approximately encompasses residues 3075, which excludes the aggregationally important zones, but approximates the zone relevant for folding. Moreover, the RP highlights these same areas by darkened patches of laminarity.
Additionally, RQI was performed on both groups of folding- and aggregation-affecting mutations to determine if there were any preferred interval lengths that affect aggregation behavior. This result is immediately related to the statistical comparisons described in the first section, allowing us to go from a general statistical perspective to a particular mechanistic one.
Fig. 9 reports the recurrence spectra relative to folding and aggregation mutants of AcP. As can be seen from the figure, the two behaviors correspond to different characteristic recurrence intervals. CDA analysis as performed for the previous nine groups was repeated for AcP. The procedure was able to distinguish the folding/aggregation zones again to a significant level (p = 0.03) (Table 6).
|
|
|
|
|
|
It is worth noting that the AB40M trajectory, at odds with the other two simulations, presents as a dominant structure (at approximately the same frequency as the
-helix), the structure called
-helix. This is considered a meta-stable helix which was already noted in the Aß(140) peptide and marks systems with an high propensity for conformational transitions (Fodje and Al-Karadaghi, 2002
). Thus the dynamical approach is globally in line with the results of the static analysis relative to the nexus between aggregation/folding and conformational flexibility. The conformational flexibility adds another degree of stochastic variability.
The other point to emphasize in the case of Aß(140) is the previously identified correspondence between disordered areas with determinism punctuated by short laminar areas. As can be seen from Fig. 11 and the record of the molecular dynamics simulation, laminar areas of the RP (roughly 515; 2030) exhibit the largest numbers of conformational changes. This is not a contradiction to the above statement regarding AB40M as the richest source of conformational dynamics. This is to say that irrespective of the overall cluster dynamics, the patchy areas are significant sources for the motions. The conjecture was confirmed by a Pearson correlation analysis of the three MD simulations with the laminar patches (AB40L, r = 0.735, p = 0.001; AB40M, r = 0.555, p = 0.007; AB40N, r = 0.569, and p = 0.005; Bonferroni adjusted probabilities, see also Fig. 12). This tends to substantiate the recent observation of Satheeshkumar and Jayakumar (2003)
, that the prion protein (113127) exhibits polymorphic behavior, especially at acidic pH. In this respect, the correlation between deterministically laminar regions with MDS histograms is highest for the acidic environment. Thus segmental motion does not necessarily translate into structural homogeneity. Of note is the appreciation of a
-helix in the 2735 region which is emphasized by a plot of laminarity (Fig. 13). This strong possibility is supported by research suggesting kinetic intermediate helical structures (Gzit, 2002
).
|
|
|
|
| CONCLUSIONS |
|---|
|
|
|---|
In the present study, both these limitations have been obviated. Specifically, use was made of a signal analysis technique which overcomes the limitations of traditional techniques dependent upon requirements of stationarity and relative periodicity. Secondly, the elegant work of the Oxford/Florence groups on AcP by means of site-directed mutagenesis, has presented specific evidence for the existence of preferential areas important for folding vs. aggregation. We used this information to systematically analyze hydrophobicity patterns via a "static" approach, and carrying over these observations to another molecule important in aggregation and fibril formation via a "dynamic" approach.
The results have confirmed the importance of hydrophobicity relative to its patterning along a protein chain. Specifically, an important criterion is what we have labeled as its smoothness along the TREND/LAM dimension (Table 10). We derived this observation first upon a comparative analysis of broadly classified protein groups, which was confirmed by further analysis based on the Chiti et al. (2002)
results. It becomes apparent that deterministic singularities can become a nucleation center dependent upon how laminar or smooth the singularities are. If they are relatively smooth with respect to deterministic patches, they tend to become important for the possibility of aggregation. This may be due to the fact that residues in such patches maintain a hydrophobic profile which is relatively stable favoring nearby contacts (perhaps based on hydrophobic cores), breaking only at the termini of the patches. Folding zones, on the other hand, are characterized by broken patches of laminarity (made of the repetition of internally very diverse patches in terms of hydrophobicity). These areas are more likely to form connections beyond their immediate (short) patches. Interestingly enough, this profile corresponds roughly to the identification of disordered zones as identified by the work of Dunker et al. (2002)
and their PONDR index. The more patchy zones, however, tend to be more liable to change (i.e., from disorder to order) because a mutation has a larger probability to stabilize two short deterministic patches.
|
To prove these theoretical statements in the actual case of proteins we need to shift from the sequence analysis perspective to the study of the actual behavior of aggregating systems in solution in terms of both molecular dynamics simulations (MDS) and experimental spectroscopic data. Thus we must shift from what we have called the "static" to the "dynamic" approach.
These observations are further confirmed by application to the Aß(140) peptide. Specifically, the patchy laminar (folding), relatively smooth deterministic (aggregating) observation continues. Two aggregating areas are identified, with a central disordered area containing a unique deterministic singularity. Additionally, the unequal hydrophobic cores forming fibrils, are further distinguished by the possibility that the longer patch is responsible for an intermediate
-helix as evidenced by MDS.
Thus the picture emerges that the laminarity of protein deterministic patches are a key in determining folding tendencies. This is supported by our previous work implying that mutations have variable effects depending upon the net effect upon the patch (i.e., maintaining the laminarity versus breaking it; Zbilut et al., 1998
). Additionally, our work with MDS and the tendencies to form laminar blocks is also suggestive (Manetti et al., 2001
). Clearly, however, the present results, although highly evocative, are not complete. The results are specific to the studied proteins, and generalization cannot be immediately assumed. Also, additional research in both AcP and Aß(140) have questioned the role of charge or other electrostatic measures (Tycko, 2003
; Massi et al., 2002
). Further investigation into these areas using descriptive differential equations are currently being evaluated. This investigation could give important clues not only for the prediction, starting from the sequence, of the aggregation propensity of a given system and thus suggesting possible targets for drugs but, on a more speculative but nevertheless very important dimension, for understanding the dynamics of the so-called misfolding diseases. Along this path, Kellershohn and Laurent (2001)
clearly demonstrated the non-Lipschitz character of the prion infection and the consequent structural transition, albeit qualitatively. Similarly, Harrison et al. (2001)
have proposed a model of "glassy" behavior of alternative states for folding behavior. These data are in agreement with our models for AcP and Aß(140), allowing us to speculate the existence of a common mechanism underlying a number of apparently disperse and heterogenous folding/aggregation phenomena.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
This work was supported by a joint Division of Mathematical Sciences/National Institute of General Medicine Sciences initiative to support mathematical biology, from the National Science Foundation and National Institutes of Health (NSF DMS 0240230); J. P. Zbilut, Principal Investigator.
Submitted on July 14, 2003; accepted for publication August 7, 2003.
| REFERENCES |
|---|
|
|
|---|
Berendsen, H. J. J., J. P. M. Postma, W. F. van Gusteren, and J. Hermans. 1981. Interaction models for water in relation to protein hydration. In Intermolecular Forces. B. Pullman, editor. D. Reidel Publishing Company, Dordrecht, The Netherlands. 331342.
Bryngelson, J. D., J. N. Onuchic, N. D. Socci, and P. G. Wolynes. 1995. Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins. 21:167195.[Medline]
Chikishev, A. Y., N. V. Netrebko, Y. M. Romanovsky, W. Ebeling, L. Schimansky-Geier, and A. V. Netrebko. 1998. Stochastic cluster dynamics of macromolecules. Int. J. Bifurc. Chaos. 8:921926.
Chiti, F., N. Taddei, F. Baroni, C. Capanni, M. Stefani, G. Ramponi, and C. M. Dobson. 2002. Kinetic partitioning of protein folding and aggregation. Nat. Struct. Biol. 9:137143.[Medline]
Coles, M., W. Bicknell, A. A. Watson, D. P. Fairlie, and D. J. Craik. 1998. Solution structure of amyloid ß-peptide(140) in a water-micelle environment. Is the membrane-spanning domain where we think it is? Biochemistry. 37:1106411077.[Medline]
Darden, T., D. York, and L. Pedersen. 1993. Particle mesh Ewald: an N*log(N) method for computing Ewald sums. J. Chem. Phys. 98:1008910092.
Dima, R. I., and D. Thirumalai. 2002. Exploring protein aggregation and self-propagation using lattice models: phase diagram and kinetics. Protein Sci. 11:10361049.
Dobson, C. M. 2003. Protein folding and disease: a view from the first Horizon Symposium. Nat. Rev. Drug Discov. 2:154160.[Medline]
Dunker, K., C. J. Brown, D. Lawson, L. M. Iakoucheva, and Z. Obradovic. 2002. Intrinsic disorder and protein function. Biochemistry. 41:65736582.[Medline]
Eckmann, J. P., S. O. Kamporst, and D. Ruelle. 1987. Recurrence plots of dynamical systems. Eur. Phys. Lett. 4:973977.
Fodje, M. N., and S. Al-Karadaghi. 2002. Occurrence, conformational features and amino acid propensities for the
-helix. Protein Eng. 15:353358.
Giuliani, A., R. Benigni, J. P. Zbilut, C. L. Webber, Jr., P. Sirabella, and A. Colosimo. 2002. Nonlinear signal analysis methods in the elucidation of protein sequence structure relationships. Chem. Rev. 102:14711491.[Medline]
Giuliani, A., and M. Tomasi. 2002. Recurrence quantification analysis reveals interaction patterns in paramyxoviridae envelope glycoproteins. Proteins. 46:171176.[Medline]
Gzit, E. 2002. A possible role for
-stacking in the self-assembly of amyloid fibrils. FASEB J. 16:7783.
Harrison, P. M., H. S. Chan, S. B. Prusiner, and F. E. Cohen. 2001. Conformational propagation with prion-like characteristics in a simple model of protein folding. Protein Sci. 10:819835.
Hooft, R. W. W., G. Vriend, C. Sander, and E. E. Abola. 1996. Errors in protein structures. Nature. 381:272.[Medline]
Jarvis, R. A., and E. A. Patrick. 1973. Clustering using a similarity measure based on shared near neighbors. IEEE Trans. Computers. C22:10251034.
Kabsch, W., and C. Sanders. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 22:25772637.[Medline]
Kac, M. 1959. Probability and Related Topics in Physical Sciences. Wiley Intersciences, New York.
Kamtekar, S., J. M. Schiffer, H. Xiong, J. M. Babik, and M. H. Hecht. 1993. Protein design by binary patterning of polar and non-polar amino acids. Science. 262:16801685.
Kellershohn, N., and M. Laurent. 2001. Prion diseases: dynamics of the infection and properties of the bistable transition. Biophys. J. 81:25172529.
Kirkitadze, M. D., M. M. Condron, and D. B. Teplow. 2001. Identification and characterization of key kinetic intermediates in amyloid ß-protein fibrillogenesis. J. Mol. Biol. 312:11031119.[Medline]
Kyte, J., and R. F. Doolitle. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157:105132.[Medline]
Ma, B., and R. Nussinov. 2002. Stabilities and conformations of Alzheimer's ß-amyloid peptide oligomers (Aß1622, Aß1635, and Aß1035): sequence effects. Proc. Natl. Acad. Sci. USA. 99:1412614131.
Mandell, A. J., K. A. Selz, and M. F. Shlesinger. 2000. Protein binding predictions from amino acid primary sequence hydrophobicity. J. Mol. Liq