| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Biophys J, September 2002, p. 1268-1280, Vol. 83, No. 3
Sheet Using NMR Structures and Sequence Alignments
Institute for Physical Science and Technology, and Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742 USA
| |
ABSTRACT |
|---|
|
|
|---|
Neurodegenerative diseases induced by transmissible
spongiform encephalopathies are associated with prions. The most
spectacular event in the formation of the infectious scrapie form,
referred to as PrPSc, is the conformational change from the
predominantly
-helical conformation of PrPC to the
PrPSc state that is rich in
-sheet content. Using
sequence alignments and structural analysis of the available nuclear
magnetic resonance structures of PrPC, we explore the
propensities of helices in PrPC to be in a
-strand
conformation. Comparison of a number of structural characteristics
(such as solvent accessible area, distribution of (
,
) angles,
mismatches in hydrogen bonds, nature of residues in local and nonlocal
contacts, distribution of regular densities of amino acids, clustering
of hydrophobic and hydrophilic residues in helices) between
PrPC structures and a databank of "normal" proteins
shows that the most unusual features are found in helix 2 (H2)
(residues 172-194) followed by helix 1 (H1)
(residues 144-153). In particular, the C-terminal
residues in H2 are frustrated in their helical state. The databank of
normal proteins consists of 58 helical proteins, 36
+
proteins,
and 31
-sheet proteins. Our conclusions are also substantiated by
gapless threading calculations that show that the normalized Z-scores
of prion proteins are similar to those of other
+
proteins with
low helical content. Application of the recently introduced notion of
discordance, namely, incompatibility of the predicted and observed
secondary structures, also points to the frustration of H2 not only in
the wild type but also in mutants of human PrPC. This
suggests that the instability of PrPC proteins may play a
role in their being susceptible to the profound conformational change.
Our analysis shows that, in addition to the previously proposed role
for the segment (90-120) and possibly H1, the C-terminus of H2 and
possibly N-terminus may play a role in the 

transition. An
implication of our results is that the ease of polymerization depends
on the unfolding rate of the monomer. Sequence alignments show that
helices in avian prion proteins (chicken, duck, crane) are better
accommodated in a helical state, which might explain the absence of
PrPSc formation over finite time scales in these species.
From this analysis, we predict that correlated mutations that reduce
the frustration in the second half of helix 2 in mammalian prion
proteins could inhibit the formation of PrPSc.
| |
INTRODUCTION |
|---|
|
|
|---|
Prions are infectious particles that are an
abnormal isoform, PrPSc, of the normal
host-encoded cellular prion protein PrPC
(Prusiner, 1997
; Prusiner, 1998
; Cohen, 1999
). They are believed to be
associated with neurodegenerative diseases in humans and many mammals,
which are caused by transmissible spongiform encephalopathies (TSE).
Examples of TSEs are familial Creutzfeldt-Jakob disease (CJD) in
humans, scrapie of sheep, and bovine spongiform encephalopathy (BSE).
Because no nucleic acid is implicated in the transformation from
PrPC to PrPSc, the
"protein-only" hypothesis was proposed (Prusiner, 1997
; Prusiner,
1998
; Cohen, 1999
), which argues that the same amino-acid sequence
adopts two distinct structures. A wealth of experimental data on
mammalian prions supports this hypothesis (Cohen and Prusiner, 1998
).
The protein-only hypothesis has also been demonstrated in recent
studies on Saccharomyces cerevisae by Weissman and coworkers (Sparrer et al., 2000
). They showed that introduction of preconverted Sup35 leads to the formation of self-propagating
[PSI+], which apparently has prion-like characteristics.
The protein-only hypothesis implies that the conformational change
leading to the PrPSc formation from the normal
cellular form PrPC may be spontaneous or might
involve interactions with unidentified protein X (Telling et al.,
1995
). Although the mechanism of conversion of
PrPC into PrPSc and the
tertiary structure of PrPSc are not known,
numerous studies (Pan et al., 1993
; Pergami et al., 1996
) have shown
that the two isoforms have identical amino-acid sequences. However, the
structures of PrPC and
PrPSc are completely different. Nuclear magnetic
resonance (NMR) structures of several mammalian prion proteins
show that PrPC(121-231) is ~45% helical with
a very low (3-8%)
-sheet content (Riek et al., 1996
, 1997
; James
et al., 1997
; Donne et al., 1997
; Zahn et al., 2000
). Using Fourier
transform infrared spectroscopy, it has been inferred that the
secondary structure of PrPSc (formed from
PrPC(90-231)) has ~50%
-sheet content
(Caughey et al., 1991
; Gasset et al., 1993
). An extraordinary
conformational change takes place in the transition from the normal
state to the infectious form. Determination of the NMR structures of
PrPC for a number of species (syrian hamster,
mouse, bovine, and human) is significant because it potentially offers
hope of understanding, at the molecular level, the mechanism of
PrPC
PrPSc conversion.
Prion proteins, encoded by a single gene, consist of about 250 residues
of which the first 22 form a signal sequence. This is followed by
unstructured, but likely helical, Cu2+ binding
octarepeats rich in glycine (Prusiner, 1998
). The NMR structure of the
remaining protein PrP(90-231) shows that the N-terminus is disordered
(Donne et al., 1997
). The most ordered portion of the structure
consists of ~103 or 104 residues,
PrPC(121-231). The structures from this region
from three species (mouse, hamster, and human) show that
PrPC consists of three helices and two short
-strands (Riek et al., 1996
, 1997
; James et al., 1997
; Donne et al.,
1997
; Zahn et al., 2000
). Using the mouse prion protein as the
reference (Fig. 1), the three helices H1,
H2, and H3 span residues 144-153, 172-194, and 200-224,
respectively. The NMR structure (Fig. 1) shows that there are two small
anti-parallel
-strands (residues 128-131 and 161-164). The contact
map for the mPrP(121-231) structure (Fig.
2) shows that PrPC
has the characteristics of an
+
protein.
|
|
We selected (from the Protein Data Bank (PDB)) four prion
proteins with known tertiary structures: 1ag2 (mouse prion
protein, 103 residues), 1b10 (syrian hamster prion
protein, 104 residues), 1qlz (human prion protein, 104 residues, the first of 20 NMR models), 1qlx (human prion
protein, 104 residues, representative). According to Structural
Classification of Proteins (SCOP) (Murzin et al., 1995
), they belong to
the
+
class of proteins, and, according to Homology Derived
Secondary Structure of Proteins (Sander and Schneider, 1991
),
they are very similar because of 90% sequence identity between mPrP
and h1PrP, an 86% sequence identity between shPrP and h1PrP, and a
94% sequence identity between mPrP and shPrP. There is some
variability in the precise assignments of residues in the helices and
the strands, which depends on whether NMR structures correspond to
(90-231) or (121-231). The location of secondary structural elements
also differs depending on the refinements used. In this study, we use
the classification provided in the header of the PDB (Berman et al.,
2000
) file.
The purpose of our study is to identify those features of PrP that
might play an important role in its conversion to
PrPSc using sequence alignments and NMR
structures. To pinpoint the plausible regions of the
PrPC structures that are susceptible to
conformational fluctuations, we compare a number of characteristics of
the structural motifs in PrPC with those of
"normal" proteins. We determined, from a databank of proteins from
the PDB, the degree of solvent accessibility and the typical
(
,
) angles for the 20 types of amino acids in
-helices and
in
-sheets. Comparison between the typical values and those for the
amino acids of the sequences of PrPC revealed
that some positions in the helices of prion proteins exhibit
characteristics that are distinct from what is normally observed.
Distribution of contacts in PrPC shows that they
differ from normal proteins especially in regard to the short-range
contacts. Most of these differences are localized in helix 2 (H2). Our
results suggest that the C-terminus of H2 is frustrated in the
experimentally found
-helical state. Because the bulk of the
stability of PrPC comes from the core, our
analysis suggests that nearly the whole protein might be involved in
the transition to PrPC* (a species that can
nucleate and polymerize). This implies that there is a large barrier
that separates PrPC* from
PrPC.
| |
MATERIALS AND METHODS |
|---|
|
|
|---|
Choice of databanks
The four NMR structures of PrPC analyzed are the mouse mPrP (103 residues, 1ag2), syrian hamster shPrP (104 residues, 1b10), human h1PrP (104 residues, 1qlz), and human h2PrP (104 residues, 1qlx). The PDB entries follow the lengths of the proteins. In the case of the entry h1PrP, we used the first of 20 NMR structures, and a representative structure was chosen for the entry h2PrP.
Because we are interested in comparing the prion proteins with normal
proteins (for which no alternative conformations leading to fibril
formation has been detected), we assembled databanks of mainly
,
mainly
and
+
proteins (based on the SCOP classification) from
the PDB. From the resulting sets, we kept only those proteins that
satisfied the following criteria: 1) the experimental method for
determining the three-dimensional (3D) structure of the protein was
x-ray crystallography; 2) the protein is single-chain; 3) there are not
many (more than 2) missing residues from the protein or many missing
atoms from the amino acids of the protein; and 4) the experimental
information about the secondary structure of the protein appears in the
PDB header.
With these criteria, we were left with 58 mainly
proteins whose PDB
entries are 1a32(OB), 1a43(OB), 1aa2(OB), 1aep, 1ail(OB), 1aum(OB),
1axn, 1ayi(OB), 1bbc, 1bd8, 1bea(OB), 1beo(OB), 1bfa(OB), 1bgc, 1bgd,
1bgf(OB), 1cei(OB), 1cem, 1csh, 1eca(OB), 1ecd(OB), 1fps(OB), 1gcb,
1hbg(OB), 1hik, 1huw, 1hyp(OB), 1hyt, 1ilk, 1ith, 1lh1(OB), 1lis, 1lki,
1lpe, 1lrv, 1mba(OB), 1mzl(OB), 1nfn, 1pbv, 1poa, 1r69(OB), 1rcb,
1utg(OB), 1vlk, 1vls, 2abk, 2asr, 2bct, 2cro(OB), 2end(OB), 2fha, 2int,
2lhb(OB), 2mhr, 451c(OB), 4cpv(OB), 4icb(OB), 5cyt(OB). The OB
indicates that the architecture of the protein is orthogonal bundle
according to Orengo et al. (1997)
.
The 36 normal
+
proteins in our comparison set have the PDB
entries 119l, 1ah6, 1ako, 1cby, 1cof, 1ctf, 1fkb, 1gcb, 1han, 1kuh,
1lba, 1lbu, 1mrj, 1npk, 1opd, 1pgx, 1pkp, 1ptf, 1sap, 1snc, 1tif, 1uae,
1ubi, 1vcc, 1vhh, 2aak, 2hpr, 2phy, 2sn3, 3cla, 3lzm, 3ssi, 4bp2, 5pti,
7rsa, 9rnt.
The PDB entries of 31 mainly
proteins in our databank are 1aac,
1amm, 1aol, 1bfg, 1eur, 1fna, 1gpc, 1gpr, 1hoe, 1idk, 1jpc, 1knb, 1kum,
1lcl, 1mai, 1nif, 1pdr, 1 pmi, 1srl, 1tul, 1wba, 1whi, 1who, 1xnb,
2ayh, 2cba, 2cna, 2cpl, 2mcm, 2prd, 2sns.
Structural analysis
Using these three databanks, we determined some of the
attributes, namely, the degree of solvent exposure and the (
,
)
angle combination, of the typical amino acid (for each of the 20 types of amino acids) in helices and in strands. Because the N-cap and C-cap
of helices present special features compared to all other positions in
a helix (see, for example, the study of Aurora and Rose, 1998
, and
references therein) and we were interested in obtaining the information
about a typical residue in a helix, we treated these two positions as
belonging to the loop category. We also treated helices shorter than
six residues as loops.
Regular density
To decipher residue-residue interactions Baud and Karlin (1999)
introduced several density measures that give an indication of the
environments. Following these authors, we calculated the regular
density for the nine structural classes formed from the three secondary
structures (
helix,
-strands, and loops) and the degree of
solvent exposure (buried (b), partially buried (pb), and exposed (e)).
The regular density for an amino acid i (i = 1, 2, ... , 20) in an environmental class
(
= 1, 2, ... , 9) is (Baud and Karlin, 1999
)
|
(1) |

is the structure, T is the threshold value
(= 5 Å),
(x) is the Heaviside function, and
NT is the number of structures in the
dataset. We calculated the distribution of





)) is measured in terms of the variable
|
(2) |


. If 


1



1, then there is
significant deviation from normal behavior. If a sequence harbors many
mismatches, then we consider it to be frustrated in that structural
element. We used the following classification of the amino acids:| |
RESULTS |
|---|
|
|
|---|
Characteristics of normal proteins
To ascertain whether PrPC proteins have any unusual structural characteristics, we first computed several characteristics of proteins in our database. These serve as a basis to illustrate the unusual features, if any, of the prion proteins.
Sequence composition
An obvious property to consider is the overall amino-acid
composition. This is relevant because a proteomic sequence analysis of
31 organisms (Michelitsch and Weissman, 2000
) revealed a large content
of glutamine/asparaginine residues in yeast prions. The unusually high
content of Gln and Asn suggests that the regions containing these
residues may trigger the initial nucleation in the transition to the
-structures that leads to self-propagation in yeast prions
(Michelitsch and Weissman, 2000
). The composition of
PrPC is closer to that of all
proteins or to
those
+
proteins that have a higher strand content than helices.
In many
+
proteins, Ala is a typical helix former (Jiang et al.,
1998
). However, in PrPC only H3 contains a single
Ala, and none of the helices is amphipathic. In mPrP, H1 is made up of
three hydrophobic, one polar and six charged residues and has 3.66 residues/turn, and H2 is made up of six hydrophobic, two charged and 15 polar residues and has 3.92 residues/turn. Helix 3 is made up of seven
hydrophobic, eight charged and ten polar residues and has 5.06 residues/turn. Although prion proteins have slightly unusual
composition compared to a typical
+
protein, the differences are
not dramatic.
Solvent exposure ratio for the amino acids in a helix or strand in normal proteins
To characterize the degree of solvent exposure of an amino acid in
a protein, we used R = AS/AR
where AS is the solvent-accessible area calculated using the Lee and Richards (1971)
algorithm, and AR is the area of the same type (X) of
amino acid in a Gly-X-Gly extended conformation. The values of
AR were taken from the literature (Creighton, 1993
). Each position, excluding the ends of the chain, in a
protein in its folded conformation has R between 0 and 1. We
computed R for the 20 amino acids from the helices of the 58 mainly
proteins, and of the 36
+
proteins. These
distributions obey the known percentages of buried positions (defined
by R being less than 0.05) for each type of amino acid. For
example, Val is buried in 55% of cases, whereas Lys is buried in a
proportion of only 3% in the mainly
proteins. The distribution of
R values for the hydrophobic residues is peaked at small
values of R, with negligible small peaks at larger values.
For polar and charged amino acids, the distribution is much broader
with a peak at larger values than the ones for the hydrophobic amino acids.
For hydrophobic amino acids, the typical value of R is identified with the median value, whereas, for polar and charged amino acids, the average of the distribution gives the typical ratio (the actual values are in the caption for Fig. 3). For Ala, Gly, and Met, none of these two classifications led to a reliable typical value.
|
Distribution of the (
,
) angles for the typical amino acid
in helix or strand
We computed the distribution of the (
,
) angles for each
type of amino acid in the helices and strands from the three datasets described in Methods. The limits of the (
,
) pair for an amino acid (other than Gly and Pro) in an
helix are
80
48,
59
27. The range for a
sheet is
150
90, 90
150 (Srinivasan
and Rose, 1995
).
We divided the range for each of the two angles in a helix in intervals
of 10°, which led to nine different classes of (
,
) angles. We
assigned to each amino acid a number between 1 and 9, based on its
(
,
) angle. If the angles were outside the specified range of
all classes, an arbitrarily large number (50) was assigned. We obtained
the distribution of these numbers for the 20 amino acids in the helices
of the 58 mainly
proteins, and of the 36
+
proteins. The
median value, computed using the distribution, was chosen to be the
typical angle for that amino acid. This method of computing the typical
angles is more meaningful than obtaining averages over the distribution
of (
,
) angles. The typical (
,
) angles obtained from the
58 all
proteins using this procedure are: for Cys, Phe, Leu, Trp,
Val, Ile, Met, His, Tyr, Asn, Thr, Arg, Gln, Asp, and Glu
70 <
59,
49 <
38, and for Ala, Ser, and
Lys
70 <
59,
38 <
27. The
corresponding values from the
+
databank are very similar.
For each of the two angles in a strand, we divided the above
limits in intervals of 10°, which led to 25 different classes of
(
,
) angles. We attributed to each amino acid a number between 1 and 25, based on which class corresponded to its (
,
) angles. If
the angles fell outside the range, a score of 50 was given. From the
distribution of these numbers for the 20 types of amino acids in the
strands of the 31 mainly
proteins and of the 36
+
proteins,
the median value was calculated. The typical angle for a given amino
acid in these structures corresponds to the median value.
Structural mismatches in helices of PrPC
The distribution of R values and the angles adopted by amino acids in the helical and strand structures give a calibration of what is typically expected in the database of normal structures. A comparison of these values with those adopted by amino acids in the helices and strands of prion proteins gives hints of plausible structural anomalies.
If R (a measure of solvent accessibility) for an amino
acid in PrPC is different from the typical values
R
+
R
(the typical values and the
values of the tolerance
R
are given in the caption of Fig. 3), then we consider it a mismatch. Similarly, if the
(
,
) angle for an amino acid falls outside the typical values,
it is a mismatch. Using these criteria, we find that between 13 and 16 amino acids are mismatches in the four prion proteins (16 out of 52 positions in helices in the mPrP, 16 out of 56 positions in helices in
h1PrP, 13 out of 55 positions in helices in shPrP and 15 out of 56 positions in helices in h2PrP). The difference between R for
each position and the typical value of R for the amino acid
at that position in the helices of the four prion proteins are shown in
Fig. 4. We repeated this analysis for all
the helices in 58 proteins from the mainly
databank and in the 36 proteins from the
+
databank. The resulting histogram of all the
percent mismatches in angles (that is, the number of mismatches
Nmism divided by the length of the
helix, Lhelix) shows that the prion proteins belong to the tail of the distribution (Fig.
5). A histogram for the mismatches in
R also shows that the prion proteins belong to the tail of
the distribution corresponding to values larger than the average of
20% (data not shown). An analysis of the amino acids in the
strands did not reveal any anomalies in prion proteins compared to the
typical
+
protein.
|
|
Environment-dependent structural characteristics reveal unusual number of mismatches in PrPC
Baud and Karlin (1999)
introduced nine structural categories to
quantify the environmental propensities of amino acids in folded
proteins. They are based on the three standard secondary structure
states (helix, strand, and loop), and on three side-chain solvent-accessibility levels: As
10% for the buried state (b), 10% < As
40% for the partly buried state
(pb), and As > 40% for the exposed
state (e).
There are clear differences among the various types of amino acids in these nine structural classes in terms of number of neighbors and their identities (aromatic, hydrophobic, polar, positively, and negatively charged). Using this approach, we calculated the typical number of neighbors (within 5 Å) and the corresponding standard deviations for the amino acids in each of the nine structural classes from the database of proteins. We compared these reference results with the structural characteristics of amino acids in PrPC in terms of mismatches. The resulting histogram of such mismatches shows that there is a similarity between this analysis and the one in Fig. 5 (data not shown). There are significant deviations from the expected behavior of the regular density (defined in Methods) for amino acids exposed in the helices of PrPC. This is particularly dramatic for exposed hydrophobic residues (Fig. 6 A), buried negatively charged residues (data not shown) and exposed negatively charged residues (Fig. 6 B). These two figures suggest that, in PrPC, there are more mismatches between the structural preferences of amino acids and the actual structural environment that is found in normal proteins. Thus, both secondary-structure analysis and environmental analysis that reflects tertiary structure preferences suggest anomalies in PrPC.
|
Hydrogen bonds
The present analysis shows that amino acids at some sites in the
PrPC present unusual properties compared to the
average amino acid of the same type in normal proteins. This is
exemplified in the degree of solvent accessibility, i.e., some polar
amino acids (e.g., Glu-146 and Asp-147 in the first helix of mPrP),
which are typically exposed, are buried in PrPC.
It follows that the
-isoform of PrPC should
have many unsatisfied buried hydrogen-bond donors/acceptors. To assess
this, we examined the hydrogen-bond characteristics of normal proteins.
The analysis of the all
protein 1aa2 (108 residues) with the
WHAT CHECK program (Hooft et al., 1996
) reveals seven
unsatisfied buried hydrogen-bond donors/acceptors, whereas, for the
+
protein 1fkb (107 residues), the number is eight, and for the
all
protein 1aac (105 residues), the corresponding number is eight.
McDonald and Thornton (1994)
showed that, in normal proteins, only a
low percentage (~6%) of the total number of residues have
unsatisfied buried hydrogen-bond donors/acceptors. The WHAT
CHECK analysis for prion proteins shows 15 (14%) unsatisfied buried hydrogen-bond donors/acceptors in the mPrP (1ag2), also 15 (14%) in the shPrP (1b10), 6 (5.8%) and 9 (8.7%) in the human prion
(h2PrP and h1PrP, respectively). We find that mPrP and shPrP have more
than twice the usual amount of unsatisfied buried hydrogen-bond
donors/acceptors, whereas the human prion protein behaves more like an
average protein in this respect. There is good agreement between the
identity of the sites revealed by this analysis and the problem sites
identified by mismatches in R and the distribution of
(
,
) angles. As an example, the WHAT CHECK
analysis for mPrP reveals unsatisfied hydrogen bonds at positions 130, 139, 141, 142, 143, 145, 151, 155, 161, 166, 170, 174, 183, 187, and
219 (Fig. 4) and we found 130, 145, 151, 174, 183, 187, and 219 to have
mismatches in terms of R and the (
,
) angles.
Local versus nonlocal contacts
We consider two residues to be in contact if they have at least
two of the heavy atoms from their side-chain within 5.2 Å. Contacts
among residues that are less than (greater than) 20% of the length of
the protein along the sequence are classified as local (nonlocal). The
tendency of PrPC to undergo 

transition
suggests that these structures may only be marginally stable (Cohen et
al., 1994
), and hence the distribution of contacts, which points to the
intrinsic stability of proteins, is useful.
We calculated the number and type of the short- and long-range order
contacts in prion proteins and in the proteins from the three
databases. The results, shown in Table 1
and Fig. 7, show that, for a typical
protein (whether all
,
+
, or all
), the number of
short-range (or local) contacts is about twice as large as the number
of long-range (or nonlocal) contacts, whereas, in PrPC, the number of short-range contacts is
approximately equal to that of the number of long-range contacts. More
importantly, the nature of residues that are involved in local contacts
in PrPC is strikingly different from the ones in
normal proteins. In normal proteins the most probable local contacts
are exclusively made up of hydrophobic residues, which is not the case
in prion proteins (Table 1). The numbers of HP (hydrophobic (H) and
polar (P)) and HH contacts are approximately equal in prion proteins, whereas, in normal proteins, there are more HH contacts than HP contacts. The percentages of local contacts of types +
, H
,
and PP are much higher in PrPC than in normal
proteins.
|
|
To put the above striking observations about PrPC
in perspective, it is necessary to assess whether similar
characteristics can be found in normal proteins. We searched the
dataset to find instances of local and nonlocal contacts involving
amino acids that are not typically associated with each other. There
are a few proteins with similar characteristics as
PrPC. They are 1axn, 1mzl, 2asr, and 5cyt from
the all
database; 1fkb, 1vhh, 2cpl, and 9rnt from the
+
database; and 1aac and 1hoe from the all
database. Most of these proteins have, at short-range, comparable numbers of HP and HH contacts
and many PP contacts, just like PrPC. The all
proteins also have many local H+ and P+ type of contacts, but none of
them has as many +
and H
contacts as PrPC.
Proteins that appear most similar to PrPC are
2cpl and 1hoe, but neither of them has such a large percentage of +
contacts as prion proteins. Note that all of the above-mentioned
proteins from the
+
class have a much lower helical content than
PrPC: 1fkb has 1 helix, the rest being sheet,
1vhh is 19% helical and 16% sheet, 2cpl is 12% helical and 27%
sheet and 9rnt has just one helix, the rest being sheet.
This analysis reveals some important characteristics of
PrPC. They have an unusually large number of +
local contacts; They also have far from typical percentages of both
local and nonlocal HP, H
, PP, P
contacts; and The percentages of
local H
and P
contacts are similar to what is seen in proteins that
have a much higher sheet content than prion proteins.
Nature of contacts in H1, H2, and H3 in PrPC
The unusual local and nonlocal contacts (+
, HP, H
, PP,
P
) in PrPC are localized in the three helices.
Because we are interested in regions of instability, we examined the
nature of local and nonlocal contacts in H1, H2, and H3. The idea is to
compare the pair of the most probable type of short-range and the most
probable type of long-range contact in each helix from
PrPC with the same pair from the helices of all
other proteins. We define as short-range any contact between an amino
acid of a helix with any other amino acid belonging to the same helix
(including the N- and the C-cap of the helix), and as long-range the
contacts with the amino acids outside the helix. We extracted, from the prion proteins and from all the proteins in the all
and
+
datasets, the pair of the most probable types of short and long-range contacts for every helix. We then counted how often each of the pairs
seen in PrPC appears in the other proteins. The
combined results, presented in Tables 2
and 3, suggest that H2 is the most
different from average, followed by H1 and then by H3.
|
|
Contacts among H1, H2, and H3
The 

structural transition in
PrPC(90-231) leads to ~50%
-sheet content
in PrPSc, which suggests that a relatively large
number of the residues in the
-helices must rearrange into a
-strand conformation. Previous studies, based in part on the
observation that H1 is unusually hydrophilic (Billeter et al., 1995
;
Morrisey and Shakhnovich, 1999
), have already proposed that, in
PrPSc, H1 is likely to play a role in the
transition. Here, we find that the unusual nature of short- and
long-range contacts in H2 makes it frustrated in the helical state.
Therefore, it is interesting to analyze the contacts among the three
helices to estimate the effects that a structural transition
(

) in a helix might have on the other helices. The arrangement
of the helices in PrPC gives them an orthogonal
bundle (OB) architecture (Orengo et al., 1997
). Therefore, we selected
47 proteins with this type of architecture from the CATH database
(Orengo et al., 1997
). Of these, 27 are among the 58 all
proteins
from our dataset. We calculated the number of interhelical contacts in
each of these 47 proteins and in the four prion proteins. Plot of the
number of contacts formed by a helix versus its length and the
percentage of stabilizing (that is, HH or +
) contacts among these
(data not shown) shows that all three helices from
PrPC are similar (in this respect) to helices
from other proteins.
The analysis of the interhelical contacts reveals that there are many
more (H
) interhelical contacts in prion proteins than in other
proteins (most of which are due to the contacts between H1 and H3), and
that the percentage of HH type of contacts in PrPC is somewhat increased (due to the contacts
between H2 and H3). The largest majority of stabilizing (HH and +
)
contacts occur between the amino acids of the first half of H2
(172-183) and those of the second half of H3 (213-224) (around the
disulfide bond between Cys-179 and Cys-214). Contacts between H2 and H3 are very similar (in terms of number and type) with the interhelical contacts seen in other proteins having the same architecture as the
PrPC, suggesting that any structural
transformation involving H2 (especially its first half, i.e., residues
172-183) is likely to affect H3 also (especially its second half,
i.e., residues 213-224).
Clustering of hydrophilic and hydrophobic residues suggests that H2 is frustrated
Istrail et al. (1999)
have noted that one of the main features of
an aggregation-prone sequence is that the hydrophobic and hydrophilic
residues are clustered into a few large groups. To assess the
clustering of hydrophobic residues, we looked for how homogeneously the
hydrophobic residues are distributed in each of the two halves of the
three helices in PrPC in comparison to the other
proteins in the dataset. Because there is a rather broad distribution
of helix lengths in proteins, to describe the distribution of
hydrophobic amino acids in each helix, we used the quantity
|
(3) |
H is presented in Fig.
8. In the helices of normal proteins, the
hydrophobic residues are uniformly distributed in the two halves. Two
of the helices in prion proteins (H1 and H3) obey this rule very well.
The clear difference is in H2, which has nearly all its hydrophobic
amino acids clustered in one half. To check the generality of this
observation, we calculated
H for all the
helices (with minimum length of 23 amino acids) in PDB. Using the
Database of Structural Motifs in Proteins (DSMP), we found 7854 such
helices among the 12,904 proteins in the PDB. The corresponding
histogram of
H (Fig. 8) is very similar to the
one obtained using only the helices in our dataset of proteins.
|
Evaluating the structural properties of PrPC using threading
Because of our reliance on environmental classes, a threading
study that is based on profiles rather than on specific interactions between proteins would supplement our conclusions. To this end, we use
the standard 3D-1D scores introduced by Eisenberg and co-workers (Bowie
et al., 1991
).
To use the profile method for threading, we first determined for each
protein from our dataset the environment in each of its positions. We
performed gapless threading of all the prion sequences on all possible
fragments of structures from the dataset. For scoring, we used the
3D-1D scores, which encode the likelihood of finding the twenty amino
acids in the 18 possible environments (Bowie et al., 1991
). For
comparison purposes, we also did the threading for all the proteins in
our dataset.
According to the scores from threading analysis, the prion proteins are
very similar to those
+
proteins of similar length, but which
have little helical content compared to the strand content. The
R3D-1D scores for all
and
+
proteins having a high helical content (and of similar length as the
prion proteins) are well above 30, which is large compared to the
scores of 20 obtained for prion proteins. An estimator of the stability
of proteins in their native-state conformations is the Z
score (Bowie et al., 1991
),
|
(4) |
N
is the average score of
the sequence over all other conformations but the native one, and
is the corresponding standard deviation. A stable protein is
characterized by a large and positive Z.
The normalized Z scores for the majority of
+
proteins, with helical/strand content similar to that of
PrPC, are considerably higher than for prion
proteins (data not shown). This suggests that
PrPC, which are rich in helices, are not stable
in their native state conformations. The normalized Z of 0.7 for PrPC resembles the normalized Z
score for other
+
proteins of similar length and with very low
helical content.
The goodness of the fit between a sequence and a structure can be
assessed using 3D profile scores, which, for correct models, increase
with the length of a protein (Luthy et al., 1992
). The R3D-1D scores for the stable proteins
in the PDB are proportional to the sequence length (L). A
plot of the R3D-1D scores for the proteins in our dataset versus L can be fit using
|
(5) |
helical state.
Helix 2 is frustrated in PrPC mutants: Analysis using PHD
It is known that inherited human TSEs (familial CJD, GSS, and FFI)
are associated with mutations in the PRNP gene. According to
SWISS-PROT, seven of the point mutations (D178N, V180I, T183A, H187R, T188R, T188K, T188A) are found in H2 (172-194). A naive application of helical propensities due to Chou and Fasman (1978)
would
suggest that, except for D178N, all the other point mutations should
lead to better helix formation. However, reliable secondary-structure prediction requires the use of context-dependent propensities based on
multiple sequence alignments as used in PHD (Profile network from
Heidelberg) (Rost and Sander, 1993
). Kallberg et al. (2001)
have also
argued that H2 is "frustrated" in the helical conformation, whereas
H1 and H3 are not. This conclusion was reached by looking for sequences
with
7 residues from 1324 protein structures that are predicted by
PHD to be in
-strands, but are experimentally determined to be in
-helices. This
/
discordance (or mismatch) was proposed to be
associated with amyloid fibril formation (Kallberg et al., 2001
). We
measure the degree of
/
discordance or frustration using
|
(6) |
/
= 4 and the sequence is
experimentally determined to be in a helical conformation, then the
particular sequence is frustrated (or maximally discordant) in the
predicted secondary structure;
S
/
= 0 is marginal. Negative
values of S
/
imply that PHD is
unreliable in this prediction.
It has already been noted by Kallberg et al. that wild-type H2 is
discordant in mouse and syrian hamster. We calculated the S
/
scores for the predictions
of PHD of secondary structure content due to point mutations in H2
(Fig. 9). The
S
/
scores are 1.83, 1.94, 1.80, 1.30, 1.80, 1.54, 1.94, and 1.94 for the WT, D178N, V180I, T183A,
H187R, T188K, T188R, and T188A, respectively. Helix 2 for all these
mutations is, just as in the WT, discordant or frustrated (Fig. 1). The
differences in the S
/
scores
suggest that the biophysical characteristics of
PrPC mutants can differ greatly (Liemann and
Glockshuber, 1999
). However, in all cases, H2 is frustrated.
|
| |
DISCUSSION |
|---|
|
|
|---|
Experimental evidence
A key experimentally testable prediction of our analysis is that,
in addition to the previously identified residues, even segments of the
relatively rigid core of the PrPC, especially the
C-terminal residues of H2, might play a role in the transition to an
assembly competent state, i.e., one that can nucleate and polymerize.
This result and the observation that the secondary structure of
PrPSc has a high percentage of
-sheet content
imply that the whole protein molecule unfolds substantially in going
from PrPC to an aggregation prone state. Although
direct and unequivocal experimental validation is currently not
available, many distinct lines of evidence lend support to our
theoretical analysis. These include backbone hydrogen/deuterium (H/D)
exchange dynamics, NMR relaxation measurements, and CD spectroscopy of
the states in the
PrPC
PrPSc transition.
Backbone H/D exchange dynamics
It has been postulated that the formation of fibrils in amyloids and prion proteins occurs from populated (at least partially) intermediates, i.e., PrPC
[I]
PrPSc
(Hornemann and Glockshuber, 1998
1.5 × 103 (Hosszu et al., 1999
KUN, then
exchange occurs only from the unfolded PrPC
rather than from an equilibrium or off-pathway intermediate. Remarkably, it was found that only about 10 residues (Hosszu et al.,
1999
-sheet architecture led them to conclude that "complete or near-complete unfolding must precede rearrangement of
the amyloidogenic intermediate."
Biophysical studies
NMR structures show that the N-terminal residues (90-120) in PrPC are disordered (Riek et al., 1996
-sheet structure (Hornemann and
Glockshuber, 1998Unfolding of PrPC
Because the core of PrPC might play a role in the conformational change, it follows that substantial unfolding must precede the formation of a species,
-PrP (Baskakov et al.,
2001
-helical PrPC and
-PrP. The time scale for
-PrP, the