help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Holzhütter, H.-G.
Right arrow Articles by Kloetzel, P.-M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Holzhütter, H.-G.
Right arrow Articles by Kloetzel, P.-M.

Biophys J, September 2000, p. 1196-1205, Vol. 79, No. 3

A Kinetic Model of Vertebrate 20S Proteasome Accounting for the Generation of Major Proteolytic Fragments from Oligomeric Peptide Substrates

Hermann-Georg Holzhütter and Peter-Michael Kloetzel

Humboldt-Universität zu Berlin, Medizinische Fakultät (Charité), Institut für Biochemie, D-10117 Berlin, Germany


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
REFERENCES

There is now convincing evidence that the proteasome contributes to the generation of most of the peptides presented by major histocompatibility complex class I molecules. Here we present a model-based kinetic analysis of fragment patterns generated by the 20S proteasome from 20 to 40 residues long oligomeric substrates. The model consists of ordinary first-order differential equations describing the time evolution of the average probabilities with which fragments can be generated from a given initial substrate. First-order rate laws are used to describe the cleavage of peptide bonds and the release of peptides from the interior of the proteasome to the external space. Numerical estimates for the 27 unknown model parameters are determined across a set of five different proteins with known cleavage patterns. Testing the validity of the model by a jack knife procedure, about 80% of the observed fragments can be correctly identified, whereas the abundance of false-positive classifications is below 10%. From our theoretical approach, it is inferred that double-cleavage fragments of length 7-13 are predominantly cut out in "C-N-order" in that the C-terminus is generated first. This is due to striking differences in the further processing of the two fragments generated by the first cleavage. The upstream fragment exhibits a pronounced tendency to escape from second cleavage as indicated by a large release rate and a monotone exponential decline of peptide bond accessibility with increasing distance from the first scissile bond. In contrast, the release rate of the downstream fragment is about four orders of magnitude lower and the accessibility of peptide bonds shows a sharp peak in a distance of about nine residues from the first scissile bond. This finding strongly supports the idea that generation of fragments with well-defined lengths is favored in that temporary immobilization of the downstream fragment after the first cleavage renders it susceptible for a second cleavage.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
REFERENCES

The proteasome is an intracellular multisubunit protease that catalyzes selective proteolytic protein processing within various cellular signal-transducing pathways, such as cell cycle control, transcriptional regulation, and antigen presentation (Ciechanover, 1994; Goldberg et al., 1995; Coux et al., 1996). Hydrolytic cleavage of peptide bonds takes place in the 20S core proteasome, a barrel-shaped protein complex made up of four staggered rings each composed of 7 subunits. Recognition and unfolding of protein substrates and their translocation into the 20S core complex is mediated by regulatory protein complexes such as S11 and S19, which, in vivo, may associate with one or both ends of the 20S core (for a recent review see e.g., Baumeister et al., 1998). The function of the eukaryotic proteasome to serve as supplier of epitopes presented by the major histocompatibility complex (MHC) class I has greatly intensified experimental work aimed at elucidating the structural and kinetic basis for the high selectivity with which the proteasome cuts out antigenic peptides from precursor proteins (Dick et al., 1994; Eggers et al., 1995; Niedermann et al., 1995; Kuckelkorn et al., 1995; Ossendorp et al., 1996; Groettrup et al., 1996). Nevertheless, a quantitative theoretical model to account for the observed patterns of cleavage fragments is still lacking. In vitro digests of model substrates by the 20S proteasome have provided evidence that the cleavage preference for a given type of peptide bond depends upon the amino acid motif in a larger sequence window around this bond (Shimbara et al., 1997). Based on this finding, we have recently developed a statistical approach to identify cleavage-determining amino acid motifs around the scissile bond (Holzhütter et al., 1999). This approach has lead to the establishment of a mathematical function that relates the overall probability for the cleavage of a given peptide bond to the generic side-chain properties "volume" and "transfer energy" of the bond-flanking amino acid residues. However, knowledge of potential cleavage sites does not suffice for the prediction of potential fragments. Fragment patterns derived from in vitro digests (Ehring et al., 1995; Niedermann et al., 1996; Theobald et al., 1998; Sijts et al., 2000) have provided evidence that the number of major double-cleavage fragments, i.e., those produced in amounts sufficient for the identification as individual high performance liquid chromatography (HPLC) fractions, is very small compared with the number of double-cleavage fragments that would result if all possible combinations of cleavage sites were used with equal efficiency. This finding points to the existence of constraints for the consecutive use of cleavage sites, in that cleavage of a peptide at any of the active sites determines the extent with which the peptide bonds of the two resulting successor fragments are accessible for further cleavage at a neighboring active site. In this paper, we present a simplified kinetic scheme for the generation of double-cleavage fragments by the 20S proteasome, which explicitly takes into account such possible correlation's between the type and the spatial distance of the two peptide bonds defining the fragment.

Kinetic model

The kinetics of fragment generation by the 20S proteasome is considered as a stop-and-go process, i.e., a new substrate molecule cannot be taken up before all degradation products of the preceding substrate molecule have been released into the extra-proteasomal space. This assumption avoids explicitly including binding competition into modeling. Furthermore, we restrict our analysis to those double-cleavage fragments (DCFs) that are cut out from the initial substrate by two immediate consecutive cleavages. In this case, there are only two alternative routes of DCF generation depending on whether the C-terminus or the N-terminus is formed first. This is illustrated in Fig. 1 where S1n = {R1:R2:···:Rn} denotes the initial substrate and Sij [i not equal  1, j not equal  n] is an arbitrary double-cleavage fragment. Generation of the DCF Si,j in C-N-order means that the first cleavage occurs at the P1 residue in sequence position j. This results in the formation of the terminal fragment Sj+1,N and the so-called N-fragment (or downstream fragment) S1,j possessing already the C-terminus of the later DCF. Alternatively, fragment generation in N-C-order means that the first cleavage occurs at the P1 residue in sequence position i - 1 resulting in the formation of the terminal fragment S1,i-1 and the so-called C-fragment (or upstream fragment) Si,N possessing already the N-terminus of the later DCF.



View larger version (29K):
[in this window]
[in a new window]
 
FIGURE 1   Kinetic scheme for the generation of double-cleavage fragments from an oligomeric precursor protein. The double-cleavage fragment Si,j = {Ri:Ri+1:···:Rj} (j >=  i) is considered to be formed from the initial substrate S1,n = {R1:R2:···:Rn} by two subsequent cleavages. Depending on the order of these two cleavages, one may distinguish between two cleavage routes. C-N-order, The peptide bond Rj:Rj+1 is cleaved first, resulting in formation of the terminal fragment Sj+1,n and the intermediary N-fragment S1,j, which may yield the DCF Si,j after second cleavage at Si-1,i; N-C-order, The peptide bond Ri-1:Ri is cleaved first, resulting in formation of the terminal fragment S1,i-1 and the intermediary C-fragment Si,n, which may yield the DCF Si,j after second cleavage at Rj:Rj+1. cj(ik) denotes the rate for cleavage of fragment Si,k at peptide bond Rj:Rj+1 (i <=  j < k), r(ik) denotes the release rate of fragment Si,k.

To establish kinetic equations associated with the reaction scheme in Fig. 1, we introduce the time-dependent probabilities pi,j and p*i,j to observe the fragment Si,j at time t either inside or outside the proteasome if the substrate S1,n was taken up at zero time. Taking into account the release of fragments into the extra-proteasomal space and neglecting DCF formation by more than two cleavages, the time-dependent evolution of pi,j and p*i,j is governed by the following set of ordinary first-order differential equations:
<FR><NU><UP>d</UP></NU><DE><UP>d</UP>t</DE></FR> p<SUB><UP>i,j</UP></SUB>=c<SUB><UP>i−1</UP></SUB>(1, j)p<SUB><UP>1,j</UP></SUB>+c<SUB><UP>j</UP></SUB>(i, n)p<SUB><UP>i,n</UP></SUB>−r<SUB><UP>i,j</UP></SUB> p<SUB><UP>i,j</UP></SUB> (1)

[i≠1, j≠n]

<FR><NU><UP>d</UP></NU><DE><UP>d</UP>t</DE></FR> p<SUB><UP>1,j</UP></SUB>=c<SUB><UP>j</UP></SUB>(1, n)p<SUB><UP>1,n</UP></SUB>−<FENCE>r<SUB><UP>1,j</UP></SUB>+<LIM><OP>∑</OP><LL><UP>k=1</UP></LL><UL><UP>j−1</UP></UL></LIM> c<SUB><UP>k</UP></SUB>(1, j)</FENCE>p<SUB><UP>1,j</UP></SUB> (2)

[j≠n]

<FR><NU><UP>d</UP></NU><DE><UP>d</UP>t</DE></FR> p<SUB><UP>i,n</UP></SUB>=c<SUB><UP>i−1</UP></SUB>(1, n)p<SUB><UP>1,n</UP></SUB>−<FENCE>r<SUB><UP>i,n</UP></SUB>+<LIM><OP>∑</OP><LL><UP>k=i</UP></LL><UL><UP>n−1</UP></UL></LIM> c<SUB><UP>k</UP></SUB>(i, n)</FENCE>p<SUB><UP>i,n</UP></SUB> (3)

[i≠1]

<FR><NU><UP>d</UP></NU><DE><UP>d</UP>t</DE></FR> p<SUB><UP>1,n</UP></SUB>=<UP>−</UP><FENCE>r<SUB><UP>1,n</UP></SUB>+<LIM><OP>∑</OP><LL><UP>k=1</UP></LL><UL><UP>n−1</UP></UL></LIM> c<SUB><UP>k</UP></SUB>(1, n)</FENCE>p<SUB><UP>1,n</UP></SUB> (4)

<FR><NU><UP>d</UP></NU><DE><UP>d</UP>t</DE></FR> p<SUP>*</SUP><SUB><UP>i,j</UP></SUB>=r<SUB><UP>i,j</UP></SUB> p<SUB><UP>i,j</UP></SUB> [i=1 … n, j=i … n]. (5)
Here, cj(ik) denotes the cleavage rate of peptide bond Rj:Rj+1 in fragment Si,k (i <=  j < k) and ri,j is the release rate for fragment Si,j.

The equation system 1-5 has to be solved for the initial conditions p1,n(t = 0) = 1, pi,j(t = 0) = 0 [i not equal  1, j not equal  n]. The homogeneous first-order differential Eq. 5 can be directly integrated yielding
p<SUB><UP>1,n</UP></SUB>=<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t) (6)
whereby
&agr;<SUB><UP>j</UP></SUB>=r<SUB><UP>1,j</UP></SUB>+<LIM><OP>∑</OP><LL><UP>k=1</UP></LL><UL><UP>j−1</UP></UL></LIM> c<SUB><UP>k</UP></SUB>(1, j) (7)
With p1,n given by expression 6, the equations 1-3 possess the general form
<FR><NU><UP>d</UP>p</NU><DE><UP>d</UP>t</DE></FR>=<LIM><OP>∑</OP><LL><UP>i</UP></LL></LIM> A<SUB><UP>i</UP></SUB> <UP>exp</UP>(<UP>−</UP>B<SUB><UP>i</UP></SUB>t)−qx, (8)
where the Ai and q are time-independent constants. For the initial condition p(t = 0) = 0 the inhomogeneous differential Eq. 8 is solved by
p=<LIM><OP>∑</OP><LL><UP>i</UP></LL></LIM> <FR><NU>A<SUB><UP>i</UP></SUB></NU><DE>q−B<SUB><UP>i</UP></SUB></DE></FR> [<UP>exp</UP>(<UP>−</UP>B<SUB><UP>i</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>qt)]. (9)
Thus the solution of Eqs. 2 and 3 read
p<SUB><UP>1,j</UP></SUB>=<FR><NU>c<SUB><UP>j</UP></SUB>(1, n)</NU><DE>&agr;<SUB><UP>j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR> [<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>j</UP></SUB>t)] (10)
and
p<SUB><UP>i,n</UP></SUB>=<FR><NU>c<SUB><UP>i−1</UP></SUB>(1, n)</NU><DE>&bgr;<SUB><UP>i</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR> [<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>&bgr;<SUB><UP>i</UP></SUB>t)], (11)
where
&bgr;<SUB><UP>i</UP></SUB>=r<SUB><UP>i,n</UP></SUB>+<LIM><OP>∑</OP><LL><UP>k=i</UP></LL><UL><UP>n−1</UP></UL></LIM> c<SUB><UP>k</UP></SUB>(i, n) (12)
Inserting expressions 10 and 11 into Eq. 1 and again using the general formula 9, we get
p<SUB><UP>i,j</UP></SUB>=<FR><NU>c<SUB><UP>i−1</UP></SUB>(1, j)c<SUB><UP>j</UP></SUB>(1, n)</NU><DE>&agr;<SUB><UP>j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR> <FENCE><FR><NU><UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>r<SUB><UP>i,j</UP></SUB>t)</NU><DE>r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR></FENCE>

<FENCE>−<FR><NU><UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>j</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>r<SUB><UP>i,j</UP></SUB>t)</NU><DE>r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>j</UP></SUB></DE></FR></FENCE> 

 +<FR><NU>c<SUB><UP>j</UP></SUB>(i, n)c<SUB><UP>i−1</UP></SUB>(1, n)</NU><DE>&bgr;<SUB><UP>i</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR> <FENCE><FR><NU><UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>r(i, j)t)</NU><DE>r(i, j)−&agr;<SUB><UP>n</UP></SUB></DE></FR>−<FR><NU><UP>exp</UP>(<UP>−</UP>&bgr;<SUB><UP>i</UP></SUB>t)−<UP>exp</UP>(<UP>−</UP>r(i, j)t)</NU><DE>r(i, j)−&bgr;<SUB><UP>i</UP></SUB></DE></FR></FENCE>.    (13)
Finally, the probability p*i,j to find the double-cleavage fragment Si,j in the external compartment is obtained by direct integration of Eq. 5,
p<SUP>*</SUP><SUB><UP>i,j</UP></SUB>=r<SUB><UP>i,j</UP></SUB><LIM><OP>∫</OP><LL>0</LL><UL><UP>t</UP></UL></LIM><UP>d</UP>t′p<SUB><UP>i,j</UP></SUB>(t′)

=<FR><NU>r<SUB><UP>i,j</UP></SUB>c<SUB><UP>i−1</UP></SUB>(1, j)c<SUB><UP>j</UP></SUB>(1, n)</NU><DE>&agr;<SUB><UP>j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR> <FENCE><FR><NU>1−<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t)</NU><DE>&agr;<SUB><UP>n</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB>)</DE></FR></FENCE>

−<FR><NU>1−<UP>exp</UP>(<UP>−</UP>r<SUB><UP>i,j</UP></SUB>t)</NU><DE>r<SUB><UP>i,j</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB>)</DE></FR>+<FR><NU>1−<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>j</UP></SUB>t)</NU><DE>&agr;<SUB><UP>j</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>j</UP></SUB>)</DE></FR>

−<FENCE><FR><NU>1−<UP>exp</UP>(<UP>−</UP>r<SUB><UP>i,j</UP></SUB>t)</NU><DE>r<SUB><UP>i,j</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>j</UP></SUB>)</DE></FR></FENCE> 

+<FR><NU>r<SUB><UP>i,j</UP></SUB>c<SUB><UP>j</UP></SUB>(i, n)c<SUB><UP>i−1</UP></SUB>(1, n)</NU><DE>&bgr;<SUB><UP>i</UP></SUB>−&agr;<SUB><UP>n</UP></SUB></DE></FR> <FENCE><FR><NU>1−<UP>exp</UP>(<UP>−</UP>&agr;<SUB><UP>n</UP></SUB>t)</NU><DE>&agr;<SUB><UP>n</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB>)</DE></FR></FENCE> (14)

−<FR><NU>1−<UP>exp</UP>(<UP>−</UP>r<SUB><UP>i,j</UP></SUB>t)</NU><DE>r<SUB><UP>i,j</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&agr;<SUB><UP>n</UP></SUB>)</DE></FR>+<FR><NU>1−<UP>exp</UP>(<UP>−</UP>&bgr;<SUB><UP>i</UP></SUB>t)</NU><DE>&bgr;<SUB><UP>i</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&bgr;<SUB><UP>i</UP></SUB>)</DE></FR>

<FENCE>−<FR><NU>1−<UP>exp</UP>(<UP>−</UP>r<SUB><UP>i,j</UP></SUB>t)</NU><DE>r<SUB><UP>i,j</UP></SUB>(r<SUB><UP>i,j</UP></SUB>−&bgr;<SUB><UP>i</UP></SUB>)</DE></FR></FENCE>.
The observation times used in the in vitro digest experiments range from 15 min to several hours and thus are orders of magnitude larger than the characteristic times alpha j-1, beta i-1, and ri,j-1 determining the time-dependence of p*i,j. Therefore, we take in the following quasistationary limit (t right-arrow infinity ) of Eq. 14:
<A><AC>p</AC><AC>˜</AC></A><SUP>*</SUP><SUB><UP>i,j</UP></SUB>=<A><AC>p</AC><AC>˜</AC></A><SUP>*</SUP><SUB><UP>i←j</UP></SUB>+<A><AC>p</AC><AC>˜</AC></A><SUP>*</SUP><SUB><UP>i→j</UP></SUB>, (15)
&ptilde;*i←j and &ptilde;*iright-arrow j are the quasistationary probabilities for the generation of fragment Si,j in C-N-order or N-C-order:
<A><AC>p</AC><AC>˜</AC></A><SUP>*</SUP><SUB><UP>i←j</UP></SUB>=<FR><NU>c<SUB><UP>i−1</UP></SUB>(1, j)c<SUB><UP>j</UP></SUB>(1, n)</NU><DE>&agr;<SUB><UP>n</UP></SUB>&agr;<SUB><UP>j</UP></SUB></DE></FR>, (16)

<A><AC>p</AC><AC>˜</AC></A><SUP>*</SUP><SUB><UP>i→j</UP></SUB>=<FR><NU>c<SUB><UP>j</UP></SUB>(i, n)c<SUB><UP>i−1</UP></SUB>(1, n)</NU><DE>&agr;<SUB><UP>n</UP></SUB>&bgr;<SUB><UP>i</UP></SUB></DE></FR>. (17)
According to Eqs. 16 and 17, the probability of cutting out the double-cleavage fragment Si,j is high if (i) the cleavage rates at the P1 residues Ri-1 and Rj yielding the N and C terminus of the fragment are large, and (ii) the factors alpha j and beta i are small, i.e., the cleavage rates at all P1 residues except those defining the two ends of the DCF must be small to prevent degradation of the intermediary fragments S1,j and Si,n to other fragments.

Evidently, the probability p*i,j to derive the double-cleavage fragment Si,j from the initial substrate S1,n equals the amount of this fragment formed relative to the amount of initial substrate utilized. To be identifiable in the experiment, this amount of a peptide has to exceed the background noise. In regression statistics, the common practice to relate binary yes-or-no events to the value of a continuous explanatory variable is to use a logistic-type function (Efron, 1975). Accordingly, we define the probability of a double-cleavage fragment Si,j to be observable in a long-term digestion experiment by
<UP>Pob</UP>s<SUB><UP>i,j</UP></SUB>=<FR><NU>1</NU><DE>1+<UP>exp</UP>(p<SUB><UP>c</UP></SUB>−<A><AC>p</AC><AC>˜</AC></A><SUP>*</SUP><SUB><UP>i,j</UP></SUB>)</DE></FR>, (18)
where pc > 0 is a properly chosen cut-off value. A fragment Si,j will be classified as observable when its observation probability is larger than 0.5. In general, the cut-off value pc will differ for various fragments because the retention time in HPLC analysis and the average ion current in mass spectrometry depend upon the specific side-chain properties of the constituting amino acids. In this paper, we refrained from a detailed consideration of sequence-dependent effects on the experimental identification of fragments and instead used a unique cut-off value for all proteolytic fragments.

The derivation of an empirical rate law for the cleavage rates cj(ik) is guided by the consideration that efficient cleavage of the peptide bond Rj:Rj+1 in fragment Si,k is determined by three factors: the accessibility, acj(ik), of the bond by an active site capable of cleaving it, the affinity, afj(ik), with which the fragment Si,k binds to this active site, and the catalytic rate crj(ik) with which the bond is hydrolyzed. Hence, we put
c<SUB><UP>j</UP></SUB>((i, k)=<UP>ac<SUB>j</SUB></UP>(i, k)<UP>af<SUB>j</SUB></UP>(i, k)<UP>cr<SUB>j</SUB></UP>(i, k)) (19)
with
<UP>ac<SUB>j</SUB></UP>(1, N)=1, (20)

<UP>ac<SUB>j</SUB></UP>(1, k)=<UP>exp</UP><FENCE><UP>−</UP><FR><NU>(k−j−L<SUB><UP>N</UP></SUB>)<SUP>2</SUP></NU><DE>2&sfgr;<SUP>2</SUP><SUB><UP>N</UP></SUB></DE></FR></FENCE>, (21)

<UP>ac<SUB>j</SUB></UP>(k, N)=<UP>exp</UP><FENCE><UP>−</UP><FR><NU>(j−k−L<SUB><UP>C</UP></SUB>)<SUP>2</SUP></NU><DE>2&sfgr;<SUP>2</SUP><SUB><UP>C</UP></SUB></DE></FR></FENCE>, (22)

<UP>af<SUB>j</SUB></UP>(i, k)=&THgr;<FENCE><UP>CP<SUB>j</SUB></UP>(i, k)−<FR><NU>1</NU><DE>2</DE></FR></FENCE><UP>CP<SUB>j</SUB></UP>(i, k), (23)

<UP>cr<SUB>j</SUB></UP>(i, k)=<UP>cr</UP>(R<SUB><UP>j</UP></SUB>)=<UP>cr<SUB>j</SUB></UP>. (24)
Assumption 20 means that all peptide bonds of the initial substrate are accessible for an active site capable of cleaving it. In contrast, one expects some restrictions for the accessibility of the peptide bonds in the N- and C-fragment because they have to move from the first active site to another one before being released (cf. Fig. 2). These restrictions are taken into account by expressions 21 and 22. The parameters LC and LN represent optimal sequence separations between consecutively used cleavage sites, i.e., P1 residues located in the sequence positions j = k - LN and j = k + LC relative to the P1 position k of the first scissile bond are those having the shortest distance to the active site to be used next.



View larger version (14K):
[in this window]
[in a new window]
 
FIGURE 2   Differences in the accessibility of the peptide bonds of the single-cleavage intermediates to the second active center. Under the assumption that the single-cleavage N-fragment S1,k exhibits a fully extended conformation during its formation at the first active site, the spatial distance Delta rj between an arbitrary peptide bond Rj:Rj+1 and the second cleavage site (to be used next) obeys the relation
&Dgr;r<SUP>2</SUP><SUB><UP>j</UP></SUB>≈&Dgr;r<SUP>2</SUP><SUB><UP>i</UP></SUB>+‖i−j‖<SUP>2</SUP>&dgr;<SUP>2</SUP>, (I)
whereby Delta ri represents the shortest distance between the second active site and the intermediate (projecting onto the peptide bond Ri:Ri+1) and delta  is the average Calpha  - Calpha distance. The transition probability for the nearest peptide bond Ri:Ri+1 to reach the second active site within the time span tau  is given by the solution of the radial-symmetric diffusion equation,
T<SUB>&tgr;</SUB>(r‖r<SUB><UP>i</UP></SUB>)=<FR><NU>1</NU><DE>4&pgr;D&tgr;<SUP>3/2</SUP></DE></FR> <UP>exp</UP><FENCE><UP>−</UP><FR><NU>&Dgr;r<SUP>2</SUP><SUB><UP>i</UP></SUB></NU><DE>4D&tgr;</DE></FR></FENCE>, (II)
where D denotes the diffusion coefficient of the intermediate. Using relation (I), it follows that the transition probability for an arbitrary peptide bond,
T<SUB>&tgr;</SUB>(r‖r<SUB><UP>j</UP></SUB>)≈<UP>exp</UP><FENCE><UP>−</UP><FR><NU>‖i−j‖<SUP>2</SUP>&dgr;<SUP>2</SUP></NU><DE>4D&tgr;</DE></FR></FENCE>T<SUB>&tgr;</SUB>(r‖r<SUB><UP>i</UP></SUB>), (III)
decreases exponentially with squared sequence separation from the best accessible bond Ri:Ri+1. Equation (III) provides the heuristic basis for the phenomenological expressions 21 and 22 whereby the parameter
&sfgr;<SUP>2</SUP>=<FR><NU>D⟨&tgr;⟩</NU><DE>2&dgr;<SUP>2</SUP></DE></FR> (IV)
is proportional to the diffusion coefficient and the mean transition time < tau > available for the passage of intermediary fragments from the first to the second active site.

Affinity term 23 is given in terms of the so-called cleavage probability CPj(ik) relating the probability for the cleavage of the peptide bond Rj:Rj+1 in fragment Si,k to the presence of certain amino acid motifs in the vicinity of the scissile bond required for the attainment of a proper binding conformation (Holzhütter et al., 1999). theta (x) denotes the unit-step function, i.e., theta (x) = 1 if x >=  0, theta (x) = 0 if x < 0, because the parameters of the cleavage probability were estimated on the basis of an evaluation scheme that classifies the peptide bond as not amenable to hydrolysis (i.e., possessing zero affinity) if CPj(ik) < 0.5. The catalytic rates defined through Eq. 24 are assumed to depend exclusively on the type of the P1 residue and enter the model as unknown constants.



View larger version (43K):
[in this window]
[in a new window]
 
FIGURE 3   Catalytic rates for peptide bond cleavage at various P1 residues. The numerical values for the catalytic rates and their variances were obtained by fitting the logistic expression 18 for the observation probability to HPLC-based binary observations (= yes or no) on double-cleavage formation for the five different in vitro digests given in Table I. The variances (indicated by the vertical bars) were assessed from the five different outcomes of the jack-knife procedure described in the main text. For the two residues, Gln and His, no catalytic rate could be estimated because of lacking experimental information.

Because structural data for the yeast 20S proteasome suggest the export of fragments from the interior of the proteasome to the outer space to proceed through narrow openings, the rate equations for the release of the two intermediate fragments formed after the first cleavage are chosen in a size-dependent manner,
r(1, j)=r<SUB><UP>N</UP></SUB> <UP>exp</UP>[<UP>−</UP>&ggr;<SUB><UP>N</UP></SUB>(j−1)], (25)

r(i, n)=r<SUB><UP>C</UP></SUB> <UP>exp</UP>[<UP>−</UP>&ggr;<SUB><UP>C</UP></SUB>(n−i)]. (26)
rN and rC denote the maximal release rates (with which single amino acids are being released) and the exponential factors gamma N and gamma C determine how sensitive the release depends upon the size of the fragment. The exponential rate laws 25 and 26 are based on the intuition that, when threading a peptide through a narrow opening, each residue of the peptide may potentially interact with the opening, thus hampering the passage of the peptide with a certain probability (say p0). Accordingly, the probability for an uninterrupted passage of the peptide should decay with (1 - p0)n = exp(-gamma n) where gamma  = 1/(1 - p0).

Numerical estimation of model parameters

The cleavage probabilities CPj(ik) defining the affinities, 23, were taken from Holzhütter et al. (1999). Hence, the kinetic proteasome model defined through the Eqs. 1-26 is composed of 27 unknown parameters: cri[i = 1, ... , 20], rC, rN, gamma C, gamma N, LC, LN, and pc. Numerical estimates for these parameters were obtained by least-square minimization,
<LIM><OP>∑</OP><LL><UP>i,j</UP></LL></LIM>[<UP>Pob</UP> s<SUB><UP>i,j</UP></SUB>−O<SUB><UP>i,j</UP></SUB>]<SUP>2</SUP> → <UP>minimum!,</UP> (27)
where Pob si,j, defined by expression 18, is a continuous function in the range [0, 1] and Oi,j is a binary classification, i.e.,
O<SUB><UP>i,j</UP></SUB>=<FENCE><AR><R><C>1 <UP>if fragment </UP>S<SUB><UP>i,j</UP></SUB> <UP>was identified</UP></C></R><R><C>0 <UP>otherwise</UP>.</C></R></AR></FENCE>  (28)
The sum in Eq. 27 runs over all fragments Si,j, which generally can be generated from the five peptide substrates used in the in vitro experiments chosen as experimental bases for the model fit (cf. Table 1). The total number of terms in the square-sum, 27, was 2113 (= 72 fragments observed, 2041 fragments not observed). Minimization was carried out by a conjugated-gradient method using the software package SIMFIT of Holzhütter and Colosimo (1990).


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1   Observed and predicted DCFs for five different in vitro digests


    RESULTS
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
REFERENCES

The numerical estimates of the model parameters are given in Table 2. Because none of the observed double-cleavage fragments was generated through cleavage at Gln or His, no catalytic rate could be assessed for these two residues.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 2   Model parameters and relative involvement of P1 residues in the production of DCFs

The value of the observation probability, Eq. 18, does not change if the catalytic rates and the release rates rC and rN are multiplied with an arbitrary nonzero factor. Therefore, to arrive at absolute values for the rate constants, the additional side constraint tau  = NStau 0 was used where tau  is the half-life time of substrate depletion reported for the experiment, NS is the average number of substrate molecules digested by a single proteasome, and tau 0 represents the elementary turnover time for a single substrate molecule. Because the probability for the substrate molecule to be still unaffected by any cleavage after time t decays with p1,N(t) e-alpha nt, we have put tau o = ln(2)/alpha n.

The catalytic rates for the various P1 residues differ by several order of magnitude. The largest values (>1 sec-1) were obtained for Cys, Gly, Glu, and Trp, the smallest values (<10-5 sec-1) for Ser, Ile, and Lys. There is a statistically significant correlation (r = 0.6) between the catalytic rate for a given P1 residue and the frequency with which cleavage at this residue is involved in the generation of a double cleavage fragment (see fifth and sixth row of Table 2). Only two P1 residues clearly fall outside this correlation: Gly, for which our calculations yielded the largest catalytic rate (crGly = 9.7 sec-1), whereas the relative frequency of double-cleavage fragments involving cleavage at Gly is one of the lowest (<FR><NU>3</NU><DE>8</DE></FR> = 0.38 DCFs generated per Gly on the average); and Leu, which is most frequently involved in DCF generation (<FR><NU>34</NU><DE>15</DE></FR> = 2.3 DCFs per Leu), whereas a surprisingly low catalytic rate (crLeu = 0.009 sec-1) was calculated for this residue.

The model parameters reveal significant differences in the release kinetics and the bond accessibilities of the intermediary N- and C-fragments. The length dependency is more pronounced for the N-fragment (gamma N = 0.76) than for the C-fragment (gamma C = 0.22). Moreover, the release rate for the C-fragments is about four orders larger than that for N-fragments of equal length (rC/rN approx  104).

Intriguingly, the accessibility of the peptide bonds in the N-fragment shows a sharp peak around the sequence position located at distance LN = 8.6 approx  9 residues away from the first scissile bond. In contrast, the accessibility of the peptide bonds in the C-fragment decreases monotonously with increasing distances from the first scissile bond (cf. Fig. 5). A plausible explanation for these striking discrepancies is that, after the first cleavage, the C-fragment moves freely in a diffusion-like manner so that the likelihood for a peptide bond of the C-fragment to become cleaved at the same active site decreases with increasing distance from the first scissile bond. In contrast, the very low release rate of the N-fragment points to some fixation of this fragment drastically hampering its free motion and thus rendering it susceptible for a second cleavage at peptide bonds located in a rather narrow sequence range of 7-13 residues relative to the peptide bond cleaved first. The sharp localization of the accessibility term for the N-fragment entails that double-cleavage fragments of lengths 7-13 are almost exclusively cut out in C-N-order, whereas fragments outside this size range are formed in N-C-order. This can be depicted from Fig. 4, showing, for all correctly predicted DCFs (=66 out of 72), the relative probabilities for the two alternative routes of DCF generation as functions of the fragment length.



View larger version (89K):
[in this window]
[in a new window]
 
FIGURE 4   Average relative probability for the generation of a double-cleavage fragment in C-N-order or N-C-order. The average probability that a double-cleavage fragment of given size is formed in C-N-order (dark bars) or N-C-order (light bars) were computed by averaging the relative proportions &ptilde;*i←j/(&ptilde;*i←j + &ptilde;*iright-arrow j) and &ptilde;*iright-arrow j/(&ptilde;*i←j + &ptilde;*iright-arrow j) of stationary probabilities 16 and 17 across all correctly predicted double-cleavage fragments of identical size.

The goodness of the proposed model can be taken from the 2 × 2 contingency tables in Table 1. Except for OvaY51-71, the rates of both false-negatives and false-positives remained below 10%. Hence, the model allows reduction of the initial set of possible DCFs (cf. numbers in the second row of Table 1) by about 90%, so that the remaining subset of predicted DCFs still contains more than 90% of the actually observed ones. It should be emphasized that the quality of DCF predictions made by the model can be strongly influenced by wrong classifications of single cleavage sites. This is the case for OvaY51-71, where the remaining differences between observed and predicted major DCFs are due to the fact that the two cleavage sites at D18 and E21 were not correctly identified because of too low values of the cleavage probabilities (0.1 and 0.03 with respect to the initial substrate). Hence, the relatively low prediction rate achieved for OvaY51-71 does not necessarily compromise the proposed model, but rather indicates the necessity to improve the identification of cleavage-determining peptide motifs.

To assess the goodness of the proposed model in future applications, a jack-knife procedure was applied. To this end, the estimation of the model parameters was performed across reduced learning sets compiled by omitting one after the other of the five data sets. Then the model was used to predict the fragments observed in the experiment omitted from the learning set. The jack-knife predictions can be depicted from the lower 2 × 2 contingency tables in Table 1. The average rate of false negatives was slightly higher (about 20%) and the rate of false positives still remained below the 10% threshold. The variances of the model parameters derived from the five different jack-knife estimates are shown in Table 2. A large variability of the catalytic rate was only obtained for Arg. This might suggest larger differences in the arginyl-specific activity among the proteasome preparations used in the five in vitro experiments. It is worthwhile to note that the structural parameters related to the release kinetics and the accessibility of peptide bonds in the N- and C-fragment exhibited very small variability.

We also applied the model to the fragment pattern of the insulin B chain generated by a gamma -interferon-stimulated vertebrate proteasome (see Table 3). The quality of the predictions was of similar goodness as obtained for the constitutive proteasomes used in the other five training experiments. This finding suggests that the cleavage kinetics of the constitutive proteasome and the immuno proteasome share a large portion of similarity. Hence the proposed model seems to be well suited to provide reliable predictions of the major cleavage products for both types of proteasomes.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 3   Observed and predicted DCFs for the digest of the insulin B chain


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
REFERENCES

This paper presents the first attempt to apply mathematical modeling to the analysis of the kinetic mechanisms behind the seemingly erratic pattern of proteolytic fragments generated by the vertebrate 20S proteasome from oligomeric precursors. Considering the uncertainties inherent the observed fragment patterns (e.g., analytical problems in the detection of extremely polar or hydrophobic fragments, destruction of initially formed DCFs in the further time course of the experiment, or variations in the specific proteasome activity among the various preparations), the model is able to discriminate with reasonable precision between major fractions of double-cleavage fragments associated with distinct HPLC peaks and minor fractions generated in amounts that are too low for individual peak separation. The rate of correct DCF identification is about 80% and the rate of true negatives is even about 90% as tested by jack-knife computations.

For oligomeric substrates as considered in this paper, detailed experimental information on their shuttling between extra- and intraproteasomal space and cleavage at the distinct active sites is not available yet. Hence, the rate laws of the model were chosen as simple as possible to retain a tractable number of adjustable parameters. Some simplifying assumptions have been made, which deserve closer inspection:
1.   The time course of fragment generation was treated as a stop-and-go process where only a single substrate molecule was taken up by the proteasome, then degraded and the degradation products entirely expelled before the next substrate molecule could be taken up. Because the beta -chamber has a volume of about 84 nm3, it may accommodate several hundreds of closely packed amino acid residues so that a concomitant processing of several oligomeric substrate molecules cannot be excluded. This could give rise to a competition for the various active sites, which was not taken into account in the model. However, unless this competition is noncompetitive and thus may lead to a permanent blockage of active sites, one would expect the competition to cause a general slow down of the turnover rate, which, in our approach, can be compensated for by an appropriate choice of the cut-off value in the observation probability.
2.   The analysis was restricted to those double-cleavage fragments that were cut out from the initial substrate by only two subsequent cleavages. This restriction seems to be justified by the fact that a ratio of about 1:10 was established between the number of cuts and the length of the protein substrate for both the archael and the mammalian proteasome (Kisselev et al., 1998, 1999).
3.   It was presupposed (cf. Eq. 20) that all peptide bonds of the initial substrate are accessible for an active site capable of cleaving it. This assumption might be wrong but was made because of absence of any experimental information on how the substrate crawls through the proteasome and which active site it passes first.
4.   The accessibility of the peptide bond in the N- or C-fragment was assumed to depend only upon its sequence separation from the peptide bond cleaved first but not upon the type of P1 residue involved in the second cleavage and also not upon the amino acid composition of the fragment, which may influence adoption of a more extended or bent conformation. No distinction was made between the cleavage-determining amino acid profiles controlling the first and the second cleavage. This simplification may indeed account for the relatively large group of false positives: It is feasible that the active site performing the second cleavage has no preference for the peptide bonds located in the sequence positions 7-13 (for the N-fragment) or 1-3 (for the C-fragment) away from the first scissile bond or that attack of these peptide bonds is prevented by the local conformation of the fragment. Further refinement of the model, taking into consideration possible correlations among P1 residues involved in the concerted cleavage of fragments as well as folding properties of shorter peptides inside the proteasome, is desirable but seems to be an overloading of the mathematical theory at the current status of experimental knowledge.
5.   An exponential decay for the size dependency of the release rate was chosen. Unfortunately, systematic studies on the effect of size, charge, and hydrophilicity on the passive transport of peptides through protein pores are not available in the literature. In a study on the paracellular diffusion of peptides through caco-2 cell monolayers, Pauletti et al. (1997) found a marked decrease of permeability with increasing peptide length, whereas the charge was of minor importance. Their data can be well fitted with the exponential model (Eqs. 25 and 26) yielding a decay constant of gamma  approx  0.2, which is very close to the model value gamma c = 0.22 determined for the C-fragment.
6.   The parametrization of the proposed kinetic model was achieved by fitting to experimental data obtained in long-term in vitro digests of model peptides. This allowed for taking the stationary limit of the full time-dependent solution (Eq. 14) and thus a considerable simplification of the mathematical expressions. As shown by Stein et al. (1996) for the hydrolysis of small fluorogenic peptides, the short transient phase immediately after onset of the reaction can reveal interesting details of the kinetic mechanism that cannot be observed in the quasistationary reaction regime. For the kinetic analysis of such presteady-state experiments the time-dependent solution (Eq. 14) is relevant. Hitherto, however, short-term digests with oligopeptides or long protein substrates were not available.

Inspection of the model parameters reveals substantial differences between the maximal cleavage rates for the various P1 residues. These differences, together with the accessibility profiles for the peptide bonds in the N- and C-fragment, account for the fact that only about 10% of all possible combinations of cleavage sites are actually used to produce double cleavage in significant amounts.

According to the model, there should be fundamental differences in the further processing of the N- and C-fragment appearing as intermediates after the first cleavage of the initial substrate. The C-fragment exhibits a very high release rate that declines with increasing fragment lengths. Thus only longer C-fragments (>10 residues) can be kept sufficiently long in the proteasome to undergo further cleavage. The accessibility profile for the peptide bonds of the C-fragment (cf. Fig. 5) suggests that cleavage should proceed at the same active site that performed the first cleavage. The N-fragment possesses a four-orders-of-magnitude lower release rate than the C-fragment. Hence the N-fragment must be somehow kept in a resting state that prevents its rapid diffusion away from the active site. One tempting mechanistic explanation for this transitory immobilization of the N-fragment is the formation of a covalent bond between the carboxyl group of the scissile bond and the OH group of the active Thr(1) (Groll et al., 1997). The accessibility profile for the peptide bonds of the N-fragment displays a sharp peak around the residue located in a sequence separation of LN approx  9 residues away from the first scissile bond. This finding seems to support the idea that the preferred generation of fragment sizes between 7 and 13 residues is brought about by the concerted action of two active sites located an appropriate spatial distance from each other (Wenzel et al., 1994). Whether such a tandem arrangement of catalytic activities is indeed present in mammalian proteasomes has to be clarified in future experiments.



View larger version (50K):
[in this window]
[in a new window]
 
FIGURE 5   Accessibility of peptide bonds in the N- and C-fragment after first cleavage. The shown accessibility profiles correspond to expressions 21 and 22 plotted with the model parameters given in Table 2. The accessibility of peptide bonds in the N-fragment exhibits a shark peak localized between sequence positions -13 ··· -7 counted downstream relative to the sequence position of the first cleavage site (= C-terminus of the N-fragment). In contrast, the accessibility of peptide bonds in the C-fragment decreases monotonously with increasing distance from the first cleavage position. Note that peptide bonds in the C-fragment located in larger sequence separations (>13 residues) from the first cleavage site are significantly better accessible than those in the N-fragment. This accounts for the finding illustrated in Fig. 4 that short (<7) and long (>13) fragments are predominantly cut out from the C-fragment.

Regarding the predictive capacity of the proposed model, one has to bear in mind that no a priori knowledge about cleavage sites was used because the identification of cleavage sites in a specific fragment was based on the affinity term (Eq. 23) constituted by the so-called cleavage probability. Application of the model allowed for a 90% reduction of the set of possible major double-cleavage fragments comprising still 80% of the actually observed DCFs.

Finally, it has to be clearly stated that the proposed kinetic model is confined to oligomeric substrates composed of not more than about 40 residues. Substrates of this length should be fully accommodated by the proteasome prior digestion and thus are supposed to freely move to any active site. This must not be true for long protein substrates, as recently used in digests with the proteasomes of different sources (Kisselev et al., 1998, 1999; Nussbaum et al., 1998; Wang et al., 1999). We think that the extension and refinement of the model to experimental data derived for longer substrates and under more physiological situations (e.g., presence of the 19S regulator or ubiquitination of substrates) could be a promising strategy to better understanding the cleavage mechanisms of the vertebrate proteasome in vivo and, in this way, to establish a mathematical tool that allows screening of a given protein sequence for possible epitopes.

    FOOTNOTES

Received for publication 7 January 2000 and in final form 22 May 2000.

Address reprint requests to Hermann-Georg Holzhütter, Humboldt- Universitaet zu Berlin, Institut fuer Biochemie, Medizinische Fakultaet (Charite), Monbijoustr. 2A, D-10117 Berlin, Germany. Tel.: +49-030-2802-6391; Fax: +49-030-2802-6615; Email: hergo{at}rz.hu-berlin.de.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
REFERENCES

Biophys J, September 2000, p. 1196-1205, Vol. 79, No. 3
© 2000 by the Biophysical Society   0006-3495/00/09/1196/10  $2.00



This article has been cited by other articles:


Home page
BioinformaticsHome page
C. Lundegaard, O. Lund, C. Kesmir, S. Brunak, and M. Nielsen
Modeling the adaptive immune system: predictions and simulations
Bioinformatics, December 15, 2007; 23(24): 3265 - 3275.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
R. Demine and P. Walden
Testing the Role of gp96 as Peptide Chaperone in Antigen Processing
J. Biol. Chem., May 6, 2005; 280(18): 17573 - 17578.
[Abstract] [Full Text] [PDF]


Home page
Biophys. JHome page
F. Luciani, C. Kesmir, M. Mishto, M. Or-Guil, and R. J. de Boer
A Mathematical Model of Protein Degradation by the Proteasome
Biophys. J., April 1, 2005; 88(4): 2422 - 2432.
[Abstract] [Full Text] [PDF]