Department of Chemistry and Chemical Biology, Harvard University,
Cambridge, Massachusetts 02138
This paper describes a theoretical method for solving
systems of coupled differential equations that describe the kinetics of
complicated reaction networks in which a molecule having multiple reaction sites reacts irreversibly with multiple equivalents of a
ligand (reagent). The members of the network differ in the number of
equivalents of reagent that have reacted, and in the patterns of sites
of reaction. A recursive algorithm generates series, asymptotic, and
average solutions describing this kinetic scheme. This method was
validated by successfully simulating the experimental data for the
kinetics of acylation of insulin.
 |
INTRODUCTION |
Even moderately complex kinetic systems can have
complicated solutions. In fact, it is only for a limited number of
systems (e.g., unimolecular systems, Scheme 1 a) that a
closed-form solutions exist (Berberan-Santos and Martinho, 1990
;
Rodiguin and Rodiguina, 1960
). In other instances, one can either
approximate the answer by an infinite series, or integrate the kinetic
equations numerically. The series solutions often diverge, and
numerical integration may be time-consuming for complicated systems.
In this paper, we present a recursive method that allows us to find
series, asymptotic, and approximate functional solutions for a kinetic
system consisting of an arbitrary number of starting materials,
intermediates, and products (Xi), and a reagent
R that is consumed irreversibly in the sequence of
conversions among them. Upon reaction with R, an
intermediate Xi converts irreversibly to
Xj with a characteristic rate constant
kij. Rate constants are positive, if they
correspond to production of a given species (intermediate or product),
and negative for its consumption. This process is assumed to be
first-order in both the reactant Xi and the
reagent, and second-order overall. Moreover, each of the
Xj can be a product of reaction of more than one
Xi and can itself convert to more than one
species. The reagent itself, aside from reacting with the starting
material and the intermediates (the loss of the reagent due to reaction
with Xi is described by rate constant
kri), can disappear (e.g., by reaction with
solvent or buffer, or by thermal decomposition), following first-order
kinetics with rate constant kR. Such a network
of reactions can be represented in matrix notation as in Scheme
1 b. The important difference between this system and the
unimolecular one (Scheme
1 a), is that now the
equations are coupled by the R(t) function (that is, the
concentration of the reagent R at time t).
Nonlinear systems of equations of this type do not have a known
closed-form solution. We will refer to this kinetic scheme as a
reagent-coupled unimolecular system (RCUS).

View larger version (21K):
[in this window]
[in a new window]
|
Scheme 1
The system of equations in (A) describes the
kinetics of a unimolecular system; it has an analytical solution. In
(B), the equations of the system are coupled by the
R(t) term (the concentration of the reagent). The system is
nonlinear and cannot be solved analytically. Because the reactions are
assumed irreversible, the entries in the matrix above the diagonal are
all zero. The recursive method developed in the paper can also solve
the more general case (C) which, however, is physically
unrealistic.
|
|
An RCUS provides a general model for the kinetics of irreversible
reactions of molecules having multiple recognition sites with
complementary ligands. Such networks of reactions are common in
biology: recations of multiple ligands with a protein (Imai, 1983
;
Matthews and van Holde 1990
; Mattevi, 1996
; Monaco et al., 1978
;
Perutz, 1989
) and humoral immune response (many antibodies attaching to
an antigen) (Tizard, 1984
) are two prominent examples. An RCUS can also
be used to describe kinetics of protein-ligand interactions in certain
analytical techniques, such as affinity capillary electrophoresis (ACE)
(Chu et al., 1994
; Kuhr and Monnig, 1992
; Liu et al., 1994
). Here, we
use our theoretical model to simulate the data obtained by capillary
electrophoresis (Gao et al., 1995
) from the formation of a charge
ladder (Gao et al., 1994
, 1996
; Gao and Whitesides, 1997
) of insulin by acylation.
Insulin is a protein (MW 5700) that is made up of two chains:
,
consisting of 21 amino acid residues, and
, 30 residues long.
Insulin has three amino groups: the
-chain N-terminal glycine residue, the
-chain N-terminal phenylalanine and a lysine group located at position 29 of the
-chain. Each of these amino groups can
be acylated. Thus, acylation gives rise to 7 possible acylation products (Scheme 2) in addition to native
insulin (denoted N): insulin acylated at F (F), acylated at G (G),
acylated at K (K), acylated at both F and G (FG), at both F and K (FK),
at both G and K (GK) and, finally, with all three amino groups acylated (FGK). We studied acylation reactions of insulin with two acylating reagents: 1) the N-hydroxysuccinimide ester of
5-carboxyfluorescein (F-NHS, Figure 1 A), and 2)
fluorescamine (FL, Figure 1 B) (Stein et al., 1973
). We
analyzed the results using our recursive method for solving the
RCUS describing this system, and compared them to the results obtained
by numerical integration of kinetic equations.

View larger version (17K):
[in this window]
[in a new window]
|
Scheme 2
The kinetic scheme for the acylation of insulin.
Insulin can be acylated by an acylating reagent R on three different
residues (F, G, K). Each conversion is first order both in the
substrate and in the reagent. In addition, the reagent can hydrolyze
(the hydrolyzed product is inactive in the acylation reactions). This
kinetic scheme can be described mathematically by an RCUS set of
equations (Scheme 1 B). Assuming the rate constants of the
acylation reactions are independent of the extent and pattern of the
acylation of the protein (no cooperativity), the number of independent
rate constants can be reduced to four. In this case,
k1 = k6 = k8 = k12, k2 = k4 = k9 = k11, and
k3 = k5 = k7 = k10.
|
|
 |
RECURSIVE METHOD FOR SOLVING RCUS |
The procedure starts with solving a simplified version of the
problem, without R(t) dependence; this simplification gives a unimolecular system, such as that in Scheme 1 a. The
eigenvalues pm of the matrix of rate constants
K are obtained from Eq. 1, in which U is the unit
matrix. For a given root pm, a particular
solution is given by
|
(1)
|
|
(2)
|
|
(3)
|
|
(4)
|
|
(5)
|
Eq. 2, and the values of coefficients Aim
are found from Eq. 3 in which
is the delta function
(Berberan-Santos and Martinho, 1990
). The general solution can be
written as a linear combination of particular solutions (Eq. 4). Eq. 4
will be useful in solving the original problem with R(t)
dependence. Note that the following identity holds (Eq. 5).
We postulate the solution to the RCUS in the form of Eq. 6. This
equation is, indeed, a solution to the RCUS, as can easily be proven
(Eq. 7). The solution given by Eq. 6 is still incomplete, because it is
written in terms of R(t), which is unknown.
|
(6)
|
|
(7)
|
We solve this problem recursively, substituting the
i(t) solutions into the
equation for R(t) (Eqs. 8 and 9). Using Eq. 5, and
introducing S(t) =
0t R(t')
dt' and Bm = Cm/pm, we can rewrite Eq. 9 in a slightly modified form (Eq. 10).
|
(8)
|
|
(9)
|
|
(10)
|
This expression can be further simplified to Eq. 11 by making use
of the boundary conditions at time t = 0. Eqs. 9 and 11
will be used to find series, asymptotic, and approximate solutions for
R(t) (note that if R(t) is known, the entire RCUS
problem is solved through Eq. 6).
|
(11)
|
The procedure for obtaining the series solution valid for small
values of time is quite lengthy, and we relegate it to the Appendix. To
investigate the asymptotic behavior of R(t), we first note
that limt
R(t) = 0, and that
S(t) converges in infinity, i.e., limt
S(t) = S
. With these conditions, Eq. 11
for t
simplifies to Eq. 12, which can easily be
solved numerically for S
.
|
(12)
|
To find the asymptotic solution at infinity, we first write
S
as a sum of S(t) and a function
y(t) defined by y(t) =
t
R(t') dt', and substitute for S(t) in Eq. 9
to obtain Eq. 13. Because y(t)
0 for large times, we
have exp(
pmy(t))
0, and the asymptotic solution at infinity can be written as in Eq. 14. By a
similar method, the asymptote for t
0 is found: in this case S(t)
0 and
exp(
pmS(t))
1, leading to Eq. 15.
|
(13)
|
|
(14)
|
|
(15)
|
If the difference between limiting exponential asymptotes is
small, it is reasonable to assume that R(t) may be
approximated by an exponential for all times (average exponential,
Ravex). With this assumption, we impose the
condition that its integral
0
Ravex(t) dt is equal to
the integral of an exact R(t) over this range, i.e., to
S(t). Using a boundary condition at t = 0,
Eq. 16 is obtained:
|
(16)
|
 |
KINETICS OF ACYLATION OF INSULIN |
We used the mathematical model developed in the previous section
to study the kinetics of acylation of insulin (Scheme 2). Partial
acylation of the amino groups of insulin results in a set of
derivatives that can be resolved into eight peaks by capillary electrophoresis based on differences in their values of electrophoretic mobility at pH 6.8. These eight peaks have all been assigned (Gao et
al., 1995
). We denote either of the two acylating agents used in the
experiments (the NHS ester of 5-carboxyfluorescein [1] and
flourescamine [2]), as R (Fig
1). The acylation reactions are
irreversible, first-order in both the reagent and an insulin derivative, and second-order overall. Moreover, we assume no
cooperativity among acylation reactions; that is, we assume there are
only three rate constants, so that, for example, the rate constant for
conversion of N to F is the same as for conversion of GK to FGK. It has
been reported (Gilson and Honig, 1987
), that for the ionic strength ~0.2 (as in our experiments) and for amino groups separated by ~10
Å, the changes in the pKa of an amino group upon acylation of other
amino groups of the same protein are ~0.1. For the amino groups of
insulin, the changes in pKa are expected to be even smaller, because
the distances between these groups are larger (NPhe
NGly, 21 Å; NPhe
NLeu, 25 Å; and NGly
NLeu, 13 Å) than the distances between the
amino groups studied by Gilson and Honig. Noncooperativity assumption
might be erroneous for proteins, in which the distances between the
acylated amino groups are smaller than ~10 Å. Therefore, using
Bronsted's equation, one can conservatively estimate that, by assuming
no cooperativity, one introduces an error of at most ~10% in the
values of rate constants. The nocooperativity assumption also greatly
reduces the computational complexity of the problem (with full
cooperativity, there would be twelve independent rate constants for
insulin, instead of three).

View larger version (11K):
[in this window]
[in a new window]
|
FIGURE 1
Molecular structures of the acylating reagents used in
the experiments: (1) the NHS ester of 5-carboxyfluorescein
and (2) fluorescamine.
|
|
With these simplifying assumptions, the kinetic equations describing
the system can be written in the form of an RCUS as in Eq. 17 (the rate
constants k1, k2, k3,
and kR are positive numbers, and the minus sign
denotes consumption of a given species).
|
(17)
|
Notice that, in our insulin example, the rate constants are the
observed ones, because only deprotonated neutral amino groups react
with the acylating reagent. The values of pm,
Bm, and Cm coefficients
(defined in the previous section) for this system of equations are
given in Table 1. Using these
coefficients, the series, asymptotic, and average exponential solutions
for R(t) were calculated; they are presented in Table
2. In the next section, we will examine
how accurately these equations fit the reality of the acylation of
insulin.
View this table:
[in this window]
[in a new window]
|
TABLE 2
Functional forms of various solutions for the
R(t) function for the kinetic scheme of
acylation of insulin
|
|
 |
RESULTS AND DISCUSSION |
General properties of solutions
We begin this section with the discussion of general properties of
series, asymptotic, and average exponential solutions for the kinetic
scheme of the acylation of insulin. The model parameters (i.e., the
observed rate constants k1, k2,
k3, kR, and the initial concentrations
R(0) and N(0)) are chosen such that they
illustrate the important characteristics of the solutions. Figure
2 A shows an example of a
series solution for R(t) compared with the results of
numerical integration of kinetic equations
(k1:k2:k3:kR = 4:3:2:0 and N(0) = R(0)). With a
fourth-order expansion, the series follows the numerically integrated
solution down to ~0.4R(0) and then diverges to infinity. Expansions
to orders higher than four do not markedly improve the quality of the
solution. They influence, however, the way in which the series
diverges: for even-order expansions
t
R(t) = 
. For small and comparable initial
concentrations of N and R, and when the rate of
decomposition of R is comparable with the rate constants of
the interconversion reactions (k1, k2,
k3), the series solution assumes a very simple form.
Because in such case, kiN(0)
kR, i = 1, 2, 3, we can
approximate R(t) by a single exponential (Eq. 18).
Obviously, the concentrations of acylated derivatives of insulin can be
well approximated by series.
|
|
|
|
|
(18)
|
solutions only in the region, where the series solution for
R(t) is exact (Fig. 2 C and D).

View larger version (35K):
[in this window]
[in a new window]
|
FIGURE 2
Performance of various methods for finding solutions
for R(t) and two intermediates, F(t) and
FG(t), in the acylation of insulin. The model rate constants
were (A and B)
k1:k2:kR = 4:3:2:0
and N(0) = R(0) and (C and
D)
k1:k2:k3:kR = 4:3:2:0.5 and N(0) = 0.4R(0).
|
|
The average exponential solutions are shown in Fig. 2, B-D.
They were obtained by first calculating S
from Eq. 12 using the bisection method (Press et al., 1996
), and then
using this value in Eq. 16. The maximum deviation of these solutions
from the results of numerical integration does not exceed 6%. The
asymptotic solutions for small and large times approximate the
numerical curves even better than average exponentials: the differences for small/large times, respectively, are less than 1% of the
numerically calculated values.
Kinetics of acylation of insulin
Our study of the kinetics of acylation of insulin is an example of
a problem where the values of the rate constants describing the
kinetics of a complex reaction network are unknown and must be decoded
from available experimental data (concentrations of intermediates at
different times and/or for different initial concentrations of
reagents). The procedure for finding the values of rate constants is
conceptually similar to finding a global minimum of a function of many
variables. First, a trial set of rate constants
{k}trial is chosen, and the concentrations
of intermediates are calculated by substituting the trial set into the
kinetic equations and integrating them. Second, the calculated
concentrations of intermediates are compared to the experimental ones,
and the difference between these two sets is quantified by some
function
({k}trial). In our study,
was defined as a square difference between simulated (x) and
experimental (y) data, i.e.,
i=1n
(xi
yi)2, where n is the
number of data. The rate constants are then changed according to some
algorithm, (such as, e.g., grid search or Monte Carlo), and the
procedure is repeated. This cycle continues until the optimal set of
rate constants {k}optimal that minimizes
is found.
This procedure for finding the optimal rate constants involves large
numbers of calculations: even in a simple case of only 10 kinetic
equations and four rate constants, for a grid search with ~100
possible values for each rate constant, and for
~103-104 numerical integration steps for
each trial set {k}trial, the total number of
computer substitutions to be made is on the order of 1013.
Our method of solving an RCUS, makes it possible to reduce this number
significantly. Specifically, because the method does not require
numerical integration of kinetic equations, the CPU time needed for
finding optimal rate constants is shorter by roughly the factor of the
number of numerical integration steps
(~103-104). In the remaining part of this
section, we will illustrate the use of our theoretical method in two
experiments involving acylation of insulin (Fig.
3).

View larger version (14K):
[in this window]
[in a new window]
|
FIGURE 3
Example of an electropherogram for the reaction of
insulin (1.5 µl 260 µM buffer solution) with flourescamine (4 µl
100 mM buffer solution). The products of the reaction were analyzed by
capillary electrophoresis in a capillary of 50 µm (internal diameter)
and 77 cm (length). A voltage of 30 kV was applied to the ends of the
capillary.
|
|
In the first experiment, fluorescein (1) was used as the
acylating agent. The reaction of fluorescein with insulin was monitored as a function of time (Fig. 4). Two
methods were independently used to find the optimal values of rate
constants. In the first approach, the grid search in the space of rate
constants (100 possible values for each rate constant) was performed,
and the kinetic equations were integrated for each trial set
{k}trial (time step 0.01 s). The
procedure gave the ratio of the values of rate constants
k1:k2k3:kR = 1.07:0.69:0.57:1.00, and the standard deviation per peak
(defined
as
=
) between the simulated and the
experimental data was calculated to be 0.0093. The total time required
for calculations (Pentium 233 MHz processor) was ~47 h. In the
alternative approach based on our theoretical method, the numerical
integration was not used. Instead, for each
{k}trial, the average-exponential solution
was found, and the concentrations of the intermediates were calculated from it. This procedure gave the same values of the optimal rate constants. The standard deviation per peak between the simulated and the experimental data was
= 0.0094, and the difference
between this simulation and that based on numerical integration was
calculated to be
= 0.00038. The computer time required for
calculations was only ~50 min.

View larger version (35K):
[in this window]
[in a new window]
|
FIGURE 4
Schematic electropherograms of acylated
derivatives of insulin for different values of time. In this study,
fluorescein was used as the acylating agent. The hatched bars
correspond to experimental data, and the solid bars were generated
using an average-exponential method.
|
|
In the second experiment, fluorescamine (2) was used as the
acylating reagent. This time, the mixture of underivatized protein and
the acylating reagent was allowed to react until stable concentrations of all the derivatives were reached (when the acylating agent disappeared, and the reactions came to a halt; ~2 h). The experiment was repeated for 6 different initial amounts of fluorescamine (Fig.
5). As before, two theoretical approaches
were taken to fit to the experimental data. Using the grid search
method (100 possible values for every rate constant) with numerical
integration (5000 steps for each set of data), the ratios of the rate
constants were found to be
k1:k2:k3:kR = 2.08:2.48:0.88:1.00. The calculations took ~39 h. In the approach
based on our theoretical method, numerical integration was replaced by
calculating a long-time asymptotic solution, giving the same values of
the optimal rate constants. This time, however, the computer time
required was much shorter (~10 min). The difference between the two
simulated sets was
= 0.00027, and that between the
experimental dataset and the set simulated using long-term asymptotic
solution (without numerical integration) was
= 0.028.

View larger version (27K):
[in this window]
[in a new window]
|
FIGURE 5
Set of schematic electropherograms of acylated
derivatives of insulin for different initial amounts of the acylating
agent (fluorescamine). The hatched bars correspond to experimental
data, and the solid bars were generated using long-time asymptotic
method.
|
|
To assess the reasonableness of our results, the relative magnitudes of
the observed rate constants obtained by our method for the acylation of
insulin were compared with those estimated from the Bronsted equation.
An observed rate constant ki, i = 1, 2, 3 can be related to the real rate constant for the acylation reaction by a simple relation ki =
iki,true, in which
i, expresses the fraction of a respective amino group in
unionized form,
i = [RNH2]i/([RNH2]i + [RNH3+]i) = (10pKa-pH + 1)
1. Using the Bronsted
equation log(ki,real) =
pKa, in which
is a parameter that
correlates basicity with nucleophilicity and is assumed to be 0.8 (Jencks and Carriuolo, 1960
), we arrive at Eq. 19 relating the
pKa of an amino group with the observed rate constant of the acylation of this group. The pKa
of Phe(i = 1) and Gly(i = 2) groups of
insulin have
|
(19)
|
previously been experimentally determined from the pH dependence
of the electrophoretic mobilities, and the extent of acylation of these
groups with acetic anhydride (Gao et al., 1995
) to be 8.4 and 7.1, respectively. Using these pKa values, and a
value of 10.0 for Lys(i = 3) (Matthews and van Holde,
1990
), and the value of pH = 6.8 for the acylation reaction, the
relative magnitudes of the observed rate constants are
k1:k2:k3 = 2.5 (±0.7):2.0 (±0.4):1 (±0.25). The deviations in the rate
constants were estimated assuming uncertainty in the values of
pKa of 0.5. The relative magnitudes of the
observed rate constants in our experiments fall into these uncertainty limits.
 |
CONCLUSIONS |
In this paper, we presented a recursive method for solving
reagent-coupled unimolecular systems: series, asymptotic, and average solutions were found. We validated the theory using experimental data
for the acylation of insulin. Because the algorithms presented in this
study are of general nature (but limited to irreversible reactions),
they can be used to study kinetics of more complicated systems, in
which molecules with multiple recognition (or reaction) sites interact
with multiple equivalents of reagent. The calculations performed using
the method are fast, because it does not require numerical integration
of kinetic equations. Although, in this study, we assumed
noncooperativity of acylation reactions, this assumption is not
necessary: the method can solve an RCUS of an arbitrary dimension. When
the rate constants are to be found from the experimental data, our
method can accelerate the calculations, but cannot circumvent the
problems associated with common search algorithms (e.g., Monte Carlo):
for large sets of independent rate constants, finding the global
minimum (best fit) becomes exceedingly complicated and, if the
uncertainties in the experimental data are large, might yield
meaningless solutions.
 |
EXPERIMENTAL |
Equipment
A Beckman P/ACE system 5500 capillary electrophoresis system was
used in the experiments. The capillary tubing (Polymicro Technologie,
Inc., Phoenix, AZ) was of uncoated fused silica with an internal
diameter of 50 µm. The samples (8 nl) were introduced into the
capillary by a positive pressure injection. A voltage of 30 kV was
applied to the capillary. The detection of the derivatized protein was
done by monitoring the UV absorbance at 214 nm.
Reagents
All chemicals were of analytical grade and were used without
further purification. Bovine pancreatic insulin was purchased from
Sigma (St. Louis, MO). 5-carboxyfluorescein succimidyl ester was
purchased from Molecular Probes (Eugene, OR), and fluorescamine was
purchased from Aldrich (Milwaukee, WI). Stock solutions of insulin
(1.5 mg/ml, 260 µM) were prepared by dissolving the protein in pH
12 water and then diluting this solution with an equal volume of
potassium phosphate buffer (100 mM, pH 6.8).
Procedures
1) Acylation with Fluorescamine. Different volumes of 100 mM
buffer solution of Fluorescamine (2, 4, 6, 8, 10, and 12 µl) were
added to six samples of 1.5 µl insulin buffer solution (260 µM).
The mixtures were allowed to react until stable concentrations of
intermediates were reached (~2 h), and were subsequently analyzed by
capillary electrophoresis. 2) Acylation with Fluorescein. Buffer solution of Fluorescein (7.5 µl, 100 mM) was mixed with 500 µl of
buffer insulin solution (260 µl). The reaction was quenched at
specified times (up to 112 min) with fivefold excess of hydroxylamine. The samples were analyzed by capillary electrophoresis.
Computational Methods
All the computer codes were implemented in C language.
Numerical integration of kinetic equations was performed using a
fourth-order Runge-Kutta method. Optimization of the values of rate
constants was achieved using a grid search algorithm. The equation for
S
was solved using a bisection method.
Address reprint requests to G. M. Whitesides, Department of
Chemistry, Harvard University, Cambridge, MA 02138. Tel.: 617-495-9431;
Fax: 617-495-9857; Email: gwhitesides{at}gmwgroup.harvard.edu.