help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Originally published as Biophys J. BioFAST on November 4, 2005.
doi:10.1529/biophysj.105.062935
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
biophysj.105.062935v1
90/3/765    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Zhang, W.
Right arrow Articles by Chen, S.-J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhang, W.
Right arrow Articles by Chen, S.-J.
Biophysical Journal 90:765-777 (2006)
© 2006 The Biophysical Society

Exploring the Complex Folding Kinetics of RNA Hairpins: I. General Folding Kinetics Analysis

Wenbing Zhang and Shi-Jie Chen

Department of Physics and Astronomy and Department of Biochemistry, University of Missouri, Columbia, Missouri

Correspondence: Address reprint requests to Shi-Jie Chen, E-mail: chenshi{at}missouri.edu.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY AND METHODS
 EXPERIMENTAL TESTS FOR THE...
 GENERAL THEORY OF HAIRPIN...
 SUMMARY
 ACKNOWLEDGEMENTS
 REFERENCES
 
Depending on the nucleotide sequence, the temperature, and other conditions, RNA hairpin-folding kinetics can be very complex. The complexity with a wide range of cooperative and noncooperative kinetic behaviors arises from the interplay between the formation of the loops, the disruption of the misfolded states, and the formation of the rate-limiting base stacks. With a rate constant model and a kinetic-cluster theory, we explore the broad landscape for RNA hairpin-folding kinetics. The model is validated through direct tests against several experimental measurements. The general kinetic folding mechanisms and the predicted great variety of folding kinetics are directly applicable and quantitatively testable in experiments. The results from this study suggest that 1), previous experimental findings based on the individual hairpins revealed only a small fraction of much broader and more complex RNA hairpin-folding landscapes; 2), even for structures as simple as hairpins, universal folding timescales and pathways do not exist; and 3), to treat the loop size as the sole factor to determine the hairpin-folding rate is an oversimplification.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY AND METHODS
 EXPERIMENTAL TESTS FOR THE...
 GENERAL THEORY OF HAIRPIN...
 SUMMARY
 ACKNOWLEDGEMENTS
 REFERENCES
 
RNA hairpins are fundamental building blocks for complex three-dimensional RNA structures. The folding of complex RNA tertiary structures often involves the conformational change of hairpin structures (1Go–7Go). Understanding how a hairpin folds is, therefore, a prerequisite for understanding RNA tertiary structure folding. In addition, RNA functions, including splicing, translational regulation, etc., often involve the sequence-specific folding kinetics of RNA hairpins (8Go–10Go). Motivated by the significant structural and functional roles of RNA hairpins, we explore the complex sequence-dependent RNA hairpin-folding kinetics.

Since the early work of Porschke (11Go), there have been several experimental studies on the folding kinetics of RNA hairpin (12Go–16Go), peptide ß-hairpin (17Go) and DNA hairpin (18Go–24Go). The early experiments focused on loop formation as the rate-limiting step. For most of the sequences studied, the folding was described by a single zipping/unzipping pathway. Moreover, depending on the loop size, most of the hairpins fold in microseconds' timescale in experiments. More recently, experiments from several groups are beginning to shed light on the complexity of RNA and DNA hairpin-folding landscapes, such as the non-Arrhenius and noncooperative kinetic behaviors (18Go–24Go). Since most of the experimental studies are limited to the kinetics specific to the sequences studied, a systematic full exploration for the sequence-dependent complexity of the folding landscape (25Go–27Go) is still missing.

Large-scale atomistic molecular dynamics simulations (28Go–30Go) enable the detailed analysis for the transition states, the kinetic intermediates, and the multiple folding pathways for the specific sequences studied. But the molecular dynamics studies are restricted to very short sequence. Monte Carlo simulations have also been used to model the RNA folding process by several authors (31Go–34Go). However, the conformational and kinetic sampling may not be complete and the method cannot give analytical results that allow for stable predictions for the long time dynamics. Recently, theoretical studies (27Go,35Go) based on statistical mechanical models were developed to study RNA folding kinetics. These statistical mechanical models are based on the master-equation (rate equation) approach (27Go) (see Theory and Methods, below). For short chains, the entire conformational ensemble can be exhaustively enumerated. With the complete conformational ensemble, the master equation can account for the transitions between each and every pair of the conformations and can give the information about the rate-limiting steps (36Go) and the population kinetics. The folding kinetics predicted from the master-equation approach is supported by the atomic simulation (29Go). However, the disadvantage of the master-equation approach is that it cannot give direct information about the microscopic pathways. In addition, the method is limited to short chains due to the rapid increase of the number of conformations for longer chains. More recently, aiming to study the folding mechanism at the level of microscopic pathways and to study kinetics for longer chains, we developed the kinetic-cluster model (detailed in Theory and Methods) by reducing the original large conformational ensemble into pre-equilibrated clusters (37Go), for which we can perform detailed microscopic kinetic analysis.

Experimental and theoretical developments suggest that the RNA hairpin-folding landscape is much more diverse and more complex than that revealed in early experiments. In this study, we develop a kinetic-cluster model for RNA hairpin folding. We go beyond the individual nucleotide sequences by focusing on the general RNA hairpin-folding scenarios as well as the specific features such as the loop and stem dependence. We first test and validate the theory through direct experimental tests for several experimentally measured duplexes and hairpins. We then employ the theory to investigate the general scenarios for RNA hairpin-folding kinetics.


    THEORY AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY AND METHODS
 EXPERIMENTAL TESTS FOR THE...
 GENERAL THEORY OF HAIRPIN...
 SUMMARY
 ACKNOWLEDGEMENTS
 REFERENCES
 
Transition rate model
Kinetic move set
The stability of a basepair in an RNA secondary structure (e.g., a hairpin structure) depends on the identity of the flanking basepairs and the identity of the basepair itself (38Go,39Go). This nearest-neighbor model is rationalized by the fact that RNA secondary structure is stabilized mainly by the base-stacking interactions and the hydrogen bonding—both of which are short-range local interactions and dependent primarily on the identity of adjacent basepairs (40Go). Adjacent basepairs form base stacks. For a given RNA sequence, we can exhaustively enumerate all the possible hairpin conformations according to the base stacks. Since a single (unstacked) basepair is not stable (38Go) and can quickly unfold, we define a kinetic move (equal to an elementary kinetic step for the transformation from one conformation to the other) to be the formation/disruption of a stack or a stacked basepair. Therefore, two conformations are kinetically connected if they can be interconverted through the addition or deletion of a base stack and are kinetically disconnected otherwise. Transitions between the disconnected conformations are disallowed and have rates equal to zero. Here we use the term base stack instead of basepair to define the kinetic move set because a single (unstacked) basepair is not stable. Our kinetic move model will be tested against experiments in Experimental Tests for the Rate Constant Model (see below). To compute the folding kinetics, we need a model to calculate the rate for each kinetic move.

Rate constant for a kinetic move
The transition rate for each kinetic move can be calculated from the general formula

(1)
with "+" for the formation and "–" for the breaking of the stack, where kB is the Boltzmann constant, T is the temperature, and is the free energy barrier of the respective transitions, and is a prefactor to be determined from the experiments.

The formation of a base stack or a loop usually involves an unfavorable entropy loss {Delta}S due to the accompanying restriction in the torsional angles, the desolvation, etc. We assume that the transition state is at the point where the bases have been fixed to the stacked configuration but the bases have not reacted to form the stabilizing hydrogen-bonding and base-stacking interactions. The barrier for the formation of the transition state is entropic:

The breaking of a base stack involves an enthalpy increase {Delta}H due to the disruption of the hydrogen bonding and the base-stacking interactions. We assume that the transition state is at the point where the hydrogen bonding and the base-stacking interactions have been disrupted, but the torsional angles of the chain are not yet liberated from the restricted base-stack configuration. The barrier for the formation of the transition state is enthalpic:

Both and depend on the sequence identity of the bases involved. In our model, {Delta}H, {Delta}S, and are calculated from a statistical mechanical model for RNA folding (41Go). Assuming from Eq. 1, we have

(2)
The above expressions for the rate constants satisfy the detailed balance condition of If {Delta}S and {Delta}H are T-independent, k+ would also be T-independent but k would be strongly dependent on T and is larger for higher T.

Rate for base-stack formation
When a base stack is formed, the increased charge density of RNA backbone would immobilize the counterions and the solvent molecules around the base stack. The reorganization of water molecules, which results in the volume contraction, increases the order of the system and yields an entropy loss (42Go). So the total entropic loss –{Delta}S ({Delta}S > 0) for the formation of a base stack is equal to the sum of the contributions from the loss of conformational entropy –{Delta}Sconf ({Delta}Sconf > 0) and the decrease in the entropy of hydration and counterions {Delta}S' ({Delta}S' > 0) (42Go), which is {Delta}S = {Delta}Sconf + {Delta}S'. As a result, we can write the rate k+ in Eq. 2 in the form of

so k0 is larger than the effective diffusive prefactor

The rate factor is determined not only by the solvent quality (e.g., solvent viscosity) through kdiff, but also by the hydration and the electrostatic states of the nucleotide (bases) through {Delta}S'. Different bases can have different hydration and ion-binding states and so can have different k0 values. For example, the GC pair and the AU pair are hydrated differently, so the GC and AU basepairs can have different k0 values.

Rate for loop formation
According to the nearest-neighbor model, a single (unstacked) basepair is not stable (38Go) and has zero enthalpy ({Delta}H = 0). So the loop conformations closed by a single basepair are unstable and can quickly unfold (see state a in Fig. 1). As a result, because the formation of the stabilizing base stack (b -> c) is much slower than the breaking of the loop (b -> a),

So the unfolded state (a in Fig. 1 A) and the (unstable) looped state (b in Fig. 1 A) can pre-equilibrate before a stable base stack (see state c in Fig. 1 A) is formed to close the loop. Here kb -> c is calculated as k+ in Eq. 2, with {Delta}Sstack equal to the entropy of the stack in state c and kb -> a is calculated as k in Eq. 2 with {Delta}H = 0 for the breaking of the unstacked basepair in state b.



View larger version (15K):
[in this window]
[in a new window]
 
FIGURE 1  (A) A schematic free energy landscape for the loop formation. State a is the fully unfolded state that contains no basepair. State b contains a loop closed by a basepair. State c is a loop conformation stabilized by a closing base stack. Transitions b -> c and c -> b are rate-limited by an entropic barrier of T{Delta}Sstack due to the formation of the closing stack and an enthalpic barrier of {Delta}Hstack for the breaking of the base stack, respectively. Transitions a -> b and b -> a are rate-limited by an entropic barrier of T{Delta}Sloop due to the formation of the loop and an enthalpic barrier of {Delta}Hloop = 0 for the breaking of the single (unstacked) basepair, respectively. Here, according to the nearest-neighbor model, the single basepair in b does not make an enthalpic contribution (i.e., {Delta}Hloop = 0). So the disruption of state b is very fast and the relaxation between state a and the unstable state b is fast. As a result, states a and b can have sufficient time to pre-equilibrate before the loop is stabilized through the b -> c transition. (B) In the kinetic-cluster model, stabilizing the pathway conformation (in the shaded free energy landscape) concomitantly stabilizes the transition state and thus speeds up the intercluster transition. This is because the free energy difference between the transition state and the pathway conformation is equal to the barrier of the kinetic move (= T{Delta}S and {Delta}H for the formation and disruption of a base stack, respectively) and is independent of the free energy of the pathway conformation. Here {Delta}S and {Delta}H are the entropy and enthalpy changes associated with the formation/disruption of the base stack.

 
The rate for the formation of a stabilized loop can be estimated from Fig. 1 as

(3)
where {Delta}Sloop is the entropy loss for the loop closure in the a -> b transition, and is the relative equilibrium population between the open state a and closed state b. Eq. 3 implies that, in the present model, the rate for the formation of a stable loop is determined by not only the rate (kb -> c) for the formation of the closing stack but also the stability (~[b]/[a]) of the closed loop conformation (b in Fig. 1 A).

Rate-limiting steps in RNA hairpin folding
The unfolding process involves the breaking of the native stacks, so the rate-limiting step of unfolding is to disrupt the slow-breaking native stacks. According to Eq. 2, the slow-breaking stacks are those with large enthalpy ({Delta}H).

On the other hand, the folding process involves the formation of the native stacks and the breaking of the nonnative base stacks, so the rate-limiting steps of folding are the formation of the slow-forming native stacks and the breaking of the possible slow-breaking nonnative base stacks. According to Eq. 2, the slow-forming native stacks are those with large {Delta}S and the slow-breaking nonnative stacks are those with large {Delta}H.

The rate constants for kinetic moves can only give rates for individual transitions in the folding process. The overall folding kinetics is determined by the collective and correlated events consisting of all the possible kinetic transitions (moves) in the folding process. To treat the statistics of the kinetic transitions, we discuss the following two theories used in this study: the master-equation method and the kinetic-cluster method.

Models for folding kinetics
Master-equation method
In the master-equation approach, the populational kinetics pi(t) for the ith state (i = 1, .., {Omega}, where {Omega} is the total number of chain conformations) is described as the difference between the rates for transitions entering and leaving the state,

where kj->i and ki->j are the rate constants for the respective transitions. The above master equation has an equivalent matrix form: dp/dt = M · p, where p is the fractional populational vector col (p1, p2, ... , p{Omega}), M is the rate matrix defined as Mij = ki->j for i != j, and Two key issues in the master-equation method are how to compute the rate constants (ki->j and kj->i) and how to solve the master equation.

For a given initial folding condition at t = 0, by diagonalizing the rate matrix M, we have the populational kinetics p(t) for t > 0,

(4)
where –{lambda}m and nm are the mth eigenvalue and eigenvector of the rate matrix M, and Cm is the coefficient that is dependent on the initial condition.

The eigenvalue spectrum gives the rates of the kinetic modes of the system. For a closed isolated system with the rate constants satisfying the detailed balance condition, there always exists an eigenvalue {lambda}1 = 0 for the equilibrium mode (43Go). Physically, the existence of this zero eigenvalue corresponds to the fact that as t -> {infty}, regardless of the initial condition of the system, the system eventually relaxes to the final equilibrium state. All other {lambda}m values are negative and nonzero. The smallest nonzero |{lambda}m| gives the rate of the slowest (rate-limiting) kinetic processes. Especially, if there only one distinctively small nonzero |{lambda}m| exists in the eigenvalue spectrum, the populational kinetics p(t) would be single-exponential with the rate determined by the smallest nonzero |{lambda}m|. The eigenvectors give the basic modes of the kinetic process and are intrinsically related to the energy landscape. In fact, from the eigenvectors we can obtain the rate-limiting steps of the kinetics (36Go).

The great advantage of the master-equation approach is that it is based on the complete ensemble of the conformation states and accounts for the kinetic effect of each and every interconformation transition. For a given rate constant model (see Eq. 1), the master equation can give a rigorous and exact solution for the relaxation kinetics of the system. Therefore, in this study, we will use the master-equation method to test the rate constant model against the experimental data for short sequences.

The master-equation approach has its limitations. The master-equation solution can only give ensemble-averaged macroscopic kinetics and cannot give detailed information about the microscopic pathways. Moreover, RNAs are polymers, whose number of conformations ({Omega}) increases rapidly with the chain length (44Go). So the master-equation approach is limited to short sequences whose rate matrix size is not too large. Because of these reasons (and especially for the first reason), we use the kinetic-cluster method, which, as explained in the next section, can overcome the above two difficulties.

Kinetic-cluster method
The basic idea of the kinetic-cluster method is to classify the large conformational ensemble into a much reduced system of clusters (macrostates) so that the overall kinetics can be represented by the intercluster (instead of interconformation) transitions. A great advantage here is that the microscopic kinetic rates and pathways can be examined in great detail. A number of attempts have been made to model the intercluster kinetics (45Go–50Go,53Go). In general, there are two types of approach. In the first type of approach, the intercluster rate is computed based on the transition state with the lowest barrier (45Go–52Go). For each pair of initial and final states, the optimized pathway is used to estimate the rate constant. In the second type of approach (53Go), conformations that are interconvertible through barrierless transitions are classified as a cluster, and the rate constants are calculated based on all the possible intercluster kinetic pathways. Kinetic Monte Carlo simulations have also been successfully used to obtain the intercluster rate (49Go,54Go). These and other simulations (31Go–34Go) sample the kinetic paths in a stochastic way. In our present kinetic-cluster method, a collection of pre-equilibrated conformations is classified as a cluster (macrostate). Such a cluster includes the barrierless conformational cluster defined in the previous model (53Go) as a special subset. So the present approach is more general. From the intercluster rate constants, we construct the reduced rate matrix. From the eigenvalues and eigenvectors of the reduced rate matrix, we can analyze the rates and pathways for the folding and unfolding kinetics.

Although both the master-equation method and the present kinetic-cluster method can predict the macroscopic kinetics, and both are based on the complete conformational ensemble, the kinetic-cluster approach has the unique advantage of providing the direct information on the microscopic pathway statistics from the intercluster transitions.

The kinetic-cluster method is based on the existence of the pre-equilibrated clusters. The pre-equilibration and cluster formation have been observed in previous experiments and computer simulations (50Go,53Go,55Go) and the kinetic-cluster method has been rigorously validated through extensive tests against the results from the original exact master equations (37Go). However, the kinetic-cluster method has its limitation. The kinetic-cluster method would fail if the system does not have well-defined discrete rate-limiting steps, in which case no pre-equilibration would occur and thus no pre-equilibrated cluster would exist.

The following is a summary for the kinetic-cluster method with applications to RNA hairpin-folding kinetics.

How to classify conformations into clusters. To simplify the illustration, we use simple unfolding kinetics to show the idea. The unfolding of a hairpin involves the breaking of the native base stacks in the native hairpin structure. If the breaking of a native base stack, say, s*, is distinctively slower than the breaking of other native stacks, then the breaking of s* is the rate-limiting step for the unfolding reaction. According to the slow-breaking s*, we can define two clusters for conformations before and after the rate-limiting stack s* is broken:

The existence of the rate-limiting stack s* implies that the transitions between conformations within a cluster are faster than transitions between conformations in different clusters. Therefore, conformations within a cluster may pre-equilibrate before entering a different cluster through intercluster transitions. As a result, each cluster can be treated as pre-equilibrated macrostate and the overall unfolding kinetics is determined by the slow intercluster transitions.

Two types of conformations: Pathway conformation and nonpathway conformations. For an intercluster transition, a conformation Ni in cluster N is transformed to a corresponding conformation Ui in cluster U through the breaking of the rate-limiting stack s*. There exist many such intercluster pathways between the clusters. We define pathway conformations as conformations (such as Ni in cluster N) that are directly connected to the other clusters through kinetic movement. They are called pathway conformations because they form the intercluster pathways. All the other conformations are called nonpathway conformations. We distinguish these two types of conformations because only the pathway conformations directly contribute to the intercluster kinetics.

Intercluster transition rates and the dominant pathways. The intercluster transition rate is given by the sum over all the possible microscopic pathways Ui {leftrightarrow} Ni between pathway conformations in the respective clusters,

(5)
where [Ui] and [Ni] are the equilibrium fractional populations of Ui and Ni in the respective clusters,

(6)
where and are the free energies of conformations Ui and Ni, respectively, and GU and GN are the free energies of clusters U and N, respectively,

(7)
where the summations in Eq. 7 are for all the (pathway and nonpathway) conformations in the respective clusters.

From the above equations for the intercluster rates, we find that the intercluster transition rates are determined by two factors: the stabilities of the pathway conformations (e.g., [Ui] and [Ni]) and the rates (e.g., and ) for each intercluster pathway. The interplay between these two factors leads to the following general conclusions for the intercluster kinetics:

Pathway conformations versus nonpathway conformations. For given rate constants (), stabilizing the pathway conformations (versus the nonpathway conformations) speeds up the intercluster transition. As illustrated in Fig. 1 B, stabilizing the pathway conformation would concomitantly stabilize the transition state and thus speed up the transition. This is because the free energy difference between the transition state and the pathway conformation is determined by the barrier of the kinetic move, which is independent of the stability of the pathway conformation.

Fast versus slow pathway conformations. Some pathway conformations have large transition rate and are thus called fast pathway conformations and others are the slow pathway conformations. Stabilizing the fast pathway conformations (versus the slow pathway conformations) leads to faster intercluster transitions.

Dominant pathways. The probability for the molecule to take a specific pathway, say, Ui -> Ni, is determined by its fractional contribution to the total rate:

(8)
We call the kinetic partitioning factor or pathway partitioning probability. The pathways that have the largest are the dominant pathways. Due to the temperature and sequence-dependence of the rate constants and of the stabilities of the pathway conformations, the dominant pathways can be quite sensitive to the temperature and the sequence. Moreover, the folding and unfolding reactions can involve different rate-limiting base stacks and can thus be described by different clusters. As a result, folding and unfolding reactions can have quite different dominant pathways.


    EXPERIMENTAL TESTS FOR THE RATE CONSTANT MODEL
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY AND METHODS
 EXPERIMENTAL TESTS FOR THE...
 GENERAL THEORY OF HAIRPIN...
 SUMMARY
 ACKNOWLEDGEMENTS
 REFERENCES
 
There are two key issues of the present theory: the rate constant model and the kinetic-cluster method. Although the latter has been extensively validated (37Go), the former requires rigorous tests. The rate constant model includes two ingredients: 1), the definition of the kinetic move; and 2), the transition rate for a kinetic move (see Eq. 2). In this section, we aim to test the rate constant model through experimental comparisons. In addition, we can fit the prefactor k0 in the rate constant equation (Eq. 2) from experiments.

We use the experimentally measured parameters ({Delta}H, {Delta}S) for the base stacks (56Go). To compute the transition rates from Eq. 2, we need to know the prefactor k0. Due to the different energetics of the GC and AU basepairs, we assign different k0 values for stacks with GC and AU basepairs. We determine k0 by fitting the experimental data. Specifically, we use the relaxation rates for the (AnUn)2 duplex (57Go) and for the (A4GCU4)2 duplex (58Go) for AU and GC, respectively. After the prefactors are determined, we would further validate the model by applying the model to predict the relaxation rates for a wide temperature range for other hairpin-folding experiments; see Table 1.


View this table:
[in this window]
[in a new window]
 
TABLE 1  A summary for the experiments used to test the theory

 
It is important to note that for simple sequences such as (AnUn)2, the k0 value serves as a global scaling factor for the overall rates. So for simple sequences (AnUn)2, k0 plays no role in determining the physical nature of the kinetics, namely, the pathways, the transition states, and temperature-dependence of the rates. For all the experimentally measured sequences that we tested, the predicted shapes of the rate versus temperature curves (equal to the Arrhenius plot) agree with the experimental results. Furthermore, the k0 parameters fitted from one experiment can give good predictions for other experiments, i.e., the fitted k0 parameters are transferable between different experiments. These results suggest that the model, including the definition of the kinetic moves and the rate constants for the kinetic moves may be valid and reliable.

In this section, where the goal is to test the rate constant model, we use the original master equation instead of the kinetic-cluster method to remove the possibility of any error caused by the kinetic-cluster method. So the tests would be exclusively focused on the rate constant model. For each sequence tested, we consider the complete ensemble of all the possible conformations and compute the interconformation rate-matrix from Eq. 2. By diagonalizing the rate matrix, we obtain the relaxation rates from the eigenvalues of the rate matrix for different temperatures.

Duplex formation
For the duplex formation, we use the zipper model. The model assumes that all the base stacks occur contiguously in a region. We use base stacks to describe the duplex structure because stacking is the major stabilizing force. Previous approach to the duplex formation is based on a prior assumption on the two states of the transition (57Go). A steady-state approximation was used. In the present study, we do not make a prior assumption about the two-state of the kinetics. So the model can account for any possible kinetic intermediates. The formation of the first stack of the duplex from the single strand is a second-order chemical reaction process, while the helix growth is a first-order (linear) process. Let [A0] and [AN] be the concentration of the single and the fully zipped (native) duplex with N basepairs, respectively. When we use to denote the concentration of the jth state with m stacks, the rate equation can be written as

where ka -> b denotes the a -> b transition rate. Furthermore, because the temperature jump in the experiment is small (~=3 – 4°) (57Go), the concentration deviation from the equilibrium level is small. Therefore, we can use the following linear approximation for the concentration deviation x0 = [A0] – [A0]eq from the equilibrium value With this approximation, the above equation becomes a linear master equation for the concentration deviations of A0, AN, and

The rate for the formation of the first base stack in a contiguous helix region is equal to Here ß is the nucleation probability, i denotes the position of the base stack, and where {Delta}Si is entropic change for the formation of the base stack. After the first stack is formed, the growth rate (e.g., ) of a base stack j is equal to The breaking rate for a base stack j (e.g., ) is equal to where {Delta}Hj is enthalpic change for the formation of the base stack.

The nucleation parameter ß is the probability that two bases in different strands will approach each other. The value ß can be determined from the measured equilibrium constant K, which, in the zipper model, is given by

where i denotes the nucleating base stack, n is the length of the helix, and is the stability of the jth stack.

The {Delta}H and {Delta}S parameters are typically measured under 1 M NaCl condition. However, many of the experiments that we use for comparison were performed under other ionic conditions (see Table 1). For example, the measurements for the (AnUn)2 kinetics is under 0.25 M Na+ condition (57Go). So we need to know the {Delta}H and {Delta}S parameters under arbitrary ionic conditions. We use the following empirical ion concentration-dependent parameters, which are reliable for ion concentrations not too low (≥mM) (59Go):

The lowest nonzero eigenvalue of the above linear rate equation gives the relaxation rate and the relaxation time {tau} = (relaxation rate)–1. Comparison between our model prediction and the (AnUn)2 experimental data gives the prefactor k0 = 6.6 x 1012 s–1 for AU stacks.

The relaxation time {tau} as a function of the reciprocal of temperature is shown in Fig. 2 a. The results show that the theoretical predictions are in good agreement with the experimental data for a wide temperature range and for different chain lengths. The agreement is better for longer chains. For short chains (small n), the theory-experiment difference comes from the finite size effects of the oligomers. For example, the electrostatic effect can cause nonadditivity in the base-stack stabilities (57Go). Moreover, the rates are overestimated for short chains due to the ignored conformations in the zipper model and the possible inaccuracy of the ion-dependence of the energy parameters used in the model.



View larger version (12K):
[in this window]
[in a new window]
 
FIGURE 2  The temperature (T in Kelvin) dependence of the relaxation time for the duplex formation. (Symbol, experiment; line, model.) (a) (AnUn)2 (from bottom to top, n = 4, 5, 6, 7) (0.25 M Na+). (b) (A4GCU4)2 (1.05 M Na+). The experimental relaxation time are determined by 1/{tau} = 4k1C0 + k–1, where k1 and k–1 and C0 are from the experiment (58Go).

 
We can determine the association rate constant k1 and the dissociation rate constant k–1 for the two-state transition through the relaxation time {tau} and the equilibrium constant K,

where is the equilibrium constant, and CN and C0 are the measured concentrations of the duplex and monomer, respectively. Consistent with the experimental results, our model predicts that k1 is only weakly dependent on temperature T and the chain length n, and k–1 decreases as temperature T decreases or duplex length n increases (data not shown).

To determine the prefactor k0 for base stacks with the GC basepair, we study the kinetics for the duplex (A4GCU4)2 (58Go). The duplex involves both AU and GC basepairs. Given the k0 value for the AU pairs, we find that k0 is equal to k0 = 6.6 x 1013 s–1 for the GC pair. The k0 for the GC pair is larger than that for AU because a GC pair involves stronger bonding than an AU pair. In Fig. 2 b, we show the temperature-dependence of the relaxation rate for (A4GCU4)2, and we find good agreement between theory and experiment for a wide range of temperature.

Hairpin formation
Our purpose here is to have further tests on the rate constant model through experimental comparisons for the temperature-rate dependence. We use our model to make predictions for two RNA hairpin-folding experiments (58Go,60Go) that showed quite different temperature-dependence of the relaxation rate. In Fig. 3, a and b, we show the results for sequences A6C6U6 (58Go) at 1 M NaCl and r(AUCCUAUT4UAGGAU) (60Go) at 0.2 M NaCl, respectively. We find reasonably good theory-experiment agreements.



View larger version (11K):
[in this window]
[in a new window]
 
FIGURE 3  The temperature (T in °C) dependence of the relaxation time for hairpin disruption. (Symbol, experiment; dashed line, model.) (a) A6C6U6 ((11Go);1 M NaCl). (b) r(AUCCUAUT4UAGGAU) ((60Go); 0.2 M Na+).

 
The eigenvalues of the rate matrix in the master equation gives the relaxation rate kr. For a two-state transition, from kr, we can obtain the folding rate kf and the unfolding rate ku separately from the following two equations: kr = kf + ku and K = kf/ku, where K is the equilibrium constant, which can be determined from our statistical thermodynamic model. At unfolding temperatures T > Tm (Tm is the folding-unfolding transition temperature), the relaxation process is predominantly an unfolding process, so At folding temperatures T < Tm, the relaxation process is mainly a folding process, so

As shown in Fig. 3 a for sequence A6C6U6, the value kf increases as temperature increases, indicating a positive activation enthalpy for the folding. In our model, such positive activation enthalpy arises from the breaking of the misfolded states, which can be formed through sliding of one or more bases from the native basepair positions. Another possibility for the positive activation enthalpy of folding is for the breaking of the single-strand stacking, which is not considered in the model. In fact, neglecting the breaking of the single-strand stacking may be a reason why our predicted rate is larger than the experiment result in Fig. 3 a.

For sequence r(AUCCUAUT4UAGGAU), the T4 loop was treated as the RNA loop U4. Due to the different chain stiffness, the entropy of T4 loop may be smaller than that of U4. As a result, the rate for loop closure may be underestimated by the model; see Fig. 3 b.


    GENERAL THEORY OF HAIRPIN-FOLDING KINETICS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY AND METHODS
 EXPERIMENTAL TESTS FOR THE...
 GENERAL THEORY OF HAIRPIN...
 SUMMARY
 ACKNOWLEDGEMENTS
 REFERENCES
 
In this section, using the kinetic-cluster method, we present a general theory for RNA hairpin-folding kinetics. We first discuss the possible slow processes of hairpin folding and the different folding kinetics scenarios with different slow processes as the rate-limiting steps for the overall folding. We then discuss the kinetic clusters and the energy landscapes for each different scenario.

Rate-limiting steps
To fold from a fully unfolded state, the hairpin forming chain undergoes the following four types of the possible slow processes (see Fig. 4):



View larger version (11K):
[in this window]
[in a new window]
 
FIGURE 4  The four types of possible slow folding processes: (a) loop formation, (b) formation of a rate-limiting stack, (c) direct folding, the loop is closed by a rate-limiting stack (shaded) from the fully unfolded state, and (d) detrapping through disruption of the nonnative base stack (pattern-fill).

 
Loop nucleation (Fig. 4 a)
The formation of the first base stack can be rate-limiting because it concurrently causes the closure of a loop, and has a rate constant of

(9)
where {Delta}Sstack and {Delta}Sloop are the entropic losses associated with the formation of the base stack and the corresponding loop. The process is slow if the total entropic loss ({Delta}Sloop + {Delta}Sstack) is large.

Formation of the rate-limiting stack (Fig. 4 b)
If a native base stack s* exists, whose entropy {Delta}S* is exceedingly larger than that of other native stacks, according to Eq. 2, the formation of s* would be exceedingly slower than the formation of other native stacks. As a result, the formation of s*, which has a rate constant of

(10)
is a rate-limiting step for the overall folding process.

Direct folding (Fig. 4 c)
In the loop nucleation process, if the loop is closed by a rate-limiting stack s* from the fully unfolded state, the loop closing process would be extremely slow with a rate constant of

(11)
The contribution from such process to the overall folding kinetics can often be ignored due to the extremely slow rate.

Detrapping (Fig. 4 d)
In the folding reaction, nonnative base stacks formed in the process need to be disrupted for the folding process to proceed. The disruption of nonnative base stacks (detrapping) has a rate constant of

(12)
where {Delta}Hnn is the enthalpy cost to break the nonnative base stack. The detrapping process can be slow for large {Delta}Hnn or low temperature T. The process can become a rate-limiting step if kdetrap is small.

Kinetic cluster
According to the above possible slow steps, we can classify the conformational ensemble into five clusters, C, In, Inn, Nn, and Nnn, such that

(13)
In and Inn are the conformations in cluster I without and with nonnative base stacks, respectively, and Nn and Nnn are the conformations in cluster N with and without nonnative base stacks, respectively. Fig. 5 summarizes the kinetic-cluster classifications. In Fig. 5 a, F = I + N is the cluster for all the folded and partially folded conformations and U = C + I is the cluster for all the conformations with the rate-limiting stack s* disrupted.



View larger version (15K):
[in this window]
[in a new window]
 
FIGURE 5  Summary of the two types of kinetic-cluster classifications.

 
By defining the nonnative clusters Inn and Nnn, we neglect the heterogeneity of the non-native stacks by grouping different nonnative states into the same cluster. This assumption may cause an overestimation for the folding rate due to the neglected transition time between different nonnative conformations within clusters Inn and Nnn. The overestimation for the rate is negligible for larger kdetrap when the disruption of the nonnative stacks is fast and can become important if kdetrap is small. Nevertheless, the separation of nonnative Inn from the nativelike In and nonnative Nnn from native Nn can, to the lowest-order approximation, account for the trapping effect in the folding process.

Energy landscapes
The competition between the loop nucleation (kloop), the formation of the rate-limiting (native) stack (kf*), and the detrapping of the nonnative stacks (kdetrap) results in a great variety of different scenarios for the energy landscapes and folding kinetics (see Fig. 6). We can classify the folding landscapes and folding kinetics into the following different scenarios; see Table 2 for a summary.



View larger version (33K):
[in this window]
[in a new window]
 
FIGURE 6  Schematic free energy landscapes for different folding scenarios. The shaded line shows the formation of the native-like looped structures In from the unfolded state C. Inn represents the non-native intermediate state. (a) Scenario 1: The rate-limiting process is the loop formation. The folding is a two-state (C -> F = I + N). (b) Scenario 2: The rate-limiting process is the formation of a rate-limiting base stack. In and Inn equilibrate quickly and form a pre-equilibrate macrostate (cluster) I. The folding is a two-state process U (= C + I) -> N. (c) Scenario 3: Weakly noncooperative folding with a rate-limiting stack and a (or few) slow-breaking misfolded nonnative base stacks. The kinetics is multi-state (five clusters: C, In, Inn, Nn, Nnn). N = Nn + Nnn (Nn and Nnn not shown). (d) Scenario 4: Strongly noncooperative folding. The disruption of an average nonnative stack is very slow and nearly every non-native state can form a kinetic trap. See also Table 2.

 

View this table:
[in this window]
[in a new window]
 
TABLE 2  A summary for the different scenarios of the folding kinetics

 
Cooperative folding (if detrapping is fast)
The folding is cooperative (two-state) and single-exponential if the detrapping is much faster than both the loop nucleation and the formation of the rate-limiting stack:

(14)
If kdetrap is large, detrapping of nonnative base stacks is not a slow process. As a result, In and Inn, which are separated by the slow-breaking nonnative stacks, can equilibrate quickly. As a result, they can merge to form larger pre-equilibrated cluster I. For the same reason, Nn and Nnn can be merged into a larger cluster N. The overall conformation ensemble can be classified into three clusters: C, I, and N (see Fig. 5 b). According to the competition between the loop nucleation and the formation of the rate-limiting stack, we can further distinguish two different folding scenarios:

Scenario 1: Cooperative folding through loop formation (if the formation of the native base stacks is fast). If the formation of the native base stacks is faster than the loop formation,

(15)
no rate-limiting (native) base-stack s* exists. As a result, cluster I and N, which are separated by the possible slow-forming stacks s*, would quickly pre-equilibrate and form cluster F = I + N (see Fig. 5 a). The overall conformational ensemble is classified into two clusters, C and F, and the folding is a two-state transition: C -> F corresponding to the loop formation (see Fig. 6 a). In fact, for most of the previous RNA and DNA hairpin-folding experiments (11Go), there is no rate-limiting (native) base stack, and the folding reaction C -> F is rate-limited by the loop nucleation.

Microscopically, the transition C -> F can be understood as the nonspecific formation of the loops. Once the looped conformations are formed, they can quickly interconvert to form the stable native structure through the fast breaking/formation of base stacks and the accompanying changes of the loops (e.g., through the chain sliding modes). Since the loop entropy is a logarithmic function of the loop size, which is not sensitive to the loop size change, the transitions between different looped structures within cluster F are not rate-limited by the loop changes, and are thus, fast.

Scenario 2: Cooperative folding through formation of the rate-limiting base stack (if loop formation is fast). If the slowest kinetic move is to form native base stack s*,

(16)
then stack s* is major divider to separate different clusters and the conformational ensemble is reduced to a two-cluster system: U (= C + I) and N (see Fig. 5 b). Cluster U is the collection of conformations without the slow-forming s* (see Fig. 6 b), and the folding kinetics is two-state: U -> N corresponding to the slow formation of s*.

Starting from the fully unfolded conformation C, the chain initially undergoes transition C -> I to form cluster U. Because many different ways exist to form the nonspecific loops from the fully unfolded state, the C -> I process is fast even if kloop is not much larger than kf*.

After initial fast formation of cluster U, the chain undergoes slow transition U -> N to form the rate-limiting stack s*. Some conformations in U may be transiently populated at this stage because they have large (fractional) equilibrium population in U but not in the final overall conformational ensemble U + N. These conformations are kinetic intermediates. How do the emergence of the intermediate states and the two-state cooperativity compromise with each other? Since the different conformations, including the kinetic intermediates, in the cluster U have already reached equilibrium, transitions between conformations in U do not contribute to the folding time. Equivalently speaking, conformations in U act as an effective single state for the folding kinetics. Therefore, regardless of the existence of the kinetic intermediates, the resultant folding kinetics is two-state (U -> N) and single-exponential.

Noncooperative folding (if detrapping is slow)
Scenario 3: Weakly noncooperative folding kinetics. If one or only a few nonnative stacks exist whose disruption rate kdetrap is slow (i.e., comparable to kf*), the folding would be (weakly) noncooperative. For example, if one slow-disruption nonnative base stack exists, we can describe the kinetics using five clusters (C, In, Inn, Nn, Nnn in Fig. 5 b; see Fig. 6 c for a schematic plot of the free energy landscape). The resultant kinetics would be multi-state and multi-exponential.

The kinetic traps would form kinetic intermediates in the folding process. These intermediates are different from the intermediates that emerged in the two-state cooperative folding process. The kinetic traps in the noncooperative kinetics prevent the formation of the pre-equilibrated cluster I (I = In + Inn), while the intermediates in the cooperative kinetics are the result of the pre-equilibration of the cluster U(= C + I).

Scenario 4: Strongly noncooperative folding kinetics. The folding kinetics becomes strongly noncooperative if the detrapping from an average nonnative stack is slow: <kdetrap> ≤ kf*, where <...> denotes the average over all the possible nonnative base stacks. See Fig. 6 d for a schematic free energy landscape. When the above condition is satisfied, on average, each individual nonnative state should be treated as a separate cluster. As a result, the use of In, Inn, Nn, Nnn becomes invalid and the resultant folding kinetics is strongly multi-state (noncooperative) and strongly multiple-exponential.

Folding rate-temperature dependence
In this and the next section, we use a 21-nt sequence UAUAUCGC7CGAUAUA as an example to illustrate the complex hairpin-folding kinetics using the kinetic-cluster theory. From the previously reported statistical thermodynamic statistical model (41Go), we find the folding/unfolding melting temperature Tm~=50°C for this sequence.

As shown in Fig. 7 b, from the large ({Delta}S, {Delta}H) parameters, the formation and disruption of the native base stack s* = (5Go,6Go,16Go,17Go) = (U,C,G,A) are rate-limiting for the folding and unfolding kinetics due to large {Delta}S = 35.5 eu and {Delta}H = 13.3 kcal/mol, respectively. The rate constants for the formation and disruption of s* are given by Eq. 2: =1.29x106s–1and which is equal to 3.19 x 104 s–1 at T = 37°C.



View larger version (28K):
[in this window]
[in a new window]
 
FIGURE 7  (a) Temperature (T in °C) dependence of the relaxation rate for a hairpin-forming sequence UAUAUC- GC7CGAUAUA. (Solid line) From the original complete conformational ensemble (master-equation method); (dashed line) from the two-cluster model (Scenario 2); and (dotted line) from the five-cluster model (Scenario 3). (b) The native structure and energy parameters of the native state. The shaded stack is the rate-limiting stack. (c) The pathway conformations Ui and Ni (i = 1, 2, ... , 20) in the respective clusters U and N and the corresponding intercluster pathways and rates. U1 -> N1 is the direct folding pathway through the loop closing by the rate-limiting stack. N19 -> U19 is a double-breaking pathway that involves the simultaneous breaking of two stacks (one of which is the rate-limiting stack). U4 is a nonnative pathway conformation because it contains a nonnative base stack.

 
Plotted in Fig. 7 a is the temperature-dependence of the relaxation rate kr for the sequence. The predicted kr-T relationship (Arrhenius plot) is consistent with the experimental findings for some DNA hairpins (21Go). The relaxation rate kr is related to the folding rate kf and the unfolding rate ku through the equation kr = kf + ku for a two-state transition. The kinetics is dominated by the folding reaction for T < Tm and kr~=kf and by the unfolding reaction for T > Tm and kr~=ku. The Arrhenius plot in Fig. 7 a has two characteristic temperatures: Tm~=50°C and a rollover temperature Tr~=10°CC. According to Tm and Tr, the kr-T relationship can be classified into three regimes:
  1. Unfolding at T > Tm: kr ~ ku increases as T is increased because the rate for breaking the stack ~kb* is larger for higher T.
  2. Folding at temperature T between Tr and Tm: kr ~ kf increases as T decreases, indicating a negative apparent activation energy.
  3. Folding at low temperature T < Tr: kr ~ kf decreases as T decreases, indicating a positive activation energy.

What causes the rollover of the rate-temperature dependence plot? How to predict the rollover temperature Tr from the microscopic energetics? In this section, we explore the connections between the macroscopic Arrhenius plot and the microscopic kinetic clusters.

Since the formation of s* is the sole dominant rate-limiting step, we can classify the conformational ensemble into two clusters U and N; see Fig. 5 b. For this 21-nt sequence, there are a total of 20 hairpin conformations in cluster U to which the rate-limiting stack s* can be added through a kinetic move. These 20 conformations (Ui, i = 1, 2, ...20; see Fig. 7 c) are the pathway conformations in cluster U. There are 20 corresponding intercluster pathways (see Fig. 7 c). Among the 20 pathways, U1 (= C) -> N1 is through the direct folding and is extremely slow (see Eq. 11). The rest of the 19 pathways are fast-folding pathways (see Fig. 7 c).

The temperature-dependent competition between the stabilities of the following three types of conformations leads to the complex temperature-dependence of the folding rate: 1), the fast-folding (U2, ..., U20); 2), the slow-folding (U1) pathway conformations; and 3), the nonpathway conformations in cluster U. For the given sequence, as shown in Fig. 8 a, a decrease in temperature causes two competing effects:

  1. The fast-folding pathway conformations U2, ..., U20 are stabilized and the slow-folding conformation U1 are relatively destabilized, causing a faster folding rate.
  2. The nonpathway (nonnative) states in U are stabilized, causing a decrease in the total population of the pathway conformations in cluster U and hence a slower folding rate.



View larger version (11K):
[in this window]
[in a new window]
 
FIGURE 8  (a) The populational distribution of the pathway conformations Ui -> Ni for i = 1, 2, ..., 20 (solid for T = 30°C and shaded for T = 90°C). The total faction population of the nonpathway conformation is listed as the state 21 in the figure. (b) The probability for the molecule to fold along Ui -> Ni (i = 1, 2, ...20) (solid for T = 30°C and shaded for T = 90°C). As temperature T is increased from 30°C to 90°C, the populational distribution is shifted from the fast-folding conformation (i = 20) to the slow-folding conformation (i = 1), causing a slow folding process.

 
For the given sequence, the former effect dominates and thus the folding is accelerated at lower temperatures.

As the temperature is further lowered, at a critical temperature Tr, the Arrhenius plot shows a rollover behavior, i.e., the rate decreases as temperature is decreased. Two possible mechanisms for the rollover exist:

  1. In the cooperative folding regime (Scenario 2), the rollover can be caused by the stabilization of the nonpathway conformations or the slow-folding pathway conformations in cluster U, which leads to a slower (instead of faster) folding rate at lower T.
  2. In a noncooperative folding process (Scenarios 3 and 4), the rollover can be caused by the formation of kinetic traps. In this case, as temperature is lowered, the detrapping rate kdetrap decreases, resulting in a decrease in the overall folding rate. In this case, rollover behavior corresponds to the transition from the cooperative to the noncooperative folding kinetics and Tr is determined by the condition for Scenario 3 to occur, which is kdetrap ~ kf*. For the given sequence, we find Tr = 10°C.

As shown in Fig. 7 a, the two-state model cannot account for the rollover behavior. Only after we introduce the five-cluster model (C, In, Inn, Nn, Nnn) in Scenario 3 can we obtain the rollover. This clearly shows that the rollover for this sequence is caused by the second mechanism above. The strongly multi-state (and glassy) folding kinetics occurs at even lower temperatures.

Folding pathways
We discuss the folding pathways for three typical temperatures below. Fig. 8 shows the plot of the kinetic pathway partitioning probability defined in Eq. 8 for all the 20 U -> N pathways in Fig. 7 c. The dominant pathways have the largest

  1. Cooperative folding at high temperature T = 90°C (T > Tm; Scenario 2). We find that the main folding pathways are U1 -> N1 and U2 -> N2, each contributing ~14.5% and 51.6% to the total folding rate, respectively. This is because the populations of U2 and U1 overwhelmingly dominate the population in cluster U.
  2. Cooperative folding at a lower temperature T = 30°C (Tr < T < Tm; Scenario 2). From Fig. 8 b we find the dominant folding pathway is U20 -> N20. Approximately of the population in U folds along this pathway. This is because of U20 is the most stable fast-folding pathway conformation for T < Tm. As the temperature is increased, the population of the slow-folding (fully unfolded state) U1 increases and that of the fast-folding state U20 decreases, so the folding rate decreases.
  3. Noncooperative folding at T = 10°C (T < Tr; Scenario 3). The folding kinetics can be described by the transitions between five clusters: C, In, Inn, Nn, Nnn. From the unfolded state C, there are a large number of pathways to form a nonnative state in Inn, so the chain would quickly fold to Inn from C and Inn becomes a kinetic trap. This is confirmed in the results of populational kinetics (data not shown). The later stage of the folding involves the detrapping from Inn and the formation of the rate-limiting stack s*. If detrapping is faster, the chain would first detrap, then form s*. As the temperature decreases, detrapping slows down and may eventually becomes slower than the formation of s*, which is T-independent in the model. So in such low T case, the chain would first form s*, then detrap to break the nonnative stacks.


    SUMMARY
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY AND METHODS
 EXPERIMENTAL TESTS FOR THE...
 GENERAL THEORY OF HAIRPIN...
 SUMMARY
 ACKNOWLEDGEMENTS
 REFERENCES
 
The kinetic-cluster approach allows us to study the kinetic rates, rate-limiting steps, and the pathways for biologically significant RNA hairpins. The overall hairpin-folding process can be rate-limited by the formation of the loop, the formation of the slow-forming native base stack, and the disruption of the slow-breaking nonnative base stack. The competition between these different processes leads to the great wealth of different RNA hairpin-folding kinetic scenarios. The detailed folding kinetics is sequence-specific. For many sequences, there exists a temperature Tr such that for T < Tr the folding rate decreases as the temperature is decreased because of the decreased rate for breaking a misfolded nonnative base stack or the increased stability of the (