help button home button Biophys. J.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Venkataramanan, L.
Right arrow Articles by Sigworth, F. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Venkataramanan, L.
Right arrow Articles by Sigworth, F. J.

Biophys J, April 2002, p. 1930-1942, Vol. 82, No. 4

Applying Hidden Markov Models to the Analysis of Single Ion Channel Activity

L. Venkataramanan* and F. J. Sigworthdagger

 *Schlumberger-Doll Research, Ridgefield, Connecticut 06877 USA; and  dagger Department of Cellular and Molecular Physiology, Yale University, New Haven, Connecticut 06520 USA


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
COMPUTATION OF THE LIKELIHOOD...
NOISE MODEL AND H-NOISE
ISSUES IN PRACTICAL...
SUMMARY
REFERENCES

Hidden Markov models have recently been used to model single ion channel currents as recorded with the patch clamp technique from cell membranes. The estimation of hidden Markov models parameters using the forward-backward and Baum-Welch algorithms can be performed at signal to noise ratios that are too low for conventional single channel kinetic analysis; however, the application of these algorithms relies on the assumptions that the background noise be white and that the underlying state transitions occur at discrete times. To address these issues, we present an "H-noise" algorithm that accounts for correlated background noise and the randomness of sampling relative to transitions. We also discuss three issues that arise in the practical application of the algorithm in analyzing single channel data. First, we describe a digital inverse filter that removes the effects of the analog antialiasing filter and yields a sharp frequency roll-off. This enhances the performance while reducing the computational intensity of the algorithm. Second, the data may be contaminated with baseline drifts or deterministic interferences such as 60-Hz pickup. We propose an extension of previous results to consider baseline drift. Finally, we describe the extension of the algorithm to multiple data sets.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
COMPUTATION OF THE LIKELIHOOD...
NOISE MODEL AND H-NOISE
ISSUES IN PRACTICAL...
SUMMARY
REFERENCES

Recordings of single ion-channel currents provide a wealth of information about the activity of single allosteric protein molecules. The open-closed behavior of ion channels has generally been described in terms of continuous-time Markov models (Colquhoun and Hawkes, 1995) in which model states are taken to correspond to distinct states of protein conformation or ligand binding. Finding the best Markov model description of a channel's behavior is therefore taken to be equivalent to a complete elucidation of the kinetic behavior of the channel protein with the Markov transition probabilities corresponding directly to rate constants of ligand binding and unbinding and of conformational changes.

Finding the best Markov model involves two steps. First, the general topology of the model must be chosen, specifying the number of states and the connectivity that specify the allowable transitions among states. The second step is the optimization of Markov model parameters. Given a Markov model lambda  with N states, the parameters are the current levels µi corresponding to each state qi, i = 1, ... , N, the initial state probability pi i, and the transition rates contained in an (N × N) matrix Q. Maximum-likelihood techniques are typically used to optimize these parameters, and the likelihood-ratio test is commonly used to identify the best model topologies.

Several methods have been used to compute likelihoods and estimate model parameters from single-channel data. Most commonly, threshold detection is used to identify channel-open and channel-closed intervals; the distributions of these dwell times are then fitted to the predictions of Markov models by maximum-likelihood techniques (Magleby and Weiss, 1990; Colquhoun and Sigworth, 1995). Alternatively, the likelihoods of models are computed on the basis of the entire sequence of open and closed dwell times (Horn and Lange, 1983; Ball and Sansom, 1989; Qin et al., 1997). An improved approach to the identification of open and closed intervals has been introduced through the use of the Viterbi algorithm (Fredkin and Rice, 1992a). Discussed in the present paper is an approach that does not require the identification of open and closed intervals at all but makes use of the raw single-channel recording in the form of sampled time course of membrane current. This application of signal processing based on hidden Markov models (HMMs) has already been demonstrated to be particularly useful in characterizing channel behavior when the signal-to-noise ratios are low and when multiple subconductance levels exist (Chung et al., 1990; Fredkin and Rice, 1992b; Chung and Gage, 1998). An excellent overview of the HMM approach to single channel analysis is given by Qin et al. (2000a,b). Various implementations of the HMM algorithms have been applied to experimental single channel data (Becker et al., 1994; Milburn et al., 1995; Michalek et al., 1999; Farokhi et al., 2000). The algorithms described here have been used in the recent studies by Sunderman and Zagotta (1999a,b), Wang et al. (2000), and Zheng et al. (2001).

In the HMM framework, the discrete-time data Y(t) are assumed to be the sum of Gaussian noise n(t) and the "noiseless" channel current as the channel makes transitions from one state to another. The Markov model lambda  underlying the channel activity is considered to be "hidden" because the channel state at any given time is not directly observable. The channel state may be hidden due to the aggregated nature of the HMM (several states may have the same conductance level) and is also hidden by the noise. To mirror the discrete-time nature of the sampled recording, a discrete-time HMM replaces the usual continuous-time Markov model. The parameters of the discrete-time model are the transition probabilities, contained in a (N × N) matrix A; the current levels µi corresponding to each state qi, i = 1, ... , N; and the initial state probability pi i.

The discrete-time transition probability matrix A = {aij, i, j = 1, ... , N} is related to the matrix of rate constants Q of the usual, continuous Markov model by
A=e<SUP><UP>Q&Dgr;t</UP></SUP>, (1)
in which Delta t is the sampling interval; thus aij is the probability of making a transition from state qi to state qj during one sample interval. The objective of our analysis is to estimate the parameters of the discrete HMM lambda  = {A, µ, pi } from the observed discrete-time data Y(t) while realistically modeling the signal and noise.

Likelihood is defined as the probability of observing a particular data set given a model. The evaluation of the likelihood of HMMs has been made practical by an algorithm called the forward-backward procedure. Optimization of the parameters of the model is aided by the Baum-Welch procedure, which through iteration causes a maximum of the likelihood to be approached. These algorithms have been well described in the speech-recognition literature (Baum et al., 1970; Liporace, 1982; Rabiner, 1989).

The applicability of the HMM algorithms to patch-clamp recordings have been limited by several problems (Qin et al., 2000b). First, the traditional forward-backward procedure is highly efficient but is applicable only if the additive noise is white. In practice, the experimental background noise is correlated from sample to sample and has a spectral density that increases with frequency. Second, the data are usually low-pass filtered by an analog anti-aliasing filter such as a Bessel filter before digitization; this produces intermediate values for the sampled signal in the vicinity of state transitions.

Qin et al. (2000b) have addressed these issues by modeling the background noise by an autoregressive (AR) process, so that the data can be reduced to a higher-order (metastate) Markov process with white noise. Their model allows for different noise models for states of different conductances. Under the strong assumption that the noise in each metastate depends only on the current state, the analytical derivatives of the likelihood function with respect to all unknown model parameters are derived. This optimization approach has the advantage that it optimizes for the rate constants directly and thus easily allows constraints on rate constants such as detailed balance.

Our approach in analyzing the single channel data differs in several respects. First, we follow the traditional Baum-Welch approach to estimate the Markov model parameters. The algorithm is slow at low signal to noise ratios (SNR) but has the advantage that it allows a large number of parameters to be reestimated simultaneously, including for example vertices of a fit to a drifting baseline. Our focus is on the estimation of the transition probability matrix A, which can be considered to be equivalent to estimating the rate matrix Q when the sampling interval is small compared with any of the rate constants in Q. Second, we also model the noise as an AR process but make use of a more general description of state-dependent noise. Last, we model the randomness of sampling relative to transitions in the continuous time Markov process by approximating it by an additive noise source called the "H-noise."

In this paper we also address three issues that arise in the practical application of the algorithm in analyzing single channel data. First, the antialiasing filter used during data acquisition typically has a gradual roll-off in the transition region of the frequency response. However, for better performance and to reduce computational complexity of the algorithms, we prefer the use of an antialiasing filter with a sharp roll-off in the frequency domain. Therefore, we describe the implementation of an inverse filter to remove the effects of the gradual roll-off filter and in effect replace it with a digital filter with a sharp roll-off in the transition region. Second, the data may be contaminated with baseline drifts or deterministic interferences (such as 60-Hz pickup). We propose an extension of the results presented in Chung et al. (1990) to model and compensate for baseline drift. Finally, the traditional HMM algorithms focus on parameter estimation from a single data set. Often, in practice, multiple sets of data are recorded under the same experimental conditions. We describe the extension of the algorithm to multiple observation sequences.

This paper is organized as follows. The second section briefly describes the computation of likelihood and estimation of HMM parameters through use of the standard algorithms. The third section presents the noise model and highlights the main features of the H-noise algorithm that model the noise correlation in successive filtered, sampled data points. The fourth section describes the inverse filter and estimation of baseline from the measured single channel data. We conclude with a discussion of the approach in the fifth section.


    COMPUTATION OF THE LIKELIHOOD AND ITS MAXIMIZATION
TOP
ABSTRACT
INTRODUCTION
COMPUTATION OF THE LIKELIHOOD...
NOISE MODEL AND H-NOISE
ISSUES IN PRACTICAL...
SUMMARY
REFERENCES

The likelihood of the hidden Markov model lambda  is defined as the probability of the observed data samples Y(t) = {yt, t = 1, ... , T} given the model,
L=P(y<SUB>1</SUB>, y<SUB>2</SUB>,…, y<SUB><UP>T</UP></SUB>‖&lgr;). (2)
The brute-force numerical computation of the likelihood is intensive as illustrated in the following example. The top trace in Fig. 1 shows noisy simulated data generated from a two-state Markov model and some of the several possible state sequences Si that could have given rise to the observed data. The likelihood L of the HMM can be calculated by summing the probability of the data and the state sequence over all possible state sequences,
L=P(<UP>data</UP>‖<UP>model</UP>)

=<LIM><OP>∑</OP><LL><UP>i</UP></LL></LIM> P(<UP>data</UP>, S<SUB><UP>i</UP></SUB>‖<UP>model</UP>)

=<LIM><OP>∑</OP><LL><UP>i</UP></LL></LIM> P(<UP>data</UP>‖S<SUB><UP>i</UP></SUB>)P(S<SUB><UP>i</UP></SUB>‖<UP>model</UP>) (3)
In general, let st denote the state of the HMM at time t, and let S = {s1, s2, ... , sT} denote a possible state sequence. The probability of this state sequence given the model lambda  can be computed in terms of the initial state probability pi  and the transition probabilities between states ast-1st as,
P(S‖&lgr;)=&pgr;<SUB><UP>s<SUB>1</SUB></UP></SUB> <LIM><OP>∏</OP><LL><UP>t=2</UP></LL><UL><UP>T</UP></UL></LIM> a<SUB><UP>s<SUB>t−1</SUB>s<SUB>t</SUB></UP></SUB>. (4)
Given a state sequence, the probability of the observed data is obtained as,
P(y<SUB>1</SUB>, y<SUB>2</SUB>,…, y<SUB><UP>T</UP></SUB>‖S)=<LIM><OP>∏</OP><LL><UP>t=1</UP></LL><UL><UP>T</UP></UL></LIM> P(y<SUB><UP>t</UP></SUB>‖s<SUB><UP>t</UP></SUB>) (5)
in which P(yt|st) is the probability of obtaining the observation yt given that the channel is in state st. This probability is considered to be independent of observations at previous times; this is the uncorrelated or "white" noise assumption. Let us define a term called the emission probability bi(y) as the probability of observing datum y when channel is in state qi,
b<SUB><UP>i</UP></SUB>(y)&dgr;y=P(y‖q<SUB><UP>i</UP></SUB>), (6)
in which delta y represents the resolution of measurement y. Assuming that delta y is a constant, we can without loss of generality set delta y = 1. In the case of Gaussian white noise with variance sigma 2 the emission probability is a Gaussian function centered on the mean current µi,
b<SUB><UP>i</UP></SUB>(y)=<FR><NU>1</NU><DE><RAD><RCD>2&pgr;</RCD></RAD>&sfgr;</DE></FR> <UP>exp</UP><FENCE><FR><NU><UP>−</UP>(y−&mgr;<SUB><UP>i</UP></SUB>)<SUP>2</SUP></NU><DE>2&sfgr;<SUP>2</SUP></DE></FR></FENCE>. (7)
and the probability of the observations given the state sequence (Eq. 5) is seen to be the product of T Gaussians. Finally, the likelihood is obtained by summing the product of the probabilities specified in Eqs. 4 and 5 over all possible state sequences,
L=<LIM><OP>∑</OP><LL><UP>all S</UP></LL></LIM> P(y<SUB>1</SUB>, y<SUB>2</SUB>,…, y<SUB><UP>T</UP></SUB>‖S)P(S‖&lgr;), (8)
The evaluation of Eq. 8 is very computationally expensive, involving on the order of (2TNT) calculations.



View larger version (9K):
[in this window]
[in a new window]
 
FIGURE 1   Illustration of the likelihood calculation. The top trace shows simulated data from a two-state Markov model. The underlying state sequence is shown in dashed lines. S1, S2, S3, and S4 are four of the many possible state sequences that could have given rise to the observed data. The likelihood L of the HMM is obtained by summing the joint probability of the data and the state sequence over all possible state sequences.

Forward and backward variables

The forward procedure (Rabiner, 1989; Qin et al., 2000a) offers a vast improvement and can account for both white and correlated noise. The issue of correlated noise is addressed by introduction of the concept of a vector or metastate-HMM (Fredkin and Rice, 1992; Venkataramanan et al., 1998). If st is the state of the HMM at time t, then each k-tuple of successive states (st st-1 ... st-k+1) forms a "metastate." A metastate at time t is denoted by the notation I = (qi0 qi1 ... qik-1) in which qij is the state of the HMM at time t - j, j = 0, ... , k - 1 and ij = 1, ... , N.

The forward variable at time t in metastate I is defined as the probability of the first t data measurements and being in metastate I at time t given that the model is lambda , or
&agr;<SUB><UP>t</UP></SUB>(i<SUB>0</SUB>, i<SUB>1</SUB>,…, i<SUB><UP>k−1</UP></SUB>)=P(y<SUB>1</SUB>, y<SUB>2</SUB>,…, y<SUB><UP>t</UP></SUB>, I‖&lgr;). (9)
The forward variable at any time instant in any state can be computed efficiently in terms of the forward variable at the previous time instant and the emission probability. The forward variables at the final time instant can be used to compute the likelihood L of the HMM,
L=<LIM><OP>∑</OP><LL><UP>I</UP></LL></LIM> &agr;<SUB><UP>T</UP></SUB>(I) (10)
In addition, the forward variable together with the backward variable defined below can be used in the estimation of the HMM parameters using the Baum-Welch procedure.

Let Yt be the vector of k samples ending at time t. The backward variable at time instant t in metastate I is defined as the probability of the last T - t data points given that the model lambda  is in metastate I at time t and (k - 1) previous data samples, or
&bgr;<SUB><UP>t</UP></SUB>(i<SUB>0</SUB>, i<SUB>1</SUB>,…, i<SUB><UP>k−1</UP></SUB>)=P(y<SUB><UP>t+1</UP></SUB>, y<SUB><UP>t+2</UP></SUB>,…, y<SUB><UP>T</UP></SUB>‖Y<SUB><UP>t</UP></SUB>, I, &lgr;). (11)
Similar to the forward variable, it can be computed at each time step in terms of the backward variable at the next time step and the emission probability associated with the metastate at that time.

Reestimation

The Baum-Welch reestimation algorithm uses the forward and backward variables to update the model parameters. Let gamma t(I) be the probability of being in metastate I at time t given all of the observed data and the model lambda ,
&ggr;<SUB><UP>t</UP></SUB>(I)=P(I‖y<SUB>1</SUB>, y<SUB>2</SUB>,…, y<SUB><UP>T</UP></SUB>, &lgr;). (12)
It can be computed from the forward and backward variables as,
&ggr;<SUB><UP>t</UP></SUB>(I)=<FR><NU>&agr;<SUB><UP>t</UP></SUB>(I)&bgr;<SUB><UP>t</UP></SUB>(I)</NU><DE>L</DE></FR>, t=1,…, T. (13)
Using the gamma  variables, new estimates can be made of the various model parameters. Let M0 be the set of metastates where the state at time t is qi0 and state at time t - 1 is qi1. Let M1 be the set of metastates where the state at time t - 1 is qi1. The reestimated transition probability from state qi1 to qi0 can be understood as the ratio of the expected number of transitions from state qi1 to qi0 divided by the number of transitions from state qi1,
<A><AC>a</AC><AC>ˆ</AC></A><SUB><UP>i<SUB>1</SUB>i<SUB>0</SUB></UP></SUB>=<FR><NU><LIM><OP>∑</OP><LL><UP>t=1</UP></LL><UL><UP>T</UP></UL></LIM> <LIM><OP>∑</OP><LL><UP>I∈M<SUB>0</SUB></UP></LL></LIM> &ggr;<SUB><UP>t</UP></SUB>(I)</NU><DE><LIM><OP>∑</OP><LL><UP>t=1</UP></LL><UL><UP>T</UP></UL></LIM> <LIM><OP>∑</OP><LL><UP>I∈M<SUB>1</SUB></UP></LL></LIM> &ggr;<SUB><UP>t</UP></SUB>(I)</DE></FR>. (14)
The reestimated current levels can similarly be obtained by solving a linear system of equations (Venkataramanan et al., 1998). The forward-backward procedure followed by the Baum-Welch reestimation formulae is a special case of the expectation maximization algorithm (Baum et al., 1970). Carried out iteratively, it is guaranteed to converge to a local maximum of the likelihood function. The algorithm converges slowly at low SNRs. In practical problems, the algorithm takes on the order of a hundred iterations to converge to the local maximum (Qin et al., 2000b; Zheng et al., 2001).

It is also possible to reconstruct the original, noiseless signal from the measured data in a number of ways. This reconstructed signal is model-dependent and is therefore not unbiased; however it gives the user useful feedback about the performance of the algorithm. In one approach, the Viterbi algorithm can be used to provide the most probable state sequence underlying the measured data (Rabiner, 1989). Alternatively, the cumulative mean or maximum a posteriori probability estimate of the signal can also be easily reconstructed from the a posteriori probabilities (Chung et al., 1990). The cumulative mean estimate gives the expectation value of the channel current based on the probabilities gamma  at each time point. This estimate is given by
<A><AC>x</AC><AC>ˆ</AC></A><SUB><UP>t</UP></SUB>=<LIM><OP>∑</OP><LL><UP>I</UP></LL></LIM> &mgr;<SUB><UP>i<SUB>0</SUB></UP></SUB>&ggr;<SUB><UP>t</UP></SUB>(I) (15)
in which µi0 is the channel current in the final state of metastate I.


    NOISE MODEL AND H-NOISE
TOP
ABSTRACT
INTRODUCTION
COMPUTATION OF THE LIKELIHOOD...
NOISE MODEL AND H-NOISE
ISSUES IN PRACTICAL...
SUMMARY
REFERENCES

The applicability of hidden Markov models to analyze single channel recordings is limited by two problems. First, the experimental background noise is correlated and has a spectral density that increases with frequency. Second, to remove high-frequency noise components, the data are necessarily filtered before sampling. The problem with filtering is illustrated with the following example. The top trace in Fig. 2 is the output of a continuous-time Markov model with two states---an open state and a closed state. The second trace is the signal after passing through an antialiasing filter. The last trace shows the filtered data after sampling. As seen in the last trace, when a channel closes from an open conductance level, there might be a sample taken on the falling edge of the filtered signal. Thus when the channel closes from an open state, the data will consist of a sequence of open current levels, one sample at a random intermediate level and a sequence of samples at the closed level. The problem is to distinguish the presence of the intermediate sample from a two-state channel from the case of a channel passing through a sublevel at say, one-half amplitude. We describe below the "H-noise" algorithm, which addresses the issues of filtering, correlated background noise, and the randomness of sampling relative to transitions.



View larger version (12K):
[in this window]
[in a new window]
 
FIGURE 2   Effects of filtering and sampling continuous-time data. (A) Example of continuous-time data from a two-state HMM. (B) Data after passing through an antialiasing filter. (C) Filtered, sampled data. The impulse response of the filter that was used is given by Eq. 23 with the parameters sigma f = 1 and fc/fs = 0.4.

Noise model for correlated background noise

The background current noise in patch-clamp recordings has a spectral density whose dependence on frequency f is ideally described by S(f) = k0 + k1f2 over the experimentally-accessible frequency range from DC to ~100 kHz (Levis and Rae, 1993; Sigworth, 1995). The white noise component k0 is due to leakage conductances and shot noise associated with the patch-clamp amplifier. The f2-noise term is due to the amplifier voltage noise, which is imposed on the input capacitance; the result is the differential of a white noise process.

In one approach to handle the signal with correlated noise, the observed data Y(t) are passed through a filter that removes the correlation in the noise. The correlated noise can thus be whitened by convolving it with a finite vector of filter coefficients comprising a so-called moving average filter. This is equivalent to modeling the original correlated noise as the output of the inverse of a moving average filter (an AR filter) driven by white Gaussian noise. Because the noise correlation may be unknown, the coefficients of the AR filter are also considered to be unknown. The H-noise algorithm provides a framework for estimation of the AR coefficients from the observed data.

The simulation shown in Fig. 3 compares the performance of the traditional forward-backward and Baum-Welch algorithms on data with additive white and correlated noise. The traditional algorithms assume that the background noise is white. In this case, the number of elements k in a metastate can be reduced to 1, the metastate keeps track of only the current state, and Eqs. 9 and 11 reduce to the standard definition of the forward and backward variables (Rabiner, 1989). Noiseless data were generated from a two-state model consisting of one closed state and one open state,
C <LIM><OP><ARROW>⇄</ARROW></OP><LL>a<SUB><UP>OC</UP></SUB></LL><UL>a<SUB><UP>CO</UP></SUB></UL></LIM> O (I)
with transition probabilities aCO = 0.3 and aOC = 0.1 and unit current amplitude. In the first simulation, the data Y(t) were simulated by adding zero-mean, white, Gaussian noise n(t) with variance sigma <UP><SUB>w</SUB><SUP>2</SUP></UP> = 1.0 to the noiseless signal. In the second simulation, the data Y(t) were simulated by adding correlated, Gaussian noise generated by passing white, Gaussian noise through a finite impulse response filter. These data were analyzed with a two-state HMM. The log-likelihood contour plot for data with white and correlated noise is shown in Fig. 3 (C and E), respectively. It can be seen in Fig. 3 C that the true values of (aCO, aOC) = (0.3, 0.1) are close to the dotted two-unit log-likelihood contour. This indicates that the traditional forward-backward and Baum-Welch algorithm performs well when applied to data with additive white noise. On the other hand, for data with correlated noise, it is clear that the maximum value of (aCO, aOC) is near (0.9, 0.9) and is very far from the true values of (aCO, aOC). Thus, the traditional forward-backward and Baum-Welch algorithm perform poorly and yield biased estimates of the parameters when the noise is correlated.



View larger version (40K):
[in this window]
[in a new window]
 
FIGURE 3   HMM likelihood estimation with white and colored noise. (A) Displayed are the first 1024 of the 20,000 data points of noiseless signal simulated from a two-state Markov model consisting of one closed state (C) and an open state (O). The values of the transition probabilities (aCO, aOC) were chosen to be (0.3, 0.1). The closed state had a current level of 0 and the open state a current level of 1. (B) White Gaussian noise with sigma w = 1.0 was added to the noiseless signal. Displayed are the first 1024 points of the data. (C) Log-likelihood contours obtained as the current levels, the initial state probability vector, and the noise variance were reestimated while fixing the two transition probabilities (aCO, aOC) at various values. The cross represents the true values of (aCO, aOC). The dotted contour represents points at two natural log units below the peak. The arrows show the sequence of reestimated values of the parameters obtained with iteration. The true values of (aCO, aOC) are seen to be close to this contour. (D) Colored noise was added to the signal by passing unity-variance, white noise w(t) through a filter of the form,
c<SUB><UP>t</UP></SUB>=m<SUB>0</SUB>w<SUB><UP>t</UP></SUB>+m<SUB>1</SUB>w<SUB><UP>t−1</UP></SUB>.
with [m0 m1] = [0.8 - 0.6]. Displayed are 1024 points of the 20,000 points of simulated data. (E) Displayed are the log-likelihood contours obtained by traditional HMM analysis. The maximum value of (aCO, aOC) is near (0.9, 0.9) and is very far from the true values indicated by the cross.

H-noise algorithm

This algorithm considers the effects of the antialiasing filter before digitization of the data as well as correlation in the background noise, making use of the metastate framework. The randomness in the time of transition between states of the underlying continuous time Markov process is considered to be reflected as a randomness in the current level at each time instant at the output of the antialiasing filter. This randomness in the current level of a state at each time instant is modeled as a fictitious, nonstationary noise source called H-noise (Venkataramanan et al., 2000).

Shown in Fig. 4 A are three of the infinitely many possible realizations of making a transition from the closed state to the open state and back to the closed state. The response of the antialiasing filter to these three realizations of state transitions is shown in Fig. 4 B. Correspondingly, the current amplitude at each sampling time instant can be thought of as a random variable. As shown in Fig. 4 C, the random current amplitude at each time instant in the metastate can be regarded as the sum of two components---a mean current amplitude and an additional fictitious noise called the H-noise. The H-noise is nonstationary because its statistics are dependent on the underlying metastate; however, these statistics are easily computed from the known step response H of the antialiasing filter. Thus, the total noise in each sample has two components---the correlated background noise and the H noise. In the development of the algorithm, it is assumed that the data are sampled sufficiently quickly such that at most one transition takes place in a sampling interval. The H-noise is also assumed to be Gaussian, which is a reasonable approximation under poor signal to noise ratios.



View larger version (11K):
[in this window]
[in a new window]
 
FIGURE 4   Origin of H-noise: randomness in time of transition between states in a sampling interval is reflected as randomness in current amplitude. (A) Three possible realizations of making a transition from a closed state to open state and back to a closed state. (B) Signal at the output of an anti-aliasing filter (same filter as in Fig. 2) from the three possible realizations in A. (C) Randomness in time of transition between states causes randomness in current amplitude at each sampling time, which is modeled as a fictitious noise called H-noise. Error bars indicate the standard deviation of the H-noise.

Correlated emission probability

The H-noise can be easily incorporated into the metastate HMM framework through defining the emission probability function to consider the noise correlation in successive data samples. This emission probability is then used in the computation of the forward and backward variables.

Recalling that Yt is the vector of k successive data samples ending at time t, let the (k - 1) data samples ending at time t - 1 be denoted by <A><AC>Y</AC><AC>&cjs1171;</AC></A>t-1. The correlated emission probability can be denoted by P(yt|<A><AC>Y</AC><AC>&cjs1171;</AC></A>t-1, I) and is the conditional probability of yt given the metastate I at time t and the (k - 1) previous data samples. It is a Gaussian function of the present sample yt and <A><AC>Y</AC><AC>&cjs1171;</AC></A>t-1 and is given by,
P(y<SUB><UP>t</UP></SUB>‖<A><AC>Y</AC><AC>&cjs1171;</AC></A><SUB><UP>t−1</UP></SUB>, I)=<FR><NU>1</NU><DE><RAD><RCD>2&pgr;</RCD></RAD>&sfgr;<SUB><UP>I</UP></SUB></DE></FR> <UP>exp</UP><FENCE><FR><NU><UP>−</UP>(h<SUP><UP>T</UP></SUP><SUB><UP>I</UP></SUB>(Y<SUB><UP>t</UP></SUB>−&Xgr;m<SUB><UP>I</UP></SUB>))<SUP>2</SUP></NU><DE>2&sfgr;<SUP><UP>2</UP></SUP><SUB><UP>I</UP></SUB></DE></FR></FENCE> (16)
in which m<UP><SUB>I</SUB><SUP>T</SUP></UP> = [µi0 µi1 ... µik-1] is the vector of current levels in metastate I. In Eq. 16, Xi  is referred to as the filter matrix and can be easily computed from the known step response of the antialiasing filter (Venkataramanan et al., 2000). The vector Xi mI denotes the modified current levels corresponding to the different states in metastate I after the temporal mixing of the samples due to the antialiasing filter. The emission probability is maximum when the measured data exactly equals the filtered current levels, i.e., Yt = Xi mI. The different terms in the residual (Yt - Xi mI) are weighted and summed to cancel out their correlations according to the vector hI.

In the special case when the effects of the antialiasing filter and therefore H-noise are ignored, the only contribution to the total noise in each metastate comes from the correlated background noise which is stationary. In this case, the filter matrix Xi  equals an identity matrix. The parameters hI and sigma <UP><SUB>I</SUB><SUP>2</SUP></UP> then have a ready physical explanation. The vector hI is the same for all metastates and corresponds to the coefficients of the moving average prewhitening filter, which would whiten the correlated noise. The parameter sigma <UP><SUB>I</SUB><SUP>2</SUP></UP> is the noise variance at the output of this prewhitening filter. These parameters hI and sigma <UP><SUB>I</SUB><SUP>2</SUP></UP> can be estimated from the estimated correlation Sigma C of the colored background noise.

When the effects of the antialiasing filter are considered, due to the nonstationary nature of the H-noise, the total noise in each metastate is also nonstationary. Let C and O represent the closed and open states of a channel. The variance of the H-noise is highest in metastates that consist of different conductance states such as (C, O, C) or (O, C, O) and is zero for metastates that consist of the same conductance such as (C, C, C) or (O, O, O). Analogous to the previous case, the parameters hI and sigma <UP><SUB>I</SUB><SUP>2</SUP></UP> are referred to as the metastate moving average (MMA) parameters corresponding to metastate I. They can be easily found from an estimate of the covariance matrix Sigma I of noise in the metastate.

Covariance of H-noise

Because the total noise in metastate I is the sum of two independent components (the correlated background noise and the fictitious H-noise) the second order moments of the total noise Sigma I are the sum of the corresponding second order moments of the correlated background noise and the H-noise in the metastate. Thus,
&Sgr;<SUB><UP>I</UP></SUB>=&Sgr;<SUB><UP>C</UP></SUB>+&Sgr;<SUB><UP>H<SUB>I</SUB></UP></SUB>. (17)
in which Sigma C is the autocorrelation of the colored background noise and Sigma HI denotes the covariance matrix of H-noise corresponding to metastate I. It can be computed in terms of the known step response of the antialiasing filter and the estimated current levels of the states.

The time of transition from one state to the next is random relative to the sampling time. Let theta 0 be a random variable distributed between 0 and 1, t - theta 0 indicating the precise time, relative to the sampling time, of a transition that occurs in the interval (t - 1 t) (Fig. 5). Similarly, let theta 1, theta 2, ... , theta k-2 be independent random variables, such that t - j - theta j, j = 0, ... , k - 2 indicates the precise time of transition from the state at time t - (j + 1) to the state at time t - j. Let theta  = {theta 0, theta 1, ... , theta k-2} be the set of these random variables. Our goal is to compute the mean and second moments of variations (the H-noise) in the filtered signal that arise from the random theta .



View larger version (5K):
[in this window]
[in a new window]
 
FIGURE 5   Definition of the variables theta j, which specify the times of transition relative to the sampling times in metastate I. The precise time of transition from the state at time t - (j + 1) to the state at time t - j is given by the random variable t - j - theta j, j = 0, ... , k - 2.

Let m<UP><SUB>I</SUB><SUP>T</SUP></UP> = [µi0 µi1 ... µik-1] be the vector of current levels associated with the metastate I. For a given value of theta , let &mtilde;I(theta ) denote the current levels associated with the metastate modified due to antialiasing filter. Recalling that the covariance of two random variables x1 and x2 is defined as
<UP>cov</UP>(x<SUB>1</SUB>, x<SUB>2</SUB>)=E(x−<A><AC>x</AC><AC>&cjs1171;</AC></A><SUB>1</SUB>)(x−<A><AC>x</AC><AC>&cjs1171;</AC></A><SUB>2</SUB>) (18)
in which the expectation operator E( ) is taken over all possible values of variables x1 and x2 and x1 and x2 denote the mean values of x1 and x2, the covariance matrix of H-noise is similarly obtained as
&Sgr;<SUB><UP>H<SUB>I</SUB></UP></SUB>=E<SUB>ϑ</SUB>([<A><AC>m</AC><AC>˜</AC></A><SUB><UP>I</UP></SUB>(ϑ)−&Xgr;m<SUB><UP>I</UP></SUB>]<SUP><UP>T</UP></SUP>[<A><AC>m</AC><AC>˜</AC></A><SUB><UP>I</UP></SUB>(ϑ)−&Xgr;m<SUB><UP>I</UP></SUB>]) (19)
in which Eϑ( ) denotes the expectation operator taken over all possible values of theta . The matrix Xi  is chosen such that the vector Xi mI equals the mean value of &mtilde;I(theta ), the average again being taken over possible values of theta . Because each element of theta  uniformly takes values between 0 and 1, the above equation can be further reduced to,
&Sgr;<SUB><UP>H<SUB>I</SUB></UP></SUB>=(&mgr;<SUB><UP>i<SUB>0</SUB></UP></SUB>−&mgr;<SUB><UP>i<SUB>1</SUB></UP></SUB>)<SUP>2</SUP>V<SUB>0</SUB>+(&mgr;<SUB><UP>i<SUB>1</SUB></UP></SUB>−&mgr;<SUB><UP>i<SUB>2</SUB></UP></SUB>)<SUP>2</SUP>V<SUB>1</SUB>+… (20)

+(&mgr;<SUB><UP>i<SUB>k−2</SUB></UP></SUB>−&mgr;<SUB><UP>i<SUB>k−1</SUB></UP></SUB>)<SUP>2</SUP>V<SUB><UP>k−2</UP></SUB>,
in which the matrices Vj, j = 0, ... , k - 2 are given by,
V<SUB><UP>j</UP></SUB>=<UP>cov</UP>(H(&thgr;+j−m+1), H(&thgr;+j−n+1)). (21)
These matrices can be computed from the known step response H of the antialiasing filter as theta  uniformly takes values between 0 and 1.

Reestimation

Once the emission probability has been computed, the forward-backward and Baum-Welch algorithm can be used to calculate the likelihood and reestimate the HMM parameters (Venkataramanan et al., 2000). Briefly, our strategy for estimation of the noise model parameters from initial estimates is as follows. The matrix Xi  is fixed, being determined by the antialiasing filter. The reestimation of MMA parameters is equivalent to reestimating matrix Sigma C. The MMA reestimation is similar to reestimation of elements of the transition probability matrix A in Eq. 14. For example, in the case that the noise statistics are the same for all metastates, the noise variance can be reestimated as,
<A><AC>&sfgr;</AC><AC>ˆ</AC></A><SUP>2</SUP>=<FR><NU><LIM><OP>∑</OP><LL><UP>I</UP></LL></LIM> <LIM><OP>∑</OP><LL><UP>t=1</UP></LL><UL><UP>T</UP></UL></LIM> &ggr;<SUB><UP>t</UP></SUB>(I)((h<SUP><UP>T</UP></SUP><SUB><UP>I</UP></SUB>Y<SUB><UP>t</UP></SUB>)−&Xgr;m<SUB><UP>I</UP></SUB>)<SUP>2</SUP></NU><DE><LIM><OP>∑</OP><LL><UP>I</UP></LL></LIM> <LIM><OP>∑</OP><LL><UP>t=1</UP></LL><UL><UP>T</UP></UL></LIM> &ggr;<SUB><UP>t</UP></SUB>(I)</DE></FR>. (22)
The reestimation formulas for other noise model parameters are given in Venkataramanan et al. (2000).


    ISSUES IN PRACTICAL IMPLEMENTATION
TOP
ABSTRACT
INTRODUCTION
COMPUTATION OF THE LIKELIHOOD...
NOISE MODEL AND H-NOISE
ISSUES IN PRACTICAL...
SUMMARY
REFERENCES

Inverse filter

During acquisition of the data, the analog data are passed through an antialiasing filter to remove the high frequency noise and then are sampled at discrete times. Because the underlying signal can be thought of as a stochastic square wave, an analog filter with a sharp roll-off in the transition region in the frequency response produces overshoot and ringing phenomena. This is undesirable if the data are analyzed using traditional techniques such as threshold detection. Therefore, an analog filter with a gradual roll-off and linear phase response such as a Bessel filter has been conventionally used.

For analysis of the data using our algorithm, a filter with a sharp roll-off in the transition region is preferred for the following reasons. First, the noise spectral density is found to increase with frequency. Therefore, for the same cut-off frequency, a filter with a sharp roll-off has a smaller noise standard deviation than a filter with a gradual roll-off. Second, use of a filter with a gradual roll-off requires that the data be sampled at several times the Nyquist rate to prevent aliasing (Colquhoun and Sigworth, 1995). This would necessitate a large size for the metastate to characterize the long step response, thus increasing the computational complexity of the HMM algorithms. On the other hand, analog filters with a sharp roll-off characteristic do not have linear phase response, which can cause problems in analyzing the data.

Qin et al. (2000b) make use of a digital inverse filter to shorten the step response of the recording system. We describe here our implementation of a digital inverse filter that accomplishes a similar purpose. Its function is to remove the effects of the data acquisition system (including the effects of the patch clamp amplifier and the analog antialiasing filter) and produce a desired impulse response. We use an impulse response of the form (Venkataramanan, 1998)
h(n)=<UP>exp</UP><FENCE><UP>−</UP><FR><NU>f<SUP><UP>2</UP></SUP><SUB><UP>x</UP></SUB>n<SUP>2</SUP></NU><DE>&sfgr;<SUP><UP>2</UP></SUP><SUB><UP>f</UP></SUB></DE></FR></FENCE> <FR><NU><UP>sin</UP>(2&pgr;f<SUB><UP>x</UP></SUB>n)</NU><DE>(2&pgr;f<SUB><UP>x</UP></SUB>n)</DE></FR>, f<SUB><UP>x</UP></SUB>=<FR><NU>f<SUB><UP>c</UP></SUB></NU><DE>f<SUB><UP>s</UP></SUB></DE></FR> (23)
Here, fx is the ratio of the bandwidth of the filter fc to the data sampling rate fs. The parameter sigma f controls the width and slope of the frequency response in the transition region. If sigma f 1, the filter is essentially Gaussian with bandwidth fc/sigma f. If sigma f 1, the filter has a sharp roll-off in the transition region with bandwidth fc.

The inverse filter provides two main advantages. First, it provides the ability to increase or decrease the filter bandwidth and therefore allows the analysis of the data at a cut-off and sampling frequency that may be different from that used during data acquisition. Second, the resulting filter cut-off frequency can be placed close to one-half the sampling frequency. This allows the use of a lower-order noise model and a smaller size metastate, thus reducing the computational complexity of the HMM algorithms.

The digital inverse filter is built as follows. Let t(n) and e(n) denote the desired impulse response and measured impulse response of the acquisition system. Let T(omega ) and E(omega ) denote their corresponding Fourier transforms, here expressed as functions of the angular frequency omega  = 2pi f. The impulse response hinv(n) of the inverse filter is then given by
h<SUB><UP>inv</UP></SUB>(n)=F<SUP>−1</SUP><FENCE><FR><NU>T(&ohgr;)</NU><DE>E(&ohgr;)</DE></FR></FENCE> (24)
in which F-1 denotes the inverse Fourier transform operation. Direct measurement of the impulse response e(n) of the recording system is difficult to perform due to the limited dynamic range of the patch-clamp amplifier. Instead, the unit step response of the system denoted by u(n) can easily be obtained by applying a triangle wave voltage through a capacitor into the head stage amplifier (Sigworth, 1995). In the Fourier domain, the impulse response of the acquisition system is related to its step response by
E(&ohgr;)=i&ohgr;U(&ohgr;). (25)
Let ê(n) be an approximation to e(n), obtained by passing the measured step response through a first-order differentiator of the form,
<A><AC>e</AC><AC>ˆ</AC></A>(n)=u(n)−u(n−1). (26)
Taking the Fourier transform of Eq. 26, we obtain,
<A><AC>E</AC><AC>ˆ</AC></A>(&ohgr;)=U(&ohgr;)[1−e<SUP><UP>−i&ohgr;</UP></SUP>]=U(&ohgr;)e<SUP><UP>−i&ohgr;/2</UP></SUP>[e<SUP><UP>i&ohgr;/2</UP></SUP>−e<SUP><UP>−i&ohgr;/2</UP></SUP>]=2iU(&ohgr;)e<SUP><UP>−i&ohgr;/2</UP></SUP><UP> sin</UP>(&ohgr;/2) (27)
From Eqs. 25 and 27,
<FR><NU>E(&ohgr;)</NU><DE>&ohgr;</DE></FR>=<FR><NU><A><AC>E</AC><AC>ˆ</AC></A>(&ohgr;)</NU><DE>2 <UP>sin</UP>(&ohgr;/2)e<SUP><UP>−i&ohgr;/2</UP></SUP></DE></FR> (28)
Therefore, from Eqs. 24 and 28,
h<SUB><UP>inv</UP></SUB>(n)=F<SUP>−1</SUP><FENCE><FR><NU>2T(&ohgr;)<UP>sin</UP>(&ohgr;/2)e<SUP><UP>−i&ohgr;/2</UP></SUP></NU><DE><A><AC>E</AC><AC>ˆ</AC></A>(&ohgr;)&ohgr;</DE></FR></FENCE> (29)
in which the factor e-iomega /2 represents a time delay of one-half sample interval and can be ignored.

The impulse response of the inverse filter is given by Eq. 29 and is computed from the desired impulse response and the measured step response. Of course, the inverse filter coefficients are more accurately obtained when the sampling rate of the step response is larger than the data sampling rate. To avoid artifacts due to noise in the experimental ê(n), we multiply it by a raised cosine window function that is centered on the impulse. To automate the construction of the window, we determine its location and width from the location and width of the impulse, as determined from the first and second moments of |ê(n)|.

In our implementations, values of fc/fs are in the range of 0.4 to 0.5. The cut-off frequency can further be reduced when the data are decimated after inverse filtering. Following are some guidelines in choosing the decimation factor. The greater the decimation factor, the smaller the bandwidth of the inverse filter. This results in good signal to noise ratios in the inverse filtered data as more noise is removed. However, a significant amount of signal information is lost due to the low cut-off frequency of the filter. This will lead to biased estimates of parameters of the Markov model if the transition rates approach the sampling rate. On the other hand, the smaller the decimation factor, the lower is the SNR of the inverse filtered data. Therefore, to obtain unbiased parameter estimates, the noise model must be more accurate leading to larger size of the metastate to capture the correlation in successive noise samples (Venkataramanan et al., 1998). The large size of the metastate leads to higher computational intensity of the H-noise algorithm. In practice, the decimation factor should be chosen such that the final sampling frequency of the data fs should be larger than the expected highest rate in the Markov scheme. It should be noted that the inverse filter can be successfully constructed with an increase in bandwidth, if the frequency response of the acquisition system has a significant magnitude (at least roughly 1/10 of the maximum magnitude) at the desired bandwidth.

An example of the step response and the calculated frequency response of a patch-clamp recording system with a 30-kHz Bessel filter is shown in Fig. 6 (A and B) in the time and frequency domains (dashed line). The step response convolved with the inverse filter coefficients results in a sharp-cutoff filter with fc = 80 kHz and is indicated by the solid lines. The 16-point inverse filter kernel is shown in Fig. 6 (C and D). There exists an overshoot in the final step response (visible also in Fig. 2, where a filter of the form of Eq. 23 was also used). However, data analysis using the algorithms described in this paper are insensitive to the overshoot because the filter matrix Xi  explicitly takes the shape of the step response into account.



View larger version (13K):
[in this window]
[in a new window]
 
FIGURE 6   Construction of the inverse filter used in Zheng et al. (2001). (A) Measured step response of the recording system (dashed line), which included an 8-pole, 30-kHz Bessel filter. The sampling frequency was 200 kHz. The final step response that was obtained by convolution with the inverse filter coefficients is shown by the solid line. (B) Frequency response of the recording system (dashed curve) and the desired frequency response after inverse filtering (solid curve). In building the inverse filter, the parameters were assigned sigma f = 1/<RAD><RCD>2</RCD></RAD> and fc/fs = 0.4, yielding an 80-kHz cutoff frequency. (C) The sixteen inverse filter coefficients hinv(n). (D) The frequency response of the inverse filter.

Baseline estimation

In addition to the Gaussian noise, it is often seen that single-channel data are contaminated with nonrandom interferences such as a sinusoidal hum from the alternating current mains. In this subsection, we extend the results in Chung et al. (1990) to estimate the baseline iteratively from the single channel data in a metastate HMM framework.

The observed single channel data are modeled as the sum of three components---a noiseless signal that consists of current levels of the states obtained as the channel makes transitions from one state to another, Gaussian noise, and the drifting baseline eta (t) = {eta t, t = 1, ... , T}. We present below a method to characterize and extract the baseline from the single channel data.

Generally, baseline drifts are slow variations in the background patch current. Therefore, the baseline can be modeled as a piecewise linear function. Let T denote the total number of points in the data set. Let the data set be divided into r segments of Delta  points each, therefore r = T/Delta . Let the baseline be approximated by a piecewise linear function of the form,
&eegr;<SUB><UP>t</UP></SUB>=<LIM><OP>∑</OP><LL><UP>j=0</UP></LL><UL><UP>r</UP></UL></LIM> &kgr;<SUB><UP>j</UP></SUB>f<SUB><UP>j</UP></SUB>(t) (30)
in which
f<SUB><UP>j</UP></SUB>(t)=<FENCE><AR><R><C>1 −<FR><NU>‖t−j&Dgr;‖</NU><DE>&Dgr;</DE></FR>,</C></R><R><C>‖t−j&Dgr;‖<&Dgr;, t=1,…, T</C></R><R><C>0  <UP>otherwise</UP>.</C></R></AR></FENCE> (31)
A graph of the function fj(t) is shown in Fig. 7.



View larger version (4K):
[in this window]
[in a new window]
 
FIGURE 7   The baseline kernel function fj(t) given by Eq. 32.

The parameters kappa j(j = 0, ... , r) in Eq. 30 are referred to as the vertices of the baseline. They are estimated iteratively in a metastate framework from the measured data Y(t) as follows. Step 1) The baseline is computed from the initial or previous estimates of kappa j(j = 0, ... , r). From Eqs. 30 and 31, the baseline at time t is computed from the vertices kappa j as
&eegr;<SUB><UP>t</UP></SUB>=&kgr;<SUB><UP>j</UP></SUB>+(&kgr;<SUB><UP>j+1</UP></SUB>−&kgr;<SUB><UP>j</UP></SUB>)<FENCE><FR><NU>t</NU><DE>&Dgr;</DE></FR>−j</FENCE>, j&Dgr;≤t≤(j+1)&Dgr; (32)
Step 2) The baseline eta (t) is subtracted from the data Y(t). The resulting data are then analyzed with the H-noise algorithm. Step 3) After obtaining the probability variables gamma t(I) from the forward-backward algorithm, the vertices kappa j(j = 0, ... , r) are updated. Let <A><AC>&kgr;</AC><AC>ˆ</AC></A>n denote the reestimated values of the vertices and let <A><AC>&eegr;</AC><AC>ˆ</AC></A>t = Sigma <UP><SUB>j=0</SUB><SUP>r</SUP></UP> <A><AC>&kgr;</AC><AC>ˆ</AC></A>jfj(t). On the lines of the Appendices in Venkataramanan (1998), a closed form expression for the reestimation of <A><AC>&kgr;</AC><AC>ˆ</AC></A>j can be obtained by solving the system of equations,
&OHgr;<A><AC>&kgr;</AC><AC>ˆ</AC></A>=&Ggr; (33)
in which <A><AC>&kgr;</AC><AC>ˆ</AC></A>T = [<A><AC>&kgr;</AC><AC>ˆ</AC></A>0 <A><AC>&kgr;</AC><AC>ˆ</AC></A>1 ... <A><AC>&kgr;</AC><AC>ˆ</AC></A>r] is the vector of reestimated vertices. Here,
&OHgr;<SUB><UP>nj</UP></SUB>=<LIM><OP>∑</OP><LL><UP>I</UP></LL></LIM> <LIM><OP>∑</OP><LL><UP>t</UP></LL></LIM> <FR><NU>&ggr;<SUB><UP>t</UP></SUB>(I)</NU><DE>&sfgr;<SUP><UP>2</UP></SUP><SUB><UP>I</UP></SUB></DE></FR> (h<SUP><UP>T</UP></SUP><SUB><UP>I</UP></SUB>F<SUB><UP>n</UP></SUB>(t))(h<SUP><UP>T</UP></SUP><SUB><UP>I</UP></SUB>F<SUB><UP>j</UP></SUB>(t)), n, j=0,…, r (34)
and
&Ggr;<SUB><UP>n</UP></SUB>=<LIM><OP>∑</OP><LL><UP>I</UP></LL></LIM> <LIM><OP>∑</OP><LL><UP>t</UP></LL></LIM> <FR><NU>&ggr;<SUB><UP>t</UP></SUB>(I)</NU><DE>&sfgr;<SUP><UP>2</UP></SUP><SUB><UP>I</UP></SUB></DE></FR> (h<SUP><UP>T</UP></SUP><SUB><UP>I</UP></SUB>(Y<SUB><UP>t</UP></SUB>−&Xgr;m<SUB><UP>I</UP></SUB>))(h<SUP><UP>T</UP></SUP><SUB><UP>I</UP></SUB>F<SUB><UP>n</UP></SUB>(t)), n=0,…, r (35)
in which
F<SUB><UP>n</UP></SUB>(t)=[f<SUB><UP>n</UP></SUB>(t)f<SUB><UP>n</UP></SUB>(t−1)…f<SUB><UP>n</UP></SUB>(t−k+1)]<SUP><UP>T</UP></SUP>, n=0,…, r. (36)
This iterative baseline estimation procedure is readily incorporated into the main body of the H-noise algorithm.

It should be noted that the average value of the baseline and the current levels of the states are not independent variables and, for this reason, it is necessary to impose a constraint in their reestimation. In one approach, it is assumed that the average value of the baseline is zero. This constraint is imposed in step 3 of the baseline estimation algorithm presented above by taking the average value of the vertices and subtracting it from each vertex. In this case, the final reestimated values of the current levels will be offset by the average value of the baseline. In an alternative approach, it is assumed that the closed states in the HMM always have a current level of zero. The current levels of all the other states and the average baseline offset are reestimated with respect to the current level of the closed states. The two approaches were found to be equivalent in our simulations.

We present in Table 1 simulation results on the two-state HMM specified in Scheme 1. These results are representative of the simulations performed on various HMMs at various signal to noise ratios. It can be clearly seen on comparing columns 4 and 5 that the parameters of the HMM and noise model have a smaller bias when the baseline is estimated. The log-likelihood of the model with baseline estimation is also much larger than the model without baseline estimation.


                              
View this table:
[in this window]
[in a new window]
 
TABLE 1   Simulation results on the two-state HMM specified in Scheme 1 

The original and estimated baseline and signal from this simulation are also displayed in Fig. 8. On comparison of the of the actual and estimated baselines, it is seen that the baseline reestimation procedure reasonably reconstructs the baseline and allows a good cumulative-mean reconstruction of the original signal.



View larger version (32K):
[in this window]
[in a new window]
 
FIGURE 8   An example of baseline estimation. (A) First 1024 points of the filtered, sampled signal with additive correlated noise. The parameters of the HMM and noise model are given in Table 1. (B) True (dotted curve) and estimated (solid line) baseline functions. (C) The first 1024 po