In this paper, we compare nonparametric kernel estimates
with smoothed histograms as methods for displaying logarithmically transformed dwell-time distributions. Kernel density plots provide a
simpler means for producing estimates of the probability density function (pdf) and they have the advantage of being smoothed in a
well-specified, carefully controlled manner. Smoothing is essential for
multidimensional plots because, with realistic amounts of data, the
number of counts per bin is small. Examples are presented for a
2-dimensional pdf and its associated dependency-difference plot that
display the correlations between successive dwell times.
 |
INTRODUCTION |
Dwell-time histograms and the probability density
functions (pdfs) estimated from them are an important means for
presenting the results of patch-clamp investigations of ion channel
currents. Blatz and Magleby (1986)
, McManus et al. (1987)
and Sigworth
and Sine (1987)
demonstrated that constructing the distribution with a
logarithmic rather than a linear time scale can greatly reduce the
computations required for subsequent fitting and has a number of
advantages for display of the data: it produces a pdf for a single
exponential whose peak height and position on the log axis specify the
amplitude and time constant of the exponential decay; it allows
convenient display of dwell times spanning several orders of magnitude;
and it provides a much better visual impression of the number of
components in the exponential mixture. A major disadvantage has been
that it is not straightforward to construct a histogram with log bins
when the original data have been digitized. The mapping of dwell times
for data recorded at a fixed sample interval to log bins is highly
irregular: some bins for short durations can never receive counts, and
significant irregularities between adjacent bins extend beyond the bin
corresponding to 100 times the sample interval. Stark and Hladky (2000)
have shown that simple approximations based on the nature of the
binning process combined with the transformation rules for probability density functions make it possible to adjust the display of the counts
to compensate for these irregularities. We report here that
construction of kernel density plots (Bowman and Azzalini, 1997
) rather
than histograms provides a simpler method for displaying estimates of
the pdf. These kernel-density plots also have the advantage of
producing an estimate that has been smoothed in a well-specified,
carefully controlled manner. The advantages of smoothing are most
marked in multidimensional plots such as the 2-dimensional histograms
used to display the occurrence of pairs of dwell-times (Magleby and
Song, 1992
) where the number of data counts per bin in a traditional
histogram would be small.
In a traditional histogram, each sojourn in a state or aggregate of
states with the same conductance, i.e., a conductance class, is
represented by a count in the bin corresponding to the dwell time. The
contribution to the plot is a small rectangle with unit height and the
same width as the bin. The placing of the rectangle is the same
regardless of where the dwell time falls within the range of times
within the bin. A nonparametric kernel density estimate is constructed
by representing each sojourn by a kernel function positioned at the
actual value of the sampled dwell time. A common choice of function is
a normal with width still to be specified. The kernel density estimate
is then the sum of the increments for all sojourns in the class.
Because, for a log plot, the increment can be chosen to have unit area, the total area under the resulting curve is equal to the number of
counts, and the pdf can be obtained by dividing all values on the curve
by the total number of counts.
 |
DWELL-TIME KERNEL ESTIMATES |
Single class dwell times
Let t = (ti), i = 1, 2, ... , n be a series of dwell times at a particular class.
Also let the kernel function
be a normal density on the log axis
z = ln(t), with mean located at the value of
the log of the dwell time, and scale dependent width (standard deviation),
i,
|
(1)
|
then the kernel estimate for the pdf of x = (xi), i = 1, ... , n, is
|
(2)
|
For long dwell times,
i can be independent of the
dwell time. Its value affects the appearance of the plot in a manner
analogous to the way the bin width affects a histogram. If the width is much too narrow, the plot consists of many sharp spikes with a visual
impression reflecting the accident of the particular samples taken
rather than the underlying pdf. In contrast, if the width is much too
wide, the plot becomes a single smear with loss of most information
about the data. Fortunately,
i can be chosen simply by
considering the properties of a log dwell-time plot.
When plotted on a logarithmic scale, the width of a exponential
component is independent of its amplitude or time constant. Hence, the
principles for choosing the kernel for a mixture are the same as those
for a single exponential. Estimates constructed with several choices of
constant width,
i =
, are compared in Fig.
1 A with the underlying
theoretical pdf for an exponential with time constant 250 µs sampled
at 5 µs intervals. Data points for the estimates (1000 per curve)
were obtained deterministically in proportion to the value of the pdf
(see Stark and Hladky, 2000
, for details). For each data point, the
kernel increment was evaluated at values of z on a grid with
20 steps per e-fold change in dwell time, which is sufficiently fine to
avoid distorting the shape and size of the kernel. (The grid spacing
must be less than
.) Evaluations were restricted to the range of
z for which
is significant. Consider first the range
xi = ln(ti) >
3. For these values, any
< 0.2 preserves the shape. Larger
values of
smear out the distribution, lowering the peak and adding
weight for longer and shorter dwell times. The present example is not
suitable to demonstrate the effect of making
too small because
there are too many data points and they are deterministically spaced.

View larger version (14K):
[in this window]
[in a new window]
|
FIGURE 1
Comparisons of kernel estimates with a theoretical
exponential probability density function of = 250 µs.
(A) Estimates for different normal kernels with constant
width . values are: solid curve, 0.1;
dashed, 0.2; heavy dots, 0.4; light
dots, 0.8 and 1.6. The theoretical curve is shown as a heavier
line. (B) Estimates with different corrections for the
uncertainty for very short dwell-times. = 50 µs. = 0.1. The m values are dotted line, 0; solid, 1;
and dashed, 2. (C) Demonstration of a close match
between estimate and theory (thin and heavy solid
lines). is 250, = 0.1 m = 1. ±SD
confidence bands for the estimate, computed via Eq. A8, are also shown
as dash-dotted lines. In this and the next two figures, "log" means
logarithm to the base e.
|
|
The series of peaks at short dwell time in the trace for
= 0.1 in Fig. 1 A represent the increments added for dwell times of 1, 2, ... sample intervals. Because the data have been sampled at discrete times, all increments to the kernel estimate for short dwell times are centered at just these precise values. Thus these peaks
are an artifact of the sampling, not a feature of the underlying pdf.
As
is increased, the increments are smeared out over the possible
range, producing a better representation of the underlying signal.
It makes no sense to set the width of the kernel increment to a value
smaller than the uncertainty in the dwell times. This uncertainty must
be at least of the order of a sample interval. A constant uncertainty
on a linear scale transforms to a variable uncertainty on a log scale.
The value of
i corresponding to an uncertainty of
m sample intervals for a dwell time of duration ti is
|
(3)
|
where
is the sample interval. The value of
i calculated from this relation is ln(1.5) for
m = 1 and a dwell time equal to one sample interval,
but it decreases rapidly, falling below 0.1 for dwell times exceeding
five sample intervals. Figure 1 B illustrates results
obtained with several values of m. For m = 1, the kernel estimate is not completely smooth but the variations occur centered about the underlying pdf. For larger m, the
curve is smoother but it lies below the pdf for small dwell times and above it for somewhat larger values. The optimal value of m
is close to 1 regardless of the time constant. In the following, we
have set m = 1.
Eq. 3 cannot be used on its own as a means of specifying the
width, because
i becomes much too small for large dwell
times. A simple, effective procedure is to use instead
|
(4)
|
Figure 1 C displays the result of applying this
relation with
= 0.1, which produces essentially no distortion
of the pdf. Although freedom from distortion favors this small
,
elimination of the roughness resulting from random sampling favors
large
. The more data available the smaller the value of
that
should be used. However, because
= 0.2 produces negligible
distortion, although that produced by 0.8 would normally be
unacceptable, in practice, it is the range 0.2-0.4 that will be
useful. A procedure for the optimal estimation of
given the number
of dwell times available, or, alternatively, an estimate of the minimum
number of times required for a chosen value, is outlined in the
Appendix. For 900 dwell times and an underlying pdf with two spaced
exponential components, the optimal value is between 0.26 and 0.34.
Joint dwell times
Let xi,1 = ln(ti,1) and xi,2 = ln(ti,2) for i = 1, 2, ... ,
n, denote the log-transformed sojourns of consecutive dwell times in any two classes
and
. Bivariate kernel estimates for the joint density h(z1, z2) of adjacent
dwell-time pairs, (xi,1,
xi,2), may be defined by extending Eq. 1 as the
product of univariate kernels along each direction,
|
(5)
|
This function is defined for all points (z1,
z2) on the real plane, because, in principle,
(ln(ti,1), ln(ti,2))
R2. This construction leads to the estimate
|
(6)
|
with
|
(7)
|
Alternatively, it is possible to use a bivariate kernel with
correlated coordinates to allow for smoothing along particular directions, see Bowman and Azzalini (1997)
or Wand and Jones (1995)
.
Estimates based on Eq. 5 can also be used to generate plots of the
dependency difference function introduced by Magleby and Song (1992)
,
|
(8)
|
with h(z1, z2) given by Eq. 6
and h(z1), h(z2), the
one-dimensional (1D) density estimates defined in Eq. 2, which can be calculated from h(z1, z2) as the
marginals
|
(9)
|
with K1 and K2 the
sizes of the discrete grid used to evaluate the joint density along
each coordinate. The dependency difference indicates the importance of
the discrepancy between the observed distribution of dwell-time pairs
and that which would arise if the dwell-times in each class were
independent of each other.
 |
RESULTS |
It is routinely possible to obtain enough experimental data to
construct 1D log dwell-time distributions with both a relatively large
number of bins per decade and sufficient counts per bin to obtain a
reasonable estimate of the pdf. The principal advantage of the use of
kernel estimation in the construction of these 1D estimates is its
simplicity. Binning, correction for the binning artefacts (see Stark
and Hladky, 2000
) and smoothing of the histogram are all replaced with
a single, easily implemented summation of increments. The major
disadvantage is that the smearing correction for the binning artifacts
for dwell-times less than ~5 sample intervals is not as accurate (see
for example Fig. 1 B) as the correction factors for
log-binned histograms (see Stark and Hladky, 2000
).
Two-dimensional (2D) pdfs or difference dependency plots require the
square of the number of bins needed for the 1D plot, and, for these,
the smoothing becomes critical. To illustrate the use of kernels for
the estimation of open-closed dwell-time densities, we have generated
synthetic noisy data using the following four-state reversible cyclic
mechanism (see Scheme 1).
The states were grouped into two classes,
and
, with
= {O1, O2} as the class of "open"
states and
= {C1, C2} the
class of "closed" states. This decomposition induces the following
partition for the Q matrix,
|
(10)
|
with Q
as the submatrix of rate
constants between the states of the
class,
Q
the rates corresponding to the
transitions from the
class to the
class, and so on. The rate
constants ki; i = 1, ... ,
8, were chosen to ensure states with widely separated life times
within each aggregate. These were set to k1 = 6.183, k2 = 0.454, k3 = 2.697, k4 = 1.665, k5 = 0.446, k6 = 13.163, k7 = 0.182, and
k8 = 11.812 (all in arbitrary units). In
this case, both O1 and C1
are short-lived states (
O1 = 0.0551,
C1 = 0.0629), whereas
O2 and C2 are long
(
O2 = 1.351,
C2 = 0.5519). For this setting, long sojourns at each class originate
from events of the type O1
O2 or
C1
C2, which are then
interrupted by brief transitions to the other class by events of the
form C2
O1
C2 or
O2
C1
O2.
To assess the performance of joint histograms and the estimate in Eq. 6, a noisy trace of 900 dwell-time pairs that correspond to the
consecutive sojourns in
and
was generated by sampling at
= 0.005. These data were idealized by using the methods
described in Rosales et al. (2001)
. Kernel estimates, together with the theoretical joint density, are shown in Fig.
2. The highest peaks, located right
foreground, originate from brief sojourns in
followed by long
durations in
. In principle, these pairs could arise from events of
the type O1
C2 or
O1
C2
C1. The
second maxima originates from long sojourns in
followed by short
ones in
. These correspond to the transitions O2
C1 or O1
O2
C1. A third minor peak corresponds to long consecutive
sojourns in both classes, which might appear to correspond to
transitions O2
C2. Figure
2 B presents the kernel estimate obtained from the 900 simulated dwell-time pairs. This estimate was obtained using Eq. 6 with
= 0.35 on a 60 × 60 grid, showing close resemblance to
the theoretical density in Fig. 2 A. Figure 2 C
shows a smoothed histogram constructed using the same sequence of dwell
times by considering a 70 × 70 grid with 10 bins per e-fold
change along each axis. The counts were corrected for the log binning
artefacts as described by Stark and Hladky (2000)
. Without smoothing,
both the raw histogram and the corrected plot had the appearance of a
forest of trees. The histogram shown in Fig. 2 C was
obtained by replacing each point with an average of 361 points weighted according to the inverse of their distance to the center (the central
point and the four nearest neighbors all have weight 1). Similar, but
somewhat more ragged, results were obtained using a uniform average
over 289 points instead of the inverse weighted average over 360. In
both cases, the size of the averaging region were chosen so that the
peak heights would be close to the theoretical pdf and the kernel
estimates, but this agreement was achieved at the expense of
considerable elevation in the distributions for short open or closed
times. Smoother histogram plots could be produced using larger
averaging regions, but, in these, there was severe loss of peak height
and even larger tails.

View larger version (30K):
[in this window]
[in a new window]
|
FIGURE 2
Joint dwell-time densities for dwell-time pairs for
consecutive sojourn in (right axis) and (left
axis). (A) Theoretical density predicted by the
spectral decomposition of Q, (B) kernel density
estimate, and (C) histogram smoothed as described in the
text. Both (B) and (C) are generated from 900 dwell-time pairs fitted to synthetic data generated by the mechanism
described by Q.
|
|
Dependence-difference plots are shown in Fig.
3. Figure 3 A presents the
theoretical results calculated from Eq. 8 using the theoretical joint
and marginal densities obtained from the spectral decomposition of
Q (Colquhoun and Hawkes, 1982
). Figure 3 B shows the dependence-difference contour obtained using Eq. 8 and the estimate
plotted in Fig. 2 B. Figure 3 C presents the
corresponding dependence plot computed from the smoothed histogram
displayed in Fig. 2 C. It can be seen from Fig.
3 B, that positive dependencies (D(z1,
z2) > 0) are related to dwell-time pairs
associated with events of the form O1
C2 and O2
C1,
whereas negative dependencies are seen for the regions that correspond
to the transitions O1
C1 and
O2
C2. Therefore, the third
minor maximum at Fig. 2 B corresponding to long sojourns in
both
and
arises from composite transitions from
O2 to C2, i.e.,
O2
C1
C2 or
O2
O1
C2.

View larger version (57K):
[in this window]
[in a new window]
|
FIGURE 3
Dependence-difference contour plots. (A) is
calculated from the theoretical joint and marginals given by
Q and the dependency-difference definition in Eq. 8.
(B) and (C) correspond to the dependencies
calculated via Eq. 8 and Eq. 9 from the kernel estimate in Fig.
2 B and the histogram in Fig. 2 C,
respectively. The vertical axis is used for the times in , whereas
the horizontal axis for those in .
|
|
 |
DISCUSSION |
The comparison of the estimate for the open-closed density shown
in Fig. 2 B with the theoretical shape in Fig.
2 A, and, even more markedly, the comparison of the
dependency differences computed from them in Fig. 3, show clearly the
utility of kernel density estimation for log-transformed dwell times.
To produce the plot at Fig. 2 C, the histogram was first
constructed and then corrected to account for the aberrations induced
by log binning (Stark and Hladky, 2000
). The corrected counts were
finally smoothed. This sequence is accomplished in a single step when
evaluating the kernel directly from Eq. 6, making these estimates
simpler to compute.
Inspection of Fig. 2 and Fig. 3 may suggest that the corrected and
smoothed histogram is closer than the density estimate to the
underlying pdf. However, the amount of smoothing for the histogram was
chosen to make the highest peak agree with the theoretical distribution
to display a result as favorable as possible for this method. In
practice, this choice is not available because it depends on the
knowledge of the underlying density. By constrast, the kernel method
provides an automatic choice for the amount of smoothing found by
minimizing the bias introduced by the estimation procedure. It should
be emphasized that the results presented show that estimates obtained
with
in the range of 0.2-0.4 do not create or eliminate peaks nor
do they shift peaks appreciably on the z axis or the
(z1, z2) plane.
Construction of kernel estimates and histograms should produce similar
results as the number of dwell times increases asymptotically toward
infinity. However, the kernel estimator can be shown to be more
efficient because its bias (in terms of the mean integrated squared
error, see the Appendix) decreases as
(n
4/5) whereas the bias of a histogram
follows
(n
2/3) (see Wand and Jones, 1995
,
p. 23). This property is particularly useful when estimating joint
densities for the kinetic analysis of ion channels, since the
observations available from a single molecule are usually brief or few
when compared to the number of bins that would be required.
The errors incurred when fitting the kernels generated by Eqs. 4 and A5
with exponential mixtures gives similar results as those shown in Fig.
2 of Sigworth and Sine (1987)
. However, although the smoothing
introduced by the kernel estimation can be chosen to minimize the mean
integrated squared error, quantitative assessment of the effects of
this smoothing on subsequent fitting remains a subtle, open question.
More generally, by taking an exponential mixture of two components with
equal amplitudes and rates
1,
2 = 
1, and varying
from 0 to
, the width is
bounded as
Address reprint requests to Rafael Rosales, Departamento de
Matemáticas, Instituto Venezolano de Invetigaciones
Científicas, Apartado. 21827, Caracas 1020-A, Venezuela. Tel.:
+58-212-5041412/13; Fax: +58-212-5041416; E-mail:
rrosales{at}cauchy.ivic.ve.