densityBC {spatstat.univar}R Documentation

Kernel Density Estimation with Optional Boundary Correction

Description

A simple implementation of fixed-bandwidth kernel density estimation on the real line, or the positive real half-line, including optional corrections for a boundary at zero.

Usage

  densityBC(x, kernel = "epanechnikov", bw=NULL,
      ...,
      h=NULL,
      adjust = 1,
      weights = rep(1, length(x))/length(x), from, to = max(x), n = 256,
      zerocor = c("none", "weighted", "convolution", "reflection",
                  "bdrykern", "JonesFoster"),
      fast=FALSE,
      internal=list())

Arguments

x

Numeric vector.

kernel

String specifying kernel. Options are "gaussian", "rectangular", "triangular", "epanechnikov", "biweight", "cosine" and "optcosine". (Partial matching is used).

bw, h

Alternative specifications of the scale factor for the kernel. The bandwidth bw is the standard deviation of the kernel (this agrees with the argument bw in density.default. The rescale factor h is the factor by which the ‘standard form’ of the kernel is rescaled. For the Epanechnikov kernel, h = bw * sqrt(5) is the half-width of the support, while for the Gaussian kernel, h = bw is the standard deviation. Either bw or h should be given, and should be a single numeric value, or a character string indicating a bandwidth selection rule as described in density.default.

adjust

Numeric value used to rescale the bandwidth bw and halfwidth h. The bandwidth used is adjust * bw. This makes it easy to specify values like ‘half the default’ bandwidth.

weights

Numeric vector of weights associated with x. The weights are not required to sum to 1, and will not be normalised to sum to 1. The weights may include negative values.

from, to

Lower and upper limits of interval on which density should be computed. The default value of from is from=min(x) if zerocor="none", and from=0 otherwise.

n

Number of r values for which density should be computed.

zerocor

String (partially matched) specifying a correction for the boundary effect bias at r=0 when estimating a density on the positive half line. Possible values are "none", "weighted", "convolution", "reflection" and "bdrykern".

fast

Logical value specifying whether to perform the calculation rapidly using the Fast Fourier Transform (fast=TRUE) or to use slower, exact code (fast=FALSE, the default). Option zerocor="bdrykern" is not available when fast=TRUE.

internal

Internal use only.

...

Additional arguments are ignored.

Details

If zerocor is absent or given as "none", this function computes the fixed bandwidth kernel estimator of the probability density on the real line.

If zerocor is given, it is assumed that the density is confined to the positive half-line, and a boundary correction is applied:

weighted

The contribution from each point x_i is weighted by the factor 1/m(x_i) where m(x) = 1 - F(-x) is the total mass of the kernel centred on x that lies in the positive half-line, and F(x) is the cumulative distribution function of the kernel

convolution

The estimate of the density f(r) is weighted by the factor 1/m(r) where m(r) = 1 - F(-r) is given above.

reflection

if the kernel centred at data point x_i has a tail that lies on the negative half-line, this tail is reflected onto the positive half-line.

bdrykern

The density estimate is computed using the Boundary Kernel associated with the chosen kernel (Wand and Jones, 1995, page 47). That is, when estimating the density f(r) for values of r close to zero (defined as r < h for all kernels except the Gaussian), the kernel contribution k_h(r - x_i) is multiplied by a term that is a linear function of r - x_i.

JonesFoster

The modification of the Boundary Kernel estimate proposed by Jones and Foster (1996), equal to \overline f(r) \exp( \hat f(r)/\overline f(r) - 1) where \overline f(r) is the convolution estimator and \hat f(r) is the boundary kernel estimator.

If fast=TRUE, the calculations are performed rapidly using density.default which employs the Fast Fourier Transform. If fast=FALSE (the default), the calculations are performed exactly using slower C code.

Value

An object of class "density" as described in the help file for density.default. It contains at least the entries

x

Vector of x values

y

Vector of density values y= f(x)

Author(s)

Adrian Baddeley Adrian.Baddeley@curtin.edu.au and Martin Hazelton Martin.Hazelton@otago.ac.nz.

References

Baddeley, A., Chang, Y-M., Davies, T.M. and Hazelton, M. (2024) In preparation.

Jones, M.C. and Foster, P.J. (1996) A simple nonnegative boundary correction method for kernel density estimation. Statistica Sinica, 6 (4) 1005–1013.

Wand, M.P. and Jones, M.C. (1995) Kernel Smoothing. Chapman and Hall.

Examples

  sim.dat <- rexp(500)
  fhatN <- densityBC(sim.dat, "biweight", h=0.4)
  fhatB <- densityBC(sim.dat, "biweight", h=0.4, zerocor="bdrykern")
  plot(fhatN, ylim=c(0,1.1), main="density estimates")
  lines(fhatB, col=2)
  curve(dexp(x), add=TRUE, from=0, col=3)
  legend(2, 0.8,
     legend=c("fixed bandwidth", "boundary kernel", "true density"),
     col=1:3, lty=rep(1,3))


[Package spatstat.univar version 3.1-1 Index]