wbs.thresh.cpt {breakfast}R Documentation

Multiple change-point detection in the mean of a vector using the (Adaptive) WBS method, with the number of change-points chosen by thresholding

Description

This function estimates the number and locations of change-points in the piecewise-constant mean of the noisy input vector, using the (Adaptive) Wild Binary Segmentation method (see Details for the relevant literature references). The number of change-points is chosen via a thresholding-type criterion. The constant means between each pair of neighbouring change-points are also estimated. The method works best when the noise in the input vector is independent and identically distributed Gaussian.

Usage

wbs.thresh.cpt(x, sigma = stats::mad(diff(x)/sqrt(2)), universal = TRUE,
  M = NULL, th.const = NULL, th.const.min.mult = 0.825, adapt = TRUE,
  lambda = 0.9)

Arguments

x

A vector containing the data in which you wish to find change-points.

sigma

The estimate or estimator of the standard deviation of the noise in x; the default is the Median Absolute Deviation of x computed under the assumption that the noise is independent and identically distributed Gaussian.

universal

If TRUE, then M and th.const (see below) are chosen automatically in such a way that if the mean of x is constant (i.e. if there are no change-points), the probability of no detection (i.e. est being constant) is approximately lambda. When universal is TRUE, then M=1000 for longer signals and M<1000 for shorter signals to avoid th.const being larger than 1.3, which empirically appears to be too high a value. If universal is FALSE, then both M and th.const must be specified.

M

The number of randomly selected sub-segments of the data on which to build the CUSUM statistics in the (Adaptive) Wild Binary Segmentation algorithm. If you are using Adaptive Wild Binary Segmentation (adapt=TRUE) and do not wish to set universal to TRUE (and therefore have M chosen for you), try M=1000. If you are using standard Wild Binary Segmentation (adapt=TRUE), try M=20000 or higher.

th.const

Tuning parameter. Change-points are estimated by thresholding [of the (Adaptive) WBS CUSUMs of x] in which the threshold has magnitude th.const * sqrt(2 * log(n)) * sigma, where n is the length of x. There is an extra twist if adapt=TRUE, see th.const.min.mult below.

th.const.min.mult

If adapt=TRUE, then the threshold gradually decreases in each recursive pass through the data, but in such a way that in never goes below th.const.min.mult * th.const * sqrt(2 * log(n)) * sigma.

adapt

If TRUE (respectively, FALSE), then Adaptive (respectively, standard) Wild Binary Segmentation is used.

lambda

See the description for the universal parameter above. Currently, the only permitted values are 0.9 and 0.95.

Details

The change-point detection algorithms used in wbs.thresh.cpt are: standard Wild Binary Segmentation [see "Wild Binary Segmentation for multiple change-point detection", P. Fryzlewicz (2014), Annals of Statistics, 42, 2243-2281] and Adaptive Wild Binary Segmentation [see "Data-adaptive Wild Binary Segmentation", P. Fryzlewicz (2017), in preparation as of September 28th, 2017].

Value

A list with the following components:

est

The estimated piecewise-constant mean of x.

no.of.cpt

The estimated number of change-points in the piecewise-constant mean of x.

cpt

The estimated locations of change-points in the piecewise-contant mean of x (these are the final indices before the location of each change-point).

Author(s)

Piotr Fryzlewicz, p.fryzlewicz@lse.ac.uk

See Also

segment.mean, wbs.bic.cpt, wbs.cpt, tguh.cpt, hybrid.cpt, wbs.K.cpt

Examples

teeth <- rep(rep(0:1, each=5), 20)
teeth.noisy <- teeth + rnorm(200)/5
teeth.cleaned <- wbs.thresh.cpt(teeth.noisy)
ts.plot(teeth.cleaned$est)
teeth.cleaned$no.of.cpt
teeth.cleaned$cpt

[Package breakfast version 1.0.0 Index]