Title: | Isolate-Detect Methodology for Multiple Change-Point Detection |
---|---|
Description: | Provides efficient implementation of the Isolate-Detect methodology for the consistent estimation of the number and location of multiple change-points in one-dimensional data sequences from the "deterministic + noise" model. For details on the Isolate-Detect methodology, please see Anastasiou and Fryzlewicz (2018) <https://docs.wixstatic.com/ugd/24cdcc_6a0866c574654163b8255e272bc0001b.pdf>. Currently implemented scenarios are: piecewise-constant signal with Gaussian noise, piecewise-constant signal with heavy-tailed noise, continuous piecewise-linear signal with Gaussian noise, continuous piecewise-linear signal with heavy-tailed noise. |
Authors: | Andreas Anastasiou [aut, cre], Piotr Fryzlewicz [aut] |
Maintainer: | Andreas Anastasiou <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-12-10 06:34:46 UTC |
Source: | CRAN |
This function performs the Isolate-Detect methodology based on an information criterion approach, in order to detect multiple change-points in a noisy, continuous, piecewise-linear data sequence, with the noise being Gaussian. More information on how this approach works as well as the relevant literature reference are given in Details.
cplm_ic(x, th_const = 1.25, Kmax = 200, penalty = c("ssic_pen", "sic_pen"), points = 10)
cplm_ic(x, th_const = 1.25, Kmax = 200, penalty = c("ssic_pen", "sic_pen"), points = 10)
x |
A numeric vector containing the data in which you would like to find change-points. |
th_const |
A positive real number with default value equal to 1.25. It is used to define the threshold value that will be used at the first step of the model selection based Isolate-Detect method; see Details for more information. |
Kmax |
A positive integer with default value equal to 200. It is the
maximum allowed number of estimated change-points in the solution path; see
|
penalty |
A character vector with names of penalty functions used. |
points |
A positive integer with default value equal to 10. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
The approach followed in cplm_ic
in order to detect the
change-points is based on identifying the set of change-points that minimise an
information criterion. At first, we employ sol_path_cplm
, which
overestimates the number of change-points using th_const
in order to define the
threshold and then sorts the obtained estimates in a way that the estimate,
which is most likely to be correct appears first, whereas the least likely
to be correct, appears last. Let be the number of estimates
that this overestimation approach returns. We will obtain a vector
, with the estimates ordered as explained above. We
define the collection
, where
is the empty set and
. Among the collection
of models
, we select the one that minimises a predefined
Information Criterion. The obtained set of change-points is apparently a subset of
the solution path given in
sol_path_cplm
. More details can be found
in “Detecting multiple generalized change-points by isolating single ones”,
Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
sol_path |
A vector containing the solution path. |
ic_curve |
A list with values of the chosen information criteria. |
cpt_ic |
A list with the change-points detected for each information criterion considered. |
no_cpt_ic |
The number of change-points detected for each information criterion considered. |
Andreas Anastasiou, [email protected]
ID_cplm
and ID
, which employ this function.
In addition, see pcm_ic
for the case of detecting changes in
a piecewise-constant signal using the information criterion based approach.
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.ic <- cplm_ic(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.ic <- cplm_ic(three.cpt.noise)
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.ic <- cplm_ic(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.ic <- cplm_ic(three.cpt.noise)
This function performs the Isolate-Detect methodology with the thresholding-based stopping rule in order to detect multiple change-points in a continuous, piecewise-linear noisy data sequence, with noise that is Gaussian. See Details for a brief explanation of the Isolate-Detect methodology (with the relevant reference) and of the thresholding-based stopping rule.
cplm_th(x, sigma = stats::mad(diff(diff(x)))/sqrt(6), thr_const = 1.4, thr_fin = sigma * thr_const * sqrt(2 * log(length(x))), s = 1, e = length(x), points = 3, k_l = 1, k_r = 1)
cplm_th(x, sigma = stats::mad(diff(diff(x)))/sqrt(6), thr_const = 1.4, thr_fin = sigma * thr_const * sqrt(2 * log(length(x))), s = 1, e = length(x), points = 3, k_l = 1, k_r = 1)
x |
A numeric vector containing the data in which you would like to find change-points. |
sigma |
A positive real number. It is the estimate of the standard deviation
of the noise in |
thr_const |
A positive real number with default value equal to 1.4. It is
used to define the threshold; see |
thr_fin |
With |
s , e
|
Positive integers with |
points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively; see Details for more information. |
k_l , k_r
|
Positive integer numbers that get updated whenever the function calls itself during the detection process. They are not essential for the function to work, and we include them only to reduce the computational time. |
The change-point detection algorithm that is used in cplm_th
is the
Isolate-Detect methodology described in “Detecting multiple generalized
change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
The concept is simple and is split into two stages; firstly, isolation of each
of the true change-points in subintervals of the data domain, and secondly their detection.
ID first creates two ordered sets of right- and left-expanding
intervals as follows. The
right-expanding interval is
,
while the
left-expanding interval is
.
We collect these intervals in the ordered set
.
For a suitably chosen contrast function, ID first identifies the point with the maximum contrast
value in
. If its value exceeds a certain threshold, then it is taken as a change-point.
If not, then the process tests the next interval in
and repeats the above process.
Upon detection, the algorithm makes a new start from estimated location.
A numeric vector with the detected change-points.
Andreas Anastasiou, [email protected]
win_cplm_th
, ID_cplm
, and ID
, which employ
this function. In addition, see pcm_th
for the case of detecting changes in
a piecewise-constant signal via thresholding.
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- cplm_th(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(251,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.th <- cplm_th(three.cpt.noise) multi.cpt <- rep(c(seq(0,49,1), seq(48,0,-1)),20) multi.cpt.noise <- multi.cpt + rnorm(1980) cpt.multi.th <- cplm_th(multi.cpt.noise)
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- cplm_th(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(251,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.th <- cplm_th(three.cpt.noise) multi.cpt <- rep(c(seq(0,49,1), seq(48,0,-1)),20) multi.cpt.noise <- multi.cpt + rnorm(1980) cpt.multi.th <- cplm_th(multi.cpt.noise)
This function estimates the signal in a given data sequence x
with change-points
at cpt
. The type of the signal depends on whether the change-points represent changes
in a piecewise-constant or continuous, piecewise-linear signal. For more information see
Details below.
est_signal(x, cpt, type = c("mean", "slope"))
est_signal(x, cpt, type = c("mean", "slope"))
x |
A numeric vector containing the given data. |
cpt |
A positive integer vector with the locations of the change-points.
If missing, the |
type |
A character string, which defines the type of the detected change-points.
If |
The data points provided in x
are assumed to follow
where is the total length of the data sequence,
are the observed
data,
is a one-dimensional, deterministic signal with abrupt structural
changes at certain points, and
is white noise. We denote by
the elements in
cpt
and by and
. Depending on the value that has been passed to
type
, the returned
value is calculated as follows.
For type = ``mean''
, in each segment ,
for
is approximated by the mean of
calculated
over
.
For type = ``slope''
, is approximated by the linear spline fit with
knots at
minimising the
distance between the fit and the data.
A numeric vector with the estimated signal.
Andreas Anastasiou, [email protected]
single.cpt.pcm <- c(rep(4,1000),rep(0,1000)) single.cpt.pcm.noise <- single.cpt.pcm + rnorm(2000) cpt.single.pcm <- ID_pcm(single.cpt.pcm.noise) fit.cpt.single.pcm <- est_signal(single.cpt.pcm.noise, cpt.single.pcm$cpt, type = "mean") three.cpt.pcm <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.pcm.noise <- three.cpt.pcm + rnorm(2000) cpt.three.pcm <- ID_pcm(three.cpt.pcm.noise) fit.cpt.three.pcm <- est_signal(three.cpt.pcm.noise, cpt.three.pcm$pcm, type = "mean") single.cpt.plm <- c(seq(0,999,1),seq(998.5,499,-0.5)) single.cpt.plm.noise <- single.cpt.plm + rnorm(2000) cpt.single.plm <- ID_cplm(single.cpt.plm.noise) fit.cpt.single.plm <- est_signal(single.cpt.plm.noise, cpt.single.plm$cpt, type = "slope")
single.cpt.pcm <- c(rep(4,1000),rep(0,1000)) single.cpt.pcm.noise <- single.cpt.pcm + rnorm(2000) cpt.single.pcm <- ID_pcm(single.cpt.pcm.noise) fit.cpt.single.pcm <- est_signal(single.cpt.pcm.noise, cpt.single.pcm$cpt, type = "mean") three.cpt.pcm <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.pcm.noise <- three.cpt.pcm + rnorm(2000) cpt.three.pcm <- ID_pcm(three.cpt.pcm.noise) fit.cpt.three.pcm <- est_signal(three.cpt.pcm.noise, cpt.three.pcm$pcm, type = "mean") single.cpt.plm <- c(seq(0,999,1),seq(998.5,499,-0.5)) single.cpt.plm.noise <- single.cpt.plm + rnorm(2000) cpt.single.plm <- ID_cplm(single.cpt.plm.noise) fit.cpt.single.plm <- est_signal(single.cpt.plm.noise, cpt.single.plm$cpt, type = "slope")
Using the Isolate-Detect methodology, this function estimates the number and locations
of multiple change-points in the noisy, continuous, piecewise-linear input vector x
,
with noise that is not normally distributed. It also gives the estimated signal, as well as
the solution path defined in sol_path_cplm
(see Details for the relevant
literature reference).
ht_ID_cplm(x, s.ht = 3, q_ht = 300, ht_thr_id = 1.4, ht_th_ic_id = 1.25, p_thr = 1, p_ic = 3)
ht_ID_cplm(x, s.ht = 3, q_ht = 300, ht_thr_id = 1.4, ht_th_ic_id = 1.25, p_thr = 1, p_ic = 3)
x |
A numeric vector containing the data in which you would like to find change-points. |
s.ht |
A positive integer number with default value equal to 3. It is used to define the way we pre-average the given data sequence. For more information see Details. |
q_ht |
A positive integer number with default value equal to 300. If the
length of |
ht_thr_id |
A positive real number with default value equal to 1.4. It is
used to define the threshold, if the thresholding approach (described in |
ht_th_ic_id |
A positive real number with default value equal to 1.25. It is
useful only if the model selection based Isolate-Detect method is to be followed
and it is used to define the threshold value that will be used at the first step
(change-point overestimation) of the model selection approach described in
|
p_thr |
A positive integer with default value equal to 1. It is used only
when the threshold based approach (described in |
p_ic |
A positive integer with default value equal to 3. It is used only
when the information criterion based approach (described in |
Firstly, in this function we call normalise
, in order to
create a new data sequence, , by taking averages of observations in
x
. Then, we employ ID_cplm
on to obtain the
change-points, namely
in
increasing order. To obtain the original location of the change-points with,
on average, the highest accuracy we define
More details can be found in “Detecting multiple generalized change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
cpt |
A vector with the detected change-points. |
no_cpt |
The number of change-points detected. |
fit |
A numeric vector with the estimated continuous piecewise-linear signal. |
solution_path |
A vector containing the solution path. |
Andreas Anastasiou, [email protected]
ID_cplm
and normalise
, which are functions that are
used in ht_ID_cplm
. In addition, see ht_ID_pcm
for the case
of piecewise-constant mean signals.
single.cpt <- c(seq(0, 1999, 1), seq(1998, -1, -1)) single.cpt.student <- single.cpt + rt(4000, df = 5) cpt.single <- ht_ID_cplm(single.cpt.student) three.cpt <- c(seq(0, 3998, 2), seq(3996, -2, -2), seq(0,3998,2), seq(3996,-2,-2)) three.cpt.student <- three.cpt + rt(8000, df = 5) cpt.three <- ht_ID_cplm(three.cpt.student)
single.cpt <- c(seq(0, 1999, 1), seq(1998, -1, -1)) single.cpt.student <- single.cpt + rt(4000, df = 5) cpt.single <- ht_ID_cplm(single.cpt.student) three.cpt <- c(seq(0, 3998, 2), seq(3996, -2, -2), seq(0,3998,2), seq(3996,-2,-2)) three.cpt.student <- three.cpt + rt(8000, df = 5) cpt.three <- ht_ID_cplm(three.cpt.student)
Using the Isolate-Detect methodology, this function estimates the number and locations
of multiple change-points in the mean of the noisy, piecewise-constant input vector x
,
with noise that is not normally distributed. It also gives the estimated signal, as well as
the solution path defined in sol_path_pcm
. See Details for the relevant literature reference.
ht_ID_pcm(x, s.ht = 3, q_ht = 300, ht_thr_id = 1, ht_th_ic_id = 0.9, p_thr = 1, p_ic = 3)
ht_ID_pcm(x, s.ht = 3, q_ht = 300, ht_thr_id = 1, ht_th_ic_id = 0.9, p_thr = 1, p_ic = 3)
x |
A numeric vector containing the data in which you would like to find change-points. |
s.ht |
A positive integer number with default value equal to 3. It is used to define the way we pre-average the given data sequence (see Details). |
q_ht |
A positive integer number with default value equal to 300. If the
length of |
ht_thr_id |
A positive real number with default value equal to 1. It is
used to define the threshold, if the thresholding approach is to be followed; see
|
ht_th_ic_id |
A positive real number with default value equal to 0.9. It is
useful only if the model selection based Isolate-Detect method is to be followed
and it is used to define the threshold value that will be used at the first step
(change-point overestimation) of the model selection approach described in
|
p_thr |
A positive integer with default value equal to 1. It is used only
when the threshold based approach (as described in |
p_ic |
A positive integer with default value equal to 3. It is used only
when the information criterion based approach (described in |
Firstly, in this function we call normalise
, in order to
create a new data sequence, , by taking averages of observations in
x
. Then, we employ ID_pcm
on to obtain the
change-points, namely
in
increasing order. To obtain the original location of the change-points with,
on average, the highest accuracy we define
More details can be found in “Detecting multiple generalized change-points by
isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
cpt |
A vector with the detected change-points. |
no_cpt |
The number of change-points detected. |
fit |
A numeric vector with the estimated piecewise-constant signal. |
solution_path |
A vector containing the solution path. |
Andreas Anastasiou, [email protected]
ID_pcm
and normalise
, which are functions that are
used in ht_ID_pcm
. In addition, see ht_ID_cplm
for the case
of continuous and piecewise-linear signals.
single.cpt <- c(rep(4,3000),rep(0,3000)) single.cpt.student <- single.cpt + rt(6000, df = 5) cpts_detect <- ht_ID_pcm(single.cpt.student) three.cpt <- c(rep(4,2000),rep(0,2000),rep(-4,2000),rep(0,2000)) three.cpt.student <- three.cpt + rt(8000, df = 5) cpts_detect_three <- ht_ID_pcm(three.cpt.student)
single.cpt <- c(rep(4,3000),rep(0,3000)) single.cpt.student <- single.cpt + rt(6000, df = 5) cpts_detect <- ht_ID_pcm(single.cpt.student) three.cpt <- c(rep(4,2000),rep(0,2000),rep(-4,2000),rep(0,2000)) three.cpt.student <- three.cpt + rt(8000, df = 5) cpts_detect_three <- ht_ID_pcm(three.cpt.student)
This is the main, general function of the package. It employs more specialised functions in
order to estimate the number and locations of multiple change-points in the noisy, piecewise-constant
or continuous, piecewise-linear input vector xd
. The noise can either follow the Gaussian
distribution or not. The approach that is followed is a hybrid between the thresholding approach
(explained in pcm_th
and cplm_th
) and the information criterion approach
(explained in pcm_ic
and cplm_ic
) and estimates the change-points
taking into account both these approaches. Further to the number and the location of the estimated
change-points, ID
, returns the estimated signal, as well as the solution path.
For more information and the relevant literature reference, see Details.
ID(xd, th.cons = 1, th.cons_lin = 1.4, th.ic = 0.9, th.ic.lin = 1.25, lambda = 3, lambda.ic = 10, contrast = c("mean", "slope"), ht = FALSE, scale = 3)
ID(xd, th.cons = 1, th.cons_lin = 1.4, th.ic = 0.9, th.ic.lin = 1.25, lambda = 3, lambda.ic = 10, contrast = c("mean", "slope"), ht = FALSE, scale = 3)
xd |
A numeric vector containing the data in which you would like to find change-points. |
th.cons |
A positive real number with default value equal to 1. It is
used to define the threshold, if the thresholding approach (explained in |
th.cons_lin |
A positive real number with default value equal to 1.4. It is
used to define the threshold, if the thresholding approach (explained in |
th.ic |
A positive real number with default value equal to 0.9. It is
useful only if the model selection based Isolate-Detect method (described in
|
th.ic.lin |
A positive real number with default value equal to 1.25. It is
useful only if the model selection based Isolate-Detect method (described in
|
lambda |
A positive integer with default value equal to 3. It is used only when the threshold based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
lambda.ic |
A positive integer with default value equal to 10. It is used only when the information criterion based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
contrast |
A character string, which defines the type of the contrast function to
be used in the Isolate-Detect algorithm. If |
ht |
A logical variable with default value equal to |
scale |
A positive integer number with default value equal to 3. It is
used to define the way we pre-average the given data sequence only if
|
The data points provided in xd
are assumed to follow
where is the total length of the data sequence,
are the observed
data,
is a one-dimensional, deterministic signal with abrupt structural
changes at certain points, and
are independent and identically
distributed random variables with mean zero and variance one. In this function,
the following scenarios for
are implemented.
Piecewise-constant signal with Gaussian noise.
Use contrast = ``mean''
and ht = FALSE
here.
Piecewise-constant signal with heavy-tailed noise.
Use contrast = ``mean''
and ht = TRUE
here.
Continuous, piecewise-linear signal with Gaussian noise.
Use contrast = ``slope''
and ht = FALSE
here.
Continuous, piecewise-linear signal with heavy-tailed noise.
Use contrast = ``slope''
and ht = TRUE
here.
In the case where ht = FALSE
: the function firstly detects the change-points using
win_pcm_th
(for the case of piecewise-constant signal) or win_cplm_th
(for the case of continuous, piecewise-linear signal). If the estimated number of change-points
is greater than 100, then the result is returned and we stop. Otherwise, ID
proceeds
to detect the change-points using pcm_ic
(for the case of piecewise-constant signal)
or cplm_ic
(for the case of continuous, piecewise-linear signal) and this is what is
returned.
In the case where ht = TRUE
: First we pre-average the given data sequence using normalise
and then, on the obtained data sequence, we follow exactly the same procedure as the one when ht = FALSE
above.
More details can be found in “Detecting multiple generalized change-points by isolating single ones”,
Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
cpt |
A vector with the detected change-points. |
no_cpt |
The number of change-points detected. |
fit |
A numeric vector with the estimated signal. |
solution_path |
A vector containing the solution path. |
Andreas Anastasiou, [email protected]
ID_pcm
, ID_cplm
, ht_ID_pcm
, and
ht_ID_cplm
, which are the functions that are employed
in ID
, depending on which scenario is imposed by the input arguments.
single.cpt.mean <- c(rep(4,3000),rep(0,3000)) single.cpt.mean.normal <- single.cpt.mean + rnorm(6000) single.cpt.mean.student <- single.cpt.mean + rt(6000, df = 5) cpt.single.mean.normal <- ID(single.cpt.mean.normal) cpt.single.mean.student <- ID(single.cpt.mean.student, ht = TRUE) single.cpt.slope <- c(seq(0, 1999, 1), seq(1998, -1, -1)) single.cpt.slope.normal <- single.cpt.slope + rnorm(4000) single.cpt.slope.student <- single.cpt.slope + rt(4000, df = 5) cpt.single.slope.normal <- ID(single.cpt.slope.normal, contrast = "slope") cpt.single.slope.student <- ID(single.cpt.slope.student, contrast = "slope", ht = TRUE)
single.cpt.mean <- c(rep(4,3000),rep(0,3000)) single.cpt.mean.normal <- single.cpt.mean + rnorm(6000) single.cpt.mean.student <- single.cpt.mean + rt(6000, df = 5) cpt.single.mean.normal <- ID(single.cpt.mean.normal) cpt.single.mean.student <- ID(single.cpt.mean.student, ht = TRUE) single.cpt.slope <- c(seq(0, 1999, 1), seq(1998, -1, -1)) single.cpt.slope.normal <- single.cpt.slope + rnorm(4000) single.cpt.slope.student <- single.cpt.slope + rt(4000, df = 5) cpt.single.slope.normal <- ID(single.cpt.slope.normal, contrast = "slope") cpt.single.slope.student <- ID(single.cpt.slope.student, contrast = "slope", ht = TRUE)
This function estimates the number and locations of multiple change-points in the noisy,
continuous and piecewise-linear input vector x
, using the Isolate-Detect methodology. The noise
follows the normal distribution. The estimated signal, as well as the solution path defined
in sol_path_cplm
are also given. The function is a hybrid between the thresholding
approach of win_cplm_th
and the information criterion approach of
cplm_ic
and estimates the change-points taking into account both these
approaches (see Details for more information and the relevant literature reference).
ID_cplm(x, thr_id = 1.4, th_ic_id = 1.25, pointsth = 3, pointsic = 10)
ID_cplm(x, thr_id = 1.4, th_ic_id = 1.25, pointsth = 3, pointsic = 10)
x |
A numeric vector containing the data in which you would like to find change-points. |
thr_id |
A positive real number with default value equal to 1.4. It is
used to define the threshold, if the thresholding approach is to be followed; see
|
th_ic_id |
A positive real number with default value equal to 1.25. It is
useful only if the model selection based Isolate-Detect method is to be followed
and it is used to define the threshold value that will be used at the first step
(change-point overestimation) of the model selection approach described in
|
pointsth |
A positive integer with default value equal to 3. It is used only when the threshold based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
pointsic |
A positive integer with default value equal to 10. It is used only when the information criterion based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
Firstly, this function detects the change-points using win_cplm_th
.
If the estimated number of change-points is larger than 100, then the
result is returned and we stop. Otherwise, ID_cplm
proceeds to detect the
change-points using cplm_ic
and this is what is returned. To sum up,
ID_cplm
returns a result based on cplm_ic
if the estimated number
of change-points is less than 100. Otherwise, the result comes from thresholding.
More details can be found in “Detecting multiple generalized change-points by
isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
cpt |
A vector with the detected change-points. |
no_cpt |
The number of change-points detected. |
fit |
A numeric vector with the estimated continuous piecewise-linear signal. |
solution_path |
A vector containing the solution path. |
Andreas Anastasiou, [email protected]
win_cplm_th
and cplm_ic
which are the functions that
ID_cplm
is based on. In addition, see ID_pcm
for the case of detecting changes
in the mean of a piecewise-constant signal. The main function ID
of the package
employs ID_cplm
.
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single <- ID_cplm(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three <- ID_cplm(three.cpt.noise) multi.cpt <- rep(c(seq(0,49,1), seq(48,0,-1)),20) multi.cpt.noise <- multi.cpt + rnorm(1980) cpt.multi <- ID_cplm(multi.cpt.noise)
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single <- ID_cplm(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three <- ID_cplm(three.cpt.noise) multi.cpt <- rep(c(seq(0,49,1), seq(48,0,-1)),20) multi.cpt.noise <- multi.cpt + rnorm(1980) cpt.multi <- ID_cplm(multi.cpt.noise)
This function estimates the number and locations of multiple change-points in the mean
of the noisy piecewise-constant input vector x
, using the Isolate-Detect methodology. The noise
is Gaussian. The estimated signal, as well as the solution path defined in sol_path_pcm
are
also given. The function is a hybrid between the thresholding approach of win_pcm_th
and the
information criterion approach of pcm_ic
and estimates the change-points taking into
account both these approaches (see Details for more information and the relevant literature reference).
ID_pcm(x, thr_id = 1, th_ic_id = 0.9, pointsth = 3, pointsic = 10)
ID_pcm(x, thr_id = 1, th_ic_id = 0.9, pointsth = 3, pointsic = 10)
x |
A numeric vector containing the data in which you would like to find change-points. |
thr_id |
A positive real number with default value equal to 1. It is
used to define the threshold, if the thresholding approach is to be followed; see |
th_ic_id |
A positive real number with default value equal to 0.9. It is
useful only if the model selection based Isolate-Detect method is to be followed.
It is used to define the threshold value that will be used at the first step
(change-point overestimation) of the model selection approach described in |
pointsth |
A positive integer with default value equal to 3. It is used only when the threshold based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
pointsic |
A positive integer with default value equal to 10. It is used only when the information criterion based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
Firstly, this function detects the change-points using win_pcm_th
.
If the estimated number of change-points is larger than 100, then the
result is returned and we stop. Otherwise, ID_pcm
proceeds to detect the
change-points using pcm_ic
and this is what is returned. To sum up,
ID_pcm
returns a result based on pcm_ic
if the estimated number
of change-points is less than 100. Otherwise, the result comes from thresholding.
More details can be found in “Detecting multiple generalized change-points by
isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
cpt |
A vector with the detected change-points. |
no_cpt |
The number of change-points detected. |
fit |
A numeric vector with the estimated piecewise-constant signal. |
solution_path |
A vector containing the solution path. |
Andreas Anastasiou, [email protected]
win_pcm_th
and pcm_ic
which are the functions that ID_pcm
is based on. In addition, see ID_cplm
for the case of detecting changes
in a continuous, piecewise-linear signal. The main function ID
of the package employs ID_pcm
.
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpts_detect <- ID_pcm(single.cpt.noise) three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.noise <- three.cpt + rnorm(2000) cpts_detect_three <- ID_pcm(three.cpt.noise) multi.cpt <- rep(c(rep(0,50),rep(3,50)),20) multi.cpt.noise <- multi.cpt + rnorm(2000) cpts_detect_multi <- ID_pcm(multi.cpt.noise)
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpts_detect <- ID_pcm(single.cpt.noise) three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.noise <- three.cpt + rnorm(2000) cpts_detect_three <- ID_pcm(three.cpt.noise) multi.cpt <- rep(c(rep(0,50),rep(3,50)),20) multi.cpt.noise <- multi.cpt + rnorm(2000) cpts_detect_multi <- ID_pcm(multi.cpt.noise)
The IDetect
package implements the Isolate-Detect methodology for
multiple generalised change-point detection, or sequence segmentation, in one-dimensional data
following the “deterministic signal + noise” model. The different structures that
are implemented are: piecewise-constant signal with Gaussian noise, piecewise-constant signal with
heavy tailed noise, piecewise-linear and continuous signal with Gaussian noise,
and piecewise-linear and continuous signal with heavy-tailed noise. The main routine
of the package is ID
.
Andreas Anastasiou, [email protected], Piotr Fryzlewicz, [email protected]
“Detecting multiple generalized change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
ID
, ID_pcm
, ID_cplm
, ht_ID_pcm
,
and ht_ID_cplm
.
#See Examples for ID.
#See Examples for ID.
This function pre-processes the given data in order to obtain a noise structure that is closer to satisfying the Gaussianity assumption. See details for more information and for the relevant literature reference.
normalise(x, sc = 3)
normalise(x, sc = 3)
x |
A numeric vector containing the data. |
sc |
A positive integer number with default value equal to 3. It is used to define the way we pre-average the given data sequence. |
For a given natural number sc
and data x
of length , let us
denote by
. Then,
normalise
calculates
for , while
More details can be found in the preprint “Detecting multiple generalized change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018).
The “normalised” vector of length
, as explained in Details.
Andreas Anastasiou, [email protected]
ht_ID_pcm
and ht_ID_cplm
, which are
functions that employ normalise
.
t5 <- rt(n = 10000, df = 5) n5 <- normalise(t5, sc = 3)
t5 <- rt(n = 10000, df = 5) n5 <- normalise(t5, sc = 3)
This function performs the Isolate-Detect methodology based on an information criterion approach, in order to detect multiple change-points in the mean of a noisy data sequence, with the noise following the Gaussian distribution. More information on how this approach works as well as the relevant literature reference are given in Details.
pcm_ic(x, th_const = 0.9, Kmax = 200, penalty = c("ssic_pen", "sic_pen"), points = 10)
pcm_ic(x, th_const = 0.9, Kmax = 200, penalty = c("ssic_pen", "sic_pen"), points = 10)
x |
A numeric vector containing the data in which you would like to find change-points. |
th_const |
A positive real number with default value equal to 0.9. It is used to define the threshold value that will be used at the first step of the model selection based Isolate-Detect method; see Details for more information. |
Kmax |
A positive integer with default value equal to 200. It is the maximum allowed number of estimated change-points in the solution path algorithm, described in Details below. |
penalty |
A character vector with names of the penalty functions used. |
points |
A positive integer with default value equal to 10. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
The approach followed in pcm_ic
in order to detect
the change-points is based on identifying the set of change-points that
minimise an information criterion. At first, we employ sol_path_pcm
,
which overestimates the number of change-points using th_const
in order to define the
threshold, and then sorts the obtained estimates in a way that the estimate, which
is most likely to be correct appears first, whereas the least likely to
be correct, appears last. Let be the number of estimates
that this overestimation approach returns. We will obtain a vector
, with the estimates ordered as explained above. We define
the collection
, where
is the empty set
and
. Among the collection of models
, we select the one that minimises a predefined Information
Criterion. The obtained set of change-points is apparently a subset of the solution path
given in
sol_path_pcm
. More details can be found in
“Detecting multiple generalized change-points by isolating single ones”,
Anastasiou and Fryzlewicz (2018), preprint.
A list with the following components:
sol_path |
A vector containing the solution path. |
ic_curve |
A list with values of the chosen information criteria. |
cpt_ic |
A list with the change-points detected for each information criterion considered. |
no_cpt_ic |
The number of change-points detected for each information criterion considered. |
Andreas Anastasiou, [email protected]
ID_pcm
and ID
, which employ this function.
In addition, see cplm_ic
for the case of detecting changes in
a continuous, piecewise-linear signal using the information criterion based approach.
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.ic <- pcm_ic(single.cpt.noise) three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.ic <- pcm_ic(three.cpt.noise)
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.ic <- pcm_ic(single.cpt.noise) three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.ic <- pcm_ic(three.cpt.noise)
This function performs the Isolate-Detect methodology (see Details for the
relevant literature reference) with the thresholding-based stopping rule
in order to detect multiple change-points in the mean of a noisy input vector
x
, with Gaussian noise. See Details for a brief explanation of the
Isolate-Detect methodology, and of the thresholding-based stopping rule.
pcm_th(x, sigma = stats::mad(diff(x)/sqrt(2)), thr_const = 1, thr_fin = sigma * thr_const * sqrt(2 * log(length(x))), s = 1, e = length(x), points = 3, k_l = 1, k_r = 1)
pcm_th(x, sigma = stats::mad(diff(x)/sqrt(2)), thr_const = 1, thr_fin = sigma * thr_const * sqrt(2 * log(length(x))), s = 1, e = length(x), points = 3, k_l = 1, k_r = 1)
x |
A numeric vector containing the data in which you would like to find change-points. |
sigma |
A positive real number. It is the estimate of the standard deviation
of the noise in |
thr_const |
A positive real number with default value equal to 1. It is
used to define the threshold; see |
thr_fin |
With |
s , e
|
Positive integers with |
points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively; see Details for more information. |
k_l , k_r
|
Positive integer numbers that get updated whenever the function calls itself during the detection process. They are not essential for the function to work, and we include them only to reduce the computational time. |
The change-point detection algorithm that is used in pcm_th
is the
Isolate-Detect methodology described in “Detecting multiple generalized
change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.
The concept is simple and is split into two stages; firstly, isolation of each
of the true change-points in subintervals of the data domain, and secondly their detection.
ID first creates two ordered sets of right- and left-expanding
intervals as follows. The
right-expanding interval is
,
while the
left-expanding interval is
.
We collect these intervals in the ordered set
.
For a suitably chosen contrast function, ID first identifies the point with the maximum contrast
value in
. If its value exceeds a certain threshold, then it is taken as a change-point.
If not, then the process tests the next interval in
and repeats the above process.
Upon detection, the algorithm makes a new start from estimated location.
A numeric vector with the detected change-points.
Andreas Anastasiou, [email protected]
win_pcm_th
, ID_pcm
, and ID
, which employ
this function. In addition, see cplm_th
for the case of detecting changes in
a continuous, piecewise-linear signal via thresholding.
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- pcm_th(single.cpt.noise) three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.th <- pcm_th(three.cpt.noise) multi.cpt <- rep(c(rep(0,50),rep(3,50)),20) multi.cpt.noise <- multi.cpt + rnorm(2000) cpt.multi.th <- pcm_th(multi.cpt.noise)
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- pcm_th(single.cpt.noise) three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.th <- pcm_th(three.cpt.noise) multi.cpt <- rep(c(rep(0,50),rep(3,50)),20) multi.cpt.noise <- multi.cpt + rnorm(2000) cpt.multi.th <- pcm_th(multi.cpt.noise)
This function returns the difference between x
and the estimated signal
with change-points at cpt
. The input in the argument type_chg
will
indicate the type of changes in the signal.
resid_ID(x, cpt, type_chg = c("mean", "slope"), type_res = c("raw", "standardised"))
resid_ID(x, cpt, type_chg = c("mean", "slope"), type_res = c("raw", "standardised"))
x |
A numeric vector containing the data. |
cpt |
A positive integer vector with the locations of the change-points.
If missing, the |
type_chg |
A character string, which defines the type of the detected change-points.
If |
type_res |
A choice of |
If type_res = ``raw''
, the function returns the difference between the data
and the estimated signal. If type_res = ``standardised''
, then the function
returns the difference between the data and the estimated signal, divided by
the estimated standard deviation.
Andreas Anastasiou, [email protected]
single.cpt.pcm <- c(rep(4,1000),rep(0,1000)) single.cpt.pcm.noise <- single.cpt.pcm + rnorm(2000) cpt_detect <- ID(single.cpt.pcm.noise, contrast = "mean") residuals_cpt_raw <- resid_ID(single.cpt.pcm.noise, cpt = cpt_detect$cpt, type_chg = "mean", type_res = "raw") residuals_cpt_stand. <- resid_ID(single.cpt.pcm.noise, cpt = cpt_detect$cpt, type_chg = "mean", type_res = "standardised") plot(residuals_cpt_raw) plot(residuals_cpt_stand.)
single.cpt.pcm <- c(rep(4,1000),rep(0,1000)) single.cpt.pcm.noise <- single.cpt.pcm + rnorm(2000) cpt_detect <- ID(single.cpt.pcm.noise, contrast = "mean") residuals_cpt_raw <- resid_ID(single.cpt.pcm.noise, cpt = cpt_detect$cpt, type_chg = "mean", type_res = "raw") residuals_cpt_stand. <- resid_ID(single.cpt.pcm.noise, cpt = cpt_detect$cpt, type_chg = "mean", type_res = "standardised") plot(residuals_cpt_raw) plot(residuals_cpt_stand.)
This function finds two subsets of integers in a given interval [s,e]
.
The routine is typically not called directly by the user; its result
is used in order to construct the expanding intervals, where the Isolate-Detect method
is going to be applied. For more details on how the Isolate-Detect methodology works, see
References.
s_e_points(r, l, s, e)
s_e_points(r, l, s, e)
r |
A positive integer vector containing the set, from which the end-points of the expanding intervals are to be chosen. |
l |
A positive integer vector containing the set, from which the start-points of the expanding intervals are to be chosen. |
s |
A positive integer indicating the starting position, in the sense that we will
choose the elements from |
e |
A positive integer indicating the finishing position, in the sense that we will
choose the elements from |
e_points
A vector containing the points that will be used as end-points,
in order to create the left-expanding intervals. It consists of the input e
and
all the elements in the input vector r
that are in (s,e)
.
s_points
A vector containing the points that will be used as start-points,
in order to create the left-expanding intervals. It consists of the input s
and
all the elements in the input vector l
that are in (s,e)
Andreas Anastasiou, [email protected]
Anastasiou, A. and Fryzlewicz, P. (2018). Detecting multiple generalized change-points by isolating single ones.
s_e_points(r = seq(10,1000,10), l = seq(991,1,-10), s=435, e = 786) s_e_points(r = seq(3,100,3), l = seq(98,1,-3), s=43, e = 86)
s_e_points(r = seq(10,1000,10), l = seq(991,1,-10), s=435, e = 786) s_e_points(r = seq(3,100,3), l = seq(98,1,-3), s=43, e = 86)
This function starts by over-estimating the number of true change-points.
After that, following an approach based on the values of a suitable contrast function,
it sorts the estimated change-points in a way that the estimation, which is
most-likely to be correct appears first, whereas the least likely to be correct,
appears last. The routine is typically not called directly by the user; it is
employed in cplm_ic
. For more details, see References.
sol_path_cplm(x, thr_ic = 1.25, points = 3)
sol_path_cplm(x, thr_ic = 1.25, points = 3)
x |
A numeric vector containing the data in which you would like to find change-points. |
thr_ic |
A positive real number with default value equal to 1.25. It is
used to define the threshold. The change-points are estimated by thresholding
with threshold equal to |
points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
The solution path for the case of continuous piecewise-linear signals.
Andreas Anastasiou, [email protected]
Anastasiou, A. and Fryzlewicz, P. (2018). Detecting multiple generalized change-points by isolating single ones.
three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250.5,999,1.5), seq(998,499,-1)) three.cpt.noise <- three.cpt + rnorm(2000) solution.path <- sol_path_cplm(three.cpt.noise)
three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250.5,999,1.5), seq(998,499,-1)) three.cpt.noise <- three.cpt + rnorm(2000) solution.path <- sol_path_cplm(three.cpt.noise)
This function starts by overestimating the number of true change-points.
After that, following a CUSUM-based approach, it sorts the estimated change-points
in a way that the estimate, which is most-likely to be correct appears first, whereas
the least likely to be correct, appears last. The routine is typically not called
directly by the user; it is employed in pcm_ic
. For more information, see
References.
sol_path_pcm(x, thr_ic = 0.9, points = 3)
sol_path_pcm(x, thr_ic = 0.9, points = 3)
x |
A numeric vector containing the data in which you would like to find change-points. |
thr_ic |
A positive real number with default value equal to 0.9. It is
used to define the threshold. The change-points are estimated by thresholding
with threshold equal to |
points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
The solution path for the case of piecewise-constant signals.
Andreas Anastasiou, [email protected]
Anastasiou, A. and Fryzlewicz, P. (2018). Detecting multiple generalized change-points by isolating single ones.
three.cpt <- c(rep(4,4000),rep(0,4000),rep(-4,4000),rep(1,4000)) three.cpt.noise <- three.cpt + rnorm(16000) solution.path <- sol_path_pcm(three.cpt.noise)
three.cpt <- c(rep(4,4000),rep(0,4000),rep(-4,4000),rep(1,4000)) three.cpt.noise <- three.cpt + rnorm(16000) solution.path <- sol_path_pcm(three.cpt.noise)
This function performs the windows-based variant of the Isolate-Detect methodology with the thresholding-based stopping rule in order to detect multiple change-points in a continuous, piecewise-linear noisy data sequence, with the noise being Gaussian. It is particularly helpful for very long data sequences, as due to applying Isolate-Detect on moving windows, the computational time is reduced. See Details for a brief explanation of this approach and for the relevant literature reference.
win_cplm_th(xd, sigma = stats::mad(diff(diff(xd)))/sqrt(6), thr_con = 1.4, c_win = 3000, w_points = 3, l_win = 12000)
win_cplm_th(xd, sigma = stats::mad(diff(diff(xd)))/sqrt(6), thr_con = 1.4, c_win = 3000, w_points = 3, l_win = 12000)
xd |
A numeric vector containing the data in which you would like to find change-points. |
sigma |
A positive real number. It is the estimate of the standard deviation
of the noise in |
thr_con |
A positive real number with default value equal to 1.4. It is
used to define the threshold. The change-points are estimated by thresholding
with threshold equal to |
c_win |
A positive integer with default value equal to 3000. It is the length
of each window for the data sequence in hand. Isolate-Detect will be applied
in segments of the form |
w_points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
l_win |
A positive integer with default value equal to 12000. If the length of
the data sequence is less than or equal to |
The method that is implemented by this function is based on splitting the given
data sequence uniformly into smaller parts (windows), to which Isolate-Detect, based on the
thresholding stopping rule (see cplm_th
), is then applied.
A numeric vector with the detected change-points.
Andreas Anastasiou, [email protected]
cplm_th
, which is the function that win_cplm_th
is based on. Also,
see ID_cplm
and ID
, which employ win_cplm_th
. In addition,
see win_pcm_th
for the case of detecting changes in a piecewise-constant signal via
thresholding.
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- win_cplm_th(single.cpt.noise) three.cpt <- c(seq(0, 3999, 1), seq(3998.5, 1999, -0.5), seq(2001,9999,2), seq(9998,5999,-1)) three.cpt.noise <- three.cpt + rnorm(16000) cpt.three.th <- win_cplm_th(three.cpt.noise)
single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- win_cplm_th(single.cpt.noise) three.cpt <- c(seq(0, 3999, 1), seq(3998.5, 1999, -0.5), seq(2001,9999,2), seq(9998,5999,-1)) three.cpt.noise <- three.cpt + rnorm(16000) cpt.three.th <- win_cplm_th(three.cpt.noise)
This function performs the windows-based variant of the Isolate-Detect methodology with the thresholding-based stopping rule in order to detect multiple change-points in the mean of a noisy data sequence, with noise that is Gaussian. It is particularly helpful for very long data sequences, as due to applying Isolate-Detect on moving windows, the computational time is reduced. See Details for a brief explanation of this approach and for the relevant literature reference.
win_pcm_th(xd, sigma = stats::mad(diff(xd)/sqrt(2)), thr_con = 1, c_win = 3000, w_points = 3, l_win = 12000)
win_pcm_th(xd, sigma = stats::mad(diff(xd)/sqrt(2)), thr_con = 1, c_win = 3000, w_points = 3, l_win = 12000)
xd |
A numeric vector containing the data in which you would like to find change-points. |
sigma |
A positive real number. It is the estimate of the standard deviation
of the noise in |
thr_con |
A positive real number with default value equal to 1. It is
used to define the threshold, which is equal to |
c_win |
A positive integer with default value equal to 3000. It is the length
of each window for the data sequence in hand. Isolate-Detect will be applied
in segments of the form |
w_points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
l_win |
A positive integer with default value equal to 12000. If the length of
the data sequence is less than or equal to |
The method that is implemented by this function is based on splitting the given
data sequence uniformly into smaller parts (windows), to which Isolate-Detect, based on the
threshold stopping rule (see pcm_th
), is then applied. An idea of the computational
improvement that this structure offers over the classical Isolate-Detect in the case of large data
sequences is given in the supplement of “Detecting multiple generalized change-points by isolating
single ones”, Anastasiou and Fryzlewicz (2018), preprint.
A numeric vector with the detected change-points.
Andreas Anastasiou, [email protected]
pcm_th
, which is the function that win_pcm_th
is based on. Also,
see ID_pcm
and ID
, which employ win_pcm_th
. In addition,
see win_cplm_th
for the case of detecting changes in the slope of a
piecewise-linear and continuous signal via thresholding.
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- win_pcm_th(single.cpt.noise) three.cpt <- c(rep(4,4000),rep(0,4000),rep(-4,4000),rep(1,4000)) three.cpt.noise <- three.cpt + rnorm(16000) cpt.three.th <- win_pcm_th(three.cpt.noise)
single.cpt <- c(rep(4,1000),rep(0,1000)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- win_pcm_th(single.cpt.noise) three.cpt <- c(rep(4,4000),rep(0,4000),rep(-4,4000),rep(1,4000)) three.cpt.noise <- three.cpt + rnorm(16000) cpt.three.th <- win_pcm_th(three.cpt.noise)