Package 'dSTEM' reference manual

Title:	Multiple Testing of Local Extrema for Detection of Change Points
Description:	Simultaneously detect the number and locations of change points in piecewise linear models under stationary Gaussian noise allowing autocorrelated random noise. The core idea is to transform the problem of detecting change points into the detection of local extrema (local maxima and local minima)through kernel smoothing and differentiation of the data sequence, see Cheng et al. (2020) <doi:10.1214/20-EJS1751>. A low-computational and fast algorithm call 'dSTEM' is introduced to detect change points based on the 'STEM' algorithm in D. Cheng and A. Schwartzman (2017) <doi:10.1214/16-AOS1458>.
Authors:	Zhibing He <[email protected]>
Maintainer:	Zhibing He <[email protected]>
License:	GPL-3
Version:	2.0-1
Built:	2025-02-01 06:45:55 UTC
Source:	CRAN

Compute convolution function using FFT

Description

Compute convolution function using FFT, similar to 'conv' in matlab

Usage

conv(u, v, shape = c("same", "full"))
conv(u, v, shape = c("same", "full"))

Arguments

`u`	numerical vector
`v`	numerical vector, don't need to have the same length as `u`
`shape`	if 'same', return central part of the convolution and has the same size as `u`; otherwise return the whole sequence of size $lenth(u)+length(v)-1$ .

Value

a vector of convolution, as specified by shape.

References

Matlab document on 'conv': https://www.mathworks.com/help/matlab/ref/conv.html

Examples

u = c(-1,2,3,-2,0,1,2)
v = c(2,4,-1,1)
w = conv(u,v,'same')
u = c(-1,2,3,-2,0,1,2)
v = c(2,4,-1,1)
w = conv(u,v,'same')

Plot data sequence, the first and second-order derivatives, and their local extrema

Description

Plot data sequence, the first and second-order derivatives, and their local extrema

Usage

cp.plt(x, order, icd.noise, H)
cp.plt(x, order, icd.noise, H)

Arguments

`x`	numerical vector of signal or signal-plus-noise data
`order`	order of derivative of data
`icd.noise`	logical value indicating if `x` includes noise
`H`	optional, vector of change-point locations

Value

a plot

Examples

l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)*3
beta1 = c(2,-1,2.5,-3,-0.2,2.5,-0.5)/50
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
dy = diff(sdata)
ddy = diff(sdata,differences=2)
cp.plt(signal,0,FALSE)
points(signal+noise,col="grey")
cp.plt(dy,1,H=h)
cp.plt(ddy,2,H=h)

l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)*3
beta1 = c(2,-1,2.5,-3,-0.2,2.5,-0.5)/50
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
dy = diff(sdata)
ddy = diff(sdata,differences=2)
cp.plt(signal,0,FALSE)
points(signal+noise,col="grey")
cp.plt(dy,1,H=h)
cp.plt(ddy,2,H=h)

Multiple testing of change points for kernel smoothed data

Description

Multiple testing of change points for kernel smoothed data

Usage

cpTest(
  x,
  order,
  alpha,
  gamma,
  sigma,
  breaks,
  slope,
  untest,
  nu,
  is.constant,
  margin
)
cpTest(
  x,
  order,
  alpha,
  gamma,
  sigma,
  breaks,
  slope,
  untest,
  nu,
  is.constant,
  margin
)

Arguments

`x`	vector of kernel smoothed data
`order`	order of derivative of data
`alpha`	global significant level
`gamma`	bandwidth of Gaussian kernel
`sigma`	standard deviation of kernel smoothed noise
`breaks`	vector of rough estimate of change-point locations, only required when order is 1.
`slope`	vector of rough estimate of slopes associated with `breaks`, only required when order is 1.
`untest`	vector of locations unnecessary to test
`nu`	standard deviation of Gaussian kernel used to generate autocorrelated Gaussian noise, it is 0 if the noise is Gaussian white noise.
`is.constant`	logical value indicating if the signal is piecewise constant, if TRUE, `breaks` and `slope` are not necessary.
`margin`	length of one period of data `x`

Value

a list of estimated change-point locations and threshold for p-value

Examples

## piecewise linear signal
l = 1200
h = seq(150,by=150,length.out=6)
jump = rep(0,7)
beta1 = c(2,-1,2.5,-3,-0.2,2.5)/50
beta1 = c(beta1,-sum(beta1*(c(h[1],diff(h))))/(l-tail(h,1)))
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,2)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
ddy = diff(sdata,differences=2)
model2 = cpTest(x=ddy,order=2,gamma=gamma,alpha=0.05)
## piecewise constant
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)
beta1 = rep(0,length(h)+1)
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
dy = diff(sdata)
model1 = cpTest(x=dy,order=1,alpha=0.05,gamma=gamma,is.constant=TRUE)
## piecewise linear with jump
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)*3
beta1 = c(2,-1,2.5,-3,-0.2,2.5,-0.5)/50
signal = gen.signal(l=l,h=h,jump=jump,b1=beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
dy = diff(sdata)
ddy = diff(sdata,differences=2)
model2 = cpTest(x=ddy,order=2,gamma=gamma,alpha=0.1)
breaks = est.pair(vall=model2$vall,peak=model2$peak,gamma=gamma)$cp
slope = est.slope(x=(signal+noise),breaks=breaks)
## piecewise linear signal
l = 1200
h = seq(150,by=150,length.out=6)
jump = rep(0,7)
beta1 = c(2,-1,2.5,-3,-0.2,2.5)/50
beta1 = c(beta1,-sum(beta1*(c(h[1],diff(h))))/(l-tail(h,1)))
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,2)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
ddy = diff(sdata,differences=2)
model2 = cpTest(x=ddy,order=2,gamma=gamma,alpha=0.05)
## piecewise constant
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)
beta1 = rep(0,length(h)+1)
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
dy = diff(sdata)
model1 = cpTest(x=dy,order=1,alpha=0.05,gamma=gamma,is.constant=TRUE)
## piecewise linear with jump
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)*3
beta1 = c(2,-1,2.5,-3,-0.2,2.5,-0.5)/50
signal = gen.signal(l=l,h=h,jump=jump,b1=beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
sdata = smth.gau(signal+noise,gamma)
dy = diff(sdata)
ddy = diff(sdata,differences=2)
model2 = cpTest(x=ddy,order=2,gamma=gamma,alpha=0.1)
breaks = est.pair(vall=model2$vall,peak=model2$peak,gamma=gamma)$cp
slope = est.slope(x=(signal+noise),breaks=breaks)

Detection of change points based on 'dSTEM' algorithm

Description

Detection of change points based on 'dSTEM' algorithm

Usage

dstem(
  data,
  type = c("I", "II-step", "II-linear", "mixture"),
  gamma = 20,
  alpha = 0.05
)
dstem(
  data,
  type = c("I", "II-step", "II-linear", "mixture"),
  gamma = 20,
  alpha = 0.05
)

Arguments

`data`	vector of data sequence
`type`	"I" if the change points are piecewise linear and continuous; "II-step" if the change points are piecewise constant and noncontinuous; "II-linear" if the change points are piecewise linear and noncontinuous; "mixture" if both type I and type II change points are include in `data`
`gamma`	bandwidth of Gaussian kernel
`alpha`	global significant level

Value

if type is 'mixture', the output is a list of type I and type II change points, otherwise, it is a list of change points

Examples

## piecewise linear signal
l = 1200
h = seq(150,by=150,length.out=6)
jump = rep(0,7)
beta1 = c(2,-1,2.5,-3,-0.2,2.5)/50
beta1 = c(beta1,-sum(beta1*(c(h[1],diff(h))))/(l-tail(h,1)))
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
model = dstem(signal + noise,"I",gamma=gamma,alpha=0.05)
## piecewise constant
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)
beta1 = rep(0,length(h)+1)
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
model = dstem(signal + noise, "II-step",gamma,alpha=0.05)
## piecewise linear with jump
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)*3
beta1 = c(2,-1,2.5,-3,-0.2,2.5,-0.5)/50
signal = gen.signal(l=l,h=h,jump=jump,b1=beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
model = dstem(signal + noise, "II-linear",gamma,alpha=0.05)
## piecewise linear signal
l = 1200
h = seq(150,by=150,length.out=6)
jump = rep(0,7)
beta1 = c(2,-1,2.5,-3,-0.2,2.5)/50
beta1 = c(beta1,-sum(beta1*(c(h[1],diff(h))))/(l-tail(h,1)))
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
model = dstem(signal + noise,"I",gamma=gamma,alpha=0.05)
## piecewise constant
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)
beta1 = rep(0,length(h)+1)
signal = gen.signal(l,h,jump,beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
model = dstem(signal + noise, "II-step",gamma,alpha=0.05)
## piecewise linear with jump
l = 1200
h = seq(150,by=150,length.out=6)
jump = c(0,1.5,2,2.2,1.8,2,1.5)*3
beta1 = c(2,-1,2.5,-3,-0.2,2.5,-0.5)/50
signal = gen.signal(l=l,h=h,jump=jump,b1=beta1)
noise = rnorm(length(signal),0,1)
gamma = 25
model = dstem(signal + noise, "II-linear",gamma,alpha=0.05)

Identify pairwise local maxima and local minima of the second-order derivative

Description

Identify pairwise local maxima and local minima of the second-order derivative

Usage

est.pair(vall, peak, gamma)
est.pair(vall, peak, gamma)

Arguments

`vall`	vector of locations of significant local minima
`peak`	vector of locations of significant local maxima
`gamma`	bandwidth of Gaussian kernel smoothing function

Value

a list of detected pairs and detected change-point locations through second-order derivative testing

Estimate variance of smoothed Gaussian noise

Description

Estimate variance of smoothed Gaussian noise through its second-order derivative

Usage

est.sigma2(x, gamma, k = 0.5)
est.sigma2(x, gamma, k = 0.5)

Arguments

`x`	numerical vector of second-order derivative of kernel smoothed data
`gamma`	bandwidth of Gaussian kernel
`k`	numerical value, local maxima (minima) are presumed beyond $Mean(x) ± k*SD(x)$

Value

value of estimated variance of smoothed noise

Examples

l=15000; h = seq(150,l,150)
jump = rep(0,length(h)+1); b1 = seq(from=0,by=0.15,length = length(h)+1)
signal = gen.signal(l,h,jump,b1)
data = signal + rnorm(length(signal),0,1) # standard white noise
gamma = 10
ddy = diff(smth.gau(data,gamma),differences=2)
est.sigma2(ddy,gamma,k=0.5) # true value is \eqn{\frac{1}{2\sqrt{pi}\gamma}}
l=15000; h = seq(150,l,150)
jump = rep(0,length(h)+1); b1 = seq(from=0,by=0.15,length = length(h)+1)
signal = gen.signal(l,h,jump,b1)
data = signal + rnorm(length(signal),0,1) # standard white noise
gamma = 10
ddy = diff(smth.gau(data,gamma),differences=2)
est.sigma2(ddy,gamma,k=0.5) # true value is \eqn{\frac{1}{2\sqrt{pi}\gamma}}

Estimate piecewise slope for piecewise linear model

Description

Estimate piecewise slope for piecewise linear model

Usage

est.slope(x, breaks)
est.slope(x, breaks)

Arguments

`x`	numerical vector of signal-plus-noise data
`breaks`	numerical vector of change-point locations

Value

a vector of estimated piecewise slope

Compute TPR and FPR

Description

Compute TPR and FPR

Usage

Fdr(uh, th, b)
Fdr(uh, th, b)

Arguments

`uh`	numerical vector of estimated change point locations
`th`	numerical vector of true change point locations
`b`	location tolerance, usually specified as the bandwidth `gamma`

Value

a dataframe of FDR (FPR) and Power (TPR)

Compute FDR threshold based on Benjamini-Hochberg (BH) algorithm

Description

Compute FDR threshold based on Benjamini-Hochberg (BH) algorithm

Usage

fdrBH(p, q)
fdrBH(p, q)

Arguments

`p`	a vector of p-values
`q`	False Discovery Rate level

Value

p-value threshold based on independence or positive dependence

Examples

fdrBH(seq(0.01,0.1,0.01),q=0.1)

fdrBH(seq(0.01,0.1,0.01),q=0.1)

Generate simulated signals

Description

Generate simulated signals

Usage

gen.signal(l, h, jump, b1, rep = 1, shift = 0)
gen.signal(l, h, jump, b1, rep = 1, shift = 0)

Arguments

`l`	length of data, if data is periodic then the length in each period
`h`	numerical vector of true change point locations
`jump`	numerical vector of jump size at change point locations
`b1`	numerical vector of piecewise slopes
`rep`	number of periods if data is periodic, default is 1
`shift`	numerical vector of vertical shifts for each period, default is 0

Value

a vector of simulated signal

Examples

l = 1200
h = seq(150,by=150,length.out=6)
jump = rep(0,7)
beta1 = c(2,-1,2.5,-3,-0.2,2.5)/50
beta1 = c(beta1,-sum(beta1*(c(h[1],diff(h))))/(l-tail(h,1)))
signal = gen.signal(l,h,jump,beta1)
l = 1200
h = seq(150,by=150,length.out=6)
jump = rep(0,7)
beta1 = c(2,-1,2.5,-3,-0.2,2.5)/50
beta1 = c(beta1,-sum(beta1*(c(h[1],diff(h))))/(l-tail(h,1)))
signal = gen.signal(l,h,jump,beta1)

Stock price of Host & Hotel Resorts (HST)

Description

A subset of daily stock price data of HST from November 7, 2011 to November 5, 2021.

Usage

HST_stock
HST_stock

Format

A data frame with 2,517 rows and 6 columns:

Date: date from November 7, 2011 to November 5, 2021
Close, Open, High, Low: stock price
Volume: stock exchange volume

Source

<https://finance.yahoo.com/quote/HST>

Smoothing data using Gaussian kernel

Description

Smoothing data using Gaussian kernel

Usage

smth.gau(x, gamma)
smth.gau(x, gamma)

Arguments

`x`	numeric vector of values to smooth
`gamma`	bandwidth of Gaussian kernel

Value

vector of smoothed values

Examples

smth.gau(x=rnorm(1000), gamma=20)
smth.gau(x=rnorm(1000), gamma=20)

Compute SNR of a certain change point location

Description

Compute SNR of a certain change point location

Usage

snr(order, gamma, is.jump, jump, diffb, addb)
snr(order, gamma, is.jump, jump, diffb, addb)

Arguments

`order`	order of derivative of data
`gamma`	bandwidth of Gaussian kernel
`is.jump`	logical value indicating if the location to be calculated is a jump point
`jump`	jump height
`diffb`	difference of the slopes on left and right sides of the location
`addb`	sum of the slopes, only used when order is 1

Value

value of SNR

Find local maxima and local minima of data sequence

Description

Find local maxima and local minima of data sequence

Usage

which.peaks(x, partial = FALSE, decreasing = FALSE)
which.peaks(x, partial = FALSE, decreasing = FALSE)

Arguments

`x`	numerical vector contains local maxima (minima)
`partial`	logical value indicating if the two endpoints will be considered
`decreasing`	logical value indicating whether to find local minima

Value

a vector of locations of local maxima or minima

Examples

a = 100:1
which.peaks(a*sin(a/3))
a = 100:1
which.peaks(a*sin(a/3))

Package 'dSTEM'

Help Index

Compute convolution function using FFT

Description

Usage

Arguments

Value

References

Examples

Plot data sequence, the first and second-order derivatives, and their local extrema

Description

Usage

Arguments

Value

Examples

Multiple testing of change points for kernel smoothed data

Description

Usage

Arguments

Value

Examples

Detection of change points based on 'dSTEM' algorithm

Description

Usage

Arguments

Value

See Also

Examples

Identify pairwise local maxima and local minima of the second-order derivative

Description

Usage

Arguments

Value

Estimate variance of smoothed Gaussian noise

Description

Usage

Arguments

Value

Examples

Estimate piecewise slope for piecewise linear model

Description

Usage

Arguments

Value

Compute TPR and FPR

Description

Usage

Arguments

Value

Compute FDR threshold based on Benjamini-Hochberg (BH) algorithm

Description

Usage

Arguments

Value

Examples

Generate simulated signals

Description

Usage

Arguments

Value

Examples

Stock price of Host & Hotel Resorts (HST)

Description

Usage

Format

Source

Smoothing data using Gaussian kernel

Description

Usage

Arguments

Value

Examples

Compute SNR of a certain change point location

Description

Usage

Arguments

Value

Find local maxima and local minima of data sequence

Description

Usage