Title: | Change Point Detection with Missing Values |
---|---|
Description: | A four step change point detection method that can detect break points with the presence of missing values proposed by Liu and Safikhani (2023) <https://drive.google.com/file/d/1a8sV3RJ8VofLWikTDTQ7W4XJ76cEj4Fg/view?usp=drive_link>. |
Authors: | Yanxi Liu [aut, cre], Abolfazl Safikhani [aut] |
Maintainer: | Yanxi Liu <[email protected]> |
License: | GPL-2 |
Version: | 0.1.0 |
Built: | 2024-12-24 06:54:24 UTC |
Source: | CRAN |
BIC and HBIC function
BIC(residual, phi)
BIC(residual, phi)
residual |
residual matrix |
phi |
estimated coefficient matrix of the model |
A list object, which contains the followings
BIC value
HBIC value
BIC threshold for final parameter estimation
BIC_threshold( beta.final, k, m.hat, brk, data_y, data_x = NULL, b_n = 2, nlam = 20 )
BIC_threshold( beta.final, k, m.hat, brk, data_y, data_x = NULL, b_n = 2, nlam = 20 )
beta.final |
estimated parameter coefficient matrices |
k |
dimensions of parameter coefficient matrices |
m.hat |
number of estimated change points |
brk |
vector of estimated change points |
data_y |
input data matrix (response), with each column representing the time series component |
data_x |
input data matrix (predictor), with each column 1 |
b_n |
the block size |
nlam |
number of hyperparameters for grid search |
lambda.val.best, the tuning parameter lambda selected by BIC.
Perform the BTIE algorithm to detect the structural breaks in large scale high-dimensional mean shift models.
BTIE( data_y, lambda.1.cv = NULL, lambda.2.cv = NULL, max.iteration = 100, tol = 10^(-2), block.size = NULL, refit = FALSE, optimal.block = TRUE, optimal.gamma.val = 1.5, block.range = NULL )
BTIE( data_y, lambda.1.cv = NULL, lambda.2.cv = NULL, max.iteration = 100, tol = 10^(-2), block.size = NULL, refit = FALSE, optimal.block = TRUE, optimal.gamma.val = 1.5, block.range = NULL )
data_y |
input data matrix (response), with each column representing the time series component |
lambda.1.cv |
tuning parmaeter lambda_1 for fused lasso |
lambda.2.cv |
tuning parmaeter lambda_2 for fused lasso |
max.iteration |
max number of iteration for the fused lasso |
tol |
tolerance for the fused lasso |
block.size |
the block size |
refit |
logical; if TRUE, refit the model, if FALSE, use BIC to find a thresholding value and then output the parameter estimates without refitting. Default is FALSE. |
optimal.block |
logical; if TRUE, grid search to find optimal block size, if FALSE, directly use the default block size. Default is TRUE. |
optimal.gamma.val |
hyperparameter for optimal block size, if optimal.blocks == TRUE. Default is 1.5. |
block.range |
the search domain for optimal block size. |
A list object, which contains the followings
set.seed(1) n <- 1000; p <- 50; brk <- c(333, 666, n+1) m <- length(brk) d <- 5 constant.full <- constant_generation(n, p, d, 50, brk) e.sigma <- as.matrix(1*diag(p)) data_y <- data_generation(n = n, mu = constant.full, sigma = e.sigma, brk = brk) data_y <- as.matrix(data_y, ncol = p.y) data_y_miss <- MCAR(data_y, 0.3) temp <- BTIE(data_y_miss, optimal.block = FALSE, block.size = 30) temp$cp.final
set.seed(1) n <- 1000; p <- 50; brk <- c(333, 666, n+1) m <- length(brk) d <- 5 constant.full <- constant_generation(n, p, d, 50, brk) e.sigma <- as.matrix(1*diag(p)) data_y <- data_generation(n = n, mu = constant.full, sigma = e.sigma, brk = brk) data_y <- as.matrix(data_y, ncol = p.y) data_y_miss <- MCAR(data_y, 0.3) temp <- BTIE(data_y_miss, optimal.block = FALSE, block.size = 30) temp$cp.final
function to generate constant given jump size and break points
constant_generation(n, p, d, vns, brk)
constant_generation(n, p, d, vns, brk)
n |
the sample size |
p |
the data dimension |
d |
the number of nonzero coeddficients |
vns |
the jump size. It can be a vector or a single value. If single value, it is same for all break points |
brk |
the break points' locations |
the parameter matrix used to generate data
The function to generate mean shift data
data_generation(n, mu, sigma, brk = n + 1)
data_generation(n, mu, sigma, brk = n + 1)
n |
the number of data points |
mu |
the matrix of mean parameter |
sigma |
covariance matrix of the white noise |
brk |
vector of change points |
data_y matrix of generated mean shift data
Perform the block fused lasso with thresholding to detect candidate break points.
first.step( data_y, data_x, lambda1, lambda2, max.iteration = max.iteration, tol = tol, blocks, cv.index, fixed_index = NULL, nonfixed_index = NULL )
first.step( data_y, data_x, lambda1, lambda2, max.iteration = max.iteration, tol = tol, blocks, cv.index, fixed_index = NULL, nonfixed_index = NULL )
data_y |
input data matrix Y, with each column representing the time series component |
data_x |
input data matrix X |
lambda1 |
tuning parmaeter lambda_1 for fused lasso |
lambda2 |
tuning parmaeter lambda_2 for fused lasso |
max.iteration |
max number of iteration for the fused lasso |
tol |
tolerance for the fused lasso |
blocks |
the blocks |
cv.index |
the index of time points for cross-validation |
fixed_index |
index for linear regression model with only partial compoenents change. |
nonfixed_index |
index for linear regression model with only partial compoenents change. |
A list object, which contains the followings
estimated jump size in L2 norm
estimated jump size in L1 norm
estimated change points in the first step
estimated parameters in the first step
function to do the missing assuming the missing completely at random
Heter_missing(data, alpha)
Heter_missing(data, alpha)
data |
data before the missing case |
alpha |
the list of percentage of missing compared to whole data |
the data matrix with missing values
function to do the imputation based on block size
imputation(data, block.size)
imputation(data, block.size)
data |
data before the imputation |
block.size |
the block size that are used to impute the missing |
the data matrix without missing values after imputation
function to do the imputation based on change point candidate
imputation2(data, cp.candidate)
imputation2(data, cp.candidate)
data |
data before the imputation |
cp.candidate |
the change point candidate that are used to impute the missing |
the data matrix without missing values after imputation
function to do the missing assuming the missing completely at random
MCAR(data, alpha)
MCAR(data, alpha)
data |
data before the missing case |
alpha |
the percentage of missing compared to whole data |
the data matrix with missing values
function to do the prediction
pred(X, phi, j, p.x, p.y, h = 1)
pred(X, phi, j, p.x, p.y, h = 1)
X |
data for prediction |
phi |
parameter matrix |
j |
the start time point for prediction |
p.x |
the dimension of data X |
p.y |
the dimension of data Y |
h |
the length of observation to predict |
prediction matrix
Prediction function (block)
pred.block(X, phi, j, p.x, p.y, h)
pred.block(X, phi, j, p.x, p.y, h)
X |
data for prediction |
phi |
parameter matrix |
j |
the start time point for prediction |
p.x |
the dimension of data X |
p.y |
the dimension of data Y |
h |
the length of observation to predict |
prediction matrix
Reimputate the missing values and perform the exhaustive search to "thin out" redundant break points.
second.step( data_y, data_x, max.iteration = max.iteration, tol = tol, cp.first, beta.est, blocks, data_y_miss )
second.step( data_y, data_x, max.iteration = max.iteration, tol = tol, cp.first, beta.est, blocks, data_y_miss )
data_y |
input data matrix, with each column representing the time series component |
data_x |
input data matrix |
max.iteration |
max number of iteration for the fused lasso |
tol |
tolerance for the fused lasso |
cp.first |
the selected break points after the first step |
beta.est |
the estiamted parameters by block fused lasso |
blocks |
the blocks |
data_y_miss |
the data y matrix before the first imputation |
A list object, which contains the followings
a set of selected break point after the exhaustive search step
the estimated coefficient matrix for each segmentation