Title: | The LIC for T Distribution Regression Analysis |
---|---|
Description: | This comprehensive toolkit for T-distributed regression is designated as "TLIC" (The LIC for T Distribution Regression Analysis) analysis. It is predicated on the assumption that the error term adheres to a T-distribution. The philosophy of the package is described in Guo G. (2020) <doi:10.1080/02664763.2022.2053949>. |
Authors: | Guangbao Guo [aut, cre]
|
Maintainer: | Guangbao Guo <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4 |
Built: | 2025-02-12 05:22:08 UTC |
Source: | CRAN |
Caculate the estimators of beta on the A-opt and D-opt
beta_AD(K = K, nk = nk, alpha = alpha, X = X, y = y)
beta_AD(K = K, nk = nk, alpha = alpha, X = X, y = y)
K |
is the number of subsets |
nk |
is the length of subsets |
alpha |
is the significance level |
X |
is the observation matrix |
y |
is the response vector |
A list containing:
betaA |
The estimator of beta on the A-opt. |
betaD |
The estimator of beta on the D-opt. |
Guo, G., Song, H. & Zhu, L. The COR criterion for optimal subset selection in distributed estimation. Statistics and Computing, 34, 163 (2024). doi:10.1007/s11222-024-10471-z
p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_AD(K=K,nk=nk,alpha=alpha,X=X,y=y)
p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_AD(K=K,nk=nk,alpha=alpha,X=X,y=y)
Caculate the estimator of beta on the COR
beta_cor(K = K, nk = nk, alpha = alpha, X = X, y = y)
beta_cor(K = K, nk = nk, alpha = alpha, X = X, y = y)
K |
is the number of subsets |
nk |
is the length of subsets |
alpha |
is the significance level |
X |
is the observation matrix |
y |
is the response vector |
A list containing:
betaC |
The estimator of beta on the COR. |
Guo, G., Song, H. & Zhu, L. The COR criterion for optimal subset selection in distributed estimation. Statistics and Computing, 34, 163 (2024). doi:10.1007/s11222-024-10471-z
p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_cor(K=K,nk=nk,alpha=alpha,X=X,y=y)
p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_cor(K=K,nk=nk,alpha=alpha,X=X,y=y)
Calculate the LIC estimator based on A-optimal and D-optimal criterion
LICnew(X, Y, alpha, K, nk)
LICnew(X, Y, alpha, K, nk)
X |
A matrix of observations (design matrix) with size n x p |
Y |
A vector of responses with length n |
alpha |
The significance level for confidence intervals |
K |
The number of subsets to consider |
nk |
The size of each subset |
A list containing:
E5 |
The LIC estimator based on A-optimal and D-optimal criterion. |
Guo, G., Song, H. & Zhu, L. The COR criterion for optimal subset selection in distributed estimation. Statistics and Computing, 34, 163 (2024). doi:10.1007/s11222-024-10471-z
p = 6; n = 1000; K = 2; nk = 200; alpha = 0.05; sigma = 1 e = rnorm(n, 0, sigma); beta = c(sort(c(runif(p, 0, 1)))); data = c(rnorm(n * p, 5, 10)); X = matrix(data, ncol = p); Y = X %*% beta + e; LICnew(X = X, Y = Y, alpha = alpha, K = K, nk = nk)
p = 6; n = 1000; K = 2; nk = 200; alpha = 0.05; sigma = 1 e = rnorm(n, 0, sigma); beta = c(sort(c(runif(p, 0, 1)))); data = c(rnorm(n * p, 5, 10)); X = matrix(data, ncol = p); Y = X %*% beta + e; LICnew(X = X, Y = Y, alpha = alpha, K = K, nk = nk)
This terr function generates a dataset with a specified number of observations and predictors, along with a response vector that has an error term following a T-distribution.
terr(n, nr, p, dist_type, ...)
terr(n, nr, p, dist_type, ...)
n |
is the number of observations |
nr |
is the number of observations with a different error T distribution |
p |
is the dimension of the observation |
dist_type |
is the type where the error term obeys a T-distribution |
... |
is additional arguments for the T-distribution function |
X,Y,e
set.seed(12) data <- terr(n = 1200, nr = 200, p = 5, dist_type = "student_t") str(data)
set.seed(12) data <- terr(n = 1200, nr = 200, p = 5, dist_type = "student_t") str(data)
The TLIC function builds on the LIC function by introducing the assumption that the error term follows a T-distribution, thereby enhancing the length and information optimisation criterion.
TLIC(X, Y, alpha = 0.05, K = 10, nk = NULL, dist_type = "student_t")
TLIC(X, Y, alpha = 0.05, K = 10, nk = NULL, dist_type = "student_t")
X |
is a design matrix |
Y |
is a random response vector of observed values |
alpha |
is the significance level |
K |
is the number of subsets |
nk |
is the sample size of subsets |
dist_type |
is the type where the error term obeys a T-distribution |
MUopt, Bopt, MAEMUopt, MSEMUopt, opt, Yopt
set.seed(12) n <- 1200 nr <- 200 p <- 5 data <- terr(n, nr, p, dist_type = "student_t") TLIC(data$X, data$Y, alpha = 0.05, K = 10, nk = n / 10, dist_type = "student_t")
set.seed(12) n <- 1200 nr <- 200 p <- 5 data <- terr(n, nr, p, dist_type = "student_t") TLIC(data$X, data$Y, alpha = 0.05, K = 10, nk = n / 10, dist_type = "student_t")