| Title: | Variance-Guided Regression for Heteroscedastic Linear Models |
|---|---|
| Description: | Fits variance-guided linear regression models for heteroscedastic data using an iteratively reweighted least squares estimator or an iteratively reweighted lasso estimator. This CRAN release focuses on the global linear mean-variance model in Section 2 of the accompanying preprint <doi:10.36227/techrxiv.177004877.75352102/v1>. The grouping-based nonlinear prediction extension from Section 3 is available in the development version on GitHub. |
| Authors: | Sibei Liu [aut], Min Lu [aut, cre] |
| Maintainer: | Min Lu <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.1.4 |
| Built: | 2026-05-29 13:03:41 UTC |
| Source: | https://github.com/cran/varGuid |
Data with non-linear and interaction relationship with 500 observations and 15 variables.
data(cobra2d)data(cobra2d)
# library(copula) # cobra2 = function(n = 500, d = 15, sd = .1, corrv = 0) { # set.seed(1) # d <- max(10, d) # X <- matrix(runif(n * d, -1, 1), ncol = d) # paramlist <- lapply(1:d, function(j) {list(min=-1,max=1)}) # myCop <- normalCopula(param=rep(corrv,dim(combn(d,2))[2]), dim = d, dispstr = "un") # myMvd <- mvdc(copula=myCop, margins=rep("unif",d),paramMargins=paramlist) # X[, 1:d] <- rMvdc(n, myMvd) # dta <- data.frame(list(x = X, y = X[,1]*X[,2] + X[,3]^2 - X[,4]*X[,7] + X[,8]*X[,10] - X[,6]^2 # + rnorm(n, sd = sd))) # colnames(dta)[1:d] <- paste("x", 1:d, sep = "") # f <- "x1 * x2 + x3 ^ 2 - x4 * x7 + x8 * x10 - x6 ^ 2" # fs <- "I(x1 * x2) + I(x3 ^ 2) + I(-x4 * x7) + I(x8 * x10) - I(x6 ^ 2)" # list(f = f, fs = fs, dta = dta) # } data(cobra2d)# library(copula) # cobra2 = function(n = 500, d = 15, sd = .1, corrv = 0) { # set.seed(1) # d <- max(10, d) # X <- matrix(runif(n * d, -1, 1), ncol = d) # paramlist <- lapply(1:d, function(j) {list(min=-1,max=1)}) # myCop <- normalCopula(param=rep(corrv,dim(combn(d,2))[2]), dim = d, dispstr = "un") # myMvd <- mvdc(copula=myCop, margins=rep("unif",d),paramMargins=paramlist) # X[, 1:d] <- rMvdc(n, myMvd) # dta <- data.frame(list(x = X, y = X[,1]*X[,2] + X[,3]^2 - X[,4]*X[,7] + X[,8]*X[,10] - X[,6]^2 # + rnorm(n, sd = sd))) # colnames(dta)[1:d] <- paste("x", 1:d, sep = "") # f <- "x1 * x2 + x3 ^ 2 - x4 * x7 + x8 * x10 - x6 ^ 2" # fs <- "I(x1 * x2) + I(x3 ^ 2) + I(-x4 * x7) + I(x8 * x10) - I(x6 ^ 2)" # list(f = f, fs = fs, dta = dta) # } data(cobra2d)
Fits the stage-1 variance-guided linear model for heteroscedastic data using
iteratively reweighted least squares (IRLS) when lasso = FALSE or an
iteratively reweighted lasso procedure when lasso = TRUE. For
lasso = FALSE, the returned object also includes weighted least squares
and heteroscedasticity-consistent inference summaries based on the final fit.
lmv(X, Y, M = 10, step = 1, tol = exp(-10), lasso = FALSE)lmv(X, Y, M = 10, step = 1, tol = exp(-10), lasso = FALSE)
X |
Input matrix with observations in rows and predictors in columns. |
Y |
Numeric response vector. |
M |
Maximum number of iterations. |
step |
Scale parameter for the data weights. |
tol |
Tolerance parameter for convergence. |
lasso |
Logical; if |
A list with the following components:
beta |
Coefficient estimates from the final variance-guided fit. |
obj.OLS |
Unweighted baseline |
obj.lasso |
Unweighted baseline |
obj.varGuid |
Final fitted model from either |
res |
Object returned by the variance-model update in the last iteration. |
obj.varGuid.coef |
For |
X |
The input design matrix |
Sibei Liu and Min Lu
Liu, S. and Lu, M. (2026). Variance-Guided Regression for Heteroscedastic Data. TechRxiv. doi:10.36227/techrxiv.177004877.75352102/v1
data(cobra2d, package = "varGuid") dat <- cobra2d set.seed(1) tid <- sample(seq_len(nrow(dat)), 200) train <- dat[-tid, ] yid <- which(colnames(dat) == "y") o <- lmv(X = train[, -yid], Y = train[, yid], lasso = FALSE) summary(o$obj.varGuid) summary(o$obj.OLS) head(prd(o, train[, -yid], model = "baseline")) head(prd(o, train[, -yid], model = "varGuid")) o2 <- lmv(X = train[, -yid], Y = train[, yid], lasso = TRUE) o2$beta o2$obj.lasso$beta head(prd(o2, train[, -yid], model = "baseline")) head(prd(o2, train[, -yid], model = "varGuid"))data(cobra2d, package = "varGuid") dat <- cobra2d set.seed(1) tid <- sample(seq_len(nrow(dat)), 200) train <- dat[-tid, ] yid <- which(colnames(dat) == "y") o <- lmv(X = train[, -yid], Y = train[, yid], lasso = FALSE) summary(o$obj.varGuid) summary(o$obj.OLS) head(prd(o, train[, -yid], model = "baseline")) head(prd(o, train[, -yid], model = "varGuid")) o2 <- lmv(X = train[, -yid], Y = train[, yid], lasso = TRUE) o2$beta o2$obj.lasso$beta head(prd(o2, train[, -yid], model = "baseline")) head(prd(o2, train[, -yid], model = "varGuid"))
A lightweight prediction helper for objects returned by lmv(). For
ordinary weighted least squares fits it dispatches to stats::predict().
For iteratively reweighted lasso fits it dispatches to
glmnet::predict.glmnet() and returns a numeric vector.
prd(object, newdata, model = c("varGuid", "baseline"), ...)prd(object, newdata, model = c("varGuid", "baseline"), ...)
object |
An object returned by |
newdata |
A matrix or data frame of predictors for prediction. |
model |
Which fitted model to use. |
... |
Additional arguments passed to |
This CRAN release covers the global linear mean-variance model from Section 2 of
Liu and Lu (2026). For the grouping-based nonlinear prediction extension from
Section 3 of the paper, please use the development version available at
devtools::install_github("luminwin/varGuid").
A numeric vector of predictions.
Sibei Liu and Min Lu
Liu, S. and Lu, M. (2026). Variance-Guided Regression for Heteroscedastic Data. TechRxiv. doi:10.36227/techrxiv.177004877.75352102/v1
data(cobra2d, package = "varGuid") dat <- cobra2d set.seed(1) tid <- sample(seq_len(nrow(dat)), 200) train <- dat[-tid, ] yid <- which(colnames(dat) == "y") o <- lmv(X = train[, -yid], Y = train[, yid], lasso = FALSE) head(prd(o, train[, -yid], model = "baseline")) head(prd(o, train[, -yid], model = "varGuid")) o2 <- lmv(X = train[, -yid], Y = train[, yid], lasso = TRUE) head(prd(o2, train[, -yid], model = "baseline")) head(prd(o2, train[, -yid], model = "varGuid"))data(cobra2d, package = "varGuid") dat <- cobra2d set.seed(1) tid <- sample(seq_len(nrow(dat)), 200) train <- dat[-tid, ] yid <- which(colnames(dat) == "y") o <- lmv(X = train[, -yid], Y = train[, yid], lasso = FALSE) head(prd(o, train[, -yid], model = "baseline")) head(prd(o, train[, -yid], model = "varGuid")) o2 <- lmv(X = train[, -yid], Y = train[, yid], lasso = TRUE) head(prd(o2, train[, -yid], model = "baseline")) head(prd(o2, train[, -yid], model = "varGuid"))
The varGuid package implements the global linear mean-variance model from
Section 2 of Liu and Lu (2026) using iteratively reweighted least squares and
iteratively reweighted lasso estimation. This CRAN release focuses on the
heteroscedastic linear model and its prediction utilities for the fitted
stage-1 models. For the grouping-based nonlinear prediction extension from
Section 3 of the paper, please use the development version available at
devtools::install_github("luminwin/varGuid").
Sibei Liu and Min Lu
Liu, S. and Lu, M. (2026). Variance-Guided Regression for Heteroscedastic Data. TechRxiv. doi:10.36227/techrxiv.177004877.75352102/v1
data(cobra2d, package = "varGuid") dat <- cobra2d set.seed(1) tid <- sample(seq_len(nrow(dat)), 200) train <- dat[-tid, ] yid <- which(colnames(dat) == "y") o <- lmv(X = train[, -yid], Y = train[, yid], lasso = FALSE) summary(o$obj.varGuid) summary(o$obj.OLS) head(prd(o, train[, -yid], model = "baseline")) head(prd(o, train[, -yid], model = "varGuid")) # Iteratively reweighted lasso: o2 <- lmv(X = train[, -yid], Y = train[, yid], lasso = TRUE) o2$beta o2$obj.lasso$beta head(prd(o2, train[, -yid], model = "baseline")) head(prd(o2, train[, -yid], model = "varGuid"))data(cobra2d, package = "varGuid") dat <- cobra2d set.seed(1) tid <- sample(seq_len(nrow(dat)), 200) train <- dat[-tid, ] yid <- which(colnames(dat) == "y") o <- lmv(X = train[, -yid], Y = train[, yid], lasso = FALSE) summary(o$obj.varGuid) summary(o$obj.OLS) head(prd(o, train[, -yid], model = "baseline")) head(prd(o, train[, -yid], model = "varGuid")) # Iteratively reweighted lasso: o2 <- lmv(X = train[, -yid], Y = train[, yid], lasso = TRUE) o2$beta o2$obj.lasso$beta head(prd(o2, train[, -yid], model = "baseline")) head(prd(o2, train[, -yid], model = "varGuid"))