Title: | Fast Functions for Liu Regression with Regularization Parameter and Statistics |
---|---|
Description: | Efficient computation of the Liu regression coefficient paths, Liu-related statistics and information criteria for a grid of the regularization parameter. The computations are based on the 'C++' library 'Armadillo' through the 'R' package 'Rcpp'. |
Authors: | Murat Genç [aut, cre] , Ömer Özbilen [aut] |
Maintainer: | Murat Genç <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0 |
Built: | 2024-12-02 06:43:18 UTC |
Source: | CRAN |
Prints coefficient estimates from a
fitted liureg
object.
## S3 method for class 'liureg' coef(object, ...)
## S3 method for class 'liureg' coef(object, ...)
object |
A |
... |
Not used in this implementation. |
The returned object is a data.frame containing the coefficients path.
Murat Genç
liureg()
, predict()
, summary()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) coef(liu.mod)
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) coef(liu.mod)
For a scalar or vector tuning parameter lambda,
the covliureg
computes the covariance matrix
for the estimates of a Liu regression model.
covliu(obj)
covliu(obj)
obj |
A |
The returned object is a list of the matrix of estimated covariances.
Murat Genç and Ömer Özbilen
liureg()
, coef()
, predict()
, summary()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) # List of covariance matrices for 101 lambda values cov.mat <- covliu(liu.mod) print(cov.mat$lam1)
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) # List of covariance matrices for 101 lambda values cov.mat <- covliu(liu.mod) print(cov.mat$lam1)
For each value of the regularization parameter lambda,
diagHliu
returns the diagonal elements of the hat matrix.
Unlike the hatliu
function, only the diagonal
elements of the hat matrix are calculated, thus the
computation of diagonal elements is faster than hatliu
.
diagHliu(obj)
diagHliu(obj)
obj |
A |
The returned object is a matrix whose columns are the diagonal elements of the hat matrix for each value of the lambda regularization parameter.
Murat Genç
liureg()
, summary()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) diagHliu(liu.mod)
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) diagHliu(liu.mod)
For each value of the regularization parameter lambda,
hatliu
returns the hat matrix of Liu regression.
The hat matrix for Liu regression is computed using the formula
hatliu(obj)
hatliu(obj)
obj |
A |
The returned object is a list of matrices whose elements are
the hat matrices for the values of the
lambda
regularization parameter.
Murat Genç
liureg()
, summary()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) # Hat matrix list hatlist <- hatliu(liu.mod) # Hat matrix for third regularization parameter hatlist[[3]]
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) # Hat matrix list hatlist <- hatliu(liu.mod) # Hat matrix for third regularization parameter hatlist[[3]]
Major League Baseball Data from the 1986 and 1987 seasons.
A data frame with 322 observations of major league players on the following 20 variables with explanations.
AtBat |
Number of times at bat in 1986 | |
Hits |
Number of hits in 1986 | |
HmRun |
Number of home runs in 1986 | |
Runs |
Number of runs in 1986 | |
RBI |
Number of runs batted in 1986 | |
Walks |
Number of walks in 1986 | |
Years |
Number of years in the major leagues | |
CAtBat |
Number of times at bat during his career | |
CHits |
Number of hits during his career | |
CHmRun |
Number of home runs during his career | |
CRuns |
Number of runs during his career | |
CRBI |
Number of runs batted in during his career | |
CWalks |
Number of walks during his career | |
League |
A factor with levels A and N indicating player's league at the end of 1986 | |
Division |
A factor with levels E and W indicating player's division at the end of 1986 | |
PutOuts |
Number of put outs in 1986 | |
Assists |
Number of assists in 1986 | |
Errors |
Number of errors in 1986 | |
Salary |
1987 annual salary on opening day in thousands of dollars | |
NewLeague |
A factor with levels A and N indicating player's league at the beginning of 1987 | |
The dataset was retrieved from the StatLib library maintained at Carnegie Mellon University. This is part of the data used in the 1988 ASA Graphics Section Poster Session. The dataset is available in the R package ISLR2 (James et al., 2022). For more details, see the book, An Introduction to Statistical Learning with applications in R by James et al. (2013).
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning with applications in R, https://www.statlearning.com, Springer-Verlag, New York.
James G, Witten D, Hastie T, Tibshirani R (2022). ISLR2: Introduction to Statistical Learning, Second Edition. R package version 1.3-2, https://CRAN.R-project.org/package=ISLR2.
For each value of lambda
, infoliu calculates
the values of the AIC and BIC model selection criteria.
Model selection criteria are based on the degrees of the freedom,
of the Liu regression model where
is the hat matrix of
Liu regression model.
infoliu(obj)
infoliu(obj)
obj |
A |
infoliu
returns the matrix of information criteria
for each value of the regularization parameter lambda
.
Murat Genç
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transaction on Automatic Control, 9(6), 716-723. doi:10.1109/TAC.1974.1100705.
Liu, K. (1993). A new class of blased estimate in linear regression. Communications in Statistics-Theory and Methods, 22(2), 393-402. doi:10.1080/03610929308831027.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461-464. doi:10.1214/aos/1176344136.
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) infoliu(liu.mod)
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) infoliu(liu.mod)
lamest
computes the Liu tuning parameters provided in the literature.
The tuning parameter estimates are based on
Liu (1993) doi:10.1080/03610929308831027,
Ozkale and Kaciranlar (2007) doi:10.1080/03610920601126522,
Liu (2011) doi:10.1016/j.jspi.2010.05.030.
lamest(obj, ...)
lamest(obj, ...)
obj |
An object of class |
... |
Not used in this implemetation. |
The lamest
function computes the following tuning
parameter estimates available in the literature.
lam.mm (Liu, 1993) |
|
lam.CL (Liu, 1993) |
|
lam.opt (Liu, 1993) |
|
lam.OK (Ozkale and Kaciranlar, 2007; Liu, 2011) |
with and where and are the th diagonal elements of and , respectively. |
lam.GCV |
This is the value corresponding to the minimum of the generalized cross-validition (GCV) values. The GCV is computed by where is the residual sum of squares and is the trace of the hat matrix at corresponding value of from Liu regression. |
The return object is the Liu tuning parameter estimates based on the literature.
Murat Genç and Ömer Özbilen
Liu, K. (1993). A new class of blased estimate in linear regression. Communications in Statistics-Theory and Methods, 22(2), 393-402. doi:10.1080/03610929308831027.
Liu, X. Q. (2011). Improved Liu estimator in a linear regression model. Journal of Statistical Planning and Inference, 141(1), 189-196. doi:10.1016/j.jspi.2010.05.030.
Ozkale, M. R. and Kaciranlar, S. (2007). A prediction-oriented criterion for choosing the biasing parameter in Liu estimation. Communications in Statistics-Theory and Methods, 36(10), 1889-1903. doi:10.1080/03610920601126522. Imdadullah, M., Aslam, M., and Altaf, S., (2017). liureg: A Comprehensive R Package for the Liu Estimation of Linear Regression Model with Collinear Regressors. The R Journal, 9(2), 232-247.
liureg()
, predict()
, summary()
, pressliu()
, residuals()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) lamest(liu.mod)
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) lamest(liu.mod)
liureg
fits coefficients paths for Liu regression models
over a grid of values for the regularization (biasing)
parameter lambda
. The returned object is of class liureg
.
liureg(X, y, lambda = 1, scale = c("ulength", "unormal", "none"), ...)
liureg(X, y, lambda = 1, scale = c("ulength", "unormal", "none"), ...)
X |
The design matrix of features. |
y |
The response vector. |
lambda |
User-specified values of |
scale |
Scaling type of the design matrix. |
... |
Not used in this implementation. |
The sequence of Liu regression models indexed by the tuning parameter.
are obtained by
where is the ordinary least squares
estimator.To obtain the models, the singular value decomposition (SVD)
of the matrix
is used. This SVD is done once and
is used to generate all models.
Explanatory variables in the design matrix are always centered
before fitting a model in the fastliu
package.
For scaling, two options are possible:
unit-length and unit-normal scaling. In unit-length scaling,
the matrix of explanatory variables has correlation form.
In unit-normal scaling, the explanatory variables have zero
mean and unit variance.
Both Coefficient estimates based on the scaled data and
in original scale are presented.
The intercept of the model is not penalized and computed by
, where
is the row vector of the explanatory variables and
is computed based on centered design matrix.
The returned liureg
object
is used for statistical testing of Liu coefficients,
plotting method and computing the Liu regression related statistics.
Fitted Liu regression object with the class of liureg
Murat Genç and Ömer Özbilen
coef()
, predict()
, summary()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.05) liu.mod <- liureg(X, y, lam)
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.05) liu.mod <- liureg(X, y, lam)
Plot method for liureg objects
## S3 method for class 'liureg' plot(x, type = c("coefpath", "biasvar", "info"), ...)
## S3 method for class 'liureg' plot(x, type = c("coefpath", "biasvar", "info"), ...)
x |
A |
type |
What to plot on the vertical axis. |
... |
Other graphical parameters to |
No return value.
Murat Genç
liureg()
, predict()
, summary()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary liu.mod <- liureg(X, y, seq(0, 1, 0.01)) # Liu coefficient paths plot(liu.mod) # Bias-variance trade-off plot(liu.mod, type="biasvar")
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary liu.mod <- liureg(X, y, seq(0, 1, 0.01)) # Liu coefficient paths plot(liu.mod) # Bias-variance trade-off plot(liu.mod, type="biasvar")
Predict method for liureg objects
## S3 method for class 'liureg' predict(object, newdata, ...)
## S3 method for class 'liureg' predict(object, newdata, ...)
object |
A |
newdata |
A data frame of new values for |
... |
Not used in this implementation. |
Depending on whether the lambda
is a scalar or a vector,
the predict.liureg
function returns a vector or matrix of predictions, respectively.
Murat Genç
liureg()
, predict()
, summary()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) # Predictions based on original X matrix. predict(liu.mod) # Predictions based on newdata. newdata can be a matrix or a data.frame. predict(liu.mod, newdata=X[1:5, ])
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) # Predictions based on original X matrix. predict(liu.mod) # Predictions based on newdata. newdata can be a matrix or a data.frame. predict(liu.mod, newdata=X[1:5, ])
pressliu
computes the predicted residual sum of squares (PRESS) based on a
Liu regression model.
pressliu(obj, digits = 5L, ...)
pressliu(obj, digits = 5L, ...)
obj |
A |
digits |
Decimal places in the columns of data frame of PRESS values. Can be an integer of vector of integers. |
... |
Not used in this implementation. |
The PRESS statistic is based on the predicted leave-one-out residual sum of squares.
The statistic is computed as
where
is the
th diagonal element of the hat matrix corresponding
to the least squares estimator,
is the
th diagonal
element of the hat matrix of the Liu estimator and
is the residual at the specific value of
.
The returned object is a vector of PRESS values computed for each lambda.
.
Murat Genç, Ömer Özbilen
Liu, K. (1993). A new class of blased estimate in linear regression. Communications in Statistics-Theory and Methods, 22(2), 393-402. doi:10.1080/03610929308831027.
Ozkale, M. R. and Kaciranlar, S. (2007). A prediction-oriented criterion for choosing the biasing parameter in Liu estimation. Communications in Statistics-Theory and Methods, 36(10), 1889-1903. doi:10.1080/03610920601126522.
liureg()
, pressliu()
, residuals()
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) pressliu(liu.mod)
data("Hitters") Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) pressliu(liu.mod)
Prints coefficients paths for Liu regression models over a grid of values for the regularization (biasing) parameter lambda.
## S3 method for class 'liureg' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'liureg' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
An object of class |
digits |
Number of decimal places in the coefficients data.frame. |
... |
Not used in this implementation. |
The returned object is a data.frame showing the coefficients path.
Murat Genç
liureg()
, summary()
, pressliu()
, residuals()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) print(liu.mod)
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) print(liu.mod)
statliu
computes the statistics related to the Liu regression.
## S3 method for class 'statliu' print(x, digits = 5, ...)
## S3 method for class 'statliu' print(x, digits = 5, ...)
x |
A |
digits |
Number of decimal places in the data frame of Liu regression statistics. |
... |
Other parameters related to |
The return object is the statistics relatec to the Liu regression.
Murat Genç
liureg()
, summary()
, pressliu()
, residuals()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) stats <- statliu(liu.mod) print(stats)
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) stats <- statliu(liu.mod) print(stats)
Liu Regression Residuals
## S3 method for class 'liureg' residuals(object, ...)
## S3 method for class 'liureg' residuals(object, ...)
object |
An object of class |
... |
Not used in this implementation. |
The returned object is a vector or matrix whose columns
are Liu residuals for each lambda
.
Murat Genç
liureg()
, pressliu()
, residuals()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) residuals(liu.mod)
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) residuals(liu.mod)
statliu
computes the statistics related to the Liu regression.
statliu(obj)
statliu(obj)
obj |
An object of class |
EDF (Liu, 1993; Hastie et al., 2009) |
Effective degrees of freedom, for each where is the number of the observations in the design matrix and is the hat matrix of Liu regression at . |
sigma2 |
Computed from the Liu regression for each . |
VAR |
Variance from the Liu regression for each . |
BIAS2 |
Squared-bias from the Liu regression for each . |
MSE |
Mean squared error (MSE) from the Liu regression for each . |
FVal |
F-statistics value from the Liu regression for each . |
GCV |
Generalized cross-validation (GCV) from the Liu regression for each . The GCV is computed by where is the residual sum of squares and is the trace of the hat matrix at corresponding value of from Liu regression. |
R2 |
R-squared from the Liu regression for each . |
AdjR2 |
Adjusted R-squared from the Liu regression for each . |
The return object is the statistics related to the Liu regression.
Murat Genç
Liu, K. (1993). A new class of blased estimate in linear regression. Communications in Statistics-Theory and Methods, 22(2), 393-402. doi:10.1080/03610929308831027.
Hastie, T., Tibshirani, R., Friedman, J. H., Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction (Vol. 2, pp. 1-758). New York: Springer.
liureg()
, summary()
, pressliu()
, residuals()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) stats <- statliu(liu.mod) print(stats)
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) stats <- statliu(liu.mod) print(stats)
summary
method for liureg
objects.
## S3 method for class 'liureg' summary(object, digits, ...)
## S3 method for class 'liureg' summary(object, digits, ...)
object |
An object of class |
digits |
Number of decimal places in the data frame of Liu regression statistics. |
... |
Not used in this implemetation. |
summary.liureg
produces an object with S3 class summary.liureg
.
The function returns a list of summary statistics of the Liu regression fit for the grid
of regularization parameter values. Each element of the output list includes:
coefficients |
A matrix with columns coefficient estimates,
scaled coefficient estimates, scaled standard errors, scaled values with corresponding
value. |
Statistics |
Liu related statistics , ,
statistics, AIC, BIC and MSE values. |
The returned object is a list whose elements are Liu regression coefficient estimates and statistics related to Liu regression.
Murat Genç
liureg()
, coef()
, predict()
, residuals()
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) summary(liu.mod)
Hitters <- na.omit(Hitters) X <- model.matrix(Salary ~ ., Hitters)[, -1] y <- Hitters$Salary lam <- seq(0, 1, 0.01) liu.mod <- liureg(X, y, lam) summary(liu.mod)