Title: | Partially Linear Additive Quantile Regression |
---|---|
Description: | Estimation, prediction, thresholding, transformation, and plotting for partially linear additive quantile regression. Intuitive functions for fitting and plotting partially linear additive quantile regression models. Uses and works with functions from the 'quantreg' package. |
Authors: | Adam Maidman [cre, aut] |
Maintainer: | Adam Maidman <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0 |
Built: | 2024-12-07 06:41:03 UTC |
Source: | CRAN |
Returns the BIC for the partially linear additive quantile regression model from Lee, Noh, and Park (2014).
bic(fit, ...)
bic(fit, ...)
fit |
a |
... |
additional parameters which will be ignored |
BIC value
Adam Maidman
Lee, E. R., Noh, H., and Park, B. U. (2014). Model selection via bayesian information criterion for quantile regression models. Journal of the American Statistical Association 109, 216-229.
data(simData) ss <- vector("list", 2) ss[[2]]$degree <- 3 fit1 <- plaqr(y~., nonlinVars=~z1+z2, data=simData, splinesettings=ss) ss[[2]]$degree <- 4 fit2 <- plaqr(y~., nonlinVars=~z1+z2, data=simData, splinesettings=ss) ss[[2]]$degree <- 5 fit3 <- plaqr(y~., nonlinVars=~z1+z2, data=simData, splinesettings=ss) bic(fit1) bic(fit2) bic(fit3)
data(simData) ss <- vector("list", 2) ss[[2]]$degree <- 3 fit1 <- plaqr(y~., nonlinVars=~z1+z2, data=simData, splinesettings=ss) ss[[2]]$degree <- 4 fit2 <- plaqr(y~., nonlinVars=~z1+z2, data=simData, splinesettings=ss) ss[[2]]$degree <- 5 fit3 <- plaqr(y~., nonlinVars=~z1+z2, data=simData, splinesettings=ss) bic(fit1) bic(fit2) bic(fit3)
Returns an object of class "plaqreffect"
which represents the effect plot(s) of the nonlinear term(s) of a "plaqr"
object from the plaqr
function. A "plaqreffect"
object should be plotted using the plot
function.
nonlinEffect(fit, select=NULL, renames=NULL)
nonlinEffect(fit, select=NULL, renames=NULL)
fit |
a |
select |
a character vector with entries matching nonlinear terms in |
renames |
a character vector with length equal to the number of nonlinear terms in |
A returned "plaqreffect"
object to be used with the "plot"
function. Each nonlinear term is associated with a list containing information for plotting. See the examples for accessing the list.
Adam Maidman
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) eff1 <- nonlinEffect(fit) eff1 plot(eff1) eff2 <- nonlinEffect(fit, select=c("z1","z2"), renames=c("Length", "Height")) eff2 plot(eff2) eff3 <- nonlinEffect(fit, select=c("z2","z1"), renames=c("Height", "Length")) eff3 eff3$z1 eff3$z2 plot(eff3) par(mfrow=c(1,2)) plot(eff3)
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) eff1 <- nonlinEffect(fit) eff1 plot(eff1) eff2 <- nonlinEffect(fit, select=c("z1","z2"), renames=c("Length", "Height")) eff2 plot(eff2) eff3 <- nonlinEffect(fit, select=c("z2","z1"), renames=c("Height", "Length")) eff3 eff3$z1 eff3$z2 plot(eff3) par(mfrow=c(1,2)) plot(eff3)
Returns an object of class "plaqr"
and "rq"
that represents a quantile regression fit. A nonlinear term z is transformed using bs(z)
before fitting the model. The formula of the model (as it appears in R
) becomes y~ x1 + x2 + bs(z1) + bs(z2)
where bs(z1)
is a B-spline.
plaqr(formula, nonlinVars=NULL, tau=.5, data=NULL, subset, weights, na.action, method = "br", model = TRUE, contrasts = NULL, splinesettings=NULL, ...)
plaqr(formula, nonlinVars=NULL, tau=.5, data=NULL, subset, weights, na.action, method = "br", model = TRUE, contrasts = NULL, splinesettings=NULL, ...)
formula |
a formula object, with the response on the left of a |
nonlinVars |
a one-sided formula object, with a |
tau |
the quantile to be estimated, this is a number strictly between 0 and 1 (for now). |
data |
a data.frame in which to interpret the variables named in the formula, or in the subset and the weights argument. If this is missing, then the variables in the formula should be on the search list. This may also be a single number to handle some special cases – see below for details. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
vector of observation weights; if supplied, the algorithm fits to minimize the sum of the weights multiplied into the absolute residuals. The length of weights must be the same as the number of observations. The weights must be nonnegative and it is strongly recommended that they be strictly positive, since zero weights are ambiguous. |
na.action |
a function to filter missing data.
This is applied to the model.frame after any subset argument has been used.
The default (with |
model |
if TRUE then the model frame is returned. This is essential if one wants to call summary subsequently. |
method |
the algorithmic method used to compute the fit. There are several
options: The default method is the modified version of the
Barrodale and Roberts algorithm for |
contrasts |
a list giving contrasts for some or all of the factors
default = |
splinesettings |
a list of length equal to the number of nonlinear effects containing arguments to pass to the |
... |
additional arguments for the fitting routines
(see the |
Returns the following:
coefficients |
Coefficients from the fitted model |
x |
optionally the model matrix, if |
y |
optionally the response, if |
residuals |
the residuals from the fit. |
dual |
the vector dual variables from the fit. |
fitted.values |
fitted values from the fit. |
formula |
the formula that was used in the |
rho |
the value of the objective function at the solution. |
model |
optionally the model frame, if |
linear |
the linear terms used in the model fit. |
nonlinear |
the nonlinear terms used in the model fit. |
z |
the values of the nonlinear terms. |
Adam Maidman
Hastie, T. J. (1992) Generalized additive models. Chapter 7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Koenker, R. W. (2005). Quantile Regression, Cambridge U. Press.
Sherwood, B. and Wang, L. (2016). Partially linear additive quantile regression in ultra-high dimension. The Annals of Statistics 44, 288-317.
Maidman, A., Wang, L. (2017). New Semiparametric Method for Predicting High-Cost Patients. Preprint.
data(simData) ss <- vector("list", 2) ss[[2]]$degree <- 5 ss[[2]]$Boundary.knots <- c(-1, 1) plaqr(y~., nonlinVars=~z1+z2, data=simData) #same as plaqr(formula= y~x1+x2+x3, nonlinVars=~z1+z2, data=simData) plaqr(y~0, nonlinVars=~z1+z2, data=simData, splinesettings=ss) #no linear terms in the model plaqr(y~., data=simData) #all linear terms
data(simData) ss <- vector("list", 2) ss[[2]]$degree <- 5 ss[[2]]$Boundary.knots <- c(-1, 1) plaqr(y~., nonlinVars=~z1+z2, data=simData) #same as plaqr(formula= y~x1+x2+x3, nonlinVars=~z1+z2, data=simData) plaqr(y~0, nonlinVars=~z1+z2, data=simData, splinesettings=ss) #no linear terms in the model plaqr(y~., data=simData) #all linear terms
Makes nonlinear effect plots for the nonlinear effects in a fit returned from the nonlinEffect
function. Note: you cannot use this function to plot a "plaqr"
object.
## S3 method for class 'plaqreffect' plot(x, select=NULL, rug = TRUE, jit = TRUE, titles = NULL, pages = 0, type="l", ...)
## S3 method for class 'plaqreffect' plot(x, select=NULL, rug = TRUE, jit = TRUE, titles = NULL, pages = 0, type="l", ...)
x |
a |
select |
vector of indices of nonlinear terms in |
rug |
if TRUE, a rugplot for the x-coordinate is plotted. |
jit |
if TRUE, the x-values of the rug plot are jittered. |
titles |
title(s) as vector of character strings, by default titles are chosen for each plot as “Effect of CovariateName (tau=tau)”. |
pages |
number of pages desired for the plots. |
type |
the type of plot that should be drawn. |
... |
additional arguments for the plotting algorithm. |
Adam Maidman
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) eff <- nonlinEffect(fit, select=c("z1","z2"), renames=c("Length", "Height")) eff plot(eff) plot(eff, select=1, col="red") plot(eff, select=c(2,1), titles=c("Effect Z1","Effect Z2")) plot(eff, select=1, col="red", lwd=4) par(mfrow=c(1,2)) plot(eff)
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) eff <- nonlinEffect(fit, select=c("z1","z2"), renames=c("Length", "Height")) eff plot(eff) plot(eff, select=1, col="red") plot(eff, select=c(2,1), titles=c("Effect Z1","Effect Z2")) plot(eff, select=1, col="red", lwd=4) par(mfrow=c(1,2)) plot(eff)
Predicts future values using the median and finds a prediction interval for future values using an upper and lower quantile. The lower quantile is (1-level)/2 and the upper quantile is .5 + level/2.
predictInt(fit, level=.95, newdata=NULL, ...)
predictInt(fit, level=.95, newdata=NULL, ...)
fit |
a fitted model of class |
level |
the prediction level required. The lower quantile is (1-level)/2 and the upper quantile is .5 + level/2. |
newdata |
an optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. |
... |
additional argument(s) for methods. |
a matrix with columns giving the predicted median and lower and upper prediction bounds.
Adam Maidman
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) predictInt(fit, level=.95)
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) predictInt(fit, level=.95)
Print an object generated by nonlinEffect.
## S3 method for class 'plaqreffect' print(x, ...)
## S3 method for class 'plaqreffect' print(x, ...)
x |
an object returned from nonlinEffect. |
... |
optional arguments. |
Adam Maidman
Print an object generated by threshold.
## S3 method for class 'thresh' print(x,...)
## S3 method for class 'thresh' print(x,...)
x |
an object returned from threshold. |
... |
optional arguments. |
Adam Maidman
A simulated data set to illustrate the functions in this package.
set.seed(4)
x1 <- rbinom(100, 1,.5)
x2 <- rnorm(100)
x3 <- rnorm(100)
z1 <- runif(100, 0, 1)
z2 <- runif(100, -1, 1)
y <- 3*x1 +1.5*x2 + 2*x3 + 5*sin(2*pi*z1) + 5*z2^3 + rnorm(100)
simData <- data.frame(y,x1,x2,x3,z1,z2)
data(simData)
data(simData)
A data frame with 100 observations on the following 6 variables.
response: expenditure
male/female (a linear term)
distance north/south from center (a linear term)
distance east/west from center (a linear term)
income/(max income) (a nonlinear term)
spending habits on a -1 to 1 scale (frugal to lavish) (a nonlinear term)
Classification of a numerical response into a “high” class and “low” class using a threshold. This function can be used with any model that has a numerical outcome and allows for prediction using the predict
function.
threshold(fit, t, newdata=NULL, ...)
threshold(fit, t, newdata=NULL, ...)
fit |
any model with a numerical response. |
t |
the desired threshold value. All values above |
newdata |
an optional data frame in which to look for variables with which to predict. If omitted, no prediction is done. |
... |
additional argument(s) for methods in the |
pred.class |
if |
t |
the threshold. |
train.class |
a vector of the predicted classes of the data used in |
true.class |
a vector of the true classes of the data used in |
train.error |
a scalar equal to the |
true.high |
the number of observations in class“1” using the data used in |
true.low |
the number of observations in class “1” using the data used in |
false.high |
the number of observations truly in class “0”, but predicted to be in class “1” using the data used in |
false.low |
the number of observations truly in class “1”, but predicted to be in class “1” using the data used in |
call |
the |
formula |
the formula used in |
Adam Maidman
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) testdata <- .5*simData[4,2:6] trh <- threshold(fit, t=9, newdata=testdata) trh$pred.class trh
data(simData) fit <- plaqr(y~.,~z1+z2,data=simData) testdata <- .5*simData[4,2:6] trh <- threshold(fit, t=9, newdata=testdata) trh$pred.class trh
Transform the response variable using the one-paremter, symmetric transformation of Geraci and Jones (2015).
trans_parameter(x, parameter, inverse=FALSE)
trans_parameter(x, parameter, inverse=FALSE)
x |
a vector of values to be transformed (the response variable) |
parameter |
a real-valued transformation parameter. 0 corresponds to the log transformation and 1 corresponds to the identity. See Geraci and Jones (2015) for more information on the one-parameter, symmetric transformation. |
inverse |
If TRUE, the inverse transformation is done to transform the variable back to the original scale. If FALSE, the standard transformation is computed. |
Returns a vector of the transformed (or back-transformed) variable.
Adam Maidman
Geraci, M. and Jones, M. (2015). Improved transformation-based quantile regression. Canadian Journal of Statistics 43, 118-132.
Maidman, A., Wang, L. (2017). New Semiparametric Method for Predicting High-Cost Patients. Preprint.
data(simData) simData$Y <- exp(simData$y) tparam <- transform_plaqr(Y~x1+x2+x3, nonlinVars=~z1+z2, data=simData) simData$newy <- trans_parameter(simData$Y, tparam$parameter) fit <- plaqr(newy~x1+x2+x3, nonlinVars=~z1+z2, data=simData) trans_parameter( predictInt(fit), tparam$parameter, inverse=TRUE)
data(simData) simData$Y <- exp(simData$y) tparam <- transform_plaqr(Y~x1+x2+x3, nonlinVars=~z1+z2, data=simData) simData$newy <- trans_parameter(simData$Y, tparam$parameter) fit <- plaqr(newy~x1+x2+x3, nonlinVars=~z1+z2, data=simData) trans_parameter( predictInt(fit), tparam$parameter, inverse=TRUE)
Returns the estimated transformation parameter for the one-parameter symmetric transformation (Geraci and Jones, 2015). Confidence intervals for the transformation parameter can also be created using the bootstrap. The response variable must be strictly positive; a constant can be added to the variable to ensure that all values are positive.
transform_plaqr(formula, nonlinVars=NULL, tau=.5, data=NULL, lambda=seq(0,1,by=.05), confint=NULL, B=99, subset, weights, na.action, method = "br", contrasts = NULL, splinesettings=NULL)
transform_plaqr(formula, nonlinVars=NULL, tau=.5, data=NULL, lambda=seq(0,1,by=.05), confint=NULL, B=99, subset, weights, na.action, method = "br", contrasts = NULL, splinesettings=NULL)
formula |
a formula object, with the response on the left of a |
nonlinVars |
a one-sided formula object, with a |
tau |
the quantile to be estimated, this is a number strictly between 0 and 1 (for now). |
data |
a data.frame in which to interpret the variables named in the formula, or in the subset and the weights argument. If this is missing, then the variables in the formula should be on the search list. This may also be a single number to handle some special cases – see below for details. |
lambda |
a real-valued sequence of possible transformation parameters. 0 corresponds to the log transformation and 1 corresponds to the identity. The transformation is symmetric so a negative transformation parameter is redundant and can be avoided. See Geraci and Jones (2015) for more information on the one-parameter, symmetric transformation. |
confint |
a |
B |
the number of bootstrap replications for the confidence interval. If no confidence interval is being created, this argument is ignored. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
vector of observation weights; if supplied, the algorithm fits to minimize the sum of the weights multiplied into the absolute residuals. The length of weights must be the same as the number of observations. The weights must be nonnegative and it is strongly recommended that they be strictly positive, since zero weights are ambiguous. |
na.action |
a function to filter missing data.
This is applied to the model.frame after any subset argument has been used.
The default (with |
method |
the algorithmic method used to compute the fit. There are several
options: The default method is the modified version of the
Barrodale and Roberts algorithm for |
contrasts |
a list giving contrasts for some or all of the factors
default = |
splinesettings |
a list of length equal to the number of nonlinear effects containing arguments to pass to the |
Returns the following:
parameter |
The transformation parameter |
Y |
The values of the transformed response |
confint |
If a confidence interval is created, this is the confidence interval for the transformation parameter. Otherwise, |
U |
If a confidence interval is created, a |
P |
If a confidence interval is created, a |
Adam Maidman
Geraci, M. and Jones, M. (2015). Improved transformation-based quantile regression. Canadian Journal of Statistics 43, 118-132.
Maidman, A., Wang, L. (2017). New Semiparametric Method for Predicting High-Cost Patients. Preprint.
data(simData) simData$Y <- exp(simData$y) transform_plaqr(Y~x1+x2+x3, nonlinVars=~z1+z2, data=simData) transform_plaqr(Y~x1+x2+x3, nonlinVars=~z1+z2, confint=.95, data=simData)
data(simData) simData$Y <- exp(simData$y) transform_plaqr(Y~x1+x2+x3, nonlinVars=~z1+z2, data=simData) transform_plaqr(Y~x1+x2+x3, nonlinVars=~z1+z2, confint=.95, data=simData)