Title: | Fit Regularized Nonparametric Regression Models Using COSSO Penalty |
---|---|
Description: | The COSSO regularization method automatically estimates and selects important function components by a soft-thresholding penalty in the context of smoothing spline ANOVA models. Implemented models include mean regression, quantile regression, logistic regression and the Cox regression models. |
Authors: | Hao Helen Zhang [aut, cph], Chen-Yen Lin [aut, cph], Isaac Ray [cre, ctb] |
Maintainer: | Isaac Ray <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.1-2 |
Built: | 2024-11-21 06:32:45 UTC |
Source: | CRAN |
345 male patients' blood test result and liver disorder status.
data(BUPA)
data(BUPA)
CLASS | 0: no liver disorder 1: liver disorder |
MCV | mean corpuscular volume. minimum 65 and maximum 103 in original scale. |
ALKPHOS | alkaline phosphotase. minimum 23 and maximum 138 in original scale. |
SGPT | alamine aminotransferase. minimum 4 and maximum 155 in original scale. |
SGOT | aspartate aminotransferase. minimum 5 and maximum 82 in original scale. |
GAMMAGT | gamma-glutamyl transpeptidase. minimum 5 and maximum 297 in original scale. |
DRINKS | number of alcoholic beverages drunk per day. minimum 0 and maximum 20 in original scale. |
All the variables, except for the response, have been scaled to [0,1] interval. To transform back to the original scale, use the formula:
Richard S. Forsyth at BUPA Medical Research Ltd.
A comprehensive method for fitting various type of regularized nonparametric regression models using cosso penalty. Fits mean, logistic, Cox and quantile regression.
cosso(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),wt=rep(1,ncol(x)), scale=FALSE,nbasis,basis.id,cpus)
cosso(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),wt=rep(1,ncol(x)), scale=FALSE,nbasis,basis.id,cpus)
x |
input matrix; the number of rows is sample size, the number of columns is the data dimension. The range of input variables is scaled to [0,1] for continuous variables. Variables with less than 7 unique values will be considered as discrete variable. |
y |
response vector. Quantitative for |
tau |
the quantile to be estimated, a number strictly between 0 and 1. Argument required when |
family |
response type. Abbreviations are allowed. |
wt |
weights for predictors. Default is |
scale |
if |
nbasis |
number of "knots" to be selected. Ignored when |
basis.id |
index designating selected "knots". Argument is not valid for |
cpus |
number of available processor units. Default is |
In the SS-ANOVA model framework, the regression function is assumed to have an additive form
where denotes intercept and
denotes the main effect of the
-th covariate.
For "Gaussian"
response, the mean function is estimated by minimizing the objective function:
For "Binomial"
response, the log-odd function is estimated by minimizing the objective function:
For "Quantile"
regression model, the quantile function, is estimated by minimizing the objective function:
For "Cox"
regression model, the log-relative hazard function is estimated by minimizing the objective function:
For identifiability sake, the intercept term in Cox model is absorbed into basline hazard, or equivalently set .
For large data sets, we can reduce the computational load of the optimization problem by
selecting a subset of the observations of size nbais as "knots", which reduces the dimension of the
kernel matrices from nobs to nbasis.
Unless specified via basis.id
or nbasis
, the default number of "knots" is (40,12*nobs^(2/9)) for
"Gaussian"
and "Binomial"
and
(35,11 * nobs^(2/9)) for
"Cox"
.
The weights can be specified based on either user's own discretion or adaptively computed from initial
function estimates. See Storlie et al. (2011) for more discussions. One possible choice is to specify the weights
as the inverse norm of initial function estimator, see
SSANOVAwt
.
An object with S3 class "cosso".
y |
the response vector. |
x |
the input matrix. |
Kmat |
a three-dimensional array containing kernel matrices for each input variables. |
wt |
weights for predictors. |
family |
type of regression model. |
basis.id |
indices of observations used as "knots". |
cpus |
number of cpu units used. Will be returned if |
tau |
the quantile to be estimated. Will be returned if |
tune |
a list containing preliminary tuning result and L2-norm. |
Hao Helen Zhang and Chen-Yen Lin
Lin, Y. and Zhang, H. H. (2006) "Component Selection and Smoothing in Smoothing Spline Analysis of Variance Models", Annals of Statistics, 34, 2272–2297.
Leng, C. and Zhang, H. H. (2006) "Model selection in nonparametric hazard regression", Nonparametric Statistics, 18, 417–429.
Zhang, H. H. and Lin, Y. (2006) "Component Selection and Smoothing for Nonparametric Regression in Exponential Families", Statistica Sinica, 16, 1021–1041.
Storlie, C. B., Bondell, H. D., Reich, B. J. and Zhang, H. H. (2011) "Surface Estimation, Variable Selection, and the Nonparametric Oracle Property", Statistica Sinica, 21, 679–705.
plot.cosso
, predict.cosso
, tune.cosso
## Gaussian set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) G.Obj=cosso(x,y,family="Gaussian") plot.cosso(G.Obj,plottype="Path") ## Not run: ## Use all observations as knots set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) G.Obj=cosso(x,y,family="Gaussian",nbasis=200) ## Clean up rm(list=ls()) ## Binomial set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) B.Obj=cosso(x,y,family="Bin") ## Clean up rm(list=ls()) ## Cox set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) hazard=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2 surTime=rexp(200,exp(hazard)) cenTime=rexp(200,exp(-hazard)*runif(1,4,6)) y=cbind(time=apply(cbind(surTime,cenTime),1,min),status=1*(surTime<cenTime)) C.obj=cosso(x,y,family="Cox",cpus=1) ## Try parallel computing C.obj=cosso(x,y,family="Cox",cpus=4) ## Clean up rm(list=ls()) ## Quantile set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rt(200,3) Q.obj=cosso(x,y,0.3,family="Quan",cpus=1) ## Try parallel computing Q.obj=cosso(x,y,0.3,family="Quan",cpus=4) ## End(Not run)
## Gaussian set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) G.Obj=cosso(x,y,family="Gaussian") plot.cosso(G.Obj,plottype="Path") ## Not run: ## Use all observations as knots set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) G.Obj=cosso(x,y,family="Gaussian",nbasis=200) ## Clean up rm(list=ls()) ## Binomial set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) B.Obj=cosso(x,y,family="Bin") ## Clean up rm(list=ls()) ## Cox set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) hazard=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2 surTime=rexp(200,exp(hazard)) cenTime=rexp(200,exp(-hazard)*runif(1,4,6)) y=cbind(time=apply(cbind(surTime,cenTime),1,min),status=1*(surTime<cenTime)) C.obj=cosso(x,y,family="Cox",cpus=1) ## Try parallel computing C.obj=cosso(x,y,family="Cox",cpus=4) ## Clean up rm(list=ls()) ## Quantile set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rt(200,3) Q.obj=cosso(x,y,0.3,family="Quan",cpus=1) ## Try parallel computing Q.obj=cosso(x,y,0.3,family="Quan",cpus=4) ## End(Not run)
This is the ozone data used in Breiman and Friedman (1985). This dataset contains 330 observations, and each observation is a daily measurement.
data(ozone)
data(ozone)
ozone | Ozone reading |
temp | Temperature (degree C). minimum 25 and maximum 93 in original scale. |
invHt | Inversion base height (feet). minimum 111 and maximum 5000 in original scale. |
press | Pressure gradient (mm Hg). minimum -69 and maximum 107 in original scale. |
vis | Visibility (miles). minimum 0 and maximum 350 in original scale. |
milPress | 500 millibar pressure height (m). minimum 5320 and maximum 5950 in original scale. |
hum | Humidity (percent). minimum 19 and maximum 93. |
invTemp | Inversion base temperature (degrees F). minimum -25 and maximum 332 in original scale. |
wind | Wind speed (mph). minimum 0 and maximum 21 in original scale. |
All the variables, except for the response, have been scaled to [0,1] interval. To transform back to the original scale, use the formula:
Breiman, L. and Friedman, J. (1985), "Estimating Optimal Transformations for Multiple Regression and Correlation", Journal of the American Statistical Association, 80, 580–598.
Plot norm solution path or main effects of selected functional components
## S3 method for class 'cosso' plot(x,M,plottype =c("Path","Functionals"),eps=1e-7,...)
## S3 method for class 'cosso' plot(x,M,plottype =c("Path","Functionals"),eps=1e-7,...)
x |
a cosso object. |
M |
a smoothing parameter value. Argument required when |
plottype |
either |
eps |
an effective zero, default is |
... |
additional arguments for plot generic. |
NULL
Hao Helen Zhang and Chen-Yen Lin
set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) B.Obj=cosso(x,y,family="Bin") plot.cosso(B.Obj,plottype="Path") plot.cosso(B.Obj,M=2,plottype="Func")
set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) B.Obj=cosso(x,y,family="Bin") plot.cosso(B.Obj,plottype="Path") plot.cosso(B.Obj,M=2,plottype="Func")
Make prediction for future observations or extract the model parameters at a particular smoothing parameter.
## S3 method for class 'cosso' predict(object,xnew,M,type=c("fit","coefficients","nonzero"),eps=1e-7,...)
## S3 method for class 'cosso' predict(object,xnew,M,type=c("fit","coefficients","nonzero"),eps=1e-7,...)
object |
a cosso object. |
xnew |
matrix of new values for |
M |
a smoothing parameter value. M should be taken between 0 and p. If not provided, a cross-validation procedure will be carried out to select an appropriate value. |
type |
if |
eps |
an effective zero, default is |
... |
additional arguments for predict function. |
The object returned depends on type.
When type="fit"
, predicted
function value will be given at the new design points xnew
.
When type="coefficients"
, three sets of coefficients will be returned.
Intercept |
the estimated intercept. If |
coefs |
the estimated coefficients for kernel representers. |
theta |
the estimated scale parameters for each functional component. |
When type="nonzero"
, a list of the indices of the nonconstant functional components will be returned.
Hao Helen Zhang and Chen-Yen Lin
## Gaussian set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) G.Obj=cosso(x,y,family="Gaussian") predict.cosso(G.Obj,M=2,type="nonzero") predict.cosso(G.Obj,xnew=x[1:3,],M=2,type="fit") ## Clean up rm(list=ls()) ## Not run: ## Binomial set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) B.Obj=cosso(x,y,family="Bin") f.hat=predict.cosso(B.Obj,xnew=x,M=2,type="fit") prob.hat=1/(1+exp(-f.hat)) ## Clean up rm(list=ls()) ## End(Not run)
## Gaussian set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) G.Obj=cosso(x,y,family="Gaussian") predict.cosso(G.Obj,M=2,type="nonzero") predict.cosso(G.Obj,xnew=x[1:3,],M=2,type="fit") ## Clean up rm(list=ls()) ## Not run: ## Binomial set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*9,0,1),nc=9)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) B.Obj=cosso(x,y,family="Bin") f.hat=predict.cosso(B.Obj,xnew=x,M=2,type="fit") prob.hat=1/(1+exp(-f.hat)) ## Clean up rm(list=ls()) ## End(Not run)
A preliminary estimate is first obtained by fitting a smoothing spline ANOVA model,
and then use the inverse
-norm,
, as the initial weight for the
-th functional component.
SSANOVAwt(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),mscale=rep(1,ncol(x)), gamma=1,scale=FALSE,nbasis,basis.id,cpus)
SSANOVAwt(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),mscale=rep(1,ncol(x)), gamma=1,scale=FALSE,nbasis,basis.id,cpus)
x |
input matrix; the number of rows is sample size, the number of columns is the data dimension. The range of input variables is scaled to [0,1] for continuous variables. |
y |
response vector. Quantitative for |
tau |
the quantile to be estimated, a number strictly between 0 and 1. Argument required when |
family |
response type. Abbreviations are allowed. |
mscale |
scale parameter for the Gram matrix associated with each function component. Default is |
gamma |
power of inverse |
scale |
if |
nbasis |
number of "knots" to be selected. Ignored when |
basis.id |
index designating selected "knots". Argument is not valid if |
cpus |
number of available processor units. Default is |
The initial mean function is estimated via a smooothing spline objective function. In the SS-ANOVA model framework, the regression function is assumed to have an additive form
where denotes intercept and
denotes the main effect of the
-th covariate.
For "Gaussian"
response, the mean regression function is estimated by minimizing the objective function:
where RSS is residual sum of squares.
For "Binomial"
response, the regression function is estimated by minimizing the objective function:
For "Quantile"
regression model, the quantile function, is estimated by minimizing the objective function:
For "Cox"
regression model, the log-hazard function, is estimated by minimizing the objective function:
The smoothing parameter is tuned by 5-fold Cross-Validation, if
family="Gaussian"
, "Binomial"
or "Quantile"
,
and Approximate Cross-Validation, if family="Cox"
. But the smoothing parameters are given in the argument
mscale
.
The adaptive weights are then fiven by .
wt |
The adaptive weights. |
Hao Helen Zhang and Chen-Yen Lin
Storlie, C. B., Bondell, H. D., Reich, B. J. and Zhang, H. H. (2011) "Surface Estimation, Variable Selection, and the Nonparametric Oracle Property", Statistica Sinica, 21, 679–705.
## Adaptive COSSO Model ## Binomial set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) Binomial.wt=SSANOVAwt(x,y,family="Bin") ada.B.Obj=cosso(x,y,wt=Binomial.wt,family="Bin") ## Not run: ## Gaussian set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) Gaussian.wt=SSANOVAwt(designx,response,family="Gau") ada.G.Obj=cosso(x,y,wt=Gaussian.wt,family="Gaussian") ## End(Not run)
## Adaptive COSSO Model ## Binomial set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(200,1,trueProb) Binomial.wt=SSANOVAwt(x,y,family="Bin") ada.B.Obj=cosso(x,y,wt=Binomial.wt,family="Bin") ## Not run: ## Gaussian set.seed(20130310) x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7)) y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1) Gaussian.wt=SSANOVAwt(designx,response,family="Gau") ada.G.Obj=cosso(x,y,wt=Gaussian.wt,family="Gaussian") ## End(Not run)
Compute K-fold cross-validated score and plot cross-validated score against a grid values of smooth parameter M.
tune.cosso(object,folds=5,plot.it=TRUE)
tune.cosso(object,folds=5,plot.it=TRUE)
object |
a cosso object. |
folds |
number of folds for corss-validation. Default is |
plot.it |
if |
OptM |
the optimal smoothing parameter for M. |
M |
used tuning grid points. |
cvm |
the mean cross-validated error/minus log-likelihood. |
cvsd |
estimate of standard error of |
Hao Helen Zhang and Chen-Yen Lin
## Binomial set.seed(20130310) x=cbind(rbinom(150,1,.7),matrix(runif(150*5,0,1),nc=5)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(150,1,trueProb) B.Obj=cosso(x,y,family="Bin",nbasis=30) tune.cosso(B.Obj,4,TRUE)
## Binomial set.seed(20130310) x=cbind(rbinom(150,1,.7),matrix(runif(150*5,0,1),nc=5)) trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2)) y=rbinom(150,1,trueProb) B.Obj=cosso(x,y,family="Bin",nbasis=30) tune.cosso(B.Obj,4,TRUE)
Randomized trial of two treatment regimens for lung cancer.
data(veteran)
data(veteran)
time | survival time |
status | censoring status |
trt | 0=standard 1=test |
celltype | 1=squamous, 2=smallcell, 3=adeno, 4=large. |
karno | Karnofsky performance score. minimum 10 and maximum 99 in original scale. |
diagtime | months from diagnosis to randomization. minimum 1 and maximum 87 in original scale. |
age | in years. minimum 34 and maximum 81 in original scale. |
prior | prior therapy 0=no, 1=yes. |
All the variables, except for the response, have been scaled to [0,1] interval. To transform back to the original scale, use the formula:
Kalbfleisch, J. and Prentice, R.L. (2002), The Statistical Analysis of Failure Time Data (Second Edition) Wiley: New Jersey.