Title: | Expectile Regression in Reproducing Kernel Hilbert Space |
---|---|
Description: | An efficient algorithm inspired by majorization-minimization principle for solving the entire solution path of a flexible nonparametric expectile regression estimator constructed in a reproducing kernel Hilbert space. |
Authors: | Yi Yang, Teng Zhang, Hui Zou |
Maintainer: | Yi Yang <[email protected]> |
License: | GPL-2 |
Version: | 1.0.0 |
Built: | 2024-12-13 06:50:31 UTC |
Source: | CRAN |
as.kernelMatrix
in package KERE can be used
to coerce the kernelMatrix class to matrix objects representing a
kernel matrix. These matrices can then be used with the kernelMatrix
interfaces which most of the functions in KERE support.
## S4 method for signature 'matrix' as.kernelMatrix(x, center = FALSE)
## S4 method for signature 'matrix' as.kernelMatrix(x, center = FALSE)
x |
matrix to be assigned the |
center |
center the kernel matrix in feature space (default: FALSE) |
Alexandros Karatzoglou
[email protected]
## Create toy data x <- rbind(matrix(rnorm(10),,2),matrix(rnorm(10,mean=3),,2)) y <- matrix(c(rep(1,5),rep(-1,5))) ### Use as.kernelMatrix to label the cov. matrix as a kernel matrix ### which is eq. to using a linear kernel K <- as.kernelMatrix(crossprod(t(x))) K
## Create toy data x <- rbind(matrix(rnorm(10),,2),matrix(rnorm(10,mean=3),,2)) y <- matrix(c(rep(1,5),rep(-1,5))) ### Use as.kernelMatrix to label the cov. matrix as a kernel matrix ### which is eq. to using a linear kernel K <- as.kernelMatrix(crossprod(t(x))) K
Does k-fold cross-validation for KERE
, produces a plot, and returns a value for lambda
.
## S3 method for class 'KERE' cv(x, y, kern, lambda = NULL, nfolds = 5, foldid, omega = 0.5, ...)
## S3 method for class 'KERE' cv(x, y, kern, lambda = NULL, nfolds = 5, foldid, omega = 0.5, ...)
x |
matrix of predictors, of dimension |
y |
response variable. |
kern |
the built-in kernel classes in KERE.
The
Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.) |
lambda |
a user supplied |
nfolds |
number of folds - default is 5. Although |
foldid |
an optional vector of values between 1 and |
omega |
the parameter |
... |
other arguments that can be passed to |
The function runs KERE
nfolds
+1 times; the
first to get the lambda
sequence, and then the remainder to
compute the fit with each of the folds omitted. The average error and standard deviation over the
folds are computed.
an object of class cv.KERE
is returned, which is a
list with the ingredients of the cross-validation fit.
lambda |
the values of |
cvm |
the mean cross-validated error - a vector of length
|
cvsd |
estimate of standard error of |
cvupper |
upper curve = |
cvlo |
lower curve = |
name |
a character string "Expectile Loss" |
lambda.min |
the optimal value of |
cvm.min |
the minimum
cross validation error |
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <[email protected]>
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
N <- 200 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) cv.KERE(x=X, y=Y, kern, lambda = lambda, nfolds = 5, omega = 0.5)
N <- 200 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) cv.KERE(x=X, y=Y, kern, lambda = lambda, nfolds = 5, omega = 0.5)
The kernel generating functions provided in KERE.
The Gaussian RBF kernel
The Polynomial kernel
The Linear kernel
The Hyperbolic tangent kernel
The Laplacian kernel
The Bessel kernel
The ANOVA RBF kernel where k(x,x) is a Gaussian
RBF kernel.
The Spline kernel \
rbfdot(sigma = 1) polydot(degree = 1, scale = 1, offset = 1) tanhdot(scale = 1, offset = 1) vanilladot() laplacedot(sigma = 1) besseldot(sigma = 1, order = 1, degree = 1) anovadot(sigma = 1, degree = 1) splinedot()
rbfdot(sigma = 1) polydot(degree = 1, scale = 1, offset = 1) tanhdot(scale = 1, offset = 1) vanilladot() laplacedot(sigma = 1) besseldot(sigma = 1, order = 1, degree = 1) anovadot(sigma = 1, degree = 1) splinedot()
sigma |
The inverse kernel width used by the Gaussian the Laplacian, the Bessel and the ANOVA kernel |
degree |
The degree of the polynomial, bessel or ANOVA kernel function. This has to be an positive integer. |
scale |
The scaling parameter of the polynomial and tangent kernel is a convenient way of normalizing patterns without the need to modify the data itself |
offset |
The offset used in a polynomial or hyperbolic tangent kernel |
order |
The order of the Bessel function to be used as a kernel |
The kernel generating functions are used to initialize a kernel
function
which calculates the dot (inner) product between two feature vectors in a
Hilbert Space. These functions can be passed as a kernel
argument on almost all
functions in KERE(e.g., ksvm
, kpca
etc).
Although using one of the existing kernel functions as a
kernel
argument in various functions in KERE has the
advantage that optimized code is used to calculate various kernel expressions,
any other function implementing a dot product of class kernel
can also be used as a kernel
argument. This allows the user to use, test and develop special kernels
for a given data set or algorithm.
Return an S4 object of class kernel
which extents the
function
class. The resulting function implements the given
kernel calculating the inner (dot) product between two vectors.
kpar |
a list containing the kernel parameters (hyperparameters) used. |
The kernel parameters can be accessed by the kpar
function.
If the offset in the Polynomial kernel is set to 0, we obtain homogeneous polynomial
kernels, for positive values, we have inhomogeneous
kernels. Note that for negative values the kernel does not satisfy Mercer's
condition and thus the optimizers may fail.
In the Hyperbolic tangent kernel if the offset is negative the likelihood of obtaining a kernel matrix that is not positive definite is much higher (since then even some diagonal elements may be negative), hence if this kernel has to be used, the offset should always be positive. Note, however, that this is no guarantee that the kernel will be positive.
Alexandros Karatzoglou
[email protected]
kernelMatrix
, kernelMult
, kernelPol
rbfkernel <- rbfdot(sigma = 0.1) rbfkernel kpar(rbfkernel) ## create two vectors x <- rnorm(10) y <- rnorm(10) ## calculate dot product rbfkernel(x,y)
rbfkernel <- rbfdot(sigma = 0.1) rbfkernel kpar(rbfkernel) ## create two vectors x <- rnorm(10) y <- rnorm(10) ## calculate dot product rbfkernel(x,y)
Fits a regularization path for the kernel expectile regression at a sequence of regularization parameters lambda.
KERE(x, y, kern, lambda = NULL, eps = 1e-08, maxit = 1e4, omega = 0.5, gamma = 1e-06, option = c("fast", "normal"))
KERE(x, y, kern, lambda = NULL, eps = 1e-08, maxit = 1e4, omega = 0.5, gamma = 1e-06, option = c("fast", "normal"))
x |
matrix of predictors, of dimension |
y |
response variable. |
kern |
the built-in kernel classes in KERE.
The
Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.) |
lambda |
a user supplied |
eps |
convergence threshold for majorization minimization algorithm. Each majorization descent loop continues until the relative change in any coefficient |
maxit |
maximum number of loop iterations allowed at fixed lambda value. Default is 1e4. If models do not converge, consider increasing |
omega |
the parameter |
gamma |
a scalar number. If it is specified, the number will be added to each diagonal element of the kernel matrix as perturbation. The default is |
option |
users can choose which method to use to update the inverse matrix in the MM algorithm. |
Note that the objective function in KERE
is
where the is the intercept,
is the solution vector, and
is the kernel matrix with
. Users can specify the kernel function to use, options include Radial Basis kernel, Polynomial kernel, Linear kernel, Hyperbolic tangent kernel, Laplacian kernel, Bessel kernel, ANOVA RBF kernel, the Spline kernel. Users can also tweak the penalty by choosing different
.
For computing speed reason, if models are not converging or running slow, consider increasing eps
before increasing maxit
.
An object with S3 class KERE
.
call |
the call that produced this object. |
alpha |
a |
lambda |
the actual sequence of |
npass |
total number of loop iterations corresponding to each lambda value. |
jerr |
error flag, for warnings and errors, 0 if no error. |
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <[email protected]>
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
# create data N <- 200 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) # run KERE m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) # plot the solution paths plot(m1)
# create data N <- 200 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) # run KERE m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) # plot the solution paths plot(m1)
The built-in kernel classes in KERE
Objects can be created by calls of the form new("rbfkernel")
,
new{"polykernel"}
, new{"tanhkernel"}
,
new{"vanillakernel"}
, new{"anovakernel"}
,
new{"besselkernel"}
, new{"laplacekernel"}
,
new{"splinekernel"}
or by calling the rbfdot
, polydot
, tanhdot
,
vanilladot
, anovadot
, besseldot
, laplacedot
,
splinedot
functions etc..
.Data
:Object of class "function"
containing
the kernel function
kpar
:Object of class "list"
containing the
kernel parameters
Class "kernel"
, directly.
Class "function"
, by class "kernel"
.
signature(kernel = "rbfkernel", x =
"matrix")
: computes the kernel matrix
signature(kernel = "rbfkernel", x =
"matrix")
: computes the quadratic kernel expression
signature(kernel = "rbfkernel", x =
"matrix")
: computes the kernel expansion
signature(kernel = "rbfkernel", x =
"matrix"),,a
: computes parts or the full kernel matrix, mainly
used in kernel algorithms where columns of the kernel matrix are
computed per invocation
Alexandros Karatzoglou
[email protected]
rbfkernel <- rbfdot(sigma = 0.1) rbfkernel is(rbfkernel) kpar(rbfkernel)
rbfkernel <- rbfdot(sigma = 0.1) rbfkernel is(rbfkernel) kpar(rbfkernel)
kernelMatrix
calculates the kernel matrix or
.
kernelPol
computes the quadratic kernel expression ,
.
kernelMult
calculates the kernel expansion kernelFast
computes the kernel matrix, identical
to kernelMatrix
, except that it also requires the squared
norm of the first argument as additional input, useful in iterative
kernel matrix calculations.
## S4 method for signature 'kernel' kernelMatrix(kernel, x, y = NULL) ## S4 method for signature 'kernel' kernelPol(kernel, x, y = NULL, z, k = NULL) ## S4 method for signature 'kernel' kernelMult(kernel, x, y = NULL, z, blocksize = 256) ## S4 method for signature 'kernel' kernelFast(kernel, x, y, a)
## S4 method for signature 'kernel' kernelMatrix(kernel, x, y = NULL) ## S4 method for signature 'kernel' kernelPol(kernel, x, y = NULL, z, k = NULL) ## S4 method for signature 'kernel' kernelMult(kernel, x, y = NULL, z, blocksize = 256) ## S4 method for signature 'kernel' kernelFast(kernel, x, y, a)
kernel |
the kernel function to be used to calculate the kernel
matrix.
This has to be a function of class |
x |
a data matrix to be used to calculate the kernel matrix. |
y |
second data matrix to calculate the kernel matrix. |
z |
a suitable vector or matrix |
k |
a suitable vector or matrix |
a |
the squared norm of |
blocksize |
the kernel expansion computations are done block wise
to avoid storing the kernel matrix into memory. |
Common functions used during kernel based computations.
The kernel
parameter can be set to any function, of class
kernel, which computes the inner product in feature space between two
vector arguments. KERE provides the most popular kernel functions
which can be initialized by using the following
functions:
rbfdot
Radial Basis kernel function
polydot
Polynomial kernel function
vanilladot
Linear kernel function
tanhdot
Hyperbolic tangent kernel function
laplacedot
Laplacian kernel function
besseldot
Bessel kernel function
anovadot
ANOVA RBF kernel function
splinedot
the Spline kernel
(see example.)
kernelFast
is mainly used in situations where columns of the
kernel matrix are computed per invocation. In these cases,
evaluating the norm of each row-entry over and over again would
cause significant computational overhead.
kernelMatrix
returns a symmetric diagonal semi-definite matrix.kernelPol
returns a matrix.kernelMult
usually returns a one-column matrix.
Alexandros Karatzoglou
[email protected]
rbfdot
, polydot
,
tanhdot
, vanilladot
## use the spam data x <- matrix(rnorm(10*10),10,10) ## initialize kernel function rbf <- rbfdot(sigma = 0.05) rbf ## calculate kernel matrix kernelMatrix(rbf, x) y <- matrix(rnorm(10*1),10,1) ## calculate the quadratic kernel expression kernelPol(rbf, x, ,y) ## calculate the kernel expansion kernelMult(rbf, x, ,y)
## use the spam data x <- matrix(rnorm(10*10),10,10) ## initialize kernel function rbf <- rbfdot(sigma = 0.05) rbf ## calculate kernel matrix kernelMatrix(rbf, x) y <- matrix(rnorm(10*1),10,1) ## calculate the quadratic kernel expression kernelPol(rbf, x, ,y) ## calculate the kernel expansion kernelMult(rbf, x, ,y)
Produces a coefficient profile plot of the coefficient paths for a
fitted KERE
object.
## S3 method for class 'KERE' plot(x, ...)
## S3 method for class 'KERE' plot(x, ...)
x |
fitted |
... |
other graphical parameters to plot. |
A coefficient profile plot is produced. The x-axis is . The y-axis is the value of fitted
's.
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <[email protected]>
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
# create data N <- 200 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) # run KERE m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) # plot the solution paths plot(m1)
# create data N <- 200 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) # run KERE m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) # plot the solution paths plot(m1)
Similar to other predict methods, this functions predicts fitted values and class labels from a fitted KERE
object.
## S3 method for class 'KERE' predict(object, kern, x, newx,...)
## S3 method for class 'KERE' predict(object, kern, x, newx,...)
object |
fitted |
kern |
the built-in kernel classes in KERE. Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.) |
x |
the original design matrix for training |
newx |
matrix of new values for |
... |
other parameters to |
The fitted at newx is returned as a size
nrow(newx)*length(lambda)
matrix for various lambda values where the KERE
model was fitted.
The fitted is returned as a size
nrow(newx)*length(lambda)
matrix. The row represents the index for observations of newx. The column represents the index for the lambda sequence.
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <[email protected]>
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
# create data N <- 100 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) # run KERE m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) # create newx for prediction N1 <- 5 X1 <- runif(N1) X2 <- 2*runif(N1) X3 <- 3*runif(N1) newx <- cbind(X1,X2,X3) # make prediction p1 <- predict.KERE(m1, kern, X, newx) p1
# create data N <- 100 X1 <- runif(N) X2 <- 2*runif(N) X3 <- 3*runif(N) SNR <- 10 # signal-to-noise ratio Y <- X1**1.5 + 2 * (X2**.5) + X1*X3 sigma <- sqrt(var(Y)/SNR) Y <- Y + X2*rnorm(N,0,sigma) X <- cbind(X1,X2,X3) # set gaussian kernel kern <- rbfdot(sigma=0.1) # define lambda sequence lambda <- exp(seq(log(0.5),log(0.01),len=10)) # run KERE m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) # create newx for prediction N1 <- 5 X1 <- runif(N1) X2 <- 2*runif(N1) X3 <- 3*runif(N1) newx <- cbind(X1,X2,X3) # make prediction p1 <- predict.KERE(m1, kern, X, newx) p1