Package 'KERE'

Title: Expectile Regression in Reproducing Kernel Hilbert Space
Description: An efficient algorithm inspired by majorization-minimization principle for solving the entire solution path of a flexible nonparametric expectile regression estimator constructed in a reproducing kernel Hilbert space.
Authors: Yi Yang, Teng Zhang, Hui Zou
Maintainer: Yi Yang <>
License: GPL-2
Version: 1.0.0
Built: 2025-02-11 06:50:38 UTC
Source: CRAN

Help Index

Assing kernelMatrix class to matrix objects


as.kernelMatrix in package KERE can be used to coerce the kernelMatrix class to matrix objects representing a kernel matrix. These matrices can then be used with the kernelMatrix interfaces which most of the functions in KERE support.


## S4 method for signature 'matrix'
as.kernelMatrix(x, center = FALSE)



matrix to be assigned the kernelMatrix class


center the kernel matrix in feature space (default: FALSE)


Alexandros Karatzoglou

See Also

kernelMatrix, dots


## Create toy data
x <- rbind(matrix(rnorm(10),,2),matrix(rnorm(10,mean=3),,2))
y <- matrix(c(rep(1,5),rep(-1,5)))

### Use as.kernelMatrix to label the cov. matrix as a kernel matrix
### which is eq. to using a linear kernel 

K <- as.kernelMatrix(crossprod(t(x)))


Cross-validation for KERE


Does k-fold cross-validation for KERE, produces a plot, and returns a value for lambda.


## S3 method for class 'KERE'
cv(x, y, kern, lambda = NULL, nfolds = 5, foldid, omega = 0.5, ...)



matrix of predictors, of dimension N×pN \times p; each row is an observation vector.


response variable.


the built-in kernel classes in KERE. The kern parameter can be set to any function, of class kernel, which computes the inner product in feature space between two vector arguments. KERE provides the most popular kernel functions which can be initialized by using the following functions:

  • rbfdot Radial Basis kernel function,

  • polydot Polynomial kernel function,

  • vanilladot Linear kernel function,

  • tanhdot Hyperbolic tangent kernel function,

  • laplacedot Laplacian kernel function,

  • besseldot Bessel kernel function,

  • anovadot ANOVA RBF kernel function,

  • splinedot the Spline kernel.

Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.)


a user supplied lambda sequence. It is better to supply a decreasing sequence of lambda values, if not, the program will sort user-defined lambda sequence in decreasing order automatically.


number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.


an optional vector of values between 1 and nfold identifying what fold each observation is in. If supplied, nfold can be missing.


the parameter ω\omega in the expectile regression model. The value must be in (0,1). Default is 0.5.


other arguments that can be passed to KERE.


The function runs KERE nfolds+1 times; the first to get the lambda sequence, and then the remainder to compute the fit with each of the folds omitted. The average error and standard deviation over the folds are computed.


an object of class cv.KERE is returned, which is a list with the ingredients of the cross-validation fit.


the values of lambda used in the fits.


the mean cross-validated error - a vector of length length(lambda).


estimate of standard error of cvm.


upper curve = cvm+cvsd.


lower curve = cvm-cvsd.


a character string "Expectile Loss"


the optimal value of lambda that gives minimum cross validation error cvm.


the minimum cross validation error cvm.


Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <>


Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.


N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)

# set gaussian kernel 
kern <- rbfdot(sigma=0.1)

# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))

cv.KERE(x=X, y=Y, kern, lambda = lambda, nfolds = 5, omega = 0.5)

Kernel Functions


The kernel generating functions provided in KERE.
The Gaussian RBF kernel k(x,x)=exp(σxx2)k(x,x') = \exp(-\sigma \|x - x'\|^2)
The Polynomial kernel k(x,x)=(scale<x,x>+offset)degreek(x,x') = (scale <x, x'> + offset)^{degree}
The Linear kernel k(x,x)=<x,x>k(x,x') = <x, x'>
The Hyperbolic tangent kernel k(x,x)=tanh(scale<x,x>+offset)k(x, x') = \tanh(scale <x, x'> + offset)
The Laplacian kernel k(x,x)=exp(σxx)k(x,x') = \exp(-\sigma \|x - x'\|)
The Bessel kernel k(x,x)=(Bessel(ν+1)nσxx2)k(x,x') = (- Bessel_{(\nu+1)}^n \sigma \|x - x'\|^2)
The ANOVA RBF kernel k(x,x)=1i1<iDNd=1Dk(xid,xid)k(x,x') = \sum_{1\leq i_1 \ldots < i_D \leq N} \prod_{d=1}^D k(x_{id}, {x'}_{id}) where k(x,x) is a Gaussian RBF kernel.
The Spline kernel d=1D1+xixj+xixjmin(xi,xj)xi+xj2min(xi,xj)2+min(xi,xj)33\prod_{d=1}^D 1 + x_i x_j + x_i x_j min(x_i, x_j) - \frac{x_i + x_j}{2} min(x_i,x_j)^2 + \frac{min(x_i,x_j)^3}{3} \


rbfdot(sigma = 1)

polydot(degree = 1, scale = 1, offset = 1)

tanhdot(scale = 1, offset = 1)


laplacedot(sigma = 1)

besseldot(sigma = 1, order = 1, degree = 1)

anovadot(sigma = 1, degree = 1)




The inverse kernel width used by the Gaussian the Laplacian, the Bessel and the ANOVA kernel


The degree of the polynomial, bessel or ANOVA kernel function. This has to be an positive integer.


The scaling parameter of the polynomial and tangent kernel is a convenient way of normalizing patterns without the need to modify the data itself


The offset used in a polynomial or hyperbolic tangent kernel


The order of the Bessel function to be used as a kernel


The kernel generating functions are used to initialize a kernel function which calculates the dot (inner) product between two feature vectors in a Hilbert Space. These functions can be passed as a kernel argument on almost all functions in KERE(e.g., ksvm, kpca etc).

Although using one of the existing kernel functions as a kernel argument in various functions in KERE has the advantage that optimized code is used to calculate various kernel expressions, any other function implementing a dot product of class kernel can also be used as a kernel argument. This allows the user to use, test and develop special kernels for a given data set or algorithm.


Return an S4 object of class kernel which extents the function class. The resulting function implements the given kernel calculating the inner (dot) product between two vectors.


a list containing the kernel parameters (hyperparameters) used.

The kernel parameters can be accessed by the kpar function.


If the offset in the Polynomial kernel is set to 0, we obtain homogeneous polynomial kernels, for positive values, we have inhomogeneous kernels. Note that for negative values the kernel does not satisfy Mercer's condition and thus the optimizers may fail.

In the Hyperbolic tangent kernel if the offset is negative the likelihood of obtaining a kernel matrix that is not positive definite is much higher (since then even some diagonal elements may be negative), hence if this kernel has to be used, the offset should always be positive. Note, however, that this is no guarantee that the kernel will be positive.


Alexandros Karatzoglou

See Also

kernelMatrix , kernelMult, kernelPol


rbfkernel <- rbfdot(sigma = 0.1)


## create two vectors
x <- rnorm(10)
y <- rnorm(10)

## calculate dot product

Fits the regularization paths for the kernel expectile regression.


Fits a regularization path for the kernel expectile regression at a sequence of regularization parameters lambda.


KERE(x, y, kern, lambda = NULL, eps = 1e-08, maxit = 1e4,
omega = 0.5, gamma = 1e-06, option = c("fast", "normal"))



matrix of predictors, of dimension N×pN \times p; each row is an observation vector.


response variable.


the built-in kernel classes in KERE. The kern parameter can be set to any function, of class kernel, which computes the inner product in feature space between two vector arguments. KERE provides the most popular kernel functions which can be initialized by using the following functions:

  • rbfdot Radial Basis kernel function,

  • polydot Polynomial kernel function,

  • vanilladot Linear kernel function,

  • tanhdot Hyperbolic tangent kernel function,

  • laplacedot Laplacian kernel function,

  • besseldot Bessel kernel function,

  • anovadot ANOVA RBF kernel function,

  • splinedot the Spline kernel.

Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.)


a user supplied lambda sequence. It is better to supply a decreasing sequence of lambda values, if not, the program will sort user-defined lambda sequence in decreasing order automatically.


convergence threshold for majorization minimization algorithm. Each majorization descent loop continues until the relative change in any coefficient alpha(new)α(old)22/α(old)22||alpha(new)-\alpha(old)||_2^2/||\alpha(old)||_2^2 is less than eps. Defaults value is 1e-8.


maximum number of loop iterations allowed at fixed lambda value. Default is 1e4. If models do not converge, consider increasing maxit.


the parameter ω\omega in the expectile regression model. The value must be in (0,1). Default is 0.5.


a scalar number. If it is specified, the number will be added to each diagonal element of the kernel matrix as perturbation. The default is 1e-06.


users can choose which method to use to update the inverse matrix in the MM algorithm. "fast" uses a trick described in Yang, Zhang and Zou (2015) to update estimates for each lambda. "normal" uses a naive way for the computation.


Note that the objective function in KERE is

Loss(yα0Kα))+λαTKα,Loss(y- \alpha_0 - K * \alpha )) + \lambda * \alpha^T * K * \alpha,

where the α0\alpha_0 is the intercept, α\alpha is the solution vector, and KK is the kernel matrix with Kij=K(xi,xj)K_{ij}=K(x_i,x_j). Users can specify the kernel function to use, options include Radial Basis kernel, Polynomial kernel, Linear kernel, Hyperbolic tangent kernel, Laplacian kernel, Bessel kernel, ANOVA RBF kernel, the Spline kernel. Users can also tweak the penalty by choosing different lambdalambda.

For computing speed reason, if models are not converging or running slow, consider increasing eps before increasing maxit.


An object with S3 class KERE.


the call that produced this object.


a nrow(x)*length(lambda) matrix of coefficients. Each column is a solution vector corresponding to a lambda value in the lambda sequence.


the actual sequence of lambda values used.


total number of loop iterations corresponding to each lambda value.


error flag, for warnings and errors, 0 if no error.


Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <>


Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.


# create data
N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)

# set gaussian kernel 
kern <- rbfdot(sigma=0.1)

# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))

# run KERE
m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) 

# plot the solution paths

Class "kernel" "rbfkernel" "polykernel", "tanhkernel", "vanillakernel"


The built-in kernel classes in KERE

Objects from the Class

Objects can be created by calls of the form new("rbfkernel"), new{"polykernel"}, new{"tanhkernel"}, new{"vanillakernel"}, new{"anovakernel"}, new{"besselkernel"}, new{"laplacekernel"}, new{"splinekernel"} or by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc..



Object of class "function" containing the kernel function


Object of class "list" containing the kernel parameters


Class "kernel", directly. Class "function", by class "kernel".



signature(kernel = "rbfkernel", x = "matrix"): computes the kernel matrix


signature(kernel = "rbfkernel", x = "matrix"): computes the quadratic kernel expression


signature(kernel = "rbfkernel", x = "matrix"): computes the kernel expansion


signature(kernel = "rbfkernel", x = "matrix"),,a: computes parts or the full kernel matrix, mainly used in kernel algorithms where columns of the kernel matrix are computed per invocation


Alexandros Karatzoglou

See Also



rbfkernel <- rbfdot(sigma = 0.1)

Kernel Matrix functions


kernelMatrix calculates the kernel matrix Kij=k(xi,xj)K_{ij} = k(x_i,x_j) or Kij=k(xi,yj)K_{ij} = k(x_i,y_j).
kernelPol computes the quadratic kernel expression H=zizjk(xi,xj)H = z_i z_j k(x_i,x_j), H=zikjk(xi,yj)H = z_i k_j k(x_i,y_j).
kernelMult calculates the kernel expansion f(xi)=i=1mzik(xi,xj)f(x_i) = \sum_{i=1}^m z_i k(x_i,x_j)
kernelFast computes the kernel matrix, identical to kernelMatrix, except that it also requires the squared norm of the first argument as additional input, useful in iterative kernel matrix calculations.


## S4 method for signature 'kernel'
kernelMatrix(kernel, x, y = NULL)

## S4 method for signature 'kernel'
kernelPol(kernel, x, y = NULL, z, k = NULL)

## S4 method for signature 'kernel'
kernelMult(kernel, x, y = NULL, z, blocksize = 256)

## S4 method for signature 'kernel'
kernelFast(kernel, x, y, a)



the kernel function to be used to calculate the kernel matrix. This has to be a function of class kernel, i.e. which can be generated either one of the build in kernel generating functions (e.g., rbfdot etc.) or a user defined function of class kernel taking two vector arguments and returning a scalar.


a data matrix to be used to calculate the kernel matrix.


second data matrix to calculate the kernel matrix.


a suitable vector or matrix


a suitable vector or matrix


the squared norm of x, e.g., rowSums(x^2)


the kernel expansion computations are done block wise to avoid storing the kernel matrix into memory. blocksize defines the size of the computational blocks.


Common functions used during kernel based computations.
The kernel parameter can be set to any function, of class kernel, which computes the inner product in feature space between two vector arguments. KERE provides the most popular kernel functions which can be initialized by using the following functions:

  • rbfdot Radial Basis kernel function

  • polydot Polynomial kernel function

  • vanilladot Linear kernel function

  • tanhdot Hyperbolic tangent kernel function

  • laplacedot Laplacian kernel function

  • besseldot Bessel kernel function

  • anovadot ANOVA RBF kernel function

  • splinedot the Spline kernel

(see example.)

kernelFast is mainly used in situations where columns of the kernel matrix are computed per invocation. In these cases, evaluating the norm of each row-entry over and over again would cause significant computational overhead.


kernelMatrix returns a symmetric diagonal semi-definite matrix.
kernelPol returns a matrix.
kernelMult usually returns a one-column matrix.


Alexandros Karatzoglou

See Also

rbfdot, polydot, tanhdot, vanilladot


## use the spam data
x <- matrix(rnorm(10*10),10,10)

## initialize kernel function 
rbf <- rbfdot(sigma = 0.05)

## calculate kernel matrix
kernelMatrix(rbf, x)

y <- matrix(rnorm(10*1),10,1)

## calculate the quadratic kernel expression
kernelPol(rbf, x, ,y)

## calculate the kernel expansion
kernelMult(rbf, x, ,y)

Plot coefficients from a "KERE" object


Produces a coefficient profile plot of the coefficient paths for a fitted KERE object.


## S3 method for class 'KERE'
plot(x, ...)



fitted KERE model.


other graphical parameters to plot.


A coefficient profile plot is produced. The x-axis is log(λ)log(\lambda). The y-axis is the value of fitted α\alpha's.


Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <>


Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.


# create data
N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)

# set gaussian kernel 
kern <- rbfdot(sigma=0.1)

# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))

# run KERE
m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) 

# plot the solution paths

make predictions from a "KERE" object.


Similar to other predict methods, this functions predicts fitted values and class labels from a fitted KERE object.


## S3 method for class 'KERE'
predict(object, kern, x, newx,...)



fitted KERE model object.


the built-in kernel classes in KERE. Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.)


the original design matrix for training KERE.


matrix of new values for x at which predictions are to be made. NOTE: newx must be a matrix with each row as an observation. predict function does not accept a vector or other formats of newx.


other parameters to predict function.


The fitted α0+Kα\alpha_0 + K * \alpha at newx is returned as a size nrow(newx)*length(lambda) matrix for various lambda values where the KERE model was fitted.


The fitted α0+Kα\alpha_0 + K * \alpha is returned as a size nrow(newx)*length(lambda) matrix. The row represents the index for observations of newx. The column represents the index for the lambda sequence.


Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <>


Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.


# create data
N <- 100
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)

# set gaussian kernel 
kern <- rbfdot(sigma=0.1)

# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))

# run KERE
m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5) 

# create newx for prediction
N1 <- 5
X1 <- runif(N1)
X2 <- 2*runif(N1)
X3 <- 3*runif(N1)
newx <- cbind(X1,X2,X3)

# make prediction
p1 <- predict.KERE(m1, kern, X, newx)