Getting started with hdsvm

This package provides tools for fitting the penalized SVM.

The strengths and improvements that this package offers relative to other SVM packages are as follows:

  • Compiled Fortran code significantly speeds up the penalized SVM estimation process.

  • Solve the elastic net penalized SVM using generalized coordinate descent algorithm.

  • Active-set and warm-start strategies implemented to compute the solution path as _1 varies.

For this getting-started vignette, first, we will randomly generate x, an input matrix of predictors of dimension n × p and a response variable ‘y’.

library(hdsvm)
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1

Then the penalized SVM model is formulated as:

$$ \min_{\beta\in\mathbb{R}^{p},b_0\in\mathbb{R}} \frac{1}{n} \sum_{i=1}^{n}[\max\{1 - y_i (\beta_0 + x_i^\top \beta), 0\}] + \lambda_1\cdot|pf_1\circ\beta|_1 + 0.5*\lambda_2\cdot|\beta|^2, \qquad (*). $$ where the represents the Hadamard product.

hdsvm()

Given an input matrix x and a response vector y, the elastic net penalized SVM model is estimated for a sequence of penalty parameter values. The other main arguments the users might supply are:

  • lambda: a user-supplied lambda sequence for L1 penalty. Ideally a decreasing sequence to utilize warm-start optimization. The sequence is sorted in decreasing order if not already.

  • nlambda: number of lambda values, default is 100.

  • lambda.factor: The factor for calculating the minimal value, defined as = * . Here, is the smallest that results in all coefficients being zero (except intercept). Defaults to 0.05 if , and 0.001 if . A very low may result in a saturated model. Does not apply if a sequence is supplied.

  • is_exact: Logical indicating if the solution should be exact (TRUE) or approximate (FALSE). Default is FALSE.

  • lam2: Regularization parameter for L2 penalty. A single value is used for each fitting process.

  • pf: L1 penalty factor of length used for the adaptive LASSO or adaptive elastic net. Separate L1 penalty weights can be applied to each coefficient to allow different L1 shrinkage. Can be 0 for some variables (but not all), which imposes no shrinkage, and results in that variable always being included in the model. Default is 1 for all variables (and implicitly infinity for variables in the list).

  • exclude: Indices of predictors excluded from the model, treated as having infinite penalty. Defaults to none.

  • dfmax: Maximum number of variables in the model, particularly useful when is large. Default is .

  • pmax: The maximum number of non-zero coefficients ever allowed in the solution path. Each coefficient is counted only once irrespective of its status across different models. Default is .

  • standardize: Logical flag indicating whether variables should be standardized before fitting. Coefficients are returned on the original scale. Default is TRUE.

  • eps: Convergence criterion.

  • maxit: Maximum number of iterations.

  • sigma: Augmented Lagrangian parameter for quadratic terms, must be positive. Default is 0.05.

  • hval: The smoothing parameter for the smoothed check loss. Default is 1.

lambda <- 10^(seq(1, -4, length.out=30))
lam2 <- 0.01
fit <- hdsvm(x, y, lambda=lambda, lam2=lam2)

cv.hdsvm()

This function performs k-fold cross-validation (cv) for the hdsvm() model. It takes the same arguments x, y, lambda, which are specified above, with additional argument nfolds and foldid for the error measure.

cv.fit <- cv.hdsvm(x, y, lambda=lambda)

nc.hdsvm()

This function fits the penalized SVM model using nonconvex penalties such as SCAD or MCP. It allows for flexible control over the regularization parameters and offers advanced options for initializing and optimizing the fit. It takes the same arguments x, y,lambda, lam2, which are specified above.The other main arguments the users might supply are:

  • pen: Specifies the type of nonconvex penalty: “SCAD” or “MCP”.

  • aval: The parameter value for the SCAD or MCP penalty. Default is 3.7 for SCAD and 2 for MCP.

  • ini_beta: Optional initial coefficients to start the fitting process.

  • lla_step: Number of Local Linear Approximation (LLA) steps. Default is 3.

nc.fit <- nc.hdsvm(x=x, y=y, lambda=lambda, lam2=lam2, pen="scad")

cv.nc.hdsvm()

This function onducts k-fold cross-validation for the nc.hdsvm() function. It takes the same arguments as the cv.hdsvm() function.

cv.nc.fit <- cv.nc.hdsvm(y=y, x=x, lambda=lambda, lam2=lam2, pen="scad")

Methods

A number of S3 methods are provided for hdsvm, cv.hdsvm, nc.hdsvm and cv.nc.hdsvm objects.

  • coef() and predict() return a matrix of coefficients and predictions given a matrix x at each lambda respectively. The optional s argument may provide a specific value of λ (not necessarily part of the original sequence), or, in the case of cv.hdsvm object or cv.nc.hdsvm, a string specifying either "lambda.min" or "lambda.1se".
coefs <- coef(fit, s = fit$lambda[3:5])
preds <- predict(fit, newx = tail(x), s = fit$lambda[3:5])
cv.coefs <- coef(cv.fit, s = c(0.02, 0.03))
cv.preds <- predict(cv.fit, newx = x[50:60, ], s = "lambda.min")
nc.coefs <- coef(nc.fit, s = nc.fit$lambda[3:5])
nc.preds <- predict(nc.fit, newx = tail(x), s = fit$lambda[3:5])
cv.nc.coefs <- coef(cv.nc.fit, s = c(0.02, 0.03))
cv.nc.preds <- predict(cv.nc.fit, newx = x[50:60, ], s = "lambda.min")