Package 'cossonet' reference manual

Title:	Sparse Nonparametric Regression for High-Dimensional Data
Description:	Estimation of sparse nonlinear functions in nonparametric regression using component selection and smoothing. Designed for the analysis of high-dimensional data, the models support various data types, including exponential family models and Cox proportional hazards models. The methodology is based on Lin and Zhang (2006) <doi:10.1214/009053606000000722>.
Authors:	Jieun Shin [aut, cre]
Maintainer:	Jieun Shin <jieunstat@uos.ac.kr>
License:	GPL-3
Version:	1.0
Built:	2025-03-13 14:20:44 UTC
Source:	CRAN

Load a matrix from a file

Description

The cossonet function implements a nonparametric regression model that estimates nonlinear components. This function can be applied to continuous, count, binary, and survival responses. To use this function, the user must specify a family, kernel function, etc. For cross-validation, the sequence vectors lambda0 and lambda_theta appropriate for the input data must also be specified.

Usage

cossonet(
  x,
  y,
  family = c("gaussian", "binomial", "poisson", "Cox"),
  wt = rep(1, ncol(x)),
  scale = TRUE,
  nbasis,
  basis.id,
  kernel = c("linear", "gaussian", "poly", "spline"),
  effect = c("main", "interaction"),
  nfold = 5,
  kparam = 1,
  lambda0 = exp(seq(log(2^{
     -10
 }), log(2^{
     10
 }), length.out = 20)),
  lambda_theta = exp(seq(log(2^{
     -10
 }), log(2^{
     10
 }), length.out = 20)),
  gamma = 0.95,
  one.std = TRUE
)
cossonet(
  x,
  y,
  family = c("gaussian", "binomial", "poisson", "Cox"),
  wt = rep(1, ncol(x)),
  scale = TRUE,
  nbasis,
  basis.id,
  kernel = c("linear", "gaussian", "poly", "spline"),
  effect = c("main", "interaction"),
  nfold = 5,
  kparam = 1,
  lambda0 = exp(seq(log(2^{
     -10
 }), log(2^{
     10
 }), length.out = 20)),
  lambda_theta = exp(seq(log(2^{
     -10
 }), log(2^{
     10
 }), length.out = 20)),
  gamma = 0.95,
  one.std = TRUE
)

Arguments

`x`	Input matrix or data frame of $n$ by $p$. `x` must have at least two columns ($p>1$).
`y`	A response vector with a continuous, binary, or count type. For survival responses, this should be a two-column matrix (or data frame) with columns called 'time' and 'status'.
`family`	A distribution corresponding to the response type. `family="gaussian"` for continuous responses, `family="binomial"` for binary responses, `family="poisson"` for count responses, and `family="cox"` for survival responses.
`wt`	The weights assigned to the explanatory variables. The default is `rep(1,ncol(x))`.
`scale`	Boolean for whether to scale continuous explanatory variables to values between 0 and 1.
`nbasis`	The number of "knots". If `basis.id` is provided, it is set to the length of `basis.id`.
`basis.id`	The index of the "knot" to select.
`kernel`	TThe kernel function. One of four types of `linear` (default), `gaussian`, `poly`, and `spline`.
`effect`	The effect of the component. `main` (default) is the main effect, and `interaction` is the two-way interaction.
`nfold`	The number of folds to use in cross-validation is used to determine how many subsets to divide the data into for the training and validation sets.
`kparam`	Parameters for Gaussian and polynomial kernel functions
`lambda0`	A vector of `lambda0` sequences. The default is a grid of 20 values `⁠[2^{-10}, \dots, 2^{10}]⁠` on an equally spaced logarithmic scale. This may need to be adjusted based on the input data. Do not set `⁠\lambda0⁠` as a single value.
`lambda_theta`	A vector of `lambda` sequences. The default is a grid of 20 values `⁠[2^{-10}, \dots, 2^{10}]⁠` on an equally spaced logarithmic scale. This may need to be adjusted based on the input data. Do not set `lambda` as a single value.
`gamma`	Elastic-net mixing parameter `⁠0 \leq \gamma \leq 1⁠`. If `gamma = 1`, the LASSO penalty is applied, and if `gamma = 0`, the Ridge penalty is applied. The default is `gamma = 0.95`.
`one.std`	A logical value indicating whether to apply the "1-standard error rule." When set to `TRUE`, it applies to both the c-step and theta-step, selecting the simplest model within one standard error of the best model.

Value

A list containing information about the fitted model.

Examples


# Generate example data
set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Fit the model
fit = cossonet(tr_x, tr_y, family = 'gaussian', gamma = 0.95, kernel = "spline", scale = TRUE,
      lambda0 = exp(seq(log(2^{-4}), log(2^{0}), length.out = 20)),
      lambda_theta = exp(seq(log(2^{-8}), log(2^{-6}), length.out = 20))
      )


# Generate example data
set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Fit the model
fit = cossonet(tr_x, tr_y, family = 'gaussian', gamma = 0.95, kernel = "spline", scale = TRUE,
      lambda0 = exp(seq(log(2^{-4}), log(2^{0}), length.out = 20)),
      lambda_theta = exp(seq(log(2^{-8}), log(2^{-6}), length.out = 20))
      )

The function `cossonet.predict` predicts predictive values for new data based on an object from the `cossonet` function.

Description

The function cossonet.predict predicts predictive values for new data based on an object from the cossonet function.

Usage

cossonet.predict(model, testx)
cossonet.predict(model, testx)

Arguments

`model`	The fitted cossonet object.
`testx`	The new data set to be predicted.

Value

A list of predicted values for the new data set.

Examples


set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Fit the model
fit = cossonet(tr_x, tr_y, family = 'gaussian', gamma = 0.95, kernel = "spline", scale = TRUE,
      lambda0 = exp(seq(log(2^{-4}), log(2^{0}), length.out = 20)),
      lambda_theta = exp(seq(log(2^{-8}), log(2^{-6}), length.out = 20))
      )

# Predict new dataset
pred = cossonet.predict(fit, te_x)

set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Fit the model
fit = cossonet(tr_x, tr_y, family = 'gaussian', gamma = 0.95, kernel = "spline", scale = TRUE,
      lambda0 = exp(seq(log(2^{-4}), log(2^{0}), length.out = 20)),
      lambda_theta = exp(seq(log(2^{-8}), log(2^{-6}), length.out = 20))
      )

# Predict new dataset
pred = cossonet.predict(fit, te_x)

The function data_generation generates an example dataset for applying the cossonet function.

Description

The function data_generation generates an example dataset for applying the cossonet function.

Usage

data_generation(
  n,
  p,
  rho,
  SNR,
  response = c("continuous", "binary", "count", "survival")
)
data_generation(
  n,
  p,
  rho,
  SNR,
  response = c("continuous", "binary", "count", "survival")
)

Arguments

`n`	observation size.
`p`	dimension.
`rho`	a positive integer indicating the correlation strength for the first four informative variables.
`SNR`	signal-to-noise ratio.
`response`	the type of the response variable.

Value

a list of explanatory variables, response variables, and true functions.

Examples

# Generate example data
set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Generate example data
set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

The function `metric` provides a contingency table for the predicted class and the true class for binary classes.

Description

The function metric provides a contingency table for the predicted class and the true class for binary classes.

Usage

metric(true, est)
metric(true, est)

Arguments

`true`	binary true class.
`est`	binary predicted class.

Value

a contingency table for the predicted results of binary class responses.

Examples


set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Fit the model
fit = cossonet(tr_x, tr_y, family = 'gaussian', gamma = 0.95, kernel = "spline", scale = TRUE,
      lambda0 = exp(seq(log(2^{-4}), log(2^{0}), length.out = 20)),
      lambda_theta = exp(seq(log(2^{-8}), log(2^{-6}), length.out = 20))
      )

# Predict new dataset
pred = cossonet.predict(fit, te_x)

# Calculate the contingency table for binary class
true_var = c(rep(1, 4), rep(0, 20-4))
est_var = ifelse(fit$theta_step$theta.new > 0, 1, 0)
metric(true_var, est_var)


set.seed(20250101)
tr = data_generation(n = 200, p = 20, SNR = 9, response = "continuous")
tr_x = tr$x
tr_y = tr$y

te = data_generation(n = 1000, p = 20, SNR = 9, response = "continuous")
te_x = te$x
te_y = te$y

# Fit the model
fit = cossonet(tr_x, tr_y, family = 'gaussian', gamma = 0.95, kernel = "spline", scale = TRUE,
      lambda0 = exp(seq(log(2^{-4}), log(2^{0}), length.out = 20)),
      lambda_theta = exp(seq(log(2^{-8}), log(2^{-6}), length.out = 20))
      )

# Predict new dataset
pred = cossonet.predict(fit, te_x)

# Calculate the contingency table for binary class
true_var = c(rep(1, 4), rep(0, 20-4))
est_var = ifelse(fit$theta_step$theta.new > 0, 1, 0)
metric(true_var, est_var)

Package 'cossonet'

Help Index

Load a matrix from a file

Description

Usage

Arguments

Value

Examples

The function cossonet.predict predicts predictive values for new data based on an object from the cossonet function.

Description

Usage

Arguments

Value

Examples

The function data_generation generates an example dataset for applying the cossonet function.

Description

Usage

Arguments

Value

Examples

The function metric provides a contingency table for the predicted class and the true class for binary classes.

Description

Usage

Arguments

Value

Examples

The function `cossonet.predict` predicts predictive values for new data based on an object from the `cossonet` function.

The function `metric` provides a contingency table for the predicted class and the true class for binary classes.