Package 'OPTS' reference manual

Title:	Optimization via Subsampling (OPTS)
Description:	Subsampling based variable selection for low dimensional generalized linear models. The methods repeatedly subsample the data minimizing an information criterion (AIC/BIC) over a sequence of nested models for each subsample. Marinela Capanu, Mihai Giurcanu, Colin B Begg, Mithat Gonen, Subsampling based variable selection for generalized linear models.
Authors:	Mihai Giurcanu [aut, cre], Marinela Capanu [aut, ctb], Colin Begg [aut], Mithat Gonen [aut]
Maintainer:	Mihai Giurcanu <[email protected]>
License:	GPL-2
Version:	0.1
Built:	2025-01-26 06:29:34 UTC
Source:	CRAN

Optimization via Subsampling (OPTS)

Description

opts computes the OPTS MLE in low dimensional case.

Usage

opts(X, Y, m, crit = "aic", prop_split = 0.5, cutoff = 0.75, ...)
opts(X, Y, m, crit = "aic", prop_split = 0.5, cutoff = 0.75, ...)

Arguments

`X`	n x p covariate matrix (without intercept)
`Y`	n x 1 binary response vector
`m`	number of subsamples
`crit`	information criterion to select the variables: (a) aic = minimum AIC and (b) bic = minimum BIC
`prop_split`	proportion of subsample size and sample size, default value = 0.5
`cutoff`	cutoff used to select the variables using the stability selection criterion, default value = 0.75
`...`	other arguments passed to the glm function, e.g., family = "binomial"

Value

opts returns a list:

`betahat`	OPTS MLE of regression parameter vector
`Jhat`	estimated set of active predictors (TRUE/FALSE) corresponding to the OPTS MLE
`SE`	standard error of OPTS MLE
`freqs`	relative frequency of selection for all variables

Examples

require(MASS)
P = 15
N = 100
M = 20
BETA_vector = c(0.5, rep(0.5, 2), rep(0.5, 2), rep(0, P - 5))
MU_vector = numeric(P)
SIGMA_mat = diag(P)

X <- mvrnorm(N, MU_vector, Sigma = SIGMA_mat)
linearPred <- cbind(rep(1, N), X) 
Y <- rbinom(N, 1, plogis(linearPred))

# OPTS-AIC MLE
opts(X, Y, 10, family = "binomial")

require(MASS)
P = 15
N = 100
M = 20
BETA_vector = c(0.5, rep(0.5, 2), rep(0.5, 2), rep(0, P - 5))
MU_vector = numeric(P)
SIGMA_mat = diag(P)

X <- mvrnorm(N, MU_vector, Sigma = SIGMA_mat)
linearPred <- cbind(rep(1, N), X) 
Y <- rbinom(N, 1, plogis(linearPred))

# OPTS-AIC MLE
opts(X, Y, 10, family = "binomial")

Threshold OPTimization via Subsampling (OPTS_TH)

Description

opts_th computes the threshold OPTS MLE in low dimensional case.

Usage

opts_th(X, Y, m, crit = "aic", type = "binseg", prop_split = 0.5,
  prop_trim = 0.2, q_tail = 0.5, ...)
opts_th(X, Y, m, crit = "aic", type = "binseg", prop_split = 0.5,
  prop_trim = 0.2, q_tail = 0.5, ...)

Arguments

`X`	n x p covariate matrix (without intercept)
`Y`	n x 1 binary response vector
`m`	number of subsamples
`crit`	information criterion to select the variables: (a) aic = minimum AIC and (b) bic = minimum BIC
`type`	method used to minimize the trimmed and averaged information criterion: (a) min = observed minimum subsampling trimmed average information, (b) sd = observed minimum using the 0.25sd rule (corresponding to OPTS-min in the paper), (c) pelt = PELT changepoint algorithm (corresponding to OPTS-PELT in the paper), (d) binseg = binary segmentation changepoint algorithm (corresponding to OPTS-BinSeg in the paper), (e) amoc = AMOC method.
`prop_split`	proportion of subsample size of the sample size; default value is 0.5
`prop_trim`	proportion that defines the trimmed mean; default value = 0.2
`q_tail`	quantiles for the minimum and maximum p-values across the subsample cutpoints used to define the range of cutpoints
`...`	other arguments passed to the glm function, e.g., family = "binomial"

Value

opts_th returns a list:

`betahat`	STOPES MLE of regression parameters
`SE`	SE of STOPES MLE
`Jhat`	set of active predictors (TRUE/FALSE) corresponding to STOPES MLE
`cuthat`	estimated cutpoint for variable selection
`pval`	marginal p-values from univariate fit
`cutpoits`	subsample cutpoints
`aic_mean`	mean subsample AIC
`bic_mean`	mean subsample BIC

Examples

require(MASS)
P = 15
N = 100
M = 20
BETA_vector = c(0.5, rep(0.5, 2), rep(0.5, 2), rep(0, P - 5))
MU_vector = numeric(P)
SIGMA_mat = diag(P)

X <- mvrnorm(N, MU_vector, Sigma = SIGMA_mat)
linearPred <- cbind(rep(1, N), X) 
Y <- rbinom(N, 1, plogis(linearPred))

# Threshold OPTS-BinSeg MLE
opts_th(X, Y, M, family = "binomial")

require(MASS)
P = 15
N = 100
M = 20
BETA_vector = c(0.5, rep(0.5, 2), rep(0.5, 2), rep(0, P - 5))
MU_vector = numeric(P)
SIGMA_mat = diag(P)

X <- mvrnorm(N, MU_vector, Sigma = SIGMA_mat)
linearPred <- cbind(rep(1, N), X) 
Y <- rbinom(N, 1, plogis(linearPred))

# Threshold OPTS-BinSeg MLE
opts_th(X, Y, M, family = "binomial")

Package 'OPTS'

Help Index

Optimization via Subsampling (OPTS)

Description

Usage

Arguments

Value

Examples

Threshold OPTimization via Subsampling (OPTS_TH)

Description

Usage

Arguments

Value

Examples