Title: | Model Selection with FDR Control of Selected Variables |
---|---|
Description: | Selects one model with variable selection FDR controlled at a specified level. A q-value for each potential variable is also returned. The input, variable selection counts over many bootstraps for several levels of penalization, is modeled as coming from a beta-binomial mixture distribution. |
Authors: | Jonatan Kallus [aut, cre] |
Maintainer: | Jonatan Kallus <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2024-10-31 20:33:15 UTC |
Source: | CRAN |
Run first step of model fitting to find good penalization interval
explore(data, B, mc.cores = getOption("mc.cores", 2L))
explore(data, B, mc.cores = getOption("mc.cores", 2L))
data |
Matrix of variable presence counts. One column for each variable, one row for each parameter value (e.g. levels of regularization). |
B |
Number of bootstraps used to construct |
mc.cores |
Number of threads to run in parallel (1 turns of parallelization) |
A list with components
pop.sep |
vector of values saying how separated true and false variables are for each level of penalization |
explore
for adjacency matricesWhen modeling graphs it may be more convenient to store data as matrices instead of row vectors.
exploregraph(data, B, ...)
exploregraph(data, B, ...)
data |
List of symmetric matrices, one matrix for each penalization level |
B |
Number of bootstraps used to construct |
... |
Additional arguments are passed on to |
A list with components
pop.sep |
vector of values saying how separated true and false variables are for each level of penalization |
Plot rope results
plotrope(result, data, types = c("global"), ...)
plotrope(result, data, types = c("global"), ...)
result |
An object returned by |
data |
Matrix of variable presence counts. One column for each variable, one row for each parameter value (e.g. levels of regularization). |
types |
List of names of plots to draw (alternatives |
... |
Pass level=v for a vector v of indices when drawing the fits plot to only plot for penalization levels corresponding to v |
Estimates a model from bootstap counts. The objective is to maximize accuracy while controlling the false discovery rate of selected variables. Developed for high-dimensional models with number of variables in the order of at least 10000.
rope(data, B, fdr = 0.1, mc.cores = getOption("mc.cores", 2L), only.first = FALSE)
rope(data, B, fdr = 0.1, mc.cores = getOption("mc.cores", 2L), only.first = FALSE)
data |
Matrix of variable presence counts. One column for each variable, one row for each parameter value (e.g. levels of regularization). |
B |
Number of bootstraps used to construct |
fdr |
Vector of target false discovery rates to return selections for |
mc.cores |
Number of threads to run in parallel (1 turns of parallelization) |
only.first |
Skip second part of algorithm. Saves time but gives worse results. |
A list with components
selection |
matrix (one row for each fdr target, one column for each variable) |
q |
vector of q-values, one for each variable |
level |
index of most separating parameter value |
alt.prop |
estimated proportion of alternative variables |
Jonatan Kallus, [email protected]
## Not run: data # a matrix of selection counts, for 100 bootstraps, with ncol(data) # potential variables counted for nrow(data) different penalization levels fdr <- c(0.05, 0.1) result <- rope(data, 100, fdr) ## End(Not run)
## Not run: data # a matrix of selection counts, for 100 bootstraps, with ncol(data) # potential variables counted for nrow(data) different penalization levels fdr <- c(0.05, 0.1) result <- rope(data, 100, fdr) ## End(Not run)
rope
for adjacency matricesWhen modeling graphs it may be more convenient to store data as matrices instead of row vectors.
ropegraph(data, B, ...)
ropegraph(data, B, ...)
data |
List of symmetric matrices, one matrix for each penalization level |
B |
Number of bootstraps used to construct |
... |
Additional arguments are passed on to |
A list with components
selection |
list of symmetric matrices, one matrix for each fdr target |
q |
symmetric matrix of q-values |
level |
index of most separating parameter value |
alt.prop |
estimated proportion of alternative variables |
## Not run: data # a list of symmetric matrices, one matrix for each penalization level, # each matrix containing selection counts for each edge over 100 bootstraps fdr <- c(0.05, 0.1) result <- rope(data, 100, fdr) ## End(Not run)
## Not run: data # a list of symmetric matrices, one matrix for each penalization level, # each matrix containing selection counts for each edge over 100 bootstraps fdr <- c(0.05, 0.1) result <- rope(data, 100, fdr) ## End(Not run)
The data set contains 175 observations for each node, the true network structure dat was used to generate data and edge presence counts from glasso over 100 bootstraps.
scalefree
scalefree
A list containing:
A matrix of 175 observations (rows) for 200 variabels (columns)
The generating network structure (as a vector)
100, the number of bootstraps used when counting edge presence
The range of penalization used for glasso (the first 9 generate U-shaped histograms)
A matrix of length(lambda) rows and 200*199/2 columns containing presence counts for each edge and each level of penalization
A list of length(lamdba) containing matrices of size 200 by 200, the data in W but in an alternative format
A 200 by 200 matrix, the data in g but in an alternative format
If variable selection counts are in a matrix this function converts them into vector to input into rope. Can be useful when variables correspond to edges in a graph.
symmetric.matrix2vector(m)
symmetric.matrix2vector(m)
m |
A symmetric matrix |
This can be convenient for using output when rope is used for selection of graph models.
vector2symmetric.matrix(v)
vector2symmetric.matrix(v)
v |
A vector with length p*(p-1)/2 for some integer p |