Package 'fence' reference manual

Title:	Using Fence Methods for Model Selection
Description:	This method is a new class of model selection strategies, for mixed model selection, which includes linear and generalized linear mixed models. The idea involves a procedure to isolate a subgroup of what are known as correct models (of which the optimal model is a member). This is accomplished by constructing a statistical fence, or barrier, to carefully eliminate incorrect models. Once the fence is constructed, the optimal model is selected from among those within the fence according to a criterion which can be made flexible. References: 1. Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692. <DOI:10.1214/07-AOS517> <https://projecteuclid.org/euclid.aos/1216237296>. 2. Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629. <DOI:10.1016/j.spl.2008.10.014> <https://www.researchgate.net/publication/23991417_A_simplified_adaptive_fence_procedure> 3. Jiang J., Nguyen T., Rao J.S. (2010), Fence Method for Nonparametric Small Area Estimation. Survey Methodology, 36(1), 3-11. <http://publications.gc.ca/collections/collection_2010/statcan/12-001-X/12-001-x2010001-eng.pdf>. 4. Jiming Jiang, Thuan Nguyen and J. Sunil Rao (2011), Invisible fence methods and the identification of differentially expressed gene sets. Statistics and Its Interface, Volume 4, 403-415. <http://www.intlpress.com/site/pub/files/_fulltext/journals/sii/2011/0004/0003/SII-2011-0004-0003-a014.pdf>. 5. Thuan Nguyen & Jiming Jiang (2012), Restricted fence method for covariate selection in longitudinal data analysis. Biostatistics, 13(2), 303-314. <DOI:10.1093/biostatistics/kxr046> <https://academic.oup.com/biostatistics/article/13/2/303/263903/Restricted-fence-method-for-covariate-selection-in>. 6. Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662. <DOI:10.1080/00949655.2012.721885> <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3891925/>. 7. Jiang, J. (2014), The fence methods, in Advances in Statistics, Hindawi Publishing Corp., Cairo. <DOI:10.1155/2014/830821>. 8. Jiming Jiang and Thuan Nguyen (2015), The Fence Methods, World Scientific, Singapore. <https://www.abebooks.com/9789814596060/Fence-Methods-Jiming-Jiang-981459606X/plp>.
Authors:	Jiming Jiang, Jianyang Zhao, J. Sunil Rao, Thuan Nguyen
Maintainer:	Thuan Nguyen <nguythua@ohsu.edu>
License:	BSD_2_clause + file LICENSE
Version:	1.0
Built:	2025-03-07 07:01:17 UTC
Source:	CRAN

Adaptive Fence model selection

Description

Adaptive Fence model selection

Usage

adaptivefence(mf, f, ms, d, lf, pf, bs, grid = 101, bandwidth)
adaptivefence(mf, f, ms, d, lf, pf, bs, grid = 101, bandwidth)

Arguments

`mf`	function for fitting the model
`f`	formula of full model
`ms`	list of formula of candidates models
`d`	data
`lf`	measure lack of fit (to minimize)
`pf`	model selection criteria, e.g., model dimension
`bs`	bootstrap samples
`grid`	grid for c
`bandwidth`	bandwidth for kernel smooth function

Value

`models`	list all model candidates in the model space
`B`	list the number of bootstrap samples that have been used
`lack_of_fit_matrix`	list a matrix of Qs for all model candidates (in columns). Each row is for each bootstrap sample
`Qd_matrix`	list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample
`bandwidth`	list the value of bandwidth
`model_mat`	list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample
`freq_mat`	list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)
`c`	list the adaptive choice of c value from which the parsimonious model is selected
`sel_model`	list the selected (parsimonious) model given the adaptive c value

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662

Examples

## Not run: 
require(fence)

#### Example 1 #####
data(iris)
full = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + (1|Species)
test_af = fence.lmer(full, iris)
plot(test_af)
test_af$sel_model

#### Example 2 #####
r =1234; set.seed(r)  
p=8; n=150; rho = 0.6
id = rep(1:50,each=3)
R = diag(p)
for(i in 1:p){
  for(j in 1:p){
     R[i,j] = rho^(abs(i-j))
  }
} 
R = 1*R
x=mvrnorm(n, rep(0, p), R)  # all x's are time-varying dependence #
colnames(x)=paste('x',1:p, sep='')
tbetas = c(0,0.5,1,0,0.5,1,0,0.5)  # non-zero beta 2,3,5,6,8
epsilon = rnorm(150)
y = x%*%tbetas + epsilon
colnames(y) = 'y'
data = data.frame(cbind(x,y,id))
full = y ~ x1+x2+x3+x4+x5+x6+x7+x8+(1|id)
#X = paste('x',1:p, sep='', collapse='+')
#full = as.formula(paste('y~',X,'+(1|id)', sep=""))  #same as previous one
fence_obj = fence.lmer(full,data)   # it takes 3-5 min #
plot(fence_obj)
fence_obj$sel_model

## End(Not run)
## Not run: 
require(fence)

#### Example 1 #####
data(iris)
full = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + (1|Species)
test_af = fence.lmer(full, iris)
plot(test_af)
test_af$sel_model

#### Example 2 #####
r =1234; set.seed(r)  
p=8; n=150; rho = 0.6
id = rep(1:50,each=3)
R = diag(p)
for(i in 1:p){
  for(j in 1:p){
     R[i,j] = rho^(abs(i-j))
  }
} 
R = 1*R
x=mvrnorm(n, rep(0, p), R)  # all x's are time-varying dependence #
colnames(x)=paste('x',1:p, sep='')
tbetas = c(0,0.5,1,0,0.5,1,0,0.5)  # non-zero beta 2,3,5,6,8
epsilon = rnorm(150)
y = x%*%tbetas + epsilon
colnames(y) = 'y'
data = data.frame(cbind(x,y,id))
full = y ~ x1+x2+x3+x4+x5+x6+x7+x8+(1|id)
#X = paste('x',1:p, sep='', collapse='+')
#full = as.formula(paste('y~',X,'+(1|id)', sep=""))  #same as previous one
fence_obj = fence.lmer(full,data)   # it takes 3-5 min #
plot(fence_obj)
fence_obj$sel_model

## End(Not run)

Adaptive Fence model selection (Small Area Estmation)

Description

Adaptive Fence model selection (Small Area Estmation)

Usage

adaptivefence.fh(mf, f, ms, d, lf, pf, bs, grid = 101, bandwidth, method)
adaptivefence.fh(mf, f, ms, d, lf, pf, bs, grid = 101, bandwidth, method)

Arguments

`mf`	Call function, for example: default calls: function(m, b) eblupFH(formula = m, vardir = D, data = b, method = "FH")
`f`	Full Model
`ms`	find candidate model, findsubmodel.fh(full)
`d`	Dimension number
`lf`	Measures lack of fit using function(res) -res$fit$goodness[1]
`pf`	Dimensions of model
`bs`	Bootstrap
`grid`	grid for c
`bandwidth`	bandwidth for kernel smooth function
`method`	Method to be used. Fay-Herriot method is the default.

Details

In Jiang et. al (2008), the adaptive c value is chosen from the highest peak in the p* vs. c plot. In Jiang et. al (2009), 95% CI is taken into account while choosing such an adaptive choice of c. In Thuan Nguyen et. al (2014), the adaptive c value is chosen from the first peak. This approach works better in the moderate sample size or weak signal situations. Empirically, the first peak becomes highest peak when sample size increases or signals become stronger

Value

`models`	list all model candidates in the model space
`B`	list the number of bootstrap samples that have been used
`lack_of_fit_matrix`	list a matrix of Qs for all model candidates (in columns). Each row is for each bootstrap sample
`Qd_matrix`	list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample
`bandwidth`	list the value of bandwidth
`model_mat`	list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample
`freq_mat`	list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)
`c`	list the adaptive choice of c value from which the parsimonious model is selected
`sel_model`	list the selected (parsimonious) model given the adaptive c value

Note

The current Fence package focuses on variable selection. However, Fence methods can be used to select other parameters of interest, e.g., tunning parameter, variance-covariance structure, etc.
The number of bootstrap samples is suggested to be increased, e.g., B=1000 when the sample size is small, or signals are weak

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662

Examples

## Not run: 
require(fence)
### example 1 ####
data("kidney")
data = kidney[-which.max(kidney$x),]     # Delete a suspicious data point #
data$x2 = data$x^2
data$x3 = data$x^3
data$x4 = data$x^4
data$D = data$sqrt.D.^2
plot(data$y ~ data$x)
full = y~x+x2+x3+x4
testfh = fence.sae(full, data, B=1000, fence="adaptive", method="F-H", D = D)
testfh$sel_model
testfh$c

## End(Not run)
## Not run: 
require(fence)
### example 1 ####
data("kidney")
data = kidney[-which.max(kidney$x),]     # Delete a suspicious data point #
data$x2 = data$x^2
data$x3 = data$x^3
data$x4 = data$x^4
data$D = data$sqrt.D.^2
plot(data$y ~ data$x)
full = y~x+x2+x3+x4
testfh = fence.sae(full, data, B=1000, fence="adaptive", method="F-H", D = D)
testfh$sel_model
testfh$c

## End(Not run)

Fence model selection (Linear Mixed Model)

Description

Fence model selection (Linear Mixed Model)

Usage

fence.lmer(full, data, B = 100, grid = 101, fence = c("adaptive",
  "nonadaptive"), cn = NA, REML = TRUE, bandwidth = NA,
  cpus = parallel::detectCores())
fence.lmer(full, data, B = 100, grid = 101, fence = c("adaptive",
  "nonadaptive"), cn = NA, REML = TRUE, bandwidth = NA,
  cpus = parallel::detectCores())

Arguments

`full`	formula of full model
`data`	data
`B`	number of bootstrap samples, parametric bootstrap is used
`grid`	grid for c
`fence`	a procedure of the fence method to be used. It's suggested to choose nonadaptive procedure if c is known; otherwise nonadaptive must be chosen
`cn`	cn value for nonadaptive
`REML`	Restricted Maximum Likelihood approach
`bandwidth`	bandwidth for kernel smooth function
`cpus`	Number of parallel computers

Details

Value

`models`	list all model candidates in the model space
`B`	list the number of bootstrap samples that have been used
`lack_of_fit_matrix`	list a matrix of Qs for all model candidates (in columns). Each row is for each bootstrap sample
`Qd_matrix`	list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample
`bandwidth`	list the value of bandwidth
`model_mat`	list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample
`freq_mat`	list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)
`c`	list the adaptive choice of c value from which the parsimonious model is selected
`sel_model`	list the selected (parsimonious) model given the adaptive c value

@note The current Fence package focuses on variable selection. However, Fence methods can be used to select other parameters of interest, e.g., tunning parameter, variance-covariance structure, etc.

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662

Examples

require(fence)
library(snow)

#### Example 1 #####
data(iris)
full = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + (1|Species)
# Takes greater than 5 seconds to run
# test_af = fence.lmer(full, iris)
# test_af$c
# test_naf = fence.lmer(full, iris, fence = "nonadaptive", cn = 12)
# plot(test_af)
# test_af$sel_model
# test_naf$sel_model
require(fence)
library(snow)

#### Example 1 #####
data(iris)
full = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + (1|Species)
# Takes greater than 5 seconds to run
# test_af = fence.lmer(full, iris)
# test_af$c
# test_naf = fence.lmer(full, iris, fence = "nonadaptive", cn = 12)
# plot(test_af)
# test_af$sel_model
# test_naf$sel_model

Fence model selection (Nonparametric Model)

Description

Fence model selection (Noparametric Model)

Usage

fence.NF(full, data, spline, ps = 1:3, qs = NA, B = 100, grid = 101,
  bandwidth = NA, lambda)
fence.NF(full, data, spline, ps = 1:3, qs = NA, B = 100, grid = 101,
  bandwidth = NA, lambda)

Arguments

`full`	formula of full model
`data`	data
`spline`	variable needed for spline terms
`ps`	order of power
`qs`	number of knots
`B`	number of bootstrap sample, parametric for lmer
`grid`	grid for c
`bandwidth`	bandwidth for kernel smooth function
`lambda`	A grid of lambda values

Value

`models`	list all model candidates with p polynomial degrees and q knots in the model space
`Qd_matrix`	list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample
`bandwidth`	list the value of bandwidth
`model_mat`	list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample
`freq_mat`	list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)
`c`	list the adaptive choice of c value from which the parsimonious model is selected
`lambda`	penalty (or smoothing) parameter estimate given selected p and q
`sel_model`	list the selected (parsimonious) model given the adaptive c value
`beta.est.u`	A list of coefficient estimates given a lambda value
`f.x.hat`	A vector of fitted values obtained from a given lambda value and beta.est.u

@note The current Fence method in Nonparametric model focuses on one spline variable. This method can be extended to a general case with more than one spline variables, and includes non-spline variables.

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Bao-Qui Tran Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Jiang J., Nguyen T., Rao J.S. (2010), Fence Method for Nonparametric Small Area Estimation. Survey Methodology, 36, 1, 3-11

Examples

## Not run: 
require(fence)
n = 100
set.seed(1234)
x=runif(n,0,3)
y = 1-x+x^2- 2*(x-1)^2*(x>1) + 2*(x-2)^2*(x>2) + rnorm(n,sd=.2)
lambda=exp((c(1:60)-30)/3)
data=data.frame(cbind(x,y))   
test_NF = fence.NF(full=y~x, data=data, spline='x', ps=c(1:3), qs=c(2,5), B=1000, lambda=lambda)
plot(test_NF)
summary <- summary(test_NF) 
model_sel <- summary[[1]]
model_sel
lambda_sel <- summary[[2]]
lambda_sel

## End(Not run)
## Not run: 
require(fence)
n = 100
set.seed(1234)
x=runif(n,0,3)
y = 1-x+x^2- 2*(x-1)^2*(x>1) + 2*(x-2)^2*(x>2) + rnorm(n,sd=.2)
lambda=exp((c(1:60)-30)/3)
data=data.frame(cbind(x,y))   
test_NF = fence.NF(full=y~x, data=data, spline='x', ps=c(1:3), qs=c(2,5), B=1000, lambda=lambda)
plot(test_NF)
summary <- summary(test_NF) 
model_sel <- summary[[1]]
model_sel
lambda_sel <- summary[[2]]
lambda_sel

## End(Not run)

Fence model selection (Small Area Estmation)

Description

Fence model selection (Small Area Estmation)

Usage

fence.sae(full, data, B = 100, grid = 101, fence = c("adaptive",
  "nonadaptive"), cn = NA, method = c("F-H", "NER"), D = NA,
  REML = FALSE, bandwidth = NA, cpus = parallel::detectCores())
fence.sae(full, data, B = 100, grid = 101, fence = c("adaptive",
  "nonadaptive"), cn = NA, method = c("F-H", "NER"), D = NA,
  REML = FALSE, bandwidth = NA, cpus = parallel::detectCores())

Arguments

`full`	formular of full model
`data`	data
`B`	number of bootstrap sample, parametric for lmer
`grid`	grid for c
`fence`	fence method to be used, e.g., adaptive, or nonadaptive. It's suggested to choose nonadaptive procedure if c is known; otherwise nonadaptive must be chosen
`cn`	cn for nonadaptive
`method`	Select method to use
`D`	vector containing the D sampling variances of direct estimators for each domain. The values must be sorted as the variables in formula. Only used in FH model
`REML`	Restricted Maximum Likelihood approach
`bandwidth`	bandwidth for kernel smooth function
`cpus`	Number of parallel computers

Details

Value

`models`	list all model candidates in the model space
`B`	list the number of bootstrap samples that have been used
`lack_of_fit_matrix`	list a matrix of Qs for all model candidates (in columns). Each row is for each bootstrap sample
`Qd_matrix`	list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample
`bandwidth`	list the value of bandwidth
`model_mat`	list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample
`freq_mat`	list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)
`c`	list the adaptive choice of c value from which the parsimonious model is selected
`sel_model`	list the selected (parsimonious) model given the adaptive c value

Note

The current Fence package focuses on variable selection. However, Fence methods can be used to select other parameters of interest, e.g., tunning parameter, variance-covariance structure, etc.
The number of bootstrap samples is suggested to be increased, e.g., B=1000 when the sample size is small, or signals are weak

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662

Examples

require(fence)
library(snow)
### example 1 ####
data("kidney")
data = kidney[-which.max(kidney$x),]     # Delete a suspicious data point #
data$x2 = data$x^2
data$x3 = data$x^3
data$x4 = data$x^4
data$D = data$sqrt.D.^2
plot(data$y ~ data$x)
full = y~x+x2+x3+x4
# Takes more than 5 seconds to run
# testfh = fence.sae(full, data, B=100, fence="adaptive", method="F-H", D = D)
# testfh$sel_model
# testfh$c
require(fence)
library(snow)
### example 1 ####
data("kidney")
data = kidney[-which.max(kidney$x),]     # Delete a suspicious data point #
data$x2 = data$x^2
data$x3 = data$x^3
data$x4 = data$x^4
data$D = data$sqrt.D.^2
plot(data$y ~ data$x)
full = y~x+x2+x3+x4
# Takes more than 5 seconds to run
# testfh = fence.sae(full, data, B=100, fence="adaptive", method="F-H", D = D)
# testfh$sel_model
# testfh$c

Invisible Fence model selection (Linear Model)

Description

Invisible Fence model selection (Linear Model)

Usage

IF.lm(full, data, B = 100, cpus = 2, lftype = c("abscoef", "pvalue"))
IF.lm(full, data, B = 100, cpus = 2, lftype = c("abscoef", "pvalue"))

Arguments

`full`	formula of full model
`data`	data
`B`	number of bootstrap sample, parametric for lm
`cpus`	number of parallel computers
`lftype`	subtractive measure type, e.g., absolute value of coefficients, p-value, t-value, etc.

Details

This method (Jiang et. al, 2011) is motivated by computational expensive in complex and high dimensional problem. The idea of the method–there is the best model in each dimension (in model space). The boostrapping determines the coverage probability of the selected model in each dimensions. The parsimonious model is the selected model with the highest coverage probabily (except the one for the full model, always probability of 1.)

Value

`full`	list the full model
`B`	list the number of bootstrap samples that have been used
`freq`	list the coverage probabilities of the selected model for each dimension
`size`	list the number of variables in the parsimonious model
`term`	list variables included in the full model
`model`	list the variables selected in-the-order in the parsimonious model

@note The current Invisible Fence focuses on variable selection. The current routine is applicable to the case in which the subtractive measure is the absolute value of the coefficients, p-value, t-value. However, the method can be extended to other subtractive measures. See Jiang et. al (2011) for more details.

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiming Jiang, Thuan Nguyen and J. Sunil Rao (2011), Invisible fence methods and the identification of differentially expressed gene sets. Statistics and Its Interface, Volume 4, 403-415.

Examples

library(fence)
library(MASS)
library(snow)
r =1234; set.seed(r)
p=10; n=300; rho = 0.6
R = diag(p)
for(i in 1:p){
  for(j in 1:p){
     R[i,j] = rho^(abs(i-j))
  }
}
R = 1*R
x=mvrnorm(n, rep(0, p), R)
colnames(x)=paste('x',1:p, sep='')
X = cbind(rep(1,n),x)
tbetas = c(1,1,1,0,1,1,0,1,0,0,0)  # non-zero beta 1,2,4,5,7
epsilon = rnorm(n)
y = as.matrix(X)%*%tbetas + epsilon
colnames(y) = 'y'
data = data.frame(cbind(X,y))
full = y ~ x1+x2+x3+x4+x5+x6+x7+x8+x9+x10
# Takes greater than 5 seconds (~`17 seconds) to run
# obj1 = IF.lm(full = full, data = data, B = 100, lftype = "abscoef")
# sort((names(obj1$model$coef)[-1]))  
# obj2 = IF.lm(full = full, data = data, B = 100, lftype = "pvalue")
# sort(setdiff(names(data[c(-1,-12)]), names(obj2$model$coef)))

library(fence)
library(MASS)
library(snow)
r =1234; set.seed(r)
p=10; n=300; rho = 0.6
R = diag(p)
for(i in 1:p){
  for(j in 1:p){
     R[i,j] = rho^(abs(i-j))
  }
}
R = 1*R
x=mvrnorm(n, rep(0, p), R)
colnames(x)=paste('x',1:p, sep='')
X = cbind(rep(1,n),x)
tbetas = c(1,1,1,0,1,1,0,1,0,0,0)  # non-zero beta 1,2,4,5,7
epsilon = rnorm(n)
y = as.matrix(X)%*%tbetas + epsilon
colnames(y) = 'y'
data = data.frame(cbind(X,y))
full = y ~ x1+x2+x3+x4+x5+x6+x7+x8+x9+x10
# Takes greater than 5 seconds (~`17 seconds) to run
# obj1 = IF.lm(full = full, data = data, B = 100, lftype = "abscoef")
# sort((names(obj1$model$coef)[-1]))  
# obj2 = IF.lm(full = full, data = data, B = 100, lftype = "pvalue")
# sort(setdiff(names(data[c(-1,-12)]), names(obj2$model$coef)))

Invisible Fence model selection (Linear Mixed Model)

Description

Invisible Fence model selection (Linear Mixed Model)

Usage

IF.lmer(full, data, B = 100, REML = TRUE, method = c("marginal",
  "conditional"), cpus = parallel::detectCores(), lftype = c("abscoef",
  "tvalue"))
IF.lmer(full, data, B = 100, REML = TRUE, method = c("marginal",
  "conditional"), cpus = parallel::detectCores(), lftype = c("abscoef",
  "tvalue"))

Arguments

`full`	formula of full model
`data`	data
`B`	number of bootstrap sample, parametric for lmer
`REML`	Restricted maximum likelihood estimation
`method`	choose either marginal (e.g., GEE) or conditional model
`cpus`	Number of parallel computers
`lftype`	subtractive measure type, e.g., absolute value of coefficients, p-value, t-value, etc.

Details

Value

`full`	list the full model
`B`	list the number of bootstrap samples that have been used
`freq`	list the coverage probabilities of the selected model for each dimension
`size`	list the number of variables in the parsimonious model
`term`	list variables included in the full model
`model`	list the variables selected in-the-order in the parsimonious model

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiming Jiang, Thuan Nguyen and J. Sunil Rao (2011), Invisible fence methods and the identification of differentially expressed gene sets. Statistics and Its Interface, Volume 4, 403-415.

Examples

require(fence)
library(snow)
library(MASS)
data("X.lmer")
data = data.frame(X.lmer)
# non-zero beta I.col.2, I.col.3a, I.col.3b, V5, V7, V8, V9
beta = matrix(c(0, 1, 1, 1, 1, 0, 0.1, 0.05, 0.25, 0), ncol = 1)
set.seed(1234)
alpha = rep(rnorm(100), each = 3)
mu = alpha + as.matrix(data[,-1]) %*% beta
data$id = as.factor(data$id)
data$y = mu + rnorm(300)
raw = "y ~ (1|id)+I.col.2+I.col.3a+I.col.3b"
for (i in 5:10) {
    raw = paste0(raw, "+V", i)
}
full = as.formula(raw)
# The following output takes more than 5 seconds (~70 seconds) to run. 

# obj1.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "abscoef")
# sort(obj1.lmer$model) 

# obj2.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "tvalue")
# sort(obj2.lmer$model)

# Similarly, the following scenarios can be run

# obj2.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "tvalue")
# sort(obj2.lmer$model)
# obj1.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "abscoef")
# sort(names(obj1.lm$model$coefficients[-1]))
# obj2.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "tvalue")
# sort(names(obj2.lm$model$coefficients[-1]))

require(fence)
library(snow)
library(MASS)
data("X.lmer")
data = data.frame(X.lmer)
# non-zero beta I.col.2, I.col.3a, I.col.3b, V5, V7, V8, V9
beta = matrix(c(0, 1, 1, 1, 1, 0, 0.1, 0.05, 0.25, 0), ncol = 1)
set.seed(1234)
alpha = rep(rnorm(100), each = 3)
mu = alpha + as.matrix(data[,-1]) %*% beta
data$id = as.factor(data$id)
data$y = mu + rnorm(300)
raw = "y ~ (1|id)+I.col.2+I.col.3a+I.col.3b"
for (i in 5:10) {
    raw = paste0(raw, "+V", i)
}
full = as.formula(raw)
# The following output takes more than 5 seconds (~70 seconds) to run. 

# obj1.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "abscoef")
# sort(obj1.lmer$model) 

# obj2.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "tvalue")
# sort(obj2.lmer$model)

# Similarly, the following scenarios can be run

# obj2.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "tvalue")
# sort(obj2.lmer$model)
# obj1.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "abscoef")
# sort(names(obj1.lm$model$coefficients[-1]))
# obj2.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "tvalue")
# sort(names(obj2.lm$model$coefficients[-1]))

Invisible Fence model selection

Description

Invisible Fence model selection

Usage

invisiblefence(mf, f, d, lf, bs)
invisiblefence(mf, f, d, lf, bs)

Arguments

`mf`	Call function, for example: default calls: function(m, b) eblupFH(formula = m, vardir = D, data = b, method = "FH")
`f`	Full model
`d`	Dimension number
`lf`	Measures lack of fit using function(res) -res$fit$goodness[1]
`bs`	Bootstrap

Details

Value

`full`	list the full model
`B`	list the number of bootstrap samples that have been used
`freq`	list the coverage probabilities of the selected model for each dimension
`size`	list the number of variables in the parsimonious model
`term`	list variables included in the full model
`model`	list the variables selected in-the-order in the parsimonious model

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiming Jiang, Thuan Nguyen and J. Sunil Rao (2011), Invisible fence methods and the identification of differentially expressed gene sets. Statistics and Its Interface, Volume 4, 403-415.

Examples

## Not run: 
data("X.lmer")
data = data.frame(X.lmer)
beta = matrix(c(0, 1, 1, 1, 1, 0, 0.1, 0.05, 0.25, 0), ncol = 1)
set.seed(1234)
alpha = rep(rnorm(100), each = 3)
mu = alpha + as.matrix(data[,-1]) %*% beta
data$id = as.factor(data$id)
data$y = mu + rnorm(300)
raw = "y ~ (1|id)+I.col.2+I.col.3a+I.col.3b"
for (i in 5:10) {
    raw = paste0(raw, "+V", i)
}
full = as.formula(raw)
obj1.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "abscoef")
obj1.lmer$model$coefficients
obj2.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "tvalue")
obj2.lmer$model$coefficients
obj1.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "abscoef")
obj1.lm$model$coefficients
obj2.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "tvalue")
obj2.lm$model$coefficients

## End(Not run)
## Not run: 
data("X.lmer")
data = data.frame(X.lmer)
beta = matrix(c(0, 1, 1, 1, 1, 0, 0.1, 0.05, 0.25, 0), ncol = 1)
set.seed(1234)
alpha = rep(rnorm(100), each = 3)
mu = alpha + as.matrix(data[,-1]) %*% beta
data$id = as.factor(data$id)
data$y = mu + rnorm(300)
raw = "y ~ (1|id)+I.col.2+I.col.3a+I.col.3b"
for (i in 5:10) {
    raw = paste0(raw, "+V", i)
}
full = as.formula(raw)
obj1.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "abscoef")
obj1.lmer$model$coefficients
obj2.lmer = IF.lmer(full = full, data = data, B = 100, method="conditional",lftype = "tvalue")
obj2.lmer$model$coefficients
obj1.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "abscoef")
obj1.lm$model$coefficients
obj2.lm = IF.lmer(full = full, data = data, B = 100, method="marginal", lftype = "tvalue")
obj2.lm$model$coefficients

## End(Not run)

kidney

Description

Data used for kidney example

Usage

kidney
kidney

Format

A data frame with 4 variables

Nonadaptive Fence model selection

Description

Nonadaptive Fence model selection

Usage

nonadaptivefence(mf, f, ms, d, lf, pf, cn)
nonadaptivefence(mf, f, ms, d, lf, pf, cn)

Arguments

`mf`	function for fitting the model
`f`	formula of full model
`ms`	list of formula of candidates models
`d`	data
`lf`	measure lack of fit (to minimize)
`pf`	model selection criteria, e.g., model dimension
`cn`	given a specific c value

Value

`models`	list all model candidates in the model space
`lack_of_fit`	list a vector of Qs for all model candidates
`formula`	list the model of the selected parsimonious model
`sel_model`	list the selected (parsimonious) model given the adaptive c value

Author(s)

Jiming Jiang Jianyang Zhao J. Sunil Rao Thuan Nguyen

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662

Examples

## Not run: 
require(fence)

#### Example 1 #####
data(iris)
full = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + (1|Species)
test_naf = fence.lmer(full, iris, fence = "nonadaptive", cn = 12)
test_naf$sel_model

## End(Not run)
## Not run: 
require(fence)

#### Example 1 #####
data(iris)
full = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + (1|Species)
test_naf = fence.lmer(full, iris, fence = "nonadaptive", cn = 12)
test_naf$sel_model

## End(Not run)

Plot Adaptive Fence model selection

Description

Plot Adaptive Fence model selection

Usage

## S3 method for class 'AF'
plot(x = res, ...)
## S3 method for class 'AF'
plot(x = res, ...)

Arguments

`x`	Object to be plotted
`...`	Additional arguments. CNot currently used.

Plot Nonparametric Fence model selection

Description

Plot Nonparametric Fence model selection

Usage

## S3 method for class 'NF'
plot(x = res, ...)
## S3 method for class 'NF'
plot(x = res, ...)

Arguments

`x`	Object to be plotted
`...`	Additional arguments. CNot currently used.

Adaptive Fence model selection (Restricted Fence)

Description

Adaptive Fence model selection (Restricted Fence)

Usage

RF(full, data, groups, B = 100, grid = 101, bandwidth = NA,
  plot = FALSE, method = c("marginal", "conditional"), id = "id",
  cpus = parallel::detectCores())
RF(full, data, groups, B = 100, grid = 101, bandwidth = NA,
  plot = FALSE, method = c("marginal", "conditional"), id = "id",
  cpus = parallel::detectCores())

Arguments

`full`	formula of full model
`data`	data
`groups`	A list of formulas of (full) model in each bins (groups) of variables
`B`	number of bootstrap sample, parametric for lmer
`grid`	grid for c
`bandwidth`	bandwidth for kernel smooth function
`plot`	Plot object
`method`	either marginal (GEE) or conditional approach is selected
`id`	Subject or cluster id variable
`cpus`	Number of parallel computers

Details

Value

`models`	list all model candidates in the model space
`B`	list the number of bootstrap samples that have been used
`lack_of_fit_matrix`	list a matrix of Qs for all model candidates (in columns). Each row is for each bootstrap sample
`Qd_matrix`	list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample
`bandwidth`	list the value of bandwidth
`model_mat`	list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample
`freq_mat`	list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)
`c`	list the adaptive choice of c value from which the parsimonious model is selected
`sel_model`	list the selected (parsimonious) model given the adaptive c value

Note

bandwidth = (cs[2] - cs[1]) * 3. So it's chosen as 3 times grid between two c values.

References

Jiang J., Rao J.S., Gu Z., Nguyen T. (2008), Fence Methods for Mixed Model Selection. The Annals of Statistics, 36(4): 1669-1692
Jiang J., Nguyen T., Rao J.S. (2009), A Simplified Adaptive Fence Procedure. Statistics and Probability Letters, 79, 625-629
Thuan Nguyen, Jiming Jiang (2012), Restricted fence method for covariate selection in longitudinal data analysis. Biostatistics, 13(2), 303-314
Thuan Nguyen, Jie Peng, Jiming Jiang (2014), Fence Methods for Backcross Experiments. Statistical Computation and Simulation, 84(3), 644-662

Examples

## Not run: 
r =1234; set.seed(r)
n = 100; p=15; rho = 0.6
beta = c(1,1,1,0,1,1,0,1,0,0,1,0,0,0,0)  # non-zero beta 1,2,3,V6,V7,V9,V12
id = rep(1:n,each=3)
V.1 = rep(1,n*3)
I.1 = rep(c(1,-1),each=150)
I.2a = rep(c(0,1,-1),n)
I.2b = rep(c(0,-1,1),n)
x = matrix(rnorm(n*3*11), nrow=n*3, ncol=11)
x = cbind(id,V.1,I.1,I.2a,I.2b,x)
R = diag(3)
for(i in 1:3){
 for(j in 1:3){
   R[i,j] = rho^(abs(i-j))
 }
} 
e=as.vector(t(mvrnorm(n, rep(0, 3), R)))  
y = as.vector(x[,-1]%*%beta) + e
data = data.frame(x,y)
raw = "y ~ V.1 + I.1 + I.2a +I.2b"
for (i in 6:16) { raw = paste0(raw, "+V", i)}; full = as.formula(raw)
bin1="y ~ V.1 + I.1 + I.2a +I.2b"
for (i in 6:8) { bin1 = paste0(bin1, "+V", i)}; bin1 = as.formula(bin1)
bin2="y ~ V9"
for (i in 10:16){ bin2 = paste0(bin2, "+V", i)}; bin2 = as.formula(bin2)
# May take longer than 30 min since there are two stages in this RF procedure
obj1.RF = RF(full = full, data = data, groups = list(bin1,bin2), method="conditional")
obj1.RF$sel_model
obj2.RF = RF(full = full, data = data, groups = list(bin1,bin2), B=100, method="marginal")
obj2.RF$sel_model

## End(Not run)
## Not run: 
r =1234; set.seed(r)
n = 100; p=15; rho = 0.6
beta = c(1,1,1,0,1,1,0,1,0,0,1,0,0,0,0)  # non-zero beta 1,2,3,V6,V7,V9,V12
id = rep(1:n,each=3)
V.1 = rep(1,n*3)
I.1 = rep(c(1,-1),each=150)
I.2a = rep(c(0,1,-1),n)
I.2b = rep(c(0,-1,1),n)
x = matrix(rnorm(n*3*11), nrow=n*3, ncol=11)
x = cbind(id,V.1,I.1,I.2a,I.2b,x)
R = diag(3)
for(i in 1:3){
 for(j in 1:3){
   R[i,j] = rho^(abs(i-j))
 }
} 
e=as.vector(t(mvrnorm(n, rep(0, 3), R)))  
y = as.vector(x[,-1]%*%beta) + e
data = data.frame(x,y)
raw = "y ~ V.1 + I.1 + I.2a +I.2b"
for (i in 6:16) { raw = paste0(raw, "+V", i)}; full = as.formula(raw)
bin1="y ~ V.1 + I.1 + I.2a +I.2b"
for (i in 6:8) { bin1 = paste0(bin1, "+V", i)}; bin1 = as.formula(bin1)
bin2="y ~ V9"
for (i in 10:16){ bin2 = paste0(bin2, "+V", i)}; bin2 = as.formula(bin2)
# May take longer than 30 min since there are two stages in this RF procedure
obj1.RF = RF(full = full, data = data, groups = list(bin1,bin2), method="conditional")
obj1.RF$sel_model
obj2.RF = RF(full = full, data = data, groups = list(bin1,bin2), B=100, method="marginal")
obj2.RF$sel_model

## End(Not run)

Summary Adaptive Fence model selection

Description

Summary Adaptive Fence model selection

Usage

## S3 method for class 'AF'
summary(object = res, ...)
## S3 method for class 'AF'
summary(object = res, ...)

Arguments

`object`	Object to be summarized
`...`	addition arguments. Not currently used

Summary Nonparametric Fence model selection

Description

Summary Nonparametric Fence model selection

Usage

## S3 method for class 'NF'
summary(object = res, ...)
## S3 method for class 'NF'
summary(object = res, ...)

Arguments

`object`	Object to be summarized
`...`	addition arguments. Not currently used

X.lmer

Description

Data used in the example for X.lmer

Usage

data(X.lmer)
data(X.lmer)

Format

A data frame with 10 variables:

Package 'fence'

Help Index

Adaptive Fence model selection

Description

Usage

Arguments

Value

Author(s)

References

Examples

Adaptive Fence model selection (Small Area Estmation)

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Fence model selection (Linear Mixed Model)

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Fence model selection (Nonparametric Model)

Description

Usage

Arguments

Value

Author(s)

References

Examples

Fence model selection (Small Area Estmation)

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Invisible Fence model selection (Linear Model)

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Invisible Fence model selection (Linear Mixed Model)

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Invisible Fence model selection

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

kidney

Description

Usage

Format

Nonadaptive Fence model selection

Description