Title: | Bayesian Generalized Additive Model Selection |
---|---|
Description: | Generalized additive model selection via approximate Bayesian inference is provided. Bayesian mixed model-based penalized splines with spike-and-slab-type coefficient prior distributions are used to facilitate fitting and selection. The approximate Bayesian inference engine options are: (1) Markov chain Monte Carlo and (2) mean field variational Bayes. Markov chain Monte Carlo has better Bayesian inferential accuracy, but requires a longer run-time. Mean field variational Bayes is faster, but less accurate. The methodology is described in He and Wand (2023) <arXiv:2201.00412>. |
Authors: | Virginia X. He [aut] , Matt P. Wand [aut, cre] |
Maintainer: | Matt P. Wand <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0-1 |
Built: | 2024-11-28 06:38:26 UTC |
Source: | CRAN |
Facilitates a graphical check of the Markov chain Monte Carlo samples ("chains") corresponding to key quantities for the predictors selected as having an effect.
checkChains(fitObject,colourVersion = TRUE,paletteNum = 1)
checkChains(fitObject,colourVersion = TRUE,paletteNum = 1)
fitObject |
|
colourVersion |
Boolean flag: |
paletteNum |
The palette of colours for the graphical display when colourVersion is TRUE. Two palettes, numbered 1 and 2, are available. The value of |
A graphic is produced that summarises the Markov chain Monte Carlo samples ("chains") corresponding to key quantities for the predictors selected as having an effect. If the predictor is found to have a linear effect then the chain for its coefficient is graphically checked. It the predictor is found to have a non-linear effect then the chain for the vertical slice of the penalized spline fit at the median of the predictor sample is graphically checked. The columns of the graphic are the following summaries of each chain: (1) trace (time series) plot, (2) lag-1 plot in which each chain value is plotted against its previous value and (3) sample autocorrelation function plot as produced by the R function acf(). A rudimentary graphical assessment of convergence involveschecking that the trace plots have flat-lined, rather than having any noticeable trends. If the latter occurs that a longer warmup is recommended.
Nothing is returned.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fitMCMC <- gamselBayes(y,Xlinear,Xgeneral) summary (fitMCMC) # Obtain a graphic for checking the chains: checkChains(fitMCMC) if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] fitMCMC <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fitMCMC) # Obtain a graphic for checking the chains: checkChains(fitMCMC) }
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fitMCMC <- gamselBayes(y,Xlinear,Xgeneral) summary (fitMCMC) # Obtain a graphic for checking the chains: checkChains(fitMCMC) if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] fitMCMC <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fitMCMC) # Obtain a graphic for checking the chains: checkChains(fitMCMC) }
Produces a tabular summary of the estimated effect types from a gamselBayes() fit object.
effectTypes(fitObject)
effectTypes(fitObject)
fitObject |
|
Two tables are printed to standard output. The first table lists the names of the predictors that are estimated as having a linear effect. The second table lists the names of the predictors that are estimated as having a nonlinear effect.
Nothing is returned.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) # Tabulate the estimated effect types: effectTypes(fit)
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) # Tabulate the estimated effect types: effectTypes(fit)
Extracts the vector of estimated effect types from a gamselBayes() fit object.
effectTypesVector(fitObject)
effectTypesVector(fitObject)
fitObject |
|
The result is a vector of character strings having the same length as the total number of predictors inputted through Xlinear
and Xgeneral
. The character strings are one of "linear", "nonlinear" and "zero" according to whether each predictor is estimated as having a linear effect, nonlinear effect of zero effect. The ordering in the returned vector matches that of the columns of Xlinear
and then the columns of Xgeneral
.
A vector of character strings having the same length as the number of predictors, which conveys the estimated effect types.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) # Obtain the vector of effect types: effectTypesVector(fit)
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) # Obtain the vector of effect types: effectTypesVector(fit)
Selection of predictors and the nature of their impact on the mean response (linear versus non-linear) is a fundamental problem in regression analysis. This function uses the generalized additive models framework for estimating predictors effects. An approximate Bayesian inference approach and has two options for achieving this: (1) Markov chain Monte Carlo and (2) mean field variational Bayes.
gamselBayes(y,Xlinear = NULL,Xgeneral = NULL,method = "MCMC",lowerMakesSparser = NULL, family = "gaussian",verbose = TRUE,control = gamselBayes.control())
gamselBayes(y,Xlinear = NULL,Xgeneral = NULL,method = "MCMC",lowerMakesSparser = NULL, family = "gaussian",verbose = TRUE,control = gamselBayes.control())
y |
Vector containing the response data. If 'family = "gaussian"' then the response data are modelled as being continuous with a Gaussian distribution. If 'family = "binomial"' then the response data must be binary with 0/1 coding. |
Xlinear |
Data frame with number of rows equal to the length of |
Xgeneral |
A data frame with number of rows equal to the length of |
method |
Character string for specifying the method to be used: |
lowerMakesSparser |
A threshold parameter between 0 and 1, which is such that lower values lead to sparser fits. |
family |
Character string for specifying the response family: |
verbose |
Boolean variable for specifying whether or not progress messages are printed to the console. The default is TRUE. |
control |
Function for controlling the spline bases, Markov chain Monte Carlo sampling, mean field variational Bayes and other specifications. |
Generalized additive model selection via approximate Bayesian inference is provided. Bayesian mixed model-based penalized splines with spike-and-slab-type coefficient prior distributions are used to facilitate fitting and selection. The approximate Bayesian inference engine options are: (1) Markov chain Monte Carlo and (2) mean field variational Bayes. Markov chain Monte Carlo has better Bayesian inferential accuracy, but requires a longer run-time. Mean field variational Bayes is faster, but less accurate. The methodology is described in He and Wand (2021) <arXiv:2201.00412>.
An object of class gamselBayes
, which is a list with the following components:
method |
the value of |
family |
the value of |
Xlinear |
the inputted design matrix containing predictors that can only have linear effects. |
Xgeneral |
the inputted design matrix containing predictors that are potentially have non-linear effects. |
rangex |
the value of the control parameter |
intKnots |
the value of the control parameter |
truncateBasis |
the value of the control parameter |
numBasis |
the value of the control parameter |
MCMC |
a list such that each component is the retained Markov chain Monte Carlo (MCMC)sample for a model parameter. The components are: |
MFVB |
a list such that each component is the mean field variational Bayes approximate posterior density function, or q-density, parameters. The components are: |
effectTypeHat |
an array of character strings, with entry either "zero", "linear" or "nonlinear", signifying the estimated effect type for each candidate predictor. |
meanXlinear |
an array containing the sample means of each column of |
sdXlinear |
an array containing the sample standard deviations of each column of |
meanXgeneral |
an array containing the sample means of each column of |
sdXgeneral |
an array containing the sample standard deviations of each column of |
Virginia X. He [email protected] and Matt P. Wand [email protected]
Chouldechova, A. and Hastie, T. (2015). Generalized additive model selection. <arXiv:1506.03850v2>.
He, V.X. and Wand, M.P. (2021). Bayesian generalized additive model selection including a fast variational option. <arXiv:2021.PLACE-HOLDER>.
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data, using Markov chain Monte Carlo: fitMCMC <- gamselBayes(y,Xlinear,Xgeneral) summary(fitMCMC) ; plot(fitMCMC) ; checkChains(fitMCMC) # Obtain a gamselBayes() fit for the data, using mean field variational Bayes: fitMFVB <- gamselBayes(y,Xlinear,Xgeneral,method = "MFVB") summary(fitMFVB) ; plot(fitMFVB) if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] # Obtain a gamselBayes() fit for the data, using Markov chain Monte Carlo: fitMCMC <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fitMCMC) ; plot(fitMCMC) ; checkChains(fitMCMC) # Obtain a gamselBayes() fit for the data, using mean field variational Bayes: fitMFVB <- gamselBayes(y = mathScore,Xgeneral = Xgeneral,method = "MFVB") summary(fitMFVB) ; plot(fitMFVB) }
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data, using Markov chain Monte Carlo: fitMCMC <- gamselBayes(y,Xlinear,Xgeneral) summary(fitMCMC) ; plot(fitMCMC) ; checkChains(fitMCMC) # Obtain a gamselBayes() fit for the data, using mean field variational Bayes: fitMFVB <- gamselBayes(y,Xlinear,Xgeneral,method = "MFVB") summary(fitMFVB) ; plot(fitMFVB) if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] # Obtain a gamselBayes() fit for the data, using Markov chain Monte Carlo: fitMCMC <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fitMCMC) ; plot(fitMCMC) ; checkChains(fitMCMC) # Obtain a gamselBayes() fit for the data, using mean field variational Bayes: fitMFVB <- gamselBayes(y = mathScore,Xgeneral = Xgeneral,method = "MFVB") summary(fitMFVB) ; plot(fitMFVB) }
Function for optional use in calls to gamselBayes()
to control spline basis dimension, hyperparameter choice and other specifications for generalized additive model selection via using Bayesian inference.
gamselBayes.control(numIntKnots = 25,truncateBasis = TRUE,numBasis = 12, sigmabeta0 = 100000,sbeta = 1000,sepsilon = 1000, su= 1000,rhoBeta = 0.5,rhoU = 0.5,nWarm = 1000, nKept = 1000,nThin = 1,maxIter = 1000,toler = 1e-8, msgCode = 1)
gamselBayes.control(numIntKnots = 25,truncateBasis = TRUE,numBasis = 12, sigmabeta0 = 100000,sbeta = 1000,sepsilon = 1000, su= 1000,rhoBeta = 0.5,rhoU = 0.5,nWarm = 1000, nKept = 1000,nThin = 1,maxIter = 1000,toler = 1e-8, msgCode = 1)
numIntKnots |
The number of interior knots used in construction of the Demmler-Reinsch spline basis functions. The value of |
truncateBasis |
Boolean flag: |
numBasis |
The number of spline basis functions retained after truncation when |
sigmabeta0 |
The standard deviation hyperparameter for the Normal prior distribution on the intercept parameter for the standardized version of the data used in Bayesian inference. The default is 100000. |
sbeta |
The standard deviation hyperparameter for the Normal prior distribution on the linear component coefficients parameters for the standardized version of the data used in Bayesian inference. The default is 1000. |
sepsilon |
The scale hyperparameter for the Half-Cauchy prior distribution on the error standard deviation parameter for the standardized version of the data used in Bayesian inference. The default is 1000. |
su |
The scale hyperparameter for the Half-Cauchy prior distribution on the spline basis coefficients standard deviation parameter for the standardized version of the data used in Bayesian inference. The default is 1000. |
rhoBeta |
The probability parameter for the Bernoulli prior distribution on the linear component coefficients spike-and-slab auxiliary indicator variables probability parameter. The default is 0.5. |
rhoU |
The probability parameter for the Bernoulli prior distribution on the spline basis coefficients spike-and-slab auxiliary indicator variables probability parameter. The default is 0.5. |
nWarm |
The size of the Markov chain Monte Carlo warmup, a positive integer, when method is Markov chain Monte Carlo. The default is 1000. |
nKept |
The size of the kept Markov chain Monte Carlo samples, a positive integer, when the method is Markov chain Monte Carlo. The default is 1000. |
nThin |
The thinning factor for the kept Markov chain Monte Carlo samples, a positive integer, when the method is Markov chain Monte Carlo. The default is 1. |
maxIter |
The maximum number of mean field variational Bayes iterations, a positive integer, when the method is mean field variational Bayes. The default is 1000. |
toler |
The convergence tolerance for mean field variational Bayes iterations, a positive number less than 0.5, when the method is mean field variational Bayes. Convergence is deemed to have occurred if the relative change in the logarithm of the approximate marginal likelihood falls below |
msgCode |
Code for the nature of progress messages printed to the screen. If msgCode=0 then no progress messages are printed. If msgCode=1 then a messages printed each time approximately each 10% of the sampling is completed. If msgCode=2 then a messages printed each time approximately each 10% of the sampling is completed. The default is 1. |
A list containing values of each of the seventeen control parameters, packaged to supply the control
argument to gamselBayes
. The values for gamselBayes.control
can be specified in the call to gamselBayes
.
Virginia X. He [email protected] and Matt P. Wand [email protected]
He, V.X. and Wand, M.P. (2021). Generalized additive model selection via Bayesian inference. Submitted.
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) ; plot(fit) ; checkChains(fit) # Now modify some of the control values: fitControlled <- gamselBayes(y,Xlinear,Xgeneral,control = gamselBayes.control( numIntKnots = 35,truncateBasis = FALSE, sbeta = 10000,su = 10000,nWarm = 2000,nKept = 1500)) summary(fitControlled) ; plot(fitControlled) ; checkChains(fitControlled)
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) ; plot(fit) ; checkChains(fit) # Now modify some of the control values: fitControlled <- gamselBayes(y,Xlinear,Xgeneral,control = gamselBayes.control( numIntKnots = 35,truncateBasis = FALSE, sbeta = 10000,su = 10000,nWarm = 2000,nKept = 1500)) summary(fitControlled) ; plot(fitControlled) ; checkChains(fitControlled)
gamselBayes()
fit object.Facilitates updating of gamselBayes
fit object when two key parameters controlling model selection are modified. Use of gamselBayesUpdate()
allows for fast tweaking of such parameters without another, potentially time-consuming, call to gamselBayes()
.
gamselBayesUpdate(fitObject,lowerMakesSparser = NULL)
gamselBayesUpdate(fitObject,lowerMakesSparser = NULL)
fitObject |
|
lowerMakesSparser |
A threshold parameter between 0 and 1, which is such that lower values lead to sparser fits. |
The gamselBayesUpdate()
function is applicable when a gamselBayes()
fit object has been obtained for particular data inputs y
, Xlinear
and Xgeneral
(as well as other tuning-type inputs) and the analyst is interested in changing the value of the parameter that controls model selection. This parameter is named lowerMakesSparser
, and is described above. A call to gamselBayesUpdate()
with a new value of lowerMakesSparse
produces an updated gamselBayes()
fit object with, potentially, different effect type estimates.
An object of class gamselBayes
with the same components as those produced by the gamselBayes()
function. See help(gamselBayes)
for details.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some regression-type data: set.seed(1) ; n <- 5000 ; numPred <- 15 Xgeneral <- as.data.frame(matrix(runif(n*numPred),n,numPred)) names(Xgeneral) <- paste("x",1:numPred,sep="") y <- as.vector(0.1 + 0.4*Xgeneral[,1] - 2*pnorm(3-6*Xgeneral[,2]) - 0.9*Xgeneral[,4] + cos(3*pi*Xgeneral[,5]) + 2*rnorm(n)) # Obtain and assess a gamselBayes() fit: fitOrig <- gamselBayes(y,Xgeneral = Xgeneral) summary(fitOrig) ; plot(fitOrig) print(fitOrig$effectTypesHat) # Update the gamselBayes() fit object with a new value of # the "lowerMakesSparser" parameter: fitUpdated <- gamselBayesUpdate(fitOrig,lowerMakesSparser = 0.6) summary(fitUpdated) ; plot(fitUpdated) print(fitUpdated$effectTypesHat)
library(gamselBayes) # Generate some regression-type data: set.seed(1) ; n <- 5000 ; numPred <- 15 Xgeneral <- as.data.frame(matrix(runif(n*numPred),n,numPred)) names(Xgeneral) <- paste("x",1:numPred,sep="") y <- as.vector(0.1 + 0.4*Xgeneral[,1] - 2*pnorm(3-6*Xgeneral[,2]) - 0.9*Xgeneral[,4] + cos(3*pi*Xgeneral[,5]) + 2*rnorm(n)) # Obtain and assess a gamselBayes() fit: fitOrig <- gamselBayes(y,Xgeneral = Xgeneral) summary(fitOrig) ; plot(fitOrig) print(fitOrig$effectTypesHat) # Update the gamselBayes() fit object with a new value of # the "lowerMakesSparser" parameter: fitUpdated <- gamselBayesUpdate(fitOrig,lowerMakesSparser = 0.6) summary(fitUpdated) ; plot(fitUpdated) print(fitUpdated$effectTypesHat)
The vignette of the gamselBayes package is displayed using the default PDF file browser. It provides a detailed description of use of the package for Bayesian generalized additive model selection.
gamselBayesVignette()
gamselBayesVignette()
Nothing is returned.
Virginia X. He [email protected] and Matt P. Wand [email protected]
gamselBayesVignette()
gamselBayesVignette()
gamselBayes()
fitThe estimated non-linear components of the generalized additive model selected via gamselBayes
are plotted.
## S3 method for class 'gamselBayes' plot(x,credLev = 0.95,gridSize = 251,nMC = 5000,varBand = TRUE, shade = TRUE,yscale = "response",rug = TRUE,rugSampSize = NULL,estCol = "darkgreen", varBandCol = NULL,rugCol = "dodgerblue",mfrow = NULL,xlim = NULL,ylim = NULL, xlab = NULL,ylab = NULL,mai = NULL,pages = NULL,cex.axis = 1.5,cex.lab = 1.5,...)
## S3 method for class 'gamselBayes' plot(x,credLev = 0.95,gridSize = 251,nMC = 5000,varBand = TRUE, shade = TRUE,yscale = "response",rug = TRUE,rugSampSize = NULL,estCol = "darkgreen", varBandCol = NULL,rugCol = "dodgerblue",mfrow = NULL,xlim = NULL,ylim = NULL, xlab = NULL,ylab = NULL,mai = NULL,pages = NULL,cex.axis = 1.5,cex.lab = 1.5,...)
x |
A |
credLev |
A number between 0 and 1 such that the credible interval band has (100*credLev)% approximate pointwise coverage. The default value is 0.95. |
gridSize |
A number of grid points used to display the density estimate curve and the pointwise credible interval band. The default value is 251. |
nMC |
The size of the Monte Carlo sample, a positive integer, for carrying out approximate inference from the mean field variational Bayes-approximate posterior distributions when the method is mean field variational Bayes. The default value is 5000. |
varBand |
Boolean flag specifying whether or not a variability band is included: |
shade |
Boolean flag specifying whether or not the variability band is displayed using shading: |
yscale |
Character string specifying the vertical axis scale for display of estimated non-linear functions: |
rug |
Boolean flag specifying whether or not rug-type displays for predictor data are used: |
rugSampSize |
The size of the random sample sample of each predictor to be used in rug-type displays. |
estCol |
Colour of the density estimate curve. The default value is "darkgreen". |
varBandCol |
Colour of the pointwise credible interval variability band. If |
rugCol |
Colour of rug plot that shows values of the predictor data. The default value is "dodgerblue". |
mfrow |
An optional two-entry vector for specifying the layout of the nonlinear fit displays. |
xlim |
An optional two-column matrix for specification of horizontal frame limits in the plotting of the estimated non-linear predictor effects. The number of rows in |
ylim |
The same as |
xlab |
An optional vector of character strings containing labels for the horizontal axes. The number of entries in |
ylab |
The same as |
mai |
An optional numerical vector of length 4 for specification of inner margin dimensions of each panel, ordered clockwise from below the panel. |
pages |
An optional positive integer that specifies the number of pages used to display the non-linear function estimates. |
cex.axis |
An optional positive scalar value for specification of the character expansion factor for the axis values. |
cex.lab |
An optional positive scalar value for specification of the character expansion factor for the labels. |
... |
Place-holder for other graphical parameters. |
The estimated non-linear components of the selected generalized additive model are plotted. Each plot corresponds to a slice of the selected generalized additive model surface with all other predictors set to their median values. Pointwise credible intervals unless varBand
is FALSE
.
Nothing is returned.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data and print out a summary: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) # Plot the predictor effect(s) estimated as being non-linear: plot(fit) # Plot the same fit(s) but with different colours and style: plot(fit,shade = FALSE,estCol = "darkmagenta",varBandCol = "plum", rugCol = "goldenrod") if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] fit <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fit) # Plot the predictor effect(s) estimated as being non-linear: plot(fit) }
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data and print out a summary: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) # Plot the predictor effect(s) estimated as being non-linear: plot(fit) # Plot the same fit(s) but with different colours and style: plot(fit,shade = FALSE,estCol = "darkmagenta",varBandCol = "plum", rugCol = "goldenrod") if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] fit <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fit) # Plot the predictor effect(s) estimated as being non-linear: plot(fit) }
gamselBayes()
fitThe estimated non-linear components of the generalized additive model selected via gamselBayes
are plotted.
## S3 method for class 'gamselBayes' predict(object,newdata,type = "response",...)
## S3 method for class 'gamselBayes' predict(object,newdata,type = "response",...)
object |
A |
newdata |
A two-component list the following components: |
type |
A character string for specifying the type of prediction, with the following options: "link", "response" or "terms", which leads to the value as described below. |
... |
A place-holder for other prediction parameters. |
A vector or data frame depending on the value of type
:
If type
="link" then the value is a vector of linear predictor-scale fitted values.
If type
="response" and family
="binomial" then the value is a vector of probability-scale fitted values.
Otherwise (i.e. family
="binomial") the value is the vector of predictor-scale fitted values.
If type
="terms" then the value is a a data frame with number of columns equal to the total number of predictors. Each column is the contribution to the vector of linear predictor-scale fitted values from each predictor. These contributions do not include the intercept predicted value. The intercept predicted value is included as an attribute of the returned data frame.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some simple regression-type data: n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) names(Xlinear) <- c("x1") ; names(Xgeneral) <- c("x2","x3","x4") # Obtain and summarise a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) # Obtain some new data: nNew <- 10 x1new <- rbinom(nNew,1,0.5) ; x2new <- runif(nNew) ; x3new <- runif(nNew) x4new <- runif(nNew) XlinearNew <- data.frame(x1new) ; names(XlinearNew) <- "x1" XgeneralNew <- data.frame(x2new,x3new,x4new) names(XgeneralNew) <- c("x2","x3","x4") newdataList <- list(XlinearNew,XgeneralNew) # Obtain predictions at the new data: predObjDefault <- predict(fit,newdata=newdataList) print(predObjDefault)
library(gamselBayes) # Generate some simple regression-type data: n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) names(Xlinear) <- c("x1") ; names(Xgeneral) <- c("x2","x3","x4") # Obtain and summarise a gamselBayes() fit for the data: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) # Obtain some new data: nNew <- 10 x1new <- rbinom(nNew,1,0.5) ; x2new <- runif(nNew) ; x3new <- runif(nNew) x4new <- runif(nNew) XlinearNew <- data.frame(x1new) ; names(XlinearNew) <- "x1" XgeneralNew <- data.frame(x2new,x3new,x4new) names(XgeneralNew) <- c("x2","x3","x4") newdataList <- list(XlinearNew,XgeneralNew) # Obtain predictions at the new data: predObjDefault <- predict(fit,newdata=newdataList) print(predObjDefault)
gamselBayes()
fitInference summaries of the estimated linear component coefficients of the generalized additive model selected via gamselBayes
are tabulated.
## S3 method for class 'gamselBayes' summary(object,credLev = 0.95,sigFigs = 5,nMC = 10000,...)
## S3 method for class 'gamselBayes' summary(object,credLev = 0.95,sigFigs = 5,nMC = 10000,...)
object |
A |
credLev |
A number between 0 and 1 such that the credible interval band has (100*credLev)% approximate pointwise coverage. The default value is 0.95. |
sigFigs |
The number of significant figures used for the entries of the summary table. |
nMC |
The size of the Monte Carlo sample, a positive integer, for carrying out approximate inference from the mean field variational Bayes-approximate posterior distributions when the method is mean field variational Bayes. The default value is 10000. |
... |
Place-holder for other summary parameters. |
If the selected generalized additive model has at least one predictor having a linear effect then a data frame is returned. The columns of the data correspond to posterior means and credible interval limits of the linear effects coefficients.
A data frame containing linear effect Bayesian inferential summaries.
Virginia X. He [email protected] and Matt P. Wand [email protected]
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data and print out a summary: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) # Print the summary with different values of some of the arguments: summary(fit,credLev=0.99,sigFigs=3) if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] fit <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fit) }
library(gamselBayes) # Generate some simple regression-type data: set.seed(1) ; n <- 1000 ; x1 <- rbinom(n,1,0.5) ; x2 <- runif(n) ; x3 <- runif(n) ; x4 <- runif(n) y <- x1 + sin(2*pi*x2) - x3 + rnorm(n) Xlinear <- data.frame(x1) ; Xgeneral <- data.frame(x2,x3,x4) # Obtain a gamselBayes() fit for the data and print out a summary: fit <- gamselBayes(y,Xlinear,Xgeneral) summary(fit) # Print the summary with different values of some of the arguments: summary(fit,credLev=0.99,sigFigs=3) if (require("Ecdat")) { # Obtain a gamselBayes() fit for data on schools in California, U.S.A.: Caschool$log.avginc <- log(Caschool$avginc) mathScore <- Caschool$mathscr Xgeneral <- Caschool[,c("mealpct","elpct","calwpct","compstu","log.avginc")] fit <- gamselBayes(y = mathScore,Xgeneral = Xgeneral) summary(fit) }