Title: | Bayesian Model Averaging Library |
---|---|
Description: | Bayesian Model Averaging for linear models with a wide choice of (customizable) priors. Built-in priors include coefficient priors (fixed, hyper-g and empirical priors), 5 kinds of model priors, moreover model sampling by enumeration or various MCMC approaches. Post-processing functions allow for inferring posterior inclusion and model probabilities, various moments, coefficient and predictive densities. Plotting functions available for posterior model size, MCMC convergence, predictive and coefficient densities, best models representation, BMA comparison. Also includes Bayesian normal-conjugate linear model with Zellner's g prior, and assorted methods. |
Authors: | Martin Feldkircher and Stefan Zeugner and Paul Hofmarcher |
Maintainer: | Stefan Zeugner <[email protected]> |
License: | BSD_3_clause + file LICENSE |
Version: | 0.3.5 |
Built: | 2024-10-31 20:47:22 UTC |
Source: | CRAN |
Extracts a model out of a bma
object's saved models and converts it
to a zlm
linear model
as.zlm(bmao, model = 1)
as.zlm(bmao, model = 1)
bmao |
A |
model |
The model index, in one of the following forms: |
A bma object stores several 'best' models it encounters (cf. argument
nmodel
in bms
). as.zlm
extracts a single model
and converts it to an object of class zlm
, which represents a
linear model estimated under Zellner's g prior.
The utility
model.frame
allows to transfrom a zlm
model into an OLS
model of class lm
.
a list of class zlm
Stefan Zeugner
bms
for creating bma
objects,
zlm
for creating zlm
objects,
pmp.bma
for displaying the
topmodels in a bma
object
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls[,1:6],mcmc="enumeration") # do a small BMA chain topmodels.bma(mm)[,1:5] #display the best 5 models m2a=as.zlm(mm,4) #extract the fourth best model summary(m2a) # Bayesian Model Selection: # transform the best model into an OLS model: lm(model.frame(as.zlm(mm))) # extract the model only containing the 5th regressor m2b=as.zlm(mm,c(0,0,0,0,1)) # extract the model only containing the 5th regressor in hexcode print(bin2hex(c(0,0,0,0,1))) m2c=as.zlm(mm,"01")
data(datafls) mm=bms(datafls[,1:6],mcmc="enumeration") # do a small BMA chain topmodels.bma(mm)[,1:5] #display the best 5 models m2a=as.zlm(mm,4) #extract the fourth best model summary(m2a) # Bayesian Model Selection: # transform the best model into an OLS model: lm(model.frame(as.zlm(mm))) # extract the model only containing the 5th regressor m2b=as.zlm(mm,c(0,0,0,0,1)) # extract the model only containing the 5th regressor in hexcode print(bin2hex(c(0,0,0,0,1))) m2c=as.zlm(mm,"01")
Returns a matrix whose columns are the (expected value or standard deviations of) coefficients for the best models in a bma object.
beta.draws.bma(bmao, stdev = FALSE)
beta.draws.bma(bmao, stdev = FALSE)
bmao |
a 'bma' object (as e.g. resulting from |
stdev |
if |
Each column presents the coefficients for the model indicated by its
column name. The zero coefficients are the excluded covariates per model.
Note that the coefficients returned are only those of the best (100) models
encountered by the bma
object (cf. argument nmodels
of
bms
).
For aggregate coefficients please refer to coef.bma
.
Note that the elements of beta.draws.bma(bmao)
correspond to
bmao$topmod$betas()
bms
for creating bms objects, coef.bma
for aggregate coefficients
Check http://bms.zeugner.eu for additional help.
#sample a bma object: data(datafls) mm=bms(datafls,burn=500,iter=5000,nmodel=20) #coefficients for all beta.draws.bma(mm) #standard deviations for the fourth- to eight best models beta.draws.bma(mm[4:8],TRUE);
#sample a bma object: data(datafls) mm=bms(datafls,burn=500,iter=5000,nmodel=20) #coefficients for all beta.draws.bma(mm) #standard deviations for the fourth- to eight best models beta.draws.bma(mm[4:8],TRUE);
A list holding results from a BMA iteration chain
Objects can be created via calls to
bms
, but indirectly also via c.bma
A
bma
object is a list whose elements hold information on input and
output for a Bayesian Model Averaging iteration chain, such as from a call
to bms
:
Martin Feldkircher and Stefan Zeugner
bms
for creating bma
objects,
or
topmod
for the topmod object
data(datafls) mm=bms(datafls) #show posterior model size print(mm$info$msize/mm$info$cumsumweights) #is the same number as in summary(mm)
data(datafls) mm=bms(datafls) #show posterior model size print(mm$info$msize/mm$info$cumsumweights) #is the same number as in summary(mm)
Given data and prior information, this function samples all possible model combinations via MC3 or enumeration and returns aggregate results.
bms( X.data, burn = 1000, iter = NA, nmodel = 500, mcmc = "bd", g = "UIP", mprior = "random", mprior.size = NA, user.int = TRUE, start.value = NA, g.stats = TRUE, logfile = FALSE, logstep = 10000, force.full.ols = FALSE, fixed.reg = numeric(0) )
bms( X.data, burn = 1000, iter = NA, nmodel = 500, mcmc = "bd", g = "UIP", mprior = "random", mprior.size = NA, user.int = TRUE, start.value = NA, g.stats = TRUE, logfile = FALSE, logstep = 10000, force.full.ols = FALSE, fixed.reg = numeric(0) )
X.data |
a data frame or a matrix, with the dependent variable in the
first column, followed by the covariates (alternatively, |
burn |
The (positive integer) number of burn-in draws for the MC3 sampler, defaults to 1000. (Not taken into account if mcmc="enumerate") |
iter |
If mcmc is set to an MC3 sampler, then this is the number of
iteration draws to be sampled (ex burn-ins), default 3000 draws. |
nmodel |
the number of best models for which information is stored
(default 500). Best models are used for convergence analysis between
likelihoods and MCMC frequencies, as well as likelihood-based inference. |
mcmc |
a character denoting the model sampler to be used. |
g |
the hyperparameter on Zellner's g-prior for the regression
coefficients. |
mprior |
a character denoting the model prior choice, defaulting to
"random": |
mprior.size |
if |
user.int |
'interactive mode': print out results to console after ending the routine and plots a chart (default TRUE). |
start.value |
specifies the starting model of the iteration chain. For
instance a specific model by the corresponding column indices (e.g.
starting.model=numeric(K) starts from the null model including solely a
constant term) or |
g.stats |
|
logfile |
setting |
logstep |
specifies at which number of posterior draws information is written to the log file; default: 10 000 iterations |
force.full.ols |
default FALSE. If |
fixed.reg |
indices or variable names of |
Ad mcmc
:
Interaction sampler: adding an ".int" to an MC3 sampler
(e.g. "mcmc="bd.int") provides for special treatment of interaction terms.
Interaction terms will only be sampled along with their component variables:
In the colnumn names of X.data, interaction terms need to be denominated by
names consisting of the base terms separated by #
(e.g. an
interaction term of base variables "A"
, "B"
and "C"
needs column name "A#B#C"
). Then variable "A#B#C"
will only be
included in a model if all of the component variables ("A", "B", and "C")
are included.
The MC3 samplers "bd
", "rev.jump
", "bd.int
" and
"rev.jump.int
", iterate away from a starting model by adding,
dropping or swapping (only in the case of rev.jump) covariates.
In an MCMC fashion, they thus randomly draw a candidate model and then move to it in case its marginal likelihood (marg.lik.) is superior to the marg.lik. of the current model.
In case the candidate's marg.lik is inferior, it is randomly accepted or rejected according to a probability formed by the ratio of candidate marg.lik over current marg.lik. Over time, the sampler should thus converge to a sensible distribution. For aggregate results based on these MC3 frequencies, the first few iterations are typically disregarded (the 'burn-ins').
Ad g
and the hyper-g prior: The hyper-g prior introduced by Liang et
al. (2008) puts a prior distribution on the shrinkage factor ,
namely a Beta distribution
that is governed by the
parameter
.
means a uniform prior distribution of the
shrinkage factor, while
close to 2 concentrates the prior
shrinkage factor close to one.
The prior expected value is
. In this sense
g="hyper=UIP"
and
g="hyper=BRIC"
set the prior expected shrinkage such that it conforms
to a fixed UIP-g (eqng=N) or BRIC-g ( ).
A list of class bma
, that may be displayed using e.g.
summary.bma
or coef.bma
. The list contains the
following elements:
info |
a list of aggregate statistics: |
arguments |
a list of the evaluated function arguments
provided to |
topmod |
a 'topmod' object
containing the best drawn models. see |
start.pos |
the positions of the starting model. If bmao is a'bma'
object this corresponds to covariates bmao$reg.names[bmao$start.pos]. If
bmao is a chain that resulted from several starting models (cf.
|
gprior.info |
a list of class |
mprior.info |
a list of class |
X.data |
data.frame or matrix: corresponds to
argument |
reg.names |
character vector: the covariate names to be used for X.data
(corresponds to |
bms.call |
the
original call to the |
The models analyzed are Bayesian
normal-gamma conjugate models with improper constant and variance priors
akin to Fernandez, Ley and Steel (2001): A model can be described as
follows, with
~
:
Moreover, the (improper) prior on the constant is put
proportional to 1. Similarly, the variance prior
is
proportional to
.
There are several ways to speed-up sampling: nmodel=10
saves
only the ten best models, at most a marginal improvement. nmodels=0
does not save the best (500) models, however then posterior convergence and
likelihood-based inference are not possible.
the best models, but not their coefficients, which renders the use of
image.bma
and the paramer exact=TRUE
in functions such as
coef.bma
infeasible. g.stats=FALSE
saves some time by not
retaining the shrinkage factors for the MC3 chain (and the best models).
force.fullobject=TRUE
in contrast, slows sampling down significantly
if mcmc="enumerate"
.
Martin Feldkircher, Paul Hofmarcher, and Stefan Zeugner
http://bms.zeugner.eu: BMS package homepage with help and tutorials
Feldkircher, M. and S. Zeugner (2015): Bayesian Model Averaging Employing Fixed and Flexible Priors: The BMS Package for R, Journal of Statistical Software 68(4).
Feldkircher, M. and S. Zeugner (2009): Benchmark Priors Revisited: On Adaptive Shrinkage and the Supermodel Effect in Bayesian Model Averaging, IMF Working Paper 09/202.
Fernandez, C. E. Ley and M. Steel (2001): Benchmark priors for Bayesian model averaging. Journal of Econometrics 100(2), 381–427
Ley, E. and M. Steel (2008): On the Effect of Prior Assumptions in Bayesian Model Averaging with Applications to Growth Regressions. working paper
Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). Mixtures of g Priors for Bayesian Variable Selection. Journal of the American Statistical Association 103, 410-423.
Sala-i-Martin, X. and G. Doppelhofer and R.I. Miller (2004): Determinants of long-term growth: a Bayesian averaging of classical estimates (BACE) approach. American Economic Review 94(4), 813–835
coef.bma
, plotModelsize
and
density.bma
for some operations on the resulting 'bma' object,
c.bma
for integrating separate MC3 chains and splitting of
sampling over several runs.
Check http://bms.zeugner.eu for additional help.
data(datafls) #estimating a standard MC3 chain with 1000 burn-ins and 2000 iterations and uniform model priors bma1 = bms(datafls,burn=1000, iter=2000, mprior="uniform") ##standard coefficients based on exact likelihoods of the 100 best models: coef(bma1,exact=TRUE, std.coefs=TRUE) #suppressing user-interactive output, using a customized starting value, and not saving the best # ...models for only 19 observations (but 41 covariates) bma2 = bms(datafls[20:39,],burn=1000, iter=2000, nmodel=0, start.value=c(1,4,7,30), user.int=FALSE) coef(bma2) #MC3 chain with a hyper-g prior (custom coefficient a=2.1), saving only the 20 best models, # ...and an alternative sampling procedure; putting a log entry to console every 1000th step bma3 = bms(datafls,burn=1000, iter=5000, nmodel=20, g="hyper=2.1", mcmc="rev.jump", logfile="",logstep=1000) image(bma3) #showing the coefficient signs of the 20 best models #enumerating with 10 covariates (= 1024 models), keeping the shrinkage factors # ...of the best 200 models bma4 = bms(datafls[,1:11],mcmc="enumerate",nmodel=200,g.stats=TRUE) #using an interaction sampler for two interaction terms dataint=datafls dataint=cbind(datafls,datafls$LifeExp*datafls$Abslat/1000, datafls$Protestants*datafls$Brit-datafls$Muslim) names(dataint)[ncol(dataint)-1]="LifeExp#Abslat" names(dataint)[ncol(dataint)]="Protestants#Brit#Muslim" bma5 = bms(X.data=dataint,burn=1000,iter=9000,start.value=0,mcmc="bd.int") density(bma5,reg="English") # plot posterior density for covariate "English" # a matrix as X.data argument bms(matrix(rnorm(1000),100,10)) # keeping a set of fixed regressors: bms(datafls, mprior.size=7, fixed.reg = c("PrScEnroll", "LifeExp", "GDP60")) # Note that mprior.size=7 means prior model size of 3 fixed to 4 'uncertain' regressors
data(datafls) #estimating a standard MC3 chain with 1000 burn-ins and 2000 iterations and uniform model priors bma1 = bms(datafls,burn=1000, iter=2000, mprior="uniform") ##standard coefficients based on exact likelihoods of the 100 best models: coef(bma1,exact=TRUE, std.coefs=TRUE) #suppressing user-interactive output, using a customized starting value, and not saving the best # ...models for only 19 observations (but 41 covariates) bma2 = bms(datafls[20:39,],burn=1000, iter=2000, nmodel=0, start.value=c(1,4,7,30), user.int=FALSE) coef(bma2) #MC3 chain with a hyper-g prior (custom coefficient a=2.1), saving only the 20 best models, # ...and an alternative sampling procedure; putting a log entry to console every 1000th step bma3 = bms(datafls,burn=1000, iter=5000, nmodel=20, g="hyper=2.1", mcmc="rev.jump", logfile="",logstep=1000) image(bma3) #showing the coefficient signs of the 20 best models #enumerating with 10 covariates (= 1024 models), keeping the shrinkage factors # ...of the best 200 models bma4 = bms(datafls[,1:11],mcmc="enumerate",nmodel=200,g.stats=TRUE) #using an interaction sampler for two interaction terms dataint=datafls dataint=cbind(datafls,datafls$LifeExp*datafls$Abslat/1000, datafls$Protestants*datafls$Brit-datafls$Muslim) names(dataint)[ncol(dataint)-1]="LifeExp#Abslat" names(dataint)[ncol(dataint)]="Protestants#Brit#Muslim" bma5 = bms(X.data=dataint,burn=1000,iter=9000,start.value=0,mcmc="bd.int") density(bma5,reg="English") # plot posterior density for covariate "English" # a matrix as X.data argument bms(matrix(rnorm(1000),100,10)) # keeping a set of fixed regressors: bms(datafls, mprior.size=7, fixed.reg = c("PrScEnroll", "LifeExp", "GDP60")) # Note that mprior.size=7 means prior model size of 3 fixed to 4 'uncertain' regressors
Combines bma objects (resulting from bms
). Can be used to
split estimation over several machines, or combine the MCMC results obtained
from different starting points.
## S3 method for class 'bma' c(..., recursive = FALSE)
## S3 method for class 'bma' c(..., recursive = FALSE)
... |
At least two 'bma' objects (cf. |
recursive |
retained for compatibility with |
Aggregates the information obtained from several chains. The result is a
'bma' object (cf. 'Values' in bms
) that can be used just as a
standard 'bma' object.
Note that combine_chains
helps in
particular to paralllelize the enumeration of the total model space: A model
with regressors has
potential covariate combinations: With
large (more than 25), this can be pretty time intensive. With the
bms
arguments start.value
and iter
, sampling can
be done in steps: cf. example 'enumeration' below.
bms
for creating bma objects
Check http://bms.zeugner.eu for additional help.
data(datafls) #MCMC case ############################ model1=bms(datafls,burn=1000,iter=4000,mcmc="bd",start.value=c(20,30,35)) model2=bms(datafls,burn=1500,iter=7000,mcmc="bd",start.value=c(1,10,15)) model_all=c(model1,model2) coef(model_all) plot(model_all) #splitting enumeration ######################## #standard case with 12 covariates (4096 differnt combinations): enum0=bms(datafls[,1:13],mcmc="enumerate") # now split the task: # enum1 does everything from model zero (the first model) to model 1999 enum1=bms(datafls[,1:13],mcmc="enumerate",start.value=0,iter=1999) # enum2 does models from index 2000 to the index 3000 (in total 1001 models) enum2=bms(datafls[,1:13],mcmc="enumerate",start.value=2000,iter=1000) # enum3 does models from index 3001 to the end enum3=bms(datafls[,1:13],mcmc="enumerate",start.value=3001) enum_combi=c(enum1,enum2,enum3) coef(enum_combi) coef(enum0) #both enum_combi and enum0 have exactly the same results #(one difference: enum_combi has more 'top models' (1500 instead of 500))
data(datafls) #MCMC case ############################ model1=bms(datafls,burn=1000,iter=4000,mcmc="bd",start.value=c(20,30,35)) model2=bms(datafls,burn=1500,iter=7000,mcmc="bd",start.value=c(1,10,15)) model_all=c(model1,model2) coef(model_all) plot(model_all) #splitting enumeration ######################## #standard case with 12 covariates (4096 differnt combinations): enum0=bms(datafls[,1:13],mcmc="enumerate") # now split the task: # enum1 does everything from model zero (the first model) to model 1999 enum1=bms(datafls[,1:13],mcmc="enumerate",start.value=0,iter=1999) # enum2 does models from index 2000 to the index 3000 (in total 1001 models) enum2=bms(datafls[,1:13],mcmc="enumerate",start.value=2000,iter=1000) # enum3 does models from index 3001 to the end enum3=bms(datafls[,1:13],mcmc="enumerate",start.value=3001) enum_combi=c(enum1,enum2,enum3) coef(enum_combi) coef(enum0) #both enum_combi and enum0 have exactly the same results #(one difference: enum_combi has more 'top models' (1500 instead of 500))
The economic growth data set from Fernandez, Ley and Steel, Journal of Applied Econometrics 2001
datafls
datafls
A data frame with 53940 rows and 10 variables: A data frame with 72 observations on the following 42 variables.
y
numeric: Economic growth 1960-1992 as from the Penn World Tables Rev 6.0
Abslat
numeric: Absolute latitude
Spanish
numeric: Spanish colony dummy
French
numeric: French colony dummy
Brit
numeric: British colony dummy
WarDummy
numeric: War dummy
LatAmerica
numeric: Latin America dummy
SubSahara
numeric; Sub-Sahara dummy
OutwarOr
numeric: Outward Orientation
Area
numeric: Area surface
PrScEnroll
numeric: Primary school enrolment
LifeExp
numeric: Life expectancy
GDP60
numeric: Initial GDP in 1960
Mining
numeric: Fraction of GDP in mining
EcoOrg
numeric: Degree of capitalism
YrsOpen
numeric: Number of years having an open economy
Age
numeric: Age
Buddha
numeric: Fraction Buddhist
Catholic
numeric: Fraction Catholic
Confucian
numeric: Fraction Confucian
EthnoL
numeric: Ethnolinguistic fractionalization
Hindu
numeric: Fraction Hindu
Jewish
numeric: Fraction Jewish
Muslim
numeric: Fraction Muslim
PrExports
numeric: Primary exports 1970
Protestants
numeric: Fraction Protestants
RuleofLaw
numeric: Rule of law
Popg
numeric: Population growth
WorkPop
numeric: workers per inhabitant
LabForce
numeric: Size of labor force
HighEnroll
numeric: Higher education enrolment
PublEdupct
numeric: Public education share
RevnCoup
numeric: Revolutions and coups
PolRights
numeric: Political rights
CivlLib
numeric: Civil liberties
English
numeric: Fraction speaking English
Foreign
numeric: Fraction speaking foreign language
RFEXDist
numeric: Exchange rate distortions
EquipInv
numeric: Equipment investment
NequipInv
numeric: Non-equipment investment
stdBMP
numeric: stand. dev. of black market premium
BlMktPm
numeric: black market premium
Fernandez, C., Ley, E., and Steel, M. F. (2001b). Model Uncertainty in Cross-Country Growth Regressions. Journal of Applied Econometrics, 16:563-576. Data set from https://warwick.ac.uk/fac/sci/statistics/staff/academic-research/steel/steel_homepage/software.
A working paper version of Fernandez, Ley and Steel (2001) is available via https://econpapers.repec.org/article/jaejapmet/v_3a16_3ay_3a2001_3ai_3a5_3ap_3a563-576.htm.
Calculates the mixture marginal posterior densities for the coefficients from a BMA object and plots them
## S3 method for class 'bma' density( x, reg = NULL, addons = "lemsz", std.coefs = FALSE, n = 300, plot = TRUE, hnbsteps = 30, addons.lwd = 1.5, ... )
## S3 method for class 'bma' density( x, reg = NULL, addons = "lemsz", std.coefs = FALSE, n = 300, plot = TRUE, hnbsteps = 30, addons.lwd = 1.5, ... )
x |
|
reg |
A scalar integer or character detailing which covariate's
coefficient should be plotted. If |
addons |
character. Specifies which additional information should be added to the plot via low-level commands (see 'Details' below). |
std.coefs |
logical. If |
n |
numeric. the number of equally spaced points at which the density is to be estimated. |
plot |
logical. If |
hnbsteps |
even integer, default 30. The number of numerical
integration steps to be used in case of a hyper-g prior (cf. argument
|
addons.lwd |
scalar, default 1.5. Line width to be used for the
low-level plotting commands specified by |
... |
Additional arguments for |
The argument addons
specifies what additional information should be
added to the plot(s) via the low-level commands lines
and
legend
:"e"
for the posterior expected value (EV) of
coefficients conditional on inclusion (see argument exact=TRUE
in
coef.bma
),"s"
for 2 times posterior standard
deviation (SD) bounds,"m"
for the posterior median,"b"
for posterior expected values of the individual models whom the density is
averaged over,"E"
for posterior EV under MCMC frequencies (see
argument exact=FALSE
in coef.bma
),"S"
for
the corresponding SD bounds (MCMC),"p"
for plotting the Posterior
Inclusion Probability above the density plot,"l"
for including a
legend
, "z"
for a zero line, "g"
for adding a
grid
Any combination of these letters will give the desired result. Use
addons=""
for not using any of these.
In case of
density.zlm
, only the letters e
, s
, l
, z
,
and g
will have an effect.
The function returns a list containing objects of the class
density
detailing the marginal posterior densities for each
coefficient provided in reg
.
In case of density.zlm
, simple
marginal posterior coefficient densities are computed, while
density.bma
calculates there mixtures over models according to
posterior model probabilities.
These densities contain only the density
points apart from the origin. (see 'Note' below)
As long as plot=TRUE
, the densities are plotted too. Note that (for
density.bma
) if the posterior inclusion probability of a covariate is
zero, then it will not be plotted, and the returned density will be
list(x=numeric(n),y=numeric(n))
.
The computed marginal posterior densities from density.bma
are
a Bayesian Model Averaging mixture of the marginal posterior densities of
the individual models. The accuracy of the result therefore depends on the
number of 'best' models contained in x
(cf. argument nmodel
in
bms
).
The marginal posterior density can be interpreted as 'conditional on inclusion': If the posterior inclusion probability of a variable is smaller than one, then some of its posterior density is Dirac at zero. Therefore the integral of the returned density vector adds up to the posterior inclusion probability, i.e. the probability that the coefficient is not zero.
Correspondingly, the posterior EV and SD specified by addons="es"
are
based on 'best' model likelihoods ('exact') and are conditional on
inclusion. They correspond to the results from command
coef.bma(x,exact=TRUE,condi.coef=TRUE,order.by.pip=FALSE)
(cf. the
example below).
The low-level commands enacted by the argument addons
rely on colors
of the palette
: color 2 for "e"
and "s"
, color 3
for "m"
, color 8 for "b"
, color 4 for "E"
and
"S"
. The default colors may be changed by a call to
palette
.
Up to BMS version 0.3.0, density.bma
may only cope with built-in
gprior
s, not with any user-defined priors.
quantile.coef.density
for extracting quantiles,
coef.bma
for similar concepts, bms
for creating
bma objects
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls) density(mm,reg="SubSahara") density(mm,reg=7,addons="lbz") density(mm,1:9) density(mm,reg=2,addons="zgSE",addons.lwd=2,std.coefs=TRUE) # plot the posterior density only for the very best model density(mm[1],reg=1,addons="esz") #using the calculated density for other purposes... dd=density(mm,reg="SubSahara") plot(dd) dd_list=density(mm,reg=1:3,plot=FALSE,n=400) plot(dd_list[[1]]) #Note that the shown density is only the part that is not zero dd=density(mm,reg="Abslat",addons="esl") pip_Abslat=sum(dd$y)*diff(dd$x)[1] #this pip and the EV conform to what is done by the follwing command coef(mm,exact=TRUE,condi.coef=TRUE)["Abslat",]
data(datafls) mm=bms(datafls) density(mm,reg="SubSahara") density(mm,reg=7,addons="lbz") density(mm,1:9) density(mm,reg=2,addons="zgSE",addons.lwd=2,std.coefs=TRUE) # plot the posterior density only for the very best model density(mm[1],reg=1,addons="esz") #using the calculated density for other purposes... dd=density(mm,reg="SubSahara") plot(dd) dd_list=density(mm,reg=1:3,plot=FALSE,n=400) plot(dd_list[[1]]) #Note that the shown density is only the part that is not zero dd=density(mm,reg="Abslat",addons="esl") pip_Abslat=sum(dd$y)*diff(dd$x)[1] #this pip and the EV conform to what is done by the follwing command coef(mm,exact=TRUE,condi.coef=TRUE)["Abslat",]
Returns a matrix with aggregate covariate-specific Bayesian model Averaging: posterior inclusion probabilites (PIP), post. expected values and standard deviations of coefficients, as well as sign probabilites
estimates.bma( bmao, exact = FALSE, order.by.pip = TRUE, include.constant = FALSE, incl.possign = TRUE, std.coefs = FALSE, condi.coef = FALSE ) ## S3 method for class 'bma' coef( object, exact = FALSE, order.by.pip = TRUE, include.constant = FALSE, incl.possign = TRUE, std.coefs = FALSE, condi.coef = FALSE, ... )
estimates.bma( bmao, exact = FALSE, order.by.pip = TRUE, include.constant = FALSE, incl.possign = TRUE, std.coefs = FALSE, condi.coef = FALSE ) ## S3 method for class 'bma' coef( object, exact = FALSE, order.by.pip = TRUE, include.constant = FALSE, incl.possign = TRUE, std.coefs = FALSE, condi.coef = FALSE, ... )
exact |
if |
order.by.pip |
|
include.constant |
If |
incl.possign |
If |
std.coefs |
If |
condi.coef |
If |
object , bmao
|
a 'bma' object (cf. |
... |
further arguments for other |
More on the argument exact
:
In case the argument
exact=TRUE
, the PIPs, coefficient statistics and conditional sign
probabilities are computed on the basis of the (500) best models the
sampling chain encountered (cf. argument nmodel
in
bms
). Here, the weights for Bayesian model averaging (BMA) are
the posterior marginal likelihoods of these best models.
In case
exact=FALSE
, then these statistics are based on all accepted models
(except burn-ins): If mcmc="enumerate"
then this are simply all
models of the traversed model space, with their marginal likelihoods
providing the weights for BMA.
If, however, the bma object bmao
was based on an MCMC sampler (e.g. when bms
argument
mcmc="bd"
), then BMA statistics are computed differently: In contrast
to above, the weights for BMA are MCMC frequencies, i.e. how often the
respective models were encountered by the MCMC sampler. (cf. a comparison of
MCMC frequencies and marginal likelihoods for the best models via the
function pmp.bma
).
A matrix with five columns (or four if incl.possign=FALSE
)
Column 'PIP' |
Posterior inclusion probabilities |
Column 'Post Mean' |
posterior
expected value of coefficients, unconditional |
Column 'Post SD' |
posterior standard deviation
of coefficients, unconditional or conditional on inclusion, depending on
|
Column 'Cond.Pos.Sign' |
The ratio of how often the
coefficients' expected values were positive conditional on inclusion. (over
all visited models in case |
Column 'Idx' |
the original order of covariates as the were used for sampling. (if included, the constant has index 0) |
bms
for creating bma objects, pmp.bma
for comparing MCMC frequencies and marginal likelihoods.
Check http://bms.zeugner.eu for additional help.
#sample, with keeping the best 200 models: data(datafls) mm=bms(datafls,burn=1000,iter=5000,nmodel=200) #standard BMA PIPs and coefficients from the MCMC sampling chain, based on # ...how frequently the models were drawn coef(mm) #standardized coefficients, ordered by index coef(mm,std.coefs=TRUE,order.by.pip=FALSE) #coefficients conditional on inclusion: coef(mm,condi.coef=TRUE) #same as ests=coef(mm,condi.coef=FALSE) ests[,2]/ests[,1] #PIPs, coefficients, and signs based on the best 200 models estimates.bma(mm,exact=TRUE) #... and based on the 50 best models coef(mm[1:50],exact=TRUE)
#sample, with keeping the best 200 models: data(datafls) mm=bms(datafls,burn=1000,iter=5000,nmodel=200) #standard BMA PIPs and coefficients from the MCMC sampling chain, based on # ...how frequently the models were drawn coef(mm) #standardized coefficients, ordered by index coef(mm,std.coefs=TRUE,order.by.pip=FALSE) #coefficients conditional on inclusion: coef(mm,condi.coef=TRUE) #same as ests=coef(mm,condi.coef=FALSE) ests[,2]/ests[,1] #PIPs, coefficients, and signs based on the best 200 models estimates.bma(mm,exact=TRUE) #... and based on the 50 best models coef(mm[1:50],exact=TRUE)
Computes the value of a Gaussian hypergeometric function
for
and
f21hyper(a, b, c, z)
f21hyper(a, b, c, z)
a |
The parameter |
b |
The parameter |
c |
The parameter |
z |
The parameter |
The function f21hyper
complements the analysis of the 'hyper-g prior'
introduced by Liang et al. (2008).
For parameter values, compare cf.
https://en.wikipedia.org/wiki/Hypergeometric_function.
The value of the Gaussian hypergeometric function
This function is a simple wrapper function of sped-up code that is
intended for sporadic application by the user; it is neither efficient nor
general; for a more general version cf. the package 'hypergeo
'
Liang F., Paulo R., Molina G., Clyde M., Berger J.(2008): Mixtures of g-priors for Bayesian variable selection. J. Am. Statist. Assoc. 103, p. 410-423
https://en.wikipedia.org/wiki/Hypergeometric_function
package hypergeo
for a more proficient implementation.
Check http://bms.zeugner.eu for additional help.
f21hyper(30,1,20,.8) #returns about 165.8197 f21hyper(30,10,20,0) #returns one f21hyper(10,15,20,-0.1) # returns about 0.4872972
f21hyper(30,1,20,.8) #returns about 165.8197 f21hyper(30,10,20,0) #returns one f21hyper(10,15,20,-0.1) # returns about 0.4872972
A utility function for reference: Returns a list with R2 and sum of squares for the OLS model encompassing all potential covariates that are included in a bma object.
fullmodel.ssq(yX.data)
fullmodel.ssq(yX.data)
yX.data |
a bma object (cf. |
Returns a list with some basic OLS statistics
R2 |
The R-squared of the full model |
ymy |
The sum of squares of residuals of the full model |
ypy |
The explained sum of squares of the full model |
yty |
The sum of squares of the (demeaned) dependent variable |
Fstat |
The F-statistic of the full model |
This function is just for quick comparison; for proper OLS estimation
consider lm
bms
for creating bma objects, lm
for
OLS estimation
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls) fullmodel.ssq(mm) #equivalent: fullmodel.ssq(datafls)
data(datafls) mm=bms(datafls) fullmodel.ssq(mm) #equivalent: fullmodel.ssq(datafls)
Calculates the mixture marginal posterior density for the shrinkage factor (g/(1+g)) from a BMA object under the hyper-g prior and plots it
gdensity(x, n = 512, plot = TRUE, addons = "zles", addons.lwd = 1.5, ...)
gdensity(x, n = 512, plot = TRUE, addons = "zles", addons.lwd = 1.5, ...)
x |
A bma object (see |
n |
The integer number of equally spaced points at which the density is to be estimated. see 'Details' below |
plot |
logical. If |
addons |
character, defaulting to |
addons.lwd |
scalar, default 1.5. Line width to be used for the
low-level plotting commands specified by |
... |
Additional arguments for |
The function gdensity
estimates and plots the posterior density for
the shrinkage factor
This is evidently only possible if the
shrinkage factor if not fixed, i.e. if the bma object x
was estimated
with a hyper-g prior - cf. argument g
in bms
The
density is based only on the best models retained in the bma object
x
, cf. argument nmodel
in bms
A note on
argument n
: The points at which the density is estimated start at
, where
and
are the expected value and
standard deviation of the shrinkage factor, respectively. For plotting the
entire domain
use
xlim=c(0,1)
as an argument for
gdensity
.
The argument addons
specifies what additional information should be
added to the plot(s) via the low-level commands lines
and
legend
:"e"
for the posterior expected value (EV) of
the shrinkage factor,"s"
for 2 times posterior standard deviation
(SD) bounds,"m"
for the posterior median,"f"
for
posterior expected values of the individual models whom the density is
averaged over,"z"
for a zero line, "l"
for including a
legend
The following two are only possible if the bma
object collected statistics on shrinkage, cf. argument g.stats
in
bms
"E"
for posterior expected value under MCMC
frequencies (see argument exact
in coef.bma
),"S"
for the corresponding 2 times standard deviation bounds
(MCMC),
Any combination of these letters will give the desired result. Use
addons=""
for not using any of these.
gdensity
returns an object of the class density
detailing the posterior mixture density of the shrinkage factor.
The computed marginal posterior density is a Bayesian Model Averaging
mixture of the marginal posterior densities of the shrinkage factor under
individual models. The accuracy of the result therefore depends on the
number of 'best' models contained in x
(cf. argument nmodel
in
bms
).
Correspondingly, the posterior EV and SD specified by addons="es"
are
based on 'best' model likelihoods ('exact') and are conditional on
inclusion.
The low-level commands enacted by the argument addons
rely on colors
of the palette
: color 2 for "e"
and "s"
, color 3
for "m"
, color 8 for "f"
, color 4 for "E"
and
"S"
. The default colors may be changed by a call to
palette
.
density.bma
for computing coefficient densities,
bms
for creating bma objects, density
for the
general method
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,g="hyper=UIP") gdensity(mm) # default plotting # the grey bars represent expected shrinkage factors of the individual models gdensity(mm,addons="lzfes") # #plotting the median 'm' and the posterior mean and bounds based on MCMC results: gdensity(mm,addons="zSEm",addons.lwd=2) # plot the posterior shrinkage density only for the very best model gdensity(mm[1],addons="esz") #using the calculated density for other purposes... dd=gdensity(mm,plot=FALSE) plot(dd)
data(datafls) mm=bms(datafls,g="hyper=UIP") gdensity(mm) # default plotting # the grey bars represent expected shrinkage factors of the individual models gdensity(mm,addons="lzfes") # #plotting the median 'm' and the posterior mean and bounds based on MCMC results: gdensity(mm,addons="zSEm",addons.lwd=2) # plot the posterior shrinkage density only for the very best model gdensity(mm[1],addons="esz") #using the calculated density for other purposes... dd=gdensity(mm,plot=FALSE) plot(dd)
An object pertaining to a coefficient prior
A gprior
object holds descriptions
and subfunctions pertaining to coefficient priors. Functions such as
bms
or zlm
rely on this class to 'convert' the
output of OLS results into posterior expressions for a Bayesian Linear
Model. Post-processing functions such as density.bma
also
resort to gprior objects.
There are currently three coefficient prior
structures built into the BMS package, generated by the following functions
(cf. Feldkircher and Zeugner, 2009) : gprior.constg.init
: creates
a Zellner's g-prior object with constant g
.gprior.eblocal.init
: creates an Empricial Bayes Zellner's g-prior.gprior.hyperg.init
: creates a hyper g-prior with a Beta-prior on the
shrinkage parameter.
The following describes the necessary slots
Martin Feldkircher and Stefan Zeugner
Feldkircher, M. and S. Zeugner (2009): Benchmark Priors Revisited: On Adaptive Shrinkage and the Supermodel Effect in Bayesian Model Averaging, IMF Working Paper 09/202.
bms
and zlm
for creating bma
or
zlm
objects.
Check the appendix of vignette(BMS)
for a
more detailed description of built-in priors.
Check
http://bms.zeugner.eu/custompriors.php for examples.
data(datafls) mm1=bms(datafls[,1:10], g="EBL") gg=mm1$gprior.info # is the g-prior object, augmented with some posterior statistics mm2=bms(datafls[,1:10], g=gg) #produces the same result mm3=bms(datafls[,1:10], g=BMS:::.gprior.eblocal.init) #this passes BMS's internal Empirical Bayes g-prior object as the coefficient prior # - any other obejct might be used as well
data(datafls) mm1=bms(datafls[,1:10], g="EBL") gg=mm1$gprior.info # is the g-prior object, augmented with some posterior statistics mm2=bms(datafls[,1:10], g=gg) #produces the same result mm3=bms(datafls[,1:10], g=BMS:::.gprior.eblocal.init) #this passes BMS's internal Empirical Bayes g-prior object as the coefficient prior # - any other obejct might be used as well
A simple-to-use function for converting a logical ('binary') vector into hex code and reverse.
hex2bin(hexcode) bin2hex(binvec)
hex2bin(hexcode) bin2hex(binvec)
hexcode |
a single-element character denoting an integer in hexcode (admissible character: 0 to 9, ato f) |
binvec |
a logical vector (alternatively a vector coercible into logical) |
The argument is an integer in binary form (such as "101"), provided as a
logical (c(T,F,T)
) or numeric vector (c(1,0,1)
).bin2hex
then returns a character denoting this number in hexcode (in
this case "5").
The function hex2bin
does the reverse operation, e.g.
hex2bin("5")
gives (c(1,0,1)
).
bin2hex
returns a single element character; hex2bin
returns a numeric vector equivalent to a logical vector
hex2bin
for converting hexcode into binary vectors,
format.hexmode
for a related R function.
Check http://bms.zeugner.eu for additional help.
bin2hex(c(TRUE,FALSE,TRUE,FALSE,TRUE,TRUE)) bin2hex(c(1,0,1,0,1,1)) hex2bin("b8a") bin2hex(hex2bin("b8a"))
bin2hex(c(TRUE,FALSE,TRUE,FALSE,TRUE,TRUE)) bin2hex(c(1,0,1,0,1,1)) hex2bin("b8a") bin2hex(hex2bin("b8a"))
Plots a grid with signs and inclusion of coefficients vs. posterior model probabilities for the best models in a 'bma' object:
## S3 method for class 'bma' image( x, yprop2pip = FALSE, order.by.pip = TRUE, do.par = TRUE, do.grid = TRUE, do.axis = TRUE, cex.axis = 1, ... )
## S3 method for class 'bma' image( x, yprop2pip = FALSE, order.by.pip = TRUE, do.par = TRUE, do.grid = TRUE, do.axis = TRUE, cex.axis = 1, ... )
x |
a list of class bma (cf. |
yprop2pip |
if |
order.by.pip |
with |
do.par |
Defaults to |
do.grid |
|
do.axis |
|
cex.axis |
font size for the axes (cf. |
... |
Parameters to be passed on to |
Under default settings, blue corresponds to positive sign, red to a negative sign, white to non-inclusion.
coef.bma for the coefficients in matrix form, bms for creating 'bma' objects.
Check http://bms.zeugner.eu for additional help.
data(datafls) model=bms(datafls,nmodel=200) #plot all models image(model,order.by.pip=FALSE) image(model,order.by.pip=TRUE,cex.axis=.8) #plot best 7 models, with other colors image(model[1:7],yprop2pip=TRUE,col=c("black","lightgrey"))
data(datafls) model=bms(datafls,nmodel=200) #plot all models image(model,order.by.pip=FALSE) image(model,order.by.pip=TRUE,cex.axis=.8) #plot best 7 models, with other colors image(model[1:7],yprop2pip=TRUE,col=c("black","lightgrey"))
Returns a vector with summary statistics for a 'bma' object
info.bma(bmao) ## S3 method for class 'bma' summary(object, ...)
info.bma(bmao) ## S3 method for class 'bma' summary(object, ...)
bmao |
same as |
object |
a list/object of class 'bma' that typically results from the
function |
... |
further arguments passed to or from other methods |
info.bma
is equivalent to summary.bma
, its argument
bmao
conforms to the argument object
A character vector summarizing the results of a call to bms
Mean no. of Regressors |
the posterior mean of model size |
Draws |
the number of iterations (ex burn-ins) |
Burnins |
the number of burn-in iterations |
Time |
the time spent on iterating through the model space |
No. of models visited |
the number of times a model was accepted (including burn-ins) |
Modelspace |
the total model
space |
list(list("2^K")) |
the total model space |
Percentage visited |
|
Percentage Topmodels |
number of times the best models were drawn in
percent of |
Corr. PMP |
the correlation between the MCMC frequencies of the best models (the number of times they were drawn) and their marginal likelihoods. |
No. Obs. |
Number of observations |
Model Prior |
a character conforming to the argument |
g-prior |
a character
corresponding to argument |
Shrinkage-Stats |
Posterior expected value und standard deviation (if
applicable) of the shrinkage factor. Only included if argument
|
All of the above statistics can also be directly extracted from the
bma object (bmao
). Therefore summary.bma
only returns a
character vector.
bms
and c.bma
for functions creating
bma objects, print.bma
makes use of summary.bma
.
Check http://bms.zeugner.eu for additional help.
data(datafls) m_fixed=bms(datafls,burn=1000,iter=3000,user.int=FALSE, ) summary(m_fixed) m_ebl=bms(datafls,burn=1000,iter=3000,user.int=FALSE, g="EBL",g.stats=TRUE) info.bma(m_ebl)
data(datafls) m_fixed=bms(datafls,burn=1000,iter=3000,user.int=FALSE, ) summary(m_fixed) m_ebl=bms(datafls,burn=1000,iter=3000,user.int=FALSE, g="EBL",g.stats=TRUE) info.bma(m_ebl)
tests for objects of class "bma"
is.bma(bmao)
is.bma(bmao)
bmao |
a 'bma' object: see 'value' |
Returns TRUE
if bmao is of class 'bma', FALSE
otherwise.
'Output' in bms
for the structure of a 'bma' object
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,burn=1000, iter=4000) is.bma(mm)
data(datafls) mm=bms(datafls,burn=1000, iter=4000) is.bma(mm)
Computes the Log Predictive Score to evaluate a forecast based on a bma object
lps.bma(object, realized.y, newdata = NULL)
lps.bma(object, realized.y, newdata = NULL)
object |
an object of class |
realized.y |
a vector with realized values of the dependent variables
to be plotted in addition to the predictive density, must have its length
conforming to |
newdata |
Needs to be provided if |
The log predictive score is an indicator for the likelihood of several
forecasts.
It is defined as minus the arithmethic mean of the logarithms
of the point densities for realized.y
given newdata
.
Note
that in most cases is more efficient to first compute the predictive density
object via a call to pred.density
and only then pass the
result on to lps.bma
.
A scalar denoting the log predictive score
pred.density
for constructing predictive densities,
bms
for creating bma
objects, density.bma
for plotting coefficient densities
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,user.int=FALSE,nmodel=100) #LPS for actual values under the used data (static forecast) lps.bma(mm, realized.y=datafls[,1] , newdata=datafls[,-1]) #the same result via predicitve.density pd=pred.density(mm, newdata=datafls[,-1]) lps.bma(pd,realized.y=datafls[,1]) # similarly for a linear model (not BMA) zz = zlm(datafls) lps.bma(zz, realized.y=datafls[,1] , newdata=datafls[,-1])
data(datafls) mm=bms(datafls,user.int=FALSE,nmodel=100) #LPS for actual values under the used data (static forecast) lps.bma(mm, realized.y=datafls[,1] , newdata=datafls[,-1]) #the same result via predicitve.density pd=pred.density(mm, newdata=datafls[,-1]) lps.bma(pd,realized.y=datafls[,1]) # similarly for a linear model (not BMA) zz = zlm(datafls) lps.bma(zz, realized.y=datafls[,1] , newdata=datafls[,-1])
An object pertaining to a BMA model prior
An mprior
object holds descriptions
and subfunctions pertaining to model priors. The BMA functions
bms
and post-processing functions rely on this class.
There are currently five model prior structures built into the BMS package,
generated by the following functions (cf. the appendix of
vignette(BMS)
): mprior.uniform.init
: creates a uniform
model prior object.mprior.fixedt.init
: creates the popular
binomial model prior object with common inclusion probabilities.mprior.randomt.init
: creates a beta-binomial model prior object.mprior.pip.init
: creates a binomial model prior object that allows
for defining individual prior inclusion probabilities.mprior.customk.init
: creates a model prior object that allows for
defining a custom prior for each model parameter size.
The following
describes the necessary slots:
Martin Feldkircher and Stefan Zeugner
bms
for creating bma
objects.
Check the
appendix of vignette(BMS)
for a more detailed description of built-in
priors.
Check http://bms.zeugner.eu/custompriors.php for examples.
Produces a combined plot: upper row shows prior and posterior model size distribution, lower row shows posterior model probabilities for the best models
## S3 method for class 'bma' plot(x, ...)
## S3 method for class 'bma' plot(x, ...)
x |
an object of class 'bma' |
... |
additional arguments for |
combines the plotting functions plotModelsize
and
plotConv
The upper plot shows the prior and posterior distribution of model
sizes (plotModelsize
).
The lower plot is an indicator of
how well the bma object has converged (plotConv
).
and Paul
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,user.int=FALSE) plot(mm)
data(datafls) mm=bms(datafls,user.int=FALSE) plot(mm)
Plots a comparison of posterior inclusion probabilites, coefficients or their standard deviation between various bma objects
plotComp( ..., varNr = NULL, comp = "PIP", exact = FALSE, include.legend = TRUE, add.grid = TRUE, do.par = TRUE, cex.xaxis = 0.8 )
plotComp( ..., varNr = NULL, comp = "PIP", exact = FALSE, include.legend = TRUE, add.grid = TRUE, do.par = TRUE, cex.xaxis = 0.8 )
... |
one or more objects of class 'bma' to be compared.
|
varNr |
optionally, covariate indices to be included in the plot, can be either integer vector or character vector - see examples |
comp |
a character denoting what should be compared: |
exact |
if |
include.legend |
whether to include a default legend in the plot
(custom legends can be added with the command |
add.grid |
whether to add a |
do.par |
whether to adjust |
cex.xaxis |
font size scaling parameter for the x-axis - cf. argument
|
coef.bma
for the underlying function
Check http://bms.zeugner.eu for additional help.
## sample two simple bma objects data(datafls) mm1=bms(datafls[,1:15]) mm2=bms(datafls[,1:15]) #compare PIPs plotComp(mm1,mm2) #compare standardized coefficeitns plotComp(mm1,mm2,comp="Std Mean") #...based on the lieklihoods of best models plotComp(mm1,mm2,comp="Std Mean",exact=TRUE) #plot only PIPs for first four covariates plotComp(mm1,mm2,varNr=1:4, col=c("black","red")) #plot only coefficients for covariates 'GDP60 ' and 'LifeExp' plotComp(mm1,mm2,varNr=c("GDP60", "LifeExp"),comp="Post Mean")
## sample two simple bma objects data(datafls) mm1=bms(datafls[,1:15]) mm2=bms(datafls[,1:15]) #compare PIPs plotComp(mm1,mm2) #compare standardized coefficeitns plotComp(mm1,mm2,comp="Std Mean") #...based on the lieklihoods of best models plotComp(mm1,mm2,comp="Std Mean",exact=TRUE) #plot only PIPs for first four covariates plotComp(mm1,mm2,varNr=1:4, col=c("black","red")) #plot only coefficients for covariates 'GDP60 ' and 'LifeExp' plotComp(mm1,mm2,varNr=c("GDP60", "LifeExp"),comp="Post Mean")
Plots the posterior model probabilites based on 1) marginal likelihoods and 2) MCMC frequencies for the best models in a 'bma' object and details the sampler's convergence by their correlation
plotConv(bmao, include.legend = TRUE, add.grid = TRUE, ...)
plotConv(bmao, include.legend = TRUE, add.grid = TRUE, ...)
bmao |
an object of class 'bma' - see |
include.legend |
whether to include a |
add.grid |
whether to include a |
... |
other parameters for |
A call to bms with a MCMC sampler (e.g.
bms(datafls,mcmc="bd",nmodel=100)
uses a Metropolis-Hastings
algorithm to sample through the model space: the frequency of how often
models are drawn converges to the distribution of their posterior marginal
likelihoods.
While sampling, each 'bma' object stores the best models
encountered by its sampling chain with their marginal likelihood and their
MCMC frequencies.plotConv
compares the MCMC frequencies to
marginal likelihoods, and thus visualizes how well the sampler has
converged.
plotConv
is also used by plot.bma
pmp.bma
for posterior model probabilites based on the
two concepts, bms
for creating objects of class 'bma'
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls[,1:12],user.int=FALSE) plotConv(mm) #is similar to matplot(pmp.bma(mm),type="l")
data(datafls) mm=bms(datafls[,1:12],user.int=FALSE) plotConv(mm) #is similar to matplot(pmp.bma(mm),type="l")
Plots posterior and prior model size distribution
plotModelsize( bmao, exact = FALSE, ksubset = NULL, include.legend = TRUE, do.grid = TRUE, ... )
plotModelsize( bmao, exact = FALSE, ksubset = NULL, include.legend = TRUE, do.grid = TRUE, ... )
bmao |
a 'bma' object (cf. |
exact |
if |
ksubset |
integer vector detailing for which model sizes the plot should be done |
include.legend |
if |
do.grid |
if |
... |
parameters passed on to |
As a default, plotModelsize
plots the posterior model size
distribution as a blue line, and the prior model distribution as a dashed
red line.
In addition, it returns a list with the following elements:
mean |
The posterior expected value of model size |
var |
The variance of the posterior model size distribution |
dens |
A vector
detailing the posterior model size distribution from model size |
See also bms
, image.bma
,
density.bma
, plotConv
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,burn=1500, iter=5000, nmodel=200,mprior="fixed",mprior.size=6) #plot Nb.1 based on aggregate results postdist= plotModelsize(mm) #plot based only on 30 best models plotModelsize(mm[1:30],exact=TRUE,include.legend=FALSE) #plot based on all best models, but showing distribution only for model sizes 1 to 20 plotModelsize(mm,exact=TRUE,ksubset=1:20) # create a plot similar to plot Nb. 1 plot(postdist$dens,type="l") lines(mm$mprior.info$mp.Kdist)
data(datafls) mm=bms(datafls,burn=1500, iter=5000, nmodel=200,mprior="fixed",mprior.size=6) #plot Nb.1 based on aggregate results postdist= plotModelsize(mm) #plot based only on 30 best models plotModelsize(mm[1:30],exact=TRUE,include.legend=FALSE) #plot based on all best models, but showing distribution only for model sizes 1 to 20 plotModelsize(mm,exact=TRUE,ksubset=1:20) # create a plot similar to plot Nb. 1 plot(postdist$dens,type="l") lines(mm$mprior.info$mp.Kdist)
Returns the posterior model probabilites for the best models encountered by a 'bma' object
pmp.bma(bmao, oldstyle = FALSE)
pmp.bma(bmao, oldstyle = FALSE)
bmao |
A bma object (see argument |
oldstyle |
For normal use, leave this at |
A call to bms with an MCMC sampler (e.g.
bms(datafls,mcmc="bd",nmodel=100)
uses a Metropolis-Hastings
algorithm to sample through the model space - and the frequency of how often
models are drawn converges to the distribution of their posterior marginal
likelihoods. While sampling, each 'bma' object stores the best models
encountered by its sampling chain with their marginal likelihood and their
MCMC frequencies.
pmp.bma
then allows for comparing the posterior model probabilities
(PMPs) for the two different methods, similar to plotConv
. It
calculates the PMPs based on marginal likelihoods (first column) and the
PMPs based on MCMC frequencies (second column) for the best x models stored
in the bma object.
The correlation of the two columns is an indicator of how well the MCMC
sampler has converged to the actual PMP distribution - it is therefore also
given in the output of summary.bma
.
The second column is slightly different in case the bms
argument mcmc
was set to mcmc="enumeration"
: In this case, the
second column is also based on marginal likelihoods. The correlation between
the two columns is therefore one.
the result is a matrix, its row names describe the model binaries
There are two columns in the matrix:
PMP (Exact) |
posterior model
probabilities based on the posterior likelihoods of the best models in
|
PMP (MCMC) |
posterior model probabilities of the best
models in |
The second column thus shows the PMPs of the best models relative to
all models the call to bms
has sampled through (therefore
typically the second column adds up to less than one). The first column
relates to the likelihoods of the best models, therefore it would add up to
1. In order estimate for their marginal likelihoods with respect to the
other models (the ones not retained in the best models), these PMP aadding
up to one are multiplied with the sum of PMP of the best models accroding to
MCMC frequencies. Therefore, the two columns have the same column sum.
CAUTION: In package versions up to BMS 0.2.5
, the first column was
indeed set always equal to one. This behaviour can still be mimicked by
setting oldstyle=TRUE
.
plotConv
for plotting pmp.bma
,
pmpmodel
to obtain the PMP for any individual model,
bms
for sampling bma objects
Check http://bms.zeugner.eu for additional help.
## sample BMA for growth dataset, MCMC sampler data(datafls) mm=bms(datafls[,1:10],nmodel=20, mcmc="bd") ## mmodel likelihoods and MCMC frequencies of best 20 models print(mm$topmod) pmp.bma(mm) #first column: posterior model prob based on model likelihoods, # relative to best models in 'mm' #second column: posterior model prob based MCMC frequencies, # relative to all models encountered by 'mm' #consequently, first column adds up to one #second column shows how much of the sampled model space is # contained in the best models colSums(pmp.bma(mm)) #correlation betwwen the two shows how well the sampler converged cor(pmp.bma(mm)[,1],pmp.bma(mm)[,2]) #is the same as given in summary.bma summary(mm)["Corr PMP"] #plot the two model probabilites plotConv(mm) #equivalent to the following chart plot(pmp.bma(mm)[,2], type="s") lines(pmp.bma(mm)[,1],col=2) #moreover, note how the first column is constructed liks=exp(mm$top$lik()) liks/sum(liks) pmp.bma(mm)[,1] #these two are equivalent #the example above does not converge well, #too few iterations and best models # this is already better, but also not good mm=bms(datafls[,1:10],burn=2000,iter=5000,nmodel=200) # in case the sampler has been 'enumeration' instead of MCMC, # then both matrix columns are of course equivalent mm=bms(datafls[,1:10],nmodel=512,mcmc="enumeration") cor(pmp.bma(mm)[,1],pmp.bma(mm)[,2]) colSums(pmp.bma(mm))
## sample BMA for growth dataset, MCMC sampler data(datafls) mm=bms(datafls[,1:10],nmodel=20, mcmc="bd") ## mmodel likelihoods and MCMC frequencies of best 20 models print(mm$topmod) pmp.bma(mm) #first column: posterior model prob based on model likelihoods, # relative to best models in 'mm' #second column: posterior model prob based MCMC frequencies, # relative to all models encountered by 'mm' #consequently, first column adds up to one #second column shows how much of the sampled model space is # contained in the best models colSums(pmp.bma(mm)) #correlation betwwen the two shows how well the sampler converged cor(pmp.bma(mm)[,1],pmp.bma(mm)[,2]) #is the same as given in summary.bma summary(mm)["Corr PMP"] #plot the two model probabilites plotConv(mm) #equivalent to the following chart plot(pmp.bma(mm)[,2], type="s") lines(pmp.bma(mm)[,1],col=2) #moreover, note how the first column is constructed liks=exp(mm$top$lik()) liks/sum(liks) pmp.bma(mm)[,1] #these two are equivalent #the example above does not converge well, #too few iterations and best models # this is already better, but also not good mm=bms(datafls[,1:10],burn=2000,iter=5000,nmodel=200) # in case the sampler has been 'enumeration' instead of MCMC, # then both matrix columns are of course equivalent mm=bms(datafls[,1:10],nmodel=512,mcmc="enumeration") cor(pmp.bma(mm)[,1],pmp.bma(mm)[,2]) colSums(pmp.bma(mm))
Returns the posterior model probability for any model based on bma results
pmpmodel(bmao, model = numeric(0), exact = TRUE)
pmpmodel(bmao, model = numeric(0), exact = TRUE)
bmao |
A bma object as created by |
model |
A model index - either variable names, or a logical with model
binaries, or the model hexcode (cf. |
exact |
If |
If the model as provided in model
is the null or the full model, or
is contained in bmao
's topmod object (cf. argument nmodel
in
bms
),
then the result is the same as in
pmp.bma
.
If not and exact=TRUE
, then pmpmodel
estimates the model based on comparing its marginal likelihood (times model
prior) to the likelihoods in the topmod
object and multiplying by
their sum of PMP according to MCMC frequencies,
A scalar with (an estimate of) the posterior model probability for
model
pmp.bma
for similar
functions
Check http://bms.zeugner.eu for additional help.
## sample BMA for growth dataset, enumeration sampler data(datafls) mm=bms(datafls[,1:10],nmodel=5) #show the best 5 models: pmp.bma(mm) #first column: posterior model prob based on model likelihoods, #second column: posterior model prob based MCMC frequencies, ### Different ways to get the same result: ######### #PMP of 2nd-best model (hex-code representation) pmpmodel(mm,"00c") #PMP of 2nd-best model (binary representation) incls=as.logical(beta.draws.bma(mm)[,2]) pmpmodel(mm,incls) #PMP of 2nd-best model (via variable names) #names of regressors in model "00c": names(datafls[,2:10])[incls] pmpmodel(mm,c("SubSahara", "LatAmerica")) #PMP of 2nd-best model (via positions) pmpmodel(mm,c(6,7)) ####PMP of another model ######### pmpmodel(mm,1:5)
## sample BMA for growth dataset, enumeration sampler data(datafls) mm=bms(datafls[,1:10],nmodel=5) #show the best 5 models: pmp.bma(mm) #first column: posterior model prob based on model likelihoods, #second column: posterior model prob based MCMC frequencies, ### Different ways to get the same result: ######### #PMP of 2nd-best model (hex-code representation) pmpmodel(mm,"00c") #PMP of 2nd-best model (binary representation) incls=as.logical(beta.draws.bma(mm)[,2]) pmpmodel(mm,incls) #PMP of 2nd-best model (via variable names) #names of regressors in model "00c": names(datafls[,2:10])[incls] pmpmodel(mm,c("SubSahara", "LatAmerica")) #PMP of 2nd-best model (via positions) pmpmodel(mm,c(6,7)) ####PMP of another model ######### pmpmodel(mm,1:5)
Returns posterior residual variance, deviance, or pseudo R-squared, according to the chosen prior structure
post.var(object, exact = FALSE, ...)
post.var(object, exact = FALSE, ...)
object |
|
exact |
When |
... |
further arguments passed to or from other methods |
post.var
: Posterior residual variance as according to the prior
definitions contained in object
post.pr2
: A
pseudo-R-squared corresponding to unity minus posterior variance over
dependent variance. deviance.bma
: returns the
deviance
of a bma
model as returned from
bms
. deviance.zlm
: returns the
deviance
of a zlm
model.
bms
for creating bma
objects and priors,
zlm
object.
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls[,1:10]) deviance(mm)/nrow(datafls) # is equivalent to post.var(mm) post.pr2(mm) # is equivalent to 1 - post.var(mm) / ( var(datafls[,1])*(1-1/nrow(datafls)) )
data(datafls) mm=bms(datafls[,1:10]) deviance(mm)/nrow(datafls) # is equivalent to post.var(mm) post.pr2(mm) # is equivalent to 1 - post.var(mm) / ( var(datafls[,1])*(1-1/nrow(datafls)) )
Predictive densities for conditional forecasts
pred.density(object, newdata = NULL, n = 300, hnbsteps = 30, ...)
pred.density(object, newdata = NULL, n = 300, hnbsteps = 30, ...)
object |
|
newdata |
A data.frame, matrix or vector containing variables with which to predict. |
n |
The integer number of equally spaced points at which the density is to be estimated. |
hnbsteps |
The number of numerical integration steps to be used in case
of a hyper-g prior (cf. argument |
... |
arguments to be passed on to |
The predictive density is a mixture density based on the nmodels
best
models in a bma
object (cf. nmodel
in bms
).
The number of 'best models' to retain is therefore vital and should be set
quite high for accuracy.
pred.density
returns a list of class pred.density
with
the following elements
densities() |
a list whose elements each contain the estimated density for each forecasted observation |
fit |
a vector with the expected values of the predictions (the 'point forecasts') |
std.err |
a vector with the standard deviations of the predictions (the 'standard errors') |
dyf(realized.y , predict_index=NULL)
|
Returns the
densities of realized response variables provided in |
lps(realized.y , predict_index=NULL)
|
Computes the log predictive score
for the response varaible provided in |
plot((x , predict_index =
NULL , addons = "eslz" , realized.y = NULL , addons.lwd = 1.5 , ...)
|
the same
as |
n |
The number of equally spaced
points for which the density (under |
nmodel |
The number of best models predictive densities are based upon. |
call |
the call that created this |
In BMS version 0.3.0, pred.density
may only cope with built-in
gprior
s, not with any user-defined priors.
predict.bma
for simple point forecasts,
plot.pred.density
for plotting predictive densities,
lps.bma
for calculating the log predictive score
independently, quantile.pred.density
for extracting quantiles
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,user.int=FALSE) #predictive densityfor two 'new' data points pd=pred.density(mm,newdata=datafls[1:2,]) #fitted values based on best models, same as predict(mm, exact=TRUE) pd$fit #plot the density for the first forecast observation plot(pd,1) # the same plot ' naked' plot(pd$densities()[[1]]) #predict density for the first forecast observation if the dep. variable is 0 pd$dyf(0,1) #predict densities for both forecasts for the realizations 0 and 0.5 pd$dyf(rbind(c(0,.5),c(0,.5))) # calc. Log Predictive Score if both forecasts are realized at 0: lps.bma(pd,c(0,0))
data(datafls) mm=bms(datafls,user.int=FALSE) #predictive densityfor two 'new' data points pd=pred.density(mm,newdata=datafls[1:2,]) #fitted values based on best models, same as predict(mm, exact=TRUE) pd$fit #plot the density for the first forecast observation plot(pd,1) # the same plot ' naked' plot(pd$densities()[[1]]) #predict density for the first forecast observation if the dep. variable is 0 pd$dyf(0,1) #predict densities for both forecasts for the realizations 0 and 0.5 pd$dyf(rbind(c(0,.5),c(0,.5))) # calc. Log Predictive Score if both forecasts are realized at 0: lps.bma(pd,c(0,0))
Expected value of prediction based on 'bma' object
## S3 method for class 'bma' predict(object, newdata = NULL, exact = FALSE, topmodels = NULL, ...)
## S3 method for class 'bma' predict(object, newdata = NULL, exact = FALSE, topmodels = NULL, ...)
object |
a bma object - see |
newdata |
An optional data.frame, matrix or vector containing variables with which to predict. If omitted, then (the expected values of) the fitted values are returned. |
exact |
If |
topmodels |
index of the models with whom to predict: for instance,
|
... |
further arguments passed to or from other methods. |
A vector with (expected values of) fitted values.
coef.bma
for obtaining coefficients,
bms
for creating bma objects, predict.lm
for a
comparable function
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=bms(datafls,user.int=FALSE) predict(mm) #fitted values based on MCM frequencies predict(mm, exact=TRUE) #fitted values based on best models predict(mm, newdata=1:41) #prediction based on MCMC frequencies predict(mm, newdata=datafls[1,], exact=TRUE) #prediction based on a data.frame # the following two are equivalent: predict(mm, topmodels=1:10) predict(mm[1:10], exact=TRUE)
data(datafls) mm=bms(datafls,user.int=FALSE) predict(mm) #fitted values based on MCM frequencies predict(mm, exact=TRUE) #fitted values based on best models predict(mm, newdata=1:41) #prediction based on MCMC frequencies predict(mm, newdata=datafls[1,], exact=TRUE) #prediction based on a data.frame # the following two are equivalent: predict(mm, topmodels=1:10) predict(mm[1:10], exact=TRUE)
Expected value (And standard errors) of predictions based on 'zlm' linear Bayesian model under Zellner's g prior
## S3 method for class 'zlm' predict(object, newdata = NULL, se.fit = FALSE, ...)
## S3 method for class 'zlm' predict(object, newdata = NULL, se.fit = FALSE, ...)
object |
a zlm linear model object - see |
newdata |
An optional data.frame, matrix or vector containing variables with which to predict. If omitted, then (the expected values of) the fitted values are returned. |
se.fit |
A switch indicating if the standard deviations for the predicted varaibles are required. |
... |
further arguments passed to or from other methods. |
A vector with (expected values of) fitted values.
If
se.fit
is TRUE
, then the output is a list with the following
elements:
fit |
a vector with the expected values of fitted values |
std.err |
a vector with the standard deviations of fitted values |
se.fit |
a vector with the standard errors without the residual scale
akin to |
residual.scale |
The part from the standard deviations that involves the identity matrix.
Note that |
bms
for creating zlm objects,
predict.lm
for a comparable function,
predict.bma
for predicting with bma objects
Check http://bms.zeugner.eu for additional help.
data(datafls) mm=zlm(datafls,g="EBL") predict(mm) #fitted values predict(mm, newdata=1:41) #prediction based on a 'new data point' #prediction based on a 'new data point', with 'standard errors' predict(mm, newdata=datafls[1,], se.fit=TRUE)
data(datafls) mm=zlm(datafls,g="EBL") predict(mm) #fitted values predict(mm, newdata=1:41) #prediction based on a 'new data point' #prediction based on a 'new data point', with 'standard errors' predict(mm, newdata=datafls[1,], se.fit=TRUE)
Print method for objects of class 'topmod', typically the best models stored in a 'bma' object
## S3 method for class 'topmod' print(x, ...)
## S3 method for class 'topmod' print(x, ...)
x |
an object of class 'topmod' - see |
... |
additional arguments passed to |
See pmp.bma
for an explanation of likelihood vs. MCMC
frequency concepts
if x
contains more than one model, then the function returns
a 2-column matrix:
Row Names |
show the model binaries in hexcode |
Column 'Marg.Log.Lik' |
shows the
marginal log-likelihoods of the models in |
Column 'MCMC
Freq' |
shows the MCMC frequencies of the models in |
if x
contains only one model, then more detailed information is shown
for this model:
first line |
'Model Index' provides the model binary in
hexcode, 'Marg.Log.Lik' its marginal log likelhood, 'Sampled Freq.' how
often it was accepted (function |
Estimates |
first column: covariate indices included in the model,
second column: posterior expected value of the coefficients, third column:
their posterior standard deviations (excluded if no coefficients were stored
in the topmod object - cf. argument |
Included Covariates |
the model binary |
Additional
Statistics |
any custom additional statistics saved with the model |
topmod
for creating topmod objects, bms
for their typical use, pmp.bma
for comparing posterior model
probabilities
Check http://bms.zeugner.eu for additional help.
# do some small-scale BMA for demonstration data(datafls) mm=bms(datafls[,1:10],nmodel=20) #print info on the best 20 models print(mm$topmod) print(mm$topmod,digits=10) #equivalent: cbind(mm$topmod$lik(),mm$topmod$ncount()) #now print info only for the second-best model: print(mm$topmod[2]) #compare 'Included Covariates' to: topmodels.bma(mm[2]) #and to as.vector(mm$topmod[2]$bool_binary())
# do some small-scale BMA for demonstration data(datafls) mm=bms(datafls[,1:10],nmodel=20) #print info on the best 20 models print(mm$topmod) print(mm$topmod,digits=10) #equivalent: cbind(mm$topmod$lik(),mm$topmod$ncount()) #now print info only for the second-best model: print(mm$topmod[2]) #compare 'Included Covariates' to: topmodels.bma(mm[2]) #and to as.vector(mm$topmod[2]$bool_binary())
Quantiles for objects of class "density", "pred.density" or "coef.density"
## S3 method for class 'density' quantile(x, probs = seq(0.25, 0.75, 0.25), names = TRUE, normalize = TRUE, ...) ## S3 method for class 'coef.density' quantile(x, probs = seq(0.25, 0.75, 0.25), names = TRUE, ...) ## S3 method for class 'pred.density' quantile(x, probs = seq(0.25, 0.75, 0.25), names = TRUE, ...)
## S3 method for class 'density' quantile(x, probs = seq(0.25, 0.75, 0.25), names = TRUE, normalize = TRUE, ...) ## S3 method for class 'coef.density' quantile(x, probs = seq(0.25, 0.75, 0.25), names = TRUE, ...) ## S3 method for class 'pred.density' quantile(x, probs = seq(0.25, 0.75, 0.25), names = TRUE, ...)
x |
a object of class |
probs |
numeric vector of probabilities with values in [0,1] - elements
very close to the boundaries return |
names |
logical; if |
normalize |
logical; if |
... |
further arguments passed to or from other methods. |
The methods quantile.coef.density
and quantile.pred.density
both apply quantile.density
to densities nested with object of class
coef.density
or pred.density
.
The function
quantile.density
applies generically to the built-in class
density
(as least for versions where there is no such method
in the pre-configured packages).
Note that quantile.density
relies
on trapezoidal integration in order to compute the cumulative densities
necessary for the calculation of quantiles.
If x
is of class density
(or a list with exactly one
element), a vector with quantiles.
If x
is a list
of
densities with more than one element (e.g. as resulting from
pred.density
or coef.density
), then the output is a matrix of
quantiles, with each matrix row corresponding to the respective density.
Stefan Zeugner
quantile.default
for a comparable function,
pred.density
and density.bma
for the
BMA-specific objects.
Check http://bms.zeugner.eu for additional help.
data(datafls) mm = bms(datafls[1:70,], user.int=FALSE) #predict last two observations with preceding 70 obs: pmm = pred.density(mm, newdata=datafls[71:72,], plot=FALSE) #'standard error' quantiles quantile(pmm, c(.05, .95)) #Posterior density for Coefficient of "GDP60" cmm = density(mm, reg="GDP60", plot=FALSE) quantile(cmm, probs=c(.05, .95)) #application to generic density: dd1 = density(rnorm(1000)) quantile(dd1) ## Not run: #application to list of densities: quantile.density( list(density(rnorm(1000)), density(rnorm(1000))) ) ## End(Not run)
data(datafls) mm = bms(datafls[1:70,], user.int=FALSE) #predict last two observations with preceding 70 obs: pmm = pred.density(mm, newdata=datafls[71:72,], plot=FALSE) #'standard error' quantiles quantile(pmm, c(.05, .95)) #Posterior density for Coefficient of "GDP60" cmm = density(mm, reg="GDP60", plot=FALSE) quantile(cmm, probs=c(.05, .95)) #application to generic density: dd1 = density(rnorm(1000)) quantile(dd1) ## Not run: #application to list of densities: quantile.density( list(density(rnorm(1000)), density(rnorm(1000))) ) ## End(Not run)
summary method for class "zlm
"
## S3 method for class 'zlm' summary(object, printout = TRUE, ...)
## S3 method for class 'zlm' summary(object, printout = TRUE, ...)
object |
an object of class |
printout |
If |
... |
further arguments passed to or from other methods |
summary.zlm
prints out coefficients expected values and their
standard deviations, as well as information on the gprior and the log
marginal likelihood. However, it invisibly returns a list with elements as
described below:
A list
with the following elements
residuals |
The expected value of residuals from the model |
coefficients |
The posterior expected values of coefficients (including the intercept) |
coef.sd |
Posterior standard deviations of the coefficients (the
intercept SD is |
gprior |
The g prior as it has been submitted to |
E.shrinkage |
the shrinkage factor |
SD.shrinkage |
(Optionally) the shrinkage factor's posterior standard deviation (in case of a hyper-g prior) |
log.lik |
The log marginal likelihood of the model |
Stefan Zeugner
zlm
for creating zlm
objects,
link{summary.lm}
for a similar function on OLS models
See also http://bms.zeugner.eu for additional help.
data(datafls) #simple example foo = zlm(datafls) summary(foo) sfoo = summary(foo,printout=FALSE) print(sfoo$E.shrinkage)
data(datafls) #simple example foo = zlm(datafls) summary(foo) sfoo = summary(foo,printout=FALSE) print(sfoo$E.shrinkage)
Create or use an updateable list keeping the best x models it encounters (for advanced users)
topmod( nbmodels, nmaxregressors = NA, bbeta = FALSE, lengthfixedvec = 0, liks = numeric(0), ncounts = numeric(0), modelbinaries = matrix(0, 0, 0), betas = matrix(0, 0, 0), betas2 = matrix(0, 0, 0), fixed_vector = matrix(0, 0, 0) )
topmod( nbmodels, nmaxregressors = NA, bbeta = FALSE, lengthfixedvec = 0, liks = numeric(0), ncounts = numeric(0), modelbinaries = matrix(0, 0, 0), betas = matrix(0, 0, 0), betas2 = matrix(0, 0, 0), fixed_vector = matrix(0, 0, 0) )
nbmodels |
The maximum number of models to be retained by the topmod object |
nmaxregressors |
The maximum number of covariates the models in the topmod object are allowed to have |
bbeta |
if |
lengthfixedvec |
The length of an optional fixed vector adhering to
each model (for instance R-squared, etc). If |
liks |
optional vector of log-likelihoods to initialize topmod object
with (length must be |
ncounts |
optional vector of MCMC frequencies to initialize topmod
object with (same length as |
modelbinaries |
optional matrix whose columns detail model binaries to
initialize topmod object with (same nb columns as |
betas |
optional matrix whose columns are coefficients to initialize
topmod object with (same dimensions as |
betas2 |
optional matrix whose columns are coefficients' second moments
to initialize topmod object with (same dimensions as |
fixed_vector |
optional matrix whose columns are a fixed vector
initialize topmod object with (same |
A 'topmod' object (as created by topmod
) holds three basic vectors:
lik
(for the (log) likelihood of models or similar), bool()
for a hexcode presentation of the model binaries (cf. bin2hex
)
and ncount() for the times the models have been drawn.
All these vectors
are sorted descendantly by lik
, and are of the same length. The
maximum length is limited by the argument nbmodels
.
If tmo
is a topmod object, then a call to tmo$addmodel
(e.g.
tmo$addmodel(mylik=4,vec01=c(T,F,F,T))
updates the object tmo
by a model represented by vec01
(here the one including the first and
fourth regressor) and the marginal (log) likelihood lik
(here: 4).
If this model is already part of tmo
, then its respective
ncount
entry is incremented by one; else it is inserted into a
position according to the ranking of lik
.
In addition, there is the possibility to save (the first moments of)
coefficients of a model (betas
) and their second moments
(betas2
), as well as an arbitrary vector of statistics per model
(fixed_vector
).
is.topmod
returns TRUE
if the argument is of class 'topmod'
a call to topmod
returns a list of class "topmod" with the
following elements:
addmodel(mylik , vec01 , vbeta=numeric(0) , vbeta2=numeric(0) , fixedvec=numeric(0))
|
function
that adjusts the list of models in the 'topmod' object (see Details).
|
lik() |
A numeric vector of the best models (log) likelihoods, in decreasing order |
bool() |
A character vector of hexmode expressions
for the model binaries (cf. |
ncount() |
A numeric vector of MCMC frequencies for the best models
(i.e. how often the respective model was introduced by |
nbmodels |
Returns the argument |
nregs |
Returns
the argument |
bool_binary() |
Returns a matrix
whose columns present the models conforming to |
betas() |
a matrix whose columns are the coefficients conforming to
|
betas2() |
similar to |
fixed_vector() |
The columns of this matrix return the
|
topmod
is rather intended as a building block for programming;
it has no direct application for a user of the BMS package.
the object resulting from bms
includes an element of
class 'topmod'
Check http://bms.zeugner.eu for additional help.
#standard use tm= topmod(2,4,TRUE,0) #should keep a maximum two models tm$addmodel(-2.3,c(1,1,1,1),1:4,5:8) #update with some model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add another model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add it again -> adjust ncount tm$addmodel(-2.5,c(1,0,0,1),1:2,5:6) #add another model #read out tm$lik() tm$ncount() tm$bool_binary() tm$betas() is.topmod(tm) #extract a topmod oobject only containing the second best model tm2=tm[2] #advanced: should return the same result as #initialize tm2= topmod(2,4,TRUE,0, liks = c(-2.2,-2.3), ncounts = c(2,1), modelbinaries = cbind(c(0,1,1,1),c(1,1,1,1)), betas = cbind(0:3,1:4), betas2 = cbind(c(0,5:7),5:8)) #update tm$addmodel(-2.5,c(1,0,0,1),1:2,5:6) #add another model #read out tm$lik() tm$ncount() tm$bool_binary() tm$betas()
#standard use tm= topmod(2,4,TRUE,0) #should keep a maximum two models tm$addmodel(-2.3,c(1,1,1,1),1:4,5:8) #update with some model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add another model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add it again -> adjust ncount tm$addmodel(-2.5,c(1,0,0,1),1:2,5:6) #add another model #read out tm$lik() tm$ncount() tm$bool_binary() tm$betas() is.topmod(tm) #extract a topmod oobject only containing the second best model tm2=tm[2] #advanced: should return the same result as #initialize tm2= topmod(2,4,TRUE,0, liks = c(-2.2,-2.3), ncounts = c(2,1), modelbinaries = cbind(c(0,1,1,1),c(1,1,1,1)), betas = cbind(0:3,1:4), betas2 = cbind(c(0,5:7),5:8)) #update tm$addmodel(-2.5,c(1,0,0,1),1:2,5:6) #add another model #read out tm$lik() tm$ncount() tm$bool_binary() tm$betas()
An updateable list keeping the best x models it encounters in any kind of model iteration
Objects can be created by calls to
topmod
, or indirectly by calls to bms
.
A 'topmod' object (as created by topmod
) holds three basic vectors:
lik
(for the (log) likelihood of models or similar), bool()
for a hexcode presentation of the model binaries (cf. bin2hex
)
and ncount() for the times the models have been drawn.
All these vectors
are sorted descendantly by lik
, and are of the same length. The
maximum length is limited by the argument nbmodels
.
If tmo
is a topmod object, then a call to tmo$addmodel
(e.g.
tmo$addmodel(mylik=4,vec01=c(T,F,F,T))
updates the object tmo
by a model represented by vec01
(here the one including the first and
fourth regressor) and the marginal (log) likelihood lik
(here: 4).
If this model is already part of tmo
, then its respective
ncount
entry is incremented by one; else it is inserted into a
position according to the ranking of lik
.
In addition, there is
the possibility to save (the first moments of) coefficients of a model
(betas
) and their second moments (betas2
), as well as an
arbitrary vector of statistics per model (fixed_vector
).
Martin Feldkircher and Stefan Zeugner
topmod
to create topmod
objects and a more
detailed description,
is.topmod
to test for this class
tm= topmod(2,4,TRUE,0) #should keep a maximum two models tm$addmodel(-2.3,c(1,1,1,1),1:4,5:8) #update with some model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add another model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add it again -> adjust ncount tm$addmodel(-2.5,c(1,0,0,1),1:2,5:6) #add another model #read out tm$lik() tm$ncount() tm$bool_binary() tm$betas()
tm= topmod(2,4,TRUE,0) #should keep a maximum two models tm$addmodel(-2.3,c(1,1,1,1),1:4,5:8) #update with some model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add another model tm$addmodel(-2.2,c(0,1,1,1),1:3,5:7) #add it again -> adjust ncount tm$addmodel(-2.5,c(1,0,0,1),1:2,5:6) #add another model #read out tm$lik() tm$ncount() tm$bool_binary() tm$betas()
Returns a matrix whose columns show which covariates were included in the best models in a 'bma' object. The last two columns detail posterior model probabilities.
topmodels.bma(bmao)
topmodels.bma(bmao)
bmao |
an object of class 'bma' - see |
Each bma class (the result of bms) contains 'top models', the x models with tthe best analytical likelihood that bms had encountered while sampling
See pmp.bma
for an explanation of likelihood vs. MCMC
frequency concepts
Each column in the resulting matrix corresponds to one of the 'best' models in bmao: the first column for the best model, the second for the second-best model, etc. The model binaries have elements 1 if the regressor given by the row name was included in the respective models, and 0 otherwise. The second-last row shows the model's posterior model probability based on marginal likelihoods (i.e. its marginal likelihood over the sum of likelihoods of all best models) The last row shows the model's posterior model probability based on MCMC frequencies (i.e. how often the model was accepted vs sum of acceptance of all models) Note that the column names are hexcode representations of the model binaries (e.g. "03" for c(0,0,0,1,0,0))
topmod
for creating topmod objects, bms
for their typical use, pmp.bma
for comparing posterior model
probabilities
Check http://bms.zeugner.eu for additional help.
data(datafls) #sample with a limited data set for demonstration mm=bms(datafls[,1:12],nmodel=20) #show binaries for all topmodels.bma(mm) #show binaries for 2nd and 3rd best model, without the model probs topmodels.bma(mm[2:3])[1:11,] #access model binaries directly mm$topmod$bool_binary()
data(datafls) #sample with a limited data set for demonstration mm=bms(datafls[,1:12],nmodel=20) #show binaries for all topmodels.bma(mm) #show binaries for 2nd and 3rd best model, without the model probs topmodels.bma(mm[2:3])[1:11,] #access model binaries directly mm$topmod$bool_binary()
Simple utilities retrieving variable names and design matrix from a bma object
## S3 method for class 'bma' variable.names(object, ...)
## S3 method for class 'bma' variable.names(object, ...)
object |
A |
... |
further arguments passed to or from other methods |
All functions are bma
-functions for the generic methods
variable.names
, deviance
, and
model.frame
.
bms
for creating bma objects
Check http://bms.zeugner.eu for additional help.
data(datafls) bma_enum=bms(datafls[1:20,1:10]) model.frame(bma_enum) # similar to bma_enum$arguments$X.data variable.names(bma_enum)[-1] # is equivalent to bma_enum$reg.names
data(datafls) bma_enum=bms(datafls[1:20,1:10]) model.frame(bma_enum) # similar to bma_enum$arguments$X.data variable.names(bma_enum)[-1] # is equivalent to bma_enum$reg.names
Simple utilities retrieving variable names and design matrix from a bma object
## S3 method for class 'zlm' variable.names(object, ...)
## S3 method for class 'zlm' variable.names(object, ...)
object |
A |
... |
further arguments passed to or from other methods |
variable.names.zlm
: method variable.names
for a
zlm
model. vcov.zlm
: the posterior
variance-covariance matrix of the coefficients of a zlm
model
- cf. vcov
logLik.zlm
: a zlm
model's
log-likelihood p(y|M)
according to the implementation of the
respective coefficent prior
zlm
for creating zlm
objects
Check http://bms.zeugner.eu for additional help.
data(datafls) zz=zlm(datafls) variable.names(zz) vcov(zz) logLik(zz)
data(datafls) zz=zlm(datafls) variable.names(zz) vcov(zz) logLik(zz)
Used to fit the Bayesian normal-conjugate linear model with Zellner's g
prior and mean zero coefficient priors. Provides an object similar to the
lm
class.
zlm(formula, data = NULL, subset = NULL, g = "UIP")
zlm(formula, data = NULL, subset = NULL, g = "UIP")
formula |
an object of class "formula" (or one that can be coerced to
that class), such as a data.frame - cf. |
data |
an optional |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
g |
specifies the hyperparameter on Zellner's g-prior for the
regression coefficients. |
zlm
estimates the coefficients of the following model where
~
and
is the design matrix
The priors on the intercept and the
variance
are improper:
,
Zellner's g affects the prior on coefficients:
~
.
Note that the prior mean
of coefficients is set to zero by default and cannot be adjusted. Note
moreover that zlm
always includes an intercept.
Returns a list of class zlm
that contains at least the
following elements (cf. lm
):
coefficients |
a named vector of posterior coefficient expected values |
residuals |
the residuals, that is response minus fitted values |
fitted.values |
the fitted mean values |
rank |
the numeric rank of the fitted linear model |
df.residual |
the residual degrees of freedom |
call |
the matched call |
terms |
the |
model |
the model frame used |
coef2moments |
a named vector of coefficient posterior second moments |
marg.lik |
the log marginal likelihood of the model |
gprior.info |
a list detailing information on
the g-prior, cf. output value |
Stefan Zeugner
The representation follows Fernandez, C. E. Ley and M. Steel (2001): Benchmark priors for Bayesian model averaging. Journal of Econometrics 100(2), 381–427
See also http://bms.zeugner.eu for additional help.
The methods summary.zlm
and predict.lm
provide additional insights into zlm
output.
The function
as.zlm
extracts a single out model of a bma
object (as
e.g. created throughbms
).
Moreover, lm
for
the standard OLS object, bms
for the application of zlm
in Bayesian model averaging.
Check http://bms.zeugner.eu for additional help.
data(datafls) #simple example foo = zlm(datafls) summary(foo) #example with formula and subset foo2 = zlm(y~GDP60+LifeExp, data=datafls, subset=2:70) #basic model, omitting three countries summary(foo2)
data(datafls) #simple example foo = zlm(datafls) summary(foo) #example with formula and subset foo2 = zlm(y~GDP60+LifeExp, data=datafls, subset=2:70) #basic model, omitting three countries summary(foo2)
A list holding output from the Bayesian Linar Model under Zellner's g prior akin to class 'lm'
Objects can be created via calls to
zlm
, but indirectly also via as.zlm
.zlm
estimates a Bayesian Linear Model under Zellner's g prior
- its output is very similar to objects of class lm
(cf.
section 'Value')
Martin Feldkircher and Stefan Zeugner
zlm
and as.zlm
for creating zlm
objects,density.zlm
, predict.zlm
and
summary.zlm
for other posterior results