Title: | Output Analysis and Diagnostics for MCMC |
---|---|
Description: | Provides functions for summarizing and plotting the output from Markov Chain Monte Carlo (MCMC) simulations, as well as diagnostic tests of convergence to the equilibrium distribution of the Markov chain. |
Authors: | Martyn Plummer [aut, cre, trl], Nicky Best [aut], Kate Cowles [aut], Karen Vines [aut], Deepayan Sarkar [aut], Douglas Bates [aut], Russell Almond [aut], Arni Magnusson [aut] |
Maintainer: | Martyn Plummer <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.19-4.1 |
Built: | 2024-11-26 06:30:32 UTC |
Source: | CRAN |
the as.ts
method for mcmc
objects coerces an mcmc object to a time series.
## S3 method for class 'mcmc' as.ts(x,...)
## S3 method for class 'mcmc' as.ts(x,...)
x |
an mcmc object |
... |
unused arguments for compatibility with generic |
Martyn Plummer
autocorr
calculates the autocorrelation function for the
Markov chain mcmc.obj
at the lags given by lags
.
The lag values are taken to be relative to the thinning interval
if relative=TRUE
.
High autocorrelations within chains indicate slow mixing and, usually, slow convergence. It may be useful to thin out a chain with high autocorrelations before calculating summary statistics: a thinned chain may contain most of the information, but take up less space in memory. Re-running the MCMC sampler with a different parameterization may help to reduce autocorrelation.
autocorr(x, lags = c(0, 1, 5, 10, 50), relative=TRUE)
autocorr(x, lags = c(0, 1, 5, 10, 50), relative=TRUE)
x |
an mcmc object |
lags |
a vector of lags at which to calculate the autocorrelation |
relative |
a logical flag. TRUE if lags are relative to the thinning interval of the chain, or FALSE if they are absolute difference in iteration numbers |
A vector or array containing the autocorrelations.
Martyn Plummer
autocorr.diag
calculates the autocorrelation function for the
Markov chain mcmc.obj
at the lags given by lags
.
The lag values are taken to be relative to the thinning interval
if relative=TRUE
. Unlike autocorr
, if mcmc.obj
has many parmeters it only computes the autocorrelations with itself and
not the cross correlations. In cases where autocorr
would
return a matrix, this function returns the diagonal of the matrix.
Hence it is more useful for chains with many parameters, but may not
be as helpful at spotting parameters.
If mcmc.obj
is of class mcmc.list
then the returned
vector is the average autocorrelation across all chains.
autocorr.diag(mcmc.obj, ...)
autocorr.diag(mcmc.obj, ...)
mcmc.obj |
an object of class |
... |
optional arguments to be passed to |
A vector containing the autocorrelations.
Russell Almond
Plots the autocorrelation function for each variable in each chain in x.
autocorr.plot(x, lag.max, auto.layout = TRUE, ask, ...)
autocorr.plot(x, lag.max, auto.layout = TRUE, ask, ...)
x |
A Markov Chain |
lag.max |
Maximum value at which to calculate acf |
auto.layout |
If |
ask |
If |
... |
graphical parameters |
Effective standard deviation of population to produce the correct standard errors.
batchSE(x, batchSize=100)
batchSE(x, batchSize=100)
x |
An |
batchSize |
Number of observations to include in each batch. |
Because of the autocorrelation, the usual method of taking
var(x)/n
overstates the precision of the estimate. This method
works around the problem by looking at the means of batches of the
parameter. If the batch size is large enough, the batch means should
be approximately uncorrelated and the normal formula for computing the
standard error should work.
The batch standard error procedure is usually thought to be not as
accurate as the time series methods used in summary
and
effectiveSize
. It is included here for completeness.
A vector giving the standard error for each column of x
.
Russell Almond
Roberts, GO (1996) Markov chain concepts related to sampling algorithms, in Gilks, WR, Richardson, S and Spiegelhalter, DJ, Markov Chain Monte Carlo in Practice, Chapman and Hall, 45-58.
spectrum0.ar
, effectiveSize
,
summary.mcmc
bugs2jags
converts a WinBUGS data in the format called "S-Plus"
(i.e. the format created by the dput
function) and writes it in
dump
format used by JAGS.
NB WinBUGS stores its arrays in row order. This is different
from R and JAGS which both store arrays in column order. This
difference is taken into account by bugs2jags
which will
automatically reorder the data in arrays, without changing the
dimension.
Not yet available in S-PLUS.
bugs2jags(infile, outfile)
bugs2jags(infile, outfile)
infile |
name of the input file |
outfile |
name of the output file |
If the input file is saved from WinBUGS, it must be saved in plain text format. The default format for files saved from WinBUGS is a binary compound document format (with extension odc) that cannot be read by bugs2jags.
Martyn Plummer
Spiegelhalter DJ, Thomas A, Best NG and Lunn D (2003). WinBUGS version 1.4 user manual MRC Biostatistics Unit, Cambridge, UK.
coda.options
is a utility function that queries and sets
options for the codamenu()
function. These settings affect
the behaviour of the functions in the coda library only when they
are called via the codamenu()
interface.
The coda.options()
function behaves just like the
options()
function in the base library, with the additional
feature that coda.options(default=TRUE)
will reset all options
to the default values.
Options can be pretty-printed using the display.coda.options()
function, which groups the options into sections.
Available options are
Bandwidth function used when smoothing samples to produce density estimates. Defaults to Silverman's “Rule of thumb”.
Logical option that determines whether to combine multiple chains when calculating cross-correlations.
Logical option that determines whether to combine multiple chains when plotting.
Logical option that determines whether to combine multiple chains when calculating summary statistics.
For internal use only.
Logical option that determines whether to plot a density plot when plot methods are called for mcmc objects.
Number of significant digits to use when printing.
For Geweke diagnostic, fraction to use from start of chain. Defaults to 0.1
For Geweke diagnostic, fraction to use from end of chain. Default to 0.5.
For Geweke-Brooks plot, number of iterations to use per bin.
For Geweke-Brooks plot, maximum number of bins to
use. This option overrides gr.bin
.
For Heidelberger and Welch diagnostic, the target value for the ratio of half width to sample mean.
Logical option that controls whether to plot a smooth line through a trace plot when plotting MCMC output.
For Raftery and Lewis diagnostic, the target quantile to be estimated
For Raftery and Lewis diagnostic, the required precision.
For Raftery and Lewis diagnostic, the probability of obtaining an estimate in the interval (q-r, q+r).
Vector of quantiles to print when calculating summary statistics for MCMC output.
Logical option that determines whether to plot a trace of the sampled output when plotting MCMC output.
Logical option that determines whether current value of par("mfrow") should be used for plots (TRUE) or whether the optimal layout should be calculated (FALSE).
coda.options(...) display.coda.options(stats = FALSE, plots = FALSE, diags = FALSE)
coda.options(...) display.coda.options(stats = FALSE, plots = FALSE, diags = FALSE)
stats |
logical flag: show summary statistic options? |
plots |
logical flag: show plotting options? |
diags |
logical flag: show plotting options? |
... |
list of options |
Distribution function of the Cramer-von Mises distribution.
pcramer(q, eps)
pcramer(q, eps)
q |
vector of quantiles. |
eps |
accuracy required |
pcramer
gives the distribution function,
Anderson TW. and Darling DA. Asymptotic theory of certain 'goodness of fit' criteria based on stochastic processes. Ann. Math. Statist., 23, 192-212 (1952).
Csorgo S. and Faraway, JJ. The exact and asymptotic distributions of the Cramer-von Mises statistic. J. Roy. Stat. Soc. (B), 58, 221-234 (1996).
crosscorr
calculates cross-correlations between
variables in Markov Chain Monte Carlo output. If x
is an mcmc.list then all chains in x
are combined
before calculating the correlation.
crosscorr(x)
crosscorr(x)
x |
an |
A matrix or 3-d array containing the correlations.
crosscorr.plot
provides an image of the correlation matrix for
x
. If x
is an mcmc.list
object, then all chains
are combined.
The range [-1,1] is divided into a number of equal-length categories
given by the length of col
and assigned the corresponding color.
By default, topographic colours are used as this makes it easier to
distinguish positive and negative correlations.
crosscorr.plot (x, col = topo.colors(10), ...)
crosscorr.plot (x, col = topo.colors(10), ...)
x |
an |
col |
color palette to use |
... |
graphical parameters |
crosscorr
, image
, topo.colors
.
Plots the evolution of the sample quantiles as a function of the number of iterations.
cumuplot(x, probs=c(0.025,0.5,0.975), ylab="", lty=c(2,1), lwd=c(1,2), type="l", ask, auto.layout=TRUE, col=1, ...)
cumuplot(x, probs=c(0.025,0.5,0.975), ylab="", lty=c(2,1), lwd=c(1,2), type="l", ask, auto.layout=TRUE, col=1, ...)
x |
an |
probs |
vector of desired quantiles |
ylab , lty , lwd , type , col
|
graphical parameters |
auto.layout |
If |
ask |
If |
... |
further graphical parameters |
Arni Magnusson
Displays a plot of the density estimate for each variable in x
,
calculated by the density
function. For discrete-valued
variables, a histogram is produced.
densplot(x, show.obs = TRUE, bwf, ylim, xlab, ylab = "", type="l", main, right=TRUE, ...)
densplot(x, show.obs = TRUE, bwf, ylim, xlab, ylab = "", type="l", main, right=TRUE, ...)
x |
An |
show.obs |
Show observations along the x-axis |
bwf |
Function for calculating the bandwidth. If omitted, the bandwidth is calculate by 1.06 times the minimum of the standard deviation and the interquartile range divided by 1.34 times the sample size to the negative one fifth power |
ylim |
Limits on y axis. See |
xlab |
X-axis label. By default this will show the sample size
and the bandwidth used for smoothing. See |
ylab |
Y-axis label. By default, this is blank. See |
type |
Plot type. See |
main |
An overall title for the plot. See |
right |
Logical flag for discrete-valued distributions passed to
the |
.
... |
Further graphical parameters |
For discrete-valued distributions, a histogram is produced and values
are aggregated using the pretty() function. By default, tick marks
appear to the right of the corresponding bar in the histogram and give
the inclusive upper limit of the hist (right=TRUE
). This can be
modified by specifying right=FALSE
. In this case tick marks
appear on the left and specify the inclusive lower limit of the bar.
For continous distributions, if a variable is bounded below at 0, or bounded in the interval [0,1], then the data are reflected at the boundary before being passed to the density() function. This allows correct estimation of a non-zero density at the boundary.
You can call this function directly, but it is more usually called by
the plot.mcmc
function.
Sample size adjusted for autocorrelation.
effectiveSize(x)
effectiveSize(x)
x |
An mcmc or mcmc.list object. |
For a time series x
of length N
, the standard error of the
mean is the square root of var(x)/n
where n
is the
effective sample size. n = N
only when there is no
autocorrelation.
Estimation of the effective sample size requires estimating the
spectral density at frequency zero. This is done by the function
spectrum0.ar
For a mcmc.list
object, the effective sizes are summed across
chains. To get the size for each chain individually use
lapply(x,effectiveSize)
.
A vector giving the effective sample size for each column of x
.
The ‘potential scale reduction factor’ is calculated for each variable in
x
, together with upper and lower confidence limits. Approximate
convergence is diagnosed when the upper limit is close to 1. For
multivariate chains, a multivariate value is calculated that bounds
above the potential scale reduction factor for any linear combination
of the (possibly transformed) variables.
The confidence limits are based on the assumption that the stationary distribution of the variable under examination is normal. Hence the ‘transform’ parameter may be used to improve the normal approximation.
gelman.diag(x, confidence = 0.95, transform=FALSE, autoburnin=TRUE, multivariate=TRUE)
gelman.diag(x, confidence = 0.95, transform=FALSE, autoburnin=TRUE, multivariate=TRUE)
x |
An |
confidence |
the coverage probability of the confidence interval for the potential scale reduction factor |
transform |
a logical flag indicating whether variables in
|
autoburnin |
a logical flag indicating whether only the second half
of the series should be used in the computation. If set to TRUE
(default) and |
multivariate |
a logical flag indicating whether the multivariate potential scale reduction factor should be calculated for multivariate chains |
An object of class gelman.diag
. This is a list with the
following elements:
psrf |
A list containing the point estimates of the potential
scale reduction factor (labelled |
mpsrf |
The point estimate of the multivariate potential scale reduction
factor. This is NULL if there is only one variable in |
The gelman.diag
class has its own print
method.
Gelman and Rubin (1992) propose a general approach to monitoring
convergence of MCMC output in which parallel chains are run
with starting values that are overdispersed relative to the posterior
distribution. Convergence is diagnosed when the chains have ‘forgotten’
their initial values, and the output from all chains is
indistinguishable. The
gelman.diag
diagnostic is applied to a
single variable from the chain. It is based a comparison of within-chain
and between-chain variances, and is similar to a classical analysis of
variance.
There are two ways to estimate the variance of the stationary distribution:
the mean of the empirical variance within each chain, , and
the empirical variance from all chains combined, which can be expressed as
where is the number of iterations and
is the empirical
between-chain variance.
If the chains have converged, then both estimates are unbiased. Otherwise the first method will underestimate the variance, since the individual chains have not had time to range all over the stationary distribution, and the second method will overestimate the variance, since the starting points were chosen to be overdispersed.
The convergence diagnostic is based on the assumption that the target distribution is normal. A Bayesian credible interval can be constructed using a t-distribution with mean
and variance
and degrees of freedom estimated by the method of moments
Use of the t-distribution accounts for the fact that the mean and variance of the posterior distribution are estimated.
The convergence diagnostic itself is
Values substantially above 1 indicate lack of convergence. If the chains have not converged, Bayesian credible intervals based on the t-distribution are too wide, and have the potential to shrink by this factor if the MCMC run is continued.
The multivariate a version of Gelman and Rubin's diagnostic was proposed by Brooks and Gelman (1998). Unlike the univariate proportional scale reduction factor, the multivariate version does not include an adjustment for the estimated number of degrees of freedom.
Gelman, A and Rubin, DB (1992) Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-511.
Brooks, SP. and Gelman, A. (1998) General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434-455.
This plot shows the evolution of Gelman and Rubin's shrink factor as the number of iterations increases.
gelman.plot(x, bin.width = 10, max.bins = 50, confidence = 0.95, transform = FALSE, autoburnin=TRUE, auto.layout = TRUE, ask, col, lty, xlab, ylab, type, ...)
gelman.plot(x, bin.width = 10, max.bins = 50, confidence = 0.95, transform = FALSE, autoburnin=TRUE, auto.layout = TRUE, ask, col, lty, xlab, ylab, type, ...)
x |
an mcmc object |
bin.width |
Number of observations per segment, excluding the first segment which always has at least 50 iterations. |
max.bins |
Maximum number of bins, excluding the last one. |
confidence |
Coverage probability of confidence interval. |
transform |
Automatic variable transformation (see |
autoburnin |
Remove first half of sequence (see |
auto.layout |
If |
ask |
Prompt user before displaying each page of plots. Default is
|
col |
graphical parameter (see |
lty |
graphical parameter (see |
xlab |
graphical parameter (see |
ylab |
graphical parameter (see |
type |
graphical parameter (see |
... |
further graphical parameters. |
The Markov chain is divided into bins according to the arguments
bin.width
and max.bins
. Then the Gelman-Rubin shrink factor
is repeatedly calculated. The first shrink factor is calculated with
observations 1:50, the second with observations ,
the third contains samples
and so on.
If the chain has less than
iterations then
gelman.diag
will exit with an error.
A potential problem with gelman.diag
is that it may mis-diagnose
convergence if the shrink factor happens to be close to 1 by chance.
By calculating the shrink factor at several points in time,
gelman.plot
shows if the shrink factor has really converged, or
whether it is still fluctuating.
Brooks, S P. and Gelman, A. (1998) General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics, 7, 434-455.
Geweke (1992) proposed a convergence diagnostic for Markov chains based on a test for equality of the means of the first and last part of a Markov chain (by default the first 10% and the last 50%). If the samples are drawn from the stationary distribution of the chain, the two means are equal and Geweke's statistic has an asymptotically standard normal distribution.
The test statistic is a standard Z-score: the difference between the two sample means divided by its estimated standard error. The standard error is estimated from the spectral density at zero and so takes into account any autocorrelation.
The Z-score is calculated under the assumption that the two parts of
the chain are asymptotically independent, which requires that the sum
of frac1
and frac2
be strictly less than 1.
geweke.diag(x, frac1=0.1, frac2=0.5)
geweke.diag(x, frac1=0.1, frac2=0.5)
x |
an mcmc object |
frac1 |
fraction to use from beginning of chain |
frac2 |
fraction to use from end of chain |
Z-scores for a test of equality of means between the first and last parts of the chain. A separate statistic is calculated for each variable in each chain.
Geweke, J. Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bayesian Statistics 4 (ed JM Bernado, JO Berger, AP Dawid and AFM Smith). Clarendon Press, Oxford, UK.
If geweke.diag
indicates that the first and last part of a sample
from a Markov chain are not drawn from the same distribution, it may
be useful to discard the first few iterations to see if the rest of the
chain has "converged". This plot shows what happens to Geweke's Z-score
when successively larger numbers of iterations are discarded from the
beginning of the chain. To preserve the asymptotic conditions required
for Geweke's diagnostic, the plot never discards more than half the chain.
The first half of the Markov chain is divided into nbins - 1
segments, then Geweke's Z-score is repeatedly calculated. The first
Z-score is calculated with all iterations in the chain, the second
after discarding the first segment, the third after discarding the first
two segments, and so on. The last Z-score is calculated using only the
samples in the second half of the chain.
geweke.plot(x, frac1 = 0.1, frac2 = 0.5, nbins = 20, pvalue = 0.05, auto.layout = TRUE, ask, ...)
geweke.plot(x, frac1 = 0.1, frac2 = 0.5, nbins = 20, pvalue = 0.05, auto.layout = TRUE, ask, ...)
x |
an mcmc object |
frac1 |
fraction to use from beginning of chain. |
frac2 |
fraction to use from end of chain. |
nbins |
Number of segments. |
pvalue |
p-value used to plot confidence limits for the null hypothesis. |
auto.layout |
If |
ask |
If |
... |
Graphical parameters. |
The graphical implementation of Geweke's diagnostic was suggested by Steve Brooks.
heidel.diag
is a run length control diagnostic based on a criterion
of relative accuracy for the estimate of the mean. The default setting
corresponds to a relative accuracy of two significant digits.
heidel.diag
also implements a convergence diagnostic, and removes
up to half the chain in order to ensure that the means are estimated
from a chain that has converged.
heidel.diag(x, eps=0.1, pvalue=0.05)
heidel.diag(x, eps=0.1, pvalue=0.05)
x |
An |
eps |
Target value for ratio of halfwidth to sample mean |
pvalue |
significance level to use |
The convergence test uses the Cramer-von-Mises statistic to test the null hypothesis that the sampled values come from a stationary distribution. The test is successively applied, firstly to the whole chain, then after discarding the first 10%, 20%, ... of the chain until either the null hypothesis is accepted, or 50% of the chain has been discarded. The latter outcome constitutes ‘failure’ of the stationarity test and indicates that a longer MCMC run is needed. If the stationarity test is passed, the number of iterations to keep and the number to discard are reported.
The half-width test calculates a 95% confidence interval for the
mean, using the portion of the chain which passed the stationarity test.
Half the width of this interval is compared with the estimate of the mean.
If the ratio between the half-width and the mean is lower than eps
,
the halfwidth test is passed. Otherwise the length of the sample is
deemed not long enough to estimate the mean with sufficient accuracy.
The heidel.diag
diagnostic is based on the work of Heidelberger
and Welch (1983), who combined their earlier work on simulation run
length control (Heidelberger and Welch, 1981) with the work of Schruben
(1982) on detecting initial transients using Brownian bridge theory.
If the half-width test fails then the run should be extended. In order to avoid problems caused by sequential testing, the test should not be repeated too frequently. Heidelberger and Welch (1981) suggest increasing the run length by a factor I > 1.5, each time, so that estimate has the same, reasonably large, proportion of new data.
Heidelberger P and Welch PD. A spectral method for confidence interval generation and run length control in simulations. Comm. ACM. 24, 233-245 (1981)
Heidelberger P and Welch PD. Simulation run length control in the presence of an initial transient. Opns Res., 31, 1109-44 (1983)
Schruben LW. Detecting initialization bias in simulation experiments. Opns. Res., 30, 569-590 (1982).
Create Highest Posterior Density (HPD) intervals for the parameters in an MCMC sample.
HPDinterval(obj, prob = 0.95, ...) ## S3 method for class 'mcmc' HPDinterval(obj, prob = 0.95, ...) ## S3 method for class 'mcmc.list' HPDinterval(obj, prob = 0.95, ...)
HPDinterval(obj, prob = 0.95, ...) ## S3 method for class 'mcmc' HPDinterval(obj, prob = 0.95, ...) ## S3 method for class 'mcmc.list' HPDinterval(obj, prob = 0.95, ...)
obj |
The object containing the MCMC sample - usually of class
|
.
prob |
A numeric scalar in the interval (0,1) giving the target
probability content of the intervals. The nominal probability
content of the intervals is the multiple of |
... |
Optional additional arguments for methods. None are used at present. |
For each parameter the interval is constructed from the empirical cdf of the sample as the shortest interval for which the difference in the ecdf values of the endpoints is the nominal probability. Assuming that the distribution is not severely multimodal, this is the HPD interval.
For an "mcmc"
object, a matrix with columns "lower"
and
"upper"
and rows corresponding to the parameters. The
attribute "Probability"
is the nominal probability content of
the intervals. A list of such matrices is returned for an
"mcmc.list"
object.
Douglas Bates
data(line) HPDinterval(line)
data(line) HPDinterval(line)
Sample MCMC output from a simple linear regression model given in the BUGS manual.
data(line)
data(line)
An mcmc
object
Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1995) BUGS: Bayesian inference using Gibbs Sampling, Version 0.5, MRC Biostatistics Unit, Cambridge.
The function mcmc
is used to create a Markov Chain Monte Carlo
object. The input data are taken to be a vector, or a matrix with
one column per variable.
If the optional arguments start
, end
, and thin
are omitted then the chain is assumed to start with iteration 1 and
have thinning interval 1. If data
represents a chain that
starts at a later iteration, the first iteration in the chain should
be given as the start
argument. Likewise, if data
represents a chain that has already been thinned, the thinning
interval should be given as the thin
argument.
An mcmc object may be summarized by the summary
function
and visualized with the plot
function.
MCMC objects resemble time series (ts
) objects and have
methods for the generic functions time
, start
,
end
, frequency
and window
.
mcmc(data= NA, start = 1, end = numeric(0), thin = 1) as.mcmc(x, ...) is.mcmc(x)
mcmc(data= NA, start = 1, end = numeric(0), thin = 1) as.mcmc(x, ...) is.mcmc(x)
data |
a vector or matrix of MCMC output |
start |
the iteration number of the first observation |
end |
the iteration number of the last observation |
thin |
the thinning interval between consecutive observations |
x |
An object that may be coerced to an mcmc object |
... |
Further arguments to be passed to specific methods |
The format of the mcmc class has changed between coda version 0.3
and 0.4. Older mcmc objects will now cause is.mcmc
to
fail with an appropriate warning message. Obsolete mcmc objects can
be upgraded with the mcmcUpgrade
function.
Martyn Plummer
mcmc.list
,
mcmcUpgrade
,
thin
,
window.mcmc
,
summary.mcmc
,
plot.mcmc
.
These are methods for the generic functions as.matrix()
,
as.array()
and as.mcmc()
.
as.matrix()
strips the MCMC attributes from an mcmc
object and returns a matrix. If iters = TRUE
then a
column is added with the iteration number. For mcmc.list
objects, the rows of multiple chains are concatenated and, if
chains = TRUE
a column is added with the chain number.
mcmc.list
objects can be coerced to 3-dimensional arrays
with the as.array()
function.
An mcmc.list
object with a single chain can be coerced
to an mcmc
object with as.mcmc()
. If the argument
has multiple chains, this causes an error.
## S3 method for class 'mcmc' as.matrix(x, iters = FALSE, ...) ## S3 method for class 'mcmc.list' as.matrix(x, iters = FALSE, chains = FALSE, ...) ## S3 method for class 'mcmc.list' as.array(x, drop, ...)
## S3 method for class 'mcmc' as.matrix(x, iters = FALSE, ...) ## S3 method for class 'mcmc.list' as.matrix(x, iters = FALSE, chains = FALSE, ...) ## S3 method for class 'mcmc.list' as.array(x, drop, ...)
x |
An |
iters |
logical flag: add column for iteration number? |
chains |
logical flag: add column for chain number? (if mcmc.list) |
drop |
logical flag: if |
... |
optional arguments to the various methods |
The function ‘mcmc.list’ is used to represent parallel runs of the same chain, with different starting values and random seeds. The list must be balanced: each chain in the list must have the same iterations and the same variables.
Diagnostic functions which act on mcmc
objects may also be applied
to mcmc.list
objects. In general, the chains will be combined,
if this makes sense, otherwise the diagnostic function will be applied
separately to each chain in the list.
Since all the chains in the list have the same iterations, a single time
dimension can be ascribed to the list. Hence there are time series methods
time
, window
, start
, end
, frequency
and thin
for mcmc.list
objects.
An mcmc.list
can be indexed as if it were a single mcmc object
using the [
operator (see examples below). The [[
operator
selects a single mcmc
object from the list.
mcmc.list(...) as.mcmc.list(x, ...) is.mcmc.list(x)
mcmc.list(...) as.mcmc.list(x, ...) is.mcmc.list(x)
... |
a list of mcmc objects |
x |
an object that may be coerced to mcmc.list |
Martyn Plummer
mcmc
.
data(line) x1 <- line[[1]] #Select first chain x2 <- line[,1, drop=FALSE] #Select first var from all chains varnames(x2) == varnames(line)[1] #TRUE
data(line) x1 <- line[[1]] #Select first chain x2 <- line[,1, drop=FALSE] #Select first var from all chains varnames(x2) == varnames(line)[1] #TRUE
These are methods for subsetting mcmc
objects. You can select
iterations using the first dimension and variables using the second
dimension. Selecting iterations will return a vector or matrix, not an
mcmc
object. If you want to do row-subsetting of an mcmc
object and preserve its dimensions, use the window
function.
Subsetting applied to an mcmc.list
object will simultaneously
affect all the parallel chains in the object.
## S3 method for class 'mcmc' x[i,j, drop=missing(i)] ## S3 method for class 'mcmc.list' x[i,j, drop=TRUE]
## S3 method for class 'mcmc' x[i,j, drop=missing(i)] ## S3 method for class 'mcmc.list' x[i,j, drop=TRUE]
x |
An |
i |
Row to extract |
j |
Column to extract |
drop |
if |
In previous releases of CODA, an mcmc
object could
be a single or multiple chains. A new class mcmc.list
has now been introduced to deal with multiple chains and
mcmc
objects can only have data from a single chain.
Objects stored in the old format are now obsolete and must be upgraded.
mcmcUpgrade(x)
mcmcUpgrade(x)
x |
an obsolete |
Martyn Plummer
mcmc
.
The ‘mcpar’ attribute of an MCMC object gives the start iteration the end iteration and the thinning interval of the chain.
It resembles the ‘tsp’ attribute of time series (ts
) objects.
mcpar(x)
mcpar(x)
x |
An |
These functions give the dimensions of an MCMC object
returns the number of iterations.
returns the number of variables.
returns the number of parallel chains.
niter(x) nvar(x) nchain(x)
niter(x) nvar(x) nchain(x)
x |
An |
A numeric vector of length 1:
plot.mcmc
summarizes an mcmc or mcmc.list object
with a trace of the sampled output and a density estimate
for each variable in the chain.
## S3 method for class 'mcmc' plot(x, trace = TRUE, density = TRUE, smooth = FALSE, bwf, auto.layout = TRUE, ask = dev.interactive(), ...)
## S3 method for class 'mcmc' plot(x, trace = TRUE, density = TRUE, smooth = FALSE, bwf, auto.layout = TRUE, ask = dev.interactive(), ...)
x |
an object of class |
trace |
Plot trace of each variable |
density |
Plot density estimate of each variable |
smooth |
Draw a smooth line through trace plots |
bwf |
Bandwidth function for density plots |
auto.layout |
Automatically generate output format |
ask |
Prompt user before each page of plots |
... |
Further arguments |
Martyn Plummer
raftery.diag
is a run length control diagnostic based on a
criterion of accuracy of estimation of the quantile q
. It is
intended for use on a short pilot run of a Markov chain. The number
of iterations required to estimate the quantile to within an
accuracy of +/-
with probability
is calculated. Separate
calculations are performed for each variable within each chain.
If the number of iterations in data
is too small,
an error message is printed indicating the minimum length of
pilot run. The minimum length is the required sample size for a
chain with no correlation between consecutive samples. Positive
autocorrelation will increase the required sample size above this
minimum value. An estimate I
(the ‘dependence factor’) of the
extent to which autocorrelation inflates the required sample size
is also provided. Values of I
larger than 5 indicate strong
autocorrelation which may be due to a poor choice of starting value,
high posterior correlations or ‘stickiness’ of the MCMC algorithm.
The number of ‘burn in’ iterations to be discarded at the beginning of the chain is also calculated.
raftery.diag(data, q=0.025, r=0.005, s=0.95, converge.eps=0.001)
raftery.diag(data, q=0.025, r=0.005, s=0.95, converge.eps=0.001)
data |
an |
q |
the quantile to be estimated. |
r |
the desired margin of error of the estimate. |
s |
the probability of obtaining an estimate in the interval (q-r,q+r). |
converge.eps |
Precision required for estimate of time to convergence. |
A list with class raftery.diag
. A print method is available
for objects of this class. the contents of the list are
tspar |
The time series parameters of |
params |
A vector containing the parameters |
Niters |
The number of iterations in |
resmatrix |
A 3-d array containing the results: |
The estimated sample size for variable U is based on the process where
is the indicator function and u is the
qth quantile of U. The process
is derived from the Markov
chain
data
by marginalization and truncation, but is not itself
a Markov chain. However, may behave as a Markov chain if
it is sufficiently thinned out.
raftery.diag
calculates the
smallest value of thinning interval which makes the thinned
chain
behave as a Markov chain. The required sample size is
calculated from this thinned sequence. Since some data is ‘thrown away’
the sample size estimates are conservative.
The criterion for the number of ‘burn in’ iterations to be
discarded, is that the conditional distribution of
given
should be within
converge.eps
of the equilibrium
distribution of the chain .
raftery.diag
is based on the FORTRAN program ‘gibbsit’
written by Steven Lewis, and available from the Statlib archive.
Raftery, A.E. and Lewis, S.M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science, 7, 493-497.
Raftery, A.E. and Lewis, S.M. (1995). The number of iterations, convergence diagnostics and generic Metropolis algorithms. In Practical Markov Chain Monte Carlo (W.R. Gilks, D.J. Spiegelhalter and S. Richardson, eds.). London, U.K.: Chapman and Hall.
Input is read interactively and checked against conditions specified
by the arguments what
, lower
, upper
and
answer.in
. If the input does not satisfy all the conditions,
an appropriate error message is produced and the user is prompted
to provide input. This process is repeated until a valid input value
is entered.
read.and.check(message = "", what = numeric(), lower, upper, answer.in, default)
read.and.check(message = "", what = numeric(), lower, upper, answer.in, default)
message |
message displayed before prompting for user input. |
what |
the type of |
lower |
lower limit of input, for numeric input only. |
upper |
upper limit of input, for numeric input only. |
answer.in |
the input must correspond to one of the elements of the
vector |
default |
value assumed if user enters a blank line. |
The value of the valid input. When the default
argument is
specified, a blank line is accepted as valid input and in this case
read.and.check
returns the value of default
.
Since the function does not return a value until it receives valid input, it extensively checks the conditions for consistency before prompting the user for input. Inconsistent conditions will cause an error.
Martyn Plummer
read.coda
reads Markov Chain Monte Carlo output in
the CODA format produced by OpenBUGS and JAGS. By default, all
of the data in the file is read, but the arguments start
,
end
and thin
may be used to read a subset of the
data. If the arguments given to start
, end
or
thin
are incompatible with the data, they are ignored.
read.coda(output.file, index.file, start, end, thin, quiet=FALSE) read.jags(file = "jags.out", start, end, thin, quiet=FALSE)
read.coda(output.file, index.file, start, end, thin, quiet=FALSE) read.jags(file = "jags.out", start, end, thin, quiet=FALSE)
output.file |
The name of the file containing the monitored output |
index.file |
The name of the file containing the index, showing which rows of the output file correspond to which variables |
file |
For JAGS output, the name of the output file. The extension ".out" may be omitted. There must be a corresponding ".ind" file with the same file stem. |
start |
First iteration of chain |
end |
Last iteration of chain |
thin |
Thinning interval for chain |
quiet |
Logical flag. If true, a progress summary will be printed |
An object of class mcmc
containing a representation of
the data in the file.
Karen Vines, Martyn Plummer
Spiegelhalter DJ, Thomas A, Best NG and Gilks WR (1995). BUGS: Bayesian inference Using Gibbs Sampling, Version 0.50. MRC Biostatistics Unit, Cambridge.
mcmc
,
read.coda.interactive
,
read.openbugs
.
read.coda.interactive
reads Markov Chain Monte Carlo output
in the format produced by the classic BUGS program. No arguments are
required. Instead, the user is prompted for the required information.
read.coda.interactive()
read.coda.interactive()
An object of class mcmc.list
containing a representation of
the data in one or more BUGS output files.
This function is normally called by the codamenu
function,
but can also be used on a stand-alone basis.
Nicky Best, Martyn Plummer
Spiegelhalter DJ, Thomas A, Best NG and Gilks WR (1995). BUGS: Bayesian inference Using Gibbs Sampling, Version 0.50. MRC Biostatistics Unit, Cambridge.
mcmc
,
mcmc.list
,
read.coda
,
codamenu
.
read.openbugs
reads Markov Chain Monte Carlo output in
the CODA format produced by OpenBUGS.
This is a convenience wrapper around the function read.coda
which allows you to read all the data output by OpenBUGS by
specifying only the file stem.
read.openbugs(stem="", start, end, thin, quiet=FALSE)
read.openbugs(stem="", start, end, thin, quiet=FALSE)
stem |
Character string giving the stem for the output files. OpenBUGS produces files with names "<stem>CODAindex.txt", "<stem>CODAchain1.txt", "<stem>CODAchain2.txt", ... |
start |
First iteration of chain |
end |
Last iteration of chain |
thin |
Thinning interval for chain |
quiet |
Logical flag. If true, a progress summary will be printed |
An object of class mcmc.list
containing output from all
chains.
Martyn Plummer
Spiegelhalter DJ, Thomas A, Best NG and Lunn D (2004). WinBUGS User Manual, Version 2.0, June 2004, MRC Biostatistics Unit, Cambridge.
rejectionRate
calculates the fraction of time that a
Metropolis–Hastings type chain rejected a proposed move. The rejection
rate is calculates separately for each variable in the mcmc.obj
argument, irregardless of whether the variables were drawn separately or
in a block. In the latter case, the values returned should be the
same.
rejectionRate(x)
rejectionRate(x)
x |
An |
For the purposes of this function, a "rejection" has occurred if the
value of the time series is the same at two successive time points.
This test is done naively using ==
and may produce problems due
to rounding error.
A vector containing the rejection rates, one for each variable.
Russell Almond
The spectral density at frequency zero is estimated by fitting a glm to
the low-frequency end of the periodogram. spectrum0(x)/length(x)
estimates the variance of mean(x)
.
spectrum0(x, max.freq = 0.5, order = 1, max.length = 200)
spectrum0(x, max.freq = 0.5, order = 1, max.length = 200)
x |
A time series. |
max.freq |
The glm is fitted on the frequency range (0, max.freq] |
order |
Order of the polynomial to fit to the periodogram. |
max.length |
The data |
The raw periodogram is calculated for the series x
and a generalized
linear model with family Gamma
and log link is fitted to
the periodogram.
The linear predictor is a polynomial in terms of the frequency. The
degree of the polynomial is determined by the parameter order
.
A list with the following values
spec |
The predicted value of the spectral density at frequency zero. |
Heidelberger and Welch (1991) observed that the usual non-parametric estimator of the spectral density, obtained by smoothing the periodogram, is not appropriate for frequency zero. They proposed an alternative parametric method which consisted of fitting a linear model to the log periodogram of the batched time series. Some technical problems with model fitting in their original proposal can be overcome by using a generalized linear model.
Batching of the data, originally proposed in order to save space, has the side effect of flattening the spectral density and making a polynomial fit more reasonable. Fitting a polynomial of degree zero is equivalent to using the ‘batched means’ method.
The definition of the spectral density used here differs from that used by
spec.pgram
. We consider the frequency range to be between 0 and 0.5,
not between 0 and frequency(x)/2
.
The model fitting may fail on chains with very high autocorrelation.
Heidelberger, P and Welch, P.D. A spectral method for confidence interval generation and run length control in simulations. Communications of the ACM, Vol 24, pp233-245, 1981.
The spectral density at frequency zero is estimated by fitting an
autoregressive model. spectrum0(x)/length(x)
estimates the
variance of mean(x)
.
spectrum0.ar(x)
spectrum0.ar(x)
x |
A time series. |
The ar()
function to fit an autoregressive model to the time
series x. For multivariate time series, separate models are fitted for
each column. The value of the spectral density at zero is then given
by a well-known formula.
A list with the following values
spec |
The predicted value of the spectral density at frequency zero. |
order |
The order of the fitted model |
The definition of the spectral density used here differs from that used by
spec.pgram
. We consider the frequency range to be between 0 and 0.5,
not between 0 and frequency(x)/2
.
summary.mcmc
produces two sets of summary statistics for
each variable:
Mean, standard deviation, naive standard error of the mean (ignoring autocorrelation of the chain) and time-series standard error based on an estimate of the spectral density at 0.
Quantiles of the sample distribution using the quantiles
argument.
## S3 method for class 'mcmc' summary(object, quantiles = c(0.025, 0.25, 0.5, 0.75, 0.975), ...)
## S3 method for class 'mcmc' summary(object, quantiles = c(0.025, 0.25, 0.5, 0.75, 0.975), ...)
object |
an object of class |
quantiles |
a vector of quantiles to evaluate for each variable |
... |
a list of further arguments |
Martyn Plummer
thin
returns the interval between successive
values of a time series. thin(x)
is equivalent
to 1/frequency(x)
.
This is a generic function. Methods have been implemented
for mcmc
objects.
thin(x, ...)
thin(x, ...)
x |
a regular time series |
... |
a list of arguments |
Martyn Plummer
time
.
These are methods for mcmc objects for the generic time series functions.
## S3 method for class 'mcmc' time(x, ...) ## S3 method for class 'mcmc' start(x, ...) ## S3 method for class 'mcmc' end(x, ...) ## S3 method for class 'mcmc' thin(x, ...)
## S3 method for class 'mcmc' time(x, ...) ## S3 method for class 'mcmc' start(x, ...) ## S3 method for class 'mcmc' end(x, ...) ## S3 method for class 'mcmc' thin(x, ...)
x |
an |
... |
extra arguments for future methods |
Displays a plot of iterations vs. sampled values for each variable in the chain, with a separate plot per variable.
traceplot(x, smooth = FALSE, col = 1:6, type = "l", xlab = "Iterations", ylab = "", ...)
traceplot(x, smooth = FALSE, col = 1:6, type = "l", xlab = "Iterations", ylab = "", ...)
x |
An |
smooth |
draw smooth line through trace plot |
col |
graphical parameter (see |
type |
graphical parameter (see |
xlab |
graphical parameter (see |
ylab |
graphical parameter (see |
... |
further graphical parameters |
You can call this function directly, but it is more usually
called by the plot.mcmc
function.
These methods use the Trellis framework as implemented in the
lattice
package to produce space-conserving diagnostic plots
from "mcmc"
and "mcmc.list"
objects. The xyplot
methods produce trace plots. The densityplot
methods and
qqmath
methods produce empirical density and probability
plots. The levelplot
method depicts the correlation of the
series. The acfplot
methods plot the auto-correlation in the
series.
Not yet available in S-PLUS.
## S3 method for class 'mcmc' densityplot(x, data, outer, aspect = "xy", default.scales = list(relation = "free"), start = 1, thin = 1, main = attr(x, "title"), xlab = "", plot.points = "rug", ..., subset) ## S3 method for class 'mcmc.list' densityplot(x, data, outer = FALSE, groups = !outer, aspect = "xy", default.scales = list(relation = "free"), start = 1, thin = 1, main = attr(x, "title"), xlab = "", plot.points = "rug", ..., subset) ## S3 method for class 'mcmc' levelplot(x, data, main = attr(x, "title"), start = 1, thin = 1, ..., xlab = "", ylab = "", cuts = 10, at, col.regions = topo.colors(100), subset) ## S3 method for class 'mcmc' qqmath(x, data, outer, aspect = "xy", default.scales = list(y = list(relation = "free")), prepanel = prepanel.qqmathline, start = 1, thin = 1, main = attr(x, "title"), ylab = "", ..., subset) ## S3 method for class 'mcmc.list' qqmath(x, data, outer = FALSE, groups = !outer, aspect = "xy", default.scales = list(y = list(relation = "free")), prepanel = prepanel.qqmathline, start = 1, thin = 1, main = attr(x, "title"), ylab = "", ..., subset) ## S3 method for class 'mcmc' xyplot(x, data, outer, layout = c(1, nvar(x)), default.scales = list(y = list(relation = "free")), type = 'l', start = 1, thin = 1, xlab = "Iteration number", ylab = "", main = attr(x, "title"), ..., subset) ## S3 method for class 'mcmc.list' xyplot(x, data, outer = FALSE, groups = !outer, aspect = "xy", layout = c(1, nvar(x)), default.scales = list(y = list(relation = "free")), type = 'l', start = 1, thin = 1, xlab = "Iteration number", ylab = "", main = attr(x, "title"), ..., subset) acfplot(x, data, ...) ## S3 method for class 'mcmc' acfplot(x, data, outer, prepanel, panel, type = 'h', aspect = "xy", start = 1, thin = 1, lag.max = NULL, ylab = "Autocorrelation", xlab = "Lag", main = attr(x, "title"), ..., subset) ## S3 method for class 'mcmc.list' acfplot(x, data, outer = FALSE, groups = !outer, prepanel, panel, type = if (groups) 'b' else 'h', aspect = "xy", start = 1, thin = 1, lag.max = NULL, ylab = "Autocorrelation", xlab = "Lag", main = attr(x, "title"), ..., subset)
## S3 method for class 'mcmc' densityplot(x, data, outer, aspect = "xy", default.scales = list(relation = "free"), start = 1, thin = 1, main = attr(x, "title"), xlab = "", plot.points = "rug", ..., subset) ## S3 method for class 'mcmc.list' densityplot(x, data, outer = FALSE, groups = !outer, aspect = "xy", default.scales = list(relation = "free"), start = 1, thin = 1, main = attr(x, "title"), xlab = "", plot.points = "rug", ..., subset) ## S3 method for class 'mcmc' levelplot(x, data, main = attr(x, "title"), start = 1, thin = 1, ..., xlab = "", ylab = "", cuts = 10, at, col.regions = topo.colors(100), subset) ## S3 method for class 'mcmc' qqmath(x, data, outer, aspect = "xy", default.scales = list(y = list(relation = "free")), prepanel = prepanel.qqmathline, start = 1, thin = 1, main = attr(x, "title"), ylab = "", ..., subset) ## S3 method for class 'mcmc.list' qqmath(x, data, outer = FALSE, groups = !outer, aspect = "xy", default.scales = list(y = list(relation = "free")), prepanel = prepanel.qqmathline, start = 1, thin = 1, main = attr(x, "title"), ylab = "", ..., subset) ## S3 method for class 'mcmc' xyplot(x, data, outer, layout = c(1, nvar(x)), default.scales = list(y = list(relation = "free")), type = 'l', start = 1, thin = 1, xlab = "Iteration number", ylab = "", main = attr(x, "title"), ..., subset) ## S3 method for class 'mcmc.list' xyplot(x, data, outer = FALSE, groups = !outer, aspect = "xy", layout = c(1, nvar(x)), default.scales = list(y = list(relation = "free")), type = 'l', start = 1, thin = 1, xlab = "Iteration number", ylab = "", main = attr(x, "title"), ..., subset) acfplot(x, data, ...) ## S3 method for class 'mcmc' acfplot(x, data, outer, prepanel, panel, type = 'h', aspect = "xy", start = 1, thin = 1, lag.max = NULL, ylab = "Autocorrelation", xlab = "Lag", main = attr(x, "title"), ..., subset) ## S3 method for class 'mcmc.list' acfplot(x, data, outer = FALSE, groups = !outer, prepanel, panel, type = if (groups) 'b' else 'h', aspect = "xy", start = 1, thin = 1, lag.max = NULL, ylab = "Autocorrelation", xlab = "Lag", main = attr(x, "title"), ..., subset)
x |
an |
data |
ignored, present for consistency with generic. |
outer |
for the |
groups |
for the |
aspect |
controls the physical aspect ratio of the panel. See
|
default.scales |
this parameter provides a reasonable default
value of the |
type |
a character vector that determines if lines, points,
etc. are drawn on the panel. The default values for the methods are
carefully chosen. See
|
thin |
an optional thinning interval that is applied before the plot is drawn. |
start |
an optional value for the starting point within the series. Values before the starting point are considered part of the "burn-in" of the series and dropped. |
plot.points |
character argument giving the style in which
points are added to the plot. See
|
layout |
a method-specific default for the |
xlab , ylab , main
|
Used to provide default axis annotations and plot labels. |
cuts , at
|
defines number and location of values where colors change |
col.regions |
color palette used |
lag.max |
maximum lag for which autocorrelation is computed. By
default, the value chosen by |
prepanel , panel
|
suitable prepanel and panel functions for
|
... |
other arguments, passed to the lattice function.
Documentation of the corresponding generics in the |
subset |
indices of the subset of the series to plot. The
default is constructed from the |
An object of class "trellis"
. The relevant
update
method can be used to
update components of the object and the
print
method (usually called by
default) will plot it on an appropriate plotting device.
Deepayan Sarkar [email protected]
Lattice
for a brief introduction to
lattice displays and links to further documentation.
data(line) ## Not run: xyplot(line) xyplot(line[[1]], start = 10) densityplot(line, start = 10) qqmath(line, start = 10) levelplot(line[[2]]) acfplot(line, outer = TRUE) ## End(Not run)
data(line) ## Not run: xyplot(line) xyplot(line[[1]], start = 10) densityplot(line, start = 10) qqmath(line, start = 10) levelplot(line[[2]]) acfplot(line, outer = TRUE) ## End(Not run)
varnames()
returns the variable names and chanames
returns the chain names, or NULL if these are not set.
If allow.null = FALSE
then NULL
values will be
replaced with canonical names.
varnames(x, allow.null=TRUE) chanames(x, allow.null=TRUE) varnames(x) <- value chanames(x) <- value
varnames(x, allow.null=TRUE) chanames(x, allow.null=TRUE) varnames(x) <- value chanames(x) <- value
x |
an |
allow.null |
Logical argument that determines whether the function may return NULL |
value |
A character vector, or NULL |
A character vector , or NULL.
window.mcmc
is a method for mcmc
objects which is
normally called by the generic function window
In addition to the generic parameters, start
and end
the additional parameter thin
may be used to thin out the
Markov chain. Setting thin=k selects every kth iteration starting
with the first. Note that the value of thin
is absolute
not relative. The value supplied given to the parameter thin
must be a multiple of thin(x)
.
Values of start
, end
and thin
which are inconsistent
with x
are ignored, but a warning message is issued.
## S3 method for class 'mcmc' window(x, start, end, thin, ...)
## S3 method for class 'mcmc' window(x, start, end, thin, ...)
x |
an mcmc object |
start |
the first iteration of interest |
end |
the last iteration of interest |
thin |
the required interval between successive samples |
... |
futher arguments for future methods |