| Title: | Hybrid Markov Chain Monte Carlo Using Gaussian Processes |
|---|---|
| Description: | Hybrid Markov chain Monte Carlo (MCMC) for sampling from multimodal target distributions when derivatives are unavailable. A Gaussian process approximation is used to emulate derivatives, enabling efficient exploration with parallel tempering. The method is described in Fielding, Nott and Liong (2011) <doi:10.1198/TECH.2010.09195>. The research was carried out as part of the Singapore-Delft Water Alliance Multi-Objective Multi-Reservoir Management programme (R-264-001-272). |
| Authors: | Mark J. Fielding [aut, cre] |
| Maintainer: | Mark J. Fielding <[email protected]> |
| License: | GPL-2 |
| Version: | 7.0.1 |
| Built: | 2026-06-24 18:37:05 UTC |
| Source: | https://github.com/cran/MCMChybridGP |
Hybrid Markov chain Monte Carlo (MCMC) to simulate from
a multimodal target distribution with derivatives unknown.
A Gaussian process fit is used to approximate derivatives.
The Package consists of an Exploratory phase,
with hybrid.explore, followed by a Sampling
phase, with hybrid.sample.
The user is to supply the log-density f of the
target distribution along with a small number of (say 10)
points to get things started.
The Sampling phase allows replacement of the true
target in high temperature chains, or complete replacement
of the target. A full description of the method is given in
Fielding, Nott and Liong (2011).
The authors gratefully acknowledge the support & contributions of the Singapore-Delft Water Alliance. The research presented in this work was carried out as part of the Multi-Objective Multi-Reservoir Management research programme (R-264-001-272).
| Package: | MCMChybridGP |
| Type: | Package |
| Version: | 1.0 |
| Date: | 2009-09-15 |
| License: | GPL-2 |
| LazyLoad: | yes |
Mark James Fielding <[email protected]>
Maintainer: Mark James Fielding <[email protected]>
"Efficient MCMC Schemes for Computationally Expensive Posterior Distributions", Fielding, Nott and Liong (2011).
mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.1225 ub <- c(1.5, 3) X0 <- generateX0(lb=c(-2,-2), ub=ub) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } explore.out <- hybrid.explore(f, X0, ub=ub, n=150, graph=TRUE) sample.out <- hybrid.sample(explore.out, n=500, graph=TRUE) opar <- par(mfrow=c(2,1)) plot(density(sample.out$SAMP[,1]), xlab="x1", ylab="f(x)") plot(density(sample.out$SAMP[,2]), xlab="x2", ylab="f(x)") par(opar)mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.1225 ub <- c(1.5, 3) X0 <- generateX0(lb=c(-2,-2), ub=ub) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } explore.out <- hybrid.explore(f, X0, ub=ub, n=150, graph=TRUE) sample.out <- hybrid.sample(explore.out, n=500, graph=TRUE) opar <- par(mfrow=c(2,1)) plot(density(sample.out$SAMP[,1]), xlab="x1", ylab="f(x)") plot(density(sample.out$SAMP[,2]), xlab="x2", ylab="f(x)") par(opar)
A function to randomly generate n0-many points
within requested bounds, lb, ub.
The points selected are to have the largest, minimum
distance between any two points.
generateX0 (lb, ub, n0 = 10, npool = 100)generateX0 (lb, ub, n0 = 10, npool = 100)
lb |
Lower (finite) bounds for points generated. |
ub |
Upper (finite) bounds for points generated. |
n0 |
The number of points to be generated. |
npool |
The size of the pool of sets of randomly generated points in determining the best set of points. |
A matrix is returned which can then be supplied to
hybrid.explore.
Mark James Fielding <[email protected]>
"Efficient MCMC Schemes for Computationally Expensive Posterior Distributions", Fielding, Nott and Liong (2011).
lb <- c(-3,-3) ub <- c(3,3) X0 <- generateX0(lb, ub)lb <- c(-3,-3) ub <- c(3,3) X0 <- generateX0(lb, ub)
A function to determine a Gaussian process fit to a set of
points forming a matrix X, given a column
of corresponding values of the log-density
of a target distribution. Returned is a (zero mean)
approximation Ef of the log-density and various
components of the Gaussian process fit as used by
hybrid.explore and hybrid.sample.
GProcess(X, y, params = NULL, request.functions = TRUE, finetune = FALSE)GProcess(X, y, params = NULL, request.functions = TRUE, finetune = FALSE)
X |
A matrix of at least 2 columns with rows representing the points (nodes) for a Gaussian process fit. |
y |
A column of corresponding values of the log-density.
Each entry corresponds to the log-density evaluated
at the respective row in |
params |
Gaussian process parameters as used in |
request.functions |
Optional boolean argument (default TRUE) to request the return of function
approximations |
finetune |
Optional boolean argument (default FALSE) to determine
fine-tuned optimal values in |
Returned is a list as requested consisting of:
Ef |
The Gaussian process approximation of the log-density function. |
sigmaf |
Upon request, a function giving the Gaussian process
approximation of the standard deviation with respect to |
Sigma |
Covariance matrix used in the Gaussian process fit. |
Sigma.inv |
The inverse of the Covariance matrix. |
inverseOK |
Boolean flag to indicate successful calculation
of |
X |
The original |
y |
|
params |
Parameter values for the Gaussian process fit. |
Mark James Fielding <[email protected]>
"Efficient MCMC Schemes for Computationally Expensive Posterior Distributions", Fielding, Nott and Liong (2011).
mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.1225 X <- matrix(c(-2,-1,0,-2,0,2,0,1,2, -2,-1,-2,0,0,0,2,1,2), ncol = 2) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } y <- rep(NA, 9) for(i in 1:9) y[i] <- f(X[i,]) Ef <- GProcess(X, y, request.functions = TRUE)$Ef Ey <- NA*y for(i in 1:9) Ey[i] <- Ef(X[i,]) data.frame(X, y, Ey) ## Gaussian process close to exact at points supplied.mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.1225 X <- matrix(c(-2,-1,0,-2,0,2,0,1,2, -2,-1,-2,0,0,0,2,1,2), ncol = 2) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } y <- rep(NA, 9) for(i in 1:9) y[i] <- f(X[i,]) Ef <- GProcess(X, y, request.functions = TRUE)$Ef Ey <- NA*y for(i in 1:9) Ey[i] <- Ef(X[i,]) data.frame(X, y, Ey) ## Gaussian process close to exact at points supplied.
The Exploratory phase of hybrid MCMC using a
Gaussian process approximation of derivatives.
The user must provide the log-density of a target distribution
and a small set of (say 10) points forming the matrix X0.
Using a Gaussian process approximation, points are added
until a appropriate set of points are determined
for a final Gaussian process fit to the target distribution.
Results can then be passed to the Sampling phase
or the Gaussian process approximation Ef can be
used to approximate f.
hybrid.explore(f, X0, ..., y0 = NULL, n = 200, L = 1000, delta = 0.003, nchains = 5, T.mult = 1.5, lb = NULL, ub = NULL, maxleap=0.95, finetune = FALSE, Tinclude = nchains, nswaps = choose(nchains, 2), graph = FALSE)hybrid.explore(f, X0, ..., y0 = NULL, n = 200, L = 1000, delta = 0.003, nchains = 5, T.mult = 1.5, lb = NULL, ub = NULL, maxleap=0.95, finetune = FALSE, Tinclude = nchains, nswaps = choose(nchains, 2), graph = FALSE)
f |
A function giving the log-density of the target distribution. |
X0 |
A matrix with rows giving say 10 points for an initial Gaussian process fit. |
... |
Further arguments to be passed to |
y0 |
Corresponding function values if already known. Otherwise the corresponding function values will first be evaluated. |
n |
An optional integer argument. The number of points to be generated for the final Gaussian process fit. |
L |
An optional integer argument. The number of steps to be used in Leapfrog moves in MCMC proposals. |
delta |
An optional numeric argument (default 0.003).
The size of Leapfrog steps to be made, with a suitable
balance between |
nchains |
An optional integer argument. The number of Parallel Tempering chains to be used (default 5). |
T.mult |
An optional numeric argument. The Temperature multiple to be used in Parallel Tempering (default 1.5). |
lb |
An optional numeric argument. Lower bounds placed on X. Only points within bounds are included in Gaussian process fit. |
ub |
An optional numeric argument. Upper bounds placed on X. Only points within bounds are included in Gaussian process fit. |
maxleap |
An optional numeric 0-1 argument (default 0.95).
Used in early stops in Leapfrog
moves. The Leapfrog is stopped when the
standard deviation of |
finetune |
An optional boolean argument (default FALSE). Option for fine-tuning Gaussian process parameters. This may be useful when a good fit is difficult to achieve. |
Tinclude |
An optional integer argument (default |
nswaps |
An optional integer argument. The number of repetitions of proposed swaps between the parallel chains. |
graph |
An optional boolean argument (default FALSE). To request a graphical display of progress during the explore phase. |
The method used in hybrid.explore is described in
Fielding, Nott and Liong (2011).
A list is returned to be used as input to hybrid.sample.
X |
A matrix made up of the set of points used in the final Gaussian process fit. |
y |
A column of the corresponding evaluations of |
f |
The original log-density function supplied. |
maxleap |
The value of |
function.calls |
The number of function calls used to evaluate |
details |
A list containing information particular to |
GPfit |
Ouput from |
The methods used in hybrid.explore and
hybrid.sample
give extensions to the work of Rasmussen (2003), as described in
Fielding, Nott and Liong (2011).
For very low acceptance rates, points included at later stages are likely
to be more useful with a fit only deteriorated by the earlier points.
In such a case a second run of hybrid.explore might be useful,
taking values for X0 and y0 as those output for X and
y from the first run.
Mark J. Fielding <[email protected]>
"Efficient MCMC Schemes for Computationally Expensive Posterior Distributions", Fielding, Nott and Liong (2011).
hybrid.sample, GProcess, generateX0.
mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.16 ub <- c(1.5, 3) X0 <- matrix(c(-2,-1, 0,-2, 0, 1, 0, 1, 1, -2,-1,-2, 0, 0, 0, 2, 1, 2), ncol = 2) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } explore.out <- hybrid.explore(f, X0, ub=ub, n=150, graph=TRUE) sample.out <- hybrid.sample(explore.out, n=500, graph=TRUE) opar <- par(mfrow=c(2,1)) plot(density(sample.out$SAMP[,1]), xlab="x1", ylab="f(x)") plot(density(sample.out$SAMP[,2]), xlab="x2", ylab="f(x)") par(opar)mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.16 ub <- c(1.5, 3) X0 <- matrix(c(-2,-1, 0,-2, 0, 1, 0, 1, 1, -2,-1,-2, 0, 0, 0, 2, 1, 2), ncol = 2) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } explore.out <- hybrid.explore(f, X0, ub=ub, n=150, graph=TRUE) sample.out <- hybrid.sample(explore.out, n=500, graph=TRUE) opar <- par(mfrow=c(2,1)) plot(density(sample.out$SAMP[,1]), xlab="x1", ylab="f(x)") plot(density(sample.out$SAMP[,2]), xlab="x2", ylab="f(x)") par(opar)
Sampling phase in hybrid MCMC, which takes the output from
hybrid.explore to samples from the target
distribution supplied. The number of chains,
Leapfrog moves and Gaussian process parameters are the
same as used in hybrid.explore,
or the values may be updated here. For a target distribution
time consuming to evaluate, the target can be replaced
completely by the Gaussian process approximation in some
or all of the chains. Bounds supplied act as reflecting
barriers.
hybrid.sample(Explore, n = 1000, replace.target = c(0, 1, 2)[1], lb = NULL, ub = NULL, L = NULL, delta = NULL, nchains = NULL, T.mult = NULL, maxleap = NULL, r = 5, nswaps = NULL, graph = FALSE)hybrid.sample(Explore, n = 1000, replace.target = c(0, 1, 2)[1], lb = NULL, ub = NULL, L = NULL, delta = NULL, nchains = NULL, T.mult = NULL, maxleap = NULL, r = 5, nswaps = NULL, graph = FALSE)
Explore |
Output from |
n |
The number of sampling iterations. |
L |
An optional integer argument passed from |
delta |
An optional numerical argument passed from |
lb |
An optional numeric argument passed from |
ub |
An optional numeric argument passed from |
nchains |
An optional integer argument passed from |
T.mult |
An optional integer argument passed from |
maxleap |
An optional numerical argument passed from |
r |
An optional numerical argument (default 5). A penalty factor on points straying from the region of Gaussian process fit, when the target distribution is replaced. |
nswaps |
An optional integer argument passed from |
replace.target |
The sampling scheme to be used (0, 1 or 2) in acceptance of MCMC proposals. Where 0 represents using the true target distribution in all chains. 1 (default) represents using the true target distribution only in the primary chain (having temperature 1). 2 represents replacing the target distribution in all chains by the Gaussian process approximation. |
graph |
An optional boolean argument (default is FALSE). Request graphical progress display during the sample phase. |
The method used in hybrid.sample
is described in Fielding, Nott and Liong (2011).
A list is returned consisting of the following.
SAMP |
A matrix with rows corresponding to sampled points generated from the target distribution. |
y |
A column of the corrresponding values of the log-density of the target distribution. |
acceptance |
A column of 0 (rejected) and 1 (accepted) giving a record of sampling proposal acceptance. |
function.calls |
The number of function calls to evaluate the true log-density. |
A record is kept throughout a run of hybrid.sample
stored as a global variable list, hybrid.sample.out.
Useful for a run stopped prematurely.
The method used in hybrid.sample gives extensions
to the work of Rasmussen (2003) and is described in
Fielding, Nott and Liong (2011).
Mark J. Fielding <[email protected]>
"Efficient MCMC Schemes for Computationally Expensive Posterior Distributions", Fielding, Nott and Liong (2011).
mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.16 ub <- c(1.5, 3) X0 <- matrix(c(-2,-1, 0,-2, 0, 1, 0, 1, 1, -2,-1,-2, 0, 0, 0, 2, 1, 2), ncol = 2) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } explore.out <- hybrid.explore(f, X0, ub=ub, n=150, graph=TRUE) sample.out <- hybrid.sample(explore.out, n=500, graph=TRUE) opar <- par(mfrow=c(2,1)) plot(density(sample.out$SAMP[,1]), xlab="x1", ylab="f(x)") plot(density(sample.out$SAMP[,2]), xlab="x2", ylab="f(x)") par(opar)mu1 <- c(-1, -1) mu2 <- c(+1, +1) sigma.sq <- 0.16 ub <- c(1.5, 3) X0 <- matrix(c(-2,-1, 0,-2, 0, 1, 0, 1, 1, -2,-1,-2, 0, 0, 0, 2, 1, 2), ncol = 2) f <- function(x) { px <- 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu1)^2)) + 1/4/pi/sqrt(sigma.sq) * exp(-1/2/sigma.sq * sum((x - mu2)^2)) return(log(px)) } explore.out <- hybrid.explore(f, X0, ub=ub, n=150, graph=TRUE) sample.out <- hybrid.sample(explore.out, n=500, graph=TRUE) opar <- par(mfrow=c(2,1)) plot(density(sample.out$SAMP[,1]), xlab="x1", ylab="f(x)") plot(density(sample.out$SAMP[,2]), xlab="x2", ylab="f(x)") par(opar)