Title: | Univariate and Multivariate Spatial-Temporal Modeling |
---|---|
Description: | Fits univariate and multivariate spatio-temporal random effects models for point-referenced data using Markov chain Monte Carlo (MCMC). Details are given in Finley, Banerjee, and Gelfand (2015) <doi:10.18637/jss.v063.i13> and Finley and Banerjee <doi:10.1016/j.envsoft.2019.104608>. |
Authors: | Andrew Finley [aut, cre], Sudipto Banerjee [aut] |
Maintainer: | Andrew Finley <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.4-8 |
Built: | 2024-11-23 06:49:22 UTC |
Source: | CRAN |
Markov chain Monte Carlo for continuous random vector using an adaptive Metropolis within Gibbs algorithm.
adaptMetropGibbs(ltd, starting, tuning=1, accept.rate=0.44, batch = 1, batch.length=25, report=100, verbose=TRUE, ...)
adaptMetropGibbs(ltd, starting, tuning=1, accept.rate=0.44, batch = 1, batch.length=25, report=100, verbose=TRUE, ...)
ltd |
an R function that evaluates the log target density of the
desired equilibrium distribution of the Markov chain. First argument is the
starting value vector of the Markov chain. Pass variables used in
the |
starting |
a real vector of parameter starting values. |
tuning |
a scalar or vector of initial Metropolis tuning values. The vector must be of |
accept.rate |
a scalar or vector of target Metropolis acceptance
rates. The vector must be of |
batch |
the number of batches. |
batch.length |
the number of sampler iterations in each batch. |
report |
the number of batches between acceptance rate reports. |
verbose |
if |
... |
currently no additional arguments. |
A list with the following tags:
p.theta.samples |
a |
acceptance |
the Metropolis acceptance rate at the end of each batch. |
ltd |
|
accept.rate |
|
batch |
|
batch.length |
|
proc.time |
the elapsed CPU and wall time (in seconds). |
This function is a rework of Rosenthal (2007) with some added niceties.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Roberts G.O. and Rosenthal J.S. (2006). Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf Preprint.
Rosenthal J.S. (2007). AMCMC: An R interface for adaptive MCMC. Computational Statistics and Data Analysis. 51:5467-5470.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } ########################### ##Fit a spatial regression ########################### set.seed(1) n <- 50 x <- runif(n, 0, 100) y <- runif(n, 0, 100) D <- as.matrix(dist(cbind(x, y))) phi <- 3/50 sigmasq <- 50 tausq <- 20 mu <- 150 s <- (sigmasq*exp(-phi*D)) w <- rmvn(1, rep(0, n), s) Y <- rmvn(1, rep(mu, n) + w, tausq*diag(n)) X <- as.matrix(rep(1, length(Y))) ##Priors ##IG sigma^2 and tau^2 a.sig <- 2 b.sig <- 100 a.tau <- 2 b.tau <- 100 ##Unif phi a.phi <- 3/100 b.phi <- 3/1 ##Functions used to transform phi to continuous support. logit <- function(theta, a, b){log((theta-a)/(b-theta))} logit.inv <- function(z, a, b){b-(b-a)/(1+exp(z))} ##Metrop. target target <- function(theta){ mu.cand <- theta[1] sigmasq.cand <- exp(theta[2]) tausq.cand <- exp(theta[3]) phi.cand <- logit.inv(theta[4], a.phi, b.phi) Sigma <- sigmasq.cand*exp(-phi.cand*D)+tausq.cand*diag(n) SigmaInv <- chol2inv(chol(Sigma)) logDetSigma <- determinant(Sigma, log=TRUE)$modulus[1] out <- ( ##Priors -(a.sig+1)*log(sigmasq.cand) - b.sig/sigmasq.cand -(a.tau+1)*log(tausq.cand) - b.tau/tausq.cand ##Jacobians +log(sigmasq.cand) + log(tausq.cand) +log(phi.cand - a.phi) + log(b.phi -phi.cand) ##Likelihood -0.5*logDetSigma-0.5*(t(Y-X%*%mu.cand)%*%SigmaInv%*%(Y-X%*%mu.cand)) ) return(out) } ##Run a couple chains n.batch <- 500 batch.length <- 25 inits <- c(0, log(1), log(1), logit(3/10, a.phi, b.phi)) chain.1 <- adaptMetropGibbs(ltd=target, starting=inits, batch=n.batch, batch.length=batch.length, report=100) inits <- c(500, log(100), log(100), logit(3/90, a.phi, b.phi)) chain.2 <- adaptMetropGibbs(ltd=target, starting=inits, batch=n.batch, batch.length=batch.length, report=100) ##Check out acceptance rate just for fun plot(mcmc.list(mcmc(chain.1$acceptance), mcmc(chain.2$acceptance))) ##Back transform chain.1$p.theta.samples[,2] <- exp(chain.1$p.theta.samples[,2]) chain.1$p.theta.samples[,3] <- exp(chain.1$p.theta.samples[,3]) chain.1$p.theta.samples[,4] <- 3/logit.inv(chain.1$p.theta.samples[,4], a.phi, b.phi) chain.2$p.theta.samples[,2] <- exp(chain.2$p.theta.samples[,2]) chain.2$p.theta.samples[,3] <- exp(chain.2$p.theta.samples[,3]) chain.2$p.theta.samples[,4] <- 3/logit.inv(chain.2$p.theta.samples[,4], a.phi, b.phi) par.names <- c("mu", "sigma.sq", "tau.sq", "effective range (-log(0.05)/phi)") colnames(chain.1$p.theta.samples) <- par.names colnames(chain.2$p.theta.samples) <- par.names ##Discard burn.in and plot and do some convergence diagnostics chains <- mcmc.list(mcmc(chain.1$p.theta.samples), mcmc(chain.2$p.theta.samples)) plot(window(chains, start=as.integer(0.5*n.batch*batch.length))) gelman.diag(chains) ########################## ##Example of fitting a ##a non-linear model ########################## ##Example of fitting a non-linear model set.seed(1) ######################################################## ##Simulate some data. ######################################################## a <- 0.1 #-Inf < a < Inf b <- 0.1 #b > 0 c <- 0.2 #c > 0 tau.sq <- 0.1 #tau.sq > 0 fn <- function(a,b,c,x){ a+b*exp(x/c) } n <- 200 x <- seq(0,1,0.01) y <- rnorm(length(x), fn(a,b,c,x), sqrt(tau.sq)) ##check out your data plot(x, y) ######################################################## ##The log target density ######################################################## ##Define the log target density used in the Metrop. ltd <- function(theta){ ##extract and transform as needed a <- theta[1] b <- exp(theta[2]) c <- exp(theta[3]) tau.sq <- exp(theta[4]) y.hat <- fn(a, b, c, x) ##likelihood logl <- sum(dnorm(y, y.hat, sqrt(tau.sq), log=TRUE)) ##priors IG on tau.sq and normal on a and transformed b, c, d logl <- (logl -(IG.a+1)*log(tau.sq)-IG.b/tau.sq +sum(dnorm(theta[1:3], N.mu, N.v, log=TRUE)) ) ##Jacobian adjustment for tau.sq logl <- logl+log(tau.sq) return(logl) } ######################################################## ##The rest ######################################################## ##Priors IG.a <- 2 IG.b <- 0.01 N.mu <- 0 N.v <- 10 theta.tuning <- c(0.01, 0.01, 0.005, 0.01) ##Run three chains with different starting values n.batch <- 1000 batch.length <- 25 theta.starting <- c(0, log(0.01), log(0.6), log(0.01)) run.1 <- adaptMetropGibbs(ltd=ltd, starting=theta.starting, tuning=theta.tuning, batch=n.batch, batch.length=batch.length, report=100) theta.starting <- c(1.5, log(0.05), log(0.5), log(0.05)) run.2 <- adaptMetropGibbs(ltd=ltd, starting=theta.starting, tuning=theta.tuning, batch=n.batch, batch.length=batch.length, report=100) theta.starting <- c(-1.5, log(0.1), log(0.4), log(0.1)) run.3 <- adaptMetropGibbs(ltd=ltd, starting=theta.starting, tuning=theta.tuning, batch=n.batch, batch.length=batch.length, report=100) ##Back transform samples.1 <- cbind(run.1$p.theta.samples[,1], exp(run.1$p.theta.samples[,2:4])) samples.2 <- cbind(run.2$p.theta.samples[,1], exp(run.2$p.theta.samples[,2:4])) samples.3 <- cbind(run.3$p.theta.samples[,1], exp(run.3$p.theta.samples[,2:4])) samples <- mcmc.list(mcmc(samples.1), mcmc(samples.2), mcmc(samples.3)) ##Summary plot(samples, density=FALSE) gelman.plot(samples) burn.in <- 5000 fn.pred <- function(theta,x){ a <- theta[1] b <- theta[2] c <- theta[3] tau.sq <- theta[4] rnorm(length(x), fn(a,b,c,x), sqrt(tau.sq)) } post.curves <- t(apply(samples.1[burn.in:nrow(samples.1),], 1, fn.pred, x)) post.curves.quants <- summary(mcmc(post.curves))$quantiles plot(x, y, pch=19, xlab="x", ylab="f(x)") lines(x, post.curves.quants[,1], lty="dashed", col="blue") lines(x, post.curves.quants[,3]) lines(x, post.curves.quants[,5], lty="dashed", col="blue") ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } ########################### ##Fit a spatial regression ########################### set.seed(1) n <- 50 x <- runif(n, 0, 100) y <- runif(n, 0, 100) D <- as.matrix(dist(cbind(x, y))) phi <- 3/50 sigmasq <- 50 tausq <- 20 mu <- 150 s <- (sigmasq*exp(-phi*D)) w <- rmvn(1, rep(0, n), s) Y <- rmvn(1, rep(mu, n) + w, tausq*diag(n)) X <- as.matrix(rep(1, length(Y))) ##Priors ##IG sigma^2 and tau^2 a.sig <- 2 b.sig <- 100 a.tau <- 2 b.tau <- 100 ##Unif phi a.phi <- 3/100 b.phi <- 3/1 ##Functions used to transform phi to continuous support. logit <- function(theta, a, b){log((theta-a)/(b-theta))} logit.inv <- function(z, a, b){b-(b-a)/(1+exp(z))} ##Metrop. target target <- function(theta){ mu.cand <- theta[1] sigmasq.cand <- exp(theta[2]) tausq.cand <- exp(theta[3]) phi.cand <- logit.inv(theta[4], a.phi, b.phi) Sigma <- sigmasq.cand*exp(-phi.cand*D)+tausq.cand*diag(n) SigmaInv <- chol2inv(chol(Sigma)) logDetSigma <- determinant(Sigma, log=TRUE)$modulus[1] out <- ( ##Priors -(a.sig+1)*log(sigmasq.cand) - b.sig/sigmasq.cand -(a.tau+1)*log(tausq.cand) - b.tau/tausq.cand ##Jacobians +log(sigmasq.cand) + log(tausq.cand) +log(phi.cand - a.phi) + log(b.phi -phi.cand) ##Likelihood -0.5*logDetSigma-0.5*(t(Y-X%*%mu.cand)%*%SigmaInv%*%(Y-X%*%mu.cand)) ) return(out) } ##Run a couple chains n.batch <- 500 batch.length <- 25 inits <- c(0, log(1), log(1), logit(3/10, a.phi, b.phi)) chain.1 <- adaptMetropGibbs(ltd=target, starting=inits, batch=n.batch, batch.length=batch.length, report=100) inits <- c(500, log(100), log(100), logit(3/90, a.phi, b.phi)) chain.2 <- adaptMetropGibbs(ltd=target, starting=inits, batch=n.batch, batch.length=batch.length, report=100) ##Check out acceptance rate just for fun plot(mcmc.list(mcmc(chain.1$acceptance), mcmc(chain.2$acceptance))) ##Back transform chain.1$p.theta.samples[,2] <- exp(chain.1$p.theta.samples[,2]) chain.1$p.theta.samples[,3] <- exp(chain.1$p.theta.samples[,3]) chain.1$p.theta.samples[,4] <- 3/logit.inv(chain.1$p.theta.samples[,4], a.phi, b.phi) chain.2$p.theta.samples[,2] <- exp(chain.2$p.theta.samples[,2]) chain.2$p.theta.samples[,3] <- exp(chain.2$p.theta.samples[,3]) chain.2$p.theta.samples[,4] <- 3/logit.inv(chain.2$p.theta.samples[,4], a.phi, b.phi) par.names <- c("mu", "sigma.sq", "tau.sq", "effective range (-log(0.05)/phi)") colnames(chain.1$p.theta.samples) <- par.names colnames(chain.2$p.theta.samples) <- par.names ##Discard burn.in and plot and do some convergence diagnostics chains <- mcmc.list(mcmc(chain.1$p.theta.samples), mcmc(chain.2$p.theta.samples)) plot(window(chains, start=as.integer(0.5*n.batch*batch.length))) gelman.diag(chains) ########################## ##Example of fitting a ##a non-linear model ########################## ##Example of fitting a non-linear model set.seed(1) ######################################################## ##Simulate some data. ######################################################## a <- 0.1 #-Inf < a < Inf b <- 0.1 #b > 0 c <- 0.2 #c > 0 tau.sq <- 0.1 #tau.sq > 0 fn <- function(a,b,c,x){ a+b*exp(x/c) } n <- 200 x <- seq(0,1,0.01) y <- rnorm(length(x), fn(a,b,c,x), sqrt(tau.sq)) ##check out your data plot(x, y) ######################################################## ##The log target density ######################################################## ##Define the log target density used in the Metrop. ltd <- function(theta){ ##extract and transform as needed a <- theta[1] b <- exp(theta[2]) c <- exp(theta[3]) tau.sq <- exp(theta[4]) y.hat <- fn(a, b, c, x) ##likelihood logl <- sum(dnorm(y, y.hat, sqrt(tau.sq), log=TRUE)) ##priors IG on tau.sq and normal on a and transformed b, c, d logl <- (logl -(IG.a+1)*log(tau.sq)-IG.b/tau.sq +sum(dnorm(theta[1:3], N.mu, N.v, log=TRUE)) ) ##Jacobian adjustment for tau.sq logl <- logl+log(tau.sq) return(logl) } ######################################################## ##The rest ######################################################## ##Priors IG.a <- 2 IG.b <- 0.01 N.mu <- 0 N.v <- 10 theta.tuning <- c(0.01, 0.01, 0.005, 0.01) ##Run three chains with different starting values n.batch <- 1000 batch.length <- 25 theta.starting <- c(0, log(0.01), log(0.6), log(0.01)) run.1 <- adaptMetropGibbs(ltd=ltd, starting=theta.starting, tuning=theta.tuning, batch=n.batch, batch.length=batch.length, report=100) theta.starting <- c(1.5, log(0.05), log(0.5), log(0.05)) run.2 <- adaptMetropGibbs(ltd=ltd, starting=theta.starting, tuning=theta.tuning, batch=n.batch, batch.length=batch.length, report=100) theta.starting <- c(-1.5, log(0.1), log(0.4), log(0.1)) run.3 <- adaptMetropGibbs(ltd=ltd, starting=theta.starting, tuning=theta.tuning, batch=n.batch, batch.length=batch.length, report=100) ##Back transform samples.1 <- cbind(run.1$p.theta.samples[,1], exp(run.1$p.theta.samples[,2:4])) samples.2 <- cbind(run.2$p.theta.samples[,1], exp(run.2$p.theta.samples[,2:4])) samples.3 <- cbind(run.3$p.theta.samples[,1], exp(run.3$p.theta.samples[,2:4])) samples <- mcmc.list(mcmc(samples.1), mcmc(samples.2), mcmc(samples.3)) ##Summary plot(samples, density=FALSE) gelman.plot(samples) burn.in <- 5000 fn.pred <- function(theta,x){ a <- theta[1] b <- theta[2] c <- theta[3] tau.sq <- theta[4] rnorm(length(x), fn(a,b,c,x), sqrt(tau.sq)) } post.curves <- t(apply(samples.1[burn.in:nrow(samples.1),], 1, fn.pred, x)) post.curves.quants <- summary(mcmc(post.curves))$quantiles plot(x, y, pch=19, xlab="x", ylab="f(x)") lines(x, post.curves.quants[,1], lty="dashed", col="blue") lines(x, post.curves.quants[,3]) lines(x, post.curves.quants[,5], lty="dashed", col="blue") ## End(Not run)
Given a observation coordinates and fixed semivariogram
parameters the bayesGeostatExact
function fits a
simple Bayesian spatial linear model.
bayesGeostatExact(formula, data = parent.frame(), n.samples, beta.prior.mean, beta.prior.precision, coords, cov.model="exponential", phi, nu, alpha, sigma.sq.prior.shape, sigma.sq.prior.rate, sp.effects=TRUE, verbose=TRUE, ...)
bayesGeostatExact(formula, data = parent.frame(), n.samples, beta.prior.mean, beta.prior.precision, coords, cov.model="exponential", phi, nu, alpha, sigma.sq.prior.shape, sigma.sq.prior.rate, sp.effects=TRUE, verbose=TRUE, ...)
formula |
for a univariate model, this is a symbolic description of the regression model to be fit. See example below. |
data |
an optional data frame containing the variables in the
model. If not found in data, the variables are taken from
|
n.samples |
the number of posterior samples to collect. |
beta.prior.mean |
|
beta.prior.precision |
|
coords |
an |
cov.model |
a quoted key word that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
phi |
the fixed value of the spatial decay. |
nu |
if |
alpha |
the fixed value of the ratio between the nugget
|
sigma.sq.prior.shape |
|
sigma.sq.prior.rate |
|
sp.effects |
a logical value indicating if spatial random effects should be recovered. |
verbose |
if |
... |
currently no additional arguments. |
An object of class bayesGeostatExact
, which is a list with the following tags:
p.samples |
a |
sp.effects |
a matrix that holds samples from the posterior
distribution of the spatial random effects. The rows of this matrix
correspond to the |
args |
a list with the initial function arguments. |
Sudipto Banerjee [email protected],
Andrew O. Finley [email protected]
## Not run: data(FBC07.dat) Y <- FBC07.dat[1:150,"Y.2"] coords <- as.matrix(FBC07.dat[1:150,c("coord.X", "coord.Y")]) n.samples <- 500 n = length(Y) p = 1 phi <- 0.15 nu <- 0.5 beta.prior.mean <- as.matrix(rep(0, times=p)) beta.prior.precision <- matrix(0, nrow=p, ncol=p) alpha <- 5/5 sigma.sq.prior.shape <- 2.0 sigma.sq.prior.rate <- 5.0 ############################## ##Simple linear model with ##the default exponential ##spatial decay function ############################## set.seed(1) m.1 <- bayesGeostatExact(Y~1, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, phi=phi, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.1$p.samples)) ##Requires MBA package to ##make surfaces library(MBA) par(mfrow=c(1,2)) obs.surf <- mba.surf(cbind(coords, Y), no.X=100, no.Y=100, extend=T)$xyz.est image(obs.surf, xaxs = "r", yaxs = "r", main="Observed response") points(coords) contour(obs.surf, add=T) w.hat <- rowMeans(m.1$sp.effects) w.surf <- mba.surf(cbind(coords, w.hat), no.X=100, no.Y=100, extend=T)$xyz.est image(w.surf, xaxs = "r", yaxs = "r", main="Estimated random effects") points(coords) contour(w.surf, add=T) ############################## ##Simple linear model with ##the matern spatial decay ##function. Note, nu=0.5 so ##should produce the same ##estimates as m.1 ############################## set.seed(1) m.2 <- bayesGeostatExact(Y~1, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, cov.model="matern", phi=phi, nu=nu, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.2$p.samples)) ############################## ##This time with the ##spherical just for fun ############################## m.3 <- bayesGeostatExact(Y~1, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, cov.model="spherical", phi=phi, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.3$p.samples)) ############################## ##Another example but this ##time with covariates ############################## data(FORMGMT.dat) n = nrow(FORMGMT.dat) p = 5 ##an intercept an four covariates n.samples <- 50 phi <- 0.0012 coords <- cbind(FORMGMT.dat$Longi, FORMGMT.dat$Lat) coords <- coords*(pi/180)*6378 beta.prior.mean <- rep(0, times=p) beta.prior.precision <- matrix(0, nrow=p, ncol=p) alpha <- 1/1.5 sigma.sq.prior.shape <- 2.0 sigma.sq.prior.rate <- 10.0 m.4 <- bayesGeostatExact(Y~X1+X2+X3+X4, data=FORMGMT.dat, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, phi=phi, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.4$p.samples)) ##Requires MBA package to ##make surfaces library(MBA) par(mfrow=c(1,2)) obs.surf <- mba.surf(cbind(coords, resid(lm(Y~X1+X2+X3+X4, data=FORMGMT.dat))), no.X=100, no.Y=100, extend=TRUE)$xyz.est image(obs.surf, xaxs = "r", yaxs = "r", main="Observed response") points(coords) contour(obs.surf, add=T) w.hat <- rowMeans(m.4$sp.effects) w.surf <- mba.surf(cbind(coords, w.hat), no.X=100, no.Y=100, extend=TRUE)$xyz.est image(w.surf, xaxs = "r", yaxs = "r", main="Estimated random effects") contour(w.surf, add=T) points(coords, pch=1, cex=1) ## End(Not run)
## Not run: data(FBC07.dat) Y <- FBC07.dat[1:150,"Y.2"] coords <- as.matrix(FBC07.dat[1:150,c("coord.X", "coord.Y")]) n.samples <- 500 n = length(Y) p = 1 phi <- 0.15 nu <- 0.5 beta.prior.mean <- as.matrix(rep(0, times=p)) beta.prior.precision <- matrix(0, nrow=p, ncol=p) alpha <- 5/5 sigma.sq.prior.shape <- 2.0 sigma.sq.prior.rate <- 5.0 ############################## ##Simple linear model with ##the default exponential ##spatial decay function ############################## set.seed(1) m.1 <- bayesGeostatExact(Y~1, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, phi=phi, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.1$p.samples)) ##Requires MBA package to ##make surfaces library(MBA) par(mfrow=c(1,2)) obs.surf <- mba.surf(cbind(coords, Y), no.X=100, no.Y=100, extend=T)$xyz.est image(obs.surf, xaxs = "r", yaxs = "r", main="Observed response") points(coords) contour(obs.surf, add=T) w.hat <- rowMeans(m.1$sp.effects) w.surf <- mba.surf(cbind(coords, w.hat), no.X=100, no.Y=100, extend=T)$xyz.est image(w.surf, xaxs = "r", yaxs = "r", main="Estimated random effects") points(coords) contour(w.surf, add=T) ############################## ##Simple linear model with ##the matern spatial decay ##function. Note, nu=0.5 so ##should produce the same ##estimates as m.1 ############################## set.seed(1) m.2 <- bayesGeostatExact(Y~1, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, cov.model="matern", phi=phi, nu=nu, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.2$p.samples)) ############################## ##This time with the ##spherical just for fun ############################## m.3 <- bayesGeostatExact(Y~1, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, cov.model="spherical", phi=phi, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.3$p.samples)) ############################## ##Another example but this ##time with covariates ############################## data(FORMGMT.dat) n = nrow(FORMGMT.dat) p = 5 ##an intercept an four covariates n.samples <- 50 phi <- 0.0012 coords <- cbind(FORMGMT.dat$Longi, FORMGMT.dat$Lat) coords <- coords*(pi/180)*6378 beta.prior.mean <- rep(0, times=p) beta.prior.precision <- matrix(0, nrow=p, ncol=p) alpha <- 1/1.5 sigma.sq.prior.shape <- 2.0 sigma.sq.prior.rate <- 10.0 m.4 <- bayesGeostatExact(Y~X1+X2+X3+X4, data=FORMGMT.dat, n.samples=n.samples, beta.prior.mean=beta.prior.mean, beta.prior.precision=beta.prior.precision, coords=coords, phi=phi, alpha=alpha, sigma.sq.prior.shape=sigma.sq.prior.shape, sigma.sq.prior.rate=sigma.sq.prior.rate) print(summary(m.4$p.samples)) ##Requires MBA package to ##make surfaces library(MBA) par(mfrow=c(1,2)) obs.surf <- mba.surf(cbind(coords, resid(lm(Y~X1+X2+X3+X4, data=FORMGMT.dat))), no.X=100, no.Y=100, extend=TRUE)$xyz.est image(obs.surf, xaxs = "r", yaxs = "r", main="Observed response") points(coords) contour(obs.surf, add=T) w.hat <- rowMeans(m.4$sp.effects) w.surf <- mba.surf(cbind(coords, w.hat), no.X=100, no.Y=100, extend=TRUE)$xyz.est image(w.surf, xaxs = "r", yaxs = "r", main="Estimated random effects") contour(w.surf, add=T) points(coords, pch=1, cex=1) ## End(Not run)
Given an lm
object, the bayesLMConjugate
function fits a
simple Bayesian linear model with Normal and inverse-Gamma priors.
bayesLMConjugate(formula, data = parent.frame(), n.samples, beta.prior.mean, beta.prior.precision, prior.shape, prior.rate, ...)
bayesLMConjugate(formula, data = parent.frame(), n.samples, beta.prior.mean, beta.prior.precision, prior.shape, prior.rate, ...)
formula |
for a univariate model, this is a symbolic description of the regression model to be fit. See example below. |
data |
an optional data frame containing the variables in the
model. If not found in data, the variables are taken from
|
n.samples |
the number of posterior samples to collect. |
beta.prior.mean |
|
beta.prior.precision |
|
prior.shape |
|
prior.rate |
|
... |
currently no additional arguments. |
An object of class bayesLMConjugate
, which is a list with at
least the following tag:
p.beta.tauSq.samples |
a |
Sudipto Banerjee [email protected],
Andrew O. Finley [email protected]
## Not run: data(FORMGMT.dat) n <- nrow(FORMGMT.dat) p <- 7 ##an intercept and six covariates n.samples <- 500 ## Below we demonstrate the conjugate function in the special case ## with improper priors. The results are the same as for the above, ## up to MC error. beta.prior.mean <- rep(0, times=p) beta.prior.precision <- matrix(0, nrow=p, ncol=p) prior.shape <- -p/2 prior.rate <- 0 m.1 <- bayesLMConjugate(Y ~ X1+X2+X3+X4+X5+X6, data = FORMGMT.dat, n.samples, beta.prior.mean, beta.prior.precision, prior.shape, prior.rate) summary(m.1$p.beta.tauSq.samples) ## End(Not run)
## Not run: data(FORMGMT.dat) n <- nrow(FORMGMT.dat) p <- 7 ##an intercept and six covariates n.samples <- 500 ## Below we demonstrate the conjugate function in the special case ## with improper priors. The results are the same as for the above, ## up to MC error. beta.prior.mean <- rep(0, times=p) beta.prior.precision <- matrix(0, nrow=p, ncol=p) prior.shape <- -p/2 prior.rate <- 0 m.1 <- bayesLMConjugate(Y ~ X1+X2+X3+X4+X5+X6, data = FORMGMT.dat, n.samples, beta.prior.mean, beta.prior.precision, prior.shape, prior.rate) summary(m.1$p.beta.tauSq.samples) ## End(Not run)
Given a lm
object, the bayesLMRef
function fits a
simple Bayesian linear model with reference (non-informative) priors.
bayesLMRef(lm.obj, n.samples, ...)
bayesLMRef(lm.obj, n.samples, ...)
lm.obj |
an object returned by |
n.samples |
the number of posterior samples to collect. |
... |
currently no additional arguments. |
See page 355 in Gelman et al. (2004).
An object of class bayesLMRef
, which is a list with at
least the following tag:
p.beta.tauSq.samples |
a |
Sudipto Banerjee [email protected],
Andrew O. Finley [email protected]
Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. (2004). Bayesian Data Analysis. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC Press.
## Not run: set.seed(1) n <- 100 X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) tau.sq <- 0.1 y <- rnorm(n, X%*%B, sqrt(tau.sq)) lm.obj <- lm(y ~ X-1) summary(lm.obj) ##Now with bayesLMRef n.samples <- 500 m.1 <- bayesLMRef(lm.obj, n.samples) summary(m.1$p.beta.tauSq.samples) ## End(Not run)
## Not run: set.seed(1) n <- 100 X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) tau.sq <- 0.1 y <- rnorm(n, X%*%B, sqrt(tau.sq)) lm.obj <- lm(y ~ X-1) summary(lm.obj) ##Now with bayesLMRef n.samples <- 500 m.1 <- bayesLMRef(lm.obj, n.samples) summary(m.1$p.beta.tauSq.samples) ## End(Not run)
Data generated in long-term research studies on the Bartlett Experimental Forest, Bartlett, NH funded by the U.S. Department of Agriculture, Forest Service, Northeastern Research Station.
This dataset holds 1991 and 2002 forest inventory data for 437 points on the BEF.dat. Variables include species specific basal area and biomass; inventory plot coordinates; slope; elevation; and tasseled cap brightness (TC1), greenness (TC2), and wetness (TC3) components from spring, summer, and fall 2002 Landsat images.
Species specific basal area and biomass are recorded as a fraction of totals.
data(BEF.dat)
data(BEF.dat)
A data frame containing 437 rows and 208 columns.
BEF.dat inventory data provided by:
Marie-Louise Smith USDA Forest Service Northeastern Research Station [email protected]
Additional variables provided by:
Andrew Lister USDA Forest Service Northeastern Research Station [email protected]
The synthetic dataset describes a stationary and isotropic bivariate process. Please refer to the vignette Section 4.2 for specifics.
data(FBC07.dat)
data(FBC07.dat)
A data frame of 250 rows and 4 columns. Columns 1 and 2 are coordinates and columns 3 and 4 are response variables.
Finley A.O., S. Banerjee, and B.P. Carlin (2007) spBayes: R package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models. Journal of Statistical Software.
Data used for illustrations.
data(FORMGMT.dat)
data(FORMGMT.dat)
Computes the inter-site Euclidean distance matrix for one or two sets of points.
iDist(coords.1, coords.2, ...)
iDist(coords.1, coords.2, ...)
coords.1 |
an |
coords.2 |
an |
... |
currently no additional arguments. |
The or
inter-site Euclidean distance matrix.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected],
## Not run: n <- 10 p1 <- cbind(runif(n),runif(n)) m <- 5 p2 <- cbind(runif(m),runif(m)) D <- iDist(p1, p2) ## End(Not run)
## Not run: n <- 10 p1 <- cbind(runif(n),runif(n)) m <- 5 p2 <- cbind(runif(m),runif(m)) D <- iDist(p1, p2) ## End(Not run)
Given univariate design matrices, the function
mkMvX
creates a multivariate design matrix suitable for use in spPredict
.
mkMvX(X)
mkMvX(X)
X |
a list of |
A multivariate design matrix suitable for use in spPredict
.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected].
## Not run: ##Define some univariate model design matrices ##with intercepts. X.1 <- cbind(rep(1, 10), matrix(rnorm(50), nrow=10)) X.2 <- cbind(rep(1, 10), matrix(rnorm(20), nrow=10)) X.3 <- cbind(rep(1, 10), matrix(rnorm(30), nrow=10)) ##Make a multivariate design matrix suitable ##for use in spPredict. X.mv <- mkMvX(list(X.1, X.2, X.3)) ## End(Not run)
## Not run: ##Define some univariate model design matrices ##with intercepts. X.1 <- cbind(rep(1, 10), matrix(rnorm(50), nrow=10)) X.2 <- cbind(rep(1, 10), matrix(rnorm(20), nrow=10)) X.3 <- cbind(rep(1, 10), matrix(rnorm(30), nrow=10)) ##Make a multivariate design matrix suitable ##for use in spPredict. X.mv <- mkMvX(list(X.1, X.2, X.3)) ## End(Not run)
The function mkSpCov
calculates a spatial covariance matrix
given spatial locations and spatial covariance parameters.
mkSpCov(coords, K, Psi, theta, cov.model)
mkSpCov(coords, K, Psi, theta, cov.model)
coords |
an |
K |
the |
Psi |
the |
theta |
a vector of |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
Covariance functions return the covariance
between a pair locations separated by distance
. The covariance function can be written as a product of a variance parameter
and a positive definite correlation function
:
, see, e.g.,
Banerjee et al. (2004) p. 27 for more details. The expressions of the correlations functions available in spBayes are given below. More will be added upon request.
For all correlations functions, is the spatial decay parameter.
Some of the correlation functions will have an extra parameter
, the smoothness parameter.
denotes the modified Bessel
function of the third kind of order
. See
documentation of the function
besselK
for further details.
The following functions are valid for and
, unless stated otherwise.
gaussian
exponential
matern
spherical
C |
the |
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
## Not run: ##A bivariate spatial covariance matrix n <- 2 ##number of locations q <- 2 ##number of responses at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##spatial decay parameters theta <- rep(6,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- rnorm(nltr, 5, 1) K <- A%*%t(A) Psi <- diag(1,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") ## End(Not run)
## Not run: ##A bivariate spatial covariance matrix n <- 2 ##number of locations q <- 2 ##number of responses at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##spatial decay parameters theta <- rep(6,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- rnorm(nltr, 5, 1) K <- A%*%t(A) Psi <- diag(1,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") ## End(Not run)
Monthly temperature data (Celsius) recorded across the Northeastern US starting in January 2000. Station UTM coordinates and elevation are also included.
data(NETemp.dat)
data(NETemp.dat)
A data frame containing 356 rows (weather stations) and 132 columns.
These data and subsequent description are drawn from the spTimer package (version 0.7). This data set contains values of daily 8-hour maximum average ozone concentrations (ppb; O3.8HRMAX), maximum temperature (degree Celsius; cMAXTMP), wind speed (knots; WDSP), and relative humidity (RM), obtained from 28 monitoring sites in New York, USA, between July 1 and August 31 in 2006. Each row represents a station and columns hold consecutive daily values.
data(NYOzone.dat)
data(NYOzone.dat)
Columns for NYdata:
1st col = Longitude
2nd col = Latitude
3rd col = X coordinates in UTM projection
4th col = Y coordinates in UTM projection
5th col = Ozone July 1 (O3.8HRMAX.1)
6th col = Ozone July 2 (O3.8HRMAX.2)
...
66th col = Ozone August 31 (O3.8HRMAX.62)
remaining columns organize cMAXTMP, WDSP, and RH identical to the 62 O3.8HRMAX measurements
spTimer Bakar, K.S. and S.K. Sahu. http://www.southampton.ac.uk/~sks/research/papers/spTimeRpaper.pdf
Sahu, S.K. and K.S. Bakar. (2012) A Comparison of Bayesian Models for Daily Ozone Concentration Levels. Statistical Methodology, 9, 144–157.
The PM10.dat
data frame is a subset of data analyzed in Hamm
et al. (2015) and Datta et al. (2016). Data comprise April 6, 2010
square root transformed PM10 measurements across
central Europe with corresponding output from the LOTOS-EUROS Schaap et
al. (2008) chemistry transport model (CTM). CTM data may differ
slightly from that considered in the studies noted above due to LOTOS-EUROS CTM
code updates. A NA
value is given
at CTM output locations were PM10 is not
observed. Point coordinates are in
"+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=km +no_defs".
data(PM10.dat)
data(PM10.dat)
Columns for PM10.dat:
x.coord = x coordinate (see projection information in the description)
y.coord = y coordinate (see projection information in the description)
pm10.obs = square root transformed PM10 measurements at
monitoring stations (NA
means there is not a station at the
given location)
pm10.ctm = square root transformed PM10 from CTM
Datta A., S. Banerjee, A.O. Finley, N. Hamm, and M. Schaap (2016). Nonseparable dynamic nearest neighbor Gaussian process models for large spatio-temporal data with an application to particulate matter analysis. Annals of Applied Statistics, 10(3), 1286–1316. ISSN 1932-6157. doi:10.1214/16-AOAS931.
Hamm N. A.O. Finley, M. Schaap, A. Stein (2015). A Spatially Varying Coefficient Model for Mapping PM10 Air Quality at the European scale. Atmospheric Environment, 102, 393–405.
Schaap M., R.M.A Timmermans, M. Roemer, G.A.C. Boersen, P. Builtjes, F. Sauter, G. Velders, J. Beck (2008). The LOTOS-EUROS Model: Description, Validation and Latest Developments. International Journal of Environment and Pollution, 32(2), 270–290.
European countries corresponding to PM10.dat
locations and used in Hamm et al. (2015) and Datta et al. (2016). Polygon projection is "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=km +no_defs".
data(PM10.poly)
data(PM10.poly)
List of polygons. See example below to convert to a SpatialPolygons
object.
Datta A., S. Banerjee, A.O. Finley, N. Hamm, and M. Schaap (2016). Nonseparable dynamic nearest neighbor Gaussian process models for large spatio-temporal data with an application to particulate matter analysis. Annals of Applied Statistics, 10(3), 1286–1316. ISSN 1932-6157. doi:10.1214/16-AOAS931.
Hamm N. A.O. Finley, M. Schaap, A. Stein (2015). A Spatially Varying Coefficient Model for Mapping PM10 Air Quality at the European scale. Atmospheric Environment, 102, 393–405.
## Not run: library(sp) prj <- "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=km +no_defs" pm10.poly <- SpatialPolygons(PM10.poly, pO = 1:length(PM10.poly), proj4string=CRS(prj)) ## End(Not run)
## Not run: library(sp) prj <- "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=km +no_defs" pm10.poly <- SpatialPolygons(PM10.poly, pO = 1:length(PM10.poly), proj4string=CRS(prj)) ## End(Not run)
Given a polygon and a set of points this function returns the subset of points that are within the polygon.
pointsInPoly(poly, points, ...)
pointsInPoly(poly, points, ...)
poly |
an |
points |
an |
... |
currently no additional arguments. |
It is assumed that the polygon is to be closed by joining the last vertex to the first vertex.
If points are found with the polygon, then a vector is returned with
elements corresponding to the row indices of points
, otherwise
NA
is returned.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected],
## Not run: ##Example 1 points <- cbind(runif(1000, 0, 10),runif(1000, 0, 10)) poly <- cbind(c(1:9,8:1), c(1,2*(5:3),2,-1,17,9,8,2:9)) point.indx <- pointsInPoly(poly, points) plot(points, pch=19, cex=0.5, xlab="x", ylab="y", col="red") points(points[point.indx,], pch=19, cex=0.5, col="blue") polygon(poly) ##Example 2 ##a function to partition the domain tiles <- function(points, x.cnt, y.cnt, tol = 1.0e-10){ x.min <- min(points[,1])-tol x.max <- max(points[,1])+tol y.min <- min(points[,2])-tol y.max <- max(points[,2])+tol x.cnt <- x.cnt+1 y.cnt <- y.cnt+1 x <- seq(x.min, x.max, length.out=x.cnt) y <- seq(y.min, y.max, length.out=y.cnt) tile.list <- vector("list", (length(y)-1)*(length(x)-1)) l <- 1 for(i in 1:(length(y)-1)){ for(j in 1:(length(x)-1)){ tile.list[[l]] <- rbind(c(x[j], y[i]), c(x[j+1], y[i]), c(x[j+1], y[i+1]), c(x[j], y[i+1])) l <- l+1 } } tile.list } n <- 1000 points <- cbind(runif(n, 0, 10), runif(n, 0, 10)) grd <- tiles(points, x.cnt=10, y.cnt=10) plot(points, pch=19, cex=0.5, xlab="x", ylab="y") sum.points <- 0 for(i in 1:length(grd)){ polygon(grd[[i]], border="red") point.indx <- pointsInPoly(grd[[i]], points) if(!is.na(point.indx[1])){ sum.points <- length(point.indx)+sum.points text(mean(grd[[i]][,1]), mean(grd[[i]][,2]), length(point.indx), col="red") } } sum.points ## End(Not run)
## Not run: ##Example 1 points <- cbind(runif(1000, 0, 10),runif(1000, 0, 10)) poly <- cbind(c(1:9,8:1), c(1,2*(5:3),2,-1,17,9,8,2:9)) point.indx <- pointsInPoly(poly, points) plot(points, pch=19, cex=0.5, xlab="x", ylab="y", col="red") points(points[point.indx,], pch=19, cex=0.5, col="blue") polygon(poly) ##Example 2 ##a function to partition the domain tiles <- function(points, x.cnt, y.cnt, tol = 1.0e-10){ x.min <- min(points[,1])-tol x.max <- max(points[,1])+tol y.min <- min(points[,2])-tol y.max <- max(points[,2])+tol x.cnt <- x.cnt+1 y.cnt <- y.cnt+1 x <- seq(x.min, x.max, length.out=x.cnt) y <- seq(y.min, y.max, length.out=y.cnt) tile.list <- vector("list", (length(y)-1)*(length(x)-1)) l <- 1 for(i in 1:(length(y)-1)){ for(j in 1:(length(x)-1)){ tile.list[[l]] <- rbind(c(x[j], y[i]), c(x[j+1], y[i]), c(x[j+1], y[i+1]), c(x[j], y[i+1])) l <- l+1 } } tile.list } n <- 1000 points <- cbind(runif(n, 0, 10), runif(n, 0, 10)) grd <- tiles(points, x.cnt=10, y.cnt=10) plot(points, pch=19, cex=0.5, xlab="x", ylab="y") sum.points <- 0 for(i in 1:length(grd)){ polygon(grd[[i]], border="red") point.indx <- pointsInPoly(grd[[i]], points) if(!is.na(point.indx[1])){ sum.points <- length(point.indx)+sum.points text(mean(grd[[i]][,1]), mean(grd[[i]][,2]), length(point.indx), col="red") } } sum.points ## End(Not run)
The function spDiag
calculates DIC, GP, GRS, and associated
statistics given a spLM
, spMvLM
,
spGLM
, spMvGLM
, spMvGLM
, or
spSVC
object.
spDiag(sp.obj, start=1, end, thin=1, verbose=TRUE, n.report=100, ...)
spDiag(sp.obj, start=1, end, thin=1, verbose=TRUE, n.report=100, ...)
sp.obj |
an object returned by |
start |
specifies the first sample included in the computation. The |
end |
specifies the last sample included in the computation.
The default is to use all posterior samples in |
thin |
a sample thinning factor. The default of 1 considers all
samples between |
verbose |
if |
n.report |
the interval to report progress. |
... |
currently no additional arguments. |
A list with some of the following tags:
DIC |
a matrix holding DIC and associated statistics, see Banerjee et al. (2004) for details. |
GP |
a matrix holding GP and associated statistics, see Gelfand and Ghosh (1998) for details. |
GRS |
a scoring rule, see Equation 27 in Gneiting and Raftery (2007) for details. |
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton,Fla.
Finley, A.O. and S. Banerjee (2019) Efficient implementation of spatially-varying coefficients models.
Gelfand A.E. and Ghosh, S.K. (1998). Model choice: a minimum posterior predictive loss approach. Biometrika. 85:1-11.
Gneiting, T. and Raftery, A.E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association. 102:359-378.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 100 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 2 tau.sq <- 0.1 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) n.samples <- 1000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) ##too restrictive of prior on beta priors.1 <- list("beta.Norm"=list(rep(0,p), diag(1,p)), "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) ##more reasonable prior for beta priors.2 <- list("beta.Norm"=list(rep(0,p), diag(1000,p)), "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) cov.model <- "exponential" n.report <- 500 verbose <- TRUE m.1 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.1, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) m.2 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.2, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) ##non-spatial model m.3 <- spLM(y~X-1, n.samples=n.samples, verbose=verbose, n.report=n.report) burn.in <- 0.5*n.samples ##recover beta and spatial random effects m.1 <- spRecover(m.1, start=burn.in, verbose=FALSE) m.2 <- spRecover(m.2, start=burn.in, verbose=FALSE) ##lower is better for DIC, GPD, and GRS print(spDiag(m.1)) print(spDiag(m.2)) print(spDiag(m.3)) ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 100 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 2 tau.sq <- 0.1 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) n.samples <- 1000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) ##too restrictive of prior on beta priors.1 <- list("beta.Norm"=list(rep(0,p), diag(1,p)), "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) ##more reasonable prior for beta priors.2 <- list("beta.Norm"=list(rep(0,p), diag(1000,p)), "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) cov.model <- "exponential" n.report <- 500 verbose <- TRUE m.1 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.1, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) m.2 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.2, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) ##non-spatial model m.3 <- spLM(y~X-1, n.samples=n.samples, verbose=verbose, n.report=n.report) burn.in <- 0.5*n.samples ##recover beta and spatial random effects m.1 <- spRecover(m.1, start=burn.in, verbose=FALSE) m.2 <- spRecover(m.2, start=burn.in, verbose=FALSE) ##lower is better for DIC, GPD, and GRS print(spDiag(m.1)) print(spDiag(m.2)) print(spDiag(m.3)) ## End(Not run)
The function spDynLM
fits Gaussian univariate Bayesian
dynamic space-time regression models for settings where space is viewed as continuous but time is taken
to be discrete.
spDynLM(formula, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, get.fitted=FALSE, n.samples, verbose=TRUE, n.report=100, ...)
spDynLM(formula, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, get.fitted=FALSE, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a list of |
data |
an optional data frame containing the variables in the
model. If not found in data, the variables are taken from
|
coords |
an |
starting |
a list with each tag corresponding to a
parameter name. Valid tags are |
knots |
either a |
tuning |
a list with each tag corresponding to a
parameter name. Valid tags are |
priors |
a list with tags |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
get.fitted |
if |
n.samples |
the number of MCMC iterations. |
verbose |
if |
n.report |
the interval to report Metropolis sampler acceptance and MCMC progress. |
... |
currently no additional arguments. |
Suppose, denotes the observation at location
and time
. We model
through a measurement equation that
provides a regression specification with a space-time varying
intercept and serially and spatially
uncorrelated zero-centered Gaussian disturbances as measurement error
. Next a transition equation
introduces a
coefficient vector, say
, which is a purely
temporal component (i.e., time-varying regression parameters), and a
spatio-temporal component
. Both these are generated through
transition equations, capturing their Markovian dependence in
time. While the transition equation of the purely temporal component
is akin to usual state-space modeling, the spatio-temporal component is
generated using Gaussian spatial processes. The overall model is written
as
Here is a
vector of predictors
and
is a
vector of
coefficients. In addition to an intercept,
can
include location specific variables useful for explaining the
variability in
. The
denotes a spatial
Gaussian process with covariance function
. We specify
, where
and
is a
correlation function with
controlling the
correlation decay and
represents the
spatial variance component. The spatial smoothness parameter,
, is used if the Matern spatial correlation function is chosen. We further assume
and
, which completes the prior specifications leading to a well-identified Bayesian hierarhical model and also yield reasonable dependence structures.
An object of class spDynLM
, which is a list with the following
tags:
coords |
the |
p.theta.samples |
a |
p.beta.0.samples |
a |
p.beta.samples |
a |
p.sigma.eta.samples |
a |
p.u.samples |
a |
p.y.samples |
a |
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2012) Bayesian dynamic modeling for large space-time datasets using Gaussian predictive processes. Journal of Geographical Systems, 14:29–47.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Gelfand, A.E., S. Banerjee, and D. Gamerman (2005) Spatial Process Modelling for Univariate and Multivariate Dynamic Spatial Data, Environmetrics, 16:465–479.
## Not run: data("NETemp.dat") ne.temp <- NETemp.dat set.seed(1) ##take a chunk of New England ne.temp <- ne.temp[ne.temp[,"UTMX"] > 5500000 & ne.temp[,"UTMY"] > 3000000,] ##subset first 2 years (Jan 2000 - Dec. 2002) y.t <- ne.temp[,4:27] N.t <- ncol(y.t) ##number of months n <- nrow(y.t) ##number of observation per months ##add some missing observations to illistrate prediction miss <- sample(1:N.t, 10) holdout.station.id <- 5 y.t.holdout <- y.t[holdout.station.id, miss] y.t[holdout.station.id, miss] <- NA ##scale to km coords <- as.matrix(ne.temp[,c("UTMX", "UTMY")]/1000) max.d <- max(iDist(coords)) ##set starting and priors p <- 2 #number of regression parameters in each month starting <- list("beta"=rep(0,N.t*p), "phi"=rep(3/(0.5*max.d), N.t), "sigma.sq"=rep(2,N.t), "tau.sq"=rep(1, N.t), "sigma.eta"=diag(rep(0.01, p))) tuning <- list("phi"=rep(5, N.t)) priors <- list("beta.0.Norm"=list(rep(0,p), diag(1000,p)), "phi.Unif"=list(rep(3/(0.9*max.d), N.t), rep(3/(0.05*max.d), N.t)), "sigma.sq.IG"=list(rep(2,N.t), rep(10,N.t)), "tau.sq.IG"=list(rep(2,N.t), rep(5,N.t)), "sigma.eta.IW"=list(2, diag(0.001,p))) ##make symbolic model formula statement for each month mods <- lapply(paste(colnames(y.t),'elev',sep='~'), as.formula) n.samples <- 2000 m.1 <- spDynLM(mods, data=cbind(y.t,ne.temp[,"elev",drop=FALSE]), coords=coords, starting=starting, tuning=tuning, priors=priors, get.fitted =TRUE, cov.model="exponential", n.samples=n.samples, n.report=25) burn.in <- floor(0.75*n.samples) quant <- function(x){quantile(x, prob=c(0.5, 0.025, 0.975))} beta <- apply(m.1$p.beta.samples[burn.in:n.samples,], 2, quant) beta.0 <- beta[,grep("Intercept", colnames(beta))] beta.1 <- beta[,grep("elev", colnames(beta))] plot(m.1$p.beta.0.samples) par(mfrow=c(2,1)) plot(1:N.t, beta.0[1,], pch=19, cex=0.5, xlab="months", ylab="beta.0", ylim=range(beta.0)) arrows(1:N.t, beta.0[1,], 1:N.t, beta.0[3,], length=0.02, angle=90) arrows(1:N.t, beta.0[1,], 1:N.t, beta.0[2,], length=0.02, angle=90) plot(1:N.t, beta.1[1,], pch=19, cex=0.5, xlab="months", ylab="beta.1", ylim=range(beta.1)) arrows(1:N.t, beta.1[1,], 1:N.t, beta.1[3,], length=0.02, angle=90) arrows(1:N.t, beta.1[1,], 1:N.t, beta.1[2,], length=0.02, angle=90) theta <- apply(m.1$p.theta.samples[burn.in:n.samples,], 2, quant) sigma.sq <- theta[,grep("sigma.sq", colnames(theta))] tau.sq <- theta[,grep("tau.sq", colnames(theta))] phi <- theta[,grep("phi", colnames(theta))] par(mfrow=c(3,1)) plot(1:N.t, sigma.sq[1,], pch=19, cex=0.5, xlab="months", ylab="sigma.sq", ylim=range(sigma.sq)) arrows(1:N.t, sigma.sq[1,], 1:N.t, sigma.sq[3,], length=0.02, angle=90) arrows(1:N.t, sigma.sq[1,], 1:N.t, sigma.sq[2,], length=0.02, angle=90) plot(1:N.t, tau.sq[1,], pch=19, cex=0.5, xlab="months", ylab="tau.sq", ylim=range(tau.sq)) arrows(1:N.t, tau.sq[1,], 1:N.t, tau.sq[3,], length=0.02, angle=90) arrows(1:N.t, tau.sq[1,], 1:N.t, tau.sq[2,], length=0.02, angle=90) plot(1:N.t, 3/phi[1,], pch=19, cex=0.5, xlab="months", ylab="eff. range (km)", ylim=range(3/phi)) arrows(1:N.t, 3/phi[1,], 1:N.t, 3/phi[3,], length=0.02, angle=90) arrows(1:N.t, 3/phi[1,], 1:N.t, 3/phi[2,], length=0.02, angle=90) y.hat <- apply(m.1$p.y.samples[,burn.in:n.samples], 1, quant) y.hat.med <- matrix(y.hat[1,], ncol=N.t) y.hat.up <- matrix(y.hat[3,], ncol=N.t) y.hat.low <- matrix(y.hat[2,], ncol=N.t) y.obs <- as.vector(as.matrix(y.t[-holdout.station.id, -miss])) y.obs.hat.med <- as.vector(y.hat.med[-holdout.station.id, -miss]) y.obs.hat.up <- as.vector(y.hat.up[-holdout.station.id, -miss]) y.obs.hat.low <- as.vector(y.hat.low[-holdout.station.id, -miss]) y.ho <- as.matrix(y.t.holdout) y.ho.hat.med <- as.vector(y.hat.med[holdout.station.id, miss]) y.ho.hat.up <- as.vector(y.hat.up[holdout.station.id, miss]) y.ho.hat.low <- as.vector(y.hat.low[holdout.station.id, miss]) par(mfrow=c(2,1)) plot(y.obs, y.obs.hat.med, pch=19, cex=0.5, xlab="observed", ylab="fitted", main="Observed vs. fitted") arrows(y.obs, y.obs.hat.med, y.obs, y.obs.hat.up, length=0.02, angle=90) arrows(y.obs, y.obs.hat.med, y.obs, y.obs.hat.low, length=0.02, angle=90) lines(-50:50, -50:50, col="blue") plot(y.ho, y.ho.hat.med, pch=19, cex=0.5, xlab="observed", ylab="predicted", main="Observed vs. predicted") arrows(y.ho, y.ho.hat.med, y.ho, y.ho.hat.up, length=0.02, angle=90) arrows(y.ho, y.ho.hat.med, y.ho, y.ho.hat.low, length=0.02, angle=90) lines(-50:50, -50:50, col="blue") ## End(Not run)
## Not run: data("NETemp.dat") ne.temp <- NETemp.dat set.seed(1) ##take a chunk of New England ne.temp <- ne.temp[ne.temp[,"UTMX"] > 5500000 & ne.temp[,"UTMY"] > 3000000,] ##subset first 2 years (Jan 2000 - Dec. 2002) y.t <- ne.temp[,4:27] N.t <- ncol(y.t) ##number of months n <- nrow(y.t) ##number of observation per months ##add some missing observations to illistrate prediction miss <- sample(1:N.t, 10) holdout.station.id <- 5 y.t.holdout <- y.t[holdout.station.id, miss] y.t[holdout.station.id, miss] <- NA ##scale to km coords <- as.matrix(ne.temp[,c("UTMX", "UTMY")]/1000) max.d <- max(iDist(coords)) ##set starting and priors p <- 2 #number of regression parameters in each month starting <- list("beta"=rep(0,N.t*p), "phi"=rep(3/(0.5*max.d), N.t), "sigma.sq"=rep(2,N.t), "tau.sq"=rep(1, N.t), "sigma.eta"=diag(rep(0.01, p))) tuning <- list("phi"=rep(5, N.t)) priors <- list("beta.0.Norm"=list(rep(0,p), diag(1000,p)), "phi.Unif"=list(rep(3/(0.9*max.d), N.t), rep(3/(0.05*max.d), N.t)), "sigma.sq.IG"=list(rep(2,N.t), rep(10,N.t)), "tau.sq.IG"=list(rep(2,N.t), rep(5,N.t)), "sigma.eta.IW"=list(2, diag(0.001,p))) ##make symbolic model formula statement for each month mods <- lapply(paste(colnames(y.t),'elev',sep='~'), as.formula) n.samples <- 2000 m.1 <- spDynLM(mods, data=cbind(y.t,ne.temp[,"elev",drop=FALSE]), coords=coords, starting=starting, tuning=tuning, priors=priors, get.fitted =TRUE, cov.model="exponential", n.samples=n.samples, n.report=25) burn.in <- floor(0.75*n.samples) quant <- function(x){quantile(x, prob=c(0.5, 0.025, 0.975))} beta <- apply(m.1$p.beta.samples[burn.in:n.samples,], 2, quant) beta.0 <- beta[,grep("Intercept", colnames(beta))] beta.1 <- beta[,grep("elev", colnames(beta))] plot(m.1$p.beta.0.samples) par(mfrow=c(2,1)) plot(1:N.t, beta.0[1,], pch=19, cex=0.5, xlab="months", ylab="beta.0", ylim=range(beta.0)) arrows(1:N.t, beta.0[1,], 1:N.t, beta.0[3,], length=0.02, angle=90) arrows(1:N.t, beta.0[1,], 1:N.t, beta.0[2,], length=0.02, angle=90) plot(1:N.t, beta.1[1,], pch=19, cex=0.5, xlab="months", ylab="beta.1", ylim=range(beta.1)) arrows(1:N.t, beta.1[1,], 1:N.t, beta.1[3,], length=0.02, angle=90) arrows(1:N.t, beta.1[1,], 1:N.t, beta.1[2,], length=0.02, angle=90) theta <- apply(m.1$p.theta.samples[burn.in:n.samples,], 2, quant) sigma.sq <- theta[,grep("sigma.sq", colnames(theta))] tau.sq <- theta[,grep("tau.sq", colnames(theta))] phi <- theta[,grep("phi", colnames(theta))] par(mfrow=c(3,1)) plot(1:N.t, sigma.sq[1,], pch=19, cex=0.5, xlab="months", ylab="sigma.sq", ylim=range(sigma.sq)) arrows(1:N.t, sigma.sq[1,], 1:N.t, sigma.sq[3,], length=0.02, angle=90) arrows(1:N.t, sigma.sq[1,], 1:N.t, sigma.sq[2,], length=0.02, angle=90) plot(1:N.t, tau.sq[1,], pch=19, cex=0.5, xlab="months", ylab="tau.sq", ylim=range(tau.sq)) arrows(1:N.t, tau.sq[1,], 1:N.t, tau.sq[3,], length=0.02, angle=90) arrows(1:N.t, tau.sq[1,], 1:N.t, tau.sq[2,], length=0.02, angle=90) plot(1:N.t, 3/phi[1,], pch=19, cex=0.5, xlab="months", ylab="eff. range (km)", ylim=range(3/phi)) arrows(1:N.t, 3/phi[1,], 1:N.t, 3/phi[3,], length=0.02, angle=90) arrows(1:N.t, 3/phi[1,], 1:N.t, 3/phi[2,], length=0.02, angle=90) y.hat <- apply(m.1$p.y.samples[,burn.in:n.samples], 1, quant) y.hat.med <- matrix(y.hat[1,], ncol=N.t) y.hat.up <- matrix(y.hat[3,], ncol=N.t) y.hat.low <- matrix(y.hat[2,], ncol=N.t) y.obs <- as.vector(as.matrix(y.t[-holdout.station.id, -miss])) y.obs.hat.med <- as.vector(y.hat.med[-holdout.station.id, -miss]) y.obs.hat.up <- as.vector(y.hat.up[-holdout.station.id, -miss]) y.obs.hat.low <- as.vector(y.hat.low[-holdout.station.id, -miss]) y.ho <- as.matrix(y.t.holdout) y.ho.hat.med <- as.vector(y.hat.med[holdout.station.id, miss]) y.ho.hat.up <- as.vector(y.hat.up[holdout.station.id, miss]) y.ho.hat.low <- as.vector(y.hat.low[holdout.station.id, miss]) par(mfrow=c(2,1)) plot(y.obs, y.obs.hat.med, pch=19, cex=0.5, xlab="observed", ylab="fitted", main="Observed vs. fitted") arrows(y.obs, y.obs.hat.med, y.obs, y.obs.hat.up, length=0.02, angle=90) arrows(y.obs, y.obs.hat.med, y.obs, y.obs.hat.low, length=0.02, angle=90) lines(-50:50, -50:50, col="blue") plot(y.ho, y.ho.hat.med, pch=19, cex=0.5, xlab="observed", ylab="predicted", main="Observed vs. predicted") arrows(y.ho, y.ho.hat.med, y.ho, y.ho.hat.up, length=0.02, angle=90) arrows(y.ho, y.ho.hat.med, y.ho, y.ho.hat.low, length=0.02, angle=90) lines(-50:50, -50:50, col="blue") ## End(Not run)
The function spGLM
fits univariate Bayesian
generalized linear spatial regression models. Given a set of knots,
spGLM
will also fit a predictive process model (see references below).
spGLM(formula, family="binomial", weights, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
spGLM(formula, family="binomial", weights, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a symbolic description of the regression model to be fit. See example below. |
family |
currently only supports |
weights |
an optional vector of weights to be used in the fitting
process. Weights correspond to number of trials and offset for
each location for the |
data |
an optional data frame containing the variables in the
model. If not found in |
coords |
an |
knots |
either a |
starting |
a list with each tag corresponding to a
parameter name. Valid tags are |
tuning |
a list with each tag corresponding to a
parameter name. Valid tags are The tuning value for |
priors |
a list with each tag corresponding to a
parameter name. Valid tags are |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
verbose |
if |
n.report |
the interval to report Metropolis sampler acceptance and MCMC progress. |
... |
currently no additional arguments. |
If a binomial
model is specified the response vector is the
number of successful trials at each location and weights
is the
total number of trials at each location.
For a poisson
specification, the weights
vector is the
count offset, e.g., population, at each location. This differs from
the glm
offset
argument which is passed as the
log of this value.
A non-spatial model is fit when coords
is not specified. See
example below.
An object of class spGLM
, which is a list with the following
tags:
coords |
the |
knot.coords |
the |
p.beta.theta.samples |
a |
acceptance |
the Metropolis sampler
acceptance rate. If |
acceptance.w |
if this is a non-predictive process model and
|
acceptance.w.knots |
if this is a predictive process model and |
p.w.knots.samples |
a matrix that holds samples from the posterior
distribution of the knots' spatial random effects. The rows of this matrix
correspond to the |
p.w.samples |
a matrix that holds samples from the posterior
distribution of the locations' spatial random effects. The rows of this matrix
correspond to the |
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004) Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Finley, A.O., S. Banerjee, and R.E. McRoberts. (2008) A Bayesian approach to quantifying uncertainty in multi-source forest area estimates. Environmental and Ecological Statistics, 15:241–258.
Roberts G.O. and Rosenthal J.S. (2006) Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf Preprint.
## Not run: library(MBA) library(coda) set.seed(1) rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p) %*% D + rep(mu,rep(n,p))) } ################################ ##Spatial binomial ################################ ##Generate binary data coords <- as.matrix(expand.grid(seq(0,100,length.out=8), seq(0,100,length.out=8))) n <- nrow(coords) phi <- 3/50 sigma.sq <- 2 R <- sigma.sq*exp(-phi*as.matrix(dist(coords))) w <- rmvn(1, rep(0,n), R) x <- as.matrix(rep(1,n)) beta <- 0.1 p <- 1/(1+exp(-(x%*%beta+w))) weights <- rep(1, n) weights[coords[,1]>mean(coords[,1])] <- 10 y <- rbinom(n, size=weights, prob=p) ##Collect samples fit <- glm((y/weights)~x-1, weights=weights, family="binomial") beta.starting <- coefficients(fit) beta.tuning <- t(chol(vcov(fit))) n.batch <- 200 batch.length <- 50 n.samples <- n.batch*batch.length m.1 <- spGLM(y~1, family="binomial", coords=coords, weights=weights, starting=list("beta"=beta.starting, "phi"=0.06,"sigma.sq"=1, "w"=0), tuning=list("beta"=beta.tuning, "phi"=0.5, "sigma.sq"=0.5, "w"=0.5), priors=list("beta.Normal"=list(0,10), "phi.Unif"=c(0.03, 0.3), "sigma.sq.IG"=c(2, 1)), amcmc=list("n.batch"=n.batch, "batch.length"=batch.length, "accept.rate"=0.43), cov.model="exponential", verbose=TRUE, n.report=10) burn.in <- 0.9*n.samples sub.samps <- burn.in:n.samples print(summary(window(m.1$p.beta.theta.samples, start=burn.in))) beta.hat <- m.1$p.beta.theta.samples[sub.samps,"(Intercept)"] w.hat <- m.1$p.w.samples[,sub.samps] p.hat <- 1/(1+exp(-(x%*%beta.hat+w.hat))) y.hat <- apply(p.hat, 2, function(x){rbinom(n, size=weights, prob=p.hat)}) y.hat.mu <- apply(y.hat, 1, mean) y.hat.var <- apply(y.hat, 1, var) ##Take a look par(mfrow=c(1,2)) surf <- mba.surf(cbind(coords,y.hat.mu),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Interpolated mean of posterior rate\n(observed rate)") contour(surf, add=TRUE) text(coords, label=paste("(",y,")",sep="")) surf <- mba.surf(cbind(coords,y.hat.var),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Interpolated variance of posterior rate\n(observed # of trials)") contour(surf, add=TRUE) text(coords, label=paste("(",weights,")",sep="")) ########################### ##Spatial poisson ########################### ##Generate count data set.seed(1) n <- 100 coords <- cbind(runif(n,1,100),runif(n,1,100)) phi <- 3/50 sigma.sq <- 2 R <- sigma.sq*exp(-phi*as.matrix(dist(coords))) w <- rmvn(1, rep(0,n), R) x <- as.matrix(rep(1,n)) beta <- 0.1 y <- rpois(n, exp(x%*%beta+w)) ##Collect samples beta.starting <- coefficients(glm(y~x-1, family="poisson")) beta.tuning <- t(chol(vcov(glm(y~x-1, family="poisson")))) n.batch <- 500 batch.length <- 50 n.samples <- n.batch*batch.length ##Note tuning list is now optional m.1 <- spGLM(y~1, family="poisson", coords=coords, starting=list("beta"=beta.starting, "phi"=0.06,"sigma.sq"=1, "w"=0), tuning=list("beta"=0.1, "phi"=0.5, "sigma.sq"=0.5, "w"=0.5), priors=list("beta.Flat", "phi.Unif"=c(0.03, 0.3), "sigma.sq.IG"=c(2, 1)), amcmc=list("n.batch"=n.batch, "batch.length"=batch.length, "accept.rate"=0.43), cov.model="exponential", verbose=TRUE, n.report=10) ##Just for fun check out the progression of the acceptance ##as it moves to 43% (same can be seen for the random spatial effects). plot(mcmc(t(m.1$acceptance)), density=FALSE, smooth=FALSE) ##Now parameter summaries, etc. burn.in <- 0.9*n.samples sub.samps <- burn.in:n.samples m.1$p.samples[,"phi"] <- 3/m.1$p.samples[,"phi"] plot(m.1$p.beta.theta.samples) print(summary(window(m.1$p.beta.theta.samples, start=burn.in))) beta.hat <- m.1$p.beta.theta.samples[sub.samps,"(Intercept)"] w.hat <- m.1$p.w.samples[,sub.samps] y.hat <- apply(exp(x%*%beta.hat+w.hat), 2, function(x){rpois(n, x)}) y.hat.mu <- apply(y.hat, 1, mean) ##Take a look par(mfrow=c(1,2)) surf <- mba.surf(cbind(coords,y),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Observed counts") contour(surf, add=TRUE) text(coords, labels=y, cex=1) surf <- mba.surf(cbind(coords,y.hat.mu),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Fitted counts") contour(surf, add=TRUE) text(coords, labels=round(y.hat.mu,0), cex=1) ## End(Not run)
## Not run: library(MBA) library(coda) set.seed(1) rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p) %*% D + rep(mu,rep(n,p))) } ################################ ##Spatial binomial ################################ ##Generate binary data coords <- as.matrix(expand.grid(seq(0,100,length.out=8), seq(0,100,length.out=8))) n <- nrow(coords) phi <- 3/50 sigma.sq <- 2 R <- sigma.sq*exp(-phi*as.matrix(dist(coords))) w <- rmvn(1, rep(0,n), R) x <- as.matrix(rep(1,n)) beta <- 0.1 p <- 1/(1+exp(-(x%*%beta+w))) weights <- rep(1, n) weights[coords[,1]>mean(coords[,1])] <- 10 y <- rbinom(n, size=weights, prob=p) ##Collect samples fit <- glm((y/weights)~x-1, weights=weights, family="binomial") beta.starting <- coefficients(fit) beta.tuning <- t(chol(vcov(fit))) n.batch <- 200 batch.length <- 50 n.samples <- n.batch*batch.length m.1 <- spGLM(y~1, family="binomial", coords=coords, weights=weights, starting=list("beta"=beta.starting, "phi"=0.06,"sigma.sq"=1, "w"=0), tuning=list("beta"=beta.tuning, "phi"=0.5, "sigma.sq"=0.5, "w"=0.5), priors=list("beta.Normal"=list(0,10), "phi.Unif"=c(0.03, 0.3), "sigma.sq.IG"=c(2, 1)), amcmc=list("n.batch"=n.batch, "batch.length"=batch.length, "accept.rate"=0.43), cov.model="exponential", verbose=TRUE, n.report=10) burn.in <- 0.9*n.samples sub.samps <- burn.in:n.samples print(summary(window(m.1$p.beta.theta.samples, start=burn.in))) beta.hat <- m.1$p.beta.theta.samples[sub.samps,"(Intercept)"] w.hat <- m.1$p.w.samples[,sub.samps] p.hat <- 1/(1+exp(-(x%*%beta.hat+w.hat))) y.hat <- apply(p.hat, 2, function(x){rbinom(n, size=weights, prob=p.hat)}) y.hat.mu <- apply(y.hat, 1, mean) y.hat.var <- apply(y.hat, 1, var) ##Take a look par(mfrow=c(1,2)) surf <- mba.surf(cbind(coords,y.hat.mu),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Interpolated mean of posterior rate\n(observed rate)") contour(surf, add=TRUE) text(coords, label=paste("(",y,")",sep="")) surf <- mba.surf(cbind(coords,y.hat.var),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Interpolated variance of posterior rate\n(observed # of trials)") contour(surf, add=TRUE) text(coords, label=paste("(",weights,")",sep="")) ########################### ##Spatial poisson ########################### ##Generate count data set.seed(1) n <- 100 coords <- cbind(runif(n,1,100),runif(n,1,100)) phi <- 3/50 sigma.sq <- 2 R <- sigma.sq*exp(-phi*as.matrix(dist(coords))) w <- rmvn(1, rep(0,n), R) x <- as.matrix(rep(1,n)) beta <- 0.1 y <- rpois(n, exp(x%*%beta+w)) ##Collect samples beta.starting <- coefficients(glm(y~x-1, family="poisson")) beta.tuning <- t(chol(vcov(glm(y~x-1, family="poisson")))) n.batch <- 500 batch.length <- 50 n.samples <- n.batch*batch.length ##Note tuning list is now optional m.1 <- spGLM(y~1, family="poisson", coords=coords, starting=list("beta"=beta.starting, "phi"=0.06,"sigma.sq"=1, "w"=0), tuning=list("beta"=0.1, "phi"=0.5, "sigma.sq"=0.5, "w"=0.5), priors=list("beta.Flat", "phi.Unif"=c(0.03, 0.3), "sigma.sq.IG"=c(2, 1)), amcmc=list("n.batch"=n.batch, "batch.length"=batch.length, "accept.rate"=0.43), cov.model="exponential", verbose=TRUE, n.report=10) ##Just for fun check out the progression of the acceptance ##as it moves to 43% (same can be seen for the random spatial effects). plot(mcmc(t(m.1$acceptance)), density=FALSE, smooth=FALSE) ##Now parameter summaries, etc. burn.in <- 0.9*n.samples sub.samps <- burn.in:n.samples m.1$p.samples[,"phi"] <- 3/m.1$p.samples[,"phi"] plot(m.1$p.beta.theta.samples) print(summary(window(m.1$p.beta.theta.samples, start=burn.in))) beta.hat <- m.1$p.beta.theta.samples[sub.samps,"(Intercept)"] w.hat <- m.1$p.w.samples[,sub.samps] y.hat <- apply(exp(x%*%beta.hat+w.hat), 2, function(x){rpois(n, x)}) y.hat.mu <- apply(y.hat, 1, mean) ##Take a look par(mfrow=c(1,2)) surf <- mba.surf(cbind(coords,y),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Observed counts") contour(surf, add=TRUE) text(coords, labels=y, cex=1) surf <- mba.surf(cbind(coords,y.hat.mu),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Fitted counts") contour(surf, add=TRUE) text(coords, labels=round(y.hat.mu,0), cex=1) ## End(Not run)
The function spLM
fits Gaussian univariate Bayesian
spatial regression models. Given a set of knots, spLM
will also
fit a predictive process model (see references below).
spLM(formula, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, modified.pp = TRUE, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
spLM(formula, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, modified.pp = TRUE, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a symbolic description of the regression model to be fit. See example below. |
data |
an optional data frame containing the variables in the
model. If not found in |
coords |
an |
knots |
either a |
starting |
a list with each tag corresponding to a
parameter name. Valid tags are |
tuning |
a list with each tag corresponding to a
parameter name. Valid tags are |
priors |
a list with each tag corresponding to a
parameter name. Valid tags are |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
modified.pp |
a logical value indicating if the modified
predictive process should be used (see references below for
details). Note, if a predictive process model is not used (i.e., |
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
verbose |
if |
n.report |
the interval to report Metropolis sampler acceptance and MCMC progress. |
... |
currently no additional arguments. |
Model parameters can be fixed at their starting
values by setting their
tuning
values to zero.
The no nugget model is specified by removing tau.sq
from the starting
list.
An object of class spLM
, which is a list with the following
tags:
coords |
the |
knot.coords |
the |
p.theta.samples |
a |
acceptance |
the Metropolis sampling
acceptance percent. Reported at |
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, FL.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873–2884.
Roberts G.O. and Rosenthal J.S. (2006). Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf.
library(coda) ## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 100 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 2 tau.sq <- 0.1 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) n.samples <- 2000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) priors.1 <- list("beta.Norm"=list(rep(0,p), diag(1000,p)), "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) priors.2 <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) cov.model <- "exponential" n.report <- 500 verbose <- TRUE m.1 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.1, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) m.2 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.2, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) burn.in <- 0.5*n.samples ##recover beta and spatial random effects m.1 <- spRecover(m.1, start=burn.in, verbose=FALSE) m.2 <- spRecover(m.2, start=burn.in, verbose=FALSE) round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.2$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.2$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) m.1.w.summary <- summary(mcmc(t(m.1$p.w.recover.samples)))$quantiles[,c(3,1,5)] m.2.w.summary <- summary(mcmc(t(m.2$p.w.recover.samples)))$quantiles[,c(3,1,5)] plot(w, m.1.w.summary[,1], xlab="Observed w", ylab="Fitted w", xlim=range(w), ylim=range(m.1.w.summary), main="Spatial random effects") arrows(w, m.1.w.summary[,1], w, m.1.w.summary[,2], length=0.02, angle=90) arrows(w, m.1.w.summary[,1], w, m.1.w.summary[,3], length=0.02, angle=90) lines(range(w), range(w)) points(w, m.2.w.summary[,1], col="blue", pch=19, cex=0.5) arrows(w, m.2.w.summary[,1], w, col="blue", m.2.w.summary[,2], length=0.02, angle=90) arrows(w, m.2.w.summary[,1], w, col="blue", m.2.w.summary[,3], length=0.02, angle=90) ########################### ##Predictive process model ########################### m.1 <- spLM(y~X-1, coords=coords, knots=c(6,6,0.1), starting=starting, tuning=tuning, priors=priors.1, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) m.2 <- spLM(y~X-1, coords=coords, knots=c(6,6,0.1), starting=starting, tuning=tuning, priors=priors.2, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) burn.in <- 0.5*n.samples round(summary(window(m.1$p.beta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) round(summary(window(m.2$p.beta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) round(summary(window(m.1$p.theta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) round(summary(window(m.2$p.theta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) ## End(Not run)
library(coda) ## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 100 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 2 tau.sq <- 0.1 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) n.samples <- 2000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) priors.1 <- list("beta.Norm"=list(rep(0,p), diag(1000,p)), "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) priors.2 <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 2), "tau.sq.IG"=c(2, 0.1)) cov.model <- "exponential" n.report <- 500 verbose <- TRUE m.1 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.1, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) m.2 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors.2, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) burn.in <- 0.5*n.samples ##recover beta and spatial random effects m.1 <- spRecover(m.1, start=burn.in, verbose=FALSE) m.2 <- spRecover(m.2, start=burn.in, verbose=FALSE) round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.2$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.2$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) m.1.w.summary <- summary(mcmc(t(m.1$p.w.recover.samples)))$quantiles[,c(3,1,5)] m.2.w.summary <- summary(mcmc(t(m.2$p.w.recover.samples)))$quantiles[,c(3,1,5)] plot(w, m.1.w.summary[,1], xlab="Observed w", ylab="Fitted w", xlim=range(w), ylim=range(m.1.w.summary), main="Spatial random effects") arrows(w, m.1.w.summary[,1], w, m.1.w.summary[,2], length=0.02, angle=90) arrows(w, m.1.w.summary[,1], w, m.1.w.summary[,3], length=0.02, angle=90) lines(range(w), range(w)) points(w, m.2.w.summary[,1], col="blue", pch=19, cex=0.5) arrows(w, m.2.w.summary[,1], w, col="blue", m.2.w.summary[,2], length=0.02, angle=90) arrows(w, m.2.w.summary[,1], w, col="blue", m.2.w.summary[,3], length=0.02, angle=90) ########################### ##Predictive process model ########################### m.1 <- spLM(y~X-1, coords=coords, knots=c(6,6,0.1), starting=starting, tuning=tuning, priors=priors.1, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) m.2 <- spLM(y~X-1, coords=coords, knots=c(6,6,0.1), starting=starting, tuning=tuning, priors=priors.2, cov.model=cov.model, n.samples=n.samples, verbose=verbose, n.report=n.report) burn.in <- 0.5*n.samples round(summary(window(m.1$p.beta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) round(summary(window(m.2$p.beta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) round(summary(window(m.1$p.theta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) round(summary(window(m.2$p.theta.samples, start=burn.in))$quantiles[,c(3,1,5)],2) ## End(Not run)
The function spMisalignGLM
fits Gaussian multivariate Bayesian
generalized linear spatial regression models to misaligned data.
spMisalignGLM(formula, family="binomial", weights, data = parent.frame(), coords, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
spMisalignGLM(formula, family="binomial", weights, data = parent.frame(), coords, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a list of |
family |
currently only supports |
weights |
an optional list of weight vectors associated with each model
in the formula list. Weights correspond to number of trials and offset for
each location for the |
data |
an optional data frame containing the variables in the
model. If not found in |
coords |
a list of |
starting |
a list with tags corresponding to |
tuning |
a list with tags |
priors |
a list with each tag corresponding to a
parameter name. Valid tags are |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
verbose |
if |
n.report |
the interval to report Metropolis acceptance and MCMC progress. |
... |
currently no additional arguments. |
If a binomial
model is specified the response vector is the
number of successful trials at each location and weights
is the
total number of trials at each location.
For a poisson
specification, the weights
vector is the
count offset, e.g., population, at each location. This differs from
the glm
offset
argument which is passed as the
log of this value.
An object of class spMisalignGLM
, which is a list with the following
tags:
p.beta.theta.samples |
a |
acceptance |
the Metropolis sampler
acceptance rate. If |
acceptance.w |
if |
p.w.samples |
a matrix that holds samples from the posterior
distribution of the locations' spatial random effects. Posterior samples are organized with the first response variable
|
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.
Finley, A.O., S. Banerjee, and B.D. Cook. (2014) Bayesian hierarchical models for spatially misaligned data. Methods in Ecology and Evolution, 5:514–523.
Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873–2884.
Finley, A.O., S. Banerjee, A.R. Ek, and R.E. McRoberts. (2008) Bayesian multivariate process modeling for prediction of forest attributes. Journal of Agricultural, Biological, and Environmental Statistics, 13:60–83.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##generate some data n <- 100 ##number of locations q <- 3 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##parameters for generating a multivariate spatial GP covariance matrix theta <- rep(3/0.5,q) ##spatial decay A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,1,-1,1,0.5,0.25) K <- A%*%t(A) K ##spatial cross-covariance cov2cor(K) ##spatial cross-correlation C <- mkSpCov(coords, K, diag(0,q), theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) ##spatial random effects w.a <- w[seq(1,length(w),q)] w.b <- w[seq(2,length(w),q)] w.c <- w[seq(3,length(w),q)] ##covariate portion of the mean x.a <- cbind(1, rnorm(n)) x.b <- cbind(1, rnorm(n)) x.c <- cbind(1, rnorm(n)) x <- mkMvX(list(x.a, x.b, x.c)) B.1 <- c(1,-1) B.2 <- c(-1,1) B.3 <- c(-1,-1) B <- c(B.1, B.2, B.3) y <- rpois(nrow(C), exp(x%*%B+w)) y.a <- y[seq(1,length(y),q)] y.b <- y[seq(2,length(y),q)] y.c <- y[seq(3,length(y),q)] ##subsample to make spatially misaligned data sub.1 <- 1:50 y.1 <- y.a[sub.1] w.1 <- w.a[sub.1] coords.1 <- coords[sub.1,] x.1 <- x.a[sub.1,] sub.2 <- 25:75 y.2 <- y.b[sub.2] w.2 <- w.b[sub.2] coords.2 <- coords[sub.2,] x.2 <- x.b[sub.2,] sub.3 <- 50:100 y.3 <- y.c[sub.3] w.3 <- w.c[sub.3] coords.3 <- coords[sub.3,] x.3 <- x.c[sub.3,] ##call spMisalignGLM q <- 3 A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.batch <- 200 batch.length <- 25 n.samples <- n.batch*batch.length starting <- list("beta"=rep(0,length(B)), "phi"=rep(3/0.5,q), "A"=A.starting, "w"=0) tuning <- list("beta"=rep(0.1,length(B)), "phi"=rep(1,q), "A"=rep(0.1,length(A.starting)), "w"=1) priors <- list("phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q)), rep(0.1,q)) m.1 <- spMisalignGLM(list(y.1~x.1-1, y.2~x.2-1, y.3~x.3-1), family="poisson", coords=list(coords.1, coords.2, coords.3), starting=starting, tuning=tuning, priors=priors, amcmc=list("n.batch"=n.batch, "batch.length"=batch.length, "accept.rate"=0.43), cov.model="exponential", n.report=10) burn.in <- floor(0.75*n.samples) plot(m.1$p.beta.theta.samples, density=FALSE) ##predict for all locations, i.e., observed and not observed out <- spPredict(m.1, start=burn.in, thin=10, pred.covars=list(x.a, x.b, x.c), pred.coords=list(coords, coords, coords)) ##summary and check quants <- function(x){quantile(x, prob=c(0.5,0.025,0.975))} y.hat <- apply(out$p.y.predictive.samples, 1, quants) ##unstack and plot y.a.hat <- y.hat[,1:n] y.b.hat <- y.hat[,(n+1):(2*n)] y.c.hat <- y.hat[,(2*n+1):(3*n)] par(mfrow=c(1,3)) plot(y.a ,y.a.hat[1,], xlab="Observed y.a", ylab="Fitted & predicted y.a") plot(y.b, y.b.hat[1,], xlab="Observed y.b", ylab="Fitted & predicted y.b") plot(y.c, y.c.hat[1,], xlab="Observed y.c", ylab="Fitted & predicted y.c") ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##generate some data n <- 100 ##number of locations q <- 3 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##parameters for generating a multivariate spatial GP covariance matrix theta <- rep(3/0.5,q) ##spatial decay A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,1,-1,1,0.5,0.25) K <- A%*%t(A) K ##spatial cross-covariance cov2cor(K) ##spatial cross-correlation C <- mkSpCov(coords, K, diag(0,q), theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) ##spatial random effects w.a <- w[seq(1,length(w),q)] w.b <- w[seq(2,length(w),q)] w.c <- w[seq(3,length(w),q)] ##covariate portion of the mean x.a <- cbind(1, rnorm(n)) x.b <- cbind(1, rnorm(n)) x.c <- cbind(1, rnorm(n)) x <- mkMvX(list(x.a, x.b, x.c)) B.1 <- c(1,-1) B.2 <- c(-1,1) B.3 <- c(-1,-1) B <- c(B.1, B.2, B.3) y <- rpois(nrow(C), exp(x%*%B+w)) y.a <- y[seq(1,length(y),q)] y.b <- y[seq(2,length(y),q)] y.c <- y[seq(3,length(y),q)] ##subsample to make spatially misaligned data sub.1 <- 1:50 y.1 <- y.a[sub.1] w.1 <- w.a[sub.1] coords.1 <- coords[sub.1,] x.1 <- x.a[sub.1,] sub.2 <- 25:75 y.2 <- y.b[sub.2] w.2 <- w.b[sub.2] coords.2 <- coords[sub.2,] x.2 <- x.b[sub.2,] sub.3 <- 50:100 y.3 <- y.c[sub.3] w.3 <- w.c[sub.3] coords.3 <- coords[sub.3,] x.3 <- x.c[sub.3,] ##call spMisalignGLM q <- 3 A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.batch <- 200 batch.length <- 25 n.samples <- n.batch*batch.length starting <- list("beta"=rep(0,length(B)), "phi"=rep(3/0.5,q), "A"=A.starting, "w"=0) tuning <- list("beta"=rep(0.1,length(B)), "phi"=rep(1,q), "A"=rep(0.1,length(A.starting)), "w"=1) priors <- list("phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q)), rep(0.1,q)) m.1 <- spMisalignGLM(list(y.1~x.1-1, y.2~x.2-1, y.3~x.3-1), family="poisson", coords=list(coords.1, coords.2, coords.3), starting=starting, tuning=tuning, priors=priors, amcmc=list("n.batch"=n.batch, "batch.length"=batch.length, "accept.rate"=0.43), cov.model="exponential", n.report=10) burn.in <- floor(0.75*n.samples) plot(m.1$p.beta.theta.samples, density=FALSE) ##predict for all locations, i.e., observed and not observed out <- spPredict(m.1, start=burn.in, thin=10, pred.covars=list(x.a, x.b, x.c), pred.coords=list(coords, coords, coords)) ##summary and check quants <- function(x){quantile(x, prob=c(0.5,0.025,0.975))} y.hat <- apply(out$p.y.predictive.samples, 1, quants) ##unstack and plot y.a.hat <- y.hat[,1:n] y.b.hat <- y.hat[,(n+1):(2*n)] y.c.hat <- y.hat[,(2*n+1):(3*n)] par(mfrow=c(1,3)) plot(y.a ,y.a.hat[1,], xlab="Observed y.a", ylab="Fitted & predicted y.a") plot(y.b, y.b.hat[1,], xlab="Observed y.b", ylab="Fitted & predicted y.b") plot(y.c, y.c.hat[1,], xlab="Observed y.c", ylab="Fitted & predicted y.c") ## End(Not run)
The function spMisalignLM
fits Gaussian multivariate Bayesian
spatial regression models to misaligned data.
spMisalignLM(formula, data = parent.frame(), coords, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
spMisalignLM(formula, data = parent.frame(), coords, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a list of |
data |
an optional data frame containing the variables in the
model. If not found in |
coords |
a list of |
starting |
a list with tags corresponding to
|
tuning |
a list with tags |
priors |
a list with tags |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
verbose |
if |
n.report |
the interval to report Metropolis acceptance and MCMC progress. |
... |
currently no additional arguments. |
Model parameters can be fixed at their starting
values by setting their
tuning
values to zero.
An object of class spMisalignLM
, which is a list with the following
tags:
p.theta.samples |
a |
acceptance |
the Metropolis sampling
acceptance percent. Reported at |
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.
Finley, A.O., S. Banerjee, and B.D. Cook. (2014) Bayesian hierarchical models for spatially misaligned data. Methods in Ecology and Evolution, 5:514–523.
Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873–2884.
Finley, A.O., S. Banerjee, A.R. Ek, and R.E. McRoberts. (2008) Bayesian multivariate process modeling for prediction of forest attributes. Journal of Agricultural, Biological, and Environmental Statistics, 13:60–83.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##generate some data n <- 100 ##number of locations q <- 3 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##parameters for generating a multivariate spatial GP covariance matrix theta <- rep(3/0.5,q) ##spatial decay A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,1,-1,1,0.5,0.25) K <- A%*%t(A) K ##spatial cross-covariance cov2cor(K) ##spatial cross-correlation C <- mkSpCov(coords, K, diag(0,q), theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) ##spatial random effects w.a <- w[seq(1,length(w),q)] w.b <- w[seq(2,length(w),q)] w.c <- w[seq(3,length(w),q)] ##covariate portion of the mean x.a <- cbind(1, rnorm(n)) x.b <- cbind(1, rnorm(n)) x.c <- cbind(1, rnorm(n)) x <- mkMvX(list(x.a, x.b, x.c)) B.1 <- c(1,-1) B.2 <- c(-1,1) B.3 <- c(-1,-1) B <- c(B.1, B.2, B.3) Psi <- c(0.1, 0.1, 0.1) ##non-spatial residual variance, i.e., nugget y <- rnorm(n*q, x%*%B+w, rep(sqrt(Psi),n)) y.a <- y[seq(1,length(y),q)] y.b <- y[seq(2,length(y),q)] y.c <- y[seq(3,length(y),q)] ##subsample to make spatially misaligned data sub.1 <- 1:50 y.1 <- y.a[sub.1] w.1 <- w.a[sub.1] coords.1 <- coords[sub.1,] x.1 <- x.a[sub.1,] sub.2 <- 25:75 y.2 <- y.b[sub.2] w.2 <- w.b[sub.2] coords.2 <- coords[sub.2,] x.2 <- x.b[sub.2,] sub.3 <- 50:100 y.3 <- y.c[sub.3] w.3 <- w.c[sub.3] coords.3 <- coords[sub.3,] x.3 <- x.c[sub.3,] ##call spMisalignLM q <- 3 A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.samples <- 5000 starting <- list("phi"=rep(3/0.5,q), "A"=A.starting, "Psi"=rep(1,q)) tuning <- list("phi"=rep(0.5,q), "A"=rep(0.01,length(A.starting)), "Psi"=rep(0.1,q)) priors <- list("phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q)), "Psi.ig"=list(rep(2,q), rep(0.1,q))) m.1 <- spMisalignLM(list(y.1~x.1-1, y.2~x.2-1, y.3~x.3-1), coords=list(coords.1, coords.2, coords.3), starting=starting, tuning=tuning, priors=priors, n.samples=n.samples, cov.model="exponential", n.report=100) burn.in <- floor(0.75*n.samples) plot(m.1$p.theta.samples, density=FALSE) ##recover regression coefficients and random effects m.1 <- spRecover(m.1, start=burn.in) round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) ##predict for all locations, i.e., observed and not observed out <- spPredict(m.1, start=burn.in, thin=10, pred.covars=list(x.a, x.b, x.c), pred.coords=list(coords, coords, coords)) ##summary and check quants <- function(x){quantile(x, prob=c(0.5,0.025,0.975))} y.hat <- apply(out$p.y.predictive.samples, 1, quants) ##unstack and plot y.a.hat <- y.hat[,1:n] y.b.hat <- y.hat[,(n+1):(2*n)] y.c.hat <- y.hat[,(2*n+1):(3*n)] par(mfrow=c(1,3)) plot(y.a, y.a.hat[1,], xlab="Observed y.a", ylab="Fitted & predicted y.a", xlim=range(y), ylim=range(y.hat), main="") arrows(y.a[-sub.1], y.a.hat[1,-sub.1], y.a[-sub.1], y.a.hat[2,-sub.1], length=0.02, angle=90) arrows(y.a[-sub.1], y.a.hat[1,-sub.1], y.a[-sub.1], y.a.hat[3,-sub.1], length=0.02, angle=90) lines(range(y.a), range(y.a)) plot(y.b, y.b.hat[1,], xlab="Observed y.b", ylab="Fitted & predicted y.b", xlim=range(y), ylim=range(y.hat), main="") arrows(y.b[-sub.2], y.b.hat[1,-sub.2], y.b[-sub.2], y.b.hat[2,-sub.2], length=0.02, angle=90) arrows(y.b[-sub.2], y.b.hat[1,-sub.2], y.b[-sub.2], y.b.hat[3,-sub.2], length=0.02, angle=90) lines(range(y.b), range(y.b)) plot(y.c, y.c.hat[1,], xlab="Observed y.c", ylab="Fitted & predicted y.c", xlim=range(y), ylim=range(y.hat), main="") arrows(y.c[-sub.3], y.c.hat[1,-sub.3], y.c[-sub.3], y.c.hat[2,-sub.3], length=0.02, angle=90) arrows(y.c[-sub.3], y.c.hat[1,-sub.3], y.c[-sub.3], y.c.hat[3,-sub.3], length=0.02, angle=90) lines(range(y.c), range(y.c)) ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##generate some data n <- 100 ##number of locations q <- 3 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##parameters for generating a multivariate spatial GP covariance matrix theta <- rep(3/0.5,q) ##spatial decay A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,1,-1,1,0.5,0.25) K <- A%*%t(A) K ##spatial cross-covariance cov2cor(K) ##spatial cross-correlation C <- mkSpCov(coords, K, diag(0,q), theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) ##spatial random effects w.a <- w[seq(1,length(w),q)] w.b <- w[seq(2,length(w),q)] w.c <- w[seq(3,length(w),q)] ##covariate portion of the mean x.a <- cbind(1, rnorm(n)) x.b <- cbind(1, rnorm(n)) x.c <- cbind(1, rnorm(n)) x <- mkMvX(list(x.a, x.b, x.c)) B.1 <- c(1,-1) B.2 <- c(-1,1) B.3 <- c(-1,-1) B <- c(B.1, B.2, B.3) Psi <- c(0.1, 0.1, 0.1) ##non-spatial residual variance, i.e., nugget y <- rnorm(n*q, x%*%B+w, rep(sqrt(Psi),n)) y.a <- y[seq(1,length(y),q)] y.b <- y[seq(2,length(y),q)] y.c <- y[seq(3,length(y),q)] ##subsample to make spatially misaligned data sub.1 <- 1:50 y.1 <- y.a[sub.1] w.1 <- w.a[sub.1] coords.1 <- coords[sub.1,] x.1 <- x.a[sub.1,] sub.2 <- 25:75 y.2 <- y.b[sub.2] w.2 <- w.b[sub.2] coords.2 <- coords[sub.2,] x.2 <- x.b[sub.2,] sub.3 <- 50:100 y.3 <- y.c[sub.3] w.3 <- w.c[sub.3] coords.3 <- coords[sub.3,] x.3 <- x.c[sub.3,] ##call spMisalignLM q <- 3 A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.samples <- 5000 starting <- list("phi"=rep(3/0.5,q), "A"=A.starting, "Psi"=rep(1,q)) tuning <- list("phi"=rep(0.5,q), "A"=rep(0.01,length(A.starting)), "Psi"=rep(0.1,q)) priors <- list("phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q)), "Psi.ig"=list(rep(2,q), rep(0.1,q))) m.1 <- spMisalignLM(list(y.1~x.1-1, y.2~x.2-1, y.3~x.3-1), coords=list(coords.1, coords.2, coords.3), starting=starting, tuning=tuning, priors=priors, n.samples=n.samples, cov.model="exponential", n.report=100) burn.in <- floor(0.75*n.samples) plot(m.1$p.theta.samples, density=FALSE) ##recover regression coefficients and random effects m.1 <- spRecover(m.1, start=burn.in) round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) ##predict for all locations, i.e., observed and not observed out <- spPredict(m.1, start=burn.in, thin=10, pred.covars=list(x.a, x.b, x.c), pred.coords=list(coords, coords, coords)) ##summary and check quants <- function(x){quantile(x, prob=c(0.5,0.025,0.975))} y.hat <- apply(out$p.y.predictive.samples, 1, quants) ##unstack and plot y.a.hat <- y.hat[,1:n] y.b.hat <- y.hat[,(n+1):(2*n)] y.c.hat <- y.hat[,(2*n+1):(3*n)] par(mfrow=c(1,3)) plot(y.a, y.a.hat[1,], xlab="Observed y.a", ylab="Fitted & predicted y.a", xlim=range(y), ylim=range(y.hat), main="") arrows(y.a[-sub.1], y.a.hat[1,-sub.1], y.a[-sub.1], y.a.hat[2,-sub.1], length=0.02, angle=90) arrows(y.a[-sub.1], y.a.hat[1,-sub.1], y.a[-sub.1], y.a.hat[3,-sub.1], length=0.02, angle=90) lines(range(y.a), range(y.a)) plot(y.b, y.b.hat[1,], xlab="Observed y.b", ylab="Fitted & predicted y.b", xlim=range(y), ylim=range(y.hat), main="") arrows(y.b[-sub.2], y.b.hat[1,-sub.2], y.b[-sub.2], y.b.hat[2,-sub.2], length=0.02, angle=90) arrows(y.b[-sub.2], y.b.hat[1,-sub.2], y.b[-sub.2], y.b.hat[3,-sub.2], length=0.02, angle=90) lines(range(y.b), range(y.b)) plot(y.c, y.c.hat[1,], xlab="Observed y.c", ylab="Fitted & predicted y.c", xlim=range(y), ylim=range(y.hat), main="") arrows(y.c[-sub.3], y.c.hat[1,-sub.3], y.c[-sub.3], y.c.hat[2,-sub.3], length=0.02, angle=90) arrows(y.c[-sub.3], y.c.hat[1,-sub.3], y.c[-sub.3], y.c.hat[3,-sub.3], length=0.02, angle=90) lines(range(y.c), range(y.c)) ## End(Not run)
The function spMvGLM
fits multivariate Bayesian
generalized linear spatial regression models. Given a set of knots,
spMvGLM
will also fit a predictive process model (see references below).
spMvGLM(formula, family="binomial", weights, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
spMvGLM(formula, family="binomial", weights, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a list of |
family |
currently only supports |
weights |
an optional |
data |
an optional data frame containing the variables in the
model. If not found in |
coords |
an |
knots |
either a |
starting |
a list with each tag corresponding to a parameter name. Valid tags are |
tuning |
a list with tags |
priors |
a list with each tag corresponding to a
parameter name. Valid tags are |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
verbose |
if |
n.report |
the interval to report Metropolis sampler acceptance and MCMC progress. |
... |
currently no additional arguments. |
If a binomial
model is specified the response vector is the
number of successful trials at each location and weights
is the
total number of trials at each location.
For a poisson
specification, the weights
vector is the
count offset, e.g., population, at each location. This differs from
the glm
offset
argument which is passed as the
log of this value.
A non-spatial model is fit when coords
is not specified. See
example below.
An object of class spMvGLM
, which is a list with the following
tags:
coords |
the |
knot.coords |
the |
p.beta.theta.samples |
a |
acceptance |
the Metropolis sampler
acceptance rate. If |
acceptance.w |
if this is a non-predictive process model and
|
acceptance.w.knots |
if this is a predictive process model and |
p.w.knots.samples |
a matrix that holds samples from the posterior
distribution of the knots' spatial random effects. The rows of this matrix
correspond to the |
p.w.samples |
a matrix that holds samples from the posterior
distribution of the locations' spatial random effects. The rows of this matrix
correspond to the |
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Finley, A.O., S. Banerjee, and R.E. McRoberts. (2008) A Bayesian approach to quantifying uncertainty in multi-source forest area estimates. Environmental and Ecological Statistics, 15:241–258.
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873-2884.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.
Roberts G.O. and Rosenthal J.S. (2006) Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf Preprint.
## Not run: library(MBA) ##Some useful functions rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##Generate some data n <- 25 ##number of locations q <- 2 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##Parameters for the bivariate spatial random effects theta <- rep(3/0.5,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,-1,0.25) K <- A%*%t(A) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) w.1 <- w[seq(1,length(w),q)] w.2 <- w[seq(2,length(w),q)] ##Covariate portion of the mean x.1 <- cbind(1, rnorm(n)) x.2 <- cbind(1, rnorm(n)) x <- mkMvX(list(x.1, x.2)) B.1 <- c(1,-1) B.2 <- c(-1,1) B <- c(B.1, B.2) weight <- 10 ##i.e., trials p <- 1/(1+exp(-(x%*%B+w))) y <- rbinom(n*q, size=rep(weight,n*q), prob=p) y.1 <- y[seq(1,length(y),q)] y.2 <- y[seq(2,length(y),q)] ##Call spMvLM fit <- glm((y/weight)~x-1, weights=rep(weight, n*q), family="binomial") beta.starting <- coefficients(fit) beta.tuning <- t(chol(vcov(fit))) A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.batch <- 100 batch.length <- 50 n.samples <- n.batch*batch.length starting <- list("beta"=beta.starting, "phi"=rep(3/0.5,q), "A"=A.starting, "w"=0) tuning <- list("beta"=beta.tuning, "phi"=rep(1,q), "A"=rep(0.1,length(A.starting)), "w"=0.5) priors <- list("beta.Flat", "phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q))) m.1 <- spMvGLM(list(y.1~x.1-1, y.2~x.2-1), coords=coords, weights=matrix(weight,n,q), starting=starting, tuning=tuning, priors=priors, amcmc=list("n.batch"=n.batch,"batch.length"=batch.length,"accept.rate"=0.43), cov.model="exponential", n.report=25) burn.in <- 0.75*n.samples sub.samps <- burn.in:n.samples print(summary(window(m.1$p.beta.theta.samples, start=burn.in))$quantiles[,c(3,1,5)]) beta.hat <- t(m.1$p.beta.theta.samples[sub.samps,1:length(B)]) w.hat <- m.1$p.w.samples[,sub.samps] p.hat <- 1/(1+exp(-(x%*%beta.hat+w.hat))) y.hat <- apply(p.hat, 2, function(x){rbinom(n*q, size=rep(weight, n*q), prob=p)}) y.hat.mu <- apply(y.hat, 1, mean) ##Unstack to get each response variable fitted values y.hat.mu.1 <- y.hat.mu[seq(1,length(y.hat.mu),q)] y.hat.mu.2 <- y.hat.mu[seq(2,length(y.hat.mu),q)] ##Take a look par(mfrow=c(2,2)) surf <- mba.surf(cbind(coords,y.1),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Observed y.1 positive trials") contour(surf, add=TRUE) points(coords) zlim <- range(surf[["z"]], na.rm=TRUE) surf <- mba.surf(cbind(coords,y.hat.mu.1),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, zlim=zlim, main="Fitted y.1 positive trials") contour(surf, add=TRUE) points(coords) surf <- mba.surf(cbind(coords,y.2),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Observed y.2 positive trials") contour(surf, add=TRUE) points(coords) zlim <- range(surf[["z"]], na.rm=TRUE) surf <- mba.surf(cbind(coords,y.hat.mu.2),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, zlim=zlim, main="Fitted y.2 positive trials") contour(surf, add=TRUE) points(coords) ## End(Not run)
## Not run: library(MBA) ##Some useful functions rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##Generate some data n <- 25 ##number of locations q <- 2 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##Parameters for the bivariate spatial random effects theta <- rep(3/0.5,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,-1,0.25) K <- A%*%t(A) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) w.1 <- w[seq(1,length(w),q)] w.2 <- w[seq(2,length(w),q)] ##Covariate portion of the mean x.1 <- cbind(1, rnorm(n)) x.2 <- cbind(1, rnorm(n)) x <- mkMvX(list(x.1, x.2)) B.1 <- c(1,-1) B.2 <- c(-1,1) B <- c(B.1, B.2) weight <- 10 ##i.e., trials p <- 1/(1+exp(-(x%*%B+w))) y <- rbinom(n*q, size=rep(weight,n*q), prob=p) y.1 <- y[seq(1,length(y),q)] y.2 <- y[seq(2,length(y),q)] ##Call spMvLM fit <- glm((y/weight)~x-1, weights=rep(weight, n*q), family="binomial") beta.starting <- coefficients(fit) beta.tuning <- t(chol(vcov(fit))) A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.batch <- 100 batch.length <- 50 n.samples <- n.batch*batch.length starting <- list("beta"=beta.starting, "phi"=rep(3/0.5,q), "A"=A.starting, "w"=0) tuning <- list("beta"=beta.tuning, "phi"=rep(1,q), "A"=rep(0.1,length(A.starting)), "w"=0.5) priors <- list("beta.Flat", "phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q))) m.1 <- spMvGLM(list(y.1~x.1-1, y.2~x.2-1), coords=coords, weights=matrix(weight,n,q), starting=starting, tuning=tuning, priors=priors, amcmc=list("n.batch"=n.batch,"batch.length"=batch.length,"accept.rate"=0.43), cov.model="exponential", n.report=25) burn.in <- 0.75*n.samples sub.samps <- burn.in:n.samples print(summary(window(m.1$p.beta.theta.samples, start=burn.in))$quantiles[,c(3,1,5)]) beta.hat <- t(m.1$p.beta.theta.samples[sub.samps,1:length(B)]) w.hat <- m.1$p.w.samples[,sub.samps] p.hat <- 1/(1+exp(-(x%*%beta.hat+w.hat))) y.hat <- apply(p.hat, 2, function(x){rbinom(n*q, size=rep(weight, n*q), prob=p)}) y.hat.mu <- apply(y.hat, 1, mean) ##Unstack to get each response variable fitted values y.hat.mu.1 <- y.hat.mu[seq(1,length(y.hat.mu),q)] y.hat.mu.2 <- y.hat.mu[seq(2,length(y.hat.mu),q)] ##Take a look par(mfrow=c(2,2)) surf <- mba.surf(cbind(coords,y.1),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Observed y.1 positive trials") contour(surf, add=TRUE) points(coords) zlim <- range(surf[["z"]], na.rm=TRUE) surf <- mba.surf(cbind(coords,y.hat.mu.1),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, zlim=zlim, main="Fitted y.1 positive trials") contour(surf, add=TRUE) points(coords) surf <- mba.surf(cbind(coords,y.2),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, main="Observed y.2 positive trials") contour(surf, add=TRUE) points(coords) zlim <- range(surf[["z"]], na.rm=TRUE) surf <- mba.surf(cbind(coords,y.hat.mu.2),no.X=100, no.Y=100, extend=TRUE)$xyz.est image(surf, zlim=zlim, main="Fitted y.2 positive trials") contour(surf, add=TRUE) points(coords) ## End(Not run)
The function spMvLM
fits Gaussian multivariate Bayesian
spatial regression models. Given a set of knots, spMvLM
will
also fit a predictive process model (see references below).
spMvLM(formula, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, modified.pp = TRUE, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
spMvLM(formula, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, modified.pp = TRUE, amcmc, n.samples, verbose=TRUE, n.report=100, ...)
formula |
a list of |
data |
an optional data frame containing the variables in the
model. If not found in |
coords |
an |
knots |
either a |
starting |
a list with tags corresponding to The value portion of each tag is a vector
that holds the parameter's starting values and are of length
|
tuning |
a list with tags |
priors |
a list with tags |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
modified.pp |
a logical value indicating if the modified
predictive process should be used (see references below for
details). Note, if a predictive process model is not used (i.e., |
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
verbose |
if |
n.report |
the interval to report Metropolis acceptance and MCMC progress. |
... |
currently no additional arguments. |
Model parameters can be fixed at their starting
values by setting their
tuning
values to zero.
The no nugget model is specified by removing Psi
and L
from the starting
list.
An object of class spMvLM
, which is a list with the following
tags:
coords |
the |
knot.coords |
the |
p.theta.samples |
a |
acceptance |
the Metropolis sampling
acceptance percent. Reported at |
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873–2884.
Finley, A.O., S. Banerjee, A.R. Ek, and R.E. McRoberts. (2008) Bayesian multivariate process modeling for prediction of forest attributes. Journal of Agricultural, Biological, and Environmental Statistics, 13:60–83.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##Generate some data n <- 25 ##number of locations q <- 2 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##Parameters for the bivariate spatial random effects theta <- rep(3/0.5,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,-1,0.25) K <- A%*%t(A) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) w.1 <- w[seq(1,length(w),q)] w.2 <- w[seq(2,length(w),q)] ##Covariate portion of the mean x.1 <- cbind(1, rnorm(n)) x.2 <- cbind(1, rnorm(n)) x <- mkMvX(list(x.1, x.2)) B.1 <- c(1,-1) B.2 <- c(-1,1) B <- c(B.1, B.2) Psi <- diag(c(0.1, 0.5)) y <- rnorm(n*q, x%*%B+w, diag(n)%x%Psi) y.1 <- y[seq(1,length(y),q)] y.2 <- y[seq(2,length(y),q)] ##Call spMvLM A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.samples <- 1000 starting <- list("phi"=rep(3/0.5,q), "A"=A.starting, "Psi"=rep(1,q)) tuning <- list("phi"=rep(1,q), "A"=rep(0.01,length(A.starting)), "Psi"=rep(0.01,q)) priors <- list("beta.Flat", "phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q)), "Psi.ig"=list(c(2,2), c(0.1,0.1))) m.1 <- spMvLM(list(y.1~x.1-1, y.2~x.2-1), coords=coords, starting=starting, tuning=tuning, priors=priors, n.samples=n.samples, cov.model="exponential", n.report=100) burn.in <- 0.75*n.samples m.1 <- spRecover(m.1, start=burn.in) round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) m.1.w.hat <- summary(mcmc(t(m.1$p.w.recover.samples)))$quantiles[,c(3,1,5)] m.1.w.1.hat <- m.1.w.hat[seq(1, nrow(m.1.w.hat), q),] m.1.w.2.hat <- m.1.w.hat[seq(2, nrow(m.1.w.hat), q),] par(mfrow=c(1,2)) plot(w.1, m.1.w.1.hat[,1], xlab="Observed w.1", ylab="Fitted w.1", xlim=range(w), ylim=range(m.1.w.hat), main="Spatial random effects w.1") arrows(w.1, m.1.w.1.hat[,1], w.1, m.1.w.1.hat[,2], length=0.02, angle=90) arrows(w.1, m.1.w.1.hat[,1], w.1, m.1.w.1.hat[,3], length=0.02, angle=90) lines(range(w), range(w)) plot(w.2, m.1.w.2.hat[,1], xlab="Observed w.2", ylab="Fitted w.2", xlim=range(w), ylim=range(m.1.w.hat), main="Spatial random effects w.2") arrows(w.2, m.1.w.2.hat[,1], w.2, m.1.w.2.hat[,2], length=0.02, angle=90) arrows(w.2, m.1.w.2.hat[,1], w.2, m.1.w.2.hat[,3], length=0.02, angle=90) lines(range(w), range(w)) ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##Generate some data n <- 25 ##number of locations q <- 2 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##Parameters for the bivariate spatial random effects theta <- rep(3/0.5,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,-1,0.25) K <- A%*%t(A) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) w.1 <- w[seq(1,length(w),q)] w.2 <- w[seq(2,length(w),q)] ##Covariate portion of the mean x.1 <- cbind(1, rnorm(n)) x.2 <- cbind(1, rnorm(n)) x <- mkMvX(list(x.1, x.2)) B.1 <- c(1,-1) B.2 <- c(-1,1) B <- c(B.1, B.2) Psi <- diag(c(0.1, 0.5)) y <- rnorm(n*q, x%*%B+w, diag(n)%x%Psi) y.1 <- y[seq(1,length(y),q)] y.2 <- y[seq(2,length(y),q)] ##Call spMvLM A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.samples <- 1000 starting <- list("phi"=rep(3/0.5,q), "A"=A.starting, "Psi"=rep(1,q)) tuning <- list("phi"=rep(1,q), "A"=rep(0.01,length(A.starting)), "Psi"=rep(0.01,q)) priors <- list("beta.Flat", "phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q)), "Psi.ig"=list(c(2,2), c(0.1,0.1))) m.1 <- spMvLM(list(y.1~x.1-1, y.2~x.2-1), coords=coords, starting=starting, tuning=tuning, priors=priors, n.samples=n.samples, cov.model="exponential", n.report=100) burn.in <- 0.75*n.samples m.1 <- spRecover(m.1, start=burn.in) round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2) round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2) m.1.w.hat <- summary(mcmc(t(m.1$p.w.recover.samples)))$quantiles[,c(3,1,5)] m.1.w.1.hat <- m.1.w.hat[seq(1, nrow(m.1.w.hat), q),] m.1.w.2.hat <- m.1.w.hat[seq(2, nrow(m.1.w.hat), q),] par(mfrow=c(1,2)) plot(w.1, m.1.w.1.hat[,1], xlab="Observed w.1", ylab="Fitted w.1", xlim=range(w), ylim=range(m.1.w.hat), main="Spatial random effects w.1") arrows(w.1, m.1.w.1.hat[,1], w.1, m.1.w.1.hat[,2], length=0.02, angle=90) arrows(w.1, m.1.w.1.hat[,1], w.1, m.1.w.1.hat[,3], length=0.02, angle=90) lines(range(w), range(w)) plot(w.2, m.1.w.2.hat[,1], xlab="Observed w.2", ylab="Fitted w.2", xlim=range(w), ylim=range(m.1.w.hat), main="Spatial random effects w.2") arrows(w.2, m.1.w.2.hat[,1], w.2, m.1.w.2.hat[,2], length=0.02, angle=90) arrows(w.2, m.1.w.2.hat[,1], w.2, m.1.w.2.hat[,3], length=0.02, angle=90) lines(range(w), range(w)) ## End(Not run)
The function spPredict
collects posterior predictive samples
for a set of new locations given a spLM
, spMvLM
,
spGLM
, spMvGLM
,
spMisalignLM
, spMisalignGLM
,
bayesGeostatExact
, bayesLMConjugate
bayesLMRef
or spSVC
object.
spPredict(sp.obj, pred.coords, pred.covars, joint=FALSE, start=1, end, thin=1, verbose=TRUE, n.report=100, n.omp.threads=1, ...)
spPredict(sp.obj, pred.coords, pred.covars, joint=FALSE, start=1, end, thin=1, verbose=TRUE, n.report=100, n.omp.threads=1, ...)
sp.obj |
an object returned by |
pred.coords |
for |
pred.covars |
for |
joint |
specifies whether posterior samples should be drawn
from the joint or point-wise predictive distribution. This argument is only
implemented for |
start |
specifies the first sample included in the composition sampling. |
end |
specifies the last sample included in the composition.
The default is to use all posterior samples in |
thin |
a sample thinning factor. The default of 1 considers all
samples between |
verbose |
if |
n.report |
the interval to report sampling progress. |
n.omp.threads |
a positive integer indicating
the number of threads to use for SMP parallel processing. The package must
be compiled for OpenMP support. For most Intel-based machines, we
recommend setting |
... |
currently no additional arguments. |
p.y.predictive.samples |
a matrix that holds the response variable(s) posterior
predictive samples. For multivariate models For For For |
p.w.predictive.samples |
a matrix organized the same as
|
p.w.predictive.samples.list |
only returned for
|
p.tilde.beta.predictive.samples.list |
only returned for
|
center.scale.pred.covars |
only returned for the
|
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, FL.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Finley, A.O. and S. Banerjee (2019) Bayesian spatially varying coefficient models in the spBayes R package. https://arxiv.org/abs/1903.03028.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 200 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 10 tau.sq <- 0.01 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) ##partition the data for out of sample prediction mod <- 1:100 y.mod <- y[mod] X.mod <- X[mod,] coords.mod <- coords[mod,] n.samples <- 1000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) priors <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 5), "tau.sq.IG"=c(2, 0.01)) cov.model <- "exponential" m.1 <- spLM(y.mod~X.mod-1, coords=coords.mod, starting=starting, tuning=tuning, priors=priors, cov.model=cov.model, n.samples=n.samples) m.1.pred <- spPredict(m.1, pred.covars=X, pred.coords=coords, start=0.5*n.samples) y.hat <- apply(m.1.pred$p.y.predictive.samples, 1, mean) quant <- function(x){quantile(x, prob=c(0.025, 0.5, 0.975))} y.hat <- apply(m.1.pred$p.y.predictive.samples, 1, quant) plot(y, y.hat[2,], pch=19, cex=0.5, xlab="observed y", ylab="predicted y") arrows(y[-mod], y.hat[2,-mod], y[-mod], y.hat[1,-mod], angle=90, length=0.05) arrows(y[-mod], y.hat[2,-mod], y[-mod], y.hat[3,-mod], angle=90, length=0.05) ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 200 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 10 tau.sq <- 0.01 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) ##partition the data for out of sample prediction mod <- 1:100 y.mod <- y[mod] X.mod <- X[mod,] coords.mod <- coords[mod,] n.samples <- 1000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) priors <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 5), "tau.sq.IG"=c(2, 0.01)) cov.model <- "exponential" m.1 <- spLM(y.mod~X.mod-1, coords=coords.mod, starting=starting, tuning=tuning, priors=priors, cov.model=cov.model, n.samples=n.samples) m.1.pred <- spPredict(m.1, pred.covars=X, pred.coords=coords, start=0.5*n.samples) y.hat <- apply(m.1.pred$p.y.predictive.samples, 1, mean) quant <- function(x){quantile(x, prob=c(0.025, 0.5, 0.975))} y.hat <- apply(m.1.pred$p.y.predictive.samples, 1, quant) plot(y, y.hat[2,], pch=19, cex=0.5, xlab="observed y", ylab="predicted y") arrows(y[-mod], y.hat[2,-mod], y[-mod], y.hat[1,-mod], angle=90, length=0.05) arrows(y[-mod], y.hat[2,-mod], y[-mod], y.hat[3,-mod], angle=90, length=0.05) ## End(Not run)
spLM
, spMvLM
,
spMisalignLM
, spSVC
using composition sampling
Function for recovering regression coefficients and spatial random
effects for spLM
, spMvLM
, and
spMisalignLM
using composition sampling.
spRecover(sp.obj, get.beta=TRUE, get.w=TRUE, start=1, end, thin=1, verbose=TRUE, n.report=100, n.omp.threads=1, ...)
spRecover(sp.obj, get.beta=TRUE, get.w=TRUE, start=1, end, thin=1, verbose=TRUE, n.report=100, n.omp.threads=1, ...)
sp.obj |
an object returned by |
get.beta |
if |
get.w |
if |
start |
specifies the first sample included in the composition sampling. |
end |
specifies the last sample included in the composition.
The default is to use all posterior samples in |
thin |
a sample thinning factor. The default of 1 considers all
samples between |
verbose |
if |
n.report |
the interval to report sampling progress. |
n.omp.threads |
a positive integer indicating
the number of threads to use for SMP parallel processing. The package must
be compiled for OpenMP support. For most Intel-based machines, we
recommend setting |
... |
currently no additional arguments. |
The input sp.obj
with posterior samples of regression coefficients and/or spatial random effects appended.
tags:
p.theta.recover.samples |
those |
p.beta.recover.samples |
a |
p.w.recover.samples |
a For |
p.w.recover.samples.list |
only returned for
|
p.tilde.beta.recover.samples.list |
only returned for
|
p.y.samples |
only returned for
|
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, FL.
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Finley, A.O. and S. Banerjee (2019) Bayesian spatially varying coefficient models in the spBayes R package. https://arxiv.org/abs/1903.03028.
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 50 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 10 tau.sq <- 0.01 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) n.samples <- 1000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) priors <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 5), "tau.sq.IG"=c(2, 0.01)) cov.model <- "exponential" m.1 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors, cov.model=cov.model, n.samples=n.samples) m.1 <- spRecover(m.1, start=0.5*n.samples, thin=2) summary(window(m.1$p.beta.recover.samples)) w.hat <- apply(m.1$p.w.recover.samples, 1, mean) plot(w, w.hat, xlab="Observed w", ylab="Fitted w") ## End(Not run)
## Not run: rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 50 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- 10 tau.sq <- 0.01 phi <- 3/0.5 D <- as.matrix(dist(coords)) R <- exp(-phi*D) w <- rmvn(1, rep(0,n), sigma.sq*R) y <- rnorm(n, X%*%B + w, sqrt(tau.sq)) n.samples <- 1000 starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1) tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1) priors <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1), "sigma.sq.IG"=c(2, 5), "tau.sq.IG"=c(2, 0.01)) cov.model <- "exponential" m.1 <- spLM(y~X-1, coords=coords, starting=starting, tuning=tuning, priors=priors, cov.model=cov.model, n.samples=n.samples) m.1 <- spRecover(m.1, start=0.5*n.samples, thin=2) summary(window(m.1$p.beta.recover.samples)) w.hat <- apply(m.1$p.w.recover.samples, 1, mean) plot(w, w.hat, xlab="Observed w", ylab="Fitted w") ## End(Not run)
The function spSVC
fits Gaussian univariate Bayesian spatially-varying
coefficient regression models.
spSVC(formula, data = parent.frame(), svc.cols=1, coords, priors, starting, tuning, cov.model, center.scale=FALSE, amcmc, n.samples, n.omp.threads = 1, verbose=TRUE, n.report=100, ...)
spSVC(formula, data = parent.frame(), svc.cols=1, coords, priors, starting, tuning, cov.model, center.scale=FALSE, amcmc, n.samples, n.omp.threads = 1, verbose=TRUE, n.report=100, ...)
formula |
a symbolic description of the regression model to be fit. See example below. |
data |
an optional data frame containing the variables in the
model. If not found in data, the variables are taken from
|
svc.cols |
a vector indicating which columns of the regression
design matrix |
coords |
an |
priors |
a list with each tag corresponding to a
parameter name. Valid tags are There are two specification for the Gaussian Process (GP) on the If the regression coefficients, i.e., |
starting |
a list with each tag corresponding to a
parameter name. Valid tags are |
tuning |
a list with each tag corresponding to a
parameter name. Valid tags are |
cov.model |
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
|
center.scale |
if |
amcmc |
a list with tags |
n.samples |
the number of MCMC iterations. This argument is
ignored if |
n.omp.threads |
a positive integer indicating
the number of threads to use for SMP parallel processing. The package must
be compiled for OpenMP support. For most Intel-based machines, we
recommend setting |
verbose |
if |
n.report |
the interval to report Metropolis sampler acceptance and MCMC progress. |
... |
currently no additional arguments. |
Model parameters can be fixed at their starting
values by setting their
tuning
values to zero.
The no nugget model is specified by removing tau.sq
from the starting
list.
An object of class spSVC
, which is a list comprising:
coords |
the |
p.theta.samples |
a |
acceptance |
the Metropolis sampling
acceptance percent. Reported at |
The return object will include additional objects used for subsequent
parameter recovery, prediction, and model fit evaluation using
spRecover
, spPredict
,
spDiag
, respectively.
Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]
Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.
Roberts G.O. and Rosenthal J.S. (2006). Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf.
Finley, A.O. and S. Banerjee (2019) Bayesian spatially varying coefficient models in the spBayes R package. https://arxiv.org/abs/1903.03028.
## Not run: library(Matrix) rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } ##Assume both columns of X are space-varying and the two GPs don't covary set.seed(1) n <- 200 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) colnames(X) <- c("x.1", "x.2") Z <- t(bdiag(as.list(as.data.frame(t(X))))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- c(1,5) tau.sq <- 1 phi <- 3/0.5 D <- as.matrix(dist(coords)) C <- exp(-phi*D)%x%diag(sigma.sq) w <- rmvn(1, rep(0,p*n), C) mu <- as.vector(X%*%B + Z%*%w) y <- rnorm(n, mu, sqrt(tau.sq)) ##fit a model to the simulated dat starting <- list("phi"=rep(3/0.5, p), "sigma.sq"=rep(1, p), "tau.sq"=1) tuning <- list("phi"=rep(0.1, p), "sigma.sq"=rep(0.1, p), "tau.sq"=0.1) cov.model <- "exponential" priors <- list("phi.Unif"=list(rep(3/2, p), rep(3/0.0001, p)), "sigma.sq.IG"=list(rep(2, p), rep(2, p)), "tau.sq.IG"=c(2, 1)) ##fit model n.samples <- 2000 m.1 <- spSVC(y~X-1, coords=coords, starting=starting, svc.cols=c(1,2), tuning=tuning, priors=priors, cov.model=cov.model, n.samples=n.samples, n.omp.threads=4) plot(m.1$p.theta.samples, density=FALSE) ##recover posterior samples m.1 <- spRecover(m.1, start=floor(0.75*n.samples), thin=2, n.omp.threads=4) summary(m.1$p.beta.recover.samples) summary(m.1$p.theta.recover.samples) ##check fitted values quant <- function(x){quantile(x, prob=c(0.025, 0.5, 0.975))} ##fitted y y.hat <- apply(m.1$p.y.samples, 1, quant) rng <- c(-15, 20) plot(y, y.hat[2,], pch=19, cex=0.5, xlab="Fitted y", ylab="Observed y", xlim=rng, ylim=rng) arrows(y, y.hat[2,], y, y.hat[1,], angle=90, length=0.05) arrows(y, y.hat[2,], y, y.hat[3,], angle=90, length=0.05) lines(rng, rng, col="blue") ##recovered w w.hat <- apply(m.1$p.w.recover.samples, 1, quant) w.1.indx <- seq(1, p*n, p) w.2.indx <- seq(2, p*n, p) par(mfrow=c(1,2)) rng <- c(-5,5) plot(w[w.1.indx], w.hat[2,w.1.indx], pch=19, cex=0.5, xlab="Fitted w.1", ylab="Observed w.1", xlim=rng, ylim=rng) arrows(w[w.1.indx], w.hat[2,w.1.indx], w[w.1.indx], w.hat[1,w.1.indx], angle=90, length=0.05) arrows(w[w.1.indx], w.hat[2,w.1.indx], w[w.1.indx], w.hat[3,w.1.indx], angle=90, length=0.05) lines(rng, rng, col="blue") rng <- c(-10,10) plot(w[w.2.indx], w.hat[2,w.2.indx], pch=19, cex=0.5, xlab="Fitted w.2", ylab="Observed w.2", xlim=rng, ylim=rng) arrows(w[w.2.indx], w.hat[2,w.2.indx], w[w.2.indx], w.hat[1,w.2.indx], angle=90, length=0.05) arrows(w[w.2.indx], w.hat[2,w.2.indx], w[w.2.indx], w.hat[3,w.2.indx], angle=90, length=0.05) lines(rng, rng, col="blue") ## End(Not run)
## Not run: library(Matrix) rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } ##Assume both columns of X are space-varying and the two GPs don't covary set.seed(1) n <- 200 coords <- cbind(runif(n,0,1), runif(n,0,1)) X <- as.matrix(cbind(1, rnorm(n))) colnames(X) <- c("x.1", "x.2") Z <- t(bdiag(as.list(as.data.frame(t(X))))) B <- as.matrix(c(1,5)) p <- length(B) sigma.sq <- c(1,5) tau.sq <- 1 phi <- 3/0.5 D <- as.matrix(dist(coords)) C <- exp(-phi*D)%x%diag(sigma.sq) w <- rmvn(1, rep(0,p*n), C) mu <- as.vector(X%*%B + Z%*%w) y <- rnorm(n, mu, sqrt(tau.sq)) ##fit a model to the simulated dat starting <- list("phi"=rep(3/0.5, p), "sigma.sq"=rep(1, p), "tau.sq"=1) tuning <- list("phi"=rep(0.1, p), "sigma.sq"=rep(0.1, p), "tau.sq"=0.1) cov.model <- "exponential" priors <- list("phi.Unif"=list(rep(3/2, p), rep(3/0.0001, p)), "sigma.sq.IG"=list(rep(2, p), rep(2, p)), "tau.sq.IG"=c(2, 1)) ##fit model n.samples <- 2000 m.1 <- spSVC(y~X-1, coords=coords, starting=starting, svc.cols=c(1,2), tuning=tuning, priors=priors, cov.model=cov.model, n.samples=n.samples, n.omp.threads=4) plot(m.1$p.theta.samples, density=FALSE) ##recover posterior samples m.1 <- spRecover(m.1, start=floor(0.75*n.samples), thin=2, n.omp.threads=4) summary(m.1$p.beta.recover.samples) summary(m.1$p.theta.recover.samples) ##check fitted values quant <- function(x){quantile(x, prob=c(0.025, 0.5, 0.975))} ##fitted y y.hat <- apply(m.1$p.y.samples, 1, quant) rng <- c(-15, 20) plot(y, y.hat[2,], pch=19, cex=0.5, xlab="Fitted y", ylab="Observed y", xlim=rng, ylim=rng) arrows(y, y.hat[2,], y, y.hat[1,], angle=90, length=0.05) arrows(y, y.hat[2,], y, y.hat[3,], angle=90, length=0.05) lines(rng, rng, col="blue") ##recovered w w.hat <- apply(m.1$p.w.recover.samples, 1, quant) w.1.indx <- seq(1, p*n, p) w.2.indx <- seq(2, p*n, p) par(mfrow=c(1,2)) rng <- c(-5,5) plot(w[w.1.indx], w.hat[2,w.1.indx], pch=19, cex=0.5, xlab="Fitted w.1", ylab="Observed w.1", xlim=rng, ylim=rng) arrows(w[w.1.indx], w.hat[2,w.1.indx], w[w.1.indx], w.hat[1,w.1.indx], angle=90, length=0.05) arrows(w[w.1.indx], w.hat[2,w.1.indx], w[w.1.indx], w.hat[3,w.1.indx], angle=90, length=0.05) lines(rng, rng, col="blue") rng <- c(-10,10) plot(w[w.2.indx], w.hat[2,w.2.indx], pch=19, cex=0.5, xlab="Fitted w.2", ylab="Observed w.2", xlim=rng, ylim=rng) arrows(w[w.2.indx], w.hat[2,w.2.indx], w[w.2.indx], w.hat[1,w.2.indx], angle=90, length=0.05) arrows(w[w.2.indx], w.hat[2,w.2.indx], w[w.2.indx], w.hat[3,w.2.indx], angle=90, length=0.05) lines(rng, rng, col="blue") ## End(Not run)
Data simulated from a space-varying coefficients model.
data(SVCMvData.dat)
data(SVCMvData.dat)
The data frame generated from the code in the example section below.
## Not run: ##The dataset was generated with the code below. library(Matrix) rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 200 coords <- cbind(runif(n,0,1), runif(n,0,1)) colnames(coords) <- c("x.coords","y.coords") X <- as.matrix(cbind(1, rnorm(n), rnorm(n))) colnames(X) <- c("intercept","a","b") Z <- t(bdiag(as.list(as.data.frame(t(X))))) beta <- c(1, 10, -10) p <- length(beta) q <- 3 A <- matrix(0, q, q) A[lower.tri(A, T)] <- c(1, -1, 0, 1, 1, 0.1) K <- A K cov2cor(K) phi <- c(3/0.75, 3/0.5, 3/0.5) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, phi, cov.model="exponential") tau.sq <- 0.1 w <- rmvn(1, rep(0,q*n), C) y <- rnorm(n, as.vector(X%*%beta + Z%*%w), sqrt(tau.sq)) w.0 <- w[seq(1, length(w), by=q)] w.a <- w[seq(2, length(w), by=q)] w.b <- w[seq(3, length(w), by=q)] SVCMvData <- data.frame(cbind(coords, y, X[,2:3], w.0, w.a, w.b)) ## End(Not run)
## Not run: ##The dataset was generated with the code below. library(Matrix) rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))) stop("Dimension problem!") D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) n <- 200 coords <- cbind(runif(n,0,1), runif(n,0,1)) colnames(coords) <- c("x.coords","y.coords") X <- as.matrix(cbind(1, rnorm(n), rnorm(n))) colnames(X) <- c("intercept","a","b") Z <- t(bdiag(as.list(as.data.frame(t(X))))) beta <- c(1, 10, -10) p <- length(beta) q <- 3 A <- matrix(0, q, q) A[lower.tri(A, T)] <- c(1, -1, 0, 1, 1, 0.1) K <- A K cov2cor(K) phi <- c(3/0.75, 3/0.5, 3/0.5) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, phi, cov.model="exponential") tau.sq <- 0.1 w <- rmvn(1, rep(0,q*n), C) y <- rnorm(n, as.vector(X%*%beta + Z%*%w), sqrt(tau.sq)) w.0 <- w[seq(1, length(w), by=q)] w.a <- w[seq(2, length(w), by=q)] w.b <- w[seq(3, length(w), by=q)] SVCMvData <- data.frame(cbind(coords, y, X[,2:3], w.0, w.a, w.b)) ## End(Not run)
Data generated as part of a long-term research study on an experimental forest in central Oregon. This dataset holds the coordinates for all trees in the experimental forest. The typical stem measurements are recorded for each tree. Crown radius was measured at the cardinal directions for a subset of trees. Mean crown radius was calculated for all trees using a simple relationship between DBH, Height, and observed crown dimension.
data(WEF.dat)
data(WEF.dat)
A data frame containing 2422 rows and 15 columns.
Inventory data of the Zurichberg Forest, Switzerland (see Mandallaz 2008 for details). These data are provided with the kind authorization of the Forest Service of the Caton of Zurich.
This dataset holds the coordinates for all trees in the Zurichberg Forest. Species (SPP), basal area (BAREA) diameter at breast height (DBH), and volume (VOL) are recorded for each tree. See species codes below.
data(Zurich.dat)
data(Zurich.dat)
A data frame containing 4954 rows and 6 columns.
## Not run: data(Zurich.dat) coords <- Zurich.dat[,c("X_TREE", "Y_TREE")] spp.name <- c("beech","maple","ash","other broadleaves", "spruce","silver fir", "larch", "other coniferous") spp.col <- c("yellow","red","orange","pink", "green","dark green","black","gray") plot(coords, col=spp.col[Zurich.dat$SPP+1], pch=19, cex=0.5, ylab="Northing", xlab="Easting") legend.coords <- c(23,240) legend(legend.coords, pch=19, legend=spp.name, col=spp.col, bty="n") ## End(Not run)
## Not run: data(Zurich.dat) coords <- Zurich.dat[,c("X_TREE", "Y_TREE")] spp.name <- c("beech","maple","ash","other broadleaves", "spruce","silver fir", "larch", "other coniferous") spp.col <- c("yellow","red","orange","pink", "green","dark green","black","gray") plot(coords, col=spp.col[Zurich.dat$SPP+1], pch=19, cex=0.5, ylab="Northing", xlab="Easting") legend.coords <- c(23,240) legend(legend.coords, pch=19, legend=spp.name, col=spp.col, bty="n") ## End(Not run)