Title: | Heteroskedastic Gaussian Process Modeling and Design under Replication |
---|---|
Description: | Performs Gaussian process regression with heteroskedastic noise following the model by Binois, M., Gramacy, R., Ludkovski, M. (2016) <doi:10.48550/arXiv.1611.05902>, with implementation details in Binois, M. & Gramacy, R. B. (2021) <doi:10.18637/jss.v098.i13>. The input dependent noise is modeled as another Gaussian process. Replicated observations are encouraged as they yield computational savings. Sequential design procedures based on the integrated mean square prediction error and lookahead heuristics are provided, and notably fast update functions when adding new observations. |
Authors: | Mickael Binois [aut, cre], Robert B. Gramacy [aut] |
Maintainer: | Mickael Binois <[email protected]> |
License: | LGPL |
Version: | 1.1.7 |
Built: | 2024-12-04 08:09:56 UTC |
Source: | CRAN |
Allocation of replicates on existing design locations, based on (29) from (Ankenman et al, 2010)
allocate_mult(model, N, Wijs = NULL, use.Ki = FALSE)
allocate_mult(model, N, Wijs = NULL, use.Ki = FALSE)
model |
|
N |
total budget of replication to allocate |
Wijs |
optional previously computed matrix of |
use.Ki |
should |
vector with approximated best number of replicates per design
B. Ankenman, B. Nelson, J. Staum (2010), Stochastic kriging for simulation metamodeling, Operations research, pp. 371–382, 58
##------------------------------------------------------------ ## Example: Heteroskedastic GP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 data_m <- find_reps(X, Z, rescale = TRUE) plot(rep(data_m$X0, data_m$mult), data_m$Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") ## Model fitting model <- mleHetGP(X = list(X0 = data_m$X0, Z0 = data_m$Z0, mult = data_m$mult), Z = Z, lower = rep(0.1, nvar), upper = rep(5, nvar), covtype = "Matern5_2") ## Compute best allocation A <- allocate_mult(model, N = 1000) ## Create a prediction grid and obtain predictions xgrid <- matrix(seq(0, 1, length.out = 301), ncol = 1) predictions <- predict(x = xgrid, object = model) ## Display mean predictive surface lines(xgrid, predictions$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) par(new = TRUE) plot(NA,NA, xlim = c(0,1), ylim = c(0,max(A)), axes = FALSE, ylab = "", xlab = "") segments(x0 = model$X0, x1 = model$X0, y0 = rep(0, nrow(model$X)), y1 = A, col = 'grey') axis(side = 4) mtext(side = 4, line = 2, expression(a[i]), cex = 0.8)
##------------------------------------------------------------ ## Example: Heteroskedastic GP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 data_m <- find_reps(X, Z, rescale = TRUE) plot(rep(data_m$X0, data_m$mult), data_m$Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") ## Model fitting model <- mleHetGP(X = list(X0 = data_m$X0, Z0 = data_m$Z0, mult = data_m$mult), Z = Z, lower = rep(0.1, nvar), upper = rep(5, nvar), covtype = "Matern5_2") ## Compute best allocation A <- allocate_mult(model, N = 1000) ## Create a prediction grid and obtain predictions xgrid <- matrix(seq(0, 1, length.out = 301), ncol = 1) predictions <- predict(x = xgrid, object = model) ## Display mean predictive surface lines(xgrid, predictions$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) par(new = TRUE) plot(NA,NA, xlim = c(0,1), ylim = c(0,max(A)), axes = FALSE, ylab = "", xlab = "") segments(x0 = model$X0, x1 = model$X0, y0 = rep(0, nrow(model$X)), y1 = A, col = 'grey') axis(side = 4) mtext(side = 4, line = 2, expression(a[i]), cex = 0.8)
A batch design-evaluated ATO data set, random partition into training and testing, and fitted hetGP model; similarly a sequentially designed adaptive horizon data set, and associated fitted hetGP model
data(ato)
data(ato)
Calling data(ato)
causes the following objects to be loaded into the namespace.
X
2000x8 matrix
of inputs coded from 1,...,20 to the unit 8-cube; original inputs can be recreated as X*19 + 1
Z
2000x10 matrix
of normalized outputs obtained in ten replicates at each of the 2000 inputs X
. Original outputs can be obtained as Z*sqrt(Zv) + Zm
Zm
scalar mean used to normalize Z
Zv
scalar variance used to normalize Z
train
vector of 1000 rows of X
and Z
selected for training
Xtrain
1000x8 matrix
obtained as a random partition of X
Ztrain
length 1000 list of vectors containing the selected (replicated) observations at each row of Xtrain
mult
the length of each entry of Ztrain
; same as unlist(lapply(Ztrain, length))
kill
a logical
vector indicating which rows of Xtrain
for which all replicates of Z
are selected for Ztrain
; same as mult == 10
Xtrain.out
897x8 matrix
comprised of the subset of X
where not all replicates are selected for training; i.e., those for which kill == FALSE
Ztrain.out
list
of length 897 containing the replicates of Z
not selected for Ztrain
nc
noiseControl
argument for mleHetGP
call
out
mleHetGP
model based on Xtrain
and Ztrain
using noiseControl=nc
Xtest
1000x8 matrix
containing the other partition of X
of locations not selected for training
Ztest
1000x10 matrix
of responses from the partition of Z
not selected for training
ato.a
2000x9 matrix
of sequentially designed inputs (8) and outputs (1) obtained under an adaptive horizon scheme
Xa
2000x8 matrix of coded inputs from ato.a
as (ato.a[,1:8]-1)/19
Za
length 2000 vector of outputs from ato.a
as (ato.a[,9] - Zm)/sqrt(Zv)
out.a
mleHetGP
model based on Xa
and Za
using noiseControl=nc
The assemble to order (ATO) simulator (Hong, Nelson, 2006) is a queuing
simulation targeting inventory management scenarios. The setup is as follows.
A company manufactures products. Products are built from base parts
called items, some of which are “key” in that the product cannot be built
without them. If a random request comes in for a product that is missing a
key item, a replenishment order is executed, and is filled after a random
period. Holding items in inventory is expensive, so there is a balance
between inventory costs and revenue. Hong & Nelson built a
Matlab
simulator for this setup, which was subsequently
reimplemented by Xie, et al., (2012).
Binois, et al (2018a) describe an out-of-sample experiment based on this
latter implementation in its default (Hong & Nelson) setting, specifying
item cost structure, product makeup (their items) and revenue, distribution
of demand and replenishment time, under target stock vector inputs for eight items. They worked with 2000
random uniform input locations (
X
), and ten replicate responses at
each location (Z
). The partition of 1000 training data points
(Xtrain
and
Ztrain
) and 1000 testing (Xtest
and Ztest
) sets
provided here is an example of one that was used for the Monte Carlo
experiment in that paper. The elements Xtrain.out
and
Ztrain.out
comprise of replicates from the training inputs which were
not used in training, so may be used for out-of-sample testing. For more
details on how the partitions were build, see the code in the examples
section below.
Binois, et al (2018b) describe an adaptive lookahead horizon scheme for
building a sequential design (Xa
, Za
) of size 2000 whose
predictive performance, via proper scores, is almost as good as the
approximately 5000 training data sites in each of the Monte Carlo
repetitions described above. The example code below demonstrates this via
out-of-sample predictions on Xtest
(measured against Ztest
)
when Xtrain
and Ztrain
are used compared to those from
Xa
and Za
.
The mleHetGP
output objects were build with
return.matrices=FALSE
for more compact storage. Before these objects
can be used for calculations, e.g., prediction or design, these covariance
matrices need to be rebuilt with rebuild
. The generic
predict
method will call rebuild
automatically,
however, some of the other methods will not, and it is often more
efficient to call rebuild
once at the outset, rather
than for every subsequent predict
call
Mickael Binois, [email protected], and Robert B. Gramacy, [email protected]
Hong L., Nelson B. (2006), Discrete optimization via simulation using COMPASS. Operations Research, 54(1), 115-129.
Xie J., Frazier P., Chick S. (2012). Assemble to Order Simulator. https://web.archive.org/web/20210308024531/http://simopt.org/wiki/index.php?title=Assemble_to_Order&oldid=447.
M. Binois, J. Huang, R. Gramacy, M. Ludkovski (2018a), Replication or exploration? Sequential design for stochastic simulation experiments, arXiv preprint arXiv:1710.03206.
M. Binois, Robert B. Gramacy, M. Ludkovski (2018b), Practical heteroskedastic Gaussian process modeling for large simulation experiments, arXiv preprint arXiv:1611.05902.
bfs
, sirEval
, link{rebuild}
,
horizon
, IMSPE_optim
, mleHetGP
,
vignette("hetGP")
data(ato) ## Not run: ## ## the code below was used to create the random partition ## ## recover the data in its original form X <- X*19+1 Z <- Z*sqrt(Zv) + Zm ## code the inputs and outputs; i.e., undo the transformation ## above X <- (X-1)/19 Zm <- mean(Z) Zv <- var(as.vector(Z)) Z <- (Z - Zm)/sqrt(Zv) ## random training and testing partition train <- sample(1:nrow(X), 1000) Xtrain <- X[train,] Xtest <- X[-train,] Ztest <- as.list(as.data.frame(t(Z[-train,]))) Ztrain <- Ztrain.out <- list() mult <- rep(NA, nrow(Xtrain)) kill <- rep(FALSE, nrow(Xtrain)) for(i in 1:length(train)) { reps <- sample(1:ncol(Z), 1) w <- sample(1:ncol(Z), reps) Ztrain[[i]] <- Z[train[i],w] if(reps < 10) Ztrain.out[[i]] <- Z[train[i],-w] else kill[i] <- TRUE mult[i] <- reps } ## calculate training locations and outputs for replicates not ## included in Ztrain Xtrain.out <- Xtrain[!kill,] Ztrain.out <- Ztrain[which(!kill)] ## fit hetGP model out <- mleHetGP(X=list(X0=Xtrain, Z0=sapply(Ztrain, mean), mult=mult), Z=unlist(Ztrain), lower=rep(0.01, ncol(X)), upper=rep(30, ncol(X)), covtype="Matern5_2", noiseControl=nc, known=list(beta0=0), maxit=100000, settings=list(return.matrices=FALSE)) ## ## the adaptive lookahead design is read in and fit as ## follows ## Xa <- (ato.a[,1:8]-1)/19 Za <- ato.a[,9] Za <- (Za - Zm)/sqrt(Zv) ## uses nc defined above out.a <- mleHetGP(Xa, Za, lower=rep(0.01, ncol(X)), upper=rep(30, ncol(X)), covtype="Matern5_2", known=list(beta0=0), noiseControl=nc, maxit=100000, settings=list(return.matrices=FALSE)) ## End(Not run) ## ## the following code duplicates a predictive comparison in ## the package vignette ## ## first using the model fit to the train partition (out) out <- rebuild(out) ## predicting out-of-sample at the test sights phet <- predict(out, Xtest) phets2 <- phet$sd2 + phet$nugs mhet <- as.numeric(t(matrix(rep(phet$mean, 10), ncol=10))) s2het <- as.numeric(t(matrix(rep(phets2, 10), ncol=10))) sehet <- (unlist(t(Ztest)) - mhet)^2 sc <- - sehet/s2het - log(s2het) mean(sc) ## predicting at the held-out training replicates phet.out <- predict(out, Xtrain.out) phets2.out <- phet.out$sd2 + phet.out$nugs s2het.out <- mhet.out <- Ztrain.out for(i in 1:length(mhet.out)) { mhet.out[[i]] <- rep(phet.out$mean[i], length(mhet.out[[i]])) s2het.out[[i]] <- rep(phets2.out[i], length(s2het.out[[i]])) } mhet.out <- unlist(t(mhet.out)) s2het.out <- unlist(t(s2het.out)) sehet.out <- (unlist(t(Ztrain.out)) - mhet.out)^2 sc.out <- - sehet.out/s2het.out - log(s2het.out) mean(sc.out) ## Not run: ## then using the model trained from the "adaptive" ## sequential design, with comparison from the "batch" ## one above, using the scores function out.a <- rebuild(out.a) sc.a <- scores(out.a, Xtest = Xtest, Ztest = Ztest) c(batch=mean(sc), adaptive=sc.a) ## an example of one iteration of sequential design Wijs <- Wij(out.a$X0, theta=out.a$theta, type=out.a$covtype) h <- horizon(out.a, Wijs=Wijs) control = list(tol_dist=1e-4, tol_diff=1e-4, multi.start=30, maxit=100) opt <- IMSPE_optim(out.a, h, Wijs=Wijs, control=control) opt$par ## End(Not run)
data(ato) ## Not run: ## ## the code below was used to create the random partition ## ## recover the data in its original form X <- X*19+1 Z <- Z*sqrt(Zv) + Zm ## code the inputs and outputs; i.e., undo the transformation ## above X <- (X-1)/19 Zm <- mean(Z) Zv <- var(as.vector(Z)) Z <- (Z - Zm)/sqrt(Zv) ## random training and testing partition train <- sample(1:nrow(X), 1000) Xtrain <- X[train,] Xtest <- X[-train,] Ztest <- as.list(as.data.frame(t(Z[-train,]))) Ztrain <- Ztrain.out <- list() mult <- rep(NA, nrow(Xtrain)) kill <- rep(FALSE, nrow(Xtrain)) for(i in 1:length(train)) { reps <- sample(1:ncol(Z), 1) w <- sample(1:ncol(Z), reps) Ztrain[[i]] <- Z[train[i],w] if(reps < 10) Ztrain.out[[i]] <- Z[train[i],-w] else kill[i] <- TRUE mult[i] <- reps } ## calculate training locations and outputs for replicates not ## included in Ztrain Xtrain.out <- Xtrain[!kill,] Ztrain.out <- Ztrain[which(!kill)] ## fit hetGP model out <- mleHetGP(X=list(X0=Xtrain, Z0=sapply(Ztrain, mean), mult=mult), Z=unlist(Ztrain), lower=rep(0.01, ncol(X)), upper=rep(30, ncol(X)), covtype="Matern5_2", noiseControl=nc, known=list(beta0=0), maxit=100000, settings=list(return.matrices=FALSE)) ## ## the adaptive lookahead design is read in and fit as ## follows ## Xa <- (ato.a[,1:8]-1)/19 Za <- ato.a[,9] Za <- (Za - Zm)/sqrt(Zv) ## uses nc defined above out.a <- mleHetGP(Xa, Za, lower=rep(0.01, ncol(X)), upper=rep(30, ncol(X)), covtype="Matern5_2", known=list(beta0=0), noiseControl=nc, maxit=100000, settings=list(return.matrices=FALSE)) ## End(Not run) ## ## the following code duplicates a predictive comparison in ## the package vignette ## ## first using the model fit to the train partition (out) out <- rebuild(out) ## predicting out-of-sample at the test sights phet <- predict(out, Xtest) phets2 <- phet$sd2 + phet$nugs mhet <- as.numeric(t(matrix(rep(phet$mean, 10), ncol=10))) s2het <- as.numeric(t(matrix(rep(phets2, 10), ncol=10))) sehet <- (unlist(t(Ztest)) - mhet)^2 sc <- - sehet/s2het - log(s2het) mean(sc) ## predicting at the held-out training replicates phet.out <- predict(out, Xtrain.out) phets2.out <- phet.out$sd2 + phet.out$nugs s2het.out <- mhet.out <- Ztrain.out for(i in 1:length(mhet.out)) { mhet.out[[i]] <- rep(phet.out$mean[i], length(mhet.out[[i]])) s2het.out[[i]] <- rep(phets2.out[i], length(s2het.out[[i]])) } mhet.out <- unlist(t(mhet.out)) s2het.out <- unlist(t(s2het.out)) sehet.out <- (unlist(t(Ztrain.out)) - mhet.out)^2 sc.out <- - sehet.out/s2het.out - log(s2het.out) mean(sc.out) ## Not run: ## then using the model trained from the "adaptive" ## sequential design, with comparison from the "batch" ## one above, using the scores function out.a <- rebuild(out.a) sc.a <- scores(out.a, Xtest = Xtest, Ztest = Ztest) c(batch=mean(sc), adaptive=sc.a) ## an example of one iteration of sequential design Wijs <- Wij(out.a$X0, theta=out.a$theta, type=out.a$covtype) h <- horizon(out.a, Wijs=Wijs) control = list(tol_dist=1e-4, tol_diff=1e-4, multi.start=30, maxit=100) opt <- IMSPE_optim(out.a, h, Wijs=Wijs, control=control) opt$par ## End(Not run)
Data from a Bayes factor MCMC-based simulation experiment comparing Student-t to Gaussian errors in an RJ-based Laplace prior Bayesian linear regession setting
data(ato)
data(ato)
Calling data(bfs)
causes the following objects to be loaded into the namespace.
bfs.exp
20x11 data.frame
whose first column is , indicating the mean parameter of an exponential distribution encoding the prior of the Student-t degrees of freedom parameter
. The remaining ten
columns comprise of Bayes factor evaluations under that setting
bfs.gamma
80x7 data.frame
whose first two columns are and
, indicating the second and first parameters to a
Gamma distribution encoding the prior of the Student-t degrees of freedom parameters
. The remaining five columns comprise of Bayes factor evaluations under those settings
Gramacy & Pantaleo (2010), Sections 3-3-3.4, describe an experiment
involving Bayes factor (BF) calculations to determine if data are
leptokurtic (Student-t errors) or not (simply Gaussian) as a function of the
prior parameterization on the Student-t degrees of freedom parameter
. Franck & Gramacy (2018) created a grid of hyperparameter
values in
describing the mean of an Exponential
distribution, evenly spaced in
space from
10^(-3)
to 10^6
spanning “solidly Student-t” (even
Cauchy) to “essentially Gaussian” in terms of the mean of the prior
over . For each
setting on the grid they
ran the Reversible Jump (RJ) MCMC to approximate the BF of Student-t over Gaussian
by feeding in sample likelihood evaluations provided by monomvn's
blasso
to compute the BF. In order to understand the
Monte Carlo variability in those calculations, ten replicates of the BFs
under each hyperparameter setting were collected. These data are provided
in bfs.exp
.
A similar, larger experiment was provided with under a Gamma
prior with parameters
and
. In this higher dimensional space, a Latin hypercube sample
of size eighty was created, and five replicates of BFs were recorded.
These data are provided in
bfs.gamma
.
The examples below involve mleHetTP
fits (Chung, et al., 2018)
to these data and a visualization of the predictive surfaces thus obtained.
The code here follows an example provided, with more detail, in
vignette("hetGP")
For code showing how these BFs were calculated, see supplementary material from Franck & Gramacy (2018)
Mickael Binois, [email protected], and Robert B. Gramacy, [email protected]
Franck CT, Gramacy RB (2018). Assessing Bayes factor surfaces using interactive visualization and computer surrogate modeling. Preprint available on arXiv:1809.05580.
Gramacy RB (2017). monomvn: Estimation for Multivariate Normal and Student-t Data with Monotone Missingness. R package version 1.9-7, https://CRAN.R-project.org/package=monomvn.
R.B. Gramacy and E. Pantaleo (2010). Shrinkage regression for multivariate inference with missing data, and an application to portfolio balancing. Bayesian Analysis 5(2), 237-262. Preprint available on arXiv:0907.2135
Chung M, Binois M, Gramacy RB, Moquin DJ, Smith AP, Smith AM (2018). Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes. SIAM Journal on Scientific Computing, 41(4), 2212-2238. Preprint available on arXiv:1802.00852.
ato
, sirEval
, mleHetTP
,
vignette("hetGP")
, blasso
data(bfs) ## ## Exponential version first ## thetas <- matrix(bfs.exp$theta, ncol=1) bfs <- as.matrix(t(bfs.exp[,-1])) ## the data are heavy tailed, so t-errors help bfs1 <- mleHetTP(X=list(X0=log10(thetas), Z0=colMeans(log(bfs)), mult=rep(nrow(bfs), ncol(bfs))), Z=log(as.numeric(bfs)), lower=10^(-4), upper=5, covtype="Matern5_2") ## predictions on a grid in 1d dx <- seq(0,1,length=100) dx <- 10^(dx*4 - 3) p <- predict(bfs1, matrix(log10(dx), ncol=1)) ## visualization matplot(log10(thetas), t(log(bfs)), col=1, pch=21, ylab="log(bf)", main="Bayes factor surface") lines(log10(dx), p$mean, lwd=2, col=2) lines(log10(dx), p$mean + 2*sqrt(p$sd2 + p$nugs), col=2, lty=2, lwd=2) lines(log10(dx), p$mean - 2*sqrt(p$sd2 + p$nugs), col=2, lty=2, lwd=2) legend("topleft", c("hetTP mean", "hetTP interval"), lwd=2, lty=1:2, col=2) ## ## Now Gamma version ## D <- as.matrix(bfs.gamma[,1:2]) bfs <- as.matrix(t(bfs.gamma[,-(1:2)])) ## fitting in 2fd bfs2 <- mleHetTP(X=list(X0=log10(D), Z0=colMeans(log(bfs)), mult=rep(nrow(bfs), ncol(bfs))), Z = log(as.numeric(bfs)), lower = rep(10^(-4), 2), upper = rep(5, 2), covtype = "Matern5_2", maxit=100000) ## predictions on a grid in 2d dx <- seq(0,1,length=100) dx <- 10^(dx*4 - 3) DD <- as.matrix(expand.grid(dx, dx)) p <- predict(bfs2, log10(DD)) ## visualization via image-contour plots par(mfrow=c(1,2)) mbfs <- colMeans(bfs) image(log10(dx), log10(dx), t(matrix(p$mean, ncol=length(dx))), col=heat.colors(128), xlab="log10 alpha", ylab="log10 beta", main="mean log BF") text(log10(D[,2]), log10(D[,1]), signif(log(mbfs), 2), cex=0.5) contour(log10(dx), log10(dx),t(matrix(p$mean, ncol=length(dx))), levels=c(-5,-3,-1,0,1,3,5), add=TRUE, col=4) image(log10(dx), log10(dx), t(matrix(sqrt(p$sd2 + p$nugs), ncol=length(dx))), col=heat.colors(128), xlab="log10 alpha", ylab="log10 beta", main="sd log BF") text(log10(D[,2]), log10(D[,1]), signif(apply(log(bfs), 2, sd), 2), cex=0.5)
data(bfs) ## ## Exponential version first ## thetas <- matrix(bfs.exp$theta, ncol=1) bfs <- as.matrix(t(bfs.exp[,-1])) ## the data are heavy tailed, so t-errors help bfs1 <- mleHetTP(X=list(X0=log10(thetas), Z0=colMeans(log(bfs)), mult=rep(nrow(bfs), ncol(bfs))), Z=log(as.numeric(bfs)), lower=10^(-4), upper=5, covtype="Matern5_2") ## predictions on a grid in 1d dx <- seq(0,1,length=100) dx <- 10^(dx*4 - 3) p <- predict(bfs1, matrix(log10(dx), ncol=1)) ## visualization matplot(log10(thetas), t(log(bfs)), col=1, pch=21, ylab="log(bf)", main="Bayes factor surface") lines(log10(dx), p$mean, lwd=2, col=2) lines(log10(dx), p$mean + 2*sqrt(p$sd2 + p$nugs), col=2, lty=2, lwd=2) lines(log10(dx), p$mean - 2*sqrt(p$sd2 + p$nugs), col=2, lty=2, lwd=2) legend("topleft", c("hetTP mean", "hetTP interval"), lwd=2, lty=1:2, col=2) ## ## Now Gamma version ## D <- as.matrix(bfs.gamma[,1:2]) bfs <- as.matrix(t(bfs.gamma[,-(1:2)])) ## fitting in 2fd bfs2 <- mleHetTP(X=list(X0=log10(D), Z0=colMeans(log(bfs)), mult=rep(nrow(bfs), ncol(bfs))), Z = log(as.numeric(bfs)), lower = rep(10^(-4), 2), upper = rep(5, 2), covtype = "Matern5_2", maxit=100000) ## predictions on a grid in 2d dx <- seq(0,1,length=100) dx <- 10^(dx*4 - 3) DD <- as.matrix(expand.grid(dx, dx)) p <- predict(bfs2, log10(DD)) ## visualization via image-contour plots par(mfrow=c(1,2)) mbfs <- colMeans(bfs) image(log10(dx), log10(dx), t(matrix(p$mean, ncol=length(dx))), col=heat.colors(128), xlab="log10 alpha", ylab="log10 beta", main="mean log BF") text(log10(D[,2]), log10(D[,1]), signif(log(mbfs), 2), cex=0.5) contour(log10(dx), log10(dx),t(matrix(p$mean, ncol=length(dx))), levels=c(-5,-3,-1,0,1,3,5), add=TRUE, col=4) image(log10(dx), log10(dx), t(matrix(sqrt(p$sd2 + p$nugs), ncol=length(dx))), col=heat.colors(128), xlab="log10 alpha", ylab="log10 beta", main="sd log BF") text(log10(D[,2]), log10(D[,1]), signif(apply(log(bfs), 2, sd), 2), cex=0.5)
Compare two models based on the log-likelihood for hetGP
and homGP
models
compareGP(model1, model2)
compareGP(model1, model2)
model1 , model2
|
|
Best model based on the likelihood, first one in case of a tie
If comparing homoskedastic and heteroskedastic models, the un-penalised likelihood is used for the later, see e.g., (Binois et al. 2017+).
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments,
Journal of Computational and Graphical Statistics, 27(4), 808–821.
Preprint available on arXiv:1611.05902.
Correlation function of selected type, supporting both isotropic and product forms
cov_gen(X1, X2 = NULL, theta, type = c("Gaussian", "Matern5_2", "Matern3_2"))
cov_gen(X1, X2 = NULL, theta, type = c("Gaussian", "Matern5_2", "Matern3_2"))
X1 |
matrix of design locations, one point per row |
X2 |
matrix of design locations if correlation is calculated between |
theta |
vector of lengthscale parameters (either of size one if isotropic or of size d if anisotropic) |
type |
one of " |
Definition of univariate correlation function and hyperparameters:
"Gaussian
":
"Matern5_2
":
"Matern3_2
":
Multivariate correlations are product of univariate ones.
Computes cSUR infill criterion
crit_cSUR(x, model, thres = 0, preds = NULL)
crit_cSUR(x, model, thres = 0, preds = NULL)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
thres |
for contour finding |
preds |
optional predictions at |
Lyu, X., Binois, M. & Ludkovski, M. (2018+). Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation. arXiv:1807.06712.
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_cSUR, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "cSUR criterion")
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_cSUR, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "cSUR criterion")
Computes EI for minimization
crit_EI(x, model, cst = NULL, preds = NULL)
crit_EI(x, model, cst = NULL, preds = NULL)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
cst |
optional plugin value used in the EI, see details |
preds |
optional predictions at |
cst
is classically the observed minimum in the deterministic case.
In the noisy case, the min of the predictive mean works fine.
This is a beta version at this point.
Mockus, J.; Tiesis, V. & Zilinskas, A. (1978).
The application of Bayesian methods for seeking the extremum Towards Global Optimization, Amsterdam: Elsevier, 2, 2.
Vazquez E, Villemonteix J, Sidorkiewicz M, Walter E (2008).
Global Optimization Based on Noisy Evaluations: An Empirical Study of Two Statistical Approaches,
Journal of Physics: Conference Series, 135, IOP Publishing.
A. Shah, A. Wilson, Z. Ghahramani (2014), Student-t processes as alternatives to Gaussian processes, Artificial Intelligence and Statistics, 877–885.
## Optimization example set.seed(42) ## Noise field via standard deviation noiseFun <- function(x, coef = 1.1, scale = 1){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(scale*(coef + cos(x * 2 * pi))) } ## Test function defined in [0,1] ftest <- function(x){ if(is.null(nrow(x))) x <- matrix(x, ncol = 1) return(f1d(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } n_init <- 10 # number of unique designs N_init <- 100 # total number of points X <- seq(0, 1, length.out = n_init) X <- matrix(X[sample(1:n_init, N_init, replace = TRUE)], ncol = 1) Z <- ftest(X) ## Predictive grid ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- matrix(xgrid, ncol = 1) model <- mleHetGP(X = X, Z = Z, lower = 0.001, upper = 1) EIgrid <- crit_EI(Xgrid, model) preds <- predict(x = Xgrid, model) par(mar = c(3,3,2,3)+0.1) plot(xgrid, f1d(xgrid), type = 'l', lwd = 1, col = "blue", lty = 3, xlab = '', ylab = '', ylim = c(-8,16)) points(X, Z) lines(Xgrid, preds$mean, col = 'red', lwd = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) par(new = TRUE) plot(NA, NA, xlim = c(0, 1), ylim = c(0,max(EIgrid)), axes = FALSE, ylab = "", xlab = "") lines(xgrid, EIgrid, lwd = 2, col = 'cyan') axis(side = 4) mtext(side = 4, line = 2, expression(EI(x)), cex = 0.8) mtext(side = 2, line = 2, expression(f(x)), cex = 0.8)
## Optimization example set.seed(42) ## Noise field via standard deviation noiseFun <- function(x, coef = 1.1, scale = 1){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(scale*(coef + cos(x * 2 * pi))) } ## Test function defined in [0,1] ftest <- function(x){ if(is.null(nrow(x))) x <- matrix(x, ncol = 1) return(f1d(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } n_init <- 10 # number of unique designs N_init <- 100 # total number of points X <- seq(0, 1, length.out = n_init) X <- matrix(X[sample(1:n_init, N_init, replace = TRUE)], ncol = 1) Z <- ftest(X) ## Predictive grid ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- matrix(xgrid, ncol = 1) model <- mleHetGP(X = X, Z = Z, lower = 0.001, upper = 1) EIgrid <- crit_EI(Xgrid, model) preds <- predict(x = Xgrid, model) par(mar = c(3,3,2,3)+0.1) plot(xgrid, f1d(xgrid), type = 'l', lwd = 1, col = "blue", lty = 3, xlab = '', ylab = '', ylim = c(-8,16)) points(X, Z) lines(Xgrid, preds$mean, col = 'red', lwd = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) par(new = TRUE) plot(NA, NA, xlim = c(0, 1), ylim = c(0,max(EIgrid)), axes = FALSE, ylab = "", xlab = "") lines(xgrid, EIgrid, lwd = 2, col = 'cyan') axis(side = 4) mtext(side = 4, line = 2, expression(EI(x)), cex = 0.8) mtext(side = 2, line = 2, expression(f(x)), cex = 0.8)
Computes ICU infill criterion
crit_ICU(x, model, thres = 0, Xref, w = NULL, preds = NULL, kxprime = NULL)
crit_ICU(x, model, thres = 0, Xref, w = NULL, preds = NULL, kxprime = NULL)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
thres |
for contour finding |
Xref |
matrix of input locations to approximate the integral by a sum |
w |
optional weights vector of weights for |
preds |
optional predictions at |
kxprime |
optional covariance matrix between |
Lyu, X., Binois, M. & Ludkovski, M. (2018+). Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation. arXiv:1807.06712.
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 51; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) # Precalculations for speedup preds <- predict(model, x = Xgrid) kxprime <- cov_gen(X1 = model$X0, X2 = Xgrid, theta = model$theta, type = model$covtype) critgrid <- apply(Xgrid, 1, crit_ICU, model = model, Xref = Xgrid, preds = preds, kxprime = kxprime) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "ICU criterion")
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 51; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) # Precalculations for speedup preds <- predict(model, x = Xgrid) kxprime <- cov_gen(X1 = model$X0, X2 = Xgrid, theta = model$theta, type = model$covtype) critgrid <- apply(Xgrid, 1, crit_ICU, model = model, Xref = Xgrid, preds = preds, kxprime = kxprime) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "ICU criterion")
Compute the integrated mean square prediction error after adding a new design
crit_IMSPE(x, model, id = NULL, Wijs = NULL)
crit_IMSPE(x, model, id = NULL, Wijs = NULL)
x |
matrix for the new design (size 1 x d) |
model |
|
id |
instead of providing |
Wijs |
optional previously computed matrix of Wijs, to avoid recomputing it; see |
The computations are scale free, i.e., values are not multiplied by nu_hat
from homGP
or hetGP
.
Currently this function ignores the extra terms related to the estimation of the mean.
deriv_crit_IMSPE
for the derivative
## One-d toy example set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*x) + rnorm(1, sd = coef)) n <- 9 designs <- matrix(seq(0.1, 0.9, length.out = n), ncol = 1) X <- matrix(designs[rep(1:n, sample(1:10, n, replace = TRUE)),]) Z <- apply(X, 1, ftest) prdata <- find_reps(X, Z, inputBounds = matrix(c(0,1), nrow = 2, ncol = 1)) Z <- prdata$Z plot(prdata$X0[rep(1:n, times = prdata$mult),], prdata$Z, xlab = "x", ylab = "Y") model <- mleHetGP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = Z, lower = 0.1, upper = 5) ngrid <- 501 xgrid <- matrix(seq(0,1, length.out = ngrid), ncol = 1) ## Precalculations Wijs <- Wij(mu1 = model$X0, theta = model$theta, type = model$covtype) t0 <- Sys.time() IMSPE_grid <- apply(xgrid, 1, crit_IMSPE, Wijs = Wijs, model = model) t1 <- Sys.time() print(t1 - t0) plot(xgrid, IMSPE_grid * model$nu_hat, xlab = "x", ylab = "crit_IMSPE values") abline(v = designs) ############################################################################### ## Bi-variate case nvar <- 2 set.seed(2) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 16 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) prdata <- find_reps(X, Z, inputBounds = matrix(c(0,1), nrow = 2, ncol = 1)) Z <- prdata$Z model <- mleHetGP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) ## Precalculations Wijs <- Wij(mu1 = model$X0, theta = model$theta, type = model$covtype) t0 <- Sys.time() IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, Wijs = Wijs, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Sequential IMSPE values")
## One-d toy example set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*x) + rnorm(1, sd = coef)) n <- 9 designs <- matrix(seq(0.1, 0.9, length.out = n), ncol = 1) X <- matrix(designs[rep(1:n, sample(1:10, n, replace = TRUE)),]) Z <- apply(X, 1, ftest) prdata <- find_reps(X, Z, inputBounds = matrix(c(0,1), nrow = 2, ncol = 1)) Z <- prdata$Z plot(prdata$X0[rep(1:n, times = prdata$mult),], prdata$Z, xlab = "x", ylab = "Y") model <- mleHetGP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = Z, lower = 0.1, upper = 5) ngrid <- 501 xgrid <- matrix(seq(0,1, length.out = ngrid), ncol = 1) ## Precalculations Wijs <- Wij(mu1 = model$X0, theta = model$theta, type = model$covtype) t0 <- Sys.time() IMSPE_grid <- apply(xgrid, 1, crit_IMSPE, Wijs = Wijs, model = model) t1 <- Sys.time() print(t1 - t0) plot(xgrid, IMSPE_grid * model$nu_hat, xlab = "x", ylab = "crit_IMSPE values") abline(v = designs) ############################################################################### ## Bi-variate case nvar <- 2 set.seed(2) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 16 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) prdata <- find_reps(X, Z, inputBounds = matrix(c(0,1), nrow = 2, ncol = 1)) Z <- prdata$Z model <- mleHetGP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) ## Precalculations Wijs <- Wij(mu1 = model$X0, theta = model$theta, type = model$covtype) t0 <- Sys.time() IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, Wijs = Wijs, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Sequential IMSPE values")
Computes log of EI for minimization, with improved stability with respect to EI
crit_logEI(x, model, cst = NULL, preds = NULL)
crit_logEI(x, model, cst = NULL, preds = NULL)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
cst |
optional plugin value used in the EI, see details |
preds |
optional predictions at |
cst
is classically the observed minimum in the deterministic case.
In the noisy case, the min of the predictive mean works fine.
This is a beta version at this point.
Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.
crit_EI
for the regular EI criterion and compare the outcomes
## Optimization example set.seed(42) ## Noise field via standard deviation noiseFun <- function(x, coef = 1.1, scale = 1){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(scale*(coef + cos(x * 2 * pi))) } ## Test function defined in [0,1] ftest <- function(x){ if(is.null(nrow(x))) x <- matrix(x, ncol = 1) return(f1d(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } n_init <- 10 # number of unique designs N_init <- 100 # total number of points X <- seq(0, 1, length.out = n_init) X <- matrix(X[sample(1:n_init, N_init, replace = TRUE)], ncol = 1) Z <- ftest(X) ## Predictive grid ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- matrix(xgrid, ncol = 1) model <- mleHetGP(X = X, Z = Z, lower = 0.001, upper = 1) logEIgrid <- crit_logEI(Xgrid, model) preds <- predict(x = Xgrid, model) par(mar = c(3,3,2,3)+0.1) plot(xgrid, f1d(xgrid), type = 'l', lwd = 1, col = "blue", lty = 3, xlab = '', ylab = '', ylim = c(-8,16)) points(X, Z) lines(Xgrid, preds$mean, col = 'red', lwd = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) par(new = TRUE) plot(NA, NA, xlim = c(0, 1), ylim = range(logEIgrid), axes = FALSE, ylab = "", xlab = "") lines(xgrid, logEIgrid, lwd = 2, col = 'cyan') axis(side = 4) mtext(side = 4, line = 2, expression(logEI(x)), cex = 0.8) mtext(side = 2, line = 2, expression(f(x)), cex = 0.8)
## Optimization example set.seed(42) ## Noise field via standard deviation noiseFun <- function(x, coef = 1.1, scale = 1){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(scale*(coef + cos(x * 2 * pi))) } ## Test function defined in [0,1] ftest <- function(x){ if(is.null(nrow(x))) x <- matrix(x, ncol = 1) return(f1d(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } n_init <- 10 # number of unique designs N_init <- 100 # total number of points X <- seq(0, 1, length.out = n_init) X <- matrix(X[sample(1:n_init, N_init, replace = TRUE)], ncol = 1) Z <- ftest(X) ## Predictive grid ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- matrix(xgrid, ncol = 1) model <- mleHetGP(X = X, Z = Z, lower = 0.001, upper = 1) logEIgrid <- crit_logEI(Xgrid, model) preds <- predict(x = Xgrid, model) par(mar = c(3,3,2,3)+0.1) plot(xgrid, f1d(xgrid), type = 'l', lwd = 1, col = "blue", lty = 3, xlab = '', ylab = '', ylim = c(-8,16)) points(X, Z) lines(Xgrid, preds$mean, col = 'red', lwd = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2)), col = 2, lty = 2) lines(Xgrid, qnorm(0.05, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) lines(Xgrid, qnorm(0.95, preds$mean, sqrt(preds$sd2 + preds$nugs)), col = 3, lty = 2) par(new = TRUE) plot(NA, NA, xlim = c(0, 1), ylim = range(logEIgrid), axes = FALSE, ylab = "", xlab = "") lines(xgrid, logEIgrid, lwd = 2, col = 'cyan') axis(side = 4) mtext(side = 4, line = 2, expression(logEI(x)), cex = 0.8) mtext(side = 2, line = 2, expression(f(x)), cex = 0.8)
Computes MCU infill criterion
crit_MCU(x, model, thres = 0, gamma = 2, preds = NULL)
crit_MCU(x, model, thres = 0, gamma = 2, preds = NULL)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
thres |
for contour finding |
gamma |
optional weight in -|f(x) - thres| + gamma * s(x). Default to 2. |
preds |
optional predictions at |
Srinivas, N., Krause, A., Kakade, S, & Seeger, M. (2012).
Information-theoretic regret bounds for Gaussian process optimization
in the bandit setting, IEEE Transactions on Information Theory, 58, pp. 3250-3265.
Bogunovic, J., Scarlett, J., Krause, A. & Cevher, V. (2016).
Truncated variance reduction: A unified approach to Bayesian optimization and level-set estimation,
in Advances in neural information processing systems, pp. 1507-1515.
Lyu, X., Binois, M. & Ludkovski, M. (2018+).
Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation. arXiv:1807.06712.
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_MCU, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "MEE criterion")
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_MCU, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "MEE criterion")
Computes MEE infill criterion
crit_MEE(x, model, thres = 0, preds = NULL)
crit_MEE(x, model, thres = 0, preds = NULL)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
thres |
for contour finding |
preds |
optional predictions at |
Ranjan, P., Bingham, D. & Michailidis, G (2008).
Sequential experiment design for contour estimation from complex computer codes,
Technometrics, 50, pp. 527-541.
Bichon, B., Eldred, M., Swiler, L., Mahadevan, S. & McFarland, J. (2008).
Efficient global reliability analysis for nonlinear implicit performance functions,
AIAA Journal, 46, pp. 2459-2468.
Lyu, X., Binois, M. & Ludkovski, M. (2018+). Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation. arXiv:1807.06712.
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_MEE, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "MEE criterion")
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_MEE, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "MEE criterion")
Search for the best value of available criterion, possibly using a h-steps lookahead strategy to favor designs with replication
crit_optim( model, crit, ..., h = 2, Xcand = NULL, control = list(multi.start = 10, maxit = 100), seed = NULL, ncores = 1 )
crit_optim( model, crit, ..., h = 2, Xcand = NULL, control = list(multi.start = 10, maxit = 100), seed = NULL, ncores = 1 )
model |
|
crit |
considered criterion, one of |
... |
additional parameters of the criterion |
h |
horizon for multi-step ahead framework. The decision is made between:
Use |
Xcand |
optional discrete set of candidates (otherwise a maximin LHS is used to initialize continuous search) |
control |
list in case |
seed |
optional seed for the generation of LHS designs with |
ncores |
number of CPU available (> 1 mean parallel TRUE), see |
When looking ahead, the kriging believer heuristic is used, meaning that the non-observed value is replaced by the mean prediction in the update.
list with elements:
par
: best first design,
value
: criterion h-steps ahead starting from adding par
,
path
: list of elements list(par
, value
, new
) at each step h
M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019),
Replication or exploration? Sequential design for stochastic simulation experiments,
Technometrics, 61(1), 7-23.
Preprint available on arXiv:1710.03206.
############################################################################### ## Bi-variate example (myopic version) ############################################################################### nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) preds <- predict(x = Xgrid, object = model) ## Initial plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) crit <- "crit_EI" crit_grid <- apply(Xgrid, 1, crit, model = model) filled.contour(x = xgrid, y = xgrid, matrix(crit_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Initial criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ## Sequential crit search nsteps <- 1 # Increase for better results for(i in 1:nsteps){ res <- crit_optim(model, crit = crit, h = 0, control = list(multi.start = 50, maxit = 30)) newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) } ## Final plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) crit_grid <- apply(Xgrid, 1, crit, model = model) filled.contour(x = xgrid, y = xgrid, matrix(crit_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Final criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ############################################################################### ## Bi-variate example (look-ahead version) ############################################################################### ## Not run: nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) nsteps <- 5 # Increase for more steps crit <- "crit_EI" # To use parallel computation (turn off on Windows) library(parallel) parallel <- FALSE #TRUE # if(parallel) ncores <- detectCores() else ncores <- 1 for(i in 1:nsteps){ res <- crit_optim(model, h = 3, crit = crit, ncores = ncores, control = list(multi.start = 100, maxit = 50)) # If a replicate is selected if(!res$path[[1]]$new) print("Add replicate") newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) ## Plots preds <- predict(x = Xgrid, object = model) contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) points(newX, col = "red", pch = 20) crit_grid <- apply(Xgrid, 1, crit, model = model) filled.contour(x = xgrid, y = xgrid, matrix(crit_grid, ngrid), nlevels = 20, color.palette = terrain.colors, plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) } ## End(Not run)
############################################################################### ## Bi-variate example (myopic version) ############################################################################### nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) preds <- predict(x = Xgrid, object = model) ## Initial plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) crit <- "crit_EI" crit_grid <- apply(Xgrid, 1, crit, model = model) filled.contour(x = xgrid, y = xgrid, matrix(crit_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Initial criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ## Sequential crit search nsteps <- 1 # Increase for better results for(i in 1:nsteps){ res <- crit_optim(model, crit = crit, h = 0, control = list(multi.start = 50, maxit = 30)) newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) } ## Final plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) crit_grid <- apply(Xgrid, 1, crit, model = model) filled.contour(x = xgrid, y = xgrid, matrix(crit_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Final criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ############################################################################### ## Bi-variate example (look-ahead version) ############################################################################### ## Not run: nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) nsteps <- 5 # Increase for more steps crit <- "crit_EI" # To use parallel computation (turn off on Windows) library(parallel) parallel <- FALSE #TRUE # if(parallel) ncores <- detectCores() else ncores <- 1 for(i in 1:nsteps){ res <- crit_optim(model, h = 3, crit = crit, ncores = ncores, control = list(multi.start = 100, maxit = 50)) # If a replicate is selected if(!res$path[[1]]$new) print("Add replicate") newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) ## Plots preds <- predict(x = Xgrid, object = model) contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) points(newX, col = "red", pch = 20) crit_grid <- apply(Xgrid, 1, crit, model = model) filled.contour(x = xgrid, y = xgrid, matrix(crit_grid, ngrid), nlevels = 20, color.palette = terrain.colors, plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) } ## End(Not run)
Fast approximated batch-Expected Improvement criterion (for minimization)
crit_qEI(x, model, cst = NULL, preds = NULL)
crit_qEI(x, model, cst = NULL, preds = NULL)
x |
matrix of new designs representing the batch of q points, one point per row (size q x d) |
model |
|
cst |
optional plugin value used in the EI, see details |
preds |
optional predictions at |
cst
is classically the observed minimum in the deterministic case.
In the noisy case, the min of the predictive mean works fine.
This is a beta version at this point. It may work for for TP models as well.
M. Binois (2015), Uncertainty quantification on Pareto fronts and high-dimensional strategies in Bayesian optimization, with applications in multi-objective automotive design. Ecole Nationale Superieure des Mines de Saint-Etienne, PhD thesis.
## Optimization example (noiseless) set.seed(42) ## Test function defined in [0,1] ftest <- f1d n_init <- 5 # number of unique designs X <- seq(0, 1, length.out = n_init) X <- matrix(X, ncol = 1) Z <- ftest(X) ## Predictive grid ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- matrix(xgrid, ncol = 1) model <- mleHomGP(X = X, Z = Z, lower = 0.01, upper = 1, known = list(g = 2e-8)) # Regular EI function cst <- min(model$Z0) EIgrid <- crit_EI(Xgrid, model, cst = cst) plot(xgrid, EIgrid, type = "l") abline(v = X, lty = 2) # observations # Create batch (based on regular EI peaks) xbatch <- matrix(c(0.37, 0.17, 0.7), 3, 1) abline(v = xbatch, col = "red") fqEI <- crit_qEI(xbatch, model, cst = cst) # Compare with Monte Carlo qEI preds <- predict(model, xbatch, xprime = xbatch) nsim <- 1e4 simus <- matrix(rnorm(3 * nsim), nsim) %*% chol(preds$cov) simus <- simus + matrix(preds$mean, nrow = nsim, ncol = 3, byrow = TRUE) MCqEI <- mean(apply(cst - simus, 1, function(x) max(c(x, 0))))
## Optimization example (noiseless) set.seed(42) ## Test function defined in [0,1] ftest <- f1d n_init <- 5 # number of unique designs X <- seq(0, 1, length.out = n_init) X <- matrix(X, ncol = 1) Z <- ftest(X) ## Predictive grid ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- matrix(xgrid, ncol = 1) model <- mleHomGP(X = X, Z = Z, lower = 0.01, upper = 1, known = list(g = 2e-8)) # Regular EI function cst <- min(model$Z0) EIgrid <- crit_EI(Xgrid, model, cst = cst) plot(xgrid, EIgrid, type = "l") abline(v = X, lty = 2) # observations # Create batch (based on regular EI peaks) xbatch <- matrix(c(0.37, 0.17, 0.7), 3, 1) abline(v = xbatch, col = "red") fqEI <- crit_qEI(xbatch, model, cst = cst) # Compare with Monte Carlo qEI preds <- predict(model, xbatch, xprime = xbatch) nsim <- 1e4 simus <- matrix(rnorm(3 * nsim), nsim) %*% chol(preds$cov) simus <- simus + matrix(preds$mean, nrow = nsim, ncol = 3, byrow = TRUE) MCqEI <- mean(apply(cst - simus, 1, function(x) max(c(x, 0))))
Computes targeted mean squared error infill criterion
crit_tMSE(x, model, thres = 0, preds = NULL, seps = 0.05)
crit_tMSE(x, model, thres = 0, preds = NULL, seps = 0.05)
x |
matrix of new designs, one point per row (size n x d) |
model |
|
thres |
for contour finding |
preds |
optional predictions at |
seps |
parameter for the target window |
Picheny, V., Ginsbourger, D., Roustant, O., Haftka, R., Kim, N. (2010).
Adaptive designs of experiments for accurate approximation of a target region,
Journal of Mechanical Design (132), p. 071008.
Lyu, X., Binois, M. & Ludkovski, M. (2018+).
Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation. arXiv:1807.06712.
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_tMSE, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "tMSE criterion")
## Infill criterion example set.seed(42) branin <- function(x){ m <- 54.8104; s <- 51.9496 if(is.null(dim(x))) x <- matrix(x, nrow = 1) xx <- 15 * x[,1] - 5; y <- 15 * x[,2] f <- (y - 5.1 * xx^2/(4 * pi^2) + 5 * xx/pi - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(xx) + 10 f <- (f - m)/s return(f) } ftest <- function(x, sd = 0.1){ if(is.null(dim(x))) x <- matrix(x, nrow = 1) return(apply(x, 1, branin) + rnorm(nrow(x), sd = sd)) } ngrid <- 101; xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) Zgrid <- ftest(Xgrid) n <- 20 N <- 500 X <- Xgrid[sample(1:nrow(Xgrid), n),] X <- X[sample(1:n, N, replace = TRUE),] Z <- ftest(X) model <- mleHetGP(X, Z, lower = rep(0.001,2), upper = rep(1,2)) critgrid <- apply(Xgrid, 1, crit_tMSE, model = model) filled.contour(matrix(critgrid, ngrid), color.palette = terrain.colors, main = "tMSE criterion")
Derivative of EI criterion for GP models
deriv_crit_EI(x, model, cst = NULL, preds = NULL)
deriv_crit_EI(x, model, cst = NULL, preds = NULL)
x |
matrix for the new design (size 1 x d) |
model |
|
cst |
threshold for contour criteria |
preds |
pre-computed preds for contour criteria |
Ginsbourger, D. Multiples metamodeles pour l'approximation et l'optimisation de fonctions numeriques multivariables Ecole Nationale Superieure des Mines de Saint-Etienne, Ecole Nationale Superieure des Mines de Saint-Etienne, 2009
Roustant, O., Ginsbourger, D., DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization, Journal of Statistical Software, 2012
crit_EI
for the criterion
Derivative of crit_IMSPE
deriv_crit_IMSPE(x, model, Wijs = NULL)
deriv_crit_IMSPE(x, model, Wijs = NULL)
x |
matrix for the new design (size 1 x d) |
model |
|
Wijs |
optional previously computed matrix of Wijs, see |
Derivative of the sequential IMSPE with respect to x
crit_IMSPE
for the criterion
1d test function (1)
f1d(x)
f1d(x)
x |
scalar or matrix (size n x 1) in [0,1] |
A. Forrester, A. Sobester, A. Keane (2008), Engineering design via surrogate modelling: a practical guide, John Wiley & Sons
plot(f1d)
plot(f1d)
f1d
Noisy 1d test function (1)
Add Gaussian noise with variance r(x) = scale * (1.1 + sin(2 pi x))^2 to f1d
f1d_n(x, scale = 1)
f1d_n(x, scale = 1)
x |
scalar or matrix (size n x 1) in [0,1] |
scale |
scalar in [0, Inf] to control the signal to noise ratio |
X <- matrix(seq(0, 1, length.out = 101), ncol = 1) Xr <- X[sort(sample(x = 1:101, size = 500, replace = TRUE)),, drop = FALSE] plot(Xr, f1d_n(Xr)) lines(X, f1d(X), col = "red", lwd = 2)
X <- matrix(seq(0, 1, length.out = 101), ncol = 1) Xr <- X[sort(sample(x = 1:101, size = 500, replace = TRUE)),, drop = FALSE] plot(Xr, f1d_n(Xr)) lines(X, f1d(X), col = "red", lwd = 2)
1d test function (2)
f1d2(x)
f1d2(x)
x |
scalar or matrix (size n x 1) in [0,1] |
A. Boukouvalas, and D. Cornford (2009), Learning heteroscedastic Gaussian processes for complex datasets, Technical report.
M. Yuan, and G. Wahba (2004), Doubly penalized likelihood estimator in heteroscedastic regression, Statistics and Probability Letters 69, 11-20.
plot(f1d2)
plot(f1d2)
f1d2
Noisy 1d test function (2)
Add Gaussian noise with variance r(x) = scale * (exp(sin(2 pi x)))^2 to f1d2
f1d2_n(x, scale = 1)
f1d2_n(x, scale = 1)
x |
scalar or matrix (size n x 1) in [0,1] |
scale |
scalar in [0, Inf] to control the signal to noise ratio |
X <- matrix(seq(0, 1, length.out = 101), ncol = 1) Xr <- X[sort(sample(x = 1:101, size = 500, replace = TRUE)),, drop = FALSE] plot(Xr, f1d2_n(Xr)) lines(X, f1d2(X), col = "red", lwd = 2)
X <- matrix(seq(0, 1, length.out = 101), ncol = 1) Xr <- X[sort(sample(x = 1:101, size = 500, replace = TRUE)),, drop = FALSE] plot(Xr, f1d2_n(Xr)) lines(X, f1d2(X), col = "red", lwd = 2)
Prepare data for use with mleHetGP
, in particular to find replicated observations
find_reps( X, Z, return.Zlist = TRUE, rescale = FALSE, normalize = FALSE, inputBounds = NULL )
find_reps( X, Z, return.Zlist = TRUE, rescale = FALSE, normalize = FALSE, inputBounds = NULL )
X |
matrix of design locations, one point per row |
Z |
vector of observations at |
return.Zlist |
to return |
rescale |
if |
normalize |
if |
inputBounds |
optional matrix of known boundaries in original input space, of size 2 times |
Replicates are searched based on character representation, using unique
.
A list with the following elements that can be passed to the main fitting functions, e.g., mleHetGP
and mleHomGP
X0
matrix with unique designs locations, one point per row,
Z0
vector of averaged observations at X0
,
mult
number of replicates at X0
,
Z
vector with all observations, sorted according to X0
,
Zlist
optional list, each element corresponds to observations at a design in X0
,
inputBounds
optional matrix, to rescale back to the original input space,
outputStats
optional vector, with mean and variance of the original outputs.
##------------------------------------------------------------ ## Find replicates on the motorcycle data ##------------------------------------------------------------ ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel data_m <- find_reps(X, Z) # Initial data plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") # Display mean values points(data_m$X0, data_m$Z0, pch = 20)
##------------------------------------------------------------ ## Find replicates on the motorcycle data ##------------------------------------------------------------ ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel data_m <- find_reps(X, Z) # Initial data plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") # Display mean values points(data_m$X0, data_m$Z0, pch = 20)
Adapt the look-ahead horizon depending on the replicate allocation or a target ratio
horizon( model, current_horizon = NULL, previous_ratio = NULL, target = NULL, Wijs = NULL )
horizon( model, current_horizon = NULL, previous_ratio = NULL, target = NULL, Wijs = NULL )
model |
|
current_horizon |
horizon used for the previous iteration, see details |
previous_ratio |
ratio before adding the previous new design |
target |
scalar in ]0,1] for desired n/N |
Wijs |
optional previously computed matrix of Wijs, see |
If target
is provided, along with previous_ratio
and current_horizon
:
the horizon is increased by one if more replicates are needed but a new ppint has been added at the previous iteration,
the horizon is decreased by one if new points are needed but a replicate has been added at the previous iteration,
otherwise it is unchanged.
If no target
is provided, allocate_mult
is used to obtain the best allocation of the existing replicates,
then the new horizon is sampled from the difference between the actual allocation and the best one, bounded below by 0.
See (Binois et al. 2017).
randomly selected horizon for next iteration (adpative) if no target
is provided,
otherwise returns the update horizon value.
M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019),
Replication or exploration? Sequential design for stochastic simulation experiments,
Technometrics, 61(1), 7-23.
Preprint available on arXiv:1710.03206.
IMSPE of a given design
IMSPE( X, theta = NULL, Lambda = NULL, mult = NULL, covtype = NULL, nu = NULL, eps = sqrt(.Machine$double.eps) )
IMSPE( X, theta = NULL, Lambda = NULL, mult = NULL, covtype = NULL, nu = NULL, eps = sqrt(.Machine$double.eps) )
X |
|
theta |
lengthscales |
Lambda |
diagonal matrix for the noise |
mult |
number of replicates at each design |
covtype |
either "Gaussian", "Matern3_2" or "Matern5_2" |
nu |
variance parameter |
eps |
numerical nugget |
One can provide directly a model of class hetGP
or homGP
, or provide X
and all other arguments
Search for the best value of the IMSPE criterion, possibly using a h-steps lookahead strategy to favor designs with replication
IMSPE_optim( model, h = 2, Xcand = NULL, control = list(tol_dist = 1e-06, tol_diff = 1e-06, multi.start = 20, maxit = 100), Wijs = NULL, seed = NULL, ncores = 1 )
IMSPE_optim( model, h = 2, Xcand = NULL, control = list(tol_dist = 1e-06, tol_diff = 1e-06, multi.start = 20, maxit = 100), Wijs = NULL, seed = NULL, ncores = 1 )
model |
|
h |
horizon for multi-step ahead framework. The decision is made between:
Use |
Xcand |
optional discrete set of candidates (otherwise a maximin LHS is used to initialise continuous search) |
control |
list in case |
Wijs |
optional previously computed matrix of Wijs, see |
seed |
optional seed for the generation of designs with |
ncores |
number of CPU available (> 1 mean parallel TRUE), see |
The domain needs to be [0, 1]^d for now.
list with elements:
par
: best first design,
value
: IMSPE h-steps ahead starting from adding par
,
path
: list of elements list(par
, value
, new
) at each step h
M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019),
Replication or exploration? Sequential design for stochastic simulation experiments,
Technometrics, 61(1), 7-23.
Preprint available on arXiv:1710.03206.
############################################################################### ## Bi-variate example (myopic version) ############################################################################### nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) preds <- predict(x = Xgrid, object = model) ## Initial plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Initial IMSPE criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ## Sequential IMSPE search nsteps <- 1 # Increase for better results for(i in 1:nsteps){ res <- IMSPE_optim(model, control = list(multi.start = 30, maxit = 30)) newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) } ## Final plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Final IMSPE criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ############################################################################### ## Bi-variate example (look-ahead version) ############################################################################### ## Not run: nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) nsteps <- 5 # Increase for more steps # To use parallel computation (turn off on Windows) library(parallel) parallel <- FALSE #TRUE # if(parallel) ncores <- detectCores() else ncores <- 1 for(i in 1:nsteps){ res <- IMSPE_optim(model, h = 3, control = list(multi.start = 100, maxit = 50), ncores = ncores) # If a replicate is selected if(!res$path[[1]]$new) print("Add replicate") newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) ## Plots preds <- predict(x = Xgrid, object = model) contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) points(newX, col = "red", pch = 20) ## Precalculations Wijs <- Wij(mu1 = model$X0, theta = model$theta, type = model$covtype) IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, Wijs = Wijs, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) } ## End(Not run)
############################################################################### ## Bi-variate example (myopic version) ############################################################################### nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) preds <- predict(x = Xgrid, object = model) ## Initial plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Initial IMSPE criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ## Sequential IMSPE search nsteps <- 1 # Increase for better results for(i in 1:nsteps){ res <- IMSPE_optim(model, control = list(multi.start = 30, maxit = 30)) newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) } ## Final plots contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, main = "Final IMSPE criterion landscape", plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) ############################################################################### ## Bi-variate example (look-ahead version) ############################################################################### ## Not run: nvar <- 2 set.seed(42) ftest <- function(x, coef = 0.1) return(sin(2*pi*sum(x)) + rnorm(1, sd = coef)) n <- 25 # must be a square xgrid0 <- seq(0.1, 0.9, length.out = sqrt(n)) designs <- as.matrix(expand.grid(xgrid0, xgrid0)) X <- designs[rep(1:n, sample(1:10, n, replace = TRUE)),] Z <- apply(X, 1, ftest) model <- mleHomGP(X, Z, lower = rep(0.1, nvar), upper = rep(1, nvar)) ngrid <- 51 xgrid <- seq(0,1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) nsteps <- 5 # Increase for more steps # To use parallel computation (turn off on Windows) library(parallel) parallel <- FALSE #TRUE # if(parallel) ncores <- detectCores() else ncores <- 1 for(i in 1:nsteps){ res <- IMSPE_optim(model, h = 3, control = list(multi.start = 100, maxit = 50), ncores = ncores) # If a replicate is selected if(!res$path[[1]]$new) print("Add replicate") newX <- res$par newZ <- ftest(newX) model <- update(object = model, Xnew = newX, Znew = newZ) ## Plots preds <- predict(x = Xgrid, object = model) contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(model$X0, col = 'blue', pch = 20) points(newX, col = "red", pch = 20) ## Precalculations Wijs <- Wij(mu1 = model$X0, theta = model$theta, type = model$covtype) IMSPE_grid <- apply(Xgrid, 1, crit_IMSPE, Wijs = Wijs, model = model) filled.contour(x = xgrid, y = xgrid, matrix(IMSPE_grid, ngrid), nlevels = 20, color.palette = terrain.colors, plot.axes = {axis(1); axis(2); points(model$X0, pch = 20)}) } ## End(Not run)
Provide leave one out predictions, e.g., for model testing and diagnostics. This is used in the method plot available on GP and TP models.
LOO_preds(model, ids = NULL)
LOO_preds(model, ids = NULL)
model |
|
ids |
vector of indices of the unique design point considered (default to all) |
list with mean and variance predictions at x_i assuming this point has not been evaluated
For TP models, psi
is considered fixed.
O. Dubrule (1983), Cross validation of Kriging in a unique neighborhood, Mathematical Geology 15, 687–699.
F. Bachoc (2013), Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification, Computational Statistics & Data Analysis, 55–69.
set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 ## Model fitting model <- mleHomGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(10, nvar), covtype = "Matern5_2", known = list(beta0 = 0)) LOO_p <- LOO_preds(model) # model minus observation(s) at x_i d_mot <- find_reps(X, Z) LOO_ref <- matrix(NA, nrow(d_mot$X0), 2) for(i in 1:nrow(d_mot$X0)){ model_i <- mleHomGP(X = list(X0 = d_mot$X0[-i,, drop = FALSE], Z0 = d_mot$Z0[-i], mult = d_mot$mult[-i]), Z = unlist(d_mot$Zlist[-i]), lower = rep(0.1, nvar), upper = rep(50, nvar), covtype = "Matern5_2", known = list(theta = model$theta, k_theta_g = model$k_theta_g, g = model$g, beta0 = 0)) model_i$nu_hat <- model$nu_hat p_i <- predict(model_i, d_mot$X0[i,,drop = FALSE]) LOO_ref[i,] <- c(p_i$mean, p_i$sd2) } # Compare results range(LOO_ref[,1] - LOO_p$mean) range(LOO_ref[,2] - LOO_p$sd2) # Use of LOO for diagnostics plot(model)
set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 ## Model fitting model <- mleHomGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(10, nvar), covtype = "Matern5_2", known = list(beta0 = 0)) LOO_p <- LOO_preds(model) # model minus observation(s) at x_i d_mot <- find_reps(X, Z) LOO_ref <- matrix(NA, nrow(d_mot$X0), 2) for(i in 1:nrow(d_mot$X0)){ model_i <- mleHomGP(X = list(X0 = d_mot$X0[-i,, drop = FALSE], Z0 = d_mot$Z0[-i], mult = d_mot$mult[-i]), Z = unlist(d_mot$Zlist[-i]), lower = rep(0.1, nvar), upper = rep(50, nvar), covtype = "Matern5_2", known = list(theta = model$theta, k_theta_g = model$k_theta_g, g = model$g, beta0 = 0)) model_i$nu_hat <- model$nu_hat p_i <- predict(model_i, d_mot$X0[i,,drop = FALSE]) LOO_ref[i,] <- c(p_i$mean, p_i$sd2) } # Compare results range(LOO_ref[,1] - LOO_p$mean) range(LOO_ref[,2] - LOO_p$sd2) # Use of LOO for diagnostics plot(model)
Gaussian process regression when seed (or trajectory) information is provided, based on maximum likelihood estimation of the hyperparameters. Trajectory handling involves observing all times for any given seed.
mleCRNGP( X, Z, T0 = NULL, stype = c("none", "XS"), lower = NULL, upper = NULL, known = NULL, noiseControl = list(g_bounds = c(sqrt(.Machine$double.eps) * 10, 100), rho_bounds = c(0.001, 0.9)), init = NULL, covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, eps = sqrt(.Machine$double.eps), settings = list(return.Ki = TRUE, factr = 1e+07) )
mleCRNGP( X, Z, T0 = NULL, stype = c("none", "XS"), lower = NULL, upper = NULL, known = NULL, noiseControl = list(g_bounds = c(sqrt(.Machine$double.eps) * 10, 100), rho_bounds = c(0.001, 0.9)), init = NULL, covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, eps = sqrt(.Machine$double.eps), settings = list(return.Ki = TRUE, factr = 1e+07) )
X |
matrix of all designs, one per row. The last column is assumed to contain the integer seed value. |
Z |
vector of all observations. If |
T0 |
optional vector of times (same for all |
stype |
structural assumptions, options include:
When time is present, the Kronecker structure is always used (the alternative is to provide times as an extra variable in |
lower , upper
|
optional bounds for the |
known |
optional list of known parameters, e.g., |
noiseControl |
list with element,
|
init |
optional list specifying starting values for MLE optimization, with elements:
|
covtype |
covariance kernel type, either 'Gaussian', 'Matern5_2' or 'Matern3_2', see |
maxit |
maximum number of iteration for L-BFGS-B of |
eps |
jitter used in the inversion of the covariance matrix for numerical stability |
settings |
list with argument |
The global covariance matrix of the model is parameterized as nu_hat * (Cx + g Id) * Cs = nu_hat * K
,
with Cx
the spatial correlation matrix between unique designs, depending on the family of kernel used (see cov_gen
for available choices) and values of lengthscale parameters.
Cs
is the correlation matrix between seed values, equal to 1 if the seeds are equal, rho
otherwise.
nu_hat
is the plugin estimator of the variance of the process.
Compared to mleHomGP
, here the replications have a specific identifier, i.e., the seed.
a list which is given the S3 class "CRNGP
", with elements:
theta
: maximum likelihood estimate of the lengthscale parameter(s),
g
: maximum likelihood estimate of the nugget variance,
rho
: maximum likelihood estimate of the seed correlation parameter,
trendtype
: either "SK
" if beta0
is given, else "OK
"
beta0
: estimated trend unless given in input,
nu_hat
: plugin estimator of the variance,
ll
: log-likelihood value,
X0
, S0
, T0
: values for the spatial, seed and time designs
Z
, eps
, covtype
, stype
,: values given in input,
call
: user call of the function
used_args
: list with arguments provided in the call
nit_opt
, msg
: counts
and msg
returned by optim
Ki
: inverse covariance matrix (not scaled by nu_hat
) (if return.Ki
is TRUE
in settings
)
Ct
: if time is used, corresponding covariance matrix.
time
: time to train the model, in seconds.
This function is experimental at this time and could evolve in the future.
Xi Chen, Bruce E Ankenman, and Barry L Nelson. The effects of common random numbers on stochastic kriging metamodels. ACM Transactions on Modeling and Computer Simulation (TOMACS), 22(2):1-20, 2012.
Michael Pearce, Matthias Poloczek, and Juergen Branke. Bayesian simulation optimization with common random numbers. In 2019 Winter Simulation Conference (WSC), pages 3492-3503. IEEE, 2019.
A Fadikar, M Binois, N Collier, A Stevens, KB Toh, J Ozik. Trajectory-oriented optimization of stochastic epidemiological models. arXiv preprint arXiv:2305.03926
predict.CRNGP
for predictions, simul.CRNGP
for generating conditional simulation on a Kronecker grid.
summary
and plot
functions are available as well.
##------------------------------------------------------------ ## Example 1: CRN GP modeling on 1d sims ##------------------------------------------------------------ #' set.seed(42) nx <- 50 ns <- 5 x <- matrix(seq(0,1, length.out = nx), nx) s <- matrix(seq(1, ns, length.out = ns)) g <- 1e-3 theta <- 0.01 KX <- cov_gen(x, theta = theta) rho <- 0.3 KS <- matrix(rho, ns, ns) diag(KS) <- 1 YY <- MASS::mvrnorm(n = 1, mu = rep(0, nx*ns), Sigma = kronecker(KX, KS) + g * diag(nx*ns)) YYmat <- matrix(YY, ns, nx) matplot(x, t(YYmat), pch = 1, type = "b", lty = 3) Xgrid <- as.matrix(expand.grid(s, x)) Xgrid <- cbind(Xgrid[,2], Xgrid[,1]) ids <- sample(1:nrow(Xgrid), 20) X0 <- Xgrid[ids,] Y0 <- YY[ids] points(X0[,1], Y0, pch = 20, col = 1 + ((X0[,2] - 1) %% 6)) model <- mleCRNGP(X0, Y0, known = list(theta = 0.01, g = 1e-3, rho = 0.3)) preds <- predict(model, x = Xgrid, xprime = Xgrid) matlines(x, t(matrix(preds$mean, ns, nx)), lty = 1) # prediction on new seed (i.e., average prediction) xs1 <- cbind(x, ns+1) predsm <- predict(model, x = xs1) lines(x, predsm$mean, col = "orange", lwd = 3) lines(x, predsm$mean + 2 * sqrt(predsm$sd2), col = "orange", lwd = 2, lty = 3) lines(x, predsm$mean - 2 * sqrt(predsm$sd2), col = "orange", lwd = 2, lty = 3) # Conditional realizations sims <- MASS::mvrnorm(n = 1, mu = preds$mean, Sigma = 1/2 * (preds$cov + t(preds$cov))) plot(Xgrid[,1], sims, col = 1 + ((Xgrid[,2] - 1) %% 6)) points(X0[,1], Y0, pch = 20, col = 1 + ((X0[,2] - 1) %% 6)) ## Not run: ##------------------------------------------------------------ ## Example 2: Homoskedastic GP modeling on 2d sims ##------------------------------------------------------------ set.seed(2) nx <- 31 ns <- 5 d <- 2 x <- as.matrix(expand.grid(seq(0,1, length.out = nx), seq(0,1, length.out = nx))) s <- matrix(seq(1, ns, length.out = ns)) Xgrid <- as.matrix(expand.grid(seq(1, ns, length.out = ns), seq(0,1, length.out = nx), seq(0,1, length.out = nx))) Xgrid <- Xgrid[,c(2, 3, 1)] g <- 1e-3 theta <- c(0.02, 0.05) KX <- cov_gen(x, theta = theta) rho <- 0.33 KS <- matrix(rho, ns, ns) diag(KS) <- 1 YY <- MASS::mvrnorm(n = 1, mu = rep(0, nx*nx*ns), Sigma = kronecker(KX, KS) + g * diag(nx*nx*ns)) YYmat <- matrix(YY, ns, nx*nx) filled.contour(matrix(YYmat[1,], nx)) filled.contour(matrix(YYmat[2,], nx)) ids <- sample(1:nrow(Xgrid), 80) X0 <- Xgrid[ids,] Y0 <- YY[ids] ## Uncomment below for For 3D visualisation # library(rgl) # plot3d(Xgrid[,1], Xgrid[,2], YY, col = 1 + (Xgrid[,3] - 1) %% 6) # points3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) model <- mleCRNGP(X0, Y0, know = list(beta0 = 0)) preds <- predict(model, x = Xgrid, xprime = Xgrid) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(YY[Xgrid[,3]==1], nx), # front = "lines", back = "lines") # aspect3d(1, 1, 1) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(preds$mean[Xgrid[,3]==1], nx), # front = "lines", back = "lines", col = "red") plot(preds$mean, YY) # prediction on new seed (i.e., average prediction) xs1 <- cbind(x, ns+1) predsm <- predict(model, x = xs1) # surface3d(unique(x[,1]), unique(x[,2]), matrix(predsm$mean, nx), col = "orange", # front = "lines", back = "lines") # Conditional realizations sims <- MASS::mvrnorm(n = 1, mu = preds$mean, Sigma = 1/2 * (preds$cov + t(preds$cov))) # plot3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 1], nx), col = 1, # front = "lines", back = "lines") # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 2], nx), col = 2, # front = "lines", back = "lines") # Faster alternative for conditional realizations # (note: here the design points are part of the simulation points) Xgrid0 <- unique(Xgrid[, -(d + 1), drop = FALSE]) sims2 <- simul(object = model,Xgrid = Xgrid, ids = ids, nsim = 5, check = TRUE) ##------------------------------------------------------------ ## Example 3: Homoskedastic GP modeling on 1d trajectories (with time) ##------------------------------------------------------------ set.seed(42) nx <- 11 nt <- 9 ns <- 7 x <- matrix(sort(seq(0,1, length.out = nx)), nx) s <- matrix(sort(seq(1, ns, length.out = ns))) t <- matrix(sort(seq(0, 1, length.out = nt)), nt) covtype <- "Matern5_2" g <- 1e-3 theta <- c(0.3, 0.5) KX <- cov_gen(x, theta = theta[1], type = covtype) KT <- cov_gen(t, theta = theta[2], type = covtype) rho <- 0.3 KS <- matrix(rho, ns, ns) diag(KS) <- 1 XST <- as.matrix(expand.grid(x, s, t)) Kmc <- kronecker(chol(KT), kronecker(chol(KS), chol(KX))) YY <- t(Kmc) %*% rnorm(nrow(Kmc)) ninit <- 50 XS <- as.matrix(expand.grid(x, s)) ids <- sort(sample(1:nrow(XS), ninit)) XST0 <- cbind(XS[ids[rep(1:ninit, each = nt)],], rep(t[,1], times = ninit)) X0 <- XST[which(duplicated(rbind(XST, XST0), fromLast = TRUE)),] Y0 <- YY[which(duplicated(rbind(XST, XST0), fromLast = TRUE))] # tmp <- hetGP:::find_reps(X = X0[,-3], Y0) model <- mleCRNGP(X = XS[ids,], T0=t, Z = matrix(Y0, ncol = nt), covtype = covtype) preds <- predict(model, x = XS, xprime = XS) # compare with regular CRN GP mref <- mleCRNGP(X = X0[, c(1, 3, 2)], Z = Y0, covtype = covtype) pref <- predict(mref, x = XST[, c(1, 3, 2)], xprime = XST[, c(1, 3, 2)]) print(model$time) # Use Kronecker structure for time print(mref$time) plot(as.vector(preds$mean), YY) plot(pref$mean, YY) ## End(Not run)
##------------------------------------------------------------ ## Example 1: CRN GP modeling on 1d sims ##------------------------------------------------------------ #' set.seed(42) nx <- 50 ns <- 5 x <- matrix(seq(0,1, length.out = nx), nx) s <- matrix(seq(1, ns, length.out = ns)) g <- 1e-3 theta <- 0.01 KX <- cov_gen(x, theta = theta) rho <- 0.3 KS <- matrix(rho, ns, ns) diag(KS) <- 1 YY <- MASS::mvrnorm(n = 1, mu = rep(0, nx*ns), Sigma = kronecker(KX, KS) + g * diag(nx*ns)) YYmat <- matrix(YY, ns, nx) matplot(x, t(YYmat), pch = 1, type = "b", lty = 3) Xgrid <- as.matrix(expand.grid(s, x)) Xgrid <- cbind(Xgrid[,2], Xgrid[,1]) ids <- sample(1:nrow(Xgrid), 20) X0 <- Xgrid[ids,] Y0 <- YY[ids] points(X0[,1], Y0, pch = 20, col = 1 + ((X0[,2] - 1) %% 6)) model <- mleCRNGP(X0, Y0, known = list(theta = 0.01, g = 1e-3, rho = 0.3)) preds <- predict(model, x = Xgrid, xprime = Xgrid) matlines(x, t(matrix(preds$mean, ns, nx)), lty = 1) # prediction on new seed (i.e., average prediction) xs1 <- cbind(x, ns+1) predsm <- predict(model, x = xs1) lines(x, predsm$mean, col = "orange", lwd = 3) lines(x, predsm$mean + 2 * sqrt(predsm$sd2), col = "orange", lwd = 2, lty = 3) lines(x, predsm$mean - 2 * sqrt(predsm$sd2), col = "orange", lwd = 2, lty = 3) # Conditional realizations sims <- MASS::mvrnorm(n = 1, mu = preds$mean, Sigma = 1/2 * (preds$cov + t(preds$cov))) plot(Xgrid[,1], sims, col = 1 + ((Xgrid[,2] - 1) %% 6)) points(X0[,1], Y0, pch = 20, col = 1 + ((X0[,2] - 1) %% 6)) ## Not run: ##------------------------------------------------------------ ## Example 2: Homoskedastic GP modeling on 2d sims ##------------------------------------------------------------ set.seed(2) nx <- 31 ns <- 5 d <- 2 x <- as.matrix(expand.grid(seq(0,1, length.out = nx), seq(0,1, length.out = nx))) s <- matrix(seq(1, ns, length.out = ns)) Xgrid <- as.matrix(expand.grid(seq(1, ns, length.out = ns), seq(0,1, length.out = nx), seq(0,1, length.out = nx))) Xgrid <- Xgrid[,c(2, 3, 1)] g <- 1e-3 theta <- c(0.02, 0.05) KX <- cov_gen(x, theta = theta) rho <- 0.33 KS <- matrix(rho, ns, ns) diag(KS) <- 1 YY <- MASS::mvrnorm(n = 1, mu = rep(0, nx*nx*ns), Sigma = kronecker(KX, KS) + g * diag(nx*nx*ns)) YYmat <- matrix(YY, ns, nx*nx) filled.contour(matrix(YYmat[1,], nx)) filled.contour(matrix(YYmat[2,], nx)) ids <- sample(1:nrow(Xgrid), 80) X0 <- Xgrid[ids,] Y0 <- YY[ids] ## Uncomment below for For 3D visualisation # library(rgl) # plot3d(Xgrid[,1], Xgrid[,2], YY, col = 1 + (Xgrid[,3] - 1) %% 6) # points3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) model <- mleCRNGP(X0, Y0, know = list(beta0 = 0)) preds <- predict(model, x = Xgrid, xprime = Xgrid) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(YY[Xgrid[,3]==1], nx), # front = "lines", back = "lines") # aspect3d(1, 1, 1) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(preds$mean[Xgrid[,3]==1], nx), # front = "lines", back = "lines", col = "red") plot(preds$mean, YY) # prediction on new seed (i.e., average prediction) xs1 <- cbind(x, ns+1) predsm <- predict(model, x = xs1) # surface3d(unique(x[,1]), unique(x[,2]), matrix(predsm$mean, nx), col = "orange", # front = "lines", back = "lines") # Conditional realizations sims <- MASS::mvrnorm(n = 1, mu = preds$mean, Sigma = 1/2 * (preds$cov + t(preds$cov))) # plot3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 1], nx), col = 1, # front = "lines", back = "lines") # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 2], nx), col = 2, # front = "lines", back = "lines") # Faster alternative for conditional realizations # (note: here the design points are part of the simulation points) Xgrid0 <- unique(Xgrid[, -(d + 1), drop = FALSE]) sims2 <- simul(object = model,Xgrid = Xgrid, ids = ids, nsim = 5, check = TRUE) ##------------------------------------------------------------ ## Example 3: Homoskedastic GP modeling on 1d trajectories (with time) ##------------------------------------------------------------ set.seed(42) nx <- 11 nt <- 9 ns <- 7 x <- matrix(sort(seq(0,1, length.out = nx)), nx) s <- matrix(sort(seq(1, ns, length.out = ns))) t <- matrix(sort(seq(0, 1, length.out = nt)), nt) covtype <- "Matern5_2" g <- 1e-3 theta <- c(0.3, 0.5) KX <- cov_gen(x, theta = theta[1], type = covtype) KT <- cov_gen(t, theta = theta[2], type = covtype) rho <- 0.3 KS <- matrix(rho, ns, ns) diag(KS) <- 1 XST <- as.matrix(expand.grid(x, s, t)) Kmc <- kronecker(chol(KT), kronecker(chol(KS), chol(KX))) YY <- t(Kmc) %*% rnorm(nrow(Kmc)) ninit <- 50 XS <- as.matrix(expand.grid(x, s)) ids <- sort(sample(1:nrow(XS), ninit)) XST0 <- cbind(XS[ids[rep(1:ninit, each = nt)],], rep(t[,1], times = ninit)) X0 <- XST[which(duplicated(rbind(XST, XST0), fromLast = TRUE)),] Y0 <- YY[which(duplicated(rbind(XST, XST0), fromLast = TRUE))] # tmp <- hetGP:::find_reps(X = X0[,-3], Y0) model <- mleCRNGP(X = XS[ids,], T0=t, Z = matrix(Y0, ncol = nt), covtype = covtype) preds <- predict(model, x = XS, xprime = XS) # compare with regular CRN GP mref <- mleCRNGP(X = X0[, c(1, 3, 2)], Z = Y0, covtype = covtype) pref <- predict(mref, x = XST[, c(1, 3, 2)], xprime = XST[, c(1, 3, 2)]) print(model$time) # Use Kronecker structure for time print(mref$time) plot(as.vector(preds$mean), YY) plot(pref$mean, YY) ## End(Not run)
Gaussian process regression under input dependent noise based on maximum likelihood estimation of the hyperparameters. A second GP is used to model latent (log-) variances. This function is enhanced to deal with replicated observations.
mleHetGP( X, Z, lower = NULL, upper = NULL, noiseControl = list(k_theta_g_bounds = c(1, 100), g_max = 100, g_bounds = c(1e-06, 1)), settings = list(linkThetas = "joint", logN = TRUE, initStrategy = "residuals", checkHom = TRUE, penalty = TRUE, trace = 0, return.matrices = TRUE, return.hom = FALSE, factr = 1e+09), covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, known = NULL, init = NULL, eps = sqrt(.Machine$double.eps) )
mleHetGP( X, Z, lower = NULL, upper = NULL, noiseControl = list(k_theta_g_bounds = c(1, 100), g_max = 100, g_bounds = c(1e-06, 1)), settings = list(linkThetas = "joint", logN = TRUE, initStrategy = "residuals", checkHom = TRUE, penalty = TRUE, trace = 0, return.matrices = TRUE, return.hom = FALSE, factr = 1e+09), covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, known = NULL, init = NULL, eps = sqrt(.Machine$double.eps) )
X |
matrix of all designs, one per row, or list with elements:
|
Z |
vector of all observations. If using a list with |
lower , upper
|
optional bounds for the |
noiseControl |
list with elements related to optimization of the noise process parameters:
|
settings |
list for options about the general modeling procedure, with elements:
|
covtype |
covariance kernel type, either |
maxit |
maximum number of iterations for |
init , known
|
optional lists of starting values for mle optimization or that should not be optimized over, respectively.
Values in
|
eps |
jitter used in the inversion of the covariance matrix for numerical stability |
The global covariance matrix of the model is parameterized as nu_hat * (C + Lambda * diag(1/mult)) = nu_hat * K
,
with C
the correlation matrix between unique designs, depending on the family of kernel used (see cov_gen
for available choices) and values of lengthscale parameters.
nu_hat
is the plugin estimator of the variance of the process.
Lambda
is the prediction on the noise level given by a second (homoskedastic) GP:
with C_g
the correlation matrix between unique designs for this second GP, with lengthscales hyperparameters theta_g
and nugget g
and Delta
the variance level at X0
that are estimated.
It is generally recommended to use find_reps
to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.
The noise process lengthscales can be set in several ways:
using k_theta_g
(settings$linkThetas == 'joint'
), supposed to be greater than one by default.
In this case lengthscales of the noise process are multiples of those of the mean process.
if settings$linkThetas == 'constr'
, then the lower bound on theta_g
correspond to estimated values of an homoskedastic GP fit.
else lengthscales between the mean and noise process are independent (both either anisotropic or not).
When no starting nor fixed parameter values are provided with init
or known
,
the initialization process consists of fitting first an homoskedastic model of the data, called modHom
.
Unless provided with init$theta
, initial lengthscales are taken at 10% of the range determined with lower
and upper
,
while init$g_H
may be use to pass an initial nugget value.
The resulting lengthscales provide initial values for theta
(or update them if given in init
).
If necessary, a second homoskedastic model, modNugs
, is fitted to the empirical residual variance between the prediction
given by modHom
at X0
and Z
(up to modHom$nu_hat
).
Note that when specifying settings$linkThetas == 'joint'
, then this second homoskedastic model has fixed lengthscale parameters.
Starting values for theta_g
and g
are extracted from modNugs
.
Finally, three initialization schemes for Delta
are available with settings$initStrategy
:
for settings$initStrategy == 'simple'
, Delta
is simply initialized to the estimated g
value of modHom
.
Note that this procedure may fail when settings$penalty == TRUE
.
for settings$initStrategy == 'residuals'
, Delta
is initialized to the estimated residual variance from the homoskedastic mean prediction.
for settings$initStrategy == 'smoothed'
, Delta
takes the values predicted by modNugs
at X0
.
Notice that lower
and upper
bounds cannot be equal for optim
.
a list which is given the S3 class "hetGP"
, with elements:
theta
: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),
Delta
: unless given, mle of the nugget vector (non-smoothed),
Lambda
: predicted input noise variance at X0
,
nu_hat
: plugin estimator of the variance,
theta_g
: unless given, mle of the lengthscale(s) of the noise/log-noise process,
k_theta_g
: if settings$linkThetas == 'joint'
, mle for the constant by which lengthscale parameters of theta
are multiplied to get theta_g
,
g
: unless given, mle of the nugget of the noise/log-noise process,
trendtype
: either "SK
" if beta0
is provided, else "OK
",
beta0
constant trend of the mean process, plugin-estimator unless given,
nmean
: plugin estimator for the constant noise/log-noise process mean,
ll
: log-likelihood value, (ll_non_pen
) is the value without the penalty,
nit_opt
, msg
: counts
and message
returned by optim
modHom
: homoskedastic GP model of class homGP
used for initialization of the mean process,
modNugs
: homoskedastic GP model of class homGP
used for initialization of the noise/log-noise process,
nu_hat_var
: variance of the noise process,
used_args
: list with arguments provided in the call to the function, which is saved in call
,
Ki
, Kgi
: inverse of the covariance matrices of the mean and noise processes (not scaled by nu_hat
and nu_hat_var
),
X0
, Z0
, Z
, eps
, logN
, covtype
: values given in input,
time
: time to train the model, in seconds.
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments,
Journal of Computational and Graphical Statistics, 27(4), 808–821.
Preprint available on arXiv:1611.05902.
predict.hetGP
for predictions, update.hetGP
for updating an existing model.
summary
and plot
functions are available as well.
mleHetTP
provide a Student-t equivalent.
##------------------------------------------------------------ ## Example 1: Heteroskedastic GP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") ## Model fitting model <- mleHetGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), covtype = "Matern5_2") ## Display averaged observations points(model$X0, model$Z0, pch = 20) ## A quick view of the fit summary(model) ## Create a prediction grid and obtain predictions xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) predictions <- predict(x = xgrid, object = model) ## Display mean predictive surface lines(xgrid, predictions$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) ##------------------------------------------------------------ ## Example 2: 2D Heteroskedastic GP modeling ##------------------------------------------------------------ set.seed(1) nvar <- 2 ## Branin redefined in [0,1]^2 branin <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) x1 <- x[,1] * 15 - 5 x2 <- x[,2] * 15 (x2 - 5/(4 * pi^2) * (x1^2) + 5/pi * x1 - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(x1) + 10 } ## Noise field via standard deviation noiseFun <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(1/5*(3*(2 + 2*sin(x[,1]*pi)*cos(x[,2]*3*pi) + 5*rowSums(x^2)))) } ## data generating function combining mean and noise fields ftest <- function(x){ return(branin(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } ## Grid of predictive locations ngrid <- 51 xgrid <- matrix(seq(0, 1, length.out = ngrid), ncol = 1) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) ## Unique (randomly chosen) design locations n <- 50 Xu <- matrix(runif(n * 2), n) ## Select replication sites randomly X <- Xu[sample(1:n, 20*n, replace = TRUE),] ## obtain training data response at design locations X Z <- ftest(X) ## Formating of data for model creation (find replicated observations) prdata <- find_reps(X, Z, rescale = FALSE, normalize = FALSE) ## Model fitting model <- mleHetGP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = prdata$Z, lower = rep(0.01, nvar), upper = rep(10, nvar), covtype = "Matern5_2") ## a quick view into the data stored in the "hetGP"-class object summary(model) ## prediction from the fit on the grid predictions <- predict(x = Xgrid, object = model) ## Visualization of the predictive surface par(mfrow = c(2, 2)) contour(x = xgrid, y = xgrid, z = matrix(branin(Xgrid), ngrid), main = "Branin function", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(predictions$mean, ngrid), main = "Predicted mean", nlevels = 20) points(Xu, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(noiseFun(Xgrid), ngrid), main = "Noise standard deviation function", nlevels = 20) points(Xu, col = 'blue', pch = 20) contour(x = xgrid, y= xgrid, z = matrix(sqrt(predictions$nugs), ngrid), main = "Predicted noise values", nlevels = 20) points(Xu, col = 'blue', pch = 20) par(mfrow = c(1, 1))
##------------------------------------------------------------ ## Example 1: Heteroskedastic GP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") ## Model fitting model <- mleHetGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), covtype = "Matern5_2") ## Display averaged observations points(model$X0, model$Z0, pch = 20) ## A quick view of the fit summary(model) ## Create a prediction grid and obtain predictions xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) predictions <- predict(x = xgrid, object = model) ## Display mean predictive surface lines(xgrid, predictions$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) ##------------------------------------------------------------ ## Example 2: 2D Heteroskedastic GP modeling ##------------------------------------------------------------ set.seed(1) nvar <- 2 ## Branin redefined in [0,1]^2 branin <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) x1 <- x[,1] * 15 - 5 x2 <- x[,2] * 15 (x2 - 5/(4 * pi^2) * (x1^2) + 5/pi * x1 - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(x1) + 10 } ## Noise field via standard deviation noiseFun <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(1/5*(3*(2 + 2*sin(x[,1]*pi)*cos(x[,2]*3*pi) + 5*rowSums(x^2)))) } ## data generating function combining mean and noise fields ftest <- function(x){ return(branin(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } ## Grid of predictive locations ngrid <- 51 xgrid <- matrix(seq(0, 1, length.out = ngrid), ncol = 1) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) ## Unique (randomly chosen) design locations n <- 50 Xu <- matrix(runif(n * 2), n) ## Select replication sites randomly X <- Xu[sample(1:n, 20*n, replace = TRUE),] ## obtain training data response at design locations X Z <- ftest(X) ## Formating of data for model creation (find replicated observations) prdata <- find_reps(X, Z, rescale = FALSE, normalize = FALSE) ## Model fitting model <- mleHetGP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = prdata$Z, lower = rep(0.01, nvar), upper = rep(10, nvar), covtype = "Matern5_2") ## a quick view into the data stored in the "hetGP"-class object summary(model) ## prediction from the fit on the grid predictions <- predict(x = Xgrid, object = model) ## Visualization of the predictive surface par(mfrow = c(2, 2)) contour(x = xgrid, y = xgrid, z = matrix(branin(Xgrid), ngrid), main = "Branin function", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(predictions$mean, ngrid), main = "Predicted mean", nlevels = 20) points(Xu, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(noiseFun(Xgrid), ngrid), main = "Noise standard deviation function", nlevels = 20) points(Xu, col = 'blue', pch = 20) contour(x = xgrid, y= xgrid, z = matrix(sqrt(predictions$nugs), ngrid), main = "Predicted noise values", nlevels = 20) points(Xu, col = 'blue', pch = 20) par(mfrow = c(1, 1))
Student-t process regression under input dependent noise based on maximum likelihood estimation of the hyperparameters. A GP is used to model latent (log-) variances. This function is enhanced to deal with replicated observations.
mleHetTP( X, Z, lower = NULL, upper = NULL, noiseControl = list(k_theta_g_bounds = c(1, 100), g_max = 10000, g_bounds = c(1e-06, 0.1), nu_bounds = c(2 + 0.001, 30), sigma2_bounds = c(sqrt(.Machine$double.eps), 10000)), settings = list(linkThetas = "joint", logN = TRUE, initStrategy = "residuals", checkHom = TRUE, penalty = TRUE, trace = 0, return.matrices = TRUE, return.hom = FALSE, factr = 1e+09), covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, known = list(beta0 = 0), init = list(nu = 3), eps = sqrt(.Machine$double.eps) )
mleHetTP( X, Z, lower = NULL, upper = NULL, noiseControl = list(k_theta_g_bounds = c(1, 100), g_max = 10000, g_bounds = c(1e-06, 0.1), nu_bounds = c(2 + 0.001, 30), sigma2_bounds = c(sqrt(.Machine$double.eps), 10000)), settings = list(linkThetas = "joint", logN = TRUE, initStrategy = "residuals", checkHom = TRUE, penalty = TRUE, trace = 0, return.matrices = TRUE, return.hom = FALSE, factr = 1e+09), covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, known = list(beta0 = 0), init = list(nu = 3), eps = sqrt(.Machine$double.eps) )
X |
matrix of all designs, one per row, or list with elements:
|
Z |
vector of all observations. If using a list with |
lower , upper
|
bounds for the |
noiseControl |
list with elements related to optimization of the noise process parameters:
|
settings |
list for options about the general modeling procedure, with elements:
|
covtype |
covariance kernel type, either |
maxit |
maximum number of iterations for |
init , known
|
optional lists of starting values for mle optimization or that should not be optimized over, respectively.
Values in
|
eps |
jitter used in the inversion of the covariance matrix for numerical stability |
The global covariance matrix of the model is parameterized as K = sigma2 * C + Lambda * diag(1/mult)
,
with C
the correlation matrix between unique designs, depending on the family of kernel used (see cov_gen
for available choices).
Lambda
is the prediction on the noise level given by a (homoskedastic) GP:
with C_g
the correlation matrix between unique designs for this second GP, with lengthscales hyperparameters theta_g
and nugget g
and Delta
the variance level at X0
that are estimated.
It is generally recommended to use find_reps
to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.
The noise process lengthscales can be set in several ways:
using k_theta_g
(settings$linkThetas == 'joint'
), supposed to be greater than one by default.
In this case lengthscales of the noise process are multiples of those of the mean process.
if settings$linkThetas == 'constr'
, then the lower bound on theta_g
correspond to estimated values of an homoskedastic GP fit.
else lengthscales between the mean and noise process are independent (both either anisotropic or not).
When no starting nor fixed parameter values are provided with init
or known
,
the initialization process consists of fitting first an homoskedastic model of the data, called modHom
.
Unless provided with init$theta
, initial lengthscales are taken at 10% of the range determined with lower
and upper
,
while init$g_H
may be use to pass an initial nugget value.
The resulting lengthscales provide initial values for theta
(or update them if given in init
).
If necessary, a second homoskedastic model, modNugs
, is fitted to the empirical residual variance between the prediction
given by modHom
at X0
and Z
(up to modHom$nu_hat
).
Note that when specifying settings$linkThetas == 'joint'
, then this second homoskedastic model has fixed lengthscale parameters.
Starting values for theta_g
and g
are extracted from modNugs
.
Finally, three initialization schemes for Delta
are available with settings$initStrategy
:
for settings$initStrategy == 'simple'
, Delta
is simply initialized to the estimated g
value of modHom
.
Note that this procedure may fail when settings$penalty == TRUE
.
for settings$initStrategy == 'residuals'
, Delta
is initialized to the estimated residual variance from the homoskedastic mean prediction.
for settings$initStrategy == 'smoothed'
, Delta
takes the values predicted by modNugs
at X0
.
Notice that lower
and upper
bounds cannot be equal for optim
.
a list which is given the S3 class "hetTP"
, with elements:
theta
: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),
Delta
: unless given, mle of the nugget vector (non-smoothed),
Lambda
: predicted input noise variance at X0
,
sigma2
: plugin estimator of the variance,
theta_g
: unless given, mle of the lengthscale(s) of the noise/log-noise process,
k_theta_g
: if settings$linkThetas == 'joint'
, mle for the constant by which lengthscale parameters of theta
are multiplied to get theta_g
,
g
: unless given, mle of the nugget of the noise/log-noise process,
trendtype
: either "SK
" if beta0
is provided, else "OK
",
beta0
constant trend of the mean process, plugin-estimator unless given,
nmean
: plugin estimator for the constant noise/log-noise process mean,
ll
: log-likelihood value, (ll_non_pen
) is the value without the penalty,
nit_opt
, msg
: counts
and message
returned by optim
modHom
: homoskedastic GP model of class homGP
used for initialization of the mean process,
modNugs
: homoskedastic GP model of class homGP
used for initialization of the noise/log-noise process,
nu_hat_var
: variance of the noise process,
used_args
: list with arguments provided in the call to the function, which is saved in call
,
X0
, Z0
, Z
, eps
, logN
, covtype
: values given in input,
time
: time to train the model, in seconds.
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments,
Journal of Computational and Graphical Statistics, 27(4), 808–821.
Preprint available on arXiv:1611.05902.
A. Shah, A. Wilson, Z. Ghahramani (2014), Student-t processes as alternatives to Gaussian processes, Artificial Intelligence and Statistics, 877–885.
predict.hetTP
for predictions.
summary
and plot
functions are available as well.
##------------------------------------------------------------ ## Example 1: Heteroskedastic TP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") ## Model fitting model <- mleHetTP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), covtype = "Matern5_2") ## Display averaged observations points(model$X0, model$Z0, pch = 20) ## A quick view of the fit summary(model) ## Create a prediction grid and obtain predictions xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) preds <- predict(x = xgrid, object = model) ## Display mean predictive surface lines(xgrid, preds$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.05, df = model$nu + nrow(X)), col = 2, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.95, df = model$nu + nrow(X)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.05, df = model$nu + nrow(X)), col = 3, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.95, df = model$nu + nrow(X)), col = 3, lty = 2) ##------------------------------------------------------------ ## Example 2: 2D Heteroskedastic TP modeling ##------------------------------------------------------------ set.seed(1) nvar <- 2 ## Branin redefined in [0,1]^2 branin <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) x1 <- x[,1] * 15 - 5 x2 <- x[,2] * 15 (x2 - 5/(4 * pi^2) * (x1^2) + 5/pi * x1 - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(x1) + 10 } ## Noise field via standard deviation noiseFun <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(1/5*(3*(2 + 2*sin(x[,1]*pi)*cos(x[,2]*3*pi) + 5*rowSums(x^2)))) } ## data generating function combining mean and noise fields ftest <- function(x){ return(branin(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } ## Grid of predictive locations ngrid <- 51 xgrid <- matrix(seq(0, 1, length.out = ngrid), ncol = 1) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) ## Unique (randomly chosen) design locations n <- 100 Xu <- matrix(runif(n * 2), n) ## Select replication sites randomly X <- Xu[sample(1:n, 20*n, replace = TRUE),] ## obtain training data response at design locations X Z <- ftest(X) ## Formating of data for model creation (find replicated observations) prdata <- find_reps(X, Z, rescale = FALSE, normalize = FALSE) ## Model fitting model <- mleHetTP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = prdata$Z, , lower = rep(0.01, nvar), upper = rep(10, nvar), covtype = "Matern5_2") ## a quick view into the data stored in the "hetTP"-class object summary(model) ## prediction from the fit on the grid preds <- predict(x = Xgrid, object = model) ## Visualization of the predictive surface par(mfrow = c(2, 2)) contour(x = xgrid, y = xgrid, z = matrix(branin(Xgrid), ngrid), main = "Branin function", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(noiseFun(Xgrid), ngrid), main = "Noise standard deviation function", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y= xgrid, z = matrix(sqrt(preds$nugs), ngrid), main = "Predicted noise values", nlevels = 20) points(X, col = 'blue', pch = 20) par(mfrow = c(1, 1))
##------------------------------------------------------------ ## Example 1: Heteroskedastic TP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel nvar <- 1 plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") ## Model fitting model <- mleHetTP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), covtype = "Matern5_2") ## Display averaged observations points(model$X0, model$Z0, pch = 20) ## A quick view of the fit summary(model) ## Create a prediction grid and obtain predictions xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) preds <- predict(x = xgrid, object = model) ## Display mean predictive surface lines(xgrid, preds$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.05, df = model$nu + nrow(X)), col = 2, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.95, df = model$nu + nrow(X)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.05, df = model$nu + nrow(X)), col = 3, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.95, df = model$nu + nrow(X)), col = 3, lty = 2) ##------------------------------------------------------------ ## Example 2: 2D Heteroskedastic TP modeling ##------------------------------------------------------------ set.seed(1) nvar <- 2 ## Branin redefined in [0,1]^2 branin <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) x1 <- x[,1] * 15 - 5 x2 <- x[,2] * 15 (x2 - 5/(4 * pi^2) * (x1^2) + 5/pi * x1 - 6)^2 + 10 * (1 - 1/(8 * pi)) * cos(x1) + 10 } ## Noise field via standard deviation noiseFun <- function(x){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(1/5*(3*(2 + 2*sin(x[,1]*pi)*cos(x[,2]*3*pi) + 5*rowSums(x^2)))) } ## data generating function combining mean and noise fields ftest <- function(x){ return(branin(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x))) } ## Grid of predictive locations ngrid <- 51 xgrid <- matrix(seq(0, 1, length.out = ngrid), ncol = 1) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) ## Unique (randomly chosen) design locations n <- 100 Xu <- matrix(runif(n * 2), n) ## Select replication sites randomly X <- Xu[sample(1:n, 20*n, replace = TRUE),] ## obtain training data response at design locations X Z <- ftest(X) ## Formating of data for model creation (find replicated observations) prdata <- find_reps(X, Z, rescale = FALSE, normalize = FALSE) ## Model fitting model <- mleHetTP(X = list(X0 = prdata$X0, Z0 = prdata$Z0, mult = prdata$mult), Z = prdata$Z, , lower = rep(0.01, nvar), upper = rep(10, nvar), covtype = "Matern5_2") ## a quick view into the data stored in the "hetTP"-class object summary(model) ## prediction from the fit on the grid preds <- predict(x = Xgrid, object = model) ## Visualization of the predictive surface par(mfrow = c(2, 2)) contour(x = xgrid, y = xgrid, z = matrix(branin(Xgrid), ngrid), main = "Branin function", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(preds$mean, ngrid), main = "Predicted mean", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y = xgrid, z = matrix(noiseFun(Xgrid), ngrid), main = "Noise standard deviation function", nlevels = 20) points(X, col = 'blue', pch = 20) contour(x = xgrid, y= xgrid, z = matrix(sqrt(preds$nugs), ngrid), main = "Predicted noise values", nlevels = 20) points(X, col = 'blue', pch = 20) par(mfrow = c(1, 1))
Gaussian process regression under homoskedastic noise based on maximum likelihood estimation of the hyperparameters. This function is enhanced to deal with replicated observations.
mleHomGP( X, Z, lower = NULL, upper = NULL, known = NULL, noiseControl = list(g_bounds = c(sqrt(.Machine$double.eps), 100)), init = NULL, covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, eps = sqrt(.Machine$double.eps), settings = list(return.Ki = TRUE, factr = 1e+07) )
mleHomGP( X, Z, lower = NULL, upper = NULL, known = NULL, noiseControl = list(g_bounds = c(sqrt(.Machine$double.eps), 100)), init = NULL, covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, eps = sqrt(.Machine$double.eps), settings = list(return.Ki = TRUE, factr = 1e+07) )
X |
matrix of all designs, one per row, or list with elements:
|
Z |
vector of all observations. If using a list with |
lower , upper
|
optional bounds for the |
known |
optional list of known parameters, e.g., |
noiseControl |
list with element ,
|
init |
optional list specifying starting values for MLE optimization, with elements:
|
covtype |
covariance kernel type, either 'Gaussian', 'Matern5_2' or 'Matern3_2', see |
maxit |
maximum number of iteration for L-BFGS-B of |
eps |
jitter used in the inversion of the covariance matrix for numerical stability |
settings |
list with argument |
The global covariance matrix of the model is parameterized as nu_hat * (C + g * diag(1/mult)) = nu_hat * K
,
with C
the correlation matrix between unique designs, depending on the family of kernel used (see cov_gen
for available choices) and values of lengthscale parameters.
nu_hat
is the plugin estimator of the variance of the process.
It is generally recommended to use find_reps
to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.
a list which is given the S3 class "homGP
", with elements:
theta
: maximum likelihood estimate of the lengthscale parameter(s),
g
: maximum likelihood estimate of the nugget variance,
trendtype
: either "SK
" if beta0
is given, else "OK
"
beta0
: estimated trend unless given in input,
nu_hat
: plugin estimator of the variance,
ll
: log-likelihood value,
X0
, Z0
, Z
, mult
, eps
, covtype
: values given in input,
call
: user call of the function
used_args
: list with arguments provided in the call
nit_opt
, msg
: counts
and msg
returned by optim
Ki
: inverse covariance matrix (not scaled by nu_hat
) (if return.Ki
is TRUE
in settings
)
time
: time to train the model, in seconds.
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments,
Journal of Computational and Graphical Statistics, 27(4), 808–821.
Preprint available on arXiv:1611.05902.
predict.homGP
for predictions, update.homGP
for updating an existing model.
summary
and plot
functions are available as well.
mleHomTP
provide a Student-t equivalent.
##------------------------------------------------------------ ## Example 1: Homoskedastic GP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") model <- mleHomGP(X = X, Z = Z, lower = 0.01, upper = 100) ## Display averaged observations points(model$X0, model$Z0, pch = 20) xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) predictions <- predict(x = xgrid, object = model) ## Display mean prediction lines(xgrid, predictions$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2)
##------------------------------------------------------------ ## Example 1: Homoskedastic GP modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") model <- mleHomGP(X = X, Z = Z, lower = 0.01, upper = 100) ## Display averaged observations points(model$X0, model$Z0, pch = 20) xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) predictions <- predict(x = xgrid, object = model) ## Display mean prediction lines(xgrid, predictions$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, qnorm(0.05, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2) lines(xgrid, qnorm(0.95, predictions$mean, sqrt(predictions$sd2 + predictions$nugs)), col = 3, lty = 2)
Student-t process regression under homoskedastic noise based on maximum likelihood estimation of the hyperparameters. This function is enhanced to deal with replicated observations.
mleHomTP( X, Z, lower = NULL, upper = NULL, known = list(beta0 = 0), noiseControl = list(g_bounds = c(sqrt(.Machine$double.eps), 10000), nu_bounds = c(2 + 0.001, 30), sigma2_bounds = c(sqrt(.Machine$double.eps), 10000)), init = list(nu = 3), covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, eps = sqrt(.Machine$double.eps), settings = list(return.Ki = TRUE, factr = 1e+09) )
mleHomTP( X, Z, lower = NULL, upper = NULL, known = list(beta0 = 0), noiseControl = list(g_bounds = c(sqrt(.Machine$double.eps), 10000), nu_bounds = c(2 + 0.001, 30), sigma2_bounds = c(sqrt(.Machine$double.eps), 10000)), init = list(nu = 3), covtype = c("Gaussian", "Matern5_2", "Matern3_2"), maxit = 100, eps = sqrt(.Machine$double.eps), settings = list(return.Ki = TRUE, factr = 1e+09) )
X |
matrix of all designs, one per row, or list with elements:
|
Z |
vector of all observations. If using a list with |
lower , upper
|
bounds for the |
known |
optional list of known parameters, e.g., |
noiseControl |
list with element,
|
init |
list specifying starting values for MLE optimization, with elements:
|
covtype |
covariance kernel type, either 'Gaussian', 'Matern5_2' or 'Matern3_2', see |
maxit |
maximum number of iteration for L-BFGS-B of |
eps |
jitter used in the inversion of the covariance matrix for numerical stability |
settings |
list with argument |
The global covariance matrix of the model is parameterized as K = sigma2 * C + g * diag(1/mult)
,
with C
the correlation matrix between unique designs, depending on the family of kernel used (see cov_gen
for available choices).
It is generally recommended to use find_reps
to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.
a list which is given the S3 class "homGP
", with elements:
theta
: maximum likelihood estimate of the lengthscale parameter(s),
g
: maximum likelihood estimate of the nugget variance,
trendtype
: either "SK
" if beta0
is given, else "OK
"
beta0
: estimated trend unless given in input,
sigma2
: maximum likelihood estimate of the scale variance,
nu2
: maximum likelihood estimate of the degrees of freedom parameter,
ll
: log-likelihood value,
X0
, Z0
, Z
, mult
, eps
, covtype
: values given in input,
call
: user call of the function
used_args
: list with arguments provided in the call
nit_opt
, msg
: counts
and msg
returned by optim
Ki
, inverse covariance matrix (if return.Ki
is TRUE
in settings
)
time
: time to train the model, in seconds.
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments,
Journal of Computational and Graphical Statistics, 27(4), 808–821.
Preprint available on arXiv:1611.05902.
A. Shah, A. Wilson, Z. Ghahramani (2014), Student-t processes as alternatives to Gaussian processes, Artificial Intelligence and Statistics, 877–885.
M. Chung, M. Binois, RB Gramacy, DJ Moquin, AP Smith, AM Smith (2019).
Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes.
SIAM Journal on Scientific Computing, 41(4), 2212-2238.
Preprint available on arXiv:1802.00852.
predict.homTP
for predictions.
summary
and plot
functions are available as well.
##------------------------------------------------------------ ## Example 1: Homoskedastic Student-t modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") noiseControl = list(g_bounds = c(1e-3, 1e4)) model <- mleHomTP(X = X, Z = Z, lower = 0.01, upper = 100, noiseControl = noiseControl) summary(model) ## Display averaged observations points(model$X0, model$Z0, pch = 20) xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) preds <- predict(x = xgrid, object = model) ## Display mean prediction lines(xgrid, preds$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.05, df = model$nu + nrow(X)), col = 2, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.95, df = model$nu + nrow(X)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.05, df = model$nu + nrow(X)), col = 3, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.95, df = model$nu + nrow(X)), col = 3, lty = 2)
##------------------------------------------------------------ ## Example 1: Homoskedastic Student-t modeling on the motorcycle data ##------------------------------------------------------------ set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel plot(X, Z, ylim = c(-160, 90), ylab = 'acceleration', xlab = "time") noiseControl = list(g_bounds = c(1e-3, 1e4)) model <- mleHomTP(X = X, Z = Z, lower = 0.01, upper = 100, noiseControl = noiseControl) summary(model) ## Display averaged observations points(model$X0, model$Z0, pch = 20) xgrid <- matrix(seq(0, 60, length.out = 301), ncol = 1) preds <- predict(x = xgrid, object = model) ## Display mean prediction lines(xgrid, preds$mean, col = 'red', lwd = 2) ## Display 95% confidence intervals lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.05, df = model$nu + nrow(X)), col = 2, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2) * qt(0.95, df = model$nu + nrow(X)), col = 2, lty = 2) ## Display 95% prediction intervals lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.05, df = model$nu + nrow(X)), col = 3, lty = 2) lines(xgrid, preds$mean + sqrt(preds$sd2 + preds$nugs) * qt(0.95, df = model$nu + nrow(X)), col = 3, lty = 2)
x
, with centered Gaussian noise of variance sigma_x
.
Several options are available, with different efficiency/accuracy tradeoffs.Gaussian process prediction prediction at a noisy input x
, with centered Gaussian noise of variance sigma_x
.
Several options are available, with different efficiency/accuracy tradeoffs.
pred_noisy_input(x, model, sigma_x, type = c("simple", "taylor", "exact"))
pred_noisy_input(x, model, sigma_x, type = c("simple", "taylor", "exact"))
x |
design considered |
model |
GP |
sigma_x |
input variance |
type |
available options include
|
Beta version.
A. McHutchon and C.E. Rasmussen (2011),
Gaussian process training with input noise,
Advances in Neural Information Processing Systems, 1341-1349.
A. Girard, C.E. Rasmussen, J.Q. Candela and R. Murray-Smith (2003), Gaussian process priors with uncertain inputs application to multiple-step ahead time series forecasting, Advances in Neural Information Processing Systems, 545-552.
################################################################################ ### Illustration of prediction with input noise ################################################################################ ## noise std deviation function defined in [0,1] noiseFun <- function(x, coef = 1.1, scale = 0.25){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(scale*(coef + sin(x * 2 * pi))) } ## data generating function combining mean and noise fields ftest <- function(x, scale = 0.25){ if(is.null(nrow(x))) x <- matrix(x, ncol = 1) return(f1d(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x, scale = scale))) } ntest <- 101; xgrid <- seq(0,1, length.out = ntest); Xgrid <- matrix(xgrid, ncol = 1) set.seed(42) Xpred <- Xgrid[rep(1:ntest, each = 100),,drop = FALSE] Zpred <- matrix(ftest(Xpred), byrow = TRUE, nrow = ntest) n <- 10 N <- 20 X <- matrix(seq(0, 1, length.out = n)) if(N > n) X <- rbind(X, X[sample(1:n, N-n, replace = TRUE),,drop = FALSE]) X <- X[order(X[,1]),,drop = FALSE] Z <- apply(X, 1, ftest) par(mfrow = c(1, 2)) plot(X, Z, ylim = c(-10,15), xlim = c(-0.1,1.1)) lines(xgrid, f1d(xgrid)) lines(xgrid, drop(f1d(xgrid)) + 2*noiseFun(xgrid), lty = 3) lines(xgrid, drop(f1d(xgrid)) - 2*noiseFun(xgrid), lty = 3) model <- mleHomGP(X, Z, known = list(beta0 = 0)) preds <- predict(model, Xgrid) lines(xgrid, preds$mean, col = "red", lwd = 2) lines(xgrid, preds$mean - 2*sqrt(preds$sd2), col = "blue") lines(xgrid, preds$mean + 2*sqrt(preds$sd2), col = "blue") lines(xgrid, preds$mean - 2*sqrt(preds$sd2 + preds$nugs), col = "blue", lty = 2) lines(xgrid, preds$mean + 2*sqrt(preds$sd2 + preds$nugs), col = "blue", lty = 2) sigmax <- 0.1 X1 <- matrix(0.5) lines(xgrid, dnorm(xgrid, X1, sigmax) - 10, col = "darkgreen") # MC experiment nmc <- 1000 XX <- matrix(rnorm(nmc, X1, sigmax)) pxx <- predict(model, XX) YXX <- rnorm(nmc, mean = pxx$mean, sd = sqrt(pxx$sd2 + pxx$nugs)) points(XX, YXX, pch = '.') hh <- hist(YXX, breaks = 51, plot = FALSE) dd <- density(YXX) plot(hh$density, hh$mids, ylim = c(-10, 15)) lines(dd$y, dd$x) # GP predictions pin1 <- pred_noisy_input(X1, model, sigmax^2, type = "exact") pin2 <- pred_noisy_input(X1, model, sigmax^2, type = "taylor") pin3 <- pred_noisy_input(X1, model, sigmax^2, type = "simple") ygrid <- seq(-10, 15,, ntest) lines(dnorm(ygrid, pin1$mean, sqrt(pin1$sd2)), ygrid, lty = 2, col = "orange") lines(dnorm(ygrid, pin2$mean, sqrt(pin2$sd2)), ygrid, lty = 2, col = "violet") lines(dnorm(ygrid, pin3$mean, sqrt(pin3$sd2)), ygrid, lty = 2, col = "grey") abline(h = mean(YXX), col = "red") # empirical mean par(mfrow = c(1, 1))
################################################################################ ### Illustration of prediction with input noise ################################################################################ ## noise std deviation function defined in [0,1] noiseFun <- function(x, coef = 1.1, scale = 0.25){ if(is.null(nrow(x))) x <- matrix(x, nrow = 1) return(scale*(coef + sin(x * 2 * pi))) } ## data generating function combining mean and noise fields ftest <- function(x, scale = 0.25){ if(is.null(nrow(x))) x <- matrix(x, ncol = 1) return(f1d(x) + rnorm(nrow(x), mean = 0, sd = noiseFun(x, scale = scale))) } ntest <- 101; xgrid <- seq(0,1, length.out = ntest); Xgrid <- matrix(xgrid, ncol = 1) set.seed(42) Xpred <- Xgrid[rep(1:ntest, each = 100),,drop = FALSE] Zpred <- matrix(ftest(Xpred), byrow = TRUE, nrow = ntest) n <- 10 N <- 20 X <- matrix(seq(0, 1, length.out = n)) if(N > n) X <- rbind(X, X[sample(1:n, N-n, replace = TRUE),,drop = FALSE]) X <- X[order(X[,1]),,drop = FALSE] Z <- apply(X, 1, ftest) par(mfrow = c(1, 2)) plot(X, Z, ylim = c(-10,15), xlim = c(-0.1,1.1)) lines(xgrid, f1d(xgrid)) lines(xgrid, drop(f1d(xgrid)) + 2*noiseFun(xgrid), lty = 3) lines(xgrid, drop(f1d(xgrid)) - 2*noiseFun(xgrid), lty = 3) model <- mleHomGP(X, Z, known = list(beta0 = 0)) preds <- predict(model, Xgrid) lines(xgrid, preds$mean, col = "red", lwd = 2) lines(xgrid, preds$mean - 2*sqrt(preds$sd2), col = "blue") lines(xgrid, preds$mean + 2*sqrt(preds$sd2), col = "blue") lines(xgrid, preds$mean - 2*sqrt(preds$sd2 + preds$nugs), col = "blue", lty = 2) lines(xgrid, preds$mean + 2*sqrt(preds$sd2 + preds$nugs), col = "blue", lty = 2) sigmax <- 0.1 X1 <- matrix(0.5) lines(xgrid, dnorm(xgrid, X1, sigmax) - 10, col = "darkgreen") # MC experiment nmc <- 1000 XX <- matrix(rnorm(nmc, X1, sigmax)) pxx <- predict(model, XX) YXX <- rnorm(nmc, mean = pxx$mean, sd = sqrt(pxx$sd2 + pxx$nugs)) points(XX, YXX, pch = '.') hh <- hist(YXX, breaks = 51, plot = FALSE) dd <- density(YXX) plot(hh$density, hh$mids, ylim = c(-10, 15)) lines(dd$y, dd$x) # GP predictions pin1 <- pred_noisy_input(X1, model, sigmax^2, type = "exact") pin2 <- pred_noisy_input(X1, model, sigmax^2, type = "taylor") pin3 <- pred_noisy_input(X1, model, sigmax^2, type = "simple") ygrid <- seq(-10, 15,, ntest) lines(dnorm(ygrid, pin1$mean, sqrt(pin1$sd2)), ygrid, lty = 2, col = "orange") lines(dnorm(ygrid, pin2$mean, sqrt(pin2$sd2)), ygrid, lty = 2, col = "violet") lines(dnorm(ygrid, pin3$mean, sqrt(pin3$sd2)), ygrid, lty = 2, col = "grey") abline(h = mean(YXX), col = "red") # empirical mean par(mfrow = c(1, 1))
CRNGP
)Gaussian process predictions using a GP object for correlated noise (of class CRNGP
)
## S3 method for class 'CRNGP' predict(object, x, xprime = NULL, t0 = NULL, ...)
## S3 method for class 'CRNGP' predict(object, x, xprime = NULL, t0 = NULL, ...)
object |
an object of class |
x |
matrix of designs locations to predict at (one point per row). Last column is for the integer valued seed.
If trajectories are considered, i.e., with time, the prediction will occur at the same times as the training data unless |
xprime |
optional second matrix of predictive locations to obtain the predictive covariance matrix between |
t0 |
single column matrix of times to predict at, if trajectories are considered. By default the prediction is at the same times as the training data. |
... |
no other argument for this method |
The full predictive variance corresponds to the sum of sd2
and nugs
. See mleHomGP
for examples.
list with elements
mean
: kriging mean;
sd2
: kriging variance (filtered, e.g. without the nugget value)
cov
: predictive covariance matrix between x
and xprime
nugs
: nugget value at each prediction location, for consistency with mleHomGP
.
hetGP
)Gaussian process predictions using a heterogeneous noise GP object (of class hetGP
)
## S3 method for class 'hetGP' predict(object, x, noise.var = FALSE, xprime = NULL, nugs.only = FALSE, ...)
## S3 method for class 'hetGP' predict(object, x, noise.var = FALSE, xprime = NULL, nugs.only = FALSE, ...)
object |
an object of class |
x |
matrix of designs locations to predict at (one point per row) |
noise.var |
should the variance of the latent variance process be returned? |
xprime |
optional second matrix of predictive locations to obtain the predictive covariance matrix between |
nugs.only |
if |
... |
no other argument for this method. |
The full predictive variance corresponds to the sum of sd2
and nugs
.
See mleHetGP
for examples.
list with elements
mean
: kriging mean;
sd2
: kriging variance (filtered, e.g. without the nugget values)
nugs
: noise variance prediction
sd2_var
: (returned if noise.var = TRUE
) kriging variance of the noise process (i.e., on log-variances if logN = TRUE
)
cov
: (returned if xprime
is given) predictive covariance matrix between x
and xprime
hetTP
)Student-t process predictions using a heterogeneous noise TP object (of class hetTP
)
## S3 method for class 'hetTP' predict(object, x, noise.var = FALSE, xprime = NULL, nugs.only = FALSE, ...)
## S3 method for class 'hetTP' predict(object, x, noise.var = FALSE, xprime = NULL, nugs.only = FALSE, ...)
object |
an object of class |
x |
matrix of designs locations to predict at |
noise.var |
should the variance of the latent variance process be returned? |
xprime |
optional second matrix of predictive locations to obtain the predictive covariance matrix between |
nugs.only |
if TRUE, only return noise variance prediction |
... |
no other argument for this method. |
The full predictive variance corresponds to the sum of sd2
and nugs
.
list with elements
mean
: kriging mean;
sd2
: kriging variance (filtered, e.g. without the nugget values)
nugs
: noise variance
sd2_var
: (optional) kriging variance of the noise process (i.e., on log-variances if logN = TRUE
)
cov
: (optional) predictive covariance matrix between x and xprime
homGP
)Gaussian process predictions using a homoskedastic noise GP object (of class homGP
)
## S3 method for class 'homGP' predict(object, x, xprime = NULL, ...)
## S3 method for class 'homGP' predict(object, x, xprime = NULL, ...)
object |
an object of class |
x |
matrix of designs locations to predict at (one point per row) |
xprime |
optional second matrix of predictive locations to obtain the predictive covariance matrix between |
... |
no other argument for this method |
The full predictive variance corresponds to the sum of sd2
and nugs
. See mleHomGP
for examples.
list with elements
mean
: kriging mean;
sd2
: kriging variance (filtered, e.g. without the nugget value)
cov
: predictive covariance matrix between x
and xprime
nugs
: nugget value at each prediction location, for consistency with mleHomGP
.
homGP
)Student-t process predictions using a homoskedastic noise GP object (of class homGP
)
## S3 method for class 'homTP' predict(object, x, xprime = NULL, ...)
## S3 method for class 'homTP' predict(object, x, xprime = NULL, ...)
object |
an object of class |
x |
matrix of designs locations to predict at |
xprime |
optional second matrix of predictive locations to obtain the predictive covariance matrix between |
... |
no other argument for this method |
The full predictive variance corresponds to the sum of sd2
and nugs
.
list with elements
mean
: kriging mean;
sd2
: kriging variance (filtered, e.g. without the nugget value)
cov
: predictive covariance matrix between x
and xprime
nugs
: nugget value at each prediction location
Functions to make hetGP
objects lighter before exporting them, and to reverse this after import.
The rebuild
function may also be used to obtain more robust inverse of covariance matrices using ginv
.
rebuild(object, robust) ## S3 method for class 'homGP' rebuild(object, robust = FALSE) strip(object) ## S3 method for class 'hetGP' rebuild(object, robust = FALSE) ## S3 method for class 'homTP' rebuild(object, robust = FALSE) ## S3 method for class 'hetTP' rebuild(object, robust = FALSE)
rebuild(object, robust) ## S3 method for class 'homGP' rebuild(object, robust = FALSE) strip(object) ## S3 method for class 'hetGP' rebuild(object, robust = FALSE) ## S3 method for class 'homTP' rebuild(object, robust = FALSE) ## S3 method for class 'hetTP' rebuild(object, robust = FALSE)
object |
|
robust |
if |
object
with additional or removed slots.
set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel ## Model fitting model <- mleHetGP(X = X, Z = Z, lower = 0.1, upper = 50) # Check size object.size(model) # Remove internal elements, e.g., to save it model <- strip(model) # Check new size object.size(model) # Now rebuild model, and use ginv instead model <- rebuild(model, robust = TRUE) object.size(model)
set.seed(32) ## motorcycle data library(MASS) X <- matrix(mcycle$times, ncol = 1) Z <- mcycle$accel ## Model fitting model <- mleHetGP(X = X, Z = Z, lower = 0.1, upper = 50) # Check size object.size(model) # Remove internal elements, e.g., to save it model <- strip(model) # Check new size object.size(model) # Now rebuild model, and use ginv instead model <- rebuild(model, robust = TRUE) object.size(model)
Score and RMSE function To asses the performance of the prediction, this function computes the root mean squared error and proper score function (also known as negative log-probability density).
scores(model, Xtest, Ztest, return.rmse = FALSE)
scores(model, Xtest, Ztest, return.rmse = FALSE)
model |
|
Xtest |
matrix of new design locations |
Ztest |
corresponding vector of observations, or alternatively, a matrix of size [nrow(Xtest) x number of replicates], a list of size nrow(Xtest) with a least one value per element |
return.rmse |
if |
T. Gneiting, and A. Raftery (2007). Strictly Proper Scoring Rules, Prediction, and Estimation, Journal of the American Statistical Association, 102(477), 359-378.
Conditional simulation for CRNGP
simul(object, Xgrid, ids, nsim, eps, seqseeds, check)
simul(object, Xgrid, ids, nsim, eps, seqseeds, check)
object |
|
Xgrid |
matrix of (x, seed) locations where the simulation is performed.
Where all design locations are matched with all seed values. In particular, it is assumed that each unique x values is matched with all seeds before going to the next x value.
The last column MUST correspond to seeds values. |
ids |
vector of indices corresponding to observed values in |
nsim |
number of simulations to return |
eps |
jitter used in the Cholesky decomposition of the covariance matrix for numerical stability |
seqseeds |
is the seed sequence repeated (e.g., 1 2 3 1 2 3), else it is assumed to be ordered (e.g., 1 1 2 2 3 3) |
check |
if |
Conditional simulation matrix.
Fast conditional simulation for a CRNGP model
## S3 method for class 'CRNGP' simul( object, Xgrid, ids = NULL, nsim = 1, eps = sqrt(.Machine$double.eps), seqseeds = TRUE, check = TRUE )
## S3 method for class 'CRNGP' simul( object, Xgrid, ids = NULL, nsim = 1, eps = sqrt(.Machine$double.eps), seqseeds = TRUE, check = TRUE )
object |
a |
Xgrid |
matrix of (x, seed) locations where the simulation is performed.
The last column MUST correspond to seeds values. |
ids |
vector of indices corresponding to observed values in |
nsim |
number of simulations to return |
eps |
jitter used in the Cholesky decomposition of the covariance matrix for numerical stability |
seqseeds |
is the seed sequence repeated (e.g., 1 2 3 1 2 3), else it is assumed to be ordered (e.g., 1 1 2 2 3 3) |
check |
if |
A matrix of size nrow(Xgrid) x nsim
.
Chiles, J. P., & Delfiner, P. (2012). Geostatistics: modeling spatial uncertainty (Vol. 713). John Wiley & Sons.
Chevalier, C.; Emery, X.; Ginsbourger, D. Fast Update of Conditional Simulation Ensembles Mathematical Geosciences, 2014
## Not run: ##------------------------------------------------------------ ## Example: Homoskedastic GP modeling on 2d sims ##------------------------------------------------------------ set.seed(2) nx <- 31 ns <- 5 d <- 2 x <- as.matrix(expand.grid(seq(0,1, length.out = nx), seq(0,1, length.out = nx))) s <- matrix(seq(1, ns, length.out = ns)) Xgrid <- as.matrix(expand.grid(seq(1, ns, length.out = ns), seq(0,1, length.out = nx), seq(0,1, length.out = nx))) Xgrid <- Xgrid[,c(2, 3, 1)] g <- 1e-6 theta <- c(0.2, 0.5) KX <- cov_gen(x, theta = theta) rho <- 0.33 KS <- matrix(rho, ns, ns) diag(KS) <- 1 YY <- MASS::mvrnorm(n = 1, mu = rep(0, nx*nx*ns), Sigma = kronecker(KX, KS) + g * diag(nx*nx*ns)) YYmat <- matrix(YY, ns, nx*nx) filled.contour(matrix(YYmat[1,], nx)) filled.contour(matrix(YYmat[2,], nx)) ids <- sample(1:nrow(Xgrid), 80) X0 <- Xgrid[ids,] Y0 <- YY[ids] # For 3d visualization # library(rgl) # plot3d(Xgrid[,1], Xgrid[,2], YY, col = 1 + (Xgrid[,3] - 1) %% 6) # points3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) model <- mleCRNGP(X0, Y0, known = list(g = 1e-6)) preds <- predict(model, x = Xgrid, xprime = Xgrid) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(YY[Xgrid[,3]==1], nx), # front = "lines", back = "lines") # aspect3d(1, 1, 1) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(preds$mean[Xgrid[,3]==1], nx), # front = "lines", back = "lines", col = "red") # Conditional realizations (classical way) set.seed(2) t0 <- Sys.time() SigmaCond <- 1/2 * (preds$cov + t(preds$cov)) sims <- t(chol(SigmaCond + diag(sqrt(.Machine$double.eps), nrow(Xgrid)))) %*% rnorm(nrow(Xgrid)) sims <- sims + preds$mean print(difftime(Sys.time(), t0)) # sims <- MASS::mvrnorm(n = 1, mu = preds$mean, Sigma = 1/2 * (preds$cov + t(preds$cov))) # plot3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 1], nx), col = 1, # front = "lines", back = "lines") # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 2], nx), col = 2, # front = "lines", back = "lines") # Alternative for conditional realizations # (note: here the design points are part of the simulation points) set.seed(2) t0 <- Sys.time() condreas <- simul(model, Xgrid, ids = ids) print(difftime(Sys.time(), t0)) # plot3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) # surface3d(unique(x[,1]), unique(x[,2]), matrix(condreas[Xgrid[,3] == 1], nx), col = 1, # front = "lines", back = "lines") # surface3d(unique(x[,1]), unique(x[,2]), matrix(condreas[Xgrid[,3] == 2], nx), col = 2, # front = "lines", back = "lines") # Alternative using ordered seeds: Xgrid2 <- as.matrix(expand.grid(seq(0,1, length.out = nx), seq(0,1, length.out = nx), seq(1, ns, length.out = ns))) condreas2 <- simul(model, Xgrid2, ids = ids, seqseeds = FALSE) ## Check that values at X0 are coherent: # condreas[ids,1] - Y0 # sims[ids,1] - Y0 ## Check that the empirical mean/covariance is correct condreas2 <- simul(model, Xgrid, ids = ids, nsim = 1000) print(range(rowMeans(condreas2) - preds$mean)) print(range(cov(t(condreas2)) - preds$cov)) ## End(Not run)
## Not run: ##------------------------------------------------------------ ## Example: Homoskedastic GP modeling on 2d sims ##------------------------------------------------------------ set.seed(2) nx <- 31 ns <- 5 d <- 2 x <- as.matrix(expand.grid(seq(0,1, length.out = nx), seq(0,1, length.out = nx))) s <- matrix(seq(1, ns, length.out = ns)) Xgrid <- as.matrix(expand.grid(seq(1, ns, length.out = ns), seq(0,1, length.out = nx), seq(0,1, length.out = nx))) Xgrid <- Xgrid[,c(2, 3, 1)] g <- 1e-6 theta <- c(0.2, 0.5) KX <- cov_gen(x, theta = theta) rho <- 0.33 KS <- matrix(rho, ns, ns) diag(KS) <- 1 YY <- MASS::mvrnorm(n = 1, mu = rep(0, nx*nx*ns), Sigma = kronecker(KX, KS) + g * diag(nx*nx*ns)) YYmat <- matrix(YY, ns, nx*nx) filled.contour(matrix(YYmat[1,], nx)) filled.contour(matrix(YYmat[2,], nx)) ids <- sample(1:nrow(Xgrid), 80) X0 <- Xgrid[ids,] Y0 <- YY[ids] # For 3d visualization # library(rgl) # plot3d(Xgrid[,1], Xgrid[,2], YY, col = 1 + (Xgrid[,3] - 1) %% 6) # points3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) model <- mleCRNGP(X0, Y0, known = list(g = 1e-6)) preds <- predict(model, x = Xgrid, xprime = Xgrid) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(YY[Xgrid[,3]==1], nx), # front = "lines", back = "lines") # aspect3d(1, 1, 1) # surface3d(unique(Xgrid[1:nx^2,1]),unique(Xgrid[,2]), matrix(preds$mean[Xgrid[,3]==1], nx), # front = "lines", back = "lines", col = "red") # Conditional realizations (classical way) set.seed(2) t0 <- Sys.time() SigmaCond <- 1/2 * (preds$cov + t(preds$cov)) sims <- t(chol(SigmaCond + diag(sqrt(.Machine$double.eps), nrow(Xgrid)))) %*% rnorm(nrow(Xgrid)) sims <- sims + preds$mean print(difftime(Sys.time(), t0)) # sims <- MASS::mvrnorm(n = 1, mu = preds$mean, Sigma = 1/2 * (preds$cov + t(preds$cov))) # plot3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 1], nx), col = 1, # front = "lines", back = "lines") # surface3d(unique(x[,1]), unique(x[,2]), matrix(sims[Xgrid[,3] == 2], nx), col = 2, # front = "lines", back = "lines") # Alternative for conditional realizations # (note: here the design points are part of the simulation points) set.seed(2) t0 <- Sys.time() condreas <- simul(model, Xgrid, ids = ids) print(difftime(Sys.time(), t0)) # plot3d(X0[,1], X0[,2], Y0, size = 10, col = 1 + ((X0[,3] - 1) %% 6)) # surface3d(unique(x[,1]), unique(x[,2]), matrix(condreas[Xgrid[,3] == 1], nx), col = 1, # front = "lines", back = "lines") # surface3d(unique(x[,1]), unique(x[,2]), matrix(condreas[Xgrid[,3] == 2], nx), col = 2, # front = "lines", back = "lines") # Alternative using ordered seeds: Xgrid2 <- as.matrix(expand.grid(seq(0,1, length.out = nx), seq(0,1, length.out = nx), seq(1, ns, length.out = ns))) condreas2 <- simul(model, Xgrid2, ids = ids, seqseeds = FALSE) ## Check that values at X0 are coherent: # condreas[ids,1] - Y0 # sims[ids,1] - Y0 ## Check that the empirical mean/covariance is correct condreas2 <- simul(model, Xgrid, ids = ids, nsim = 1000) print(range(rowMeans(condreas2) - preds$mean)) print(range(cov(t(condreas2)) - preds$cov)) ## End(Not run)
Epidemiology problem, initial and rescaled to [0,1]^2 versions.
sirEval(x) sirSimulate(S0 = 1990, I0 = 10, M = S0 + I0, beta = 0.75, gamma = 0.5, imm = 0)
sirEval(x) sirSimulate(S0 = 1990, I0 = 10, M = S0 + I0, beta = 0.75, gamma = 0.5, imm = 0)
x |
vector of size two |
S0 |
initial nunber of susceptibles |
I0 |
initial number of infected |
M |
total population |
beta , gamma , imm
|
control rates |
R. Hu, M. Ludkovski (2017), Sequential Design for Ranking Response Surfaces, SIAM/ASA Journal on Uncertainty Quantification, 5(1), 212-239.
## SIR test problem illustration ngrid <- 10 # increase xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) nrep <- 5 # increase X <- Xgrid[rep(1:nrow(Xgrid), nrep),] Y <- apply(X, 1, sirEval) dataSIR <- find_reps(X, Y) filled.contour(xgrid, xgrid, matrix(lapply(dataSIR$Zlist, sd), ngrid), xlab = "Susceptibles", ylab = "Infecteds", color.palette = terrain.colors)
## SIR test problem illustration ngrid <- 10 # increase xgrid <- seq(0, 1, length.out = ngrid) Xgrid <- as.matrix(expand.grid(xgrid, xgrid)) nrep <- 5 # increase X <- Xgrid[rep(1:nrow(Xgrid), nrep),] Y <- apply(X, 1, sirEval) dataSIR <- find_reps(X, Y) filled.contour(xgrid, xgrid, matrix(lapply(dataSIR$Zlist, sd), ngrid), xlab = "Susceptibles", ylab = "Infecteds", color.palette = terrain.colors)
"hetGP"
-class model fit with new observationsFast update of existing hetGP
model with new observations.
## S3 method for class 'hetGP' update( object, Xnew, Znew, ginit = 0.01, lower = NULL, upper = NULL, noiseControl = NULL, settings = NULL, known = NULL, maxit = 100, method = "quick", ... )
## S3 method for class 'hetGP' update( object, Xnew, Znew, ginit = 0.01, lower = NULL, upper = NULL, noiseControl = NULL, settings = NULL, known = NULL, maxit = 100, method = "quick", ... )
object |
previously fit |
Xnew |
matrix of new design locations; |
Znew |
vector new observations at those design locations, of length |
ginit |
minimal value of the smoothing parameter (i.e., nugget of the noise process) for optimization initialization.
It is compared to the |
lower , upper , noiseControl , settings , known
|
optional bounds for mle optimization, see |
maxit |
maximum number of iterations for the internal L-BFGS-B optimization method; see |
method |
one of |
... |
no other argument for this method. |
The update can be performed with or without re-estimating hyperparameter.
In the first case, mleHetGP
is called, based on previous values for initialization.
The only missing values are the latent variables at the new points, that are initialized based on two possible update schemes in method
:
"quick"
the new delta value is the predicted nugs value from the previous noise model;
"mixed"
new values are taken as the barycenter between prediction given by the noise process and empirical variance.
The subsequent number of MLE computations can be controlled with maxit
.
In case hyperparameters need not be updated, maxit
can be set to 0
.
In this case it is possible to pass NA
s in Znew
, then the model can still be used to provide updated variance predictions.
##------------------------------------------------------------ ## Sequential update example ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } ## Initial data set nvar <- 1 n <- 20 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:10, n, replace = TRUE) X <- rep(X, mult) Z <- sin(X) + rnorm(length(X), sd = noisefun(X)) ## Initial fit testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) model <- model_init <- mleHetGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), maxit = 1000) ## Visualizing initial predictive surface preds <- predict(x = testpts, model_init) plot(X, Z) lines(testpts, preds$mean, col = "red") ## 10 fast update steps nsteps <- 5 npersteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), npersteps)) newX <- testpts[newIds, drop = FALSE] newZ <- sin(newX) + rnorm(length(newX), sd = noisefun(newX)) points(newX, newZ, col = "blue", pch = 20) model <- update(object = model, Xnew = newX, Znew = newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(model$nit_opt) } ## Final predictions after 10 updates p_fin <- predict(x=testpts, model) ## Visualizing the result by augmenting earlier plot lines(testpts, p_fin$mean, col = "blue") lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) ## Now compare to what you would get if you did a full batch fit instead model_direct <- mleHetGP(X = X, Z = Z, maxit = 1000, lower = rep(0.1, nvar), upper = rep(50, nvar), init = list(theta = model_init$theta, k_theta_g = model_init$k_theta_g)) p_dir <- predict(x = testpts, model_direct) print(model_direct$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(model_init) summary(model) summary(model_direct)
##------------------------------------------------------------ ## Sequential update example ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } ## Initial data set nvar <- 1 n <- 20 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:10, n, replace = TRUE) X <- rep(X, mult) Z <- sin(X) + rnorm(length(X), sd = noisefun(X)) ## Initial fit testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) model <- model_init <- mleHetGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), maxit = 1000) ## Visualizing initial predictive surface preds <- predict(x = testpts, model_init) plot(X, Z) lines(testpts, preds$mean, col = "red") ## 10 fast update steps nsteps <- 5 npersteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), npersteps)) newX <- testpts[newIds, drop = FALSE] newZ <- sin(newX) + rnorm(length(newX), sd = noisefun(newX)) points(newX, newZ, col = "blue", pch = 20) model <- update(object = model, Xnew = newX, Znew = newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(model$nit_opt) } ## Final predictions after 10 updates p_fin <- predict(x=testpts, model) ## Visualizing the result by augmenting earlier plot lines(testpts, p_fin$mean, col = "blue") lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) ## Now compare to what you would get if you did a full batch fit instead model_direct <- mleHetGP(X = X, Z = Z, maxit = 1000, lower = rep(0.1, nvar), upper = rep(50, nvar), init = list(theta = model_init$theta, k_theta_g = model_init$k_theta_g)) p_dir <- predict(x = testpts, model_direct) print(model_direct$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(model_init) summary(model) summary(model_direct)
"hetTP"
-class model fit with new observationsFast update of existing hetTP
model with new observations.
## S3 method for class 'hetTP' update( object, Xnew, Znew, ginit = 0.01, lower = NULL, upper = NULL, noiseControl = NULL, settings = NULL, known = NULL, maxit = 100, method = "quick", ... )
## S3 method for class 'hetTP' update( object, Xnew, Znew, ginit = 0.01, lower = NULL, upper = NULL, noiseControl = NULL, settings = NULL, known = NULL, maxit = 100, method = "quick", ... )
object |
previously fit |
Xnew |
matrix of new design locations; |
Znew |
vector new observations at those design locations, of length |
ginit |
minimal value of the smoothing parameter (i.e., nugget of the noise process) for optimization initialisation.
It is compared to the |
lower , upper , noiseControl , settings , known
|
optional bounds for mle optimization, see |
maxit |
maximum number of iterations for the internal L-BFGS-B optimization method; see |
method |
one of |
... |
no other argument for this method. |
The update can be performed with or without re-estimating hyperparameter.
In the first case, mleHetTP
is called, based on previous values for initialization.
The only missing values are the latent variables at the new points, that are initialized based on two possible update schemes in method
:
"quick"
the new delta value is the predicted nugs value from the previous noise model;
"mixed"
new values are taken as the barycenter between prediction given by the noise process and empirical variance.
The subsequent number of MLE computations can be controlled with maxit
.
In case hyperparameters need not be updated, maxit
can be set to 0
.
In this case it is possible to pass NA
s in Znew
, then the model can still be used to provide updated variance predictions.
##------------------------------------------------------------ ## Sequential update example ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } ## Initial data set nvar <- 1 n <- 20 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:10, n, replace = TRUE) X <- rep(X, mult) Z <- sin(X) + noisefun(X) * rt(length(X), df = 10) ## Initial fit testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) model <- model_init <- mleHetTP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), maxit = 1000) ## Visualizing initial predictive surface preds <- predict(x = testpts, model_init) plot(X, Z) lines(testpts, preds$mean, col = "red") ## 10 fast update steps nsteps <- 5 npersteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), npersteps)) newX <- testpts[newIds, drop = FALSE] newZ <- sin(newX) + noisefun(newX) * rt(length(newX), df = 10) points(newX, newZ, col = "blue", pch = 20) model <- update(object = model, Xnew = newX, Znew = newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(model$nit_opt) } ## Final predictions after 10 updates p_fin <- predict(x=testpts, model) ## Visualizing the result by augmenting earlier plot lines(testpts, p_fin$mean, col = "blue") lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) ## Now compare to what you would get if you did a full batch fit instead model_direct <- mleHetTP(X = X, Z = Z, maxit = 1000, lower = rep(0.1, nvar), upper = rep(50, nvar), init = list(theta = model_init$theta, k_theta_g = model_init$k_theta_g)) p_dir <- predict(x = testpts, model_direct) print(model_direct$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(model_init) summary(model) summary(model_direct)
##------------------------------------------------------------ ## Sequential update example ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } ## Initial data set nvar <- 1 n <- 20 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:10, n, replace = TRUE) X <- rep(X, mult) Z <- sin(X) + noisefun(X) * rt(length(X), df = 10) ## Initial fit testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) model <- model_init <- mleHetTP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar), maxit = 1000) ## Visualizing initial predictive surface preds <- predict(x = testpts, model_init) plot(X, Z) lines(testpts, preds$mean, col = "red") ## 10 fast update steps nsteps <- 5 npersteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), npersteps)) newX <- testpts[newIds, drop = FALSE] newZ <- sin(newX) + noisefun(newX) * rt(length(newX), df = 10) points(newX, newZ, col = "blue", pch = 20) model <- update(object = model, Xnew = newX, Znew = newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(model$nit_opt) } ## Final predictions after 10 updates p_fin <- predict(x=testpts, model) ## Visualizing the result by augmenting earlier plot lines(testpts, p_fin$mean, col = "blue") lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) ## Now compare to what you would get if you did a full batch fit instead model_direct <- mleHetTP(X = X, Z = Z, maxit = 1000, lower = rep(0.1, nvar), upper = rep(50, nvar), init = list(theta = model_init$theta, k_theta_g = model_init$k_theta_g)) p_dir <- predict(x = testpts, model_direct) print(model_direct$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(model_init) summary(model) summary(model_direct)
homGP
-updateUpdate existing homGP
model with new observations
## S3 method for class 'homGP' update( object, Xnew, Znew = NULL, lower = NULL, upper = NULL, noiseControl = NULL, known = NULL, maxit = 100, ... )
## S3 method for class 'homGP' update( object, Xnew, Znew = NULL, lower = NULL, upper = NULL, noiseControl = NULL, known = NULL, maxit = 100, ... )
object |
initial model of class |
Xnew |
matrix of new design locations; |
Znew |
vector new observations at those new design locations, of length |
lower , upper , noiseControl , known
|
optional bounds for MLE optimization, see |
maxit |
maximum number of iterations for the internal L-BFGS-B optimization method; see |
... |
no other argument for this method. |
In case hyperparameters need not be updated, maxit
can be set to 0
.
In this case it is possible to pass NA
s in Znew
, then the model can still be used to provide updated variance predictions.
## Not run: ##------------------------------------------------------------ ## Example : Sequential Homoskedastic GP modeling ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } nvar <- 1 n <- 10 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:10, n) X <- rep(X, mult) Z <- sin(X) + rnorm(length(X), sd = noisefun(X)) testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) model <- model_init <- mleHomGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar)) preds <- predict(x = testpts, object = model_init) plot(X, Z) lines(testpts, preds$mean, col = "red") nsteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), 10)) newX <- testpts[newIds, drop = FALSE] newZ <- sin(newX) + rnorm(length(newX), sd = noisefun(newX)) points(newX, newZ, col = "blue", pch = 20) model <- update(object = model, newX, newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(model$nit_opt) } p_fin <- predict(x = testpts, object = model) lines(testpts, p_fin$mean, col = "blue") lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) model_direct <- mleHomGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar)) p_dir <- predict(x = testpts, object = model_direct) print(model_direct$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(model_init) summary(model) summary(model_direct) ## End(Not run)
## Not run: ##------------------------------------------------------------ ## Example : Sequential Homoskedastic GP modeling ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } nvar <- 1 n <- 10 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:10, n) X <- rep(X, mult) Z <- sin(X) + rnorm(length(X), sd = noisefun(X)) testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) model <- model_init <- mleHomGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar)) preds <- predict(x = testpts, object = model_init) plot(X, Z) lines(testpts, preds$mean, col = "red") nsteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), 10)) newX <- testpts[newIds, drop = FALSE] newZ <- sin(newX) + rnorm(length(newX), sd = noisefun(newX)) points(newX, newZ, col = "blue", pch = 20) model <- update(object = model, newX, newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(model$nit_opt) } p_fin <- predict(x = testpts, object = model) lines(testpts, p_fin$mean, col = "blue") lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2)), col = "blue", lty = 2) lines(testpts, qnorm(0.05, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) lines(testpts, qnorm(0.95, p_fin$mean, sqrt(p_fin$sd2 + p_fin$nugs)), col = "blue", lty = 3) model_direct <- mleHomGP(X = X, Z = Z, lower = rep(0.1, nvar), upper = rep(50, nvar)) p_dir <- predict(x = testpts, object = model_direct) print(model_direct$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2)), col = "green", lty = 2) lines(testpts, qnorm(0.05, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, qnorm(0.95, p_dir$mean, sqrt(p_dir$sd2 + p_dir$nugs)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(model_init) summary(model) summary(model_direct) ## End(Not run)
homTP
-updateUpdate existing homTP
model with new observations
## S3 method for class 'homTP' update( object, Xnew, Znew = NULL, lower = NULL, upper = NULL, noiseControl = NULL, known = NULL, maxit = 100, ... )
## S3 method for class 'homTP' update( object, Xnew, Znew = NULL, lower = NULL, upper = NULL, noiseControl = NULL, known = NULL, maxit = 100, ... )
object |
initial model of class |
Xnew |
matrix of new design locations; |
Znew |
vector new observations at those new design locations, of length |
lower , upper , noiseControl , known
|
optional bounds for MLE optimization, see |
maxit |
maximum number of iterations for the internal L-BFGS-B optimization method; see |
... |
no other argument for this method. |
In case hyperparameters need not be updated, maxit
can be set to 0
.
In this case it is possible to pass NA
s in Znew
, then the model can still be used to provide updated variance predictions.
## Not run: ##------------------------------------------------------------ ## Example : Sequential Homoskedastic TP moding ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } df_noise <- 3 nvar <- 1 n <- 10 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:50, n, replace = TRUE) X <- rep(X, mult) Z <- sin(X) + noisefun(X) * rt(length(X), df = df_noise) testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) mod <- mod_init <- mleHomTP(X = X, Z = Z, covtype = "Matern5_2", lower = rep(0.1, nvar), upper = rep(50, nvar)) preds <- predict(x = testpts, object = mod_init) plot(X, Z) lines(testpts, preds$mean, col = "red") nsteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), 5)) newX <- testpts[rep(newIds, times = sample(1:50, length(newIds), replace = TRUE)), drop = FALSE] newZ <- sin(newX) + noisefun(newX) * rt(length(newX), df = df_noise) points(newX, newZ, col = "blue", pch = 20) mod <- update(object = mod, newX, newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(mod$nit_opt) } p_fin <- predict(x = testpts, object = mod) lines(testpts, p_fin$mean, col = "blue") lines(testpts, p_fin$mean + sqrt(p_fin$sd2) * qt(0.05, df = mod$nu + length(Z)), col = "blue", lty = 2) lines(testpts, p_fin$mean + sqrt(p_fin$sd2) * qt(0.95, df = mod$nu + length(Z)), col = "blue", lty = 2) lines(testpts, p_fin$mean + sqrt(p_fin$sd2 + p_fin$nugs) * qt(0.05, df = mod$nu + length(Z)), col = "blue", lty = 3) lines(testpts, p_fin$mean + sqrt(p_fin$sd2 + p_fin$nugs) * qt(0.95, df = mod$nu + length(Z)), col = "blue", lty = 3) mod_dir <- mleHomTP(X = X, Z = Z, covtype = "Matern5_2", lower = rep(0.1, nvar), upper = rep(50, nvar)) p_dir <- predict(x = testpts, object = mod_dir) print(mod_dir$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, p_dir$mean + sqrt(p_dir$sd2) * qt(0.05, df = mod_dir$nu + length(Z)), col = "green", lty = 2) lines(testpts, p_dir$mean + sqrt(p_dir$sd2) * qt(0.95, df = mod_dir$nu + length(Z)), col = "green", lty = 2) lines(testpts, p_dir$mean + sqrt(p_dir$sd2 + p_dir$nugs) * qt(0.05, df = mod_dir$nu + length(Z)), col = "green", lty = 3) lines(testpts, p_dir$mean + sqrt(p_dir$sd2 + p_dir$nugs) * qt(0.95, df = mod_dir$nu + length(Z)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(mod_init) summary(mod) summary(mod_dir) ## End(Not run)
## Not run: ##------------------------------------------------------------ ## Example : Sequential Homoskedastic TP moding ##------------------------------------------------------------ set.seed(42) ## Spatially varying noise function noisefun <- function(x, coef = 1){ return(coef * (0.05 + sqrt(abs(x)*20/(2*pi))/10)) } df_noise <- 3 nvar <- 1 n <- 10 X <- matrix(seq(0, 2 * pi, length=n), ncol = 1) mult <- sample(1:50, n, replace = TRUE) X <- rep(X, mult) Z <- sin(X) + noisefun(X) * rt(length(X), df = df_noise) testpts <- matrix(seq(0, 2*pi, length = 10*n), ncol = 1) mod <- mod_init <- mleHomTP(X = X, Z = Z, covtype = "Matern5_2", lower = rep(0.1, nvar), upper = rep(50, nvar)) preds <- predict(x = testpts, object = mod_init) plot(X, Z) lines(testpts, preds$mean, col = "red") nsteps <- 10 for(i in 1:nsteps){ newIds <- sort(sample(1:(10*n), 5)) newX <- testpts[rep(newIds, times = sample(1:50, length(newIds), replace = TRUE)), drop = FALSE] newZ <- sin(newX) + noisefun(newX) * rt(length(newX), df = df_noise) points(newX, newZ, col = "blue", pch = 20) mod <- update(object = mod, newX, newZ) X <- c(X, newX) Z <- c(Z, newZ) plot(X, Z) print(mod$nit_opt) } p_fin <- predict(x = testpts, object = mod) lines(testpts, p_fin$mean, col = "blue") lines(testpts, p_fin$mean + sqrt(p_fin$sd2) * qt(0.05, df = mod$nu + length(Z)), col = "blue", lty = 2) lines(testpts, p_fin$mean + sqrt(p_fin$sd2) * qt(0.95, df = mod$nu + length(Z)), col = "blue", lty = 2) lines(testpts, p_fin$mean + sqrt(p_fin$sd2 + p_fin$nugs) * qt(0.05, df = mod$nu + length(Z)), col = "blue", lty = 3) lines(testpts, p_fin$mean + sqrt(p_fin$sd2 + p_fin$nugs) * qt(0.95, df = mod$nu + length(Z)), col = "blue", lty = 3) mod_dir <- mleHomTP(X = X, Z = Z, covtype = "Matern5_2", lower = rep(0.1, nvar), upper = rep(50, nvar)) p_dir <- predict(x = testpts, object = mod_dir) print(mod_dir$nit_opt) lines(testpts, p_dir$mean, col = "green") lines(testpts, p_dir$mean + sqrt(p_dir$sd2) * qt(0.05, df = mod_dir$nu + length(Z)), col = "green", lty = 2) lines(testpts, p_dir$mean + sqrt(p_dir$sd2) * qt(0.95, df = mod_dir$nu + length(Z)), col = "green", lty = 2) lines(testpts, p_dir$mean + sqrt(p_dir$sd2 + p_dir$nugs) * qt(0.05, df = mod_dir$nu + length(Z)), col = "green", lty = 3) lines(testpts, p_dir$mean + sqrt(p_dir$sd2 + p_dir$nugs) * qt(0.95, df = mod_dir$nu + length(Z)), col = "green", lty = 3) lines(testpts, sin(testpts), col = "red", lty = 2) ## Compare outputs summary(mod_init) summary(mod) summary(mod_dir) ## End(Not run)
Compute double integral of the covariance kernel over a [0,1]^d domain
Wij(mu1, mu2 = NULL, theta, type)
Wij(mu1, mu2 = NULL, theta, type)
mu1 , mu2
|
input locations considered |
theta |
lengthscale hyperparameter of the kernel |
type |
kernel type, one of " |
M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019),
Replication or exploration? Sequential design for stochastic simulation experiments,
Technometrics, 61(1), 7-23.
Preprint available on arXiv:1710.03206.