Title: | Multivariate Normal Probabilities using Vecchia Approximation |
---|---|
Description: | Under a different representation of the multivariate normal (MVN) probability, we can use the Vecchia approximation to sample the integrand at a linear complexity with respect to n. Additionally, both the SOV algorithm from Genz (92) and the exponential-tilting method from Botev (2017) can be adapted to linear complexity. The reference for the method implemented in this package is Jian Cao and Matthias Katzfuss (2024) "Linear-Cost Vecchia Approximation of Multivariate Normal Probabilities" <doi:10.48550/arXiv.2311.09426>. Two major references for the development of our method are Alan Genz (1992) "Numerical Computation of Multivariate Normal Probabilities" <doi:10.1080/10618600.1992.10477010> and Z. I. Botev (2017) "The Normal Law Under Linear Restrictions: Simulation and Estimation via Minimax Tilting" <doi:10.48550/arXiv.1603.04166>. |
Authors: | Jian Cao [aut, cre], Matthias Katzfuss [aut] |
Maintainer: | Jian Cao <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.1 |
Built: | 2024-12-26 06:58:29 UTC |
Source: | CRAN |
Univariate ordering under FIC approximation, first m chosen by m iter of dense univariate reordering
FIC_reorder_univar( a, b, m, locs = NULL, covName = NULL, covParms = NULL, covMat = NULL )
FIC_reorder_univar( a, b, m, locs = NULL, covName = NULL, covParms = NULL, covMat = NULL )
a |
lower bound vector for TMVN |
b |
upper bound vector for TMVN |
m |
Vecchia conditioning set size |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
covMat |
dense covariance matrix, not needed when 'locs' is not null |
a vector of new order based on FIC assumption and maxmin ordering
library(VeccTMVN) n1 <- 5 n2 <- 5 n <- n1 * n2 m <- 5 locs <- as.matrix(expand.grid((1:n1) / n1, (1:n2) / n2)) covparms <- c(2, 0.1, 0) cov_name <- "matern15_isotropic" a <- rep(-Inf, n) b <- seq(from = -3, to = 3, length.out = n) cat("The output order should be roughly 1 to ", n) cat(FIC_reorder_univar(a, b, m, locs, cov_name, covparms))
library(VeccTMVN) n1 <- 5 n2 <- 5 n <- n1 * n2 m <- 5 locs <- as.matrix(expand.grid((1:n1) / n1, (1:n2) / n2)) covparms <- c(2, 0.1, 0) cov_name <- "matern15_isotropic" a <- rep(-Inf, n) b <- seq(from = -3, to = 3, length.out = n) cat("The output order should be roughly 1 to ", n) cat(FIC_reorder_univar(a, b, m, locs, cov_name, covparms))
Find ordered nearest neighbors based on a correlation Matrix. Assuming the absolute value of the correlation is monotonically decreasing with distance. Returns an n X (m + 1) matrix similar to 'GpGp::find_ordered_nn'.
find_nn_corr(corrMat, m)
find_nn_corr(corrMat, m)
corrMat |
the correlation matrix |
m |
the number of nearest neighbors |
an n X (m + 1) matrix
library(GpGp) library(VeccTMVN) set.seed(123) d <- 3 n <- 100 locs <- matrix(runif(d * n), n, d) covparms <- c(2, 0.01, 0) cov_mat <- GpGp::matern15_isotropic(covparms, locs) m <- 10 NNarray_test <- GpGp::find_ordered_nn(locs, m = m) NNarray <- find_nn_corr(cov_mat, m) cat("Number of mismatch is", sum(NNarray != NNarray_test, na.rm = TRUE))
library(GpGp) library(VeccTMVN) set.seed(123) d <- 3 n <- 100 locs <- matrix(runif(d * n), n, d) covparms <- c(2, 0.01, 0) cov_mat <- GpGp::matern15_isotropic(covparms, locs) m <- 10 NNarray_test <- GpGp::find_ordered_nn(locs, m = m) NNarray <- find_nn_corr(cov_mat, m) cat("Number of mismatch is", sum(NNarray != NNarray_test, na.rm = TRUE))
Get the inverse upper Cholesky factor under the Vecchia approximation
get_sp_inv_chol(covMat, NNarray)
get_sp_inv_chol(covMat, NNarray)
covMat |
the covariance matrix |
NNarray |
n X (m + 1) matrix representing the nearest neighbor indices among previous observations. This is typically the return of GpGp::find_ordered_nn |
upper Cholesky of the inverse of 'covMat'
library(GpGp) n1 <- 10 n2 <- 10 n <- n1 * n2 locs <- as.matrix(expand.grid((1:n1) / n1, (1:n2) / n2)) covparms <- c(2, 0.3, 0) cov_mat <- GpGp::matern15_isotropic(covparms, locs) m <- 30 NNarray <- GpGp::find_ordered_nn(locs, m = m) # Vecchia approx -------------------------------- U_Vecc <- get_sp_inv_chol(cov_mat, NNarray) U <- solve(chol(cov_mat)) cat("Frobenius norm of the difference is", sqrt(sum((U - U_Vecc)^2)))
library(GpGp) n1 <- 10 n2 <- 10 n <- n1 * n2 locs <- as.matrix(expand.grid((1:n1) / n1, (1:n2) / n2)) covparms <- c(2, 0.3, 0) cov_mat <- GpGp::matern15_isotropic(covparms, locs) m <- 30 NNarray <- GpGp::find_ordered_nn(locs, m = m) # Vecchia approx -------------------------------- U_Vecc <- get_sp_inv_chol(cov_mat, NNarray) U <- solve(chol(cov_mat)) cat("Frobenius norm of the difference is", sqrt(sum((U - U_Vecc)^2)))
Compute censored multivariate normal (MVN) log-probabilities that have spatial covariance matrices using Vecchia approximation
loglk_censor_MVN( locs, indCensor, y, bCensor, covName = NULL, covParms = NULL, m = 30, NLevel1 = 10, NLevel2 = 1000, verbose = TRUE )
loglk_censor_MVN( locs, indCensor, y, bCensor, covName = NULL, covParms = NULL, m = 30, NLevel1 = 10, NLevel2 = 1000, verbose = TRUE )
locs |
location (feature) matrix n X d |
indCensor |
indices of locations that have only censored observations |
y |
observed (not censored) values, of length n |
bCensor |
upper bound, above which observations are not censored, can be different for different locations, of length 1 or n |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m |
Vecchia conditioning set size |
NLevel1 |
first level Monte Carlo sample size |
NLevel2 |
second level Monte Carlo sample size |
verbose |
verbose level |
estimated MVN probability and estimation error
Simulate truncated multivariate normal (TMVN) using the Vecchia approximation
mvrandn( lower, upper, mean, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, N = 1000, verbose = FALSE )
mvrandn( lower, upper, mean, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, N = 1000, verbose = FALSE )
lower |
lower bound vector for TMVN |
upper |
upper bound vector for TMVN |
mean |
MVN mean |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m |
Vecchia conditioning set size |
sigma |
dense covariance matrix, not needed when 'locs' is not null |
N |
number of samples required |
verbose |
verbose level |
n X N matrix of generated samples
Simulate truncated multivariate normal (TMVT) using the Vecchia approximation
mvrandt( lower, upper, delta, df, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, N = 1000, verbose = FALSE )
mvrandt( lower, upper, delta, df, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, N = 1000, verbose = FALSE )
lower |
lower bound vector for TMVT |
upper |
upper bound vector for TMVT |
delta |
MVT shifting parameter |
df |
degrees of freedom |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m |
Vecchia conditioning set size |
sigma |
dense covariance matrix, not needed when 'locs' is not null |
N |
number of samples required |
verbose |
verbose level |
n X N matrix of generated samples
Compute multivariate normal (MVN) probabilities that have spatial covariance matrices using Vecchia approximation
pmvn( lower, upper, mean, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
pmvn( lower, upper, mean, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
lower |
lower bound vector for TMVN |
upper |
upper bound vector for TMVN |
mean |
MVN mean |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m |
Vecchia conditioning set size |
sigma |
dense covariance matrix, not needed when 'locs' is not null |
reorder |
whether to reorder integration variables. '0' for no, '1' for FIC-based univariate ordering, '2' for Vecchia-based univariate ordering, and '3' for univariate reordering, which appeared faster than '2' |
NLevel1 |
first level Monte Carlo sample size |
NLevel2 |
second level Monte Carlo sample size |
verbose |
verbose or not |
retlog |
TRUE or FALSE for whether to return loglk or not |
... |
could be m_ord for conditioning set size for reordering |
estimated MVN probability and estimation error
NLevel1 = 1
for m = m2
and the same
exponential tilting parameter as m = m1
to compute one MC estimate.
This MC estimate is used to correct the bias from the Vecchia approximationApplying the multi-level Monte Carlo (MLMC) technique to the pmvn function
The function uses NLevel1 = 1
for m = m2
and the same
exponential tilting parameter as m = m1
to compute one MC estimate.
This MC estimate is used to correct the bias from the Vecchia approximation
pmvn_MLMC( lower, upper, mean, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m1 = 30, m2 = 100, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
pmvn_MLMC( lower, upper, mean, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m1 = 30, m2 = 100, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
lower |
lower bound vector for TMVN |
upper |
upper bound vector for TMVN |
mean |
MVN mean |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m1 |
the smaller Vecchia conditioning set size for Level 1 MC |
m2 |
the bigger Vecchia conditioning set size for Level 2 MC |
sigma |
dense covariance matrix, not needed when 'locs' is not null |
reorder |
whether to reorder integration variables. '0' for no, '1' for FIC-based univariate ordering, '2' for Vecchia-based univariate ordering, and '3' for univariate reordering, which appeared faster than '2' |
NLevel1 |
first level Monte Carlo sample size |
NLevel2 |
second level Monte Carlo sample size |
verbose |
verbose or not |
retlog |
TRUE or FALSE for whether to return loglk or not |
... |
could be m_ord for conditioning set size for reordering |
estimated MVN probability and estimation error
Compute multivariate Student-t (MVT) probabilities that have spatial covariance matrices using Vecchia approximation
pmvt( lower, upper, delta, df, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
pmvt( lower, upper, delta, df, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m = 30, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
lower |
lower bound vector for TMVT |
upper |
upper bound vector for TMVT |
delta |
MVT shifting parameter |
df |
degrees of freedom |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m |
Vecchia conditioning set size |
sigma |
dense covariance matrix, not needed when 'locs' is not null |
reorder |
whether to reorder integration variables. '0' for no, '1' for FIC-based univariate ordering, '2' for Vecchia-based univariate ordering, and '3' for univariate reordering, which appeared faster than '2' |
NLevel1 |
first level Monte Carlo sample size |
NLevel2 |
second level Monte Carlo sample size |
verbose |
verbose or not |
retlog |
TRUE or FALSE for whether to return loglk or not |
... |
could be m_ord for conditioning set size for reordering |
estimated MVT probability and estimation error
NLevel1 = 1
for m = m2
and the same
exponential tilting parameter as m = m1
to compute one MC estimate.
This MC estimate is used to correct the bias from the Vecchia approximationApplying the multi-level Monte Carlo (MLMC) technique to the pmvt function
The function uses NLevel1 = 1
for m = m2
and the same
exponential tilting parameter as m = m1
to compute one MC estimate.
This MC estimate is used to correct the bias from the Vecchia approximation
pmvt_MLMC( lower, upper, delta, df, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m1 = 30, m2 = 100, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
pmvt_MLMC( lower, upper, delta, df, locs = NULL, covName = "matern15_isotropic", covParms = c(1, 0.1, 0), m1 = 30, m2 = 100, sigma = NULL, reorder = 0, NLevel1 = 12, NLevel2 = 10000, verbose = FALSE, retlog = FALSE, ... )
lower |
lower bound vector for TMVT |
upper |
upper bound vector for TMVT |
delta |
MVT shifting parameter |
df |
degrees of freedom |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m1 |
the smaller Vecchia conditioning set size for Level 1 MC |
m2 |
the bigger Vecchia conditioning set size for Level 2 MC |
sigma |
dense covariance matrix, not needed when 'locs' is not null |
reorder |
whether to reorder integration variables. '0' for no, '1' for FIC-based univariate ordering, '2' for Vecchia-based univariate ordering, and '3' for univariate reordering, which appeared faster than '2' |
NLevel1 |
first level Monte Carlo sample size |
NLevel2 |
second level Monte Carlo sample size |
verbose |
verbose or not |
retlog |
TRUE or FALSE for whether to return loglk or not |
... |
could be m_ord for conditioning set size for reordering |
estimated MVT probability and estimation error
Simulate partially censored multivariate normal (MVN) at censored locations using the Vecchia approximation
ptmvrandn( locs, indCensor, y, bCensor, covName = NULL, covParms = NULL, m = 30, N = 1000, verbose = TRUE, reorder = TRUE )
ptmvrandn( locs, indCensor, y, bCensor, covName = NULL, covParms = NULL, m = 30, N = 1000, verbose = TRUE, reorder = TRUE )
locs |
location (feature) matrix n X d |
indCensor |
indices of locations that have only censored observations |
y |
observed (not censored) values, of length n |
bCensor |
upper bound, above which observations are not censored, can be different for different locations, of length 1 or n |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
m |
Vecchia conditioning set size |
N |
number of samples required |
verbose |
verbose level |
reorder |
whether to Vecchia univariate variable reordering |
n X N matrix of generated samples
Univariate variable reordering, described in Genz and Bretz (2009) If failed due to PD singularity, the unfinished order will be returned and a warning will be issued
univar_order(a, b, sigma)
univar_order(a, b, sigma)
a |
lower integration limits |
b |
upper integration limits |
sigma |
covariance matrix |
the new order
Univariate ordering under Vecchia approximation
Vecc_reorder( a, b, m, locs = NULL, covName = NULL, covParms = NULL, covMat = NULL )
Vecc_reorder( a, b, m, locs = NULL, covName = NULL, covParms = NULL, covMat = NULL )
a |
lower bound vector for TMVN |
b |
upper bound vector for TMVN |
m |
Vecchia conditioning set size |
locs |
location (feature) matrix n X d |
covName |
covariance function name from the 'GpGp' package |
covParms |
parameters for 'covName' |
covMat |
dense covariance matrix, not needed when 'locs' is not null |
new order, nearest neighbor matrix, and coefficient matrix
library(lhs) library(GpGp) library(VeccTMVN) set.seed(123) n <- 100 m <- 5 locs <- lhs::geneticLHS(n, 2) covparms <- c(1, 0.1, 0) cov_name <- "matern15_isotropic" cov_mat <- get(cov_name)(covparms, locs) a <- rep(-Inf, n) b <- runif(n) odr_TN <- TruncatedNormal::cholperm(cov_mat, a, b)$perm rslt <- Vecc_reorder(a, b, m, locs = locs, covName = cov_name, covParms = covparms ) # compare order cat(rslt$order) cat(odr_TN)
library(lhs) library(GpGp) library(VeccTMVN) set.seed(123) n <- 100 m <- 5 locs <- lhs::geneticLHS(n, 2) covparms <- c(1, 0.1, 0) cov_name <- "matern15_isotropic" cov_mat <- get(cov_name)(covparms, locs) a <- rep(-Inf, n) b <- runif(n) odr_TN <- TruncatedNormal::cholperm(cov_mat, a, b)$perm rslt <- Vecc_reorder(a, b, m, locs = locs, covName = cov_name, covParms = covparms ) # compare order cat(rslt$order) cat(odr_TN)
Compute multivariate normal probabilities and sample from multivariate truncated normal distribution, taking advantage of the Vecchia approximation