Cubature Vectorization Results

Introduction

This R cubature package exposes both the hcubature and pcubature routines of the underlying C cubature library, including the vectorized interfaces.

Per the documentation, use of pcubature is advisable only for smooth integrands in dimensions up to three at most. In fact, the pcubature routines perform significantly worse than the vectorized hcubature in inappropriate cases. So when in doubt, you are better off using hcubature.

Version 2.0 of this package integrates the Cuba library as well, once again providing vectorized interfaces.

The main point of this note is to examine the difference vectorization makes. My recommendations are below in the summary section.

A Timing Harness

Our harness will provide timing results for hcubature, pcubature (where appropriate) and Cuba cuhre calls. We begin by creating a harness for these calls.

library(benchr)
library(cubature)

harness <- function(which = NULL,
                    f, fv, lowerLimit, upperLimit, tol = 1e-3, times = 20, ...) {

    fns <- c(hc = "Non-vectorized Hcubature",
             hc.v = "Vectorized Hcubature",
             pc = "Non-vectorized Pcubature",
             pc.v = "Vectorized Pcubature",
             cc = "Non-vectorized cubature::cuhre",
             cc_v = "Vectorized cubature::cuhre")
    cc <- function() cubature::cuhre(f = f,
                                     lowerLimit = lowerLimit, upperLimit = upperLimit,
                                     relTol = tol,
                                     ...)
    cc_v <- function() cubature::cuhre(f = fv,
                                       lowerLimit = lowerLimit, upperLimit = upperLimit,
                                       relTol = tol,
                                       nVec = 1024L,
                                       ...)

    hc <- function() cubature::hcubature(f = f,
                                         lowerLimit = lowerLimit,
                                         upperLimit = upperLimit,
                                         tol = tol,
                                         ...)

    hc.v <- function() cubature::hcubature(f = fv,
                                           lowerLimit = lowerLimit,
                                           upperLimit = upperLimit,
                                           tol = tol,
                                           vectorInterface = TRUE,
                                           ...)

    pc <- function() cubature::pcubature(f = f,
                                     lowerLimit = lowerLimit,
                                     upperLimit = upperLimit,
                                     tol = tol,
                                     ...)

    pc.v <- function() cubature::pcubature(f = fv,
                                           lowerLimit = lowerLimit,
                                           upperLimit = upperLimit,
                                           tol = tol,
                                           vectorInterface = TRUE,
                                           ...)
    
    ndim = length(lowerLimit)

    if (is.null(which)) {
        fnIndices <- seq_along(fns)
    } else {
        fnIndices <- na.omit(match(which, names(fns)))
    }
    fnList <- lapply(names(fns)[fnIndices], function(x) call(x))

    argList <- c(fnList, times = times, progress = FALSE)
    result <- do.call(benchr::benchmark, args = argList)
    d <- summary(result)[seq_along(fnIndices), ]
    d$expr <- fns[fnIndices]
    d
}

We reel off the timing runs.

Example 1.

func <- function(x) sin(x[1]) * cos(x[2]) * exp(x[3])
func.v <- function(x) {
    matrix(apply(x, 2, function(z) sin(z[1]) * cos(z[2]) * exp(z[3])), ncol = ncol(x))
}

d <- harness(f = func, fv = func.v,
             lowerLimit = rep(0, 3),
             upperLimit = rep(1, 3),
             tol = 1e-5,
             times = 100)
knitr::kable(d, digits = 3, row.names = FALSE)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 100 0.000 0.000 0.000 0.000 0.000 0.002 0.029 1.00
Vectorized Hcubature 100 0.000 0.000 0.000 0.000 0.000 0.000 0.040 1.46
Non-vectorized Pcubature 100 0.001 0.001 0.001 0.001 0.001 0.003 0.094 3.13
Vectorized Pcubature 100 0.001 0.001 0.001 0.001 0.001 0.004 0.125 4.36
Non-vectorized cubature::cuhre 100 0.001 0.001 0.001 0.001 0.001 0.003 0.063 2.14
Vectorized cubature::cuhre 100 0.001 0.001 0.001 0.001 0.001 0.002 0.062 2.21

Multivariate Normal

Using cubature, we evaluate Rϕ(x)dx where ϕ(x) is the three-dimensional multivariate normal density with mean 0, and variance $$ \Sigma = \left(\begin{array}{rrr} 1 &\frac{3}{5} &\frac{1}{3}\\ \frac{3}{5} &1 &\frac{11}{15}\\ \frac{1}{3} &\frac{11}{15} & 1 \end{array} \right) $$ and R is $[-\frac{1}{2}, 1] \times [-\frac{1}{2}, 4] \times [-\frac{1}{2}, 2].$

We construct a scalar function (my_dmvnorm) and a vector analog (my_dmvnorm_v). First the functions.

m <- 3
sigma <- diag(3)
sigma[2,1] <- sigma[1, 2] <- 3/5 ; sigma[3,1] <- sigma[1, 3] <- 1/3
sigma[3,2] <- sigma[2, 3] <- 11/15
logdet <- sum(log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values))
my_dmvnorm <- function (x, mean, sigma, logdet) {
    x <- matrix(x, ncol = length(x))
    distval <- stats::mahalanobis(x, center = mean, cov = sigma)
    exp(-(3 * log(2 * pi) + logdet + distval)/2)
}

my_dmvnorm_v <- function (x, mean, sigma, logdet) {
    distval <- stats::mahalanobis(t(x), center = mean, cov = sigma)
    exp(matrix(-(3 * log(2 * pi) + logdet + distval)/2, ncol = ncol(x)))
}

Now the timing.

d <- harness(f = my_dmvnorm, fv = my_dmvnorm_v,
             lowerLimit = rep(-0.5, 3),
             upperLimit = c(1, 4, 2),
             tol = 1e-5,
             times = 10,
             mean = rep(0, m), sigma = sigma, logdet = logdet)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 10 0.804 0.808 0.813 0.820 0.818 0.859 8.202 618.00
Vectorized Hcubature 10 0.002 0.002 0.002 0.002 0.002 0.004 0.022 1.44
Non-vectorized Pcubature 10 0.347 0.350 0.351 0.355 0.352 0.394 3.550 267.00
Vectorized Pcubature 10 0.001 0.001 0.001 0.001 0.001 0.001 0.013 1.00
Non-vectorized cubature::cuhre 10 0.333 0.335 0.335 0.336 0.337 0.339 3.358 255.00
Vectorized cubature::cuhre 10 0.003 0.003 0.003 0.003 0.003 0.003 0.032 2.48

The effect of vectorization is huge. So it makes sense for users to vectorize the integrands as much as possible for efficiency.

Furthermore, for this particular example, we expect mvtnorm::pmvnorm to do pretty well since it is specialized for the multivariate normal. The good news is that the vectorized versions of hcubature and pcubature are quite competitive if you compare the table above to the one below.

library(mvtnorm)
g1 <- function() pmvnorm(lower = rep(-0.5, m),
                                  upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
                                  alg = Miwa(), abseps = 1e-5, releps = 1e-5)
g2 <- function() pmvnorm(lower = rep(-0.5, m),
                         upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
                         alg = GenzBretz(), abseps = 1e-5, releps = 1e-5)
g3 <- function() pmvnorm(lower = rep(-0.5, m),
                         upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
                         alg = TVPACK(), abseps = 1e-5, releps = 1e-5)

knitr::kable(summary(benchr::benchmark(g1(), g2(), g3(), times = 20, progress = FALSE)),
             digits = 3, row.names = FALSE)
expr n.eval min lw.qu median mean up.qu max total relative
g1() 20 0.001 0.001 0.001 0.001 0.001 0.004 0.027 1.01
g2() 20 0.001 0.001 0.001 0.001 0.001 0.003 0.025 1.00
g3() 20 0.000 0.001 0.001 0.001 0.001 0.002 0.023 1.00

Product of cosines

testFn0 <- function(x) prod(cos(x))
testFn0_v <- function(x) matrix(apply(x, 2, function(z) prod(cos(z))), ncol = ncol(x))

d <- harness(f = testFn0, fv = testFn0_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 1000)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 1000 0 0 0 0 0 0.000 0.042 1.00
Vectorized Hcubature 1000 0 0 0 0 0 0.002 0.076 1.63
Non-vectorized Pcubature 1000 0 0 0 0 0 0.002 0.056 1.29
Vectorized Pcubature 1000 0 0 0 0 0 0.002 0.130 2.87
Non-vectorized cubature::cuhre 1000 0 0 0 0 0 0.003 0.368 8.26
Vectorized cubature::cuhre 1000 0 0 0 0 0 0.002 0.402 9.06

Gaussian function

testFn1 <- function(x) {
    val <- sum(((1 - x) / x)^2)
    scale <- prod((2 / sqrt(pi)) / x^2)
    exp(-val) * scale
}

testFn1_v <- function(x) {
    val <- matrix(apply(x, 2, function(z) sum(((1 - z) / z)^2)), ncol(x))
    scale <- matrix(apply(x, 2, function(z) prod((2 / sqrt(pi)) / z^2)), ncol(x))
    exp(-val) * scale
}

d <- harness(f = testFn1, fv = testFn1_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 10)

knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 10 0.003 0.003 0.003 0.003 0.003 0.004 0.030 29.80
Vectorized Hcubature 10 0.005 0.005 0.005 0.005 0.005 0.005 0.049 52.40
Non-vectorized Pcubature 10 0.000 0.000 0.000 0.000 0.000 0.000 0.001 1.00
Vectorized Pcubature 10 0.000 0.000 0.000 0.000 0.000 0.000 0.002 1.89
Non-vectorized cubature::cuhre 10 0.013 0.013 0.015 0.014 0.015 0.015 0.141 156.00
Vectorized cubature::cuhre 10 0.020 0.020 0.021 0.021 0.022 0.022 0.211 228.00

Discontinuous function

testFn2 <- function(x) {
    radius <- 0.50124145262344534123412
    ifelse(sum(x * x) < radius * radius, 1, 0)
}

testFn2_v <- function(x) {
    radius <- 0.50124145262344534123412
    matrix(apply(x, 2, function(z) ifelse(sum(z * z) < radius * radius, 1, 0)), ncol = ncol(x))
}

d <- harness(which = c("hc", "hc.v", "cc", "cc_v"),
             f = testFn2, fv = testFn2_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 10)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 10 0.041 0.042 0.042 0.042 0.042 0.044 0.422 1.00
Vectorized Hcubature 10 0.047 0.049 0.049 0.049 0.049 0.050 0.488 1.16
Non-vectorized cubature::cuhre 10 0.784 0.786 0.788 0.796 0.799 0.827 7.963 18.70
Vectorized cubature::cuhre 10 0.877 0.883 0.891 0.901 0.920 0.949 9.009 21.10

A Simple Polynomial (product of coordinates)

testFn3 <- function(x) prod(2 * x)
testFn3_v <- function(x) matrix(apply(x, 2, function(z) prod(2 * z)), ncol = ncol(x))

d <- harness(f = testFn3, fv = testFn3_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 50)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 50 0.000 0.000 0.000 0.000 0.000 0.000 0.003 1.13
Vectorized Hcubature 50 0.000 0.000 0.000 0.000 0.000 0.000 0.005 1.73
Non-vectorized Pcubature 50 0.000 0.000 0.000 0.000 0.000 0.000 0.003 1.00
Vectorized Pcubature 50 0.000 0.000 0.000 0.000 0.000 0.002 0.006 1.57
Non-vectorized cubature::cuhre 50 0.001 0.001 0.001 0.001 0.001 0.003 0.034 11.80
Vectorized cubature::cuhre 50 0.001 0.001 0.001 0.001 0.001 0.001 0.033 12.10

Gaussian centered at $\frac{1}{2}$

testFn4 <- function(x) {
    a <- 0.1
    s <- sum((x - 0.5)^2)
    ((2 / sqrt(pi)) / (2. * a))^length(x) * exp (-s / (a * a))
}

testFn4_v <- function(x) {
    a <- 0.1
    r <- apply(x, 2, function(z) {
        s <- sum((z - 0.5)^2)
        ((2 / sqrt(pi)) / (2. * a))^length(z) * exp (-s / (a * a))
    })
    matrix(r, ncol = ncol(x))
}

d <- harness(f = testFn4, fv = testFn4_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 20)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 20 0.001 0.001 0.001 0.001 0.001 0.003 0.028 1.00
Vectorized Hcubature 20 0.002 0.002 0.002 0.002 0.002 0.004 0.037 1.35
Non-vectorized Pcubature 20 0.002 0.002 0.002 0.002 0.002 0.004 0.043 1.49
Vectorized Pcubature 20 0.003 0.003 0.003 0.003 0.003 0.004 0.053 1.96
Non-vectorized cubature::cuhre 20 0.003 0.003 0.003 0.003 0.003 0.005 0.068 2.39
Vectorized cubature::cuhre 20 0.004 0.004 0.004 0.004 0.004 0.006 0.078 2.85

Double Gaussian

testFn5 <- function(x) {
    a <- 0.1
    s1 <- sum((x - 1 / 3)^2)
    s2 <- sum((x - 2 / 3)^2)
    0.5 * ((2 / sqrt(pi)) / (2. * a))^length(x) * (exp(-s1 / (a * a)) + exp(-s2 / (a * a)))
}
testFn5_v <- function(x) {
    a <- 0.1
    r <- apply(x, 2, function(z) {
        s1 <- sum((z - 1 / 3)^2)
        s2 <- sum((z - 2 / 3)^2)
        0.5 * ((2 / sqrt(pi)) / (2. * a))^length(z) * (exp(-s1 / (a * a)) + exp(-s2 / (a * a)))
    })
    matrix(r, ncol = ncol(x))
}

d <- harness(f = testFn5, fv = testFn5_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 20)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 20 0.003 0.004 0.004 0.004 0.004 0.005 0.076 1.43
Vectorized Hcubature 20 0.005 0.005 0.005 0.005 0.005 0.006 0.098 1.85
Non-vectorized Pcubature 20 0.002 0.002 0.002 0.003 0.003 0.004 0.053 1.00
Vectorized Pcubature 20 0.003 0.003 0.003 0.004 0.003 0.005 0.072 1.31
Non-vectorized cubature::cuhre 20 0.007 0.007 0.007 0.007 0.007 0.009 0.142 2.76
Vectorized cubature::cuhre 20 0.008 0.008 0.008 0.009 0.010 0.010 0.178 3.33

Tsuda’s Example

testFn6 <- function(x) {
    a <- (1 + sqrt(10.0)) / 9.0
    prod( a / (a + 1) * ((a + 1) / (a + x))^2)
}

testFn6_v <- function(x) {
    a <- (1 + sqrt(10.0)) / 9.0
    r <- apply(x, 2, function(z) prod( a / (a + 1) * ((a + 1) / (a + z))^2))
    matrix(r, ncol = ncol(x))
}

d <- harness(f = testFn6, fv = testFn6_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 20)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 20 0.001 0.002 0.002 0.002 0.002 0.003 0.034 1.00
Vectorized Hcubature 20 0.002 0.002 0.002 0.002 0.002 0.005 0.047 1.35
Non-vectorized Pcubature 20 0.008 0.008 0.008 0.008 0.008 0.010 0.166 5.08
Vectorized Pcubature 20 0.010 0.010 0.010 0.011 0.010 0.012 0.212 6.62
Non-vectorized cubature::cuhre 20 0.004 0.004 0.004 0.005 0.004 0.006 0.091 2.70
Vectorized cubature::cuhre 20 0.005 0.005 0.005 0.005 0.005 0.007 0.101 3.01

Morokoff & Calflish Example

testFn7 <- function(x) {
    n <- length(x)
    p <- 1/n
    (1 + p)^n * prod(x^p)
}
testFn7_v <- function(x) {
    matrix(apply(x, 2, function(z) {
        n <- length(z)
        p <- 1/n
        (1 + p)^n * prod(z^p)
    }), ncol = ncol(x))
}

d <- harness(f = testFn7, fv = testFn7_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 20)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 20 0.003 0.003 0.003 0.003 0.003 0.005 0.068 1.00
Vectorized Hcubature 20 0.004 0.004 0.004 0.004 0.004 0.007 0.083 1.23
Non-vectorized Pcubature 20 0.008 0.008 0.008 0.009 0.010 0.010 0.171 2.46
Vectorized Pcubature 20 0.009 0.009 0.010 0.010 0.011 0.012 0.209 3.18
Non-vectorized cubature::cuhre 20 0.040 0.041 0.042 0.042 0.043 0.044 0.842 12.90
Vectorized cubature::cuhre 20 0.040 0.042 0.042 0.042 0.043 0.045 0.847 13.00

Wang-Landau Sampling 1d, 2d Examples

I.1d <- function(x) {
    sin(4 * x) *
        x * ((x * ( x * (x * x - 4) + 1) - 1))
}
I.1d_v <- function(x) {
    matrix(apply(x, 2, function(z)
        sin(4 * z) *
        z * ((z * ( z * (z * z - 4) + 1) - 1))),
        ncol = ncol(x))
}
d <- harness(f = I.1d, fv = I.1d_v,
             lowerLimit = -2, upperLimit = 2, times = 100)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 100 0 0 0 0.000 0.000 0.000 0.013 2.25
Vectorized Hcubature 100 0 0 0 0.000 0.000 0.002 0.023 3.75
Non-vectorized Pcubature 100 0 0 0 0.000 0.000 0.002 0.007 1.00
Vectorized Pcubature 100 0 0 0 0.000 0.000 0.000 0.016 2.74
Non-vectorized cubature::cuhre 100 0 0 0 0.000 0.000 0.000 0.023 4.00
Vectorized cubature::cuhre 100 0 0 0 0.001 0.001 0.002 0.053 8.87
I.2d <- function(x) {
    x1 <- x[1]; x2 <- x[2]
    sin(4 * x1 + 1) * cos(4 * x2) * x1 * (x1 * (x1 * x1)^2 - x2 * (x2 * x2 - x1) +2)
}
I.2d_v <- function(x) {
    matrix(apply(x, 2,
                 function(z) {
                     x1 <- z[1]; x2 <- z[2]
                     sin(4 * x1 + 1) * cos(4 * x2) * x1 * (x1 * (x1 * x1)^2 - x2 * (x2 * x2 - x1) +2)
                 }),
           ncol = ncol(x))
}
d <- harness(f = I.2d, fv = I.2d_v,
             lowerLimit = rep(-1, 2), upperLimit = rep(1, 2), times = 100)
knitr::kable(d, digits = 3)
expr n.eval min lw.qu median mean up.qu max total relative
Non-vectorized Hcubature 100 0.004 0.004 0.004 0.005 0.004 0.006 0.474 10.60
Vectorized Hcubature 100 0.005 0.005 0.006 0.006 0.006 0.008 0.585 13.20
Non-vectorized Pcubature 100 0.000 0.000 0.000 0.001 0.000 0.048 0.089 1.00
Vectorized Pcubature 100 0.001 0.001 0.001 0.001 0.001 0.003 0.070 1.52
Non-vectorized cubature::cuhre 100 0.001 0.001 0.001 0.001 0.001 0.003 0.125 2.96
Vectorized cubature::cuhre 100 0.001 0.001 0.001 0.001 0.001 0.003 0.148 3.28

Implementation Notes

About the only real modification we have made to the underlying cubature library is that we use M = 16 rather than the default M = 19 suggested by the original author for pcubature. This allows us to comply with CRAN package size limits and seems to work reasonably well for the above tests. Future versions will allow for such customization on demand.

The changes made to the Cuba library are managed in a Github repo branch: each time a new release is made, we update the main branch, and keep all changes for Unix platforms in a branch named R_pkg against the current main branch. Customization for windows is done in the package itself using the Makevars.win script.

Summary

The recommendations are:

  1. Vectorize your function. The time spent in so doing pays back enormously. This is easy to do and the examples above show how.

  2. Vectorized hcubature seems to be a good starting point.

  3. For smooth integrands in low dimensions ( ≤ 3), pcubature might be worth trying out. Experiment before using in a production package.

Session Info

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] mvtnorm_1.3-2  cubature_2.1.1 benchr_0.2.5  
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.37      R6_2.5.1           fastmap_1.2.0      xfun_0.49         
##  [5] maketools_1.3.1    cachem_1.1.0       knitr_1.49         htmltools_0.5.8.1 
##  [9] rmarkdown_2.29     buildtools_1.0.0   lifecycle_1.0.4    cli_3.6.3         
## [13] RcppProgress_0.4.2 sass_0.4.9         jquerylib_0.1.4    compiler_4.4.2    
## [17] sys_3.4.3          tools_4.4.2        evaluate_1.0.1     bslib_0.8.0       
## [21] Rcpp_1.0.13-1      yaml_2.3.10        jsonlite_1.8.9     rlang_1.1.4