An R Package for Fast Sampling from von Mises Fisher Distribution

Introduction

The package vMF simulates von Mises-Fisher distribution (). Unlike the package movMF (Hornik and Grün 2014), which simulates and estimates mixtures of , vFM performs fast sampling as its source code is written in C++. vFM also computes the density and the normalization constant of .

The von Mises-Fisher distribution is used to model coordinates on a hypersphere of dimension p ≥ 2. Roughly speaking, it is the equivalent of the normal distribution on a hypersphere. As the normal distribution, is characterized by two parameters. The location (or mean directional) parameter μ around which draws will be concentrated and the intensity parameter η which measures the intensity of concentration of the draws around μ. The higher η, the more the draws are concentrated around μ. Compared to the normal distribution, μ is similar to the mean parameter of the normal distribution and 1/η is similar to the standard deviation.

There are several definitions of the density function of . In this package, the density is normalized by the uniform distribution without loss of generality. This is also the case in Mardia and Jupp (2009) and Hornik and Grün (2013).

Let z ∼ ℳ(η, μ). The density of z is given by

fp(z|η, μ) = Cp(η)eηzμ, where $\displaystyle C_p(x) = \left(\frac{x}{2}\right)^{\frac{p}{2}-1}\frac{1}{\Gamma\left(\frac{p}{2}\right)I_{\frac{p}{2}-1}(x)}$ is the normalization constant and I.(.) the Bessel function of the first kind defined by: $$\displaystyle I_{\alpha}(x) = \sum_{m=0}^{\infty}\frac{\left(\frac{x}{2}\right)^{2m+\alpha}}{m!\Gamma(m+\alpha + 1)}.$$ The normalization with respect to the uniform distribution implies Cp(0) = 1.

Simulation from von Mises Fisher distribution

The following algorithm provides a rejection sampling scheme for drawing a sample from with mean directional parameter μ = (0, ..., 0, 1) and concentration (intensity) parameter η ≥ 0 (see Section 2.1 in Hornik and Grün 2014).

  • Step 1. Calculate b using * Step 1. Calculate b using

$$b = \dfrac{p - 1}{2\eta + \sqrt{2\eta^2 + (p - 1)^2}}.$$ Let x0 = (1 − b)/(1 + b) and c = ηx0 + (p − 1)log (1 − x02).

  • Step 2. Generate Z ∼ Beta((p − 1)/2, (p − 1)/2) and U ∼ Unif([0, 1]) and calculate

$$W = \dfrac{1-(1+b)Z}{1-(1-b)Z}.$$

  • Step 3. If

ηW + (p − 1)log (1 − x0W) − c < log (U), go to step 2.

  • Step 4. Generate a uniform (d − 1)-dimensional unit vector V and return

$$\mathbf{X} =\left(\sqrt{1- W^2}\mathbf{V}^{\prime},W\right)^{\prime}$$

The uniform (d − 1)-dimensional unit vector V can be generated by simulating d − 1 independent standard normal random variables and normalizing them so as V2 = 1. To get sampling from with arbitrary mean direction parameter μ, X is multiplied from the left with a matrix where the first d − 1 columns consist of unitary basis vectors of the subspace orthogonal to μ and the last column is equal to μ.

Comparison of vMF and movMF

In this section, I compare vMF and movMF.

library(rbenchmark) 

fcompare <- function(n) {
  benchmark("vMF" = rvMF(n,c(1,0,0)), "movMF" = rmovMF(1,c(1,0,0)))
}

fcompare(1)
#>    test replications elapsed relative user.self sys.self user.child sys.child
#> 2 movMF          100   0.018       18     0.018        0          0         0
#> 1   vMF          100   0.001        1     0.001        0          0         0
fcompare(10)
#>    test replications elapsed relative user.self sys.self user.child sys.child
#> 2 movMF          100   0.018       18     0.019        0          0         0
#> 1   vMF          100   0.001        1     0.001        0          0         0
fcompare(100)
#>    test replications elapsed relative user.self sys.self user.child sys.child
#> 2 movMF          100   0.019     4.75     0.018        0          0         0
#> 1   vMF          100   0.004     1.00     0.004        0          0         0

vMF performs over movMF. The performance of vMF is much better when only few simulations are performed. When the sample is too large, the two package require approximately the same running time.

out  <- unlist(lapply(1:200, function(x) fcompare(x)$elapsed[1]/fcompare(x)$elapsed[2]))
library(ggplot2)
ggplot(data = data.frame(n = 1:200, time = out), aes(x = n, y = time)) +
  geom_point(col = "blue") + geom_hline(yintercept = 1, col = 2)

Many papers use simulations from the von-Mises Fisher distribution in a Markov Chain Monte Carlo (MCMC) process. A single draw is performed at each iteration of the MCMC. This is for example the case in Boucher and Houndetoungan (2022), Breza et al. (2020), McCormick and Zheng (2015). In such a simulation context, using vMF would take much less time than movMF. For example, I consider the process (zt)t ∈ ℕ which follows a random walk of the von-Mises Fisher distribution. The first variable, z0, is randowmly set on a 4-dimensional hypersphere and zt ∼ ℳ(1, zt − 1) t > 0. Simulating this process has about the same complexity as using von-Mises Fisher drawings in an MCMC.

set.seed(123)
P                  <- 4
initial            <- rmovMF(1, rep(0, P))
# Fonction based on vMF to simulate theta
SamplevMF          <- function(n) {
  output           <- matrix(0, n + 1, P)
  output[1, ]      <- initial
  for (i in 1:n) {
    output[i + 1,] <- rvMF(1, output[i,])
  }
  return(output)
}

# Fonction based on movMF to simulate theta
SamplemovMF        <-function(n){
  output           <- matrix(0, n + 1, P)
  output[1, ]      <- initial
  for (i in 1:n) {
    output[i + 1,] <- rmovMF(1, output[i,])
  }
  return(output)
}
benchmark("vMF" = SamplevMF(1000), "movMF" = SamplemovMF(1000))
#>    test replications elapsed relative user.self sys.self user.child sys.child
#> 2 movMF          100  35.605   51.902    35.260    0.032          0         0
#> 1   vMF          100   0.686    1.000     0.643    0.044          0         0

The comparison of the running times vMF is less time-consuming

Boucher, Vincent, and Elysée Aristide Houndetoungan. 2022. “Estimating Peer Effects Using Partial Network Data.”
Breza, Emily, Arun G Chandrasekhar, Tyler H McCormick, and Mengjie Pan. 2020. “Using Aggregated Relational Data to Feasibly Identify Network Structure Without Network Data.” American Economic Review 110 (8): 2454–84.
Hornik, Kurt, and Bettina Grün. 2013. “On Conjugate Families and Jeffreys Priors for von Mises–Fisher Distributions.” Journal of Statistical Planning and Inference 143 (5): 992–99.
———. 2014. “movMF: An r Package for Fitting Mixtures of von Mises-Fisher Distributions.” Journal of Statistical Software 58 (10): 1–31.
Mardia, Kanti V, and Peter E Jupp. 2009. Directional Statistics. Vol. 494. John Wiley & Sons.
McCormick, Tyler H, and Tian Zheng. 2015. “Latent Surface Models for Networks Using Aggregated Relational Data.” Journal of the American Statistical Association 110 (512): 1684–95.