--- title: "An R Package for Fast Sampling from von Mises Fisher Distribution" author: "Aristide Houndetoungan" date: "`r Sys.Date()`" output: pdf_document: citation_package: natbib number_sections: true bookdown::pdf_book: citation_package: biblatex bibliography: ["References.bib", "Packages.bib"] biblio-style: "apalike" link-citations: true urlcolor: blue vignette: > %\VignetteIndexEntry{An R Package for Fast Sampling from von Mises Fisher Distribution} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction{-#over} The package **vMF** simulates von Mises-Fisher distribution ($\mathcal{M}$). Unlike the package [**movMF**](https://CRAN.R-project.org/package=movMF) [@hornik2014movmf], which simulates and estimates mixtures of $\mathcal{M}$, **vFM** performs fast sampling as its source code is written in C++. **vFM** also computes the density and the normalization constant of $\mathcal{M}$. The von Mises-Fisher distribution is used to model coordinates on a hypersphere of dimension $p \ge 2$. Roughly speaking, it is the equivalent of the normal distribution on a hypersphere. As the normal distribution, $\mathcal{M}$ is characterized by two parameters. The location (or mean directional) parameter $\boldsymbol{\mu}$ around which draws will be concentrated and the intensity parameter $\eta$ which measures the intensity of concentration of the draws around $\boldsymbol{\mu}$. The higher $\eta$, the more the draws are concentrated around $\boldsymbol{\mu}$. Compared to the normal distribution, $\boldsymbol{\mu}$ is similar to the mean parameter of the normal distribution and $1/\eta$ is similar to the standard deviation. There are several definitions of the density function of $\mathcal{M}$. In this package, the density is normalized by the uniform distribution without loss of generality. This is also the case in @mardia2009directional and @hornik2013conjugate. Let $\mathbf{z} \sim \mathcal{M}\left(\eta,\boldsymbol{\mu}\right)$. The density of $\mathbf{z}$ is given by $$f_p(\mathbf{z}|\eta, \boldsymbol{\mu}) =C_p(\eta) e^{\eta\mathbf{z}'\boldsymbol{\mu}},$$ where $\displaystyle C_p(x) = \left(\frac{x}{2}\right)^{\frac{p}{2}-1}\frac{1}{\Gamma\left(\frac{p}{2}\right)I_{\frac{p}{2}-1}(x)}$ is the normalization constant and $I_.(.)$ the Bessel function of the first kind defined by: $$\displaystyle I_{\alpha}(x) = \sum_{m=0}^{\infty}\frac{\left(\frac{x}{2}\right)^{2m+\alpha}}{m!\Gamma(m+\alpha + 1)}.$$ \noindent The normalization with respect to the uniform distribution implies $C_p(0)=1$. # Simulation from von Mises Fisher distribution{-#sim} The following algorithm provides a rejection sampling scheme for drawing a sample from $\mathcal{M}$ with mean directional parameter $\boldsymbol{\mu} = (0, ... , 0, 1)$ and concentration (intensity) parameter $\eta \ge 0$ [see Section 2.1 in @hornik2014movmf]. * Step 1. Calculate $b$ using * Step 1. Calculate $b$ using $$b = \dfrac{p - 1}{2\eta + \sqrt{2\eta^2 + (p - 1)^2}}.$$ Let $x_0 = (1 - b)/(1 + b)$ and $c = \eta x_0 + (p - 1)\log\left(1 - x_0^2\right)$. * Step 2. Generate $Z \sim Beta((p-1)/2,(p-1)/2)$ and $U \sim Unif([0, 1])$ and calculate $$W = \dfrac{1-(1+b)Z}{1-(1-b)Z}.$$ * Step 3. If $$\eta W+(p-1)\log(1-x_0W)-c<\log(U),$$ go to step 2. * Step 4. Generate a uniform $(d-1)$-dimensional unit vector $\mathbf{V}$ and return $$\mathbf{X} =\left(\sqrt{1- W^2}\mathbf{V}^{\prime},W\right)^{\prime}$$ The uniform ($d-1$)-dimensional unit vector $\mathbf{V}$ can be generated by simulating $d-1$ independent standard normal random variables and normalizing them so as $\lVert \mathbf{V} \rVert_2 = 1$. To get sampling from $\mathcal{M}$ with arbitrary mean direction parameter $\boldsymbol{\mu}$, $\mathbf{X}$ is multiplied from the left with a matrix where the first $d-1$ columns consist of unitary basis vectors of the subspace orthogonal to $\boldsymbol{\mu}$ and the last column is equal to $\boldsymbol{\mu}$. # Comparison of vMF and movMF{-#comp} In this section, I compare **vMF** and **[movMF](https://CRAN.R-project.org/package=movMF)**. ```{r ex1, echo = TRUE, eval = TRUE} library(rbenchmark) fcompare <- function(n) { benchmark("vMF" = rvMF(n,c(1,0,0)), "movMF" = rmovMF(1,c(1,0,0))) } fcompare(1) fcompare(10) fcompare(100) ``` **vMF** performs over **[movMF](https://CRAN.R-project.org/package=movMF)**. The performance of **vMF** is much better when only few simulations are performed. When the sample is too large, the two package require approximately the same running time. ```{r ex11a, echo = FALSE, eval = TRUE} #load data to save time during the building load("out.Rdata") ``` ```{r ex11b, echo = TRUE, eval = FALSE} out <- unlist(lapply(1:200, function(x) fcompare(x)$elapsed[1]/fcompare(x)$elapsed[2])) ``` ```{r ex11c, echo = TRUE, eval = TRUE, fig.height = 4, fig.align = "center"} library(ggplot2) ggplot(data = data.frame(n = 1:200, time = out), aes(x = n, y = time)) + geom_point(col = "blue") + geom_hline(yintercept = 1, col = 2) ``` Many papers use simulations from the von-Mises Fisher distribution in a Markov Chain Monte Carlo (MCMC) process. A single draw is performed at each iteration of the MCMC. This is for example the case in @boucher2019partial, @breza2020using, @mccormick2015latent. In such a simulation context, using **vMF** would take much less time than **[movMF](https://CRAN.R-project.org/package=movMF)**. For example, I consider the process $\left(\mathbf{z}_t\right)_{t\in\mathbb{N}}$ which follows a random walk of the von-Mises Fisher distribution. The first variable, $\mathbf{z}_0$, is randowmly set on a 4-dimensional hypersphere and $\mathbf{z}_t \sim \mathcal{M}\left(1,\mathbf{z} _ {t - 1}\right)$ $\forall$ $t > 0$. Simulating this process has about the same complexity as using von-Mises Fisher drawings in an MCMC. ```{r ex2a, echo = TRUE, eval = FALSE} set.seed(123) P <- 4 initial <- rmovMF(1, rep(0, P)) # Fonction based on vMF to simulate theta SamplevMF <- function(n) { output <- matrix(0, n + 1, P) output[1, ] <- initial for (i in 1:n) { output[i + 1,] <- rvMF(1, output[i,]) } return(output) } # Fonction based on movMF to simulate theta SamplemovMF <-function(n){ output <- matrix(0, n + 1, P) output[1, ] <- initial for (i in 1:n) { output[i + 1,] <- rmovMF(1, output[i,]) } return(output) } benchmark("vMF" = SamplevMF(1000), "movMF" = SamplemovMF(1000)) ``` ```{r ex2b, echo = FALSE, eval = TRUE} print(outbench) ``` The comparison of the running times **vMF** is less time-consuming