An Introduction to synlik

Introduction

The synlik package provides Synthetic Likelihood methods for intractable likelihoods. The package is meant to be as general purpose as possible: as long as you are able to simulate data from your model you should be able to fit it.

Creating a synlik object

A synlik object is mainly composed of the simulator, which is the function that simulates data from the model of interest. In addition, it is possible to specify a function summaries, which transforms the data generated by simulator into summary statistics. The simulator can generate any kind of output, as long as summaries is able to transform it into a matrix where each row is a simulated vector of statistics. If summaries is not specified, then simulator has to output such a matrix.

Here we set-up a synlik object representing the Ricker map. The observations are given by $Y_t \sim Pois(\phi N_t)$, where the hidden state has the following dynamics: $N_t = r N_{t-1} exp( -N_{t-1} + e_t )$. This is how we create the object:

library(synlik)

## Loading required package: Rcpp

ricker_sl <- synlik(simulator = rickerSimul,
                    summaries = rickerStats,
                    param = c( logR = 3.8, logSigma = log(0.3), logPhi = log(10) ),
                    extraArgs = list("nObs" = 50, "nBurn" = 50)
)

Here:

rickerSimul and rickerStats are functions provided by synlik.
param is a named vector that contains the log-parameters that will be used by rickerSimul(param, nsim, extraArgs, ...).
extraArgs contains additional arguments required by rickerSimul, see ?rickerSimul for details.

Now we are ready to simulate data from the object:

ricker_sl@data <- simulate(ricker_sl, nsim = 1, seed = 54)

Here ricker_sl@data is just a vector, but synlik allows the simulator to simulate any kind of object, so it is often necessary encapsulate an adequate plotting function into the object:

ricker_sl@plotFun <- function(input, ...) plot(drop(input), type = 'l', ylab = "Pop", xlab = "Time", ...)
plot(ricker_sl)

plot of chunk ricker_plot

If we want to simulate several datasets we simply do:

tmp <- simulate(ricker_sl, nsim = 10)
dim(tmp)

## [1] 10 50

So far we have just simulated data, not summary statistics. In this particular example rickerStats needs to be passed the reference data in ricker_sl@data in order to be able to calculate the statistics. We can do that by using the slot extraArgs:

ricker_sl@extraArgs$obsData <- ricker_sl@data

Now we are ready to simulate summary statistics:

tmp <- simulate(ricker_sl, nsim = 2, stats = TRUE)
tmp

##           [,1]         [,2]         [,3]       [,4]       [,5]  [,6] [,7]
## [1,] 0.8016316 0.0005366660 2.427694e-05 0.03743647 -0.2248205 36.90   17
## [2,] 0.9022046 0.0003317784 1.959014e-05 0.05564859 -0.2073172 37.38   21
##          [,8]      [,9]     [,10]     [,11]      [,12]     [,13]
## [1,] 3180.570 -903.5390 -370.0129 -526.6177   46.91435 -203.1856
## [2,] 3555.596 -863.9911 -930.2048  547.7001 -842.12995  225.4435

and to check their approximate normality:

checkNorm(ricker_sl)

plot of chunk unnamed-chunk-3

Looking at the synthetic likelihood

If we want to estimate the value of the synthetic likelihood at a certain location in the parameter space, we can do it by using the function slik as follows:

slik(ricker_sl, 
     param  = c(logR = 3.8, logSigma = log(0.3), logPhi = log(10)),
     nsim   = 1e3)

## [1] -20.44001

We can also look at slices of this function wrt each parameter:

slice(object = ricker_sl, 
      ranges = list("logR" = seq(3.5, 3.9, by = 0.01),
                    "logPhi" = seq(2, 2.6, by = 0.01),
                    "logSigma" = seq(-2, -0.5, by = 0.02)), 
      param = c(logR = 3.8, logSigma = log(0.3), logPhi = log(10)), 
      nsim = 1000)

plot of chunk ricker_slice

Finally we can have 2D slices with respect to pairs of parameters:

slice(object = ricker_sl, 
      ranges = list("logR" = seq(3.5, 3.9, by = 0.02),
                    "logPhi" = seq(2, 2.6, by = 0.02)), 
      pairs = TRUE,
      param = c(logR = 3.8, logSigma = log(0.3), logPhi = log(10)), 
      nsim = 1000, 
      multicore = TRUE,
      ncores = 2)

plot of chunk ricker_slice_2D

Notice that here we have used the `multicore` option, which distributes the computation among different cores or cluster nodes. Also `slik` provides this option, but for such a simple model the time needed to set up the cluster is longer than the simulation time.

Estimating the parameters by MCMC

The unknown model parameters can be estimated by MCMC, using the smcmc function. Here is an example (you might want to increase niter):

ricker_sl <- smcmc(ricker_sl, 
                   initPar = c(3.2, -1, 2.6),
                   niter = 10, 
                   burn = 3,
                   priorFun = function(input, ...) sum(input), 
                   propCov = diag(c(0.1, 0.1, 0.1))^2, 
                   nsim = 500)

Notice that priorFun returns the log-density of the prior. If we have not reached convergence we can do some more MCMC iterations by using the continue generic:

ricker_sl <- continue(ricker_sl, niter = 10)

Finally we can plot the MCMC output (here we plot a pre-computed object):

data(ricker_smcmc)
addline1 <- function(parNam, ...) abline(h = ricker_smcmc@param[parNam], lwd = 2, lty = 2, col = 3) 
addline2 <- function(parNam, ...) abline(v = ricker_smcmc@param[parNam], lwd = 2, lty = 2, col = 3)

plot(ricker_smcmc, addPlot1 = "addline1", addPlot2 = "addline2")

## [1] "Plotting the MCMC chains"