Before diving into this vignette, we recommend reading the vignette Introduction to LaMa.
The regular HMM formulation needs a key assumption to be applicable, namely the data need to be observed at regular, equidistant time-points such that the transition probabilties can be interpreted meaningfully w.r.t. a specific time unit. If this is not the case, the model used should acocunt for this by building on a mathematical formulation in continuous time. The obvious choice here is to retain most of the HMM model formulation, but replace the unobserved discrete-time Markov chain with a continuous-time Markov chain. However, here it is important to note that the so-called snapshot property needs to be fulfilled, i.e. the observed process at time t can only depend on the state at that time instant and not on the interval since the previous observation. For more details see Glennie et al. (2023).
A continuous-time Markov chain is characterised by a so-called (infinitesimal) generator matrix $$ Q = \begin{pmatrix} q_{11} & q_{12} & \cdots & q_{1N} \\ q_{21} & q_{22} & \cdots & q_{2N} \\ \vdots & \vdots & \ddots & \vdots \\ q_{N1} & q_{N2} & \cdots & q_{NN} \\ \end{pmatrix}, $$ where the diagonal entries are qii = −∑j ≠ iqij, qij ≥ 0 for i ≠ j. This matrix can be interpreted as the derivative of the transition probability matrix and completely describes the dynamics of the state process. The time-spent in a state i is exponentially distributed with rate −qii and conditional on leaving the state, the probability to transition to a state j ≠ i is ωij = qij/ − qii. For a more detailed introduction see Dobrow (2016) (pp. 265 f.). For observation times t1 and t2, we can then obtain the transition probability matrix between these points via the identity Γ(t1, t2) = exp (Q(t2 − t1)), where exp () is the matrix expoential. This follows from the so-called Kolmogorov forward equations, but for more details see Dobrow (2016).
We start by setting parameters to simulate data. In this example, state 1 has a smaller rate and the state dwell time in state one follows and Exp(0.5) distribution, i.e. it exhibits longer dwell times than state 2 with rate 1.
We simulate the continuous-time Markov chain by drawing the exponentially distributed state dwell-times. Within a stay, we can assume whatever structure we like for the observation times, as these are not explicitly modeled. Here we choose to generate them by a Poisson process with rate λ = 1, but this choice is arbitrary. For more details on Poisson point processes, see the MM(M)PP vignette.
set.seed(123)
k = 200 # number of state switches
trans_times = s = rep(NA, k) # time points where the chain transitions
s[1] = sample(1:2, 1) # initial distribuion c(0.5, 0.5)
# exponentially distributed waiting times
trans_times[1] = rexp(1, -Q[s[1],s[1]])
n_arrivals = rpois(1, trans_times[1])
obs_times = sort(runif(n_arrivals, 0, trans_times[1]))
x = rnorm(n_arrivals, mu[s[1]], sigma[s[1]])
for(t in 2:k){
s[t] = c(1,2)[-s[t-1]] # for 2-states, always a state swith when transitioning
# exponentially distributed waiting times
trans_times[t] = trans_times[t-1] + rexp(1, -Q[s[t], s[t]])
n_arrivals = rpois(1, trans_times[t]-trans_times[t-1])
obs_times = c(obs_times,
sort(runif(n_arrivals, trans_times[t-1], trans_times[t])))
x = c(x, rnorm(n_arrivals, mu[s[t]], sigma[s[t]]))
}
Let’s visualise the simulated continuous-time HMM:
color = c("orange", "deepskyblue")
n = length(obs_times)
plot(obs_times[1:50], x[1:50], pch = 16, bty = "n", xlab = "observation times",
ylab = "x", ylim = c(-5,25))
segments(x0 = c(0,trans_times[1:48]), x1 = trans_times[1:49],
y0 = rep(-5,50), y1 = rep(-5,50), col = color[s[1:49]], lwd = 4)
legend("topright", lwd = 2, col = color,
legend = c("state 1", "state 2"), box.lwd = 0)
The likelhood of a continuous-time HMM for observations xt1, …, xtT
at irregular time points t1, …, tT
has the exact same structure as the regular HMM likelihood: L(θ) = δ(1)Γ(t1, t2)P(xt2)Γ(t2, t3)P(xt3)…Γ(tT − 1, tT)P(xtT)1t,
where δ(1), P and 1t are as usual and Γ(tk, tk + 1)
is computed as explained above. Thus we can fit such models using the
standard implementation of the general forward algorithm
forward_g()
with time-varying transition probability
matrices. We can use the generator()
function to compute
the infinitesimal generator matrix from an unconstrained parameter
vector and tpm_cont()
to compute all matrix
exponentials.
nll = function(par, timediff, x, N){
mu = par[1:N]
sigma = exp(par[N+1:N])
Q = generator(par[2*N+1:(N*(N-1))]) # generator matrix
Pi = stationary_cont(Q) # stationary dist of CT Markov chain
Qube = tpm_cont(Q, timediff) # this computes exp(Q*timediff)
allprobs = matrix(1, nrow = length(x), ncol = N)
ind = which(!is.na(x))
for(j in 1:N){
allprobs[ind,j] = dnorm(x[ind], mu[j], sigma[j])
}
-forward_g(Pi, Qube, allprobs)
}
The simulation is very similar but we now also have to draw which state to transition to, as explained in the beginning.
set.seed(123)
k = 200 # number of state switches
trans_times = s = rep(NA, k) # time points where the chain transitions
s[1] = sample(1:3, 1) # uniform initial distribuion
# exponentially distributed waiting times
trans_times[1] = rexp(1, -Q[s[1],s[1]])
n_arrivals = rpois(1, trans_times[1])
obs_times = sort(runif(n_arrivals, 0, trans_times[1]))
x = rnorm(n_arrivals, mu[s[1]], sigma[s[1]])
for(t in 2:k){
# off-diagonal elements of the s[t-1] row of Q divided by the diagonal element
# give the probabilites of the next state
s[t] = sample(c(1:3)[-s[t-1]], 1, prob = Q[s[t-1],-s[t-1]]/-Q[s[t-1],s[t-1]])
# exponentially distributed waiting times
trans_times[t] = trans_times[t-1] + rexp(1, -Q[s[t], s[t]])
n_arrivals = rpois(1, trans_times[t]-trans_times[t-1])
obs_times = c(obs_times,
sort(runif(n_arrivals, trans_times[t-1], trans_times[t])))
x = c(x, rnorm(n_arrivals, mu[s[t]], sigma[s[t]]))
}
par = c(mu = c(5, 10, 25), # state-dependent means
logsigma = c(log(2), log(2), log(6)), # state-dependent sds
qs = rep(0, 6)) # off-diagonals of Q
timediff = diff(obs_times)
system.time(
mod2 <- nlm(nll, par, timediff = timediff, x = x, N = 3, stepmax = 10)
)
#> user system elapsed
#> 2.062 2.633 1.565
# without restricting stepmax, we run into numerical problems