Introduction

RHLP: Flexible and user-friendly probabilistic segmentation of time series (or structured longitudinal data) with smooth and/or abrupt regime changes by a mixture model-based regression approach with a hidden logistic process, fitted by the EM algorithm.

It was written in R Markdown, using the knitr package for production.

See help(package="samurais") for further details and references provided by citation("samurais").

Load data

data("univtoydataset")

Set up RHLP model parameters

K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model

Set up EM parameters

n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

rhlp <- emRHLP(univtoydataset$x, univtoydataset$y, K, p, q, 
               variance_type, n_tries, max_iter, threshold, verbose, verbose_IRLS)
## EM: Iteration : 1 || log-likelihood : -2119.2730847863
## EM: Iteration : 2 || log-likelihood : -1149.01040275042
## EM: Iteration : 3 || log-likelihood : -1118.2038425746
## EM: Iteration : 4 || log-likelihood : -1096.8826062752
## EM: Iteration : 5 || log-likelihood : -1067.55719335696
## EM: Iteration : 6 || log-likelihood : -1037.26620104185
## EM: Iteration : 7 || log-likelihood : -1022.7174307707
## EM: Iteration : 8 || log-likelihood : -1006.118254514
## EM: Iteration : 9 || log-likelihood : -1001.18491882476
## EM: Iteration : 10 || log-likelihood : -1000.91250762673
## EM: Iteration : 11 || log-likelihood : -1000.62280599148
## EM: Iteration : 12 || log-likelihood : -1000.30309886791
## EM: Iteration : 13 || log-likelihood : -999.932334867598
## EM: Iteration : 14 || log-likelihood : -999.484219689836
## EM: Iteration : 15 || log-likelihood : -998.928118018318
## EM: Iteration : 16 || log-likelihood : -998.234244639955
## EM: Iteration : 17 || log-likelihood : -997.359536244659
## EM: Iteration : 18 || log-likelihood : -996.15265481515
## EM: Iteration : 19 || log-likelihood : -994.697863399405
## EM: Iteration : 20 || log-likelihood : -993.186583927774
## EM: Iteration : 21 || log-likelihood : -991.813523755133
## EM: Iteration : 22 || log-likelihood : -990.611295180997
## EM: Iteration : 23 || log-likelihood : -989.539226242094
## EM: Iteration : 24 || log-likelihood : -988.553118850066
## EM: Iteration : 25 || log-likelihood : -987.539963656861
## EM: Iteration : 26 || log-likelihood : -986.073920058718
## EM: Iteration : 27 || log-likelihood : -983.263549767648
## EM: Iteration : 28 || log-likelihood : -979.340492092037
## EM: Iteration : 29 || log-likelihood : -977.468559826356
## EM: Iteration : 30 || log-likelihood : -976.653534229025
## EM: Iteration : 31 || log-likelihood : -976.589338743393
## EM: Iteration : 32 || log-likelihood : -976.589338067356

Summary

rhlp$summary()
## ---------------------
## Fitted RHLP model
## ---------------------
## 
## RHLP model with K = 5 components:
## 
##  log-likelihood nu       AIC       BIC       ICL
##       -976.5893 33 -1009.589 -1083.959 -1083.176
## 
## Clustering table (Number of observations in each regimes):
## 
##   1   2   3   4   5 
## 100 120 200 100 150 
## 
## Regression coefficients:
## 
##       Beta(K = 1) Beta(K = 2) Beta(K = 3) Beta(K = 4) Beta(K = 5)
## 1    6.031875e-02   -5.434903   -2.770416    120.7698    4.027543
## X^1 -7.424718e+00  158.705091   43.879453   -474.5887   13.194260
## X^2  2.931652e+02 -650.592347  -94.194780    597.7947  -33.760602
## X^3 -1.823560e+03  865.329795   67.197059   -244.2385   20.402152
## 
## Variances:
## 
##  Sigma2(K = 1) Sigma2(K = 2) Sigma2(K = 3) Sigma2(K = 4) Sigma2(K = 5)
##       1.220624      1.110243      1.079394     0.9779734      1.028332

Plots

Fitted regressors

rhlp$plot(what = "regressors")

Estimated signal

rhlp$plot(what = "estimatedsignal")

Log-likelihood

rhlp$plot(what = "loglikelihood")

A-quick-tour-of-RHLP