Package 'ClickClustCont' reference manual

Title:	Mixtures of Continuous Time Markov Models
Description:	Provides an expectation maximization (EM) algorithm to fit a mixture of continuous time Markov models for use with clickstream or other sequence type data. Gallaugher, M.P.B and McNicholas, P.D. (2018) <arXiv:1802.04849>.
Authors:	Michael P.B. Gallaugher, Paul D. McNicholas
Maintainer:	Michael P.B. Gallaugher <gallaump@mcmaster.ca>
License:	GPL (>= 2)
Version:	0.1.7
Built:	2025-03-14 07:16:54 UTC
Source:	CRAN

EM Algorithm for Continuous Time Markov Models

Description

This function fits the continuous time first-order Markov model for a specified set of groups and returns the model chosen by the BIC. This is an implementation of the methodology developed in Gallaugher and McNicholas (2019).

Usage

ClickClust_EM(x, t, J, G, itemEM = 5, starts = 100, maxit = 5000,
  tol = 0.001, Contin = TRUE, Verbose = TRUE, seed = 1,
  known = NULL, crit = "BIC", returnall = FALSE)
ClickClust_EM(x, t, J, G, itemEM = 5, starts = 100, maxit = 5000,
  tol = 0.001, Contin = TRUE, Verbose = TRUE, seed = 1,
  known = NULL, crit = "BIC", returnall = FALSE)

Arguments

`x`	A list of states
`t`	A list of times spent in each state
`J`	The total number of states
`G`	A vector containing the number of groups to test
`itemEM`	The number of emEM iterations for initialization (defaults to 5)
`starts`	The number of random starting values for the emEM algorithm (defaults to 100)
`maxit`	The maximum number of iterations after initialization (defaults to 5000)
`tol`	The tolerance for convergence (defaults to 0.001)
`Contin`	Fit the continuous time model (defaults to TRUE). If FALSE, fit the discrete model.
`Verbose`	Display Messages (defaults to TRUE)
`seed`	Sets the seed for the emEM algorithm (defaults to 1)
`known`	A vector of labels for semi-supervised classification. 0 indicates unknown observations. The known labels are denoted by their group number (1,2,3, etc.).
`crit`	The model selection criterion to use ("BIC" or "ICL"). Defaults to "BIC".
`returnall`	If true, returns the results for all groups considered. Defaults to FALSE.

Value

Returns a list with parameter and classification estimates for the best model chosen by the selection criterion.

References

Michael P.B. Gallaugher and Paul D. McNicholas (2019). Clustering and semi-supervised classification for clickstream data via mixture models. arXiv preprint arXiv:1802.04849v2.

Examples

library(gtools)
data(SimData)
x<-SimData[[1]]
t<-SimData[[2]]
Click_2G<-ClickClust_EM(x=x,t=t,J=5,G=2,starts=10)
library(gtools)
data(SimData)
x<-SimData[[1]]
t<-SimData[[2]]
Click_2G<-ClickClust_EM(x=x,t=t,J=5,G=2,starts=10)

Revised MSNBC Data

Description

This is a revised version of the MSNBC323 dataset in the R package ClickClust (Melnykov, 2016). This dataset contains the clickstreams without within-state transitions x and simulated time points t. See Gallaugher and McNicholas (2019) for further details.

Usage

data(mMSNBC)
data(mMSNBC)

Format

An object of class list of length 2.

References

Michael P.B. Gallaugher and Paul D. McNicholas (2019). Clustering and semi-supervised classification for clickstream data via mixture models. arXiv preprint arXiv:1802.04849v2.

Volodymyr Melnykov (2016). ClickClust: An R Package for Model-Based Clustering of Categorical Sequences. Journal of Statistical Software 74(9), 1-34.

Simulated Data

Description

This is a simulated dataset with two groups. It is in the form of a list with the first element being the list of states and the second element being the list of time stamps. This is an example of the simulated data used in Simulation 1B in Gallaugher and McNicholas (2019).

Usage

data(SimData)
data(SimData)

Format

An object of class list of length 2.

References

Michael P.B. Gallaugher and Paul D. McNicholas (2019). Clustering and semi-supervised classification for clickstream data via mixture models. arXiv preprint arXiv:1802.04849v2.

Package 'ClickClustCont'

Help Index

EM Algorithm for Continuous Time Markov Models

Description

Usage

Arguments

Value

References

Examples

Revised MSNBC Data

Description

Usage

Format

References

Simulated Data

Description

Usage

Format

References