Title: | Analyzes Clickstreams Based on Markov Chains |
---|---|
Description: | A set of tools to read, analyze and write lists of click sequences on websites (i.e., clickstream). A click can be represented by a number, character or string. Clickstreams can be modeled as zero- (only computes occurrence probabilities), first- or higher-order Markov chains. |
Authors: | Michael Scholz, Theo van Kraay |
Maintainer: | Michael Scholz <[email protected]> |
License: | GPL-2 |
Version: | 1.3.3 |
Built: | 2024-12-26 06:42:58 UTC |
Source: | CRAN |
This package allows modeling clickstreams with Markov chains. It supports to model clickstreams as zero-order, first-order or higher-order Markov chains.
Package: | clickstream |
Type: | Package |
Version: | 1.3.3 |
Date: | 2023-09-27 |
License: | GPL-2 |
Depends: | R (>= 3.0), methods |
Michael Scholz [email protected]
Theo van Kraay [email protected]
Scholz, M. (2016) R Package clickstream: Analyzing Clickstream Data with Markov Chains, Journal of Statistical Software, 74, 4, pages 1–17 .
Ching, W.-K.and Huang, X. and Ng, M.K. and Siu, T.-K. (2013) Markov Chains – Models, Algorithms and Applications, 2nd edition, New York: Springer-Verlag.
# fitting a simple Markov chain and predicting the next click clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) startPattern <- new("Pattern", sequence = c("h", "c")) predict(mc, startPattern) plot(mc)
# fitting a simple Markov chain and predicting the next click clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) startPattern <- new("Pattern", sequence = c("h", "c")) predict(mc, startPattern) plot(mc)
Pattern
objectsConcatenates two Pattern
objects
## S4 method for signature 'Pattern,Pattern' e1 + e2
## S4 method for signature 'Pattern,Pattern' e1 + e2
e1 |
First pattern |
e2 |
Second pattern |
Concatenates two Pattern
objects.
Michael Scholz [email protected]
Returns All Absorbing States
absorbingStates(object)
absorbingStates(object)
object |
An instance of the |
Returns the names of all states that never have a successor in a clickstream (i.e. that are absorbing).
Michael Scholz [email protected]
Coerces a Clickstream
object to a ClickClust
object.
as.ClickClust(clickstreamList)
as.ClickClust(clickstreamList)
clickstreamList |
A list of clickstreams. |
A list consisting of a dataset X and a vector of initial states y
Michael Scholz [email protected]
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) X <- as.ClickClust(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) X <- as.ClickClust(cls)
Converts a character vector or a character list into a clickstream list. Note that non-alphanumeric characters will be removed.
as.clickstreams(obj, sep = ",", header = TRUE)
as.clickstreams(obj, sep = ",", header = TRUE)
obj |
The character vector or character list which will be converted into a clickstream list. Each line of the vector must represent exactly one click stream. |
sep |
The character separating clicks (default is “,”). |
header |
A logical flag indicating whether the first entry of each entry in the character vector is the name of the clickstream. |
A list of clickstreams. Each element is a vector of characters representing the clicks. The name of each list element is either extracted from the character vector or a unique number.
Michael Scholz [email protected]
print.Clickstreams
, randomClickstreams
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) print(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) print(cls)
Coerces a Clickstream
object to a transactions
object.
as.moltenTransactions(clickstreamList)
as.moltenTransactions(clickstreamList)
clickstreamList |
A list of clickstreams. |
An instance of the old class transactions
Michael Scholz [email protected]
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) trans <- as.moltenTransactions(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) trans <- as.moltenTransactions(cls)
Coerces a Clickstream
object to a transactions
object.
as.transactions(clickstreamList)
as.transactions(clickstreamList)
clickstreamList |
A list of clickstreams. |
An instance of the class transactions
Michael Scholz [email protected]
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) trans <- as.transactions(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) trans <- as.transactions(cls)
Calculates the chi-Square statistic, p-value, and degrees of freedom, for the first-order transition matrix of a MarkovChain
object compared with observed state changes.
chiSquareTest(cls, mc)
chiSquareTest(cls, mc)
cls |
The clickstream object. |
mc |
The Markov chain against which to compare the clickstream data. Please note that the first-order transition matrix is used for performing the chi-square test. |
Theo van Kraay [email protected]
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") csf <- tempfile() writeLines(clickstreams, csf) cls <- readClickstreams(csf, header = TRUE) unlink(csf) mc <- fitMarkovChain(cls) chiSquareTest(cls, mc)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") csf <- tempfile() writeLines(clickstreams, csf) cls <- readClickstreams(csf, header = TRUE) unlink(csf) mc <- fitMarkovChain(cls) chiSquareTest(cls, mc)
Performs k-means clustering on a list of clickstreams. For each clickstream a transition matrix of a given order is computed. These transition matrices are used as input for performing k-means clustering.
clusterClickstreams(clickstreamList, order = 0, centers, ...)
clusterClickstreams(clickstreamList, order = 0, centers, ...)
clickstreamList |
A list of clickstreams for which the cluster analysis is performed. |
order |
The order of the transition matrices used as input for clustering (default is 0; 0 and 1 are possible). |
centers |
The number of clusters. |
... |
Additional parameters for k-means clustering (see
|
This method returns a ClickstreamClusters
object (S3-class).
It is a list with the following components:
clusters |
The resulting list of
|
centers |
A matrix of cluster centres. |
states |
Vector of states |
totss |
The total sum of squares. |
withinss |
Vector of within-cluster sum of squares, one component per cluster. |
tot.withinss |
Total within-cluster sum of
squares, i.e., |
betweenss |
The
between-cluster sum of squares, i.e., |
Michael Scholz [email protected]
print.ClickstreamClusters
,
summary.ClickstreamClusters
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) print(clusters)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) print(clusters)
EvaluationResult
Class EvaluationResult
Objects can be created by calls of the form
new("EvaluationResult", ...)
. This S4 class describes EvaluationResult
objects.
Michael Scholz [email protected]
# show EvaluationResult definition showClass("EvaluationResult")
# show EvaluationResult definition showClass("EvaluationResult")
This function fits a list of clickstreams to a Markov chain. Zero-order,
first-order as well as higher-order Markov chains are supported. For
estimating higher-order Markov chains this function solves the following
linear or quadratic programming problem:
The distribution of states is given as .
is the lag parameter for lag
and
the
transition matrix.
fitMarkovChain(clickstreamList, order = 1, verbose = TRUE, control = list())
fitMarkovChain(clickstreamList, order = 1, verbose = TRUE, control = list())
clickstreamList |
A list of clickstreams for which a Markov chain is fitted. |
order |
(Optional) The order of the Markov chain that is fitted from
the clickstreams. Per default, Markov chains with |
verbose |
(Optional) An optimal logical variable to indicate whether warnings and infos should be printed. |
control |
(Optional) The control list of optimization parameters. Parameter
|
For solving the quadratic programming problem of higher-order Markov chains,
an augmented Lagrange multiplier method from the package
Rsolnp
is used.
Returns a MarkovChain
object.
At least half of the clickstreams need to consist of as many clicks as the order of the Markov chain that should be fitted.
Michael Scholz [email protected]
This method implements the parameter estimation method presented in Ching, W.-K. et al.: Markov Chains – Models, Algorithms and Applications, 2nd edition, Springer, 2013.
# fitting a simple Markov chain clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) show(mc)
# fitting a simple Markov chain clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) show(mc)
The purpose of this function is to generate pre-computed markov chain objects from clusters of clickstreams.
fitMarkovChains(clusters, order = 1)
fitMarkovChains(clusters, order = 1)
clusters |
The clusters from which to generate markov chain objects. |
order |
The order for the markov chain. |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") test <- c("User1,h,c,c,p,p,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d") trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- clusterClickstreams(trainingCLS, centers = 2) markovchains <- fitMarkovChains(clusters, order = 1)
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") test <- c("User1,h,c,c,p,p,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d") trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- clusterClickstreams(trainingCLS, centers = 2) markovchains <- fitMarkovChains(clusters, order = 1)
Generates a data frame of state frequencies for all clickstreams in a list of clickstreams.
frequencies(clickstreamList)
frequencies(clickstreamList)
clickstreamList |
A list of clickstreams. |
A data frame containing state frequencies for each clickstream.
Michael Scholz [email protected]
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) frequencyDF <- frequencies(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) frequencyDF <- frequencies(cls)
This is an experimental function for a consensus clustering algorithm based on targeting a range of average next state probabilities derived when fitting each cluster to a markov chain.
getConsensusClusters( trainingCLS, testCLS, maxIterations = 5, optimalProbMean = 0.5, range = 0.3, centresMin = 2, clusterCentresRange = 0, order = 1, takeHighest = FALSE, verbose = FALSE )
getConsensusClusters( trainingCLS, testCLS, maxIterations = 5, optimalProbMean = 0.5, range = 0.3, centresMin = 2, clusterCentresRange = 0, order = 1, takeHighest = FALSE, verbose = FALSE )
trainingCLS |
Clickstream object with training data (this should be the data used to build the markov chain object). |
testCLS |
Clickstream object with test data. |
maxIterations |
Number of times to iterate (repeat) through the k-means clustering. |
optimalProbMean |
The target average probability of each next page click prediction in a 1st order markov chain. |
range |
The range above the optimal probability to target. |
centresMin |
The minimum cluster centres to evaluate. |
clusterCentresRange |
the additional cluster centres to evaluate. |
order |
The order for markov chains that will be used to evaluate each cluster. |
takeHighest |
Determines whether to default to the highest mean next click probability, or error if the target is not reached after the given number of k-means iterations. |
verbose |
Should this function report extra information on progress? |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,h,c,c,p,p,c,p,p,p,i,p,o", "User5,i,h,c,c,p,p,c,p,c,d", "User6,i,h,c,c,p,p,c,p,c,o", "User7,i,h,c,c,p,p,c,p,c,d", "User8,i,h,c,c,p,p,c,p,c,d,o") test <- c( "User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d" ) trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- getConsensusClusters(trainingCLS, testCLS, maxIterations=5, optimalProbMean=0.40, range = 0.70, centresMin = 2, clusterCentresRange = 0, order = 1, takeHighest = FALSE, verbose = FALSE) markovchains <- fitMarkovChains(clusters) startPattern <- new("Pattern", sequence = c("i", "h", "c", "p")) mc <- getOptimalMarkovChain(startPattern, markovchains, clusters) predict(mc, startPattern)
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,h,c,c,p,p,c,p,p,p,i,p,o", "User5,i,h,c,c,p,p,c,p,c,d", "User6,i,h,c,c,p,p,c,p,c,o", "User7,i,h,c,c,p,p,c,p,c,d", "User8,i,h,c,c,p,p,c,p,c,d,o") test <- c( "User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d" ) trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- getConsensusClusters(trainingCLS, testCLS, maxIterations=5, optimalProbMean=0.40, range = 0.70, centresMin = 2, clusterCentresRange = 0, order = 1, takeHighest = FALSE, verbose = FALSE) markovchains <- fitMarkovChains(clusters) startPattern <- new("Pattern", sequence = c("i", "h", "c", "p")) mc <- getOptimalMarkovChain(startPattern, markovchains, clusters) predict(mc, startPattern)
This is an experimental function for a consensus clustering algorithm based on targeting a range of average next state probabilities derived when fitting each cluster to a markov chain. This function parallelizes k-means and fitToMarkovChain operations across computer cores, and depends on the parallel package to function.
getConsensusClustersParallel( trainingCLS, testCLS, maxIterations = 5, optimalProbMean = 0.5, range = 0.3, centresMin = 2, clusterCentresRange = 0, order = 1, cores = 2, takeHighest = FALSE, verbose = FALSE )
getConsensusClustersParallel( trainingCLS, testCLS, maxIterations = 5, optimalProbMean = 0.5, range = 0.3, centresMin = 2, clusterCentresRange = 0, order = 1, cores = 2, takeHighest = FALSE, verbose = FALSE )
trainingCLS |
Clickstream object with training data (this should be the data used to build the markov chain object). |
testCLS |
Clickstream object with test data. |
maxIterations |
Number of times to iterate (repeat) through the k-means clustering. |
optimalProbMean |
The target average probability of each next page click prediction in a 1st order markov chain. |
range |
The range above the optimal probability to target. |
centresMin |
The minimum cluster centres to evaluate. |
clusterCentresRange |
the additional cluster centres to evaluate. |
order |
The order for markov chains that will be used to evaluate each cluster. |
cores |
Number of cores used for clustering. |
takeHighest |
Determines whether to default to the highest mean next click probability, or error if the target is not reached after the given number of k-means iterations. |
verbose |
Should this function report extra information on progress? |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,h,c,c,p,p,c,p,p,p,i,p,o", "User5,i,h,c,c,p,p,c,p,c,d", "User6,i,h,c,c,p,p,c,p,c,o", "User7,i,h,c,c,p,p,c,p,c,d", "User8,i,h,c,c,p,p,c,p,c,d,o") test <- c( "User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d" ) trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- getConsensusClustersParallel(trainingCLS, testCLS, maxIterations=3, optimalProbMean=0.40, range = 0.70, centresMin = 2, clusterCentresRange = 0, order = 1, cores = 1, takeHighest = FALSE, verbose = FALSE) markovchains <- fitMarkovChains(clusters) startPattern <- new("Pattern", sequence = c("i", "h", "c", "p")) mc <- getOptimalMarkovChain(startPattern, markovchains, clusters) predict(mc, startPattern)
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,h,c,c,p,p,c,p,p,p,i,p,o", "User5,i,h,c,c,p,p,c,p,c,d", "User6,i,h,c,c,p,p,c,p,c,o", "User7,i,h,c,c,p,p,c,p,c,d", "User8,i,h,c,c,p,p,c,p,c,d,o") test <- c( "User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d" ) trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- getConsensusClustersParallel(trainingCLS, testCLS, maxIterations=3, optimalProbMean=0.40, range = 0.70, centresMin = 2, clusterCentresRange = 0, order = 1, cores = 1, takeHighest = FALSE, verbose = FALSE) markovchains <- fitMarkovChains(clusters) startPattern <- new("Pattern", sequence = c("i", "h", "c", "p")) mc <- getOptimalMarkovChain(startPattern, markovchains, clusters) predict(mc, startPattern)
The purpose of this function is to predict from a pattern using pre-computed markov chains and corresponding clusters. The markov chain corresponding with the cluster that is the best fit to the prediction value is used.
getOptimalMarkovChain(startPattern, markovchains, clusters)
getOptimalMarkovChain(startPattern, markovchains, clusters)
startPattern |
The pattern object to be used. |
markovchains |
The pre-computed markov chains generated from a set of clusters. |
clusters |
The corresponding clusters (should be in the corresponding order as the markov chains). |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") test <- c("User1,h,c,c,p,p,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d") trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- clusterClickstreams(trainingCLS, centers = 2) markovchains <- fitMarkovChains(clusters, order = 1) startPattern <- new("Pattern", sequence = c("c")) mc <- getOptimalMarkovChain(startPattern, markovchains, clusters) predict(mc, startPattern)
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") test <- c("User1,h,c,c,p,p,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d") trainingCLS <- as.clickstreams(training, header = TRUE) testCLS <- as.clickstreams(test, header = TRUE) clusters <- clusterClickstreams(trainingCLS, centers = 2) markovchains <- fitMarkovChains(clusters, order = 1) startPattern <- new("Pattern", sequence = c("c")) mc <- getOptimalMarkovChain(startPattern, markovchains, clusters) predict(mc, startPattern)
Plots a Heatmap
hmPlot( object, order = 1, absorptionProbability = FALSE, title = NA, lowColor = "yellow", highColor = "red", flip = FALSE )
hmPlot( object, order = 1, absorptionProbability = FALSE, title = NA, lowColor = "yellow", highColor = "red", flip = FALSE )
object |
The |
order |
Order of the transition matrix that should be plotted. Default is 1. |
absorptionProbability |
Should the heatmap show absorption probabilities? Default is FALSE. |
title |
Title of the heatmap. |
lowColor |
Color for the lowest transition probability of 0. Default is "yellow". |
highColor |
Color for the highest transition probability of 1. Default is "red". |
flip |
Flip to horizontal plot. Default is FALSE. |
Plots a heatmap for a specified transition matrix or
the absorption probability matrix of a given MarkovChain
object.
Michael Scholz [email protected]
# fitting a simple Markov chain and plotting a heat map clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) hmPlot(mc)
# fitting a simple Markov chain and plotting a heat map clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) hmPlot(mc)
Pattern
objectCreates a new Pattern
object
## S4 method for signature 'Pattern' initialize(.Object, sequence, probability, absorbingProbabilities, ...)
## S4 method for signature 'Pattern' initialize(.Object, sequence, probability, absorbingProbabilities, ...)
.Object |
Pattern (name of the class) |
sequence |
Click sequence |
probability |
Probability for the click sequence |
absorbingProbabilities |
Probabilities that the sequence will finally end in one of the absorbing states |
... |
Further arguments for the |
Creates a new Pattern
object.
Michael Scholz [email protected]
MarkovChain
Class MarkovChain
Objects can be created by calls of the form
new("MarkovChain", ...)
. This S4 class describes MarkovChain
objects.
Michael Scholz [email protected]
# show MarkovChain definition showClass("MarkovChain") # fit a simple Markov chain from a list of click streams clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) show(mc)
# show MarkovChain definition showClass("MarkovChain") # fit a simple Markov chain from a list of click streams clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) show(mc)
Evaluates the number of occurrences of predicted next clicks vs. total number of starting pattern occurrences in a given clickstream. The predicted next click can be a markov chain of any order.
mcEvaluate(mc, startPattern, testCLS)
mcEvaluate(mc, startPattern, testCLS)
mc |
a markovchain object (this should have been built from a set of training data) |
startPattern |
the starting pattern we want to predict next click on, and evaluate observed occurrences in test data. |
testCLS |
clickstream object with test data |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") test <- c("User1,h,h,h,h,c,c,p,p,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d,c,c,c,c") csf <- tempfile() writeLines(training, csf) trainingCLS <- readClickstreams(csf, header = TRUE) unlink(csf) csf <- tempfile() writeLines(test, csf) testCLS <- readClickstreams(csf, header = TRUE) unlink(csf) mc <- fitMarkovChain(trainingCLS, order = 1) startPattern <- new("Pattern", sequence = c("c","c")) res <- mcEvaluate(mc, startPattern, testCLS) res
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d") test <- c("User1,h,h,h,h,c,c,p,p,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d,c,c,c,c") csf <- tempfile() writeLines(training, csf) trainingCLS <- readClickstreams(csf, header = TRUE) unlink(csf) csf <- tempfile() writeLines(test, csf) testCLS <- readClickstreams(csf, header = TRUE) unlink(csf) mc <- fitMarkovChain(trainingCLS, order = 1) startPattern <- new("Pattern", sequence = c("c","c")) res <- mcEvaluate(mc, startPattern, testCLS) res
Evaluates all next page clicks in a clickstream training data set against a test data. Handles higher order by cycling through every possible pattern permutation. Produces a report of observed and expected values in a matrix.
mcEvaluateAll( mc, trainingCLS, testCLS, includeChiSquare = TRUE, returnChiSquareOnly = FALSE )
mcEvaluateAll( mc, trainingCLS, testCLS, includeChiSquare = TRUE, returnChiSquareOnly = FALSE )
mc |
A markovchain object that corresponds to a list of clusters. |
trainingCLS |
Clickstream object with training data (this should be the data used to build the markov chain object). |
testCLS |
Clickstream object with test data. |
includeChiSquare |
Should the result include the chi-square value? |
returnChiSquareOnly |
Should the result only consist of the chi-square value? |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p", "User2,i,c,i,c,c,c,d") test <- c("User1,h,c,c,p,c,h,c,d,p,c,d,p", "User2,i,c,i,p,c,c,d") csf <- tempfile() writeLines(training, csf) trainingCLS <- readClickstreams(csf, header = TRUE) unlink(csf) csf <- tempfile() writeLines(test, csf) testCLS <- readClickstreams(csf, header = TRUE) unlink(csf) mc <- fitMarkovChain(trainingCLS, order = 2) mcEvaluateAll(mc, trainingCLS, testCLS)
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p", "User2,i,c,i,c,c,c,d") test <- c("User1,h,c,c,p,c,h,c,d,p,c,d,p", "User2,i,c,i,p,c,c,d") csf <- tempfile() writeLines(training, csf) trainingCLS <- readClickstreams(csf, header = TRUE) unlink(csf) csf <- tempfile() writeLines(test, csf) testCLS <- readClickstreams(csf, header = TRUE) unlink(csf) mc <- fitMarkovChain(trainingCLS, order = 2) mcEvaluateAll(mc, trainingCLS, testCLS)
Evaluates all next page clicks in a clickstream training data set against a test data on the basis of a set of pre-computed Markov chains and corresponding clusters. Handles higher order by cycling through every possible pattern permutation. Produces and produces a report of observed and expected values in a matrix
mcEvaluateAllClusters( markovchains, clusters, testCLS, trainingCLS, includeChiSquare = TRUE, returnChiSquareOnly = FALSE )
mcEvaluateAllClusters( markovchains, clusters, testCLS, trainingCLS, includeChiSquare = TRUE, returnChiSquareOnly = FALSE )
markovchains |
A list of MarkovChain-objects. |
clusters |
The list of clusters. |
testCLS |
Clickstream object with test data. |
trainingCLS |
Clickstream object with training data (this should be the data used to build the markov chain object). |
includeChiSquare |
Should the result include the chi-square value? |
returnChiSquareOnly |
Should the result only consist of the chi-square value? |
Theo van Kraay [email protected]
training <- c("User1,h,c,c,p,c,h,c,h,o,p,p,c,p,p,o", "User2,i,c,i,c,c,c,o,o,o,i,d", "User3,h,i,c,i,c,o,i,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d,o,i,h,o,o") test <- c("User1,h,c,c,p,p,h,o,i,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d") csf <- tempfile() writeLines(training, csf) trainingCLS <- readClickstreams(csf, header = TRUE) unlink(csf) csf <- tempfile() writeLines(test, csf) testCLS <- readClickstreams(csf, header = TRUE) unlink(csf) clusters <- clusterClickstreams(trainingCLS, centers = 2, order = 1) markovchains <- fitMarkovChains(clusters, order = 2) mcEvaluateAllClusters(markovchains, clusters, testCLS, trainingCLS)
training <- c("User1,h,c,c,p,c,h,c,h,o,p,p,c,p,p,o", "User2,i,c,i,c,c,c,o,o,o,i,d", "User3,h,i,c,i,c,o,i,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d,o,i,h,o,o") test <- c("User1,h,c,c,p,p,h,o,i,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User4,c,c,c,c,d") csf <- tempfile() writeLines(training, csf) trainingCLS <- readClickstreams(csf, header = TRUE) unlink(csf) csf <- tempfile() writeLines(test, csf) testCLS <- readClickstreams(csf, header = TRUE) unlink(csf) clusters <- clusterClickstreams(trainingCLS, centers = 2, order = 1) markovchains <- fitMarkovChains(clusters, order = 2) mcEvaluateAllClusters(markovchains, clusters, testCLS, trainingCLS)
Pattern
This S4 class describes a click pattern consisting of a sequence of clicks and a probability of occurrence.
Objects can be created by calls of the form
new("Pattern", sequence, probability, ...)
. This S4 class describes a click pattern consisting of a sequence of clicks
and a probability of occurrence.
Michael Scholz [email protected]
# show Pattern definition showClass("Pattern") # create simple Pattern objects pattern1 <- new("Pattern", sequence = c("h", "c", "p")) pattern2 <- new("Pattern", sequence = c("c", "p", "p"), probability = 0.2) pattern3 <- new("Pattern", sequence = c("h", "p", "p"), probability = 0.35, absorbingProbabilities = data.frame(d = 0.6, o = 0.4))
# show Pattern definition showClass("Pattern") # create simple Pattern objects pattern1 <- new("Pattern", sequence = c("h", "c", "p")) pattern2 <- new("Pattern", sequence = c("c", "p", "p"), probability = 0.2) pattern3 <- new("Pattern", sequence = c("h", "p", "p"), probability = 0.35, absorbingProbabilities = data.frame(d = 0.6, o = 0.4))
MarkovChain
objectPlots a MarkovChain
object
## S4 method for signature 'MarkovChain' plot(x, order = 1, digits = 2, minProbability = 0, ...)
## S4 method for signature 'MarkovChain' plot(x, order = 1, digits = 2, minProbability = 0, ...)
x |
An instance of the |
order |
The order of the transition matrix that should be plotted |
digits |
The number of digits of the transition probabilities |
minProbability |
Only transitions with a probability >= the specified minProbability will be shown |
... |
Further parameters for the |
Plots the transition matrix with order order
of a MarkovChain
object as graph.
Michael Scholz [email protected]
Predicts the Next Click(s) of a User
## S4 method for signature 'MarkovChain' predict(object, startPattern, dist = 1, ties = "random")
## S4 method for signature 'MarkovChain' predict(object, startPattern, dist = 1, ties = "random")
object |
The |
startPattern |
Starting clicks of a user as |
dist |
(Optional) The number of clicks that should be predicted (default is 1). |
ties |
(Optional) The strategy for handling ties in predicting the next
click. Possible strategies are |
This method predicts the next click(s) of a user.
The first clicks of a user
are given as Pattern
object. The next click(s) are predicted based on
the transition probabilities in the MarkovChain
object. The
probability distribution of the next click (n) is estimated as follows:
The distribution of states at time is given as
. The transition matrix for lag
is given as
.
specifies the lag parameter and
the absorbing
probability matrix.
Michael Scholz [email protected]
# fitting a simple Markov chain and predicting the next click clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) startPattern <- new("Pattern", sequence = c("h", "c")) predict(mc, startPattern) # # predict with predefined absorbing probabilities # startPattern <- new("Pattern", sequence = c("h", "c"), absorbingProbabilities = data.frame(d = 0.2, o = 0.8)) predict(mc, startPattern)
# fitting a simple Markov chain and predicting the next click clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) startPattern <- new("Pattern", sequence = c("h", "c")) predict(mc, startPattern) # # predict with predefined absorbing probabilities # startPattern <- new("Pattern", sequence = c("h", "c"), absorbingProbabilities = data.frame(d = 0.2, o = 0.8)) predict(mc, startPattern)
Predicts the cluster for a given Pattern
object. Potential clusters
need to be identified with the method clusterClickstreams
before
predicting the cluster.
## S3 method for class 'ClickstreamClusters' predict(object, pattern, ...)
## S3 method for class 'ClickstreamClusters' predict(object, pattern, ...)
object |
A |
pattern |
Sequence of a user's initial clicks as |
... |
Ignored parameters. |
Returns the index of the clusters to which the given Pattern
object most probably belongs to.
Michael Scholz [email protected]
clusterClickstreams
,
print.ClickstreamClusters
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) pattern <- new("Pattern", sequence = c("h", "c")) predict(clusters, pattern)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) pattern <- new("Pattern", sequence = c("h", "c")) predict(clusters, pattern)
Prints a ClickstreamClusters
object. A ClickstreamClusters
object represents the result of a cluster analysis on a list of clickstreams
(see clusterClickstreams
).
## S3 method for class 'ClickstreamClusters' print(x, ...)
## S3 method for class 'ClickstreamClusters' print(x, ...)
x |
A |
... |
Ignored parameters. |
Michael Scholz [email protected]
clusterClickstreams
,
summary.ClickstreamClusters
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) print(clusters)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) print(clusters)
Prints a Clickstreams
object
## S3 method for class 'Clickstreams' print(x, ...)
## S3 method for class 'Clickstreams' print(x, ...)
x |
A list of clickstreams. |
... |
Ignored parameters. |
Michael Scholz [email protected]
readClickstreams
, randomClickstreams
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) print(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) print(cls)
Prints the summary of a MarkovChain
object.
## S3 method for class 'MarkovChainSummary' print(x, ...)
## S3 method for class 'MarkovChainSummary' print(x, ...)
x |
A |
... |
Ignored parameters. |
Michael Scholz [email protected]
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) print(summary(mc))
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) print(summary(mc))
Generates a Sequence of Clicks
randomClicks(object, startPattern, dist)
randomClicks(object, startPattern, dist)
object |
The |
startPattern |
|
dist |
(Optional) The number of clicks that should be generated (default is 1). |
Generates a sequence of clicks by randomly walking through
the transition graph of a given MarkovChain
object.
Michael Scholz [email protected]
# fitting a simple Markov chain and predicting the next click clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) startPattern <- new("Pattern", sequence = c("h", "c")) predict(mc, startPattern)
# fitting a simple Markov chain and predicting the next click clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) mc <- fitMarkovChain(cls) startPattern <- new("Pattern", sequence = c("h", "c")) predict(mc, startPattern)
Generates a list of clickstreams by randomly walking through a given transition matrix.
randomClickstreams( states, startProbabilities, transitionMatrix, meanLength, n = 100 )
randomClickstreams( states, startProbabilities, transitionMatrix, meanLength, n = 100 )
states |
Names of all possible states. |
startProbabilities |
Start probabilities for all states. |
transitionMatrix |
Matrix of transition probabilities. |
meanLength |
Average length of the click streams. |
n |
Number of click streams to be generated. |
Returns a list of clickstreams.
Michael Scholz [email protected]
fitMarkovChain
, readClickstreams
,
print.Clickstreams
# generate a simple list of click streams states <- c("a", "b", "c") startProbabilities <- c(0.2, 0.5, 0.3) transitionMatrix <- matrix(c(0, 0.4, 0.6, 0.3, 0.1, 0.6, 0.2, 0.8, 0), nrow = 3) cls <- randomClickstreams(states, startProbabilities, transitionMatrix, meanLength = 5, n = 10) print(cls)
# generate a simple list of click streams states <- c("a", "b", "c") startProbabilities <- c(0.2, 0.5, 0.3) transitionMatrix <- matrix(c(0, 0.4, 0.6, 0.3, 0.1, 0.6, 0.2, 0.8, 0), nrow = 3) cls <- randomClickstreams(states, startProbabilities, transitionMatrix, meanLength = 5, n = 10) print(cls)
Reads a list of clickstream from a csv-file. Note that non-alphanumeric characters will be removed.
readClickstreams(file, sep = ",", header = FALSE)
readClickstreams(file, sep = ",", header = FALSE)
file |
The name of the file which the clickstreams are to be read from.
Each line of the file appears as one click stream. If it does not contain an
absolute path, the file name is relative to the current working directory,
|
sep |
The character separating clicks (default is “,”). |
header |
A logical flag indicating whether the first entry of each line in the file is the name of the clickstream user. |
A list of clickstreams. Each element is a vector of characters representing the clicks. The name of each list element is either the header of a clickstream file or a unique number.
Michael Scholz [email protected]
print.Clickstreams
, randomClickstreams
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") csf <- tempfile() writeLines(clickstreams, csf) cls <- readClickstreams(csf, header = TRUE) unlink(csf) print(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") csf <- tempfile() writeLines(clickstreams, csf) cls <- readClickstreams(csf, header = TRUE) unlink(csf) print(cls)
EvaluationResult
objectShows an EvaluationResult
object
## S4 method for signature 'EvaluationResult' show(object)
## S4 method for signature 'EvaluationResult' show(object)
object |
An instance of the |
Shows an EvaluationResult
object.
Michael Scholz [email protected]
MarkovChain
objectShows a MarkovChain
object
## S4 method for signature 'MarkovChain' show(object)
## S4 method for signature 'MarkovChain' show(object)
object |
An instance of the |
Shows an MarkovChain
object.
Michael Scholz [email protected]
Pattern
objectShows a Pattern
object
## S4 method for signature 'Pattern' show(object)
## S4 method for signature 'Pattern' show(object)
object |
An instance of the |
Shows a Pattern
object.
Michael Scholz [email protected]
Returns All States
states(object)
states(object)
object |
An instance of the |
Returns the name of all states of a MarkovChain
object.
Michael Scholz [email protected]
Prints the Summary of a MarkovChain Object
## S4 method for signature 'MarkovChain' summary(object)
## S4 method for signature 'MarkovChain' summary(object)
object |
An instance of the |
Returns a MarkovChainSummary
object.
list("desc") |
A short description of the |
list("observations") |
The number of observations from which the
|
list("k") |
The number of estimation parameters. |
list("logLikelihood") |
The maximal log-likelihood of
the |
list("aic") |
Akaike's Information
Criterion for the |
list("bic") |
Bayesian
Information Criterion for the |
Generates a summary for a given MarkovChain
object
Michael Scholz [email protected]
Prints a summary of a ClickstreamCluster
object. A
ClickstreamClusters
object represents the result of a cluster
analysis on a list of clickstreams (see clusterClickstreams
).
## S3 method for class 'ClickstreamClusters' summary(object, ...)
## S3 method for class 'ClickstreamClusters' summary(object, ...)
object |
A |
... |
Ignored parameters. |
Michael Scholz [email protected]
clusterClickstreams
,
print.ClickstreamClusters
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) summary(clusters)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) summary(clusters)
Prints a summary of a Clickstreams
object.
## S3 method for class 'Clickstreams' summary(object, ...)
## S3 method for class 'Clickstreams' summary(object, ...)
object |
A |
... |
Ignored parameters. |
Michael Scholz [email protected]
readClickstreams
, randomClickstreams
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) summary(cls)
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) summary(cls)
Returns All Transient States
transientStates(object)
transientStates(object)
object |
An instance of the |
Returns the names of all states that have a non-zero probability that a user will never return to them (i.e. that are transient).
Michael Scholz [email protected]
Writes a list of clickstream to a csv-file.
writeClickstreams( clickstreamList, file, header = TRUE, sep = ",", quote = TRUE )
writeClickstreams( clickstreamList, file, header = TRUE, sep = ",", quote = TRUE )
clickstreamList |
The list of clickstreams to be written. |
file |
The name of the file which the clickstreams are written to. |
header |
A logical flag indicating whether the name of each clickstream element should be used as first element. |
sep |
The character used to separate clicks (default is “,”). |
quote |
A logical flag indicating whether each element of a clickstream
will be surrounded by double quotes (default is |
Michael Scholz [email protected]
readClickstreams
, clusterClickstreams
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) writeClickstreams(cls, file = "clickstreams.csv", header = TRUE, sep = ",") # Remove the clickstream file unlink("clickstreams.csv")
clickstreams <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o", "User2,i,c,i,c,c,c,d", "User3,h,i,c,i,c,p,c,c,p,c,c,i,d", "User4,c,c,p,c,d", "User5,h,c,c,p,p,c,p,p,p,i,p,o", "User6,i,h,c,c,p,p,c,p,c,d") cls <- as.clickstreams(clickstreams, header = TRUE) clusters <- clusterClickstreams(cls, order = 0, centers = 2) writeClickstreams(cls, file = "clickstreams.csv", header = TRUE, sep = ",") # Remove the clickstream file unlink("clickstreams.csv")