Package 'SAGMM' reference manual

Title:	Clustering via Stochastic Approximation and Gaussian Mixture Models
Description:	Computes clustering by fitting Gaussian mixture models (GMM) via stochastic approximation following the methods of Nguyen and Jones (2018) <doi:10.1201/9780429446177>. It also provides some test data generation and plotting functionality to assist with this process.
Authors:	Andrew T. Jones, Hien D. Nguyen
Maintainer:	Andrew T. Jones <[email protected]>
License:	GPL-3
Version:	0.2.4
Built:	2025-02-12 06:49:43 UTC
Source:	CRAN

Return Gamma, a sequence of gain factors

Description

Generate a series of gain factors.

Usage

gainFactors(Number, Burnin)
gainFactors(Number, Burnin)

Arguments

`Number`	Number of values required.
`Burnin`	Number of 'Burnin' values at the beginning of sequence.

Value

Gamma, a vector of gain factors.

Examples

g<-gainFactors(10^4, 2*10^3)
g<-gainFactors(10^4, 2*10^3)

Generate data for simulations to test the SAGMM package..

Description

This function is primarily a convienence wrapper for MixSim.

Usage

generateSimData(ngroups = 5, Dimensions = 5, Number = 10^4)
generateSimData(ngroups = 5, Dimensions = 5, Number = 10^4)

Arguments

`ngroups`	Number of mixture components. Default 5.
`Dimensions`	number of Dimensions. Default 5.
`Number`	number of samples. Default 10^4.

Value

List of results: X, Y, simobject.

Examples

sims<-generateSimData(ngroups=10, Dimensions=10, Number=10^4)
sims<-generateSimData()
sims<-generateSimData(ngroups=10, Dimensions=10, Number=10^4)
sims<-generateSimData()

SAGMM: A package for Clustering via Stochastic Approximation and Gaussian Mixture Models.

Description

The SAGMM package allows for computation of gaussian mixture models using stochastic approximation to increase efficiency with large data sets. The primary function SAGMMFit allows this to be performed in a relative flexible manner.

Author(s)

Andrew T. Jones and Hien D. Nguyen

References

Nguyen & Jones (2018). Big Data-Appropriate Clustering via Stochastic Approximation and Gaussian Mixture Models. In Data Analytics (pp. 79-96). CRC Press.

Clustering via Stochastic Approximation and Gaussian Mixture Models (GMM)

Description

Fit a GMM via Stochastic Approximation. See Reference.

Usage

SAGMMFit(X, Y = NULL, Burnin = 5, ngroups = 5, kstart = 10,
  plot = FALSE)
SAGMMFit(X, Y = NULL, Burnin = 5, ngroups = 5, kstart = 10,
  plot = FALSE)

Arguments

`X`	numeric matrix of the data.
`Y`	Group membership (if known). Where groups are integers in 1:ngroups. If provided ngroups can
`Burnin`	Ratio of observations to use as a burn in before algorithm begins.
`ngroups`	Number of mixture components. If Y is provided, and groups is not then is overridden by Y.
`kstart`	number of kmeans starts to initialise.
`plot`	If TRUE generates a plot of the clustering.

Value

A list containing

`Cluster`	The clustering of each observation.
`plot`	A plot of the clustering (if requested).
`l2`	Estimate of Lambda^2
`ARI1`	Adjusted Rand Index 1 - using k-means
`ARI2`	Adjusted Rand Index 2 - using GMM Clusters
`ARI3`	Adjusted Rand Index 3 - using intialiation k-means
`KM`	Initial K-means clustering of the data.
`pi`	The cluster proportions (vector of length ngroups)
`tau`	tau matrix of conditional probabilities.
`fit`	Full output details from inner C++ loop.

Author(s)

Andrew T. Jones and Hien D. Nguyen

References

Nguyen & Jones (2018). Big Data-Appropriate Clustering via Stochastic Approximation and Gaussian Mixture Models. In Data Analytics (pp. 79-96). CRC Press.

Examples

sims<-generateSimData(ngroups=10, Dimensions=10, Number=10^4)
res1<-SAGMMFit(sims$X, sims$Y)
res2<-SAGMMFit(sims$X, ngroups=5)
sims<-generateSimData(ngroups=10, Dimensions=10, Number=10^4)
res1<-SAGMMFit(sims$X, sims$Y)
res2<-SAGMMFit(sims$X, ngroups=5)

Package 'SAGMM'

Help Index

Return Gamma, a sequence of gain factors

Description

Usage

Arguments

Value

Examples

Generate data for simulations to test the SAGMM package..

Description

Usage

Arguments

Value

Examples

SAGMM: A package for Clustering via Stochastic Approximation and Gaussian Mixture Models.

Description

Author(s)

References

Clustering via Stochastic Approximation and Gaussian Mixture Models (GMM)

Description

Usage

Arguments

Value

Author(s)

References

Examples