Title: | Direct MLE for Multivariate Normal Mixture Distributions |
---|---|
Description: | Multivariate Normal (i.e. Gaussian) Mixture Models (S3) Classes. Fitting models to data using 'MLE' (maximum likelihood estimation) for multivariate normal mixtures via smart parametrization using the 'LDL' (Cholesky) decomposition, see McLachlan and Peel (2000, ISBN:9780471006268), Celeux and Govaert (1995) <doi:10.1016/0031-3203(94)00125-6>. |
Authors: | Nicolas Trutmann [aut, cre], Martin Maechler [aut, ths] (<https://orcid.org/0000-0002-8685-9910>, based on 'nor1mix') |
Maintainer: | Nicolas Trutmann <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2-0 |
Built: | 2025-01-08 06:51:16 UTC |
Source: | CRAN |
The centered log ratio transformation is Maechler's solution to allowing unconstrained mixture weights optimization.
It has been inspired by Aitchison's centered log ratio,
see also CRAN package compositions' clr()
, and typically
other references on modelling proportions.
clr1(w) clr1inv(lp)
clr1(w) clr1inv(lp)
w |
numeric vector of length |
lp |
numeric vector of length |
Aitchison's clr transformation is slightly different, as it does not drop one coordinate, as we do. Hence the extra ‘1’ in the name of our version.
a numeric vector of length or
, see above.
Martin Maechler
Aitchison, J., 1986. The Statistical Analysis of Compositional Data Monographs on Statistics and Applied Probability. Chapman & Hall Ltd., London (UK).
More in the CRAN package compositions vignette ‘UsingCompositions.pdf’
The first implementation of these was in nor1mix, June 2019, in its
par2norMix()
and
nM2par()
functions.
## Apart from error checking and very large number cases, the R implementation is simply ..clr1 <- function (w) { ln <- log(w) ln[-1L] - mean(ln) } ## and its inverse ..clr1inv <- function(lp) { p1 <- exp(c(-sum(lp), lp)) p1/sum(p1) } lp <- clr1( (1:3)/6 ) clr1inv(lp) stopifnot(all.equal(clr1inv(lp), (1:3)/6)) for(n in 1:100) { k <- 2 + rpois(1, 3) # #{components} lp <- rnorm(k-1) # arbitrary unconstrained ## clr1() and clr1inv() are inverses : stopifnot(all.equal(lp, clr1(clr1inv(lp)))) } wM <- clr1inv(c(720,720,720)) w2 <- clr1inv(c(720,718,717)) stopifnot(is.finite(wM), all.equal(wM, c(0, 1/3, 1/3, 1/3)) , is.finite(w2), all.equal(w2, c(0, 0.84379473, 0.1141952, 0.042010066)) )
## Apart from error checking and very large number cases, the R implementation is simply ..clr1 <- function (w) { ln <- log(w) ln[-1L] - mean(ln) } ## and its inverse ..clr1inv <- function(lp) { p1 <- exp(c(-sum(lp), lp)) p1/sum(p1) } lp <- clr1( (1:3)/6 ) clr1inv(lp) stopifnot(all.equal(clr1inv(lp), (1:3)/6)) for(n in 1:100) { k <- 2 + rpois(1, 3) # #{components} lp <- rnorm(k-1) # arbitrary unconstrained ## clr1() and clr1inv() are inverses : stopifnot(all.equal(lp, clr1(clr1inv(lp)))) } wM <- clr1inv(c(720,720,720)) w2 <- clr1inv(c(720,718,717)) stopifnot(is.finite(wM), all.equal(wM, c(0, 1/3, 1/3, 1/3)) , is.finite(w2), all.equal(w2, c(0, 0.84379473, 0.1141952, 0.042010066)) )
npar()
returns an integer (vector, if p
or k
is)
with the number of free parameters of the corresponding model, which is
also the length(.)
of the parameter vector in our
parametrization, see nMm2par()
.
dfnMm(k, p, model = c("EII","VII","EEI","VEI","EVI", "VVI","EEE","VEE","EVV","VVV"))
dfnMm(k, p, model = c("EII","VII","EEI","VEI","EVI", "VVI","EEE","VEE","EVV","VVV"))
k |
number of mixture components |
p |
dimension of data space, i.e., number of variables (aka “features”). |
model |
a |
integer. degrees of freedom of a model with specified dimensions, components and model type.
(m <- eval(formals(dfnMm)$model)) # list of 10 models w/ differing Sigma # A nice table for a given 'p' and all models, all k in 1:8 sapply(m, dfnMm, k=setNames(,1:8), p = 20)
(m <- eval(formals(dfnMm)$model)) # list of 10 models w/ differing Sigma # A nice table for a given 'p' and all models, all k in 1:8 sapply(m, dfnMm, k=setNames(,1:8), p = 20)
Calculates the probability density function of the multivariate normal distribution.
dnorMmix(x, nMm)
dnorMmix(x, nMm)
x |
a vector or matrix of multivariate observations |
nMm |
a |
Returns the density of nMm
at point x
. Iterates over components of the
mixture and returns weighted sum of dmvnorm
.
Nicolas Trutmann
From 2-dimensional mean vector mu
and 2x2 covariance matrix
sigma
, compute
npoints
equi-angular points on
the 1-alpha
confidence ellipse of bivariate
Gaussian (normal) distribution
.
ellipsePts(mu, sigma, npoints, alpha = 0.05, r = sqrt(qchisq(1 - alpha, df = 2)))
ellipsePts(mu, sigma, npoints, alpha = 0.05, r = sqrt(qchisq(1 - alpha, df = 2)))
mu |
mean vector ( |
sigma |
2x2 |
npoints |
integer specifying the number of points to be computed. |
alpha |
confidence level such that the ellipse should contain 1-alpha of the mass. |
r |
radius of the ellipse, typically computed from |
a numeric matrix of dimension npoints x 2
, containing the
x-y-coordinates of the ellipse points.
This has been inspired by package mixtools's ellipse()
function.
Martin Maechler
xy <- ellipsePts(c(10, 100), sigma = cbind(c(4, 7), c(7, 28)), npoints = 20) plot(xy, type = "b", col=2, cex=2, main="ellipsePts(mu = (10,100), sigma, npoints = 20)") points(10, 100, col=3, cex=3, pch=3) text (10, 100, col=3, expression(mu == "mu"), adj=c(-.1, -.1)) stopifnot(is.matrix(xy), dim(xy) == c(20, 2))
xy <- ellipsePts(c(10, 100), sigma = cbind(c(4, 7), c(7, 28)), npoints = 20) plot(xy, type = "b", col=2, cex=2, main="ellipsePts(mu = (10,100), sigma, npoints = 20)") points(10, 100, col=3, cex=3, pch=3) text (10, 100, col=3, expression(mu == "mu"), adj=c(-.1, -.1)) stopifnot(is.matrix(xy), dim(xy) == c(20, 2))
Simple (but not too simple) R implementation of the (square root free)
Choleksy decomposition.
ldl(m)
ldl(m)
m |
positive semi-definite square matrix, say of dimension |
a list
with two components
L |
a lower triangular matrix with diagonal entries 1. |
D |
numeric vector, the diagonal
|
chol()
in base R, or also a “generalized LDL”
decomposition, the Bunch-Kaufman, BunchKaufman()
in (‘Recommended’) package Matrix.
(L <- rbind(c(1,0,0), c(3,1,0), c(-4,5,1))) D <- c(4,1,9) FF <- L %*% diag(D) %*% t(L) FF LL <- ldl(FF) stopifnot(all.equal(L, LL$L), all.equal(D, LL$D)) ## rank deficient : FF0 <- L %*% diag(c(4,0,9)) %*% t(L) ((L0 <- ldl(FF0))) # !! now fixed with the if(Di == 0) test ## With the "trick", it works: stopifnot(all.equal(FF0, L0$L %*% diag(L0$D) %*% t(L0$L))) ## [hint: the LDL' is no longer unique when the matrix is singular] system.time(for(i in 1:10000) ldl(FF) ) # ~ 0.2 sec (L <- rbind(c( 1, 0, 0, 0), c( 3, 1, 0, 0), c(-4, 5, 1, 0), c(-2,20,-7, 1))) D <- c(4,1, 9, 0.5) F4 <- L %*% diag(D) %*% t(L) F4 L4 <- ldl(F4) stopifnot(all.equal(L, L4$L), all.equal(D, L4$D)) system.time(for(i in 1:10000) ldl(F4) ) # ~ 0.16 sec ## rank deficient : F4.0 <- L %*% diag(c(4,1,9,0)) %*% t(L) ((L0 <- ldl(F4.0))) stopifnot(all.equal(F4.0, L0$L %*% diag(L0$D) %*% t(L0$L))) F4_0 <- L %*% diag(c(4,1,0,9)) %*% t(L) ((L0 <- ldl(F4_0))) stopifnot(all.equal(F4_0, L0$L %*% diag(L0$D) %*% t(L0$L))) ## Large mkLDL <- function(n, rF = function(n) sample.int(n), rFD = function(n) 1+ abs(rF(n))) { L <- diag(nrow=n) L[lower.tri(L)] <- rF(n*(n-1)/2) list(L = L, D = rFD(n)) } (LD <- mkLDL(17)) chkLDL <- function(n, ..., verbose=FALSE, tol = 1e-14) { LD <- mkLDL(n, ...) if(verbose) cat(sprintf("n=%3d ", n)) n <- length(D <- LD$D) L <- LD$L M <- L %*% diag(D) %*% t(L) r <- ldl(M) stopifnot(exprs = { all.equal(M, r$L %*% diag(r$D) %*% t(r$L), tol=tol) all.equal(L, r$L, tol=tol) all.equal(D, r$D, tol=tol) }) if(verbose) cat("[ok]\n") invisible(list(LD = LD, M = M, ldl = r)) } (chkLDL(7)) N <- 99 ## test N random cases set.seed(101) for(i in 1:N) { cat(sprintf("i=%3d, ",i)) chkLDL(rpois(1, lambda = 20), verbose=TRUE) } system.time(chkLDL( 500)) # 0.62 try( ## this almost never "works": system.time( chkLDL( 500, rF = rnorm, rFD = function(n) 10 + runif(n)) ) # 0.64 ) system.time(chkLDL( 600)) # 1.09 ## .. then it grows quickly for (on nb-mm4) ## for n = 1000 it typically *fails*: The matrix M is typically very ill conditioned ## does not depend much on the RNG ? "==> much better conditioned L and hence M : " set.seed(120) L <- as(Matrix::tril(toeplitz(exp(-(0:999)/50))), "matrix") dimnames(L) <- NULL D <- 10 + runif(nrow(L)) M <- L %*% diag(D) %*% t(L) rcond(L) # 0.010006 ! rcond(M) # 9.4956e-5 if(FALSE) # ~ 4-5 sec system.time(r <- ldl(M))
(L <- rbind(c(1,0,0), c(3,1,0), c(-4,5,1))) D <- c(4,1,9) FF <- L %*% diag(D) %*% t(L) FF LL <- ldl(FF) stopifnot(all.equal(L, LL$L), all.equal(D, LL$D)) ## rank deficient : FF0 <- L %*% diag(c(4,0,9)) %*% t(L) ((L0 <- ldl(FF0))) # !! now fixed with the if(Di == 0) test ## With the "trick", it works: stopifnot(all.equal(FF0, L0$L %*% diag(L0$D) %*% t(L0$L))) ## [hint: the LDL' is no longer unique when the matrix is singular] system.time(for(i in 1:10000) ldl(FF) ) # ~ 0.2 sec (L <- rbind(c( 1, 0, 0, 0), c( 3, 1, 0, 0), c(-4, 5, 1, 0), c(-2,20,-7, 1))) D <- c(4,1, 9, 0.5) F4 <- L %*% diag(D) %*% t(L) F4 L4 <- ldl(F4) stopifnot(all.equal(L, L4$L), all.equal(D, L4$D)) system.time(for(i in 1:10000) ldl(F4) ) # ~ 0.16 sec ## rank deficient : F4.0 <- L %*% diag(c(4,1,9,0)) %*% t(L) ((L0 <- ldl(F4.0))) stopifnot(all.equal(F4.0, L0$L %*% diag(L0$D) %*% t(L0$L))) F4_0 <- L %*% diag(c(4,1,0,9)) %*% t(L) ((L0 <- ldl(F4_0))) stopifnot(all.equal(F4_0, L0$L %*% diag(L0$D) %*% t(L0$L))) ## Large mkLDL <- function(n, rF = function(n) sample.int(n), rFD = function(n) 1+ abs(rF(n))) { L <- diag(nrow=n) L[lower.tri(L)] <- rF(n*(n-1)/2) list(L = L, D = rFD(n)) } (LD <- mkLDL(17)) chkLDL <- function(n, ..., verbose=FALSE, tol = 1e-14) { LD <- mkLDL(n, ...) if(verbose) cat(sprintf("n=%3d ", n)) n <- length(D <- LD$D) L <- LD$L M <- L %*% diag(D) %*% t(L) r <- ldl(M) stopifnot(exprs = { all.equal(M, r$L %*% diag(r$D) %*% t(r$L), tol=tol) all.equal(L, r$L, tol=tol) all.equal(D, r$D, tol=tol) }) if(verbose) cat("[ok]\n") invisible(list(LD = LD, M = M, ldl = r)) } (chkLDL(7)) N <- 99 ## test N random cases set.seed(101) for(i in 1:N) { cat(sprintf("i=%3d, ",i)) chkLDL(rpois(1, lambda = 20), verbose=TRUE) } system.time(chkLDL( 500)) # 0.62 try( ## this almost never "works": system.time( chkLDL( 500, rF = rnorm, rFD = function(n) 10 + runif(n)) ) # 0.64 ) system.time(chkLDL( 600)) # 1.09 ## .. then it grows quickly for (on nb-mm4) ## for n = 1000 it typically *fails*: The matrix M is typically very ill conditioned ## does not depend much on the RNG ? "==> much better conditioned L and hence M : " set.seed(120) L <- as(Matrix::tril(toeplitz(exp(-(0:999)/50))), "matrix") dimnames(L) <- NULL D <- 10 + runif(nrow(L)) M <- L %*% diag(D) %*% t(L) rcond(L) # 0.010006 ! rcond(M) # 9.4956e-5 if(FALSE) # ~ 4-5 sec system.time(r <- ldl(M))
mvtnorm::dmvnorm
Compute the log-likelihood of a multivariate normal mixture, by calling
dmvnorm()
(from package mvtnorm).
llmvtnorm(par, x, k, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"))
llmvtnorm(par, x, k, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"))
par |
parameter vector as calculated by nMm2par |
x |
numeric data |
k |
number of mixture components. |
model |
assumed model of the distribution |
returns the log-likelihood (a number) of the specified model for the data
( observations)
x
.
dmvnorm()
from package mvtnorm. Our own
function, returning the same: llnorMmix()
.
set.seed(1); x <- rnorMmix(50, MW29) para <- nMm2par(MW29, model=MW29$model) llmvtnorm(para, x, 2, model=MW29$model) # [1] -236.2295
set.seed(1); x <- rnorMmix(50, MW29) para <- nMm2par(MW29, model=MW29$model) llmvtnorm(para, x, 2, model=MW29$model) # [1] -236.2295
Calculates log-likelihood of a dataset, tx, given a normal mixture model as
specified by a parameter vector.
A parameter vector can be obtained by applying nMm2par
to a
norMmix
object.
llnorMmix(par, tx, k, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"))
llnorMmix(par, tx, k, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"))
par |
parameter vector |
tx |
Transposed numeric data matrix, i.e. |
k |
number of mixture components. |
model |
assumed distribution model of normal mixture |
returns the log-likelihood (a number) of the specified model for the data
( observations)
x
.
Our alternative function llmvtnorm()
(which is based on
dmvnorm()
from package mvtnorm).
set.seed(1); tx <- t(rnorMmix(50, MW29)) para <- nMm2par(MW29, model=MW29$model) llnorMmix(para, tx, 2, model=MW29$model) # [1] -236.2295
set.seed(1); tx <- t(rnorMmix(50, MW29)) para <- nMm2par(MW29, model=MW29$model) llnorMmix(para, tx, 2, model=MW29$model) # [1] -236.2295
Nicolas Trutmann constructed multivariate versions from most of the
univariate (i.e., one-dimensional) "Marron-Wand" densities as defined in
CRAN package nor1mix, see MarronWand
(in
that package).
## 2-dim examples: MW21 # Gaussian MW22 # Skewed MW23 # Str Skew MW24 # Kurtotic MW25 # Outlier MW26 # Bimodal MW27 # Separated (bimodal) MW28 # Asymmetric Bimodal MW29 # Trimodal MW210 # Claw MW211 # Double Claw MW212 # Asymmetric Claw MW213 # Asymm. Double Claw MW214 # Smooth Comb MW215 # Trimodal ## 3-dim : MW31 MW32 MW33 MW34 ## 5 - dim: MW51 # Gaussian
## 2-dim examples: MW21 # Gaussian MW22 # Skewed MW23 # Str Skew MW24 # Kurtotic MW25 # Outlier MW26 # Bimodal MW27 # Separated (bimodal) MW28 # Asymmetric Bimodal MW29 # Trimodal MW210 # Claw MW211 # Double Claw MW212 # Asymmetric Claw MW213 # Asymm. Double Claw MW214 # Smooth Comb MW215 # Trimodal ## 3-dim : MW31 MW32 MW33 MW34 ## 5 - dim: MW51 # Gaussian
A normal mixture model. The first digit of the number in the variable name encodes the dimension of the mixture; the following digits merely enumerate models, with some correlation to the complexity of the model.
Martin Maechler for 1D; Nicolas Trutmann for 2-D, 3-D and 5-D.
Marron, S. and Wand, M. (1992) Exact Mean Integrated Squared Error; Annals of Statistcs 20, 712–736; doi:10.1214/aos/1176348653.
MW210 plot(MW214, main = "plot( MW214 )") plot(MW51, main = paste("plot( MW51 ); name:", attr(MW51, "name")))
MW210 plot(MW214, main = "plot( MW214 )") plot(MW51, main = paste("plot( MW51 ); name:", attr(MW51, "name")))
From a "norMmix"
(-like) object, return the numeric
parameter vector in our MLE parametrization.
nMm2par(obj, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"), meanFUN = mean.default, checkX = FALSE)
nMm2par(obj, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"), meanFUN = mean.default, checkX = FALSE)
obj |
a
|
model |
a |
meanFUN |
a |
checkX |
a boolean. check for positive definiteness of covariance matrix. |
This transformation forms a vector from the parameters of a normal mixture. These consist of weights, means and covariance matrices.
Covariance matrices are given as D and L from the LDLt decomposition
vector containing encoded parameters of the mixture. first, the centered log ratio of the weights, then the means, and then the model specific encoding of the covariances.
the inverse function of nMm2par()
is par2nMm()
.
A <- MW24 nMm2par(A, model = A$model) # [1] -0.3465736 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 # [7] -2.3025851 ## All MW* models in {norMmix} pkg: pkg <- "package:norMmix" lMW <- mget(ls(pattern = "^MW", pkg), envir=as.environment(pkg)) lM.par <- lapply(lMW, nMm2par) ## but these *do* differ ___ FIXME __ modMW <- vapply(lMW, `[[`, "model", FUN.VALUE = "XYZ") cbind(modMW, lengths(lM.par), npar = sapply(lMW, npar))[order(modMW),]
A <- MW24 nMm2par(A, model = A$model) # [1] -0.3465736 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 # [7] -2.3025851 ## All MW* models in {norMmix} pkg: pkg <- "package:norMmix" lMW <- mget(ls(pattern = "^MW", pkg), envir=as.environment(pkg)) lM.par <- lapply(lMW, nMm2par) ## but these *do* differ ___ FIXME __ modMW <- vapply(lMW, `[[`, "model", FUN.VALUE = "XYZ") cbind(modMW, lengths(lM.par), npar = sapply(lMW, npar))[order(modMW),]
Cast nor1mix object as norMmix.
nor1toMmix(object)
nor1toMmix(object)
object |
A nor1mix mixture model to be coerced to norMmix. |
This package was designed to extend the nor1mix
package to the case of
multivariate mixture models. Therefore we include a utility function to cast
1-dimensional mixtures as defined in nor1mix to norMmix
.
A norMmix
object if the appropriate S3method has been implemented.
norMmix
creates a multivariate normal (aka Gaussian) mixture
object, conceptually a mixture of multivariate
(
-dimensional) Gaussians
, for
.
norMmix(mu, Sigma = NULL, weight = rep(1/k, k), name = NULL, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"))
norMmix(mu, Sigma = NULL, weight = rep(1/k, k), name = NULL, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"))
mu |
matrix of means, or a vector in which case |
Sigma |
NULL, number, numeric, vector (length = k), matrix (dim = p x k), or array (p x p x k). See details. |
weight |
weights of mixture model components |
name |
gives the option of naming mixture |
model |
see ‘Details’ |
model
must be specified by one of the (currently 10)
character
strings shown in the default.
(In a future version, model
may become optional).
norMmix
as a few nifty ways of constructing simpler matrices from
smaller givens. This happens according to the dimension of the given
value for the Sigma argument:
for a single value d
or NULL
, norMmix()
assumes
all covariance matrices to be diagonal with entries d
or 1
, respectively.
for a vector v
, norMmix
assumes all matrices to
be diagonal with the i-th matrix having diagonal entries v[i]
.
for a matrix m
, norMmix
assumes all matrices to
be diagonal with diagonal vector m[,i]
, i.e., it goes by
columns.
an array is assumed to be the covariance matrices, given explicitly.
FIXME ... give "all" the details ... (from Bachelor's thesis ???)
currently, a list
of class "norMmix"
, with
a name
attribute and components
model |
three-letter |
mu |
(p x k) matrix of component means |
Sigma |
(p x p x k) array of component Covariance matrices
|
weight |
p-vector of mixture probability weights;
non-negative, summing to one: |
k |
integer, the number of components |
dim |
integer, the dimension |
Nicolas Trutmann
__ TODO __
norMmixMLE()
to fit such mixture models to data (an matrix).
“Marron-Wand”-like examples (for testing, etc), such as
MW21
.
## Some of the "MW" objects : % --> ../R/zmarrwandnMm.R # very simple 2d: M21 <- norMmix(mu = cbind(c(0,0)), # 2 x 1 ==> k=2, p=1 Sigma = 1, model = "EII") stopifnot(identical(M21, # even simpler, Sigma = default : norMmix(mu = cbind(c(0,0)), model = "EII"))) m2.2 <- norMmix(mu = cbind(c(0, 0), c(5, 0)), Sigma = c(1, 10), weight = c(7,1)/8, model = "VEI") m22 <- norMmix( name = "one component rotated", mu = cbind( c(0,0) ), Sigma = array(c(55,9, 9,3), dim = c(2,2, 1)), model = "EVV") stopifnot( all.equal(MW22, m22) ) m213 <- norMmix( name = "#13 test VVV", weight = c(0.5, 0.5), mu = cbind( c(0,0), c(30,30) ), Sigma = array(c( 1,3,3,11, 3,6,6,13 ), dim=c(2,2, 2)), model = "VVV") stopifnot( all.equal(MW213, m213) ) str(m213) m34 <- norMmix( name = "#4 3d VEI", weight = c(0.1, 0.9), mu = matrix(rep(0,6), 3,2), Sigma = array(c(diag(1:3), 0.2*diag(3:1)), c(3,3, 2)), model = "VVI" ) stopifnot( all.equal(MW34, m34) )
## Some of the "MW" objects : % --> ../R/zmarrwandnMm.R # very simple 2d: M21 <- norMmix(mu = cbind(c(0,0)), # 2 x 1 ==> k=2, p=1 Sigma = 1, model = "EII") stopifnot(identical(M21, # even simpler, Sigma = default : norMmix(mu = cbind(c(0,0)), model = "EII"))) m2.2 <- norMmix(mu = cbind(c(0, 0), c(5, 0)), Sigma = c(1, 10), weight = c(7,1)/8, model = "VEI") m22 <- norMmix( name = "one component rotated", mu = cbind( c(0,0) ), Sigma = array(c(55,9, 9,3), dim = c(2,2, 1)), model = "EVV") stopifnot( all.equal(MW22, m22) ) m213 <- norMmix( name = "#13 test VVV", weight = c(0.5, 0.5), mu = cbind( c(0,0), c(30,30) ), Sigma = array(c( 1,3,3,11, 3,6,6,13 ), dim=c(2,2, 2)), model = "VVV") stopifnot( all.equal(MW213, m213) ) str(m213) m34 <- norMmix( name = "#4 3d VEI", weight = c(0.1, 0.9), mu = matrix(rep(0,6), 3,2), Sigma = array(c(diag(1:3), 0.2*diag(3:1)), c(3,3, 2)), model = "VVI" ) stopifnot( all.equal(MW34, m34) )
Direct Maximum Likelihood Estimation (MLE) for multivariate normal
mixture models "norMmix"
. Starting from a
clara
(package cluster) clustering plus
one M-step by default, or alternatively from the default start of (package)
mclust, perform direct likelihood maximization via optim()
.
norMmixMLE(x, k, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"), initFUN = claraInit, ll = c("nmm", "mvt"), keep.optr = TRUE, keep.data = keep.optr, method = "BFGS", maxit = 100, trace = 2, optREPORT = 10, reltol = sqrt(.Machine$double.eps), ...) claraInit(x, k, samples = 128, sampsize = ssClara2kL, trace) mclVVVinit(x, k, ...) ssClara2kL(n, k, p)
norMmixMLE(x, k, model = c("EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVV", "VVV"), initFUN = claraInit, ll = c("nmm", "mvt"), keep.optr = TRUE, keep.data = keep.optr, method = "BFGS", maxit = 100, trace = 2, optREPORT = 10, reltol = sqrt(.Machine$double.eps), ...) claraInit(x, k, samples = 128, sampsize = ssClara2kL, trace) mclVVVinit(x, k, ...) ssClara2kL(n, k, p)
x |
numeric [n x p] matrix |
k |
positive number of components |
model |
a |
initFUN |
a |
ll |
a string specifying the method to be used for the likelihood
computation; the default, |
keep.optr , keep.data
|
|
method , maxit , optREPORT , reltol
|
arguments for tuning the
optimizer |
trace |
|
... |
|
samples |
the number of subsamples to take in
|
sampsize |
the sample size to take in
|
n , p
|
By default, initFUN=claraInit
, uses clara()
and one M-step from EM-algorithm to initialize parameters
after that uses general optimizer optim()
to calculate the MLE.
To silence the output of norMmixMLE
, set optREPORT
very high and trace
to 0. For details on output behavior, see the "details" section of optim
.
norMmixMLE
returns an object of class
"norMmixMLE"
which is a list
with components
norMmix |
the |
optr |
(if |
npar |
the number of free parameters, a function of |
n |
the sample size, i.e., the number of observations or rows of |
cond |
the result of (the hidden function) |
x |
(if |
MW214 set.seed(105) x <- rnorMmix(1000, MW214) ## Fitting, assuming to know the true model (k=6, "VII") fm1 <- norMmixMLE(x, k = 6, model = "VII", initFUN=claraInit) fm1 # {using print.norMmixMLE() method} fm1M <- norMmixMLE(x, k = 6, model = "VII", initFUN=mclVVVinit) ## Fitting "wrong" overparametrized model: typically need more iterations: fmW <- norMmixMLE(x, k = 7, model = "VVV", maxit = 200, initFUN=claraInit) ## default maxit=100 is often too small ^^^^^^^^^^^ x <- rnorMmix(2^12, MW51) fM5 <- norMmixMLE(x, k = 4) # k = 3 is sufficient fM5 c(logLik = logLik(fM5), AIC = AIC(fM5), BIC = BIC(fM5)) plot(fM5, show.x=FALSE) plot(fM5, lwd=3, pch.data=".") # this takes several seconds fM5big <- norMmixMLE(x, model = "VVV", k = 4, maxit = 300) # k = 3 is sufficient summary(warnings()) fM5big ; c(logLik = logLik(fM5big), AIC = AIC(fM5big), BIC = BIC(fM5big)) plot(fM5big, show.x=FALSE)
MW214 set.seed(105) x <- rnorMmix(1000, MW214) ## Fitting, assuming to know the true model (k=6, "VII") fm1 <- norMmixMLE(x, k = 6, model = "VII", initFUN=claraInit) fm1 # {using print.norMmixMLE() method} fm1M <- norMmixMLE(x, k = 6, model = "VII", initFUN=mclVVVinit) ## Fitting "wrong" overparametrized model: typically need more iterations: fmW <- norMmixMLE(x, k = 7, model = "VVV", maxit = 200, initFUN=claraInit) ## default maxit=100 is often too small ^^^^^^^^^^^ x <- rnorMmix(2^12, MW51) fM5 <- norMmixMLE(x, k = 4) # k = 3 is sufficient fM5 c(logLik = logLik(fM5), AIC = AIC(fM5), BIC = BIC(fM5)) plot(fM5, show.x=FALSE) plot(fM5, lwd=3, pch.data=".") # this takes several seconds fM5big <- norMmixMLE(x, model = "VVV", k = 4, maxit = 300) # k = 3 is sufficient summary(warnings()) fM5big ; c(logLik = logLik(fM5big), AIC = AIC(fM5big), BIC = BIC(fM5big)) plot(fM5big, show.x=FALSE)
This function is generic; method functions can be written to handle specific classes of objects. The following classes have methods written for them:
norMmix
norMmixMLE
fittednorMmix
npar(object, ...)
npar(object, ...)
object |
any R object from the list in the ‘Description’. |
... |
potentially further arguments for methods; Currently, none of the methods for the listed classes do have such. |
Depending on object
:
norMmix |
integer number. |
norMmixMLE |
integer number. |
fittednorMmix |
Nicolas Trutmann
methods(npar) # list available methods npar(MW213)
methods(npar) # list available methods npar(MW213)
Transforms the (numeric) parameter vector of our MLE parametrization of a multivariate
normal mixture model into the corresponding list
of
components determining the model. Additionally (partly redundantly), the
dimension p
and number of components k
need to be specified
as well.
par2nMm(par, p, k, model = c("EII","VII","EEI","VEI","EVI", "VVI","EEE","VEE","EVV","VVV") , name = sprintf("model = %s , components = %s", model, k) )
par2nMm(par, p, k, model = c("EII","VII","EEI","VEI","EVI", "VVI","EEE","VEE","EVV","VVV") , name = sprintf("model = %s , components = %s", model, k) )
par |
the model parameter numeric vector. |
p |
dimension of data space, i.e., number of variables (aka “features”). |
k |
the number of mixture components, a positive integer. |
model |
a |
name |
returns a list
with components
weight |
.. |
mu |
.. |
Sigma |
.. |
k |
.. |
dim |
.. |
This is the inverse function of nMm2par()
.
## TODO: Show to get the list, and then how to get a norMmix() object from the list str(MW213) # List of 6 # $ model : chr "VVV" # $ mu : num [1:2, 1:2] 0 0 30 30 # $ Sigma : num [1:2, 1:2, 1:2] 1 3 3 11 3 6 6 13 # $ weight: num [1:2] 0.5 0.5 # $ k : int 2 # $ dim : int 2 # - attr(*, "name")= chr "#13 test VVV" # - attr(*, "class")= chr "norMmix" para <- nMm2par(MW213, model="EEE") par2nMm(para, 2, 2, model="EEE")
## TODO: Show to get the list, and then how to get a norMmix() object from the list str(MW213) # List of 6 # $ model : chr "VVV" # $ mu : num [1:2, 1:2] 0 0 30 30 # $ Sigma : num [1:2, 1:2, 1:2] 1 3 3 11 3 6 6 13 # $ weight: num [1:2] 0.5 0.5 # $ k : int 2 # $ dim : int 2 # - attr(*, "name")= chr "#13 test VVV" # - attr(*, "class")= chr "norMmix" para <- nMm2par(MW213, model="EEE") par2nMm(para, 2, 2, model="EEE")
This is the S3 method for plotting "norMmix"
objects.
## S3 method for class 'norMmix' plot(x, y=NULL, ...) ## S3 method for class 'norMmixMLE' plot(x, y = NULL, show.x = TRUE, main = sprintf( "norMmixMLE(*, model=\"%s\") fit to n=%d observations in %d dim.", nm$model, x$nobs, nm$dim ), sub = paste0( sprintf("log likelihood: %g; npar=%d", x$logLik, x$npar), if (!is.null(opt <- x$optr)) paste("; optim() counts:", named2char(opt$counts)) ), cex.data = par("cex") / 4, pch.data = 4, ...) ## S3 method for class 'fittednorMmix' plot(x, main = "unnamed", plotbest = FALSE, ...) plot2d (nMm, data = NULL, add = FALSE, main = NULL, sub = NULL, type = "l", lty = 2, lwd = if (!is.null(data)) 2 else 1, xlim = NULL, ylim = NULL, f.lim = 0.05, npoints = 250, lab = FALSE, col = Trubetskoy10[1], col.data = adjustcolor(par("col"), 1/2), cex.data = par("cex"), pch.data = par("pch"), fill = TRUE, fillcolor = col, border = NA, ...) plotnd(nMm, data = NULL, main = NULL, diag.panel = NULL, ...) Trubetskoy10
## S3 method for class 'norMmix' plot(x, y=NULL, ...) ## S3 method for class 'norMmixMLE' plot(x, y = NULL, show.x = TRUE, main = sprintf( "norMmixMLE(*, model=\"%s\") fit to n=%d observations in %d dim.", nm$model, x$nobs, nm$dim ), sub = paste0( sprintf("log likelihood: %g; npar=%d", x$logLik, x$npar), if (!is.null(opt <- x$optr)) paste("; optim() counts:", named2char(opt$counts)) ), cex.data = par("cex") / 4, pch.data = 4, ...) ## S3 method for class 'fittednorMmix' plot(x, main = "unnamed", plotbest = FALSE, ...) plot2d (nMm, data = NULL, add = FALSE, main = NULL, sub = NULL, type = "l", lty = 2, lwd = if (!is.null(data)) 2 else 1, xlim = NULL, ylim = NULL, f.lim = 0.05, npoints = 250, lab = FALSE, col = Trubetskoy10[1], col.data = adjustcolor(par("col"), 1/2), cex.data = par("cex"), pch.data = par("pch"), fill = TRUE, fillcolor = col, border = NA, ...) plotnd(nMm, data = NULL, main = NULL, diag.panel = NULL, ...) Trubetskoy10
x , nMm
|
an R object inheriting from |
y |
further data matrix, first 2 columns will be plotted by
|
... |
further arguments to be passed to another plotting function. |
show.x |
Option for |
data |
Data points to plot. |
add |
This argument is used in the internal function, |
main |
Set main title. See |
sub |
Set subtitle. See |
type |
Graphing type for ellipses border. Defaults to "l". |
lty |
Line type to go with the |
lwd |
Line width as in |
xlim |
Set explicit x limits for 2d plots. |
ylim |
As |
f.lim |
Percentage value for how much to extend |
npoints |
How many points to use in the drawn ellipses. Larger values make them prettier but might affect plot times. |
lab |
Whether to print labels for mixture components. Will print "comp |
col |
Fill color for ellipses. Default is "#4363d8". |
col.data |
Color to be used for data points. |
cex.data |
See |
pch.data |
See |
fill |
Leave ellipses blank with outline or fill them in. |
fillcolor |
Color for infill of ellipses. |
border |
Argument to be passed to |
diag.panel |
Function to plot 2d projections of a higher-dimensional mixture model.
Used by |
plotbest |
Used by |
The plot method calls one of two auxiliary functions, one for dim=2,
another for higher dimensions. The method for 2 dimensional plots also takes a
add
parameter (FALSE
by default), which allows for the ellipses
to be drawn over an existing plot.
The higher dimensional plot method relies on the pairs.default
function
to draw a lattice plot, where the panels are built using the 2 dimensional method.
Trubetskoy10
: A vector of colors for these plots,
chosen to be distinguishable and accessible for the colorblind, according to
https://sashamaps.net/2017/01/11/list-of-20-simple-distinct-colors/,
slightly rearranged, so that the first five colors stand out well on white background.
plot.norMmix
In the 2 dimensional case, returns invisibly coordinates
of bounding ellipses of distribution.
plot(MW212) ## and add a finite sample realization: points(rnorMmix(n=500, MW212)) ## or: x <- points(rnorMmix(n=500, MW212)) plot(MW212, x) ## Example of dim. = p > 2 : plot(MW34)
plot(MW212) ## and add a finite sample realization: points(rnorMmix(n=500, MW212)) ## or: x <- points(rnorMmix(n=500, MW212)) plot(MW212, x) ## Example of dim. = p > 2 : plot(MW34)
Draw n
(p-dimensional) observations randomly from the multivariate normal
mixture distribution specified by obj
.
rnorMmix(n, obj, index = FALSE, permute = TRUE)
rnorMmix(n, obj, index = FALSE, permute = TRUE)
n |
sample size, non-negative. |
obj |
a |
index |
Logical, store the clustering information as first column |
permute |
Logical, indicating if the observations should be randomly permuted after creation “cluster by cluster”. |
n p-dimensional observations, as numeric matrix.
Nicolas Trutmann
x <- rnorMmix(500, MW213) plot(x) x <- rnorMmix(500, MW213, index=TRUE) plot(x[,-1], col=x[,1]) ## using index column to color components
x <- rnorMmix(500, MW213) plot(x) x <- rnorMmix(500, MW213, index=TRUE) plot(x[,-1], col=x[,1]) ## using index column to color components
sllnorMmix()
returns a number, the log-likelihood of the data
x
, given a normal mixture obj
.
sllnorMmix(x, obj)
sllnorMmix(x, obj)
x |
data |
obj |
an R object of class |
Calculates log-likelihood of a dataset, x, given a normal mixture model;
just a simplified wrapper for llnorMmix
.
Removes functionality in favor of ease of use.
double
. See description.
set.seed(2019) x <- rnorMmix(400, MW27) sllnorMmix(x, MW27) # -1986.315
set.seed(2019) x <- rnorMmix(400, MW27) sllnorMmix(x, MW27) # -1986.315