A-quick-tour-of-NMoE

Introduction

NMoE (Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with Gaussian distributions. NMoE consists of a mixture of K Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

  • The gating network parameters alpha’s of the softmax net.
  • The experts network parameters: The location parameters (regression coefficients) beta’s and variances sigma2’s.

It was written in R Markdown, using the knitr package for production.

See help(package="meteorits") for further details and references provided by citation("meteorits").

Application to a simulated dataset

Generate sample

n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivNMoE(alphak = alphak, betak = betak, sigmak = sigmak, x = x)
y <- sample$y

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: -866.137907192821
## EM NMoE: Iteration: 2 | log-likelihood: -865.565209804636
## EM NMoE: Iteration: 3 | log-likelihood: -864.619538863423
## EM NMoE: Iteration: 4 | log-likelihood: -862.687451604584
## EM NMoE: Iteration: 5 | log-likelihood: -858.705037480123
## EM NMoE: Iteration: 6 | log-likelihood: -850.944611349545
## EM NMoE: Iteration: 7 | log-likelihood: -837.32333663068
## EM NMoE: Iteration: 8 | log-likelihood: -816.969375981482
## EM NMoE: Iteration: 9 | log-likelihood: -792.960139525551
## EM NMoE: Iteration: 10 | log-likelihood: -772.413187885903
## EM NMoE: Iteration: 11 | log-likelihood: -759.86631511808
## EM NMoE: Iteration: 12 | log-likelihood: -753.6394583455
## EM NMoE: Iteration: 13 | log-likelihood: -750.649031204516
## EM NMoE: Iteration: 14 | log-likelihood: -749.161999518704
## EM NMoE: Iteration: 15 | log-likelihood: -748.401445264402
## EM NMoE: Iteration: 16 | log-likelihood: -748.004900337244
## EM NMoE: Iteration: 17 | log-likelihood: -747.793619551444
## EM NMoE: Iteration: 18 | log-likelihood: -747.677480343033
## EM NMoE: Iteration: 19 | log-likelihood: -747.610729399699
## EM NMoE: Iteration: 20 | log-likelihood: -747.570029951881
## EM NMoE: Iteration: 21 | log-likelihood: -747.543398687308
## EM NMoE: Iteration: 22 | log-likelihood: -747.524621890016
## EM NMoE: Iteration: 23 | log-likelihood: -747.510435824386
## EM NMoE: Iteration: 24 | log-likelihood: -747.499095243218
## EM NMoE: Iteration: 25 | log-likelihood: -747.489643519091
## EM NMoE: Iteration: 26 | log-likelihood: -747.481537822712
## EM NMoE: Iteration: 27 | log-likelihood: -747.474455362923

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df       AIC       BIC       ICL
##       -747.4745  8 -755.4745 -772.3329 -813.4375
## 
## Clustering table (Number of observations in each expert):
## 
##   1   2 
## 279 221 
## 
## Regression coefficients:
## 
##     Beta(k = 1) Beta(k = 2)
## 1     0.2817768   0.0273754
## X^1   2.8143684  -2.4503403
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##        1.11643      1.039103

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")

Application to a real dataset

Load data

data("tempanomalies")
x <- tempanomalies$Year
y <- tempanomalies$AnnualAnomaly

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: 48.7564014408117
## EM NMoE: Iteration: 2 | log-likelihood: 49.158089857527
## EM NMoE: Iteration: 3 | log-likelihood: 50.2366237848553
## EM NMoE: Iteration: 4 | log-likelihood: 53.0647094633021
## EM NMoE: Iteration: 5 | log-likelihood: 59.1373128650323
## EM NMoE: Iteration: 6 | log-likelihood: 67.4621145659615
## EM NMoE: Iteration: 7 | log-likelihood: 73.3694996911543
## EM NMoE: Iteration: 8 | log-likelihood: 76.0127757391333
## EM NMoE: Iteration: 9 | log-likelihood: 77.6190793770636
## EM NMoE: Iteration: 10 | log-likelihood: 79.2537502900737
## EM NMoE: Iteration: 11 | log-likelihood: 81.2689067880499
## EM NMoE: Iteration: 12 | log-likelihood: 84.0250399902512
## EM NMoE: Iteration: 13 | log-likelihood: 87.9640435580483
## EM NMoE: Iteration: 14 | log-likelihood: 92.5174421679465
## EM NMoE: Iteration: 15 | log-likelihood: 95.1426209669512
## EM NMoE: Iteration: 16 | log-likelihood: 95.9462471840957
## EM NMoE: Iteration: 17 | log-likelihood: 96.208301160411
## EM NMoE: Iteration: 18 | log-likelihood: 96.3268026029042
## EM NMoE: Iteration: 19 | log-likelihood: 96.4057540556682
## EM NMoE: Iteration: 20 | log-likelihood: 96.4757102538877
## EM NMoE: Iteration: 21 | log-likelihood: 96.547053072046
## EM NMoE: Iteration: 22 | log-likelihood: 96.6241263467428
## EM NMoE: Iteration: 23 | log-likelihood: 96.7091784102442
## EM NMoE: Iteration: 24 | log-likelihood: 96.8034058520855
## EM NMoE: Iteration: 25 | log-likelihood: 96.9070986210927
## EM NMoE: Iteration: 26 | log-likelihood: 97.0195237980864
## EM NMoE: Iteration: 27 | log-likelihood: 97.1388310923819
## EM NMoE: Iteration: 28 | log-likelihood: 97.262157860171
## EM NMoE: Iteration: 29 | log-likelihood: 97.3860267272788
## EM NMoE: Iteration: 30 | log-likelihood: 97.5070236189724
## EM NMoE: Iteration: 31 | log-likelihood: 97.622625513492
## EM NMoE: Iteration: 32 | log-likelihood: 97.731919401371
## EM NMoE: Iteration: 33 | log-likelihood: 97.8359658583622
## EM NMoE: Iteration: 34 | log-likelihood: 97.9376147082945
## EM NMoE: Iteration: 35 | log-likelihood: 98.0408127959805
## EM NMoE: Iteration: 36 | log-likelihood: 98.1496069150533
## EM NMoE: Iteration: 37 | log-likelihood: 98.26717592146
## EM NMoE: Iteration: 38 | log-likelihood: 98.3952388392598
## EM NMoE: Iteration: 39 | log-likelihood: 98.5340786953153
## EM NMoE: Iteration: 40 | log-likelihood: 98.6831618166505
## EM NMoE: Iteration: 41 | log-likelihood: 98.8419934490694
## EM NMoE: Iteration: 42 | log-likelihood: 99.0108512522856
## EM NMoE: Iteration: 43 | log-likelihood: 99.191169896525
## EM NMoE: Iteration: 44 | log-likelihood: 99.3857059306659
## EM NMoE: Iteration: 45 | log-likelihood: 99.5987685928119
## EM NMoE: Iteration: 46 | log-likelihood: 99.8367992741779
## EM NMoE: Iteration: 47 | log-likelihood: 100.109480829686
## EM NMoE: Iteration: 48 | log-likelihood: 100.431389231479
## EM NMoE: Iteration: 49 | log-likelihood: 100.823329675216
## EM NMoE: Iteration: 50 | log-likelihood: 101.307465264003
## EM NMoE: Iteration: 51 | log-likelihood: 101.871337841288
## EM NMoE: Iteration: 52 | log-likelihood: 102.387350919616
## EM NMoE: Iteration: 53 | log-likelihood: 102.641826349413
## EM NMoE: Iteration: 54 | log-likelihood: 102.720917798529
## EM NMoE: Iteration: 55 | log-likelihood: 102.720969193023

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df      AIC      BIC      ICL
##         102.721  8 94.72097 83.07035 83.18514
## 
## Clustering table (Number of observations in each expert):
## 
##  1  2 
## 84 52 
## 
## Regression coefficients:
## 
##       Beta(k = 1)  Beta(k = 2)
## 1   -12.667310466 -42.36310162
## X^1   0.006474817   0.02149318
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##     0.01352312    0.01193059

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")