Introduction

NMoE (Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with Gaussian distributions. NMoE consists of a mixture of K Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

The gating network parameters alpha’s of the softmax net.
The experts network parameters: The location parameters (regression coefficients) beta’s and variances sigma2’s.

It was written in R Markdown, using the knitr package for production.

See help(package="meteorits") for further details and references provided by citation("meteorits").

Application to a simulated dataset

Generate sample

n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivNMoE(alphak = alphak, betak = betak, sigmak = sigmak, x = x)
y <- sample$y

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: -838.391927310104
## EM NMoE: Iteration: 2 | log-likelihood: -837.808680417343
## EM NMoE: Iteration: 3 | log-likelihood: -836.936143902895
## EM NMoE: Iteration: 4 | log-likelihood: -835.053593716159
## EM NMoE: Iteration: 5 | log-likelihood: -830.911272713982
## EM NMoE: Iteration: 6 | log-likelihood: -822.392030059066
## EM NMoE: Iteration: 7 | log-likelihood: -807.020820172028
## EM NMoE: Iteration: 8 | log-likelihood: -784.418457965227
## EM NMoE: Iteration: 9 | log-likelihood: -759.561128161182
## EM NMoE: Iteration: 10 | log-likelihood: -740.952452187447
## EM NMoE: Iteration: 11 | log-likelihood: -731.427752941753
## EM NMoE: Iteration: 12 | log-likelihood: -727.122402403986
## EM NMoE: Iteration: 13 | log-likelihood: -724.986204305193
## EM NMoE: Iteration: 14 | log-likelihood: -723.84995349559
## EM NMoE: Iteration: 15 | log-likelihood: -723.236004917191
## EM NMoE: Iteration: 16 | log-likelihood: -722.90486804622
## EM NMoE: Iteration: 17 | log-likelihood: -722.726923653111
## EM NMoE: Iteration: 18 | log-likelihood: -722.63148240186
## EM NMoE: Iteration: 19 | log-likelihood: -722.58028471849
## EM NMoE: Iteration: 20 | log-likelihood: -722.552769645507
## EM NMoE: Iteration: 21 | log-likelihood: -722.537934433017
## EM NMoE: Iteration: 22 | log-likelihood: -722.529899043053
## EM NMoE: Iteration: 23 | log-likelihood: -722.52552013898

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df       AIC      BIC       ICL
##       -722.5255  8 -730.5255 -747.384 -807.2732
## 
## Clustering table (Number of observations in each expert):
## 
##   1   2 
## 230 270 
## 
## Regression coefficients:
## 
##     Beta(k = 1) Beta(k = 2)
## 1    0.08481009  -0.1730748
## X^1  2.70883328  -2.2152476
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##       0.890237       1.00329

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")

Application to a real dataset

Load data

data("tempanomalies")
x <- tempanomalies$Year
y <- tempanomalies$AnnualAnomaly

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: 48.7003472612216
## EM NMoE: Iteration: 2 | log-likelihood: 48.9532916509119
## EM NMoE: Iteration: 3 | log-likelihood: 49.4027196694693
## EM NMoE: Iteration: 4 | log-likelihood: 50.6477276867279
## EM NMoE: Iteration: 5 | log-likelihood: 53.9300393235625
## EM NMoE: Iteration: 6 | log-likelihood: 60.5774603004439
## EM NMoE: Iteration: 7 | log-likelihood: 68.3480813463301
## EM NMoE: Iteration: 8 | log-likelihood: 72.9810772998394
## EM NMoE: Iteration: 9 | log-likelihood: 75.0600176716451
## EM NMoE: Iteration: 10 | log-likelihood: 76.4450576980649
## EM NMoE: Iteration: 11 | log-likelihood: 77.8274224431498
## EM NMoE: Iteration: 12 | log-likelihood: 79.4238183050976
## EM NMoE: Iteration: 13 | log-likelihood: 81.4436732192292
## EM NMoE: Iteration: 14 | log-likelihood: 84.2425123638667
## EM NMoE: Iteration: 15 | log-likelihood: 88.2567928482649
## EM NMoE: Iteration: 16 | log-likelihood: 92.7847735985735
## EM NMoE: Iteration: 17 | log-likelihood: 95.239956738046
## EM NMoE: Iteration: 18 | log-likelihood: 95.9656073270022
## EM NMoE: Iteration: 19 | log-likelihood: 96.2033735004744
## EM NMoE: Iteration: 20 | log-likelihood: 96.3133262604804
## EM NMoE: Iteration: 21 | log-likelihood: 96.388373934485
## EM NMoE: Iteration: 22 | log-likelihood: 96.4556830322977
## EM NMoE: Iteration: 23 | log-likelihood: 96.524531455379
## EM NMoE: Iteration: 24 | log-likelihood: 96.5989049972542
## EM NMoE: Iteration: 25 | log-likelihood: 96.6809893651902
## EM NMoE: Iteration: 26 | log-likelihood: 96.7720817023191
## EM NMoE: Iteration: 27 | log-likelihood: 96.8726942338739
## EM NMoE: Iteration: 28 | log-likelihood: 96.9824056306437
## EM NMoE: Iteration: 29 | log-likelihood: 97.0997088332496
## EM NMoE: Iteration: 30 | log-likelihood: 97.2220270529671
## EM NMoE: Iteration: 31 | log-likelihood: 97.3460073872316
## EM NMoE: Iteration: 32 | log-likelihood: 97.4681170488713
## EM NMoE: Iteration: 33 | log-likelihood: 97.5854484827486
## EM NMoE: Iteration: 34 | log-likelihood: 97.6965047208013
## EM NMoE: Iteration: 35 | log-likelihood: 97.801700898511
## EM NMoE: Iteration: 36 | log-likelihood: 97.9033735279602
## EM NMoE: Iteration: 37 | log-likelihood: 98.0052348435546
## EM NMoE: Iteration: 38 | log-likelihood: 98.1114394256212
## EM NMoE: Iteration: 39 | log-likelihood: 98.2255643973553
## EM NMoE: Iteration: 40 | log-likelihood: 98.3498521308348
## EM NMoE: Iteration: 41 | log-likelihood: 98.485013745023
## EM NMoE: Iteration: 42 | log-likelihood: 98.6306686192143
## EM NMoE: Iteration: 43 | log-likelihood: 98.7861976702948
## EM NMoE: Iteration: 44 | log-likelihood: 98.9515339416524
## EM NMoE: Iteration: 45 | log-likelihood: 99.1276851436853
## EM NMoE: Iteration: 46 | log-likelihood: 99.3169386027979
## EM NMoE: Iteration: 47 | log-likelihood: 99.5230298517981
## EM NMoE: Iteration: 48 | log-likelihood: 99.7515735287679
## EM NMoE: Iteration: 49 | log-likelihood: 100.010976749063
## EM NMoE: Iteration: 50 | log-likelihood: 100.313925334063
## EM NMoE: Iteration: 51 | log-likelihood: 100.679073880408
## EM NMoE: Iteration: 52 | log-likelihood: 101.129729246152
## EM NMoE: Iteration: 53 | log-likelihood: 101.673189246282
## EM NMoE: Iteration: 54 | log-likelihood: 102.222004455733
## EM NMoE: Iteration: 55 | log-likelihood: 102.616107351468
## EM NMoE: Iteration: 56 | log-likelihood: 102.708512489443
## EM NMoE: Iteration: 57 | log-likelihood: 102.719223445079
## EM NMoE: Iteration: 58 | log-likelihood: 102.721906479085
## EM NMoE: Iteration: 59 | log-likelihood: 102.721906550733

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df      AIC      BIC     ICL
##        102.7219  8 94.72191 83.07129 83.1776
## 
## Clustering table (Number of observations in each expert):
## 
##  1  2 
## 84 52 
## 
## Regression coefficients:
## 
##       Beta(k = 1)  Beta(k = 2)
## 1   -12.667290226 -42.36205623
## X^1   0.006474806   0.02149266
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##     0.01352345    0.01193107

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")

A-quick-tour-of-NMoE

Introduction

Application to a simulated dataset

Generate sample

Set up tMoE model parameters

Set up EM parameters

Estimation

Summary

Plots

Mean curve

Confidence regions

Clusters

Log-likelihood

Application to a real dataset

Load data

Set up tMoE model parameters

Set up EM parameters

Estimation

Summary

Plots

Mean curve

Confidence regions

Clusters

Log-likelihood