A-quick-tour-of-SNMoE

Introduction

SNMoE (Skew-Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with possibly skewed distributions to generalize the standard Normal mixture of expert model. SNMoE consists of a mixture of K skew-Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

  • The gating network parameters alpha’s of the softmax net.
  • The experts network parameters: The location parameters (regression coefficients) beta’s, scale parameters sigma’s, and the skewness parameters lambda’s. SNMoE thus generalises mixtures of (normal, skew-normal) distributions and mixtures of regressions with these distributions. For example, when q = 0, we retrieve mixtures of (skew-normal, or normal) regressions, and when both p = 0 and q = 0, it is a mixture of (skew-normal, or normal) distributions. It also reduces to the standard (normal, skew-normal) distribution when we only use a single expert (K = 1).

Model estimation/learning is performed by a dedicated expectation conditional maximization (ECM) algorithm by maximizing the observed data log-likelihood. We provide simulated examples to illustrate the use of the model in model-based clustering of heterogeneous regression data and in fitting non-linear regression functions.

It was written in R Markdown, using the knitr package for production.

See help(package="meteorits") for further details and references provided by citation("meteorits").

Application to a simulated dataset

Generate sample

n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
lambdak <- c(3, 5) # Skewness parameters of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivSNMoE(alphak = alphak, betak = betak, sigmak = sigmak, 
                          lambdak = lambdak, x = x)
y <- sample$y

Set up SNMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
                 threshold, verbose, verbose_IRLS)
## EM - SNMoE: Iteration: 1 | log-likelihood: -557.112378185117
## EM - SNMoE: Iteration: 2 | log-likelihood: -513.821018266112
## EM - SNMoE: Iteration: 3 | log-likelihood: -512.35798583579
## EM - SNMoE: Iteration: 4 | log-likelihood: -512.047220072863
## EM - SNMoE: Iteration: 5 | log-likelihood: -511.934665411367
## EM - SNMoE: Iteration: 6 | log-likelihood: -511.855329203015
## EM - SNMoE: Iteration: 7 | log-likelihood: -511.782480754674
## EM - SNMoE: Iteration: 8 | log-likelihood: -511.71375319197
## EM - SNMoE: Iteration: 9 | log-likelihood: -511.649955625955
## EM - SNMoE: Iteration: 10 | log-likelihood: -511.591561915854
## EM - SNMoE: Iteration: 11 | log-likelihood: -511.538534747817
## EM - SNMoE: Iteration: 12 | log-likelihood: -511.49053102614
## EM - SNMoE: Iteration: 13 | log-likelihood: -511.447093205087
## EM - SNMoE: Iteration: 14 | log-likelihood: -511.407800268346
## EM - SNMoE: Iteration: 15 | log-likelihood: -511.372246141678
## EM - SNMoE: Iteration: 16 | log-likelihood: -511.340018156194
## EM - SNMoE: Iteration: 17 | log-likelihood: -511.310768264826
## EM - SNMoE: Iteration: 18 | log-likelihood: -511.284172342321
## EM - SNMoE: Iteration: 19 | log-likelihood: -511.259965708918
## EM - SNMoE: Iteration: 20 | log-likelihood: -511.23790967496
## EM - SNMoE: Iteration: 21 | log-likelihood: -511.217773033645
## EM - SNMoE: Iteration: 22 | log-likelihood: -511.199383029599
## EM - SNMoE: Iteration: 23 | log-likelihood: -511.182538077038
## EM - SNMoE: Iteration: 24 | log-likelihood: -511.167100532942
## EM - SNMoE: Iteration: 25 | log-likelihood: -511.15291807044
## EM - SNMoE: Iteration: 26 | log-likelihood: -511.139868198265
## EM - SNMoE: Iteration: 27 | log-likelihood: -511.127840086375
## EM - SNMoE: Iteration: 28 | log-likelihood: -511.116730510591
## EM - SNMoE: Iteration: 29 | log-likelihood: -511.106490369239
## EM - SNMoE: Iteration: 30 | log-likelihood: -511.097051217936
## EM - SNMoE: Iteration: 31 | log-likelihood: -511.088301636514
## EM - SNMoE: Iteration: 32 | log-likelihood: -511.080192891387
## EM - SNMoE: Iteration: 33 | log-likelihood: -511.072678223232
## EM - SNMoE: Iteration: 34 | log-likelihood: -511.065700521708
## EM - SNMoE: Iteration: 35 | log-likelihood: -511.05918053927
## EM - SNMoE: Iteration: 36 | log-likelihood: -511.053114359
## EM - SNMoE: Iteration: 37 | log-likelihood: -511.047473783831
## EM - SNMoE: Iteration: 38 | log-likelihood: -511.042235860277
## EM - SNMoE: Iteration: 39 | log-likelihood: -511.037337158351
## EM - SNMoE: Iteration: 40 | log-likelihood: -511.032733385276
## EM - SNMoE: Iteration: 41 | log-likelihood: -511.028420855818
## EM - SNMoE: Iteration: 42 | log-likelihood: -511.024378866605
## EM - SNMoE: Iteration: 43 | log-likelihood: -511.02059413479
## EM - SNMoE: Iteration: 44 | log-likelihood: -511.017041404155
## EM - SNMoE: Iteration: 45 | log-likelihood: -511.013708256236
## EM - SNMoE: Iteration: 46 | log-likelihood: -511.010568018274
## EM - SNMoE: Iteration: 47 | log-likelihood: -511.007618454328
## EM - SNMoE: Iteration: 48 | log-likelihood: -511.004839696096
## EM - SNMoE: Iteration: 49 | log-likelihood: -511.002219609299
## EM - SNMoE: Iteration: 50 | log-likelihood: -510.999741061989
## EM - SNMoE: Iteration: 51 | log-likelihood: -510.997412434851
## EM - SNMoE: Iteration: 52 | log-likelihood: -510.995214850967
## EM - SNMoE: Iteration: 53 | log-likelihood: -510.99314054955
## EM - SNMoE: Iteration: 54 | log-likelihood: -510.991172097037
## EM - SNMoE: Iteration: 55 | log-likelihood: -510.98930320531
## EM - SNMoE: Iteration: 56 | log-likelihood: -510.987529165389
## EM - SNMoE: Iteration: 57 | log-likelihood: -510.985859257595
## EM - SNMoE: Iteration: 58 | log-likelihood: -510.984269831596
## EM - SNMoE: Iteration: 59 | log-likelihood: -510.982755404038
## EM - SNMoE: Iteration: 60 | log-likelihood: -510.981319620258
## EM - SNMoE: Iteration: 61 | log-likelihood: -510.979954334698
## EM - SNMoE: Iteration: 62 | log-likelihood: -510.978659854229
## EM - SNMoE: Iteration: 63 | log-likelihood: -510.977427362487
## EM - SNMoE: Iteration: 64 | log-likelihood: -510.976250289218
## EM - SNMoE: Iteration: 65 | log-likelihood: -510.975126712328
## EM - SNMoE: Iteration: 66 | log-likelihood: -510.974061011717
## EM - SNMoE: Iteration: 67 | log-likelihood: -510.97304409357
## EM - SNMoE: Iteration: 68 | log-likelihood: -510.972074312477
## EM - SNMoE: Iteration: 69 | log-likelihood: -510.971148075217
## EM - SNMoE: Iteration: 70 | log-likelihood: -510.970264085304
## EM - SNMoE: Iteration: 71 | log-likelihood: -510.969426837844
## EM - SNMoE: Iteration: 72 | log-likelihood: -510.968621524195
## EM - SNMoE: Iteration: 73 | log-likelihood: -510.967850897754
## EM - SNMoE: Iteration: 74 | log-likelihood: -510.967113310193
## EM - SNMoE: Iteration: 75 | log-likelihood: -510.966404291042
## EM - SNMoE: Iteration: 76 | log-likelihood: -510.965724424982
## EM - SNMoE: Iteration: 77 | log-likelihood: -510.965071111335
## EM - SNMoE: Iteration: 78 | log-likelihood: -510.964443871974
## EM - SNMoE: Iteration: 79 | log-likelihood: -510.963845589862
## EM - SNMoE: Iteration: 80 | log-likelihood: -510.963268720826
## EM - SNMoE: Iteration: 81 | log-likelihood: -510.962714397371
## EM - SNMoE: Iteration: 82 | log-likelihood: -510.962182607623
## EM - SNMoE: Iteration: 83 | log-likelihood: -510.961672175282

Summary

snmoe$summary()
## -----------------------------------------------
## Fitted Skew-Normal Mixture-of-Experts model
## -----------------------------------------------
## 
## SNMoE model with K = 2 experts:
## 
##  log-likelihood df       AIC       BIC       ICL
##       -510.9617 10 -520.9617 -542.0347 -542.0633
## 
## Clustering table (Number of observations in each expert):
## 
##   1   2 
## 249 251 
## 
## Regression coefficients:
## 
##     Beta(k = 1) Beta(k = 2)
## 1      1.037491   0.9575331
## X^1    2.725999  -2.7142375
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##      0.4483975     0.4615512

Plots

Mean curve

snmoe$plot(what = "meancurve")

Confidence regions

snmoe$plot(what = "confregions")

Clusters

snmoe$plot(what = "clusters")

Log-likelihood

snmoe$plot(what = "loglikelihood")

Application to a real dataset

Load data

data("tempanomalies")
x <- tempanomalies$Year
y <- tempanomalies$AnnualAnomaly

Set up SNMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
                 threshold, verbose, verbose_IRLS)
## EM - SNMoE: Iteration: 1 | log-likelihood: 85.6629398704066
## EM - SNMoE: Iteration: 2 | log-likelihood: 87.6348814122629
## EM - SNMoE: Iteration: 3 | log-likelihood: 88.8206553596048
## EM - SNMoE: Iteration: 4 | log-likelihood: 89.077306290869
## EM - SNMoE: Iteration: 5 | log-likelihood: 89.2552761244491
## EM - SNMoE: Iteration: 6 | log-likelihood: 89.4655065074209
## EM - SNMoE: Iteration: 7 | log-likelihood: 89.6408151120597
## EM - SNMoE: Iteration: 8 | log-likelihood: 89.7453785611423
## EM - SNMoE: Iteration: 9 | log-likelihood: 89.8019158873457
## EM - SNMoE: Iteration: 10 | log-likelihood: 89.8326434504498
## EM - SNMoE: Iteration: 11 | log-likelihood: 89.8516173994911
## EM - SNMoE: Iteration: 12 | log-likelihood: 89.8659299759306
## EM - SNMoE: Iteration: 13 | log-likelihood: 89.8779244167782
## EM - SNMoE: Iteration: 14 | log-likelihood: 89.8883043391909
## EM - SNMoE: Iteration: 15 | log-likelihood: 89.8976280227498
## EM - SNMoE: Iteration: 16 | log-likelihood: 89.9062100560137
## EM - SNMoE: Iteration: 17 | log-likelihood: 89.9141562316792
## EM - SNMoE: Iteration: 18 | log-likelihood: 89.9214916569529
## EM - SNMoE: Iteration: 19 | log-likelihood: 89.9282195274107
## EM - SNMoE: Iteration: 20 | log-likelihood: 89.9343371109122
## EM - SNMoE: Iteration: 21 | log-likelihood: 89.9398450285261
## EM - SNMoE: Iteration: 22 | log-likelihood: 89.9448352313501
## EM - SNMoE: Iteration: 23 | log-likelihood: 89.9493572167746
## EM - SNMoE: Iteration: 24 | log-likelihood: 89.9532720811831
## EM - SNMoE: Iteration: 25 | log-likelihood: 89.9566088757451
## EM - SNMoE: Iteration: 26 | log-likelihood: 89.9593957123153
## EM - SNMoE: Iteration: 27 | log-likelihood: 89.9619306377121
## EM - SNMoE: Iteration: 28 | log-likelihood: 89.9631089137827
## EM - SNMoE: Iteration: 29 | log-likelihood: 89.9642699709444
## EM - SNMoE: Iteration: 30 | log-likelihood: 89.9651774658723
## EM - SNMoE: Iteration: 31 | log-likelihood: 89.9659639146333
## EM - SNMoE: Iteration: 32 | log-likelihood: 89.966531145056
## EM - SNMoE: Iteration: 33 | log-likelihood: 89.9673964689938
## EM - SNMoE: Iteration: 34 | log-likelihood: 89.9677098607269
## EM - SNMoE: Iteration: 35 | log-likelihood: 89.9679626635867
## EM - SNMoE: Iteration: 36 | log-likelihood: 89.9680677554424
## EM - SNMoE: Iteration: 37 | log-likelihood: 89.9684075475609
## EM - SNMoE: Iteration: 38 | log-likelihood: 89.9685237178347
## EM - SNMoE: Iteration: 39 | log-likelihood: 89.9685823566911

Summary

snmoe$summary()
## -----------------------------------------------
## Fitted Skew-Normal Mixture-of-Experts model
## -----------------------------------------------
## 
## SNMoE model with K = 2 experts:
## 
##  log-likelihood df      AIC      BIC      ICL
##        89.96858 10 79.96858 65.40531 65.29506
## 
## Clustering table (Number of observations in each expert):
## 
##  1  2 
## 70 66 
## 
## Regression coefficients:
## 
##       Beta(k = 1)  Beta(k = 2)
## 1   -14.063284318 -33.80514329
## X^1   0.007209172   0.01719828
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##     0.01416763    0.01744145

Plots

Mean curve

snmoe$plot(what = "meancurve")

Confidence regions

snmoe$plot(what = "confregions")

Clusters

snmoe$plot(what = "clusters")

Log-likelihood

snmoe$plot(what = "loglikelihood")