---
title: "A-quick-tour-of-NMoE"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{A-quick-tour-of-NMoE}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
library(knitr)
knitr::opts_chunk$set(
	fig.align = "center",
	fig.height = 5.5,
	fig.width = 6,
	warning = FALSE,
	collapse = TRUE,
	dev.args = list(pointsize = 10),
	out.width = "90%",
	par = TRUE
)
knit_hooks$set(par = function(before, options, envir)
  { if (before && options$fig.show != "none") 
       par(family = "sans", mar = c(4.1,4.1,1.1,1.1), mgp = c(3,1,0), tcl = -0.5)
})
```

```{r, message = FALSE, echo = FALSE}
library(meteorits)
```

# Introduction

**NMoE** (Normal Mixtures-of-Experts) provides a flexible modelling framework 
for heterogenous data with Gaussian distributions. **NMoE** consists of a 
mixture of *K* Normal expert regressors network (of degree *p*) gated by a 
softmax gating network (of degree *q*) and is represented by:

* The gating network parameters `alpha`'s of the softmax net.
* The experts network parameters: The location parameters (regression 
coefficients) `beta`'s and variances `sigma2`'s.

It was written in R Markdown, using the [knitr](https://cran.r-project.org/package=knitr) 
package for production.

See `help(package="meteorits")` for further details and references provided by
`citation("meteorits")`.

# Application to a simulated dataset

## Generate sample

```{r}
n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivNMoE(alphak = alphak, betak = betak, sigmak = sigmak, x = x)
y <- sample$y
```

## Set up tMoE model parameters

```{r}
K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)
```

## Set up EM parameters

```{r}
n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE
```

## Estimation

```{r}
nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
```

## Summary

```{r}
nmoe$summary()
```

## Plots

### Mean curve

```{r}
nmoe$plot(what = "meancurve")
```

### Confidence regions

```{r}
nmoe$plot(what = "confregions")
```

### Clusters

```{r}
nmoe$plot(what = "clusters")
```

### Log-likelihood

```{r}
nmoe$plot(what = "loglikelihood")
```

# Application to a real dataset

## Load data

```{r}
data("tempanomalies")
x <- tempanomalies$Year
y <- tempanomalies$AnnualAnomaly
```

## Set up tMoE model parameters

```{r}
K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)
```

## Set up EM parameters

```{r}
n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE
```

## Estimation

```{r}
nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
```

## Summary

```{r}
nmoe$summary()
```

## Plots

### Mean curve

```{r}
nmoe$plot(what = "meancurve")
```

### Confidence regions

```{r}
nmoe$plot(what = "confregions")
```

### Clusters

```{r}
nmoe$plot(what = "clusters")
```

### Log-likelihood

```{r}
nmoe$plot(what = "loglikelihood")
```