---
title: "Getting Started with Modeltime Ensemble"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with Modeltime Ensemble}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
    # collapse = TRUE,
    message = FALSE, 
    warning = FALSE,
    paged.print = FALSE,
    comment = "#>",
    fig.width = 8, 
    fig.height = 4.5,
    fig.align = 'center',
    out.width='95%'
)
```

> Ensemble Algorithms for Time Series Forecasting with Modeltime

A `modeltime` extension that that implements ___ensemble forecasting methods___ including model averaging, weighted averaging, and stacking. Let's go through a guided tour to kick the tires on `modeltime.ensemble`. 

```{r, echo=F, out.width='100%', fig.align='center'}
knitr::include_graphics("stacking.jpg")
```

# Time Series Ensemble Forecasting Example

We'll perform the simplest type of forecasting: __Using a simple average of the forecasted models.__

Note that `modeltime.ensemble` has capabilities for more sophisticated model ensembling using:

- __Weighted Averaging__  
- __Stacking__ using an Elastic Net regression model (meta-learning)

## Libraries

Load libraries to complete this short tutorial.

```r
# Time Series ML
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)

# Core
library(tidyverse)
library(timetk)

interactive <- FALSE
```

```{r setup, include=FALSE,echo=FALSE}
# Time Series ML
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)

# Core
library(dplyr)
library(timetk)

interactive <- FALSE
```

## Collect the Data

We'll use the `m750` dataset that comes with `modeltime.ensemble`. We can visualize the dataset. 

```{r}
m750 %>%
    plot_time_series(date, value, .color_var = id, .interactive = interactive)
```


# Perform Train / Test Splitting

We'll split into a training and testing set. 

```{r}
splits <- time_series_split(m750, assess = "2 years", cumulative = TRUE)

splits %>%
    tk_time_series_cv_plan() %>%
    plot_time_series_cv_plan(date, value, .interactive = interactive)
```

# Modeling

Once the data has been collected, we can move into modeling. 

## Recipe 

We'll create a Feature Engineering Recipe that can be applied to the data to create features that machine learning models can key in on. This will be most useful for the __Elastic Net (Model 3).__ 

```{r}
recipe_spec <- recipe(value ~ date, training(splits)) %>%
    step_timeseries_signature(date) %>%
    step_rm(matches("(.iso$)|(.xts$)")) %>%
    step_normalize(matches("(index.num$)|(_year$)")) %>%
    step_dummy(all_nominal()) %>%
    step_fourier(date, K = 1, period = 12)

recipe_spec %>% prep() %>% juice()
```

## Model 1 - Auto ARIMA

First, we'll make an ARIMA model using Auto ARIMA. 

```{r}
model_spec_arima <- arima_reg() %>%
    set_engine("auto_arima")

wflw_fit_arima <- workflow() %>%
    add_model(model_spec_arima) %>%
    add_recipe(recipe_spec %>% step_rm(all_predictors(), -date)) %>%
    fit(training(splits))
```

## Model 2 - Prophet

Next, we'll make a Prophet Model. 

```{r}
model_spec_prophet <- prophet_reg() %>%
    set_engine("prophet")

wflw_fit_prophet <- workflow() %>%
    add_model(model_spec_prophet) %>%
    add_recipe(recipe_spec %>% step_rm(all_predictors(), -date)) %>%
    fit(training(splits))
```

## Model 3 - Elastic Net

Third, we'll make an Elastic Net Model using `glmnet`. 

```{r}
model_spec_glmnet <- linear_reg(
    mixture = 0.9,
    penalty = 4.36e-6
) %>%
    set_engine("glmnet")

wflw_fit_glmnet <- workflow() %>%
    add_model(model_spec_glmnet) %>%
    add_recipe(recipe_spec %>% step_rm(date)) %>%
    fit(training(splits))
```


# Modeltime Workflow for Ensemble Forecasting

With the models created, we can can create an __Ensemble Average Model using a simple Mean Average.__

## Step 1 - Create a Modeltime Table

Create a _Modeltime Table_ using the `modeltime` package. 

```{r}
m750_models <- modeltime_table(
    wflw_fit_arima,
    wflw_fit_prophet,
    wflw_fit_glmnet
)

m750_models
```

## Step 2 - Make an Ensemble

Then use `ensemble_average()` to turn that Modeltime Table into a ___Modeltime Ensemble.___ This is a _fitted ensemble specification_ containing the ingredients to forecast future data and be refitted on data sets using the 3 submodels. 

```{r}
ensemble_fit <- m750_models %>%
    ensemble_average(type = "mean")

ensemble_fit
```

## Step 3 - Forecast! (the Test Data)

To forecast, just follow the [Modeltime Workflow](https://business-science.github.io/modeltime/articles/getting-started-with-modeltime.html). 

```{r}
# Calibration
calibration_tbl <- modeltime_table(
    ensemble_fit
) %>%
    modeltime_calibrate(testing(m750_splits))

# Forecast vs Test Set
calibration_tbl %>%
    modeltime_forecast(
        new_data    = testing(m750_splits),
        actual_data = m750
    ) %>%
    plot_modeltime_forecast(.interactive = interactive)
```

## Step 4 - Refit on Full Data & Forecast Future

Once satisfied with our ensemble model, we can `modeltime_refit()` on the full data set and forecast forward gaining the confidence intervals in the process. 

```{r}
refit_tbl <- calibration_tbl %>%
    modeltime_refit(m750)

refit_tbl %>%
    modeltime_forecast(
        h = "2 years",
        actual_data = m750
    ) %>%
    plot_modeltime_forecast(.interactive = interactive)
```

This was a very short tutorial on the simplest type of forecasting, but there's a lot more to learn. 


## Take the High-Performance Forecasting Course

> Become the forecasting expert for your organization

<a href="https://university.business-science.io/p/ds4b-203-r-high-performance-time-series-forecasting/" target="_blank"><img src="https://www.filepicker.io/api/file/bKyqVAi5Qi64sS05QYLk" alt="High-Performance Time Series Forecasting Course" width="100%" style="box-shadow: 0 0 5px 2px rgba(0, 0, 0, .5);"/></a>

[_High-Performance Time Series Course_](https://university.business-science.io/p/ds4b-203-r-high-performance-time-series-forecasting/)

### Time Series is Changing

Time series is changing. __Businesses now need 10,000+ time series forecasts every day.__ This is what I call a _High-Performance Time Series Forecasting System (HPTSF)_ - Accurate, Robust, and Scalable Forecasting. 

 __High-Performance Forecasting Systems will save companies by improving accuracy and scalability.__ Imagine what will happen to your career if you can provide your organization a "High-Performance Time Series Forecasting System" (HPTSF System).

### How to Learn High-Performance Time Series Forecasting

I teach how to build a HPTFS System in my [__High-Performance Time Series Forecasting Course__](https://university.business-science.io/p/ds4b-203-r-high-performance-time-series-forecasting). You will learn:

- __Time Series Machine Learning__ (cutting-edge) with `Modeltime` - 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more)
- __Deep Learning__ with `GluonTS` (Competition Winners)
- __Time Series Preprocessing__, Noise Reduction, & Anomaly Detection
- __Feature engineering__ using lagged variables & external regressors
- __Hyperparameter Tuning__
- __Time series cross-validation__
- __Ensembling__ Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
- __Scalable Forecasting__ - Forecast 1000+ time series in parallel
- and more.

<p class="text-center" style="font-size:24px;">
Become the Time Series Expert for your organization.
</p>
<br>
<p class="text-center" style="font-size:30px;">
<a href="https://university.business-science.io/p/ds4b-203-r-high-performance-time-series-forecasting">Take the High-Performance Time Series Forecasting Course</a>
</p>