Exogenous Variables

library(nixtlar)

1. Exogenous variables

Exogenous variables are external factors that provide additional information about the behavior of the target variable in time series forecasting. These variables, which are correlated with the target, can significantly improve predictions. Examples of exogenous variables include weather data, economic indicators, holiday markers, and promotional sales.

TimeGPT allows you to include exogenous variables when generating a forecast. This vignette will show you how to include them. It assumes you have already set up your API key. If you haven’t done this, please read the Get Started vignette first.

2. Load data

For this vignette, we will use the electricity consumption dataset with exogenous variables included in nixtlar. This dataset contains hourly prices from five different electricity markets, along with two exogenous variables related to the prices and binary variables indicating the day of the week.

df_exo_vars <- nixtlar::electricity_exo_vars
head(df_exo_vars)
#>   unique_id                  ds     y Exogenous1 Exogenous2 day_0 day_1 day_2
#> 1        BE 2016-10-22 00:00:00 70.00      49593      57253     0     0     0
#> 2        BE 2016-10-22 01:00:00 37.10      46073      51887     0     0     0
#> 3        BE 2016-10-22 02:00:00 37.10      44927      51896     0     0     0
#> 4        BE 2016-10-22 03:00:00 44.75      44483      48428     0     0     0
#> 5        BE 2016-10-22 04:00:00 37.10      44338      46721     0     0     0
#> 6        BE 2016-10-22 05:00:00 35.61      44504      46303     0     0     0
#>   day_3 day_4 day_5 day_6
#> 1     0     0     1     0
#> 2     0     0     1     0
#> 3     0     0     1     0
#> 4     0     0     1     0
#> 5     0     0     1     0
#> 6     0     0     1     0

When using exogenous variables, nixtlar distinguishes between historical and future exogenous variables:

  • Historical Exogenous Variables: These should be included in the input data immediately following the id_col, ds, and y columns. If your dataset contains additional columns that are not exogenous variables, you must remove them before using any core functions of nixtlar.

  • Future Exogenous Variables: These correspond to the X_df parameter and should cover the entire forecast horizon. This dataset must include columns with the appropriate timestamps and, if applicable, unique identifiers.

future_exo_vars <- nixtlar::electricity_future_exo_vars
head(future_exo_vars)
#>   unique_id                  ds Exogenous1 Exogenous2 day_0 day_1 day_2 day_3
#> 1        BE 2016-12-31 00:00:00      64108      70318     0     0     0     0
#> 2        BE 2016-12-31 01:00:00      62492      67898     0     0     0     0
#> 3        BE 2016-12-31 02:00:00      61571      68379     0     0     0     0
#> 4        BE 2016-12-31 03:00:00      60381      64972     0     0     0     0
#> 5        BE 2016-12-31 04:00:00      60298      62900     0     0     0     0
#> 6        BE 2016-12-31 05:00:00      60339      62364     0     0     0     0
#>   day_4 day_5 day_6
#> 1     0     1     0
#> 2     0     1     0
#> 3     0     1     0
#> 4     0     1     0
#> 5     0     1     0
#> 6     0     1     0

3. Forecast with exogenous variables

To generate a forecast with exogenous variables, use the nixtla_client_forecast function as you would for forecasts without them. The only difference is that you must add the future exogenous variables using the X_df argument.

fcst_exo_vars <- nixtla_client_forecast(df_exo_vars, h = 24, X_df = future_exo_vars)
#> Frequency chosen: h
#> Using historical exogenous features: Exogenous1, Exogenous2, day_0, day_1, day_2, day_3, day_4, day_5, day_6
#> Using future exogenous features: Exogenous1, Exogenous2, day_0, day_1, day_2, day_3, day_4, day_5, day_6
head(fcst_exo_vars)
#>   unique_id                  ds  TimeGPT
#> 1        BE 2016-12-31 00:00:00 74.54077
#> 2        BE 2016-12-31 01:00:00 43.34429
#> 3        BE 2016-12-31 02:00:00 44.42921
#> 4        BE 2016-12-31 03:00:00 38.09440
#> 5        BE 2016-12-31 04:00:00 37.38914
#> 6        BE 2016-12-31 05:00:00 39.08574

For comparison, we will also generate a forecast without exogenous variables.

df <- nixtlar::electricity # same dataset but without exogenous variables

fcst <- nixtla_client_forecast(df, h = 24)
#> Frequency chosen: h
head(fcst)
#>   unique_id                  ds  TimeGPT
#> 1        BE 2016-12-31 00:00:00 45.19045
#> 2        BE 2016-12-31 01:00:00 43.24445
#> 3        BE 2016-12-31 02:00:00 41.95839
#> 4        BE 2016-12-31 03:00:00 39.79649
#> 5        BE 2016-12-31 04:00:00 39.20454
#> 6        BE 2016-12-31 05:00:00 40.10878

4. Plot TimeGPT forecast

nixtlar includes a function to plot the historical data and any output from nixtla_client_forecast, nixtla_client_historic, nixtla_client_anomaly_detection and nixtla_client_cross_validation. If you have long series, you can use max_insample_length to only plot the last N historical values (the forecast will always be plotted in full).

nixtla_client_plot(df_exo_vars, fcst_exo_vars, max_insample_length = 500)