All models in the tsmarch
package assume as input a
zero-mean time series of returns, which means that conditional mean
filtration should be performed outside the package. However, all methods
provide an argument to pass the already estimated conditional mean
(cond_mean
) which takes care of the conditional
distribution centering, particularly during the prediction step,
otherwise the user can use the simulated distribution of innovations as
an input to the conditional mean dynamics simulation. Details of how to
work with conditional mean dynamics is available in another vignette.
For this demo, we simply center the data by their mean.
suppressMessages(library(tsmarch))
suppressMessages(library(tstests))
suppressMessages(library(xts))
suppressMessages(library(shape))
suppressMessages(library(tsgarch))
suppressMessages(library(tsdistributions))
data(globalindices)
Sys.setenv(TZ = "UTC")
train_set <- 1:1600
test_set <- 1601:1698
series <- 1:5
y <- as.xts(globalindices[, series])
train <- y[train_set,]
mu <- colMeans(train)
train <- sweep(train, 2, mu, "-")
test <- y[test_set,]
test <- sweep(test, 2, mu, "-")
oldpar <- par(mfrow = c(1,1))
gogarch_mod <- gogarch_modelspec(train, distribution = "nig", model = "gjrgarch", components = 4) |> estimate()
summary(gogarch_mod)
#> GOGARCH Model Summary
#> Factor Dynamics: GJRGARCH | MANIG
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> [IC_1]:omega 6.392e-02 9.837e-03 6.498 8.16e-11 ***
#> [IC_1]:alpha1 4.771e-23 2.484e-02 0.000 1.000000
#> [IC_1]:gamma1 2.598e-01 3.721e-02 6.982 2.92e-12 ***
#> [IC_1]:beta1 7.887e-01 2.459e-02 32.079 < 2e-16 ***
#> [IC_1]:skew -4.798e-01 6.345e-02 -7.562 3.97e-14 ***
#> [IC_1]:shape 4.629e+00 1.350e+00 3.429 0.000605 ***
#> [IC_1]:persistence 9.078e-01 1.762e-02 51.512 < 2e-16 ***
#> [IC_2]:omega 7.512e-03 3.586e-03 2.095 0.036167 *
#> [IC_2]:alpha1 6.850e-02 1.558e-02 4.398 1.09e-05 ***
#> [IC_2]:gamma1 -2.482e-02 1.863e-02 -1.333 0.182653
#> [IC_2]:beta1 9.371e-01 1.181e-02 79.315 < 2e-16 ***
#> [IC_2]:skew -3.236e-02 4.635e-02 -0.698 0.485075
#> [IC_2]:shape 1.750e+00 3.336e-01 5.245 1.57e-07 ***
#> [IC_2]:persistence 9.933e-01 5.386e-03 184.432 < 2e-16 ***
#> [IC_3]:omega 3.483e-02 1.020e-02 3.416 0.000635 ***
#> [IC_3]:alpha1 1.061e-01 2.289e-02 4.634 3.58e-06 ***
#> [IC_3]:gamma1 -2.777e-02 2.561e-02 -1.085 0.278043
#> [IC_3]:beta1 8.716e-01 2.145e-02 40.641 < 2e-16 ***
#> [IC_3]:skew 7.628e-02 4.618e-02 1.652 0.098524 .
#> [IC_3]:shape 1.378e+00 2.488e-01 5.536 3.10e-08 ***
#> [IC_3]:persistence 9.635e-01 1.668e-02 57.766 < 2e-16 ***
#> [IC_4]:omega 4.182e-02 1.411e-02 2.963 0.003044 **
#> [IC_4]:alpha1 9.781e-02 2.106e-02 4.645 3.41e-06 ***
#> [IC_4]:gamma1 -6.445e-02 2.753e-02 -2.341 0.019249 *
#> [IC_4]:beta1 8.911e-01 2.407e-02 37.015 < 2e-16 ***
#> [IC_4]:skew 1.856e-01 5.250e-02 3.536 0.000407 ***
#> [IC_4]:shape 3.105e+00 7.132e-01 4.353 1.34e-05 ***
#> [IC_4]:persistence 9.554e-01 1.710e-02 55.862 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Rotation Matrix (U):
#> 0.774295 -0.388656 -0.496869 -0.050357
#> 0.161894 0.883841 -0.438877 -0.001827
#> 0.610818 0.259545 0.747973 0.008627
#> 0.034062 -0.020222 -0.032318 0.998692
#>
#> N: 1600 | Series: 5 | Factors: 4
#> LogLik: 8246.79, AIC: -16413.6, BIC: -16198.5
In the code snippet above we used dimensionality reduction in the
whitening stage with the components
argument. The returned
object is of class gogarch.estimate from which we can then proceed to
further analyze the series:
Online filtering of new data with the existing estimated model can be
achieved via the tsfilter
method which returns an object of
class gogarch.estimate updated with the new information. What this
allows us to do is to use the existing estimated model in order to
filter newly arrived information without having to re-estimate. Since
the returned object is the same as the estimated object, we can then use
the existing methods to analyze the new data. The next code snippet
shows how to perform 1-step ahead rolling predictions and generation of
an equal weighted portfolio value at risk at the 10% quantile.
h <- 98
w <- rep(1/5, 5)
gogarch_filter_mod <- gogarch_mod
var_value <- rep(0, 98)
actual <- as.numeric(coredata(test) %*% w)
# first prediction without filtering update
var_value[1] <- predict(gogarch_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
for (i in 2:h) {
gogarch_filter_mod <- tsfilter(gogarch_filter_mod, y = test[i - 1,])
var_value[i] <- predict(gogarch_filter_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
}
At time T+0
the initial prediction is made for
T+1
, and then the model is updated with new information
using the tsfilter
method bringing the model information
set to time T+1
from which predictions at time
T+2
are made and so forth. This is equivalent to a rolling
1-step ahead rolling prediction without re-estimation.
Having obtained the predicted value at risk from the simulated
distribution, we can then use the var_cp_test
function from
the tstests
package to evaluate the accuracy of the calculation:
Test | DoF | Statistic | Pr(>Chisq) | Decision(5%) | |
---|---|---|---|---|---|
Kupiec (UC) | 1 | 1.8739 | 0.1710 | Fail to Reject H0 | |
CP (CCI) | 1 | 0.8753 | 0.3495 | Fail to Reject H0 | |
CP (CC) | 2 | 2.7493 | 0.2529 | Fail to Reject H0 | |
CP (D) | 1 | 0.1147 | 0.7348 | Fail to Reject H0 | |
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 | |||||
Coverage: 0.1, Obs: 98, Failures: 6, E[Failures]: 9 | |||||
Hypothesis(H0) : Unconditional(UC), Independent(CCI), Joint Coverage(CC) and Duration(D) |
There are 3 methods related to the conditional co-moments of the
model: tscov
(and tscor
) returns the
NxNxT
conditional covariance (correlation) matrix,
tscoskew
returns the NxNxNxT
conditional
co-skewewness matrix and tscokurt
returns the
NxNxNxNxT
conditional co-kurtosis matrix. These methods
benefit from the use of multiple threads which can be set via the
RcppParallel::setThreadOptions
function (though care should
be taken about availability of RAM).
V <- tscov(gogarch_mod)
S <- tscoskew(gogarch_mod, standardized = TRUE, folded = TRUE)
K <- tscokurt(gogarch_mod, standardized = TRUE, folded = TRUE)
Notice that the standardized
and folded
arguments are used to return the standardized co-moments in either
folded or unfolded form. The unfolded form represented the flattened
tensor of the co-moments is useful for the calculation of the portfolio
weighted moments via the Kronecker product. Theses method are available
for both estimate and predicted/simulated objects. To illustrate this,
we also generate a 25 step ahead prediction, generate the co-kurtosis
distribution and then combine the estimated and predicted into a
tsmodel.predict
object for which special purpose plots are
available from the tsmethods
package.
p <- predict(gogarch_mod, h = 25, nsim = 1000, seed = 100)
K_p <- tscokurt(p, standardized = TRUE, folded = TRUE, distribution = TRUE, index = 1:25)
K_p <- t(K_p[1,1,1,1,,])
colnames(K_p) <- as.character(p$forc_dates)
class(K_p) <- "tsmodel.distribution"
L <- list(original_series = xts(K[1,1,1,1,], as.Date(gogarch_mod$spec$target$index)), distribution = K_p)
class(L) <- "tsmodel.predict"
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8)
plot(L, gradient_color = "orange", interval_color = "cadetblue", median_color = "black", median_type = 2, median_width = 1,
n_original = 100, main = "Kurtosis [AEX]", xlab = "", cex.main = 0.8)
To calculate the weighted portfolio moments we can use the
tsaggregate
method and similarly form an object for
plotting, this time for the portfolio skewness.
port_moments_estimate <- tsaggregate(gogarch_mod, weights = w)
port_moments_predict <- tsaggregate(p, weights = w, distribution = TRUE)
L <- list(original_series = port_moments_estimate$skewness, distribution = port_moments_predict$skewness)
class(L) <- "tsmodel.predict"
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8)
plot(L, gradient_color = "orange", interval_color = "cadetblue", median_color = "black", median_type = 2, median_width = 1,
n_original = 100, main = "Portfolio Skewness", xlab = "", cex.main = 0.8)
The main vignette discusses in detail the convolution approach for generating a weighted portfolio distribution from which the density and quantile functions can then be approximated. A short example is provided below where we evaluate the FFT approximation against the exact moments for a 98-step ahead prediction.
p <- predict(gogarch_mod, h = 98, nsim = 1000)
port_f_moments <- do.call(cbind, tsaggregate(p, weights = w, distribution = FALSE))
pconv <- tsconvolve(p, weights = w, fft_support = NULL, fft_step = 0.0001, fft_by = 0.00001, distribution = FALSE)
p_c_moments <- matrix(0, ncol = 4, nrow = 98)
for (i in 1:98) {
df <- dfft(pconv, index = i)
mu <- pconv$mu[i]
f_2 <- function(x) (x - mu)^2 * df(x)
f_3 <- function(x) (x - mu)^3 * df(x)
f_4 <- function(x) (x - mu)^4 * df(x)
sprt <- attr(pconv$y[[i]],"support")
p_c_moments[i,2] <- sqrt(integrate(f_2, sprt[1], sprt[2], abs.tol = 1e-8, subdivisions = 500)$value)
p_c_moments[i,3] <- integrate(f_3, sprt[1], sprt[2], abs.tol = 1e-8, subdivisions = 500)$value/p_c_moments[i,2]^3
p_c_moments[i,4] <- integrate(f_4, sprt[1], sprt[2], abs.tol = 1e-8, subdivisions = 500)$value/p_c_moments[i,2]^4
}
par(mar = c(2,2,2,2), mfrow = c(3,1), pty = "m")
matplot(cbind(as.numeric(port_f_moments[,2]), p_c_moments[,2]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), main = "Sigma", xaxt = "n")
grid()
matplot(cbind(as.numeric(port_f_moments[,3]), p_c_moments[,3]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), main = "Skewness", xaxt = "n")
grid()
matplot(cbind(as.numeric(port_f_moments[,4]), p_c_moments[,4]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), main = "Kurtosis")
grid()
This provides for a code correctness check of the FFT approximation to inverting the characteristic function as we observe that the approximation and exact moments are identical. However, care must be taken in certain cases in terms of calibrating the step size as well as the integration function tolerance levels to achieve the desired accuracy.
The next plot shows how to generate a value at risk surface for the prediction period. It should be noted that the sample quantiles from the simulated distribution will not match up to the FFT approximation since the one is based on simulation whereas the other is an analytic approximation to the weighted density.
p <- predict(gogarch_mod, h = 98, nsim = 5000)
pconv <- tsconvolve(p, weights = w, fft_support = NULL, fft_step = 0.0001, fft_by = 0.00001, distribution = FALSE)
q_seq <- seq(0.025, 0.975, by = 0.025)
q_surface = matrix(NA, ncol = length(q_seq), nrow = 98)
for (i in 1:98) {
q_surface[i,] <- qfft(pconv, index = i)(q_seq)
}
par(mar = c(1.8,1.8,1.1,1), pty = "m")
col_palette <- drapecol(q_surface, col = femmecol(100), NAcol = "white")
persp(x = 1:98, y = q_seq, z = q_surface, col = col_palette, theta = 45,
phi = 15, expand = 0.5, ltheta = 25, shade = 0.25,
ticktype = "simple", xlab = "Time", ylab = "Quantile",
zlab = "VaR", cex.axis = 0.8, main = "Value at Risk Prediction Surface")
We can also generate the probability integral transform of the weighted distribution which can be used in the expected shortfall test:
pit_value <- pit(pconv, actual)
as_flextable(shortfall_de_test(pit_value, alpha = 0.1), include.decision = TRUE)
Test | Statistic | Pr(>|t|) | Decision(5%) | |
---|---|---|---|---|
DE (U) | 0.0234 | 0.1342 | Fail to Reject H0 | |
DE (C) | 4.1027 | 0.0428 | * | Reject H0 |
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 | ||||
Coverage: 0.1, Obs: 98 | ||||
Hypothesis(H0) : Unconditional(U) and Independent(C) |
In the DCC model we need to pre-estimate the univariate dynamics
before passing them to the DCC specification as a
multi_garch
class object. With the exception of the Copula
model, the marginal distributions of the univariate GARCH models should
always be Normal, irrespective of whether a multivariate Normal or
Student is chosen as the DCC model distribution. There are no checks
performed for this and it is up to the user to ensure that this is the
case. Additionally, for the purpose of allowing the calculation of the
partitioned Hessian, the argument keep_tmb
should be set to
TRUE in the estimation routine of the univariate models.
garch_model <- lapply(1:5, function(i) {
garch_modelspec(train[,i], model = "gjrgarch") |> estimate(keep_tmb = TRUE)
})
garch_model <- to_multi_estimate(garch_model)
names(garch_model) <- colnames(train)
Once the univariate models have been estimated and converted to the appropriate class, we can then pass the object to the DCC model for estimation:
dcc_mod <- dcc_modelspec(garch_model, dynamics = "adcc", distribution = "mvt") |> estimate()
dcc_mod |> summary()
#> DCC Model Summary
#> DCC Dynamics: ADCC | MVT
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> alpha_1 0.010268 0.002646 3.880 0.000104 ***
#> gamma_1 0.007613 0.003725 2.044 0.040979 *
#> beta_1 0.972286 0.005412 179.666 < 2e-16 ***
#> shape 9.188257 0.750569 12.242 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> N: 1600 | Series: 5
#> LogLik: 19704.9, AIC: -39351.8, BIC: -39195.9
We chose to use adcc
dynamics for this demo which allows
asymmetric reaction to positive and negative shocks, and nicely
visualized using a news impact correlation surface plot:
We perform a similar exercise as in the GOGARCH filtering section:
h <- 98
w <- rep(1/5, 5)
dcc_filter_mod <- dcc_mod
var_value <- rep(0, 98)
actual <- as.numeric(coredata(test) %*% w)
# first prediction without filtering update
var_value[1] <- predict(dcc_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
for (i in 2:h) {
dcc_filter_mod <- tsfilter(dcc_filter_mod, y = test[i - 1,])
var_value[i] <- predict(dcc_filter_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
}
as_flextable(var_cp_test(actual, var_value, alpha = 0.1), include.decision = TRUE)
Test | DoF | Statistic | Pr(>Chisq) | Decision(5%) | |
---|---|---|---|---|---|
Kupiec (UC) | 1 | 0.3894 | 0.5326 | Fail to Reject H0 | |
CP (CCI) | 1 | 0.1847 | 0.6674 | Fail to Reject H0 | |
CP (CC) | 2 | 0.5741 | 0.7505 | Fail to Reject H0 | |
CP (D) | 1 | 0.0091 | 0.9240 | Fail to Reject H0 | |
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 | |||||
Coverage: 0.1, Obs: 98, Failures: 8, E[Failures]: 9 | |||||
Hypothesis(H0) : Unconditional(UC), Independent(CCI), Joint Coverage(CC) and Duration(D) |
There are 2 ways to obtain the weighted portfolio distribution for the DCC model:
tsaggregate
)We illustrate both approaches in a quick prediction exercise:
p <- predict(dcc_mod, h = 98, nsim = 5000)
simulated_aggregate <- tsaggregate(p, weights = w, distribution = TRUE)
# we don't have any conditional mean dynamics but uncertainty around zero from the simulation
weighted_mu <- t(apply(p$mu, 1, rowMeans)) %*% w
H <- tscov(p, distribution = FALSE)
weighted_sigma <- sqrt(sapply(1:98, function(i) w %*% H[,,i] %*% w))
shape <- unname(coef(dcc_mod)["shape"])
simulated_var <- unname(apply(simulated_aggregate$mu, 2, quantile, 0.05))
analytic_var <- qstd(0.05, mu = weighted_mu, sigma = weighted_sigma, shape = shape)
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8, cex.main = 0.8)
plot(as.Date(p$forc_dates), simulated_var, type = "l", ylab = "", xlab = "", main = "Value at Risk [5%]", ylim = c(-0.039, -0.033))
lines(as.Date(p$forc_dates), analytic_var, col = 2, lty = 2)
legend("topright", c("Simulated","Analytic"), col = 1:2, lty = 1:2, bty = "n")
Note that the DCC dynamics do not have a closed form solution for the
multi-step ahead forecast. Approximations have been used in the
literature but in the tsmarch
package we have instead opted
for a simulation approach which means that when calling the
tscov
method on a predicted object it will either return
the full simulated array of covariance matrices else their average
across each horizon when the distribution
argument is set
to FALSE.
The Copula model allows different distributions for the margins allowing for an additional layer of flexibility. The next sections use the same type of code examples as in the DCC model. Once a model is estimated, the methods applied on the model and all subsequent methods are the same as in the DCC and GOGARCH models.
distributions <- c(rep("jsu",4), rep("sstd",1))
garch_model <- lapply(1:5, function(i) {
garch_modelspec(train[,i], model = "gjrgarch", distribution = distributions[i]) |> estimate(keep_tmb = TRUE)
})
garch_model <- to_multi_estimate(garch_model)
names(garch_model) <- colnames(train)
cgarch_mod <- cgarch_modelspec(garch_model, dynamics = "adcc",
transformation = "parametric",
copula = "mvt") |> estimate()
cgarch_mod |> summary()
#> CGARCH Model Summary
#> Copula Dynamics: parametric | MVT | ADCC
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> alpha_1 0.009024 0.001942 4.645 3.39e-06 ***
#> gamma_1 0.010816 0.003077 3.515 0.00044 ***
#> beta_1 0.977254 0.004038 242.016 < 2e-16 ***
#> shape 16.912168 2.712104 6.236 4.49e-10 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> N: 1600 | Series: 5
#> LogLik: 19792.3, AIC: -39506.7, BIC: -39297
h <- 98
w <- rep(1/5, 5)
cgarch_filter_mod <- cgarch_mod
var_value <- rep(0, 98)
actual <- as.numeric(coredata(test) %*% w)
# first prediction without filtering update
var_value[1] <- predict(cgarch_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
for (i in 2:h) {
cgarch_filter_mod <- tsfilter(cgarch_filter_mod, y = test[i - 1,])
var_value[i] <- predict(cgarch_filter_mod, h = 1, nsim = 5000, seed = 100) |> value_at_risk(weights = w, alpha = 0.1)
}
as_flextable(var_cp_test(actual, var_value, alpha = 0.1), include.decision = TRUE)
Test | DoF | Statistic | Pr(>Chisq) | Decision(5%) | |
---|---|---|---|---|---|
Kupiec (UC) | 1 | 0.3894 | 0.5326 | Fail to Reject H0 | |
CP (CCI) | 1 | 0.1847 | 0.6674 | Fail to Reject H0 | |
CP (CC) | 2 | 0.5741 | 0.7505 | Fail to Reject H0 | |
CP (D) | 1 | 0.0091 | 0.9240 | Fail to Reject H0 | |
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 | |||||
Coverage: 0.1, Obs: 98, Failures: 8, E[Failures]: 9 | |||||
Hypothesis(H0) : Unconditional(UC), Independent(CCI), Joint Coverage(CC) and Duration(D) |
For the Copula model we reply on the simulated distribution for all calculations:
p <- predict(cgarch_mod, h = 98, nsim = 5000)
simulated_aggregate <- tsaggregate(p, weights = w, distribution = TRUE)
simulated_var <- unname(apply(simulated_aggregate$mu, 2, quantile, 0.05))
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8, cex.main = 0.8)
plot(as.Date(p$forc_dates), simulated_var, type = "l", ylab = "", xlab = "", main = "Value at Risk [5%]")
We briefly address in this section the question of how to handle conditional mean dynamics. There are effectively 2 approaches which are available for the user:
cond_mean
at every stage of the analysis
(i.e. estimation, prediction, filtering, simulation etc) and the
underlying code will take care of re-centering the simulated
distributions.cond_mean
but then take
the predicted simulated zero mean correlated residuals use them as
inputs to the prediction method of the conditional mean dynamics.Whilst the second method may be preferred, not many packages have either an option for generating a simulated predictive distribution or taking an input of a pre-created matrix of correlated residuals. In the next section we illustrate both approaches.
We first estimate the conditional mean dynamics using an AR(6) model, extract the residuals and fitted values and then make a 25 step ahead prediction.
arima_model <- lapply(1:5, function(i){
arima(train[,i], order = c(6,0,0), method = "ML")
})
.residuals <- do.call(cbind, lapply(arima_model, function(x) as.numeric(residuals(x))))
colnames(.residuals) <- colnames(train)
.residuals <- xts(.residuals, index(train))
.fitted <- train - .residuals
.predicted <- do.call(cbind, lapply(1:5, function(i){
as.numeric(predict(arima_model[[i]], n.ahead = 25)$pred)
}))
colnames(.predicted) <- colnames(train)
We then pass the .fitted values to the estimation method and the .predicted valued to the prediction method. Technically, the estimation method does not require this if we are only interested in prediction since they will not be used. All 3 models in the package handle the conditional mean inputs in the same way, ensuring that the output generated from different methods which depends on this will be correctly reflected. For this example we will use the DCC model:
dcc_mod_mean <- dcc_modelspec(garch_model, dynamics = "adcc", distribution = "mvt", cond_mean = .fitted) |> estimate()
all.equal(fitted(dcc_mod_mean), .fitted)
#> [1] TRUE
As expected the fitted method now picks up the cond_mean
passed to the model.
p <- predict(dcc_mod_mean, h = 25, cond_mean = .predicted, nsim = 5000, seed = 100)
simulated_mean <- as.matrix(t(apply(p$mu, 1, rowMeans)))
colnames(simulated_mean) <- colnames(train)
all.equal(simulated_mean, .predicted)
#> [1] TRUE
The mean of the simulated predictive distribution for each series and horizon is now the same as the matrix passed (.predicted) as a result of the re-centering operation automatically carried out.
In the injection approach, we pass the simulated correlated innovations from the DCC model to the ARIMA simulation and ensure that we also pass enough start-up innovations to produce a forward type simulation equivalent to a simulated forecast.
res <- p$mu
arima_pred <- lapply(1:5, function(i){
# we eliminate the mean prediction from the simulated predictive distribution
# to obtain the zero mean innovations
res_i <- scale(t(res[,i,]), scale = FALSE, center = TRUE)
sim_p <- do.call(rbind, lapply(1:5000, function(j) {
arima.sim(model = list(ar = coef(arima_model[[i]])[1:6]), n.start = 20, n = 25, innov = res_i[j,], start.innov = as.numeric(tail(.residuals[,i],20))) |> as.numeric() + coef(arima_model[[i]])[7]
}))
return(sim_p)
})
arima_pred <- array(unlist(arima_pred), dim = c(5000, 25, 5))
arima_pred <- aperm(arima_pred, c(2, 3, 1))
simulated_mean <- as.matrix(t(apply(arima_pred, 1, rowMeans)))
colnames(simulated_mean) <- colnames(train)
par(mfrow = c(3,2), mar = c(2,2,2,2))
for (i in 1:5) {
matplot(cbind(simulated_mean[,i], .predicted[,i]), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), ylab = "", xaxt = "n")
grid()
}
par(oldpar)
The simulated mean is as expected no different from the prediction mean of the ARIMA model.
We visually inspect the 2 methods by creating a couple of overlayed distribution plots
i <- 1
sim_1a <- t(p$mu[,i,])
sim_1b <- t(arima_pred[,i,])
colnames(sim_1a) <- colnames(sim_1b) <- as.character(p$forc_dates)
class(sim_1a) <- class(sim_1b) <- "tsmodel.distribution"
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8)
plot(sim_1a, gradient_color = "whitesmoke", interval_color = "orange", median_color = "orange")
plot(sim_1b, add = TRUE, gradient_color = "whitesmoke", interval_color = "steelblue", median_color = "steelblue", median_type = 2)
Next we visually inspect the pairwise correlations between the two methods:
j <- 2
sim_2a <- t(p$mu[,j,])
sim_2b <- t(arima_pred[,j,])
colnames(sim_2a) <- colnames(sim_2b) <- as.character(p$forc_dates)
class(sim_2a) <- class(sim_2b) <- "tsmodel.distribution"
C_a <- sapply(1:25, function(i) cor(sim_1a[,i], sim_2a[,i]))
C_b <- sapply(1:25, function(i) cor(sim_1b[,i], sim_2b[,i]))
par(mar = c(2,2,1.1,1), pty = "m", cex.axis = 0.8, cex.main = 0.8)
matplot(cbind(C_a, C_b), type = "l", lty = c(1,3), lwd = c(2, 2), col = c("grey","tomato1"), ylab = "", main = "Pairwise Correlation")
grid()
As can be observed, both methods produce almost identical output.