The bootCT package

VAR / VECM models and the conditional ARDL specification

The starting point of the method which the bootCT package is based on is the K + 1 dimensional VAR(p) model

A(L)(z_t − μ − ηt) = ε_t, ε_t ∼ N_K + 1(0, Σ), t = 1, 2, …, T

$$\mathbf{A}(L)=\left(\mathbf{I}_{K+1}- \sum_{j=1}^{p}\mathbf{A}_j\mathbf{L}^j\right) $$ Here, A_j are square (K + 1) matrices, z_t a vector of (K + 1) variables, μ and η are (K + 1) vectors representing the drift and the trend respectively, and det(A(z)) = 0 for |z| ≥ 1. If the matrix $\mathbf{A}(1)=\mathbf{I}_{K+1}-\sum_{j=1}^{p}\mathbf{A}_{j}$ is singular, the components of z_t turn out to be integrated and possibly cointegrated.

The VAR model admits a VECM representation

$$ \Delta\mathbf{z}_t=\boldsymbol{\alpha}_{0}+\boldsymbol{\alpha}_{1}t-\mathbf{A}(1)\mathbf{z}_{t-1}+\sum_{j=1}^{p-1}\boldsymbol{\Gamma}_{j}\Delta \mathbf{z}_{t-j}+\boldsymbol{\varepsilon}_t $$ where $\boldsymbol{\Gamma}_{j}=-\sum_{i=j+1}^{p}\mathbf{A}_i$. The matrix A (suppressing the bracket term for simplicity’s sake) contains the long-run cointegrating relationships among the variables, while the Γ_j’s contain the short-run relationships. Defining now $\boldsymbol{\Gamma}(1)=\mathbf{I}_{K+1}-\sum_{i=1}^{p-1}\boldsymbol{\Gamma}_{i}$, the intercept and trend terms in the VECM specification are

α₀ = Aμ + (Γ(1) − A)η, α₁ = Aη It is important to highlight now the five possible cases in the intercept and trend specifications:

Case I: no intercept, no trend, for which μ = η = 0
Case II: restricted intercept, no trend, for which α₀ = Aμ, η = 0
Case III: unrestricted intercept, no trend, for which α₀ ≠ Aμ, η = 0
Case IV: unrestricted intercept, restricted trend, for which α₀ ≠ Aμ, α₁ = Aη
Case V: unrestricted intercept, unrestricted trend, for which α₀ ≠ Aμ, α₁ ≠ Aη.

We now introduce the key concept of conditioning in the VECM system. To study the adjustment to the equilibrium of a single variable y_t, given the other x_t variables, the vectors z_t and ε_t are partitioned:

$$ \mathbf{z}_t=\begin{bmatrix} \underset{(1,1)}{y_{t}} \\ \underset{(K,1)}{\mathbf{x}_{t}} \end{bmatrix}\enspace\boldsymbol{\varepsilon}_t=\begin{bmatrix} \underset{(1,1)}{\varepsilon_{yt}} \\ \underset{(K,1)}{\boldsymbol{\varepsilon}_{xt}} \end{bmatrix} $$ The vectors α₀, α₁, the matrix A and the polynomial matrix Γ(L) are partitioned conformably to z_t

$$\boldsymbol{\alpha}_0=\begin{bmatrix} \underset{(1,1)}{\alpha_{0y}} \\ \underset{(K,1)}{\boldsymbol{\alpha}_{0x}} \end{bmatrix}, \enspace \enspace \enspace \boldsymbol{\alpha}_1=\begin{bmatrix} \underset{(1,1)}{\alpha_{1y}} \\ \underset{(K,1)}{\boldsymbol{\alpha}_{1x} } \end{bmatrix}$$

$$\mathbf{A}=\begin{bmatrix} \underset{(1,K+1)}{\mathbf{a}^{'}_{(y)}} \\ \underset{(K,K+1)}{\mathbf{A}_{(x)}} \end{bmatrix} =\begin{bmatrix} \underset{(1,1)}{a_{yy}} & \underset{(1,K)}{\mathbf{a}^{'}_{yx}} \\ \underset{(K,1)}{\mathbf{a}_{xy}} & \underset{(K,K)}{\mathbf{A}_{xx} } \end{bmatrix}, \enspace \enspace \enspace \boldsymbol{\Gamma}(L)=\begin{bmatrix} \underset{(1,K+1)}{\boldsymbol{\gamma}^{'}_{y}(L)} \\ \underset{(K,K+1)}{\boldsymbol{\Gamma}_{(x)}(L)} \end{bmatrix} =\begin{bmatrix} \underset{(1,1)}{\gamma_{yy}(L)} & \underset{(1,K)}{\boldsymbol{\gamma}^{'}_{yx}(L)} \\ \underset{(K,1)}{\boldsymbol{\gamma}_{xy}(L)} & \underset{(K,K)}{\boldsymbol{\Gamma}_{xx}(L) } \end{bmatrix} $$ while for the error term $$ \boldsymbol{\varepsilon}_t \sim N\Bigg(\mathbf{0}, \begin{bmatrix} \underset{(1,1)}{\sigma_{yy}}& \underset{(1,K)}{\boldsymbol{\sigma}_{yx}^{'}} \\ \underset{(K,1)}{\boldsymbol{\sigma}_{xy}} & \underset{(K,K)}{\boldsymbol{\Sigma}_{xx}} \end{bmatrix}\Bigg) $$

In order to condition y_t on x_t, we define the conditional variance σ_y.x = σ_yy − σ_yx^′Σ_xx⁻¹σ_xy = σ_yy − ω^′σ_xy And thus ε_yt = ω^′ε_xt + ν_yt ∼ N(0, σ_y.x) where ν_yt is independent of ε_xt. Accordingly, conditioning can be applied to the entire system, obtaining

a_(y).x^′ = a_(y)^′ − ω^′A_(x), γ_y.x^′(L) = γ_y′(L) − ω′Γ_(x)(L)

Therefore, the long-run relationships of the VECM turn out to be now included in the matrix $$ {\mathbf{A}_c}=\begin{bmatrix} \mathbf{a}^{'}_{(y).x} \\ \mathbf{A}_{(x)} \end{bmatrix}=\begin{bmatrix} a_{yy}-\boldsymbol{\omega}^{'}\mathbf{a}_{xy} & \mathbf{a}_{yx}^{'}-\boldsymbol{\omega}^{'}\mathbf{A}_{xx} \\ \mathbf{a}_{xy}&\mathbf{A}_{xx} \end{bmatrix} $$

To rule out the presence of long-run relationships between y_t and x_t in the marginal model, the long-run matrix becomes $$ \widetilde{\mathbf{A}}=\begin{bmatrix}a_{yy} & \mathbf{a}^{'}_{yx}-\boldsymbol{\omega}^{'}\mathbf{A}_{xx} \\ \mathbf{0} & \mathbf{A}_{xx} \end{bmatrix}=\begin{bmatrix} a_{yy} & \widetilde{\mathbf{a}}_{y.x}^{'} \\ \mathbf{0}&\mathbf{A}_{xx}\end{bmatrix} $$

Putting everything together, the VECM system remains unaltered in the x_t variables Δx_t = α_0x + α_1xt + A_(x)z_t − 1 + Γ_(x)(L)Δz_t + ε_xt while the equation in the conditional ARDL specification for y_t is

$$ \Delta y_{t}=\alpha_{0.y}+\alpha_{1.y}t -a_{yy}EC_{t-1}+ \sum_{j=1}^{p-1}\boldsymbol{\gamma}'_{y.x,j}\Delta\mathbf{z}_{t-j}+\boldsymbol{\omega}'\Delta\mathbf{x}_{t}+\nu_{yt} $$ The error correction term, EC_t − 1, expressing the long-run equilibrium relationship between y_t and x_t, is given by EC_t − 1 = y_t − 1 − θ₀ − θ₁t − θ′x_t − 1 with $$\theta_{0}=\mu_{y}-\boldsymbol{\theta}'\boldsymbol{\mu}_{x}, \enspace \theta_{1}=\eta_{y}-\boldsymbol{\theta}'\boldsymbol{\eta}_{x}, \enspace\boldsymbol{\theta}'=-\frac{\widetilde{\mathbf{a}'}_{y.x}}{a_{yy}}=-\frac{\mathbf{a}_{yx}'-\boldsymbol{\omega}'\mathbf{A}_{xx}}{a_{yy}} $$ Turning back to the cases specification, the ARDL equation can be specialized on this basis:

Case I: no intercept, no trend, therefore θ₀ = θ₁ = α_0.y = α_1.y = 0
Case II: restricted intercept, no trend. Since α₀ = Aμ, θ₀ ≠ 0 and the intercept term appears in EC_t − 1. Therefore, α_0.y = 0. In addition, θ₁ = α_1.y = 0
Case III: unrestricted intercept, no trend, Since α₀ ≠ Aμ, θ₀ = 0 and α_0.y = α_0y − ω′α_0x ≠ 0. In addition, θ₁ = α_1.y = 0
Case IV: unrestricted intercept, restricted trend, for which α_0.y = α_0y − ω′α_0x ≠ 0. Since
α₁ = Aη, θ₁ ≠ 0 and the trend term appear in EC_t − 1.
Case V: unrestricted intercept, unrestricted trend. Since α₀ ≠ Aμ, θ₀ = 0 and α_0.y = α_0y − ω′α_0x ≠ 0. In addition, α₁ ≠ Aη, θ₁ = 0 and α_1.y = α_1y − ω′α_1x ≠ 0

Generating a multivariate time series: the `sim_vecm_ardl` function

As usual, we load the package

library(bootCT)

This first function allows to generate a multivariate time series that follows a VECM/ARDL specification, and a particular case, as detailed above. These are the parameters:

The error covariance matrix Σ

sigma.in = matrix(c(1.69, 0.39, 0.52,
                    0.39, 1.44, -0.3,
                    0.52, -0.3, 1.00),3,3)

The short-run parameters, Γ₁, Γ₂

gamma1 = matrix(c(0.60, 0.00, 0.20,
                  0.10, -0.3, 0.00,
                  0.00, -0.3, 0.20),3,3,T)
gamma2 = gamma1 * 0.3
gamma.in = list(gamma1,gamma2)

The long-run parameters, a_yy, a_yx, A_xx that will eventually form the matrix $\widetilde{\mathbf A}$

ayy.in = 0.6
ayx.uc.in = c(0.4, 0.4)
axx.in = matrix(c(0.30, 0.50,
                  -0.4, 0.30),2,2,T)

The intercept and trend terms (μ, η, α₀ and α₁), which depend on the chosen case. For instance, in Case II:

case = 2
mu.in = c(2,2,2)
eta.in = c(0,0,0)
azero.in = c(0,0,0)
aone.in = c(0,0,0)

Calling the function to generate T = 200 observations:

data.vecm.ardl_2 =
  sim_vecm_ardl(nobs=200,
                case = 2,
                sigma.in = sigma.in,
                gamma.in = gamma.in,
                axx.in = axx.in,
                ayx.uc.in = ayx.uc.in,
                ayy.in = ayy.in,
                mu.in = mu.in,
                eta.in = eta.in,
                azero.in = azero.in,
                aone.in = aone.in,
                burn.in = 100,
                seed.in = 999)

In the function output, there are several elements available. Some represent the function arguments themselves, other are calculated as a byproduct (e.g., α_0.y, α_1.y, θ₀, θ₁). The main points of interests are naturally the data and its first difference

head(data.vecm.ardl_2$data)
#>         dep_1_0     ind_1_0   ind_2_0 time
#> 104 -3.53399308  2.51204516 5.3842215    1
#> 105 -0.05750437  1.23912249 7.8679695    2
#> 106 -1.71303429  0.64048684 6.2980765    3
#> 107 -0.81595005 -0.03456059 5.4483791    4
#> 108  2.08297259 -0.78242332 4.1370603    5
#> 109  1.91584685  0.28597121 0.2817475    6
head(data.vecm.ardl_2$diffdata)
#>      d_dep_1_0  d_ind_1_0  d_ind_2_0 time
#> 104 -0.7962310 -2.7284874  1.4603533    1
#> 105  3.4764887 -1.2729227  2.4837480    2
#> 106 -1.6555299 -0.5986356 -1.5698930    3
#> 107  0.8970842 -0.6750474 -0.8496974    4
#> 108  2.8989226 -0.7478627 -1.3113188    5
#> 109 -0.1671257  1.0683945 -3.8553128    6

Notice that the index of the data starts from burn.in + length(gamma.in) + 1.

The following code plots the data and the first difference:

df = data.vecm.ardl_2$data
meltdf <- melt(df,id="time")
ggplot(meltdf,
       aes(x = time, y = value, colour = variable, group = variable)) +
       geom_line() + ggtitle("CASE II - Level") + theme_bw()

df = data.vecm.ardl_2$diffdata
meltdf <- melt(df,id="time")
ggplot(meltdf,
       aes(x = time, y = value, colour = variable, group = variable)) +
       geom_line() + ggtitle("CASE II - Diff") + theme_bw()

To make the case in accordance with the intercept/trend specification for Case III/IV/V, respectively:

# case III
case = 3
mu.in = c(0,0,0)
eta.in = c(0,0,0)
azero.in = c(2,2,2)
aone.in = c(0,0,0)

# case IV
case = 4
mu.in = c(0,0,0)
eta.in = c(0.6,0.6,0.6)
azero.in = c(2,2,2)
aone.in = c(0,0,0)

# case V
case = 5
mu.in = c(0,0,0)
eta.in = c(0,0,0)
azero.in = c(2,2,2)
aone.in = c(0.6,0.6,0.6)

Bootstrapping the ARDL test for cointegration

The usual test based on the conditional ARDL model examines the following null hypotheses:

$$ H_0^{F_{ov}}: a_{yy}=0\enspace\cap \enspace\widetilde{\mathbf a}_{y.x}=\mathbf 0 $$ $$ H_0^{F_{ind}}: \widetilde{\mathbf a}_{y.x}=\mathbf 0\qquad \text{(Degeneracy of first type)} $$

H₀^t : a_yy = 0 (Degeneracy of second type) Note that a form of spurious cointegration can occur, since $\widetilde{\mathbf a}_{y.x} = \mathbf a_{y.x} - \boldsymbol\omega'\mathbf A_{xx}$.

If H₀^F_ind is rejected, it might be that a_y.x = 0, while ω′A_xx ≠ 0. In that case, no cointegration is really in place.

For this reason, the F_ind test must be performed also in the unconditional ARDL model (UC), that is the one omitting instantaneous differences of the independent variables from the equation (Δx_t), along with their coefficient ω′. Its null hypothesis is

H_0, UC^F_ind : a_yx = 0 On the model

$$ \Delta y_{t}=\alpha_{0.y}+\alpha_{1.y}t -a_{yy}EC_{t-1}+ \sum_{j=1}^{p-1}\boldsymbol{\gamma}'_{y.x,j}\Delta\mathbf{z}_{t-j}+\nu_{yt} $$ in which $\boldsymbol{\theta}'=-\frac{{\mathbf{a}'}_{yx}}{a_{yy}}$.

The usual testing framework in Pesaran (2001) employs the ARDL bound test on the conditional model for F_ov and t, while Goh et al. (2017) derived the bound test for F_ind. However:

The t and F_ind tests are available only for Case I-III-V
Bound tests may lead to inconclusive results

For this reason, Bertelli et al. (2022) derived a comprehensive bootstrap framework to analyze ARDL-based cointegration.

In order to perform this bootrap version, the usual “bootstrap ingredients” are necessary

The test statistics coming from the original data: F_ov, t, F_ind and F_ind, UC.
A sampling scheme on the original data.
The simulated bootstrap distribution of each statistic under its respective null hypothesis, of which the critical quantiles c_α, T^* are calculated.

This bootstrap procedure does the following:

In calculating the test statistics, the most appropriate model in the estimation phase for both the ARDL and VECM models has to be chosen. This can either be imposed by the user or automatically detected via information criteria
In deriving the bootstrap distribution of the statistics under each null, the marginal VECM model remains unaltered, while the ARDL model is re-estimated imposing the particular null hypothesis under examination (i.e., discarding the variables related to a_yy, $\widetilde{\mathbf a}_{y.x}$, or both). Thus, the residuals of the ARDL and VECM models are calculated. Naturally, for the F_ind, UC model, the restriction applies to a_yx.
In each bootstrap iteration, a random sample from the residuals is extracted (of the same length of the original data), and synthetic data is generated using the estimates obtained under each null. Finally the bootstrap version F_ov^(b), t^(b) and F_ind^(b) is calculated and stored, for b = 1, …, B. F_ind, UC^(b) is also calculated if appropriate.
Bootstrap critical quantiles are calculated, and decision on each test is based on the comparison between said critical value and the statistic in the original data. Notably, the F_ind, UC and its bootstrap distribution under the null are calculated only when it is mandatory to detect spurious cointegration.

The `boot_ardl` function

We use the German macroeconomic dataset of Lutkepohl (2007). We start by loading the data and visualizing it - also in first difference

data("ger_macro")

# Data preparation (log-transform)
LNDATA = apply(ger_macro[,-1], 2, log)
col_ln = paste0("LN", colnames(ger_macro)[-1])
LNDATA = as.data.frame(LNDATA)
colnames(LNDATA) = col_ln
LNDATA = LNDATA %>% select(LNCONS, LNINCOME, LNINVEST)
LNDATA$DATE = ger_macro$DATE

# First difference
lagdf1 = lag_mts(as.matrix(LNDATA[,-4]), k = c(1,1,1))
DIFF.LNDATA = na.omit(LNDATA[,-4] - lagdf1)
colnames(DIFF.LNDATA) = paste0("D_", colnames(LNDATA)[-4])
DIFF.LNDATA$DATE = ger_macro$DATE[-1]

# Graphs
dfmelt = melt(LNDATA, id = "DATE")
dfmelt = dfmelt%>%arrange(variable,DATE)
diff.dfmelt = melt(DIFF.LNDATA, id = "DATE")
diff.dfmelt = diff.dfmelt%>%arrange(variable,DATE)

ggplot(dfmelt,
       aes(x = DATE, y = value, colour = variable, group = variable)) +
  geom_line() + ggtitle("Level Variables (log-scale)") + theme_bw()


ggplot(diff.dfmelt,
       aes(x = DATE, y = value, colour = variable, group = variable)) +
  geom_line() + ggtitle("Diff. Variables (log-scale)") + theme_bw()

Calling the function using CONS as the dependent variable:

BCT_res_CONS = boot_ardl(data = LNDATA,
                         yvar = "LNCONS",
                         xvar = c("LNINCOME","LNINVEST"),
                         case = 3,
                         fix.ardl = NULL,
                         info.ardl = "AIC",
                         fix.vecm = NULL,
                         info.vecm = "AIC",
                         maxlag = 5,
                         a.ardl = 0.1,
                         a.vecm = 0.1,
                         nboot = 2000,
                         a.boot.H0 = c(0.01,0.05,0.1),
                         print = F)

Notably, the parameters fix.vecm and fix.ardl are blank, meaning that the estimation phase is dealt with via an automatic procedure that selects the best lag order for the short-run parameters employing common information criteria such as the AIC, BIC, SC, FPE, etc. In this case, the parameters identifying the criteria, info.vecm and info.ardl are also null, which in turn triggers the AIC as default value. The parameter maxlag sets the maximum possible value in the lag search. Secondly, the parameters a.vecm and a.ardl set the significance threshold for each of the single parameters in the short-run part of the model equation for the VECM and ARDL models, respectively. Coefficients for which the p-value is greater than this latter threshold are discarded. The argument a.boot.H0 sets the significance levels for which critical values of the bootstrap distribution under the null of F_ov, t, F_ind and F_ind, UC (if appropriate) are calculated, using nboot residual bootstrap replicates.

A summary method can be called to visualize subsets of the output:

The conditional ARDL model estimates

summary(BCT_res_CONS, out="ARDL")
#> CONDITIONAL ARDL MODEL
#> 
#> Call:
#> lm(formula = formula.ardlx, data = na.omit(df.ardl.l[[cond]]))
#> 
#> Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -0.0152061 -0.0050872  0.0003464  0.0037709  0.0242904 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.048190   0.012652   3.809 0.000267 ***
#> LNCONS.l1   -0.306508   0.054653  -5.608 2.62e-07 ***
#> LNINCOME.l1  0.296537   0.054579   5.433 5.42e-07 ***
#> LNINVEST.l1 -0.001366   0.011320  -0.121 0.904253    
#> D_LNCONS.l1 -0.247543   0.078653  -3.147 0.002289 ** 
#> D_LNINCOME   0.470633   0.073736   6.383 9.45e-09 ***
#> D_LNINVEST   0.065370   0.019326   3.382 0.001098 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.00773 on 83 degrees of freedom
#> Multiple R-squared:  0.5405, Adjusted R-squared:  0.5072 
#> F-statistic: 16.27 on 6 and 83 DF,  p-value: 2.724e-12
#> 
#> 
#> BEST LAG ORDER FOR LAGGED DIFFERENCES
#> 
#>     Series lag_dz
#> 1   LNCONS      1
#> 2 LNINCOME      0
#> 3 LNINVEST      0

The lagged levels of LNCONS and LNINCOME are significant, hinting a possible cointegrating relationship. The estimates of the instantaneous differences of LNINCOME and LNINVEST (i.e., the parameters in ω′) are also significant. Only the one-lagged difference of LNCONS is significant.

The VECM model estimates (displayed also for LNCONS for the sake of completeness)

summary(BCT_res_CONS, out="VECM")
#> UNCONDITIONAL VECM MODEL
#> 
#> VAR Estimation Results:
#> ========================= 
#> Endogenous variables: D_LNCONS, D_LNINCOME, D_LNINVEST 
#> Deterministic variables: const 
#> Sample size: 89 
#> Log Likelihood: 749.335 
#> Roots of the characteristic polynomial:
#> 0.5783 0.449 0.449 0.4405 0.3425 0.3425
#> Call:
#> vars::VAR(y = na.omit(dlag0), p = vecmsel, type = typevecm, exogen = na.omit(lagdata))
#> 
#> 
#> Estimation results for equation D_LNCONS: 
#> ========================================= 
#> D_LNCONS = D_LNINCOME.l2 + D_LNINVEST.l2 + const + LNCONS.l1 + LNINCOME.l1 + LNINVEST.l1 
#> 
#>               Estimate Std. Error t value Pr(>|t|)    
#> D_LNINCOME.l2  0.16351    0.09243   1.769 0.080572 .  
#> D_LNINVEST.l2  0.06112    0.02272   2.690 0.008633 ** 
#> const          0.05221    0.01577   3.311 0.001377 ** 
#> LNCONS.l1     -0.27421    0.07058  -3.885 0.000205 ***
#> LNINCOME.l1    0.27269    0.06950   3.923 0.000179 ***
#> LNINVEST.l1   -0.01094    0.01337  -0.819 0.415375    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> 
#> Residual standard error: 0.009436 on 83 degrees of freedom
#> Multiple R-Squared: 0.8216,  Adjusted R-squared: 0.8087 
#> F-statistic: 63.72 on 6 and 83 DF,  p-value: < 2.2e-16 
#> 
#> 
#> Estimation results for equation D_LNINCOME: 
#> =========================================== 
#> D_LNINCOME = D_LNCONS.l1 + D_LNINVEST.l1 + D_LNINVEST.l2 + const + LNINCOME.l1 + LNINVEST.l1 
#> 
#>               Estimate Std. Error t value Pr(>|t|)  
#> D_LNCONS.l1    0.21122    0.11311   1.867   0.0654 .
#> D_LNINVEST.l1  0.03549    0.02941   1.207   0.2310  
#> D_LNINVEST.l2  0.04951    0.02749   1.801   0.0753 .
#> const          0.03345    0.01655   2.021   0.0465 *
#> LNINCOME.l1   -0.01668    0.01422  -1.173   0.2442  
#> LNINVEST.l1    0.01622    0.01652   0.982   0.3291  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> 
#> Residual standard error: 0.01106 on 83 degrees of freedom
#> Multiple R-Squared: 0.7722,  Adjusted R-squared: 0.7558 
#> F-statistic:  46.9 on 6 and 83 DF,  p-value: < 2.2e-16 
#> 
#> 
#> Estimation results for equation D_LNINVEST: 
#> =========================================== 
#> D_LNINVEST = D_LNCONS.l1 + D_LNINVEST.l1 + D_LNCONS.l2 + const + LNINCOME.l1 + LNINVEST.l1 
#> 
#>               Estimate Std. Error t value Pr(>|t|)  
#> D_LNCONS.l1    0.89879    0.44170   2.035   0.0451 *
#> D_LNINVEST.l1 -0.18040    0.11117  -1.623   0.1084  
#> D_LNCONS.l2    0.74424    0.43099   1.727   0.0879 .
#> const          0.03621    0.06580   0.550   0.5836  
#> LNINCOME.l1    0.12380    0.05390   2.297   0.0241 *
#> LNINVEST.l1   -0.15237    0.06258  -2.435   0.0170 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> 
#> Residual standard error: 0.04289 on 83 degrees of freedom
#> Multiple R-Squared: 0.2557,  Adjusted R-squared: 0.2019 
#> F-statistic: 4.751 on 6 and 83 DF,  p-value: 0.0003303 
#> 
#> 
#> 
#> Covariance matrix of residuals:
#>             D_LNCONS D_LNINCOME D_LNINVEST
#> D_LNCONS   9.354e-05  5.932e-05  1.615e-04
#> D_LNINCOME 5.932e-05  1.286e-04  6.811e-05
#> D_LNINVEST 1.615e-04  6.811e-05  1.933e-03
#> 
#> Correlation matrix of residuals:
#>            D_LNCONS D_LNINCOME D_LNINVEST
#> D_LNCONS     1.0000     0.5409     0.3799
#> D_LNINCOME   0.5409     1.0000     0.1366
#> D_LNINVEST   0.3799     0.1366     1.0000

The Johansen test for cointegration on the independent variables

summary(BCT_res_CONS, out="cointVECM")
#> VECM MODEL JOHANSEN COINTEGRATION TESTS
#> 
#> Test type: trace statistic , with linear trend 
#> 
#> Eigenvalues (lambda):
#> [1] 0.07694993 0.02379754
#> 
#> 
#> Values of test statistic and critical values of test:
#> 
#>          test 10pct  5pct  1pct
#> r <= 1 | 2.14  6.50  8.18 11.65
#> r = 0  | 9.27 15.66 17.95 23.52
#> 
#> 
#> Eigenvectors, normalised to first column:
#> (These are the cointegration relations)
#> 
#>             LNINCOME.l1 LNINVEST.l1
#> LNINCOME.l1    1.000000    1.000000
#> LNINVEST.l1   -1.155814    1.416763
#> 
#> 
#> Weights W:
#> (This is the loading matrix)
#> 
#>            LNINCOME.l1  LNINVEST.l1
#> LNINCOME.d -0.01610551 -0.001368126
#> LNINVEST.d  0.12129032 -0.003256883
#> 
#> 
#> =======================================
#> 
#> Test type: maximal eigenvalue statistic (lambda max) , with linear trend 
#> 
#> Eigenvalues (lambda):
#> [1] 0.07694993 0.02379754
#> 
#> 
#> Values of test statistic and critical values of test:
#> 
#>          test 10pct  5pct  1pct
#> r <= 1 | 2.14  6.50  8.18 11.65
#> r = 0  | 7.13 12.91 14.90 19.19
#> 
#> 
#> Eigenvectors, normalised to first column:
#> (These are the cointegration relations)
#> 
#>             LNINCOME.l1 LNINVEST.l1
#> LNINCOME.l1    1.000000    1.000000
#> LNINVEST.l1   -1.155814    1.416763
#> 
#> 
#> Weights W:
#> (This is the loading matrix)
#> 
#>            LNINCOME.l1  LNINVEST.l1
#> LNINCOME.d -0.01610551 -0.001368126
#> LNINVEST.d  0.12129032 -0.003256883

We accept the null r = 0 for both the trace and eigenvalue tests, therefore the independent variables are not cointegrated.

The ARDL bootstrap and bound tests

summary(BCT_res_CONS, out="cointARDL")
#> 
#> 
#> 
#> Observations: 92 
#> Number of Lagged Regressors (not including LDV) (k):  2 
#> Case:  3 
#> --------------------------------------------------------------------------
#> -                         PSS    Fov-test                                -
#> --------------------------------------------------------------------------
#>                  <------- I(0) ----- I(1) ----->
#> 10% critical value        3.170       4.140 
#> 5% critical value         3.790       4.850 
#> 1% critical value         5.150       6.360 
#> 
#> F-statistic =  10.751 
#> 
#>      Bootstrap critical values
#>      1 %      5 %      10 % 
#>    5.610     3.920     3.160 
#> 
#> 
#> Observations: 92 
#> Number of Lagged Regressors (not including LDV) (k):  2 
#> Case:  3 
#> --------------------------------------------------------------------------
#> -                          PSS    t-test                                 -
#> --------------------------------------------------------------------------
#>                   <------- I(0) ----- I(1) ----->
#> 10% critical value        -2.570      -3.210 
#> 5% critical value         -2.860      -3.530 
#> 1% critical value         -3.430      -4.100 
#> 
#> t-statistic =  -5.608 
#> 
#>      Bootstrap critical values
#>      1 %       5 %       10 % 
#>    -3.580     -2.930     -2.610
#> 
#> 
#> Observations: 92 
#> Number of Lagged Regressors (not including LDV) (k):  2 
#> Case:  3 
#> -----------------------------------------------------------------------------
#> -                         SMG    FInd-test                                  -
#> -----------------------------------------------------------------------------
#>                  <------- I(0) ----- I(1) ----->
#> 10% critical value        2.31       4.33 
#> 5% critical value         3.01       5.42 
#> 2.5% critical value       3.74       6.42 
#> 1% critical value         4.71       7.68 
#> 
#> F-statistic =  15.636 
#> 
#>      Bootstrap critical values
#>      1 %      5 %      10 % 
#>    7.760     4.970     4.030

This is the main output of the function. Since case=3 all the bound tests can be performed. In this example, not only the statistics exceed the I(1) threshold on the bound tests, but they also exceed the bootstrap critical values reported at the bottom of each test result. Cointegration is thus confirmed between LNCONS and the other variables. No spurious (faux) cointegration is detected.

The bootCT package

VAR / VECM models and the conditional ARDL specification

Generating a multivariate time series: the sim_vecm_ardl function

Bootstrapping the ARDL test for cointegration

The boot_ardl function

Generating a multivariate time series: the `sim_vecm_ardl` function

The `boot_ardl` function