The search.varma()
function is one of the three main
functions in the ldt
package. This vignette explains a
basic usage of this function using the commodity prices dataset (International Commodity Prices (2023)).
Commodity prices refer to the prices at which raw materials or primary
foodstuffs are bought and sold. This dataset contains monthly data on
primary commodity prices, including 68 different commodities with some
starting in January 1990 and others in later periods.
For this example, we use just the first 5 columns of data:
Here are the last few observations from this subset of the data:
tail(data)
#> PALLFNF PEXGALL PNFUEL PFANDB PFOOD
#> 2023M4 170.7647 171.9614 155.8291 144.1869 146.1422
#> 2023M5 157.1340 156.8779 148.7744 137.3836 138.6572
#> 2023M6 154.0691 153.8750 145.9433 135.2476 136.0733
#> 2023M7 157.9088 158.0988 146.1088 135.6483 136.6740
#> 2023M8 161.3679 162.2299 142.8028 130.7136 131.3730
#> 2023M9 168.4047 170.0978 143.4144 129.3996 129.7991
And here are some summary statistics for each variable:
sapply(data, summary)
#> PALLFNF PEXGALL PNFUEL PFANDB PFOOD
#> Min. 61.8872 65.91441 55.03738 54.72416 55.14206
#> 1st Qu. 106.9555 108.45966 97.47020 64.14606 63.48380
#> Median 125.5690 129.27773 108.51537 90.77461 91.48711
#> Mean 133.2516 137.56116 111.14108 89.21294 89.77894
#> 3rd Qu. 166.4862 172.35819 131.78403 106.95780 108.36253
#> Max. 241.9187 253.29973 178.30364 162.22220 165.74817
#> NA's 156.0000 156.00000 156.00000 24.00000 24.00000
The columns of the data represent the following variables:
PALLFNF: All Commodity Price Index, 2016 = 100, includes both Fuel and Non-Fuel Price Indices
PEXGALL: Commodities for Index: All, excluding Gold, 2016 = 100
PNFUEL: Non-Fuel Price Index, 2016 = 100, includes Precious Metal, Food and Beverages and Industrial Inputs Price Indices
PFANDB: Food and Beverage Price Index, 2016 = 100, includes Food and Beverage Price Indices
PFOOD: Food Price Index, 2016 = 100, includes Cereal, Vegetable Oils, Meat, Seafood, Sugar, and Other Food (Apple (non-citrus fruit), Bananas, Chana (legumes), Fishmeal, Groundnuts, Milk (dairy), Tomato (veg)) Price Indices
We use the first variable (i.e., PALLFNF) as the target variable and
the MAPE metric to find the best predicting model. Out-of-sample
evaluation affects the choice of maximum model complexity, as it
involves reestimating the model using maximum likelihood several times.
Although the simUsePreviousEstim
argument helps with
initializing maximum likelihood estimation, VARMA model estimation is
time-consuming due to its large number of parameters. We impose some
restrictions in the modelset. We set a maximum value for the number of
equations allowed in the models. Additionally, we set a maximum value
for the parameters of the VARMA model.
search_res <- search.varma(data = get.data(data, endogenous = 5),
combinations = get.combinations(sizes = c(1,2,3),
numTargets = 1),
maxParams = c(2,0,0),
metric <- get.search.metrics(typesIn = c(),
typesOut = c("mape"),
simFixSize = 6),
maxHorizon = 5)
#> Warning in search.varma(data = get.data(data, endogenous = 5), combinations =
#> get.combinations(sizes = c(1, : 'maxHorizon' argument is different from the
#> maximum horizon in the 'metrics' argument.
print(search_res)
#> LDT search result:
#> Method in the search process: VARMA
#> Expected number of models: 22, searched: 22 , failed: 0 (0%)
#> Elapsed time: 0.01667309 minutes
#> Length of results: 1
#> --------
#> Target (PALLFNF):
#> Evaluation (mape):
#> Best model:
#> endogenous: (3x1) PALLFNF, PEXGALL, PNFUEL
#> exogenous: (Intercept)
#> metric: 2.705854
#> --------
The output of the search.varma()
function does not
contain any estimation results, but only the information required to
replicate them. The summary()
function returns a similar
structure but with the estimation results included.
We can plot the predicted values along with the out-of-sample evaluations:
This package can be a recommended tool for empirical studies that
require reducing assumptions and summarizing uncertainty analysis
results. This vignette is just a demonstration. There are indeed other
options you can explore with the search.varma()
function.
For instance, you can experiment with different evaluation metrics or
restrict the model set based on your specific needs. Additionally,
there’s an alternative approach where you can combine modeling with
Principal Component Analysis (PCA) (see estim.varma()
function). I encourage you to experiment with these options and see how
they can enhance your data analysis journey.