Title: | Nonparametric Estimation of the Trend and Its Derivatives in TS |
---|---|
Description: | The nonparametric trend and its derivatives in equidistant time series (TS) with short-memory stationary errors can be estimated. The estimation is conducted via local polynomial regression using an automatically selected bandwidth obtained by a built-in iterative plug-in algorithm or a bandwidth fixed by the user. A Nadaraya-Watson kernel smoother is also built-in as a comparison. With version 1.1.0, a linearity test for the trend function, forecasting methods and backtesting approaches are implemented as well. The smoothing methods of the package are described in Feng, Y., Gries, T., and Fritz, M. (2020) <doi:10.1080/10485252.2020.1759598>. |
Authors: | Yuanhua Feng [aut] (Paderborn University, Germany), Sebastian Letmathe [aut] (Paderborn University, Germany), Dominik Schulz [aut, cre] (Paderborn University, Germany), Thomas Gries [ctb] (Paderborn University, Germany), Marlon Fritz [ctb] (Paderborn University, Germany) |
Maintainer: | Dominik Schulz <[email protected]> |
License: | GPL-3 |
Version: | 1.1.4 |
Built: | 2024-11-06 06:35:31 UTC |
Source: | CRAN |
Point forecasts and the respective forecasting intervals for autoregressive-moving-average (ARMA) models can be calculated, the latter via bootstrap, by means of this function.
bootCast( X, p = NULL, q = NULL, include.mean = FALSE, n.start = 1000, h = 1, it = 10000, pb = TRUE, cores = future::availableCores(), alpha = 0.95, export.error = FALSE, plot = FALSE, ... )
bootCast( X, p = NULL, q = NULL, include.mean = FALSE, n.start = 1000, h = 1, it = 10000, pb = TRUE, cores = future::availableCores(), alpha = 0.95, export.error = FALSE, plot = FALSE, ... )
X |
a numeric vector that contains the time series that is assumed to follow an ARMA model ordered from past to present. |
p |
an integer value |
q |
an integer value |
include.mean |
a logical value; if set to |
n.start |
an integer that defines the 'burn-in' number
of observations for the simulated ARMA series via bootstrap; is set to
|
h |
an integer that represents the forecasting horizon; if |
it |
an integer that represents the total number of iterations, i.e.,
the number of simulated series; is set to |
pb |
a logical value; for |
cores |
an integer value >0 that states the number of (logical) cores to
use in the bootstrap (or |
alpha |
a numeric vector of length 1 with |
export.error |
a single logical value; if the argument is set to
|
plot |
a logical value that controls the graphical output; for
|
... |
additional arguments for the standard plot function, e.g.,
|
This function is part of the smoots
package and was implemented under
version 1.1.0. For a given time series ,
,
the point forecasts and the respective forecasting intervals will be
calculated. It is assumed that the series follows an ARMA(
) model
where and
are real
numbers (for
and
) and
are i.i.d. (identically and independently
distributed) random variables with zero mean and constant variance.
is equal to
.
The point forecasts and forecasting intervals for the future periods
will be obtained. With respect to the point
forecasts
, where
,
with for
and
for
will be applied.
The forecasting intervals on the other hand are obtained by a forward
bootstrap method that was introduced by Pan and Politis (2016) for
autoregressive models and extended by Lu and Wang (2020) for applications to
autoregressive-moving-average models.
For this purpose, let be the number of the current bootstrap
iteration. Based on the demeaned residuals of the initial ARMA estimation,
different innovation series
will
be sampled. The initial coefficient estimates and the sampled innovation
series are then used to simulate a variety of series
, from which again coefficient estimates will
be obtained. With these newly obtained estimates, proxy residual series
are calculated for
the original series
. Subsequently, point forecasts for the
time points
to
are obtained for each iteration
based on the original series
, the newly obtained
coefficient forecasts and the proxy residual series
.
Simultaneously, "true" forecasts, i.e., true future observations, are
simulated. Within each iteration, the difference between the simulated true
forecast and the bootstrapped point forecast is calculated and saved for each
future time point
to
. The result for these time
points are simulated empirical values of the forecasting error. Denote by
the quantile of the empirical distribution for the
future time point
. Given a predefined confidence level
alpha
, define
alpha
. The
bootstrapped forecasting interval is then
i.e., the forecasting intervals are given by the sum of the respective point forecasts and quantiles of the respective bootstrapped forecasting error distributions.
The function bootCast
allows for different adjustments to
the forecasting progress. At first, a vector with the values of the observed
time series ordered from past to present has to be passed to the argument
X
. Orders and
of the underlying ARMA process can be
defined via the arguments
p
and q
. If only one of these orders
is inserted by the user, the other order is automatically set to 0
. If
none of these arguments are defined, the function will choose orders based on
the Bayesian Information Criterion (BIC) for
. Via the logical argument
include.mean
the user can decide, whether to consider the mean of the
series within the estimation process. By means of n.start
, the number
of "burn-in" observations for the simulated ARMA processes can be regulated.
These observations are usually used for the processes to build up and then
omitted. Furthermore, the argument h
allows for the definition of the
maximum future time point . Point forecasts and forecasting
intervals will be returned for the time points
to
.
it
corresponds to the number of bootstrap iterations. We recommend a
sufficiently high number of repetitions for maximum accuracy of the results.
Another argument is alpha
, which is the equivalent of the confidence
level considered within the calculation of the forecasting intervals, i.e.,
the quantiles
alpha
and
alpha
of the bootstrapped forecasting error distribution
will be obtained.
Since this bootstrap approach needs a lot of computation time, especially for
series with high numbers of observations and when fitting models with many
parameters, parallel computation of the bootstrap iterations is enabled.
With cores
, the number of cores can be defined with an integer.
Nonetheless, for cores = NULL
, no cluster is created and therefore
the parallel computation is disabled. Note that the bootstrapped results are
fully reproducible for all cluster sizes. The progress of the bootstrap can
be observed in the R console, where a progress bar and the estimated
remaining time are displayed for pb = TRUE
.
If the argument export.error
is set to TRUE
, the output of
the function is a list instead of a matrix with additional information on
the simulated forecasting errors. For more information see the section
Value.
For simplicity, the function also incorporates the possibility to directly
create a plot of the output, if the argument plot
is set to
TRUE
. By the additional and optional arguments ...
, further
arguments of the standard plot function can be implemented to shape the
returned plot.
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSS-ML"
is used throughout
for the estimation of ARMA models. Furthermore, to increase the performance,
C++ code via the Rcpp
and
RcppArmadillo
packages was
implemented. Also, the future
and
future.apply
packages are
considered for parallel computation of bootstrap iterations. The progress
of the bootstrap is shown via the
progressr
package.
The function returns a by
matrix with its columns
representing the future time points and the point forecasts, the lower
bounds of the forecasting intervals and the upper bounds of the
forecasting intervals as the rows. If the argument
plot
is set to
TRUE
, a plot of the forecasting results is created.
If export.error = TRUE
is selected, a list with the following
elements is returned instead.
the by
matrix forecasting matrix with point
forecasts and bounds of the forecasting intervals.
a it
by matrix, where each column represents a
future time point
; in each column the
respective
it
simulated forecasting errors are saved.
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.
Lu, X., and Wang, L. (2020). Bootstrap prediction interval for ARMA models with unknown orders. REVSTAT–Statistical Journal, 18:3, 375-396.
Pan, L. and Politis, D. N. (2016). Bootstrap prediction intervals for linear, nonlinear and nonparametric autoregressions. In: Journal of Statistical Planning and Inference 177, pp. 1-27.
### Example 1: Simulated ARMA process ### # Function for drawing from a demeaned chi-squared distribution rchisq0 <- function(n, df, npc = 0) { rchisq(n, df, npc) - df } # Simulation of the underlying process n <- 2000 n.start = 1000 set.seed(23) X <- arima.sim(model = list(ar = c(1.2, -0.7), ma = 0.63), n = n, rand.gen = rchisq0, n.start = n.start, df = 3) + 13.1 # Quick application with low number of iterations # (not recommended in practice) result <- bootCast(X = X, p = 2, q = 1, include.mean = TRUE, n.start = n.start, h = 5, it = 10, cores = 2, plot = TRUE, lty = 3, col = "forestgreen", xlim = c(1950, 2005), type = "b", main = "Exemplary title", pch = "*") result ### Example 2: Application with more iterations ### ## Not run: result2 <- bootCast(X = X, p = 2, q = 1, include.mean = TRUE, n.start = n.start, h = 5, it = 10000, cores = 2, plot = TRUE, lty = 3, col = "forestgreen", xlim = c(1950, 2005), main = "Exemplary title") result2 ## End(Not run)
### Example 1: Simulated ARMA process ### # Function for drawing from a demeaned chi-squared distribution rchisq0 <- function(n, df, npc = 0) { rchisq(n, df, npc) - df } # Simulation of the underlying process n <- 2000 n.start = 1000 set.seed(23) X <- arima.sim(model = list(ar = c(1.2, -0.7), ma = 0.63), n = n, rand.gen = rchisq0, n.start = n.start, df = 3) + 13.1 # Quick application with low number of iterations # (not recommended in practice) result <- bootCast(X = X, p = 2, q = 1, include.mean = TRUE, n.start = n.start, h = 5, it = 10, cores = 2, plot = TRUE, lty = 3, col = "forestgreen", xlim = c(1950, 2005), type = "b", main = "Exemplary title", pch = "*") result ### Example 2: Application with more iterations ### ## Not run: result2 <- bootCast(X = X, p = 2, q = 1, include.mean = TRUE, n.start = n.start, h = 5, it = 10000, cores = 2, plot = TRUE, lty = 3, col = "forestgreen", xlim = c(1950, 2005), main = "Exemplary title") result2 ## End(Not run)
Asymptotically Unbiased Confidence Bounds
confBounds( obj, alpha = 0.95, p = c(0, 1, 2, 3), plot = TRUE, showPar = TRUE, rescale = TRUE, ... )
confBounds( obj, alpha = 0.95, p = c(0, 1, 2, 3), plot = TRUE, showPar = TRUE, rescale = TRUE, ... )
obj |
|
alpha |
the confidence level; a single numeric value between |
p |
the order of polynomial used for the parametric polynomial
regression that is conducted as a benchmark for the trend function;
must satisfy |
plot |
a logical value; for |
showPar |
set to |
rescale |
a single logical value; is set to |
... |
further arguments that can be passed to the |
This function is part of the smoots
package and was implemented under
version 1.1.0. The underlying theory is based on the additive nonparametric
regression function
where is the observed time series,
is the rescaled time
on the interval
,
is a smooth trend function and
are stationary errors with
and
short-range dependence.
The purpose of this function is the estimation of reasonable confidence intervals for the nonparametric trend function and its derivatives. The optimal bandwidth minimizes the Asymptotic Mean Integrated Squared Error (AMISE) criterion, however, local polynomial estimates are (usually) biased. The bias is then (approximately)
where is the order of the local polynomials,
is the
order of the asymptotically equivalent kernel,
is the order of the
of the trend function's derivative,
is the
-th order
derivative of the trend function and
.
is the
-th order asymptotically
equivalent kernel function for estimating
.
A renewed estimation with an adjusted bandwidth
, i.e., a
bandwidth with a smaller order than the optimal bandwidth, is conducted.
, where
is the optimal bandwidth, is implemented.
Following this idea, we have that
converges to
in distribution, where is the sum of autocovariances.
Consequently, the trend (or derivative) estimates are asymptotically unbiased
and normally distributed.
To make use of this function, an object of class smoots
can be given
as input that was created by either msmooth
,
tsmooth
or dsmooth
. Based on the optimal
bandwidth saved within obj
, an adjustment to the bandwidth is made so
that the estimates following the adjusted bandwidth are (relatively)
unbiased.
Based on the input argument alpha
, the level of confidence between
0
and 1
, the respective confidence bounds are calculated for
each observation point.
From the input argument obj
, the order of derivative is automatically
obtained. By means of the argument p
, an order of polynomial is
selected for a parametric regression of the trend function. This is only
meaningful, if the trend (and not its derivatives) is analyzed. Otherwise,
the argument is automatically dropped by the function. Furthermore, if
plot = TRUE
, a plot of the unbiased trend (or derivative) estimates
alongside the confidence bounds is created. If also showPar = TRUE
,
the estimated parametric trend (or parametric constant value for the
derivatives) is added to the confidence bound plot for comparison.
NOTE:
The values that are returned by the function are obtained with respect to
the rescaled time points on the interval . While the plot can be
adjusted and rescaled by means of a given vector with the actual time points,
the numeric output is not rescaled. For this purpose we refer the user to
the
rescale
function of the smoots
package.
This function implements C++ code by means of the
Rcpp
and
RcppArmadillo
packages for
better performance.
A plot is created in the plot window and a list with different components is returned.
a numeric vector of length 1; the level of confidence; input argument.
a numeric vector with one element that represents the adjusted bandwidth for the unbiased trend estimation.
a numeric vector with the estimates following the parametric
regression defined by p
that is conducted as a benchmark for the trend
function; for the trend's derivatives or for p = 0
, a constant value
is the benchmark; the values are obtained with respect to the rescaled time
points on the interval .
the number of observations.
a data frame with the three (numeric) columns ye.ub,
lower and upper; in ye.ub the unbiased trend
estimates, in lower the lower confidence bound and in upper
the upper confidence bound can be found; the values are obtained with respect
to the rescaled time points on the interval .
the order of the trend's derivative considered for the test.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J. and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.
log_gdp <- log(smoots::gdpUS$GDP) est <- msmooth(log_gdp) confBounds(est)
log_gdp <- log(smoots::gdpUS$GDP) est <- msmooth(log_gdp) confBounds(est)
An information criterion is calculated for different orders of an autoregressive-moving-average (ARMA) model.
critMatrix( X, p.max = 5, q.max = 5, criterion = c("bic", "aic"), include.mean = TRUE )
critMatrix( X, p.max = 5, q.max = 5, criterion = c("bic", "aic"), include.mean = TRUE )
X |
a numeric vector that contains the observed time series ordered from past to present; the series is assumed to follow an ARMA process. |
p.max |
an integer value |
q.max |
an integer value |
criterion |
a character value that defines the information criterion
that will be calculated; the Bayesian Information Criterion ( |
include.mean |
a logical value; this argument regulates whether to
estimate the mean of the series ( |
This function is part of the smoots
package and was implemented under
version 1.1.0. The series passed to X
is assumed to follow an
ARMA() model. A
p.max + 1
by q.max + 1
matrix is
calculated for this series. More precisely, the criterion chosen via the
argument criterion
is calculated for all combinations of orders
and
.
Within the function, two information criteria are supported: the Bayesian Information Criterion (BIC) and Akaike's Information Criterion (AIC). The AIC is given by
where is the estimated
innovation variance,
and
are the ARMA orders and
is
the number of observations.
The BIC, on the other hand, is defined by
with being the number of estimated parameters and
being the estimated Log-Likelihood. Since the parameter
only differs with respect to the orders
and
for all
estimated models, the term
is reduced to
within the function. Exemplarily,
if the mean of the series is estimated as well, it is usually considered
within the parameter
when calculating the BIC.
However, since the mean is estimated for all models, not considering this
estimated parameter within the calculation of the BIC will reduce all BIC
values by the same amount of
. Therefore, the selection
via this simplified criterion is still valid, if the number of the estimated
parameters only differs with respect to
and
between the
models that the BIC is obtained for.
The optimal orders are considered to be the ones which minimize either the BIC or the AIC. The use of the BIC is however recommended, because the BIC is consistent, whereas the AIC is not.
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSS-ML"
is used throughout for
the estimation of ARMA models.
The function returns a p.max + 1
by q.max + 1
matrix, where the
rows represent the AR orders from to
and the columns represent the MA orders from
to
. The values within the matrix are the values of
the previously selected information criterion for the different combinations
of
and
.
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
## Not run: # Simulate an ARMA(2,1) process set.seed(23) X.sim <- stats::arima.sim(model = list(ar = c(1.2, -0.71), ma = 0.46), n = 1000) + 13.1 # Application of the function critMatrix(X.sim) # Result: Via the BIC, the orders p.opt = 2 and q.opt = 1 are selected. ## End(Not run)
## Not run: # Simulate an ARMA(2,1) process set.seed(23) X.sim <- stats::arima.sim(model = list(ar = c(1.2, -0.71), ma = 0.46), n = 1000) + 13.1 # Application of the function critMatrix(X.sim) # Result: Via the BIC, the orders p.opt = 2 and q.opt = 1 are selected. ## End(Not run)
A dataset that contains the daily financial data of the DAX from 1990 to July 2019 (currency in EUR).
dax
dax
A data frame with 7475 rows and 9 variables:
the observation year
the observation month
the observation day
the opening price of the day
the highest price of the day
the lowest price of the day
the closing price of the day
the adjusted closing price of the day
the traded volume
The data was obtained from Yahoo Finance (accessed: 2019-08-22).
This function runs through an iterative process in order to find the optimal bandwidth for the nonparametric estimation of the first or second derivative of the trend in an equidistant time series (with short-memory errors) and subsequently employs the obtained bandwidth via local polynomial regression.
dsmooth( y, d = c(1, 2), mu = c(0, 1, 2, 3), pp = c(1, 3), bStart.p = 0.15, bStart = 0.15 )
dsmooth( y, d = c(1, 2), mu = c(0, 1, 2, 3), pp = c(1, 3), bStart.p = 0.15, bStart = 0.15 )
y |
a numeric vector that contains the time series ordered from past to present. |
||||||||||
d |
an integer |
||||||||||
mu |
an integer
|
||||||||||
pp |
an integer |
||||||||||
bStart.p |
a numeric object that indicates the starting value of the
bandwidth for the iterative process for the calculation of |
||||||||||
bStart |
a numeric object that indicates the starting value of the
bandwidth for the iterative process; should be |
The trend's derivative is estimated based on the additive nonparametric regression model for an equidistant time series
where is the observed time series,
is the rescaled time
on the interval
,
is a smooth and deterministic
trend function and
are stationary errors with
and short-range dependence (see also Beran and Feng,
2002). With this function, the first or second derivative of
can be estimated without a parametric model assumption for the error series.
The iterative-plug-in (IPI) algorithm, which numerically minimizes the Asymptotic Mean Squared Error (AMISE), was proposed by Feng, Gries and Fritz (2020).
Define ,
and
, where
is the order of the polynomial,
is the order of the asymptotically equivalent kernel,
is the order of the trend function's derivative,
,
is the variance factor and
the
-th order equivalent kernel
obtained for the estimation of
in the interior.
is the
-th order derivative (
) of the nonparametric trend.
Furthermore, we define
and
with being the bandwidth and
being the number of
observations. The AMISE is then
The variance factor is first obtained from a pilot-estimation of
the time series' nonparametric trend (
) with polynomial order
. The estimate is then plugged into the iterative procedure for
estimating the first or second derivative (
or
).
For further details on the asymptotic theory or the algorithm, we refer the
user to Feng, Fritz and Gries (2020) and Feng et al. (2019).
The function itself is applicable in the following way: Based on a data input
y
, an order of polynomial pp
for the variance factor estimation
procedure, a starting value for the relative bandwidth bStart.p
in the
variance factor estimation procedure, a kernel function defined by the
smoothness parameter mu
and a starting value for the relative
bandwidth bStart
in the bandwidth estimation procedure, an optimal
bandwidth is numerically calculated for the trend's derivative of order
d
. In fact, aside from the input vector y
, every argument has a
default setting that can be adjusted for the individual case. However, it is
recommended to initially use the default values for the estimation of the
first derivative and adjust the argument d
to d = 2
for the
estimation of the second derivative. Following Feng, Gries and Fritz (2020),
the initial bandwidth does not affect the resulting optimal bandwidth in
theory. However in practice, local minima of the AMISE can influence the
results. Therefore, the default starting bandwidth is set to 0.15
, the
suggested starting bandwidth by Feng, Gries and Fritz (2020) for the
data-driven estimation of the first derivative. The recommended initial
bandwidth for the second derivative, however, is 0.2
and not
0.15
. Thus, if the algorithm does not give suitable results
(especially for d = 2
), the adjustment of the initial bandwidth might
be a good starting point. Analogously, the default starting bandwidth for the
trend estimation for the variance factor is bStart.p = 0.15
, although
according to Feng, Gries and Fritz (2020), bStart.p = 0.1
is suggested
for pp = 1
and bStart.p = 0.2
for pp = 3
. The default is
therefore a compromise between the two suggested values. For more specific
information on the input arguments consult the section Arguments.
After the bandwidth estimation, the nonparametric derivative of the series is calculated with respect to the obtained optimal bandwidth by means of a local polynomial regression. The output object is then a list that contains, among other components, the original time series, the estimates of the derivative and the estimated optimal bandwidth.
The default print method for this function delivers key numbers such as
the iteration steps and the generated optimal bandwidth rounded to the fourth
decimal. The exact numbers and results such as the estimated nonparametric
trend series are saved within the output object and can be addressed via the
$
sign.
NOTE:
The estimates are obtained for the rescaled time points on the interval
. Therefore, the estimated derivatives might not reflect the
derivatives for the actual time points. To rescale them, we refer the
user to the
rescale
function of the smoots
package.
With package version 1.1.0, this function implements C++ code by means
of the Rcpp
and
RcppArmadillo
packages for
better performance.
The function returns a list with different components:
the optimal bandwidth chosen by the IPI-algorithm.
the starting bandwidth for the local polynomial regression based derivative estimation procedure; input argument.
the starting bandwidth for the nonparametric trend estimation that leads to the variance factor estimate; input argument.
indicates whether an enlarged bandwidth was used for the variance
factor estimation or not; it is always set to "Y"
(yes) for this
function.
the estimated variance factor; in contrast to the definitions
given in the Details section, this object actually contains an
estimated value of , i.e. it corresponds to the estimated sum
of autocovariances.
the inflation rate setting.
the bandwidths of the single iterations steps
the estimation method for the variance factor estimation; it is
always estimated nonparametrically ("NP"
) within this function.
the smoothness parameter of the second order kernel; input argument.
the number of observations.
the total number of iterations until convergence.
the original input series; input argument.
the order of polynomial for the local polynomial regression used within derivative estimation procedure.
the order of polynomial for the local polynomial regression used in the variance factor estimation; input argument.
the considered order of the trend's derivative; input argument
d
.
the weighting system matrix used within the local polynomial
regression; this matrix is a condensed version of a complete weighting system
matrix; in each row of ws
, the weights for conducting the smoothing
procedure at a specific observation time point can be found; the first
rows, where
corresponds to the number of
observations,
is the bandwidth considered for smoothing and
denotes the integer part, contain the weights at the
left-hand boundary points; the weights in row
are representative for the estimation at all
interior points and the remaining rows contain the weights for the right-hand
boundary points; each row has exactly
elements,
more specifically the weights for observations of the nearest
time points; moreover, the weights are normalized,
i.e. the weights are obtained under consideration of the time points
, where
.
the nonparametric estimates of the derivative for the rescaled
time points on the interval .
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
# Logarithm of test data test_data <- gdpUS y <- log(test_data$GDP) t <- seq(from = 1947, to = 2019.25, by = 0.25) # Applied dsmooth function for the trend's first derivative result_d <- dsmooth(y, d = 1, mu = 1, pp = 1, bStart.p = 0.1, bStart = 0.15) estim <- result_d$ye # Plot of the results plot(t, estim, xlab = "Year", ylab = "First derivative", type = "l", main = paste0("Estimated first derivative of the trend for log-quarterly ", "US-GDP, Q1 1947 - Q2 2019"), cex.axis = 0.8, cex.main = 0.8, cex.lab = 0.8, bty = "n") # Print result result_d
# Logarithm of test data test_data <- gdpUS y <- log(test_data$GDP) t <- seq(from = 1947, to = 2019.25, by = 0.25) # Applied dsmooth function for the trend's first derivative result_d <- dsmooth(y, d = 1, mu = 1, pp = 1, bStart.p = 0.1, bStart = 0.15) estim <- result_d$ye # Plot of the results plot(t, estim, xlab = "Year", ylab = "First derivative", type = "l", main = paste0("Estimated first derivative of the trend for log-quarterly ", "US-GDP, Q1 1947 - Q2 2019"), cex.axis = 0.8, cex.main = 0.8, cex.lab = 0.8, bty = "n") # Print result result_d
Generic function which extracts fitted values from a smoots
class
object. Both fitted
and fitted.values
can be called.
## S3 method for class 'smoots' fitted(object, ...)
## S3 method for class 'smoots' fitted(object, ...)
object |
an object from the |
... |
included for consistency with the generic function. |
Fitted values extracted from a smoots
class object.
Sebastian Letmathe (Scientific Employee) (Department of Economics,
Paderborn University),
A dataset that contains the (seasonally adjusted) Gross Domestic Product of the US from the first quarter of 1947 to the second quarter of 2019
gdpUS
gdpUS
A data frame with 290 rows and 3 variables:
the observation year
the observation quarter in the given year
the Gross Domestic Product of the US in billions of chained 2012 US Dollars (annual rate)
The data was obtained from the Federal Reserve Bank of St. Louis (accessed: 2019-09-01).
https://fred.stlouisfed.org/series/GDPC1
This function is an R function for estimating the trend function and its derivatives in an equidistant time series with local polynomial regression and a fixed bandwidth given beforehand.
gsmooth(y, v = 0, p = v + 1, mu = 1, b = 0.15, bb = c(0, 1))
gsmooth(y, v = 0, p = v + 1, mu = 1, b = 0.15, bb = c(0, 1))
y |
a numeric vector that contains the time series data ordered from past to present. |
|||||||||||||||||||||||||
v |
an integer
|
|||||||||||||||||||||||||
p |
an integer Exemplary for
|
|||||||||||||||||||||||||
mu |
an integer
|
|||||||||||||||||||||||||
b |
a real number |
|||||||||||||||||||||||||
bb |
can be set to
|
The trend or its derivatives are estimated based on the additive nonparametric regression model for an equidistant time series
where is the observed time series,
is the rescaled time
on the interval
,
is a smooth and deterministic
trend function and
are stationary errors with
(see also Beran and Feng, 2002).
This function is part of the package smoots
and is used in
the field of analyzing equidistant time series data. It applies the local
polynomial regression method to the input data with an arbitrarily
selectable bandwidth. By these means, the trend as well as its derivatives
can be estimated nonparametrically, even though the result will strongly
depend on the bandwidth given beforehand as an input.
NOTE:
The estimates are obtained with regard to the rescaled time points on the
interval . Thus, if
, the estimates might not
reflect the values for the actual time points. To rescale the estimates, we
refer the user to the
rescale
function of the smoots
package.
With package version 1.1.0, this function implements C++ code by means
of the Rcpp
and
RcppArmadillo
packages for
better performance.
The output object is a list with different components:
the chosen (relative) bandwidth; input argument.
the chosen bandwidth option at the boundaries; input argument.
the chosen smoothness parameter for the second order kernel; input argument.
the number of observations.
the original input series; input argument.
the chosen order of polynomial; input argument.
a vector with the estimated residual series; is set to NULL
for v > 0
.
the order of derivative; input argument.
the weighting system matrix used within the local polynomial
regression; this matrix is a condensed version of a complete weighting system
matrix; in each row of ws
, the weights for conducting the smoothing
procedure at a specific observation time point can be found; the first
rows, where
corresponds to the number of
observations,
is the bandwidth considered for smoothing and
denotes the integer part, contain the weights at the
left-hand boundary points; the weights in row
are representative for the estimation at all
interior points and the remaining rows contain the weights for the right-hand
boundary points; each row has exactly
elements,
more specifically the weights for observations of the nearest
time points; moreover, the weights are normalized,
i.e. the weights are obtained under consideration of the time points
, where
.
a vector with the estimates of the selected nonparametric order of
derivative on the rescaled time interval .
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J. and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
# Logarithm of test data test_data <- gdpUS y <- log(test_data$GDP) # Applied gsmooth function for the trend with two different bandwidths results1 <- gsmooth(y, v = 0, p = 1, mu = 1, b = 0.28, bb = 1) results2 <- gsmooth(y, v = 0, p = 1, mu = 1, b = 0.11, bb = 1) trend1 <- results1$ye trend2 <- results2$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) plot(t, y, type = "l", xlab = "Year", ylab = "log(US-GDP)", bty = "n", lwd = 2, main = "Estimated trend for log-quarterly US-GDP, Q1 1947 - Q2 2019") points(t, trend1, type = "l", col = "red", lwd = 1) points(t, trend2, type = "l", col = "blue", lwd = 1) legend("bottomright", legend = c("Trend (b = 0.28)", "Trend (b = 0.11)"), fill = c("red", "blue"), cex = 0.6) title(sub = expression(italic("Figure 1")), col.sub = "gray47", cex.sub = 0.6, adj = 0)
# Logarithm of test data test_data <- gdpUS y <- log(test_data$GDP) # Applied gsmooth function for the trend with two different bandwidths results1 <- gsmooth(y, v = 0, p = 1, mu = 1, b = 0.28, bb = 1) results2 <- gsmooth(y, v = 0, p = 1, mu = 1, b = 0.11, bb = 1) trend1 <- results1$ye trend2 <- results2$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) plot(t, y, type = "l", xlab = "Year", ylab = "log(US-GDP)", bty = "n", lwd = 2, main = "Estimated trend for log-quarterly US-GDP, Q1 1947 - Q2 2019") points(t, trend1, type = "l", col = "red", lwd = 1) points(t, trend2, type = "l", col = "blue", lwd = 1) legend("bottomright", legend = c("Trend (b = 0.28)", "Trend (b = 0.11)"), fill = c("red", "blue"), cex = 0.6) title(sub = expression(italic("Figure 1")), col.sub = "gray47", cex.sub = 0.6, adj = 0)
This function estimates the nonparametric trend function in an equidistant time series with Nadaraya-Watson kernel regression.
knsmooth(y, mu = 1, b = 0.15, bb = c(0, 1))
knsmooth(y, mu = 1, b = 0.15, bb = c(0, 1))
y |
a numeric vector that contains the time series data ordered from past to present. |
||||||||||||
mu |
an integer
|
||||||||||||
b |
a real number |
||||||||||||
bb |
can be set to
|
The trend is estimated based on the additive nonparametric regression model for an equidistant time series
where is the observed time series,
is the rescaled time
on the interval
,
is a smooth and deterministic
trend function and
are stationary errors with
.
This function is part of the package smoots
and is used for
the estimation of trends in equidistant time series. The applied method
is a kernel regression with arbitrarily selectable second order
kernel, relative bandwidth and boundary method. Especially the chosen
bandwidth has a strong impact on the final result and has thus to be
selected carefully. This approach is not recommended by the authors of this
package.
The output object is a list with different components:
the chosen (relative) bandwidth; input argument.
the chosen bandwidth option at the boundaries; input argument.
the chosen smoothness parameter for the second order kernel; input argument.
the number of observations.
the original input series; input argument.
a vector with the estimated residual series.
a vector with the estimates of the nonparametric trend.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y. (2009). Kernel and Locally Weighted Regression. Verlag für Wissenschaft und Forschung, Berlin.
# Logarithm of test data test_data <- gdpUS y <- log(test_data$GDP) #Applied knmooth function for the trend with two different bandwidths trend1 <- knsmooth(y, mu = 1, b = 0.28, bb = 1)$ye trend2 <- knsmooth(y, mu = 1, b = 0.05, bb = 1)$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) plot(t, y, type = "l", xlab = "Year", ylab = "log(US-GDP)", bty = "n", lwd = 2, main = "Estimated trend for log-quarterly US-GDP, Q1 1947 - Q2 2019") points(t, trend1, type = "l", col = "red", lwd = 1) points(t, trend2, type = "l", col = "blue", lwd = 1) legend("bottomright", legend = c("Trend (b = 0.28)", "Trend (b = 0.05)"), fill = c("red", "blue"), cex = 0.6) title(sub = expression(italic("Figure 1")), col.sub = "gray47", cex.sub = 0.6, adj = 0)
# Logarithm of test data test_data <- gdpUS y <- log(test_data$GDP) #Applied knmooth function for the trend with two different bandwidths trend1 <- knsmooth(y, mu = 1, b = 0.28, bb = 1)$ye trend2 <- knsmooth(y, mu = 1, b = 0.05, bb = 1)$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) plot(t, y, type = "l", xlab = "Year", ylab = "log(US-GDP)", bty = "n", lwd = 2, main = "Estimated trend for log-quarterly US-GDP, Q1 1947 - Q2 2019") points(t, trend1, type = "l", col = "red", lwd = 1) points(t, trend2, type = "l", col = "blue", lwd = 1) legend("bottomright", legend = c("Trend (b = 0.28)", "Trend (b = 0.05)"), fill = c("red", "blue"), cex = 0.6) title(sub = expression(italic("Figure 1")), col.sub = "gray47", cex.sub = 0.6, adj = 0)
Point forecasts and the respective forecasting intervals for trend-stationary time series are calculated.
modelCast( obj, p = NULL, q = NULL, h = 1, method = c("norm", "boot"), alpha = 0.95, it = 10000, n.start = 1000, pb = TRUE, cores = future::availableCores(), np.fcast = c("lin", "const"), export.error = FALSE, plot = FALSE, ... )
modelCast( obj, p = NULL, q = NULL, h = 1, method = c("norm", "boot"), alpha = 0.95, it = 10000, n.start = 1000, pb = TRUE, cores = future::availableCores(), np.fcast = c("lin", "const"), export.error = FALSE, plot = FALSE, ... )
obj |
an object of class |
p |
an integer value |
q |
an integer value |
h |
an integer that represents the forecasting horizon; if |
method |
a character object; defines the method used for the calculation
of the forecasting intervals; with |
alpha |
a numeric vector of length 1 with |
it |
an integer that represents the total number of iterations, i.e.,
the number of simulated series; is set to |
n.start |
an integer that defines the 'burn-in' number
of observations for the simulated ARMA series via bootstrap; is set to
|
pb |
a logical value; for |
cores |
an integer value >0 that states the number of (logical) cores to
use in the bootstrap (or |
np.fcast |
a character object; defines the forecasting method used
for the nonparametric trend; for |
export.error |
a single logical value; if the argument is set to
|
plot |
a logical value that controls the graphical output; for
|
... |
additional arguments for the standard plot function, e.g.,
|
This function is part of the smoots package and was implemented under version 1.1.0. The point forecasts and forecasting intervals are obtained based on the additive nonparametric regression model
where is the observed time series with equidistant design,
is the rescaled time on the interval
,
is a smooth trend function and
are stationary errors with
and
short-range dependence (see also Beran and Feng, 2002). Thus, we assume
to be a trend-stationary time series. Furthermore, we assume
that the rest term
follows an ARMA(
)
model
where ,
, and
,
, are real numbers and
the random variables
are
i.i.d. (identically and independently distributed) with
zero mean and constant variance.
The point forecasts and forecasting intervals for the future periods
will be obtained. With respect to the point
forecasts of
, i.e.,
, where
,
with for
and
for
will be applied. In practice, this procedure will
not be applied directly to
but to
.
The point forecasts of the nonparametric trend are simply obtained following the proposal by Fritz et al. (forthcoming) by
where is a dummy variable that is either equal to the constant value
or
. Consequently, if
,
, i.e., the last trend estimate, is
used as a constant estimate for the future. However, if
, the
trend is extrapolated linearly. The point forecast for the whole component
model is then given by
i.e., it is equal to the sum of the point forecasts of the individual components.
Equivalently to the point forecasts, the forecasting intervals are the sum
of the forecasting intervals of the individual components. To simplify the
process, the forecasting error in ,
which is of order
, is not considered (see Fritz et al.
(forthcoming)), i.e., only the forecasting intervals with respect to the
rest term
will be calculated.
If the distribution of the innovations is non-normal or generally not further
specified, bootstrapping the forecasting intervals is recommended. If they
are however normally distributed or if it is at least assumed that they are,
the forecasting errors are also approximately normally distributed with a
quickly obtainable variance. For further details on the bootstrapping
method, we refer the readers to bootCast
, whereas more
information on the calculation under normality can be found at
normCast
.
In order to apply the function, a smoots
object that was generated as
the result of a trend estimation process needs to be passed to the argument
obj
. The arguments p
and q
represent the orders of the
of the ARMA() model that the error term
is assumed to follow. If both arguments are
set to
NULL
, which is the default setting, orders will be selected
according to the Bayesian Information Criterion (BIC) for all possible
combinations of . Furthermore, the forecasting
horizon can be adjusted by means of the argument
h
, so that point
forecasts and forecasting intervals will be obtained for all time points
.
The function also allows for two calculation approaches for the forecasting
intervals. Via the argument method
, intervals
can be obtained under the assumption that the ARMA innovations are normally
distributed (method = "norm"
). Alternatively, bootstrapped intervals
can be obtained for unknown innovation distributions that are clearly
non-Gaussian (method = "boot"
).
Another argument is alpha
. By passing a value
to this argument, the (alpha
)-percent confidence level for
the forecasting intervals can be defined. If method = "boot"
is
selected, the additional arguments it
and n.start
can be
adjusted. More specifically, it
regulates the number of iterations of
the bootstrap, whereas n.start
sets the number of 'burn-in'
observations in the simulated ARMA processes within the bootstrap that are
omitted.
Since this bootstrap approach for method = "boot"
generally needs a
lot of computation time, especially for
series with high numbers of observations and when fitting models with many
parameters, parallel computation of the bootstrap iterations is enabled.
With cores
, the number of cores can be defined with an integer.
Nonetheless, for cores = NULL
, no cluster is created and therefore
the parallel computation is disabled. Note that the bootstrapped results are
fully reproducible for all cluster sizes. The progress of the bootstrap can
be observed in the R console, where a progress bar and the estimated
remaining time are displayed for pb = TRUE
.
Moreover, the argument np.fcast
allows to set the forecasting method
for the nonparametric trend function. As previously discussed, the two
options are a linear extrapolation of the trend (np.fcast = "lin"
) and
a constant continuation of the last estimated value of the trend
(np.fcast = "const"
).
The function also implements the option to automatically create a plot of
the forecasting results for plot = TRUE
. This includes the feature
to pass additional arguments of the standard plot function to
modelCast
(see also the section 'Examples').
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSS-ML"
is used throughout
for the estimation of ARMA models. Furthermore, to increase the performance,
C++ code via the Rcpp
and
RcppArmadillo
packages was
implemented. Also, the future
and
future.apply
packages are
considered for parallel computation of bootstrap iterations. The progress
of the bootstrap is shown via the
progressr
package.
The function returns a by
matrix with its columns
representing the future time points and the point forecasts, the lower
bounds of the forecasting intervals and the upper bounds of the
forecasting intervals as the rows. If the argument
plot
is set to
TRUE
, a plot of the forecasting results is created.
#'If export.error = TRUE
is selected, a list with the following
elements is returned instead.
the by
forecasting matrix with point forecasts
and bounds of the forecasting intervals.
an it
by matrix, where each column
represents a future time point
; in each column
the respective
it
simulated forecasting errors are saved.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J. and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.
Fritz, M., Forstinger, S., Feng, Y., and Gries, T. (forthcoming). Forecasting economic growth processes for developing economies. Unpublished.
X <- log(smoots::gdpUS$GDP) NPest <- smoots::msmooth(X) modelCast(NPest, h = 5, plot = TRUE, xlim = c(261, 295), type = "b", col = "deepskyblue4", lty = 3, pch = 20, main = "Exemplary title")
X <- log(smoots::gdpUS$GDP) NPest <- smoots::msmooth(X) modelCast(NPest, h = 5, plot = TRUE, xlim = c(261, 295), type = "b", col = "deepskyblue4", lty = 3, pch = 20, main = "Exemplary title")
This function runs an iterative plug-in algorithm to find the optimal bandwidth for the estimation of the nonparametric trend in equidistant time series (with short memory errors) and then employs the resulting bandwidth via either local polynomial or kernel regression.
msmooth( y, p = c(1, 3), mu = c(0, 1, 2, 3), bStart = 0.15, alg = c("A", "B", "N", "NA", "NAM", "NM", "O", "OA", "OAM", "OM"), method = c("lpr", "kr") )
msmooth( y, p = c(1, 3), mu = c(0, 1, 2, 3), bStart = 0.15, alg = c("A", "B", "N", "NA", "NAM", "NM", "O", "OA", "OAM", "OM"), method = c("lpr", "kr") )
y |
a numeric vector that contains the input time series ordered from past to present. |
||||||||||||||||||||||
p |
an integer |
||||||||||||||||||||||
mu |
an integer
|
||||||||||||||||||||||
bStart |
a numeric object that indicates the starting value of the
bandwidth for the iterative process; should be |
||||||||||||||||||||||
alg |
a control parameter (as character) that indicates the
corresponding algorithm used (set to
It is proposed to use |
||||||||||||||||||||||
method |
the smoothing approach; |
The trend is estimated based on the additive nonparametric regression model for an equidistant time series
where is the observed time series,
is the rescaled time
on the interval
,
is a smooth and deterministic
trend function and
are stationary errors with
and short-range dependence (see also Beran and Feng,
2002). With this function
can be estimated without a parametric
model assumption for the error series. Thus, after estimating and removing
the trend, any suitable parametric model, e.g. an ARMA(
) model, can
be fitted to the residuals (see
arima
).
The iterative-plug-in (IPI) algorithm, which numerically minimizes the Asymptotic Mean Squared Error (AMISE), was proposed by Feng, Gries and Fritz (2020).
Define ,
and
, where
is the order of the polynomial,
is the order of the asymptotically equivalent kernel,
is the order of the trend function's derivative,
,
is the variance factor and
the
-th order equivalent kernel
obtained for the estimation of
in the interior.
is the
-th order derivative (
) of the nonparametric trend.
Furthermore, we define
and
with being the bandwidth and
being the number of
observations. The AMISE is then
The function calculates suitable estimates for , the variance
factor, and
over different iterations. In each
iteration, a bandwidth is obtained in accordance with the AMISE that once
more serves as an input for the following iteration. The process repeats
until either convergence or the 40th iteration is reached. For further
details on the asymptotic theory or the algorithm, please consult Feng,
Gries and Fritz (2020) or Feng et al. (2019).
To apply the function, only few arguments are needed: a data input y
,
an order of polynomial p
, a kernel function defined by the smoothness
parameter mu
, a starting value for the relative bandwidth
bStart
and a final smoothing method method
.
In fact, aside from the input vector y
, every argument has a default
setting that can be adjusted for the individual case. It is recommended to
initially use the default values for p
, alg
and
bStart
and adjust them in the rare case of the resulting optimal
bandwidth being either too small or too large. Theoretically, the
initial bandwidth does not affect the selected optimal bandwidth. However, in
practice local minima of the AMISE might exist and influence the selected
bandwidth. Therefore, the default setting is bStart = 0.15
, which is a
compromise between the starting values bStart = 0.1
for p = 1
and bStart = 0.2
for p = 3
that were proposed by Feng, Gries
and Fritz (2020). In the rare case of a clearly unsuitable optimal bandwidth,
a starting bandwidth that differs from the default value is a first
possible approach to obtain a better result. Other argument adjustments can
be tried as well. For more specific information on the input arguments
consult the section Arguments.
When applying the function, an optimal bandwidth is obtained based on the
IPI algorithm proposed by Feng, Gries and Fritz (2020). In a second step,
the nonparametric trend of the series is calculated with respect
to the chosen bandwidth and the selected regression method (lpf
or
kr
). It is notable that p
is automatically set to 1 for
method = "kr"
. The output object is then a list that contains, among
other components, the original time series, the estimated trend values and
the series without the trend.
The default print method for this function delivers key numbers such as
the iteration steps and the generated optimal bandwidth rounded to the fourth
decimal. The exact numbers and results such as the estimated nonparametric
trend series are saved within the output object and can be addressed via the
$
sign.
NOTE:
With package version 1.1.0, this function implements C++ code by means
of the Rcpp
and
RcppArmadillo
packages for
better performance.
The function returns a list with different components:
the Bayesian Information Criterion of the optimal AR()
model when estimating the variance factor via autoregressive models
(if calculated; calculated for
alg = "OA"
and alg = "NA"
).
the Bayesian Information Criterion of the optimal
ARMA() model when estimating the variance factor via
autoregressive-moving-average models (if calculated; calculated for
alg = "OAM"
and alg = "NAM"
).
the percentage of omitted observations on each side of the observation period; always equal to 0.05.
the optimal bandwidth chosen by the IPI-algorithm.
the boundary bandwidth method used within the IPI; always equal to 1.
the starting value of the (relative) bandwidth; input argument.
indicates whether an enlarged bandwidth was used for the variance factor estimation or not; depends on the chosen algorithm.
the estimated variance factor; in contrast to the definitions
given in the Details section, this object actually contains an
estimated value of , i.e. it corresponds to the estimated sum
of autocovariances.
the estimated variance factor obtained by estimation of
autoregressive models (if calculated; alg = "OA"
or "NA"
).
the estimated variance factor obtained by estimation of
autoregressive-moving-average models (if calculated; calculated for
alg = "OAM"
and alg = "NAM"
).
the estimated variance factor obtained by Lag-Window Spectral
Density Estimation following Bühlmann (1996) (if calculated; calculated for
algorithms "A"
, "B"
, "O"
and "N"
).
the estimated variance factor obtained by estimation of
moving-average models (if calculated; calculated for alg = "OM"
and
alg = "NM"
).
the estimated value of .
the setting for the inflation rate according to the chosen algorithm.
the bandwidths of the single iterations steps
the optimal bandwidth for the lag-window spectral density
estimation (if calculated; calculated for algorithms "A"
, "B"
,
"O"
and "N"
).
the Bayesian Information Criterion of the optimal MA()
model when estimating the variance factor via moving-average models (if
calculated; calculated for
alg = "OM"
and alg = "NM"
).
the estimation method for the variance factor estimation; depends on the chosen algorithm.
the smoothness parameter of the second order kernel; input argument.
the number of observations.
the total number of iterations until convergence.
the original input series; input argument.
the order p of the optimal AR() or ARMA(
) model
when estimating the variance factor via autoregressive or
autoregressive-moving average models (if calculated; calculated for
alg = "OA"
, alg = "NA"
, alg = "OAM"
and
alg = "NAM"
).
the order of polynomial used in the IPI-algorithm; also used for the
final smoothing, if method = "lpr"
; input argument.
the order of the optimal MA(
) or ARMA(
)
model when estimating the variance factor via moving-average or
autoregressive-moving average models (if calculated; calculated for
alg = "OM"
,
alg = "NM"
, alg = "OAM"
and alg = "NAM"
).
the estimated residual series.
the considered order of derivative of the trend; is always zero for this function.
the weighting system matrix used within the local polynomial
regression; this matrix is a condensed version of a complete weighting system
matrix; in each row of ws
, the weights for conducting the smoothing
procedure at a specific observation time point can be found; the first
rows, where
corresponds to the number of
observations,
is the bandwidth considered for smoothing and
denotes the integer part, contain the weights at the
left-hand boundary points; the weights in row
are representative for the estimation at all
interior points and the remaining rows contain the weights for the right-hand
boundary points; each row has exactly
elements,
more specifically the weights for observations of the nearest
time points; moreover, the weights are normalized,
i.e. the weights are obtained under consideration of the time points
, where
.
the nonparametric estimates of the trend.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J. and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.
Bühlmann, P. (1996). Locally adaptive lag-window spectral estimation. Journal of Time Series Analysis, 17(3), 247-270.
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
### Example 1: US-GDP ### # Logarithm of test data # -> the logarithm of the data is assumed to follow the additive model test_data <- gdpUS y <- log(test_data$GDP) # Applied msmooth function for the trend results <- msmooth(y, p = 1, mu = 1, bStart = 0.1, alg = "A", method = "lpr") res <- results$res ye <- results$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) matplot(t, cbind(y, ye), type = "ll", lty = c(3, 1), col = c(1, "red"), xlab = "Years", ylab = "Log-Quartlery US-GDP", main = "Log-Quarterly US-GDP vs. Trend, Q1 1947 - Q2 2019") legend("bottomright", legend = c("Original series", "Estimated trend"), fill = c(1, "red"), cex = 0.7) results ## Not run: ### Example 2: German Stock Index ### # The following procedure can be considered, if (log-)returns are assumed # to follow a model from the general class of semiparametric GARCH-type # models (including Semi-GARCH, Semi-Log-GARCH and Semi-APARCH models among # others) with a slowly changing variance over time due to a deterministic, # nonparametric scale function. # Obtain the logarithm of the squared returns returns <- diff(log(dax$Close)) # (log-)returns rt <- returns - mean(returns) # demeaned (log-)returns yt <- log(rt^2) # logarithm of the squared returns # Apply 'smoots' function to the log-data, because the logarithm of # the squared returns follows an additive model with a nonparametric trend # function, if the returns are assumed to follow a semiparametric GARCH-type # model. # In this case, the setting 'alg = "A"' is used in combination with p = 3, as # the resulting estimates appear to be more suitable than for 'alg = "B"'. est <- msmooth(yt, p = 3, alg = "A") m_xt <- est$ye # estimated trend values # Obtain the standardized returns 'eps' and the scale function 'scale.f' res <- est$res # the detrended log-data C <- -log(mean(exp(res))) # an estimate of a constant value needed # for the retransformation scale.f <- exp((m_xt - C) / 2) # estimated values of the scale function in # the returns eps <- rt / scale.f # the estimated standardized returns # -> 'eps' can now be analyzed by any suitable GARCH-type model. # The total volatilities are then the product of the conditional # volatilities obtained from 'eps' and the scale function 'scale.f'. ## End(Not run)
### Example 1: US-GDP ### # Logarithm of test data # -> the logarithm of the data is assumed to follow the additive model test_data <- gdpUS y <- log(test_data$GDP) # Applied msmooth function for the trend results <- msmooth(y, p = 1, mu = 1, bStart = 0.1, alg = "A", method = "lpr") res <- results$res ye <- results$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) matplot(t, cbind(y, ye), type = "ll", lty = c(3, 1), col = c(1, "red"), xlab = "Years", ylab = "Log-Quartlery US-GDP", main = "Log-Quarterly US-GDP vs. Trend, Q1 1947 - Q2 2019") legend("bottomright", legend = c("Original series", "Estimated trend"), fill = c(1, "red"), cex = 0.7) results ## Not run: ### Example 2: German Stock Index ### # The following procedure can be considered, if (log-)returns are assumed # to follow a model from the general class of semiparametric GARCH-type # models (including Semi-GARCH, Semi-Log-GARCH and Semi-APARCH models among # others) with a slowly changing variance over time due to a deterministic, # nonparametric scale function. # Obtain the logarithm of the squared returns returns <- diff(log(dax$Close)) # (log-)returns rt <- returns - mean(returns) # demeaned (log-)returns yt <- log(rt^2) # logarithm of the squared returns # Apply 'smoots' function to the log-data, because the logarithm of # the squared returns follows an additive model with a nonparametric trend # function, if the returns are assumed to follow a semiparametric GARCH-type # model. # In this case, the setting 'alg = "A"' is used in combination with p = 3, as # the resulting estimates appear to be more suitable than for 'alg = "B"'. est <- msmooth(yt, p = 3, alg = "A") m_xt <- est$ye # estimated trend values # Obtain the standardized returns 'eps' and the scale function 'scale.f' res <- est$res # the detrended log-data C <- -log(mean(exp(res))) # an estimate of a constant value needed # for the retransformation scale.f <- exp((m_xt - C) / 2) # estimated values of the scale function in # the returns eps <- rt / scale.f # the estimated standardized returns # -> 'eps' can now be analyzed by any suitable GARCH-type model. # The total volatilities are then the product of the conditional # volatilities obtained from 'eps' and the scale function 'scale.f'. ## End(Not run)
Point forecasts and the respective forecasting intervals for autoregressive- moving-average (ARMA) models can be calculated, the latter under the assumption of normally distributed innovations, by means of this function.
normCast( X, p = NULL, q = NULL, include.mean = FALSE, h = 1, alpha = 0.95, plot = FALSE, ... )
normCast( X, p = NULL, q = NULL, include.mean = FALSE, h = 1, alpha = 0.95, plot = FALSE, ... )
X |
a numeric vector that contains the time series that is assumed to follow an ARMA model ordered from past to present. |
p |
an integer value |
q |
an integer value |
include.mean |
a logical value; if set to |
h |
an integer that represents the forecasting horizon; if |
alpha |
a numeric vector of length 1 with |
plot |
a logical value that controls the graphical output; for
|
... |
additional arguments for the standard plot function, e.g.,
|
This function is part of the smoots
package and was implemented under
version 1.1.0. For a given time series ,
,
the point forecasts and the respective forecasting intervals will be
calculated.
It is assumed that the series follows an ARMA(
) model
where and
are real
numbers (for
and
) and
are i.i.d. (identically and independently
distributed) random variables with zero mean and constant variance.
is equal to
.
The point forecasts and forecasting intervals for the future periods
will be obtained. With respect to the point
forecasts
, where
,
with for
and
for
will be applied.
The forecasting intervals on the other hand are obtained under the assumption
of normally distributed innovations. Let be the
-percent
quantile of the standard normal distribution. The
-percent
forecasting interval at a point
, where
,
is given by
with being the standard deviation of the forecasting error
at the time point
and with
. For ARMA models with normal
innovations, the variance of the forecasting error can be derived from the
MA(
) representation of the model. It is
where are the coefficients of the MA(
)
representation and
is the
innovation variance.
The function normCast
allows for different adjustments to
the forecasting progress. At first, a vector with the values of the observed
time series ordered from past to present has to be passed to the argument
X
. Orders and
of the underlying ARMA process can be
defined via the arguments
p
and q
. If only one of these orders
is inserted by the user, the other order is automatically set to 0
. If
none of these arguments are defined, the function will choose orders based on
the Bayesian Information Criterion (BIC) for . Via the logical argument
include.mean
the user can decide,
whether to consider the mean of the series within the estimation process.
Furthermore, the argument h
allows for the definition of the maximum
future time point . Point forecasts and forecasting intervals will
be returned for the time points
to
. Another argument
is
alpha
, which is the equivalent of the confidence level considered
within the calculation of the forecasting intervals, i.e., the quantiles
alpha
and
alpha
of the
forecasting intervals will be obtained.
For simplicity, the function also incorporates the possibility to directly
create a plot of the output, if the argument plot
is set to
TRUE
. By the additional and optional arguments ...
, further
arguments of the standard plot function can be implemented to shape the
returned plot.
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSS-ML"
is used throughout
for the estimation of ARMA models.
The function returns a by
matrix with its columns
representing the future time points and the point forecasts, the lower
bounds of the forecasting intervals and the upper bounds of the
forecasting intervals as the rows. If the argument
plot
is set to
TRUE
, a plot of the forecasting results is created.
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.
Fritz, M., Forstinger, S., Feng, Y., and Gries, T. (2020). Forecasting economic growth processes for developing economies. Unpublished.
### Example 1: Simulated ARMA process ### # Simulation of the underlying process n <- 2000 n.start = 1000 set.seed(21) X <- arima.sim(model = list(ar = c(1.2, -0.7), ma = 0.63), n = n, rand.gen = rnorm, n.start = n.start) + 7.7 # Application of normCast() result <- normCast(X = X, p = 2, q = 1, include.mean = TRUE, h = 5, plot = TRUE, xlim = c(1971, 2005), col = "deepskyblue4", type = "b", lty = 3, pch = 16, main = "Exemplary title") result
### Example 1: Simulated ARMA process ### # Simulation of the underlying process n <- 2000 n.start = 1000 set.seed(21) X <- arima.sim(model = list(ar = c(1.2, -0.7), ma = 0.63), n = n, rand.gen = rnorm, n.start = n.start) + 7.7 # Application of normCast() result <- normCast(X = X, p = 2, q = 1, include.mean = TRUE, h = 5, plot = TRUE, xlim = c(1971, 2005), col = "deepskyblue4", type = "b", lty = 3, pch = 16, main = "Exemplary title") result
From a matrix with values of an information criterion for different orders
and
of an autoregressive-moving-average (ARMA) model, the
optimal orders are selected.
optOrd(mat, restr = NULL, sFUN = min)
optOrd(mat, restr = NULL, sFUN = min)
mat |
a numeric matrix, whose rows represent the AR orders
|
restr |
a single expression (not a character object) that defines
further restrictions; the standard logical operators, e.g. |
sFUN |
the selection function; is set to |
Given a matrix mat
filled with the values of an information criterion
for different estimated ARMA() models, where the rows represent
different orders
and where
the columns represent the orders
, the function returns a vector with the optimal orders
and
. Further selection restrictions can be passed to the
argument
restr
as an expression. To implement a restriction, the rows
and columns are addressed via p
and q
, respectively. Moreover,
standard boolean operators such as ==
, >=
or &
can be
used. See the Section Examples for examples of different restrictions.
In many cases, the minimum value of a criterion is considered to indicate
the best model. However, in some other cases a different selection approach
might be appropriate. Therefore, a selection function can be considered by
means of the argument sFUN
. The default is sFUN = min
, i.e. the
function min
is applied to select the optimal
orders.
The function returns a vector with two elements. The first element is the
optimal order , whereas the second element is the selected optimal
order
.
Sebastian Letmathe (Scientific Employee) (Department of Economics,
Paderborn University),
## Not run: set.seed(21) Xt <- arima.sim(model = list(ar = c(1.2, -0.5), ma = 0.7), n = 1000) + 7 mat <- smoots::critMatrix(Xt) optOrd(mat) # without restrictions optOrd(mat, p <= q) # with one restriction optOrd(mat, p >= 1 & q >= 4) # with two restrictions ## End(Not run)
## Not run: set.seed(21) Xt <- arima.sim(model = list(ar = c(1.2, -0.5), ma = 0.7), n = 1000) + 7 mat <- smoots::critMatrix(Xt) optOrd(mat) # without restrictions optOrd(mat, p <= q) # with one restriction optOrd(mat, p >= 1 & q >= 4) # with two restrictions ## End(Not run)
This function regulates how objects created by the package smoots
are
plotted.
## S3 method for class 'smoots' plot(x, t = NULL, rescale = TRUE, which = NULL, ...)
## S3 method for class 'smoots' plot(x, t = NULL, rescale = TRUE, which = NULL, ...)
x |
an input object of class |
t |
an optional vector with time points that will be considered for
the x-axis within the plot; is set to NULL by default and uses a vector
|
rescale |
a single logical value; is set to |
which |
a selector for the plot type so that the interactive prompt is
avoided; for the default, |
... |
additional arguments of the standard plot method. |
None
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
This function regulates how objects created by the package smoots
are
printed.
## S3 method for class 'smoots' print(x, ...)
## S3 method for class 'smoots' print(x, ...)
x |
an input object of class |
... |
included for compatibility; additional arguments will however not affect the output. |
None
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
The estimation functions of the smoots
package estimate the
nonparametric trend function or its derivatives on the rescaled
time interval . With this function the derivative estimates can
be rescaled in accordance with a given vector with time points.
rescale(y, x = seq_along(y), v = 1)
rescale(y, x = seq_along(y), v = 1)
y |
a numeric vector or matrix with the derivative estimates obtained
for time points on the interval |
x |
a numeric vector of length |
v |
the order of derivative that is implemented for |
The derivative estimation process is based on the additive time series model
where is the observed time series with equidistant design,
is the rescaled time on
,
is a smooth and
deterministic trend function and
are stationary errors
with E(eps_[t]) = 0 (see also Beran and Feng, 2002). Since the estimates of
the main smoothing functions in
smoots
are obtained with regard to the
rescaled time points , the derivative estimates returned by these
functions are valid for
only. Thus, by passing the returned
estimates to the argument
y
, a vector with the actual time points to
the argument x
and the order of derivative of y
to the argument
v
, a rescaled estimate series is calculated and returned. The function
can also be combined with the numeric output of confBounds
.
Note that the trend estimates, even though they are also obtained for the
rescaled time points , are still valid for the actual time points.
A numeric vector with the rescaled derivative estimates is returned.
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
data <- smoots::gdpUS Xt <- log(data$GDP) time <- seq(from = 1947.25, to = 2019.5, by = 0.25) d_est <- smoots::dsmooth(Xt) ye_rescale <- smoots::rescale(d_est$ye, x = time, v = 1) plot(time, ye_rescale, type = "l", main = "", ylab = "", xlab = "Year")
data <- smoots::gdpUS Xt <- log(data$GDP) time <- seq(from = 1947.25, to = 2019.5, by = 0.25) d_est <- smoots::dsmooth(Xt) ye_rescale <- smoots::rescale(d_est$ye, x = time, v = 1) plot(time, ye_rescale, type = "l", main = "", ylab = "", xlab = "Year")
Generic function which extracts model residuals from a smoots
class
object. Both residuals
and its abbreviated form resid
can be called.
## S3 method for class 'smoots' residuals(object, ...)
## S3 method for class 'smoots' residuals(object, ...)
object |
an object from the |
... |
included for consistency with the generic function. |
Residuals extracted from a smoots
class object.
Sebastian Letmathe (Scientific Employee) (Department of Economics,
Paderborn University),
A simple backtest of Semi-ARMA models via rolling forecasts can be implemented.
rollCast( y, p = NULL, q = NULL, K = 5, method = c("norm", "boot"), alpha = 0.95, np.fcast = c("lin", "const"), it = 10000, n.start = 1000, pb = TRUE, cores = future::availableCores(), argsSmoots = list(), plot = TRUE, argsPlot = list() )
rollCast( y, p = NULL, q = NULL, K = 5, method = c("norm", "boot"), alpha = 0.95, np.fcast = c("lin", "const"), it = 10000, n.start = 1000, pb = TRUE, cores = future::availableCores(), argsSmoots = list(), plot = TRUE, argsPlot = list() )
y |
a numeric vector that represents the equidistant time series assumed to follow a Semi-ARMA model; must be ordered from past to present. |
p |
an integer value |
q |
an integer value |
K |
a single, positive integer value that defines the number of
out-of-sample observations; the last |
method |
a character object; defines the method used for the calculation
of the forecasting intervals; with |
alpha |
a numeric vector of length 1 with |
np.fcast |
a character object; defines the forecasting method used
for the nonparametric trend; for |
it |
an integer that represents the total number of iterations, i.e.,
the number of simulated series; is set to |
n.start |
an integer that defines the 'burn-in' number
of observations for the simulated ARMA series via bootstrap; is set to
|
pb |
a logical value; for |
cores |
an integer value >0 that states the number of (logical) cores to
use in the bootstrap (or |
argsSmoots |
a list that contains arguments that will be passed to
|
plot |
a logical value that controls the graphical output; for the
default ( |
argsPlot |
a list; additional arguments for the standard plot function,
e.g., |
Define that an observed, equidistant time series , with
, follows
where is the rescaled time on the closed
interval
and
is a nonparametric and
deterministic trend function (see Beran and Feng, 2002, and Feng, Gries and
Fritz, 2020).
, on the other hand, is a stationary process
with
and short-range dependence.
For the purpose of this function,
is assumed
to follow an autoregressive-moving-average (ARMA) model with
Here, the random variables are identically and
independently distributed (i.i.d.) with zero-mean and a constant variance
and the coefficients
and
,
and
, are real numbers.
The combination of both previous formulas will be called a semiparametric
ARMA (Semi-ARMA) model.
An explicit forecasting method of Semi-ARMA models is described in
modelCast
. To backtest a selected model, a slightly adjusted
procedure is used. The data is divided into in-sample and an
out-of-sample values (usually the last observations in the data
are reserved for the out-of-sample observations). A model is fitted to the
in-sample data, whereas one-step rolling point forecasts and forecasting
intervals are obtained for the out-of-sample time points. The proposed
forecasts of the trend are either a linear or a constant extrapolation of
the trend with negligible forecasting intervals, whereas the point forecasts
of the stationary rest term are obtained via the selected ARMA(
)
model (see Fritz et al., 2020). The corresponding forecasting intervals
are calculated under the assumption that the innovations
are either normally distributed (see e.g. pp.
93-94 in Brockwell and Davis, 2016) or via a forward bootstrap (see Lu and
Wang, 2020). For a one-step forecast for time point
, all observations
until time point
are assumed to be known.
The function calculates three important values for backtesting: the number
of breaches, i.e. the number of true observations that lie outside of the
forecasting intervals, the mean absolute scaled error (MASE, see Hyndman
and Koehler, 2006) and the root mean squared scaled error (RMSSE, see
Hyndman and Koehler, 2006) are obtained. For the MASE, a value
indicates a better average forecasting potential than a naive forecasting
approach.
Furthermore, it is independent from the scale of the data and can thus be
used to compare forecasts of different datasets. Closely related is the
RMSSE, however here, the mean of the squared forecasting errors is computed
and scaled by the mean of the squared naive forecasting approach. Then the
root of that value is the RMSSE. Due to the close relation, the
interpretation of the RMSSE is similarly but not identically to the
interpretation of the MASE. Of course, a value close to zero is preferred
in both cases.
To make use of the function, a numeric vector with the values of a time
series that is assumed to follow a Semi-ARMA model needs to be passed to
the argument y
. Moreover, the arguments p
and q
represent the AR and MA orders, respectively, of the underlying ARMA
process in the parametric part of the model. If both values are set to
NULL
, an optimal order in accordance with the Bayesian Information
Criterion (BIC) will be selected. If only one of the values is NULL
,
it will be changed to zero instead. K
defines the number of the
out-of-sample observations; these will be cut off the end of y
, while
the remaining observations are treated as the in-sample observations. For the
out-of-sample time points, rolling forecasts will be obtained.
method
describes the method to use for the computation of the
prediction intervals. Under the normality assumption for the innovations
, intervals can be obtained via
method = "norm". However, if the assumption does not hold, a
bootstrap can be implemented as well (method = "boot"). Both
approaches are explained in more detail in
normCast
and
bootCast
, respectively. With alpha
, the confidence
level of the forecasting intervals can be adjusted, as the
(alpha
)-percent forecasting intervals will be computed. By
means of the argument np.fcast
, the forecasting method for the
nonparametric trend function can be defined. Selectable are a linear
(np.fcast = "lin"
) and a constant (np.fcast = "const"
)
extrapolation. For more information on these methods, we refer the reader to
trendCast
.
it
, n.start
, pb
and cores
are only
relevant for method = "boot"
. With it
the total number of
bootstrap iterations is defined, whereas n.start
regulates, how
many 'burn-in' observations are generated for each simulated ARMA process
in the bootstrap. Since a bootstrap may take a longer computation time,
with the argument cores
the number of cores for parallel computation
of the bootstrap iterations can be defined. Nonetheless, for
cores = NULL
, no cluster is created and therefore
the parallel computation is disabled. Note that the bootstrapped results are
fully reproducible for all cluster sizes. Moreover, for pb = TRUE
,
the progress of the bootstrap approach can be observed in the R console via
a progress bar. Additional information on these four function arguments can
be found in bootCast
.
The argument argsSmoots
is a list. In this list, different arguments
of the function msmooth
can be implemented to adjust the
estimation of the nonparametric part of the complete model. The arguments
of the smoothing function are described in msmooth
.
rollCast
allows for a quick plot of the results. If the logical
argument plot
is set to TRUE
, a graphic with default
settings is created. Nevertheless, users are allowed to implement further
arguments of the standard plot function in the list argsPlot
. For
example, the limits of the plot can be adjusted by xlim
and
ylim
. Furthermore, an argument x
can be included in
argsPlot
with the actual equidistant time points of the whole series
(in-sample & out-of-sample observations). Otherwise, simply 1:n
is
used as the in-sample time points by default.
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSS-ML"
is used throughout for
the estimation of ARMA models. Furthermore, to increase the performance,
C++ code via the Rcpp
and
RcppArmadillo
packages
was implemented. Also, the future
and
future.apply
packages are
considered for parallel computation of bootstrap iterations. The progress
of the bootstrap is shown via the
progressr
package.
A list with different elements is returned. The elements are as follows.
a single numeric value; it describes, what confidence level
(alpha
)-percent has been considered for the forecasting
intervals.
a logical vector that states whether the true
out-of-sample observations lie outside of the forecasting intervals,
respectively; a breach is denoted by
TRUE
.
a numeric vector that contains the margin of the breaches
(in absolute terms) for the out-of-sample time points; if a breach
did not occur, the respective element is set to zero.
a numeric vector that contains the simulated empirical
values of the forecasting error for method = "boot"
; otherwise,
it is set to NULL
.
a numeric vector that contains the point forecasts
of the parametric part of the model.
a numeric matrix that contains the rolling point
forecasts as well as the values of the bounds of the respective forecasting
intervals for the complete model;
the first row contains the point forecasts, the lower bounds of the
forecasting intervals are in the second row and the upper bounds
can be found in the third row.
a numeric vector that contains the obtained trend
forecasts.
a positive integer; states the number of out-of-sample observations as well as the number of forecasts for the out-of-sample time points.
the obtained value of the mean average scaled error for the selected model.
a character object that states, whether the forecasting
intervals were obtained via a bootstrap (method = "boot"
) or under
the normality assumption for the innovations (method = "norm"
).
the output (usually a list) of the nonparametric
trend estimation via msmooth
.
the output (usually a list) of the parametric ARMA
estimation of the detrended series via arima
.
the number of observations (in-sample & out-of-sample observations).
the number of in-sample observations (n - n.out
).
the number of out-of-sample observations (equals K
).
a character object that states the applied forecasting
method for the nonparametric trend function; either a linear (
np.fcast = "lin"
) or a constant np.fcast = "const"
are
possible.
a numeric vector of length 2 with the
alpha
-percent and
{
alpha
}-percent quantiles of
the forecasting error distribution.
the obtained value of the root mean squared scaled error for the selected model.
a numeric vector that contains all true observations (in-sample & out-of-sample observations).
a numeric vector that contains all in-sample observations.
a numeric vector that contains the out-of-sample
observations.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J., and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54, 291-311.
Brockwell, P. J., and Davis, R. A. (2016). Introduction to time series and forecasting, 3rd edition. Springer.
Fritz, M., Forstinger, S., Feng, Y., and Gries, T. (2020). Forecasting economic growth processes for developing economies. Unpublished.
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Hyndman, R. J., and Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22:4, 679-688.
Lu, X., and Wang, L. (2020). Bootstrap prediction interval for ARMA models with unknown orders. REVSTAT–Statistical Journal, 18:3, 375-396.
lgdp <- log(smoots::gdpUS$GDP) time <- seq(from = 1947.25, to = 2019.5, by = 0.25) backtest <- rollCast(lgdp, K = 5, argsPlot = list(x = time, xlim = c(2012, 2019.5), col = "forestgreen", type = "b", pch = 20, lty = 2, main = "Example")) backtest
lgdp <- log(smoots::gdpUS$GDP) time <- seq(from = 1947.25, to = 2019.5, by = 0.25) backtest <- rollCast(lgdp, K = 5, argsPlot = list(x = time, xlim = c(2012, 2019.5), col = "forestgreen", type = "b", pch = 20, lty = 2, main = "Example")) backtest
The smoots
package provides different applicable functions for the
estimation of the trend or its derivatives in equidistant time series.
The main functions include an automated bandwidth selection method for time
series with short-memory errors. With package version 1.1.0 several
functions for forecasting as well as linearity tests were added.
The smoots
functions are either meant for calculating nonparametric
estimates of the trend of a time series or its derivatives.
msmooth
is the central function of the package. It allows
the user to conduct a local polynomial regression of the trend based on
an optimal bandwidth that is obtained by an iterative plug-in algorithm.
There are also different algorithms implemented concerning the inflation rate
and other factors that can be chosen from (see also: msmooth
).
dsmooth
is a function that calculates the derivatives of the
trend after obtaining the optimal bandwidth by an iterative plug-in
algorithm. The estimates are obtained for rescaled time points on the
interval (see also:
dsmooth
).
tsmooth
is similar to msmooth
as it also calculates the trend
of the series. Instead of using the name of a predefined algorithm that
settles the inflation rate (and other factors), these factors can be manually
and individually adjusted as arguments in the function (see also:
tsmooth
).
gsmooth
is a standard smoothing function that applies the local
polynomial regression method. A bandwidth can be chosen freely. The estimates
are obtained for rescaled time points on the interval
(see also:
gsmooth
).
knsmooth
is a standard smoothing function that applies the kernel
regression method. A bandwidth can be chosen freely
(see also: knsmooth
).
With the publication of the package version 1.1.0, new functions were added. These include functions for forecasting and functions for testing linearity of the deterministic trend.
rescale
helps rescaling the estimates of the derivatives of the
nonparametric trend function, because the estimates are obtained for rescaled
time points on the interval and not for the actual time points
(see also:
rescale
).
critMatrix
is a quick tool for the calculation of information criteria
for ARMA() models with different order combinations
and
. The function returns a matrix with the obtained values of the
selected criterion for the different combinations of
and
(see also:
critMatrix
).
optOrd
is useful in combination with critMatrix
. It
reads a matrix equal in structure to the ones returned by critMatrix
and returns the optimal orders p and q. Furthermore, additional restrictions
for the selection can be imposed (see also: optOrd
).
normCast
provides means to obtain point forecasts as well as
forecasting intervals for a given series under the assumption that it follows
an ARMA() model and that its innovations are normally distributed
(see also:
normCast
).
bootCast
can also be used to calculate point forecasts and forecasting
intervals, if the series of interest follows an ARMA() model.
However, the main difference is that this function should be applied, if the
distribution of the innovations is unknown or explicitly non-Gaussian,
because the underlying bootstrap process does not need a distribution
assumption. In spite of this advantage, it also needs significantly more
computation time. Therefore, with version 1.1.1, the bootstrap can now be
conducted in parallel for an improved performance (see also:
bootCast
).
trendCast
uses a smoots
object that is the output of a trend
estimation and calculates point forecasts of the trend. Forecasting intervals
are omitted for reasons that are explained within the function's
documentation (see also: trendCast
).
modelCast
calculates the point forecasts and forecasting intervals of
a trend-stationary series. Based on a smoots
object that is the output
of a trend estimation, trendCast
is applied to the estimated trend,
whereas either normCast
or bootCast
is applied to the
residuals (see also: modelCast
).
rollCast
is a backtesting function for Semi-ARMA models. A given
series is divided into in-sample and out-of-sample observations, where the
in-sample is used to fit a Semi-ARMA model. One-step rolling forecasts and
the corresponding forecasting intervals are then obtained for the
out-of-sample observations. The quality of the model is then assessed via
a comparison of the true out-of-sample observations with the point forecasts
and forecasting intervals. Different quality criteria are calculated and
returned (see also: modelCast
).
confBounds
calculates the confidence bounds of the estimated trend
or of the estimated derivative of the trend at a predefined confidence
level. A graphic of the confidence bounds alongside a previously chosen
constant or parametric illustration of the estimated series is created. With
this plot it can be tested graphically if the deterministic trend is constant
or if it follows a parametric polynomial model. Furthermore, it can also be
tested, if the derivatives of the trend are constant (see also:
confBounds
).
The package includes four datasets: gdpUS
(see also:
gdpUS
) that has data concerning the quarterly US GDP between
Q1 1947-01 and Q2 2019, tempNH
(see also: tempNH
) with
mean monthly Northern Hemisphere temperature changes from 1880 to 2018,
dax
(see also: dax
) with daily financial time series
data of the German stock index (DAX) from 1990 to July 2019 and vix
(see also: vix
) with daily financial time series data of the
CBOE Volatility Index (VIX) from 1990 to July 2019.
The package is distributed under the General Public License v3 ([GPL-3](https://tldrlegal.com/license/gnu-general-public-license-v3-(gpl-3))).
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
A dataset that contains mean monthly Northern Hemisphere temperature changes from 1880 to 2018. The base period is 1951 to 1980.
tempNH
tempNH
A data frame with 1668 rows and 3 variables:
the observation year
the observation month
the observed mean monthly temperature changes in degrees Celsius
The data was obtained from the Goddard Institute for Space Studies (part of the National Aeronautics and Space Administration [NASA]) (accessed: 2019-09-25).
https://data.giss.nasa.gov/gistemp/tabledata_v4/NH.Ts+dSST.txt
Forecasting Function for Nonparametric Trend Functions
trendCast(object, h = 1, np.fcast = c("lin", "const"), plot = FALSE, ...)
trendCast(object, h = 1, np.fcast = c("lin", "const"), plot = FALSE, ...)
object |
an object returned by either |
h |
the forecasting horizon; the values |
np.fcast |
the forecasting method; |
plot |
a logical value; if set to |
... |
additional arguments for the standard plot function, e.g.,
|
This function is part of the smoots
package and was implemented under
version 1.1.0. The underlying theory is based on the additive nonparametric
regression function
where is the observed time series with equidistant design,
is the rescaled time on the interval
,
is a smooth and deterministic trend function and
are
stationary errors with
.
The purpose of this function is the forecasting of future values based on
a nonparametric regression model. Following the proposition in Fritz
et al. (2020), point predictions can be conducted
separately for the nonparametric trend function and the
stationary part
. The sum of both forecasts is then the
forecast of
. With this function, only the forecast with respect to
is computable.
Now assume that the variance of the error in the local polynomial
forecasts is negligible when calculating the forecasting intervals. We
define the forecast for time point
,
, by
where is equal to
and
is a
dummy variable. If
, a linear extrapolation is applied. For
,
is the predicted value.
To make use of this function, an object of class smoots
can be given
as input. However, since the discussed approach is only valid for the
estimated trend function, only objects created by msmooth
,
tsmooth
, knsmooth
and link{gsmooth}
, if
the trend was estimated, will be appropriate input objects.
With the input argument h
, a positive integer can be given to the
function that represents the forecasting horizon, i.e. how many future values
are to be estimated. Via the argument np.fcast
the value of the dummy
variable D can be specified and thus the forecasting method. For
np.fcast = "lin"
, is applied, whereas for
np.fcast = "const"
, is set to
.
By means of the argument plot
that can be either set to the logical
values TRUE
or FALSE
, a simple plot of the original series
alongside the local polynomial estimates as well as the forecasted values can
be either generated or suppressed.
The function always returns a vector of forecasted values ordered from
to
. Depending on the setting of the argument
plot
, a generic plot of the results may be generated. Furthermore,
additional arguments of the standard plot function can be passed to this
function as well to adjust the generated plot.
A numeric vector is always returned with the forecasted values. Depending on the setting for the argument plot, a generic plot might be created.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.
Fritz, M., Forstinger, S., Feng, Y., and Gries, T. (2020). Forecasting economic growth processes for developing economies. Unpublished.
log_gdp <- log(smoots::gdpUS$GDP) est <- msmooth(log_gdp) forecasts <- trendCast(est, h = 5, plot = TRUE) forecasts
log_gdp <- log(smoots::gdpUS$GDP) est <- msmooth(log_gdp) forecasts <- trendCast(est, h = 5, plot = TRUE) forecasts
This function runs an iterative plug-in algorithm to find the optimal bandwidth for the estimation of the nonparametric trend in equidistant time series (with short-memory errors) and then employs the resulting bandwidth via either local polynomial or kernel regression. This function allows for more flexibility in its arguments than msmooth.
tsmooth( y, p = c(1, 3), mu = c(0, 1, 2, 3), Mcf = c("NP", "ARMA", "AR", "MA"), InfR = c("Opt", "Nai", "Var"), bStart = 0.15, bvc = c("Y", "N"), bb = c(0, 1), cb = 0.05, method = c("lpr", "kr") )
tsmooth( y, p = c(1, 3), mu = c(0, 1, 2, 3), Mcf = c("NP", "ARMA", "AR", "MA"), InfR = c("Opt", "Nai", "Var"), bStart = 0.15, bvc = c("Y", "N"), bb = c(0, 1), cb = 0.05, method = c("lpr", "kr") )
y |
a numeric vector that contains the time series ordered from past to present. |
||||||||||
p |
an integer |
||||||||||
mu |
an integer
|
||||||||||
Mcf |
method for estimating the variance factor
|
||||||||||
InfR |
a character object that represents the inflation
rate in the form
|
||||||||||
bStart |
a numeric object that indicates the starting value of the
bandwidth for the iterative process; should be |
||||||||||
bvc |
a character object that indicates whether an enlarged bandwidth is
being used for the estimation of the variance factor
|
||||||||||
bb |
can be set to
|
||||||||||
cb |
a numeric value that indicates the percentage of omitted
observations on each side of the observation period for the automated
bandwidth selection; is set to |
||||||||||
method |
the final smoothing approach; |
The trend is estimated based on the additive nonparametric regression model for an equidistant time series
where is the observed time series,
is the rescaled time
on the interval
,
is a smooth and deterministic
trend function and
are stationary errors with
and short-range dependence (see also Beran and Feng,
2002). With this function
can be estimated without a parametric
model assumption for the error series. Thus, after estimating and removing
the trend, any suitable parametric model, e.g. an ARMA(
) model, can
be fitted to the residuals (see
arima
).
The iterative-plug-in (IPI) algorithm, which numerically minimizes the Asymptotic Mean Squared Error (AMISE), was proposed by Feng, Gries and Fritz (2020).
Define ,
and
, where
is the order of the polynomial,
is the order of the asymptotically equivalent kernel,
is the order of the trend function's derivative,
,
is the variance factor and
the
-th order equivalent kernel
obtained for the estimation of
in the interior.
is the
-th order derivative (
) of the nonparametric trend.
Furthermore, we define
and
with being the bandwidth and
being the number of
observations. The AMISE is then
The function calculates suitable estimates for , the variance
factor, and
over different iterations. In each
iteration, a bandwidth is obtained in accordance with the AMISE that once
more serves as an input for the following iteration. The process repeats
until either convergence or the 40th iteration is reached. For further
details on the asymptotic theory or the algorithm, please consult Feng, Gries
and Fritz (2020) or Feng et al. (2019).
To apply the function, more arguments are needed compared to the similar
function msmooth
: a data input y
, an order of polynomial
p
, a kernel weighting function defined by the smoothness parameter
mu
, a variance factor estimation method Mcf
, an inflation rate
setting InfR
(see also Beran and Feng, 2002), a starting value for the
relative bandwidth bStart
, an inflation setting for the variance
factor estimation bvc
, a boundary method bb
, a boundary cut-off
percentage cb
and a final smoothing method method
.
In fact, aside from the input vector y
, every argument has a default
setting that can be adjusted for the individual case. Theoretically, the
initial bandwidth does not affect the selected optimal bandwidth. However, in
practice local minima of the AMISE might exist and influence the selected
bandwidth. Therefore, the default setting is bStart = 0.15
, which is a
compromise between the starting values bStart = 0.1
for p = 1
and bStart = 0.2
for p = 3
that were proposed by Feng, Gries
and Fritz (2020). In the rare case of a clearly unsuitable optimal bandwidth,
a starting bandwidth that differs from the default value is a first
possible approach to obtain a better result. Other argument adjustments can
be tried as well. For more specific information on the input arguments
consult the section Arguments.
When applying the function, an optimal bandwidth is obtained based on the
IPI algorithm proposed by Feng, Gries and Fritz (2020). In a second step,
the nonparametric trend of the series is calculated with respect
to the chosen bandwidth and the selected regression method (lpf
or
kr
). Please note that method = "lpf"
is strongly recommended by
the authors. Moreover, it is notable that p
is automatically set to
1
for method = "kr"
. The output object is then a list that
contains, among other components, the original time series, the estimated
trend values and the series without the trend.
The default print method for this function delivers only key numbers such as
the iteration steps and the generated optimal bandwidth rounded to the fourth
decimal. The exact numbers and results such as the estimated nonparametric
trend series are saved within the output object and can be addressed via the
$
sign.
NOTE:
With package version 1.1.0, this function implements C++ code by means
of the Rcpp
and
RcppArmadillo
packages for
better performance.
The function returns a list with different components:
the Bayesian Information Criterion of the optimal AR()
model when estimating the variance factor via autoregressive models
(if calculated; calculated for
alg = "OA"
and alg = "NA"
).
the Bayesian Information Criterion of the optimal
ARMA() model when estimating the variance factor via
autoregressive-moving-average models (if calculated; calculated for
alg = "OAM"
and alg = "NAM"
).
the percentage of omitted observations on each side of the observation period.
the optimal bandwidth chosen by the IPI-algorithm.
the boundary bandwidth method used within the IPI; always equal to 1.
the starting value of the (relative) bandwidth; input argument.
indicates whether an enlarged bandwidth was used for the variance factor estimation or not; depends on the chosen algorithm.
the estimated variance factor; in contrast to the definitions
given in the Details section, this object actually contains an
estimated value of , i.e. it corresponds to the estimated sum
of autocovariances.
the estimated variance factor obtained by estimation of
autoregressive models (if calculated; alg = "OA"
or "NA"
).
the estimated variance factor obtained by estimation of
autoregressive-moving-average models (if calculated; calculated for
alg = "OAM"
and alg = "NAM"
).
the estimated variance factor obtained by Lag-Window Spectral
Density Estimation following Bühlmann (1996) (if calculated; calculated for
algorithms "A"
, "B"
, "O"
and "N"
).
the estimated variance factor obtained by estimation of
moving-average models (if calculated; calculated for alg = "OM"
and
alg = "NM"
).
the estimated value of .
the setting for the inflation rate according to the chosen algorithm.
the bandwidths of the single iterations steps
the optimal bandwidth for the lag-window spectral density
estimation (if calculated; calculated for algorithms "A"
, "B"
,
"O"
and "N"
).
the Bayesian Information Criterion of the optimal MA()
model when estimating the variance factor via moving-average models (if
calculated; calculated for
alg = "OM"
and alg = "NM"
).
the estimation method for the variance factor estimation; depends on the chosen algorithm.
the smoothness parameter of the second order kernel; input argument.
the number of observations.
the total number of iterations until convergence.
the original input series; input argument.
the order p of the optimal AR() or ARMA(
) model
when estimating the variance factor via autoregressive or
autoregressive-moving average models (if calculated; calculated for
alg = "OA"
, alg = "NA"
, alg = "OAM"
and
alg = "NAM"
).
the order of polynomial used in the IPI-algorithm; also used for the
final smoothing, if method = "lpr"
; input argument.
the order of the optimal MA(
) or ARMA(
)
model when estimating the variance factor via moving-average or
autoregressive-moving average models (if calculated; calculated for
alg = "OM"
,
alg = "NM"
, alg = "OAM"
and alg = "NAM"
).
the estimated residual series.
the considered order of derivative of the trend; is always zero for this function.
the weighting system matrix used within the local polynomial
regression; this matrix is a condensed version of a complete weighting system
matrix; in each row of ws
, the weights for conducting the smoothing
procedure at a specific observation time point can be found; the first
rows, where
corresponds to the number of
observations,
is the bandwidth considered for smoothing and
denotes the integer part, contain the weights at the
left-hand boundary points; the weights in row
are representative for the estimation at all
interior points and the remaining rows contain the weights for the right-hand
boundary points; each row has exactly
elements,
more specifically the weights for observations of the nearest
time points; moreover, the weights are normalized,
i.e. the weights are obtained under consideration of the time points
, where
.
the nonparametric estimates of the trend.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J. and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.
Bühlmann, P. (1996). Locally adaptive lag-window spectral estimation. Journal of Time Series Analysis, 17(3), 247-270.
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
### Example 1: US-GDP ### # Logarithm of test data # -> the logarithm of the data is assumed to follow the additive model test_data <- gdpUS y <- log(test_data$GDP) # Applied tsmooth function for the trend result <- tsmooth(y, p = 1, mu = 1, Mcf = "NP", InfR = "Opt", bStart = 0.1, bvc = "Y") trend1 <- result$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) plot(t, y, type = "l", xlab = "Year", ylab = "log(US-GDP)", bty = "n", lwd = 1, lty = 3, main = "Estimated trend for log-quarterly US-GDP, Q1 1947 - Q2 2019") points(t, trend1, type = "l", col = "red", lwd = 1) title(sub = expression(italic("Figure 1")), col.sub = "gray47", cex.sub = 0.6, adj = 0) result ## Not run: ### Example 2: German Stock Index ### # The following procedure can be considered, if (log-)returns are assumed # to follow a model from the general class of semiparametric GARCH-type # models (including Semi-GARCH, Semi-Log-GARCH and Semi-APARCH models among # others) with a slowly changing variance over time due to a deterministic, # nonparametric scale function. # Obtain the logarithm of the squared returns returns <- diff(log(dax$Close)) # (log-)returns rt <- returns - mean(returns) # demeaned (log-)returns yt <- log(rt^2) # logarithm of the squared returns # Apply 'smoots' function to the log-data, because the logarithm of # the squared returns follows an additive model with a nonparametric trend # function, if the returns are assumed to follow a semiparametric GARCH-type # model. # In this case, the optimal inflation rate is used for p = 3. est <- tsmooth(yt, p = 3, InfR = "Opt") m_xt <- est$ye # estimated trend values # Obtain the standardized returns 'eps' and the scale function 'scale.f' res <- est$res # the detrended log-data C <- -log(mean(exp(res))) # an estimate of a constant value needed # for the retransformation scale.f <- exp((m_xt - C) / 2) # estimated values of the scale function in # the returns eps <- rt / scale.f # the estimated standardized returns # -> 'eps' can now be analyzed by any suitable GARCH-type model. # The total volatilities are then the product of the conditional # volatilities obtained from 'eps' and the scale function 'scale.f'. ## End(Not run)
### Example 1: US-GDP ### # Logarithm of test data # -> the logarithm of the data is assumed to follow the additive model test_data <- gdpUS y <- log(test_data$GDP) # Applied tsmooth function for the trend result <- tsmooth(y, p = 1, mu = 1, Mcf = "NP", InfR = "Opt", bStart = 0.1, bvc = "Y") trend1 <- result$ye # Plot of the results t <- seq(from = 1947, to = 2019.25, by = 0.25) plot(t, y, type = "l", xlab = "Year", ylab = "log(US-GDP)", bty = "n", lwd = 1, lty = 3, main = "Estimated trend for log-quarterly US-GDP, Q1 1947 - Q2 2019") points(t, trend1, type = "l", col = "red", lwd = 1) title(sub = expression(italic("Figure 1")), col.sub = "gray47", cex.sub = 0.6, adj = 0) result ## Not run: ### Example 2: German Stock Index ### # The following procedure can be considered, if (log-)returns are assumed # to follow a model from the general class of semiparametric GARCH-type # models (including Semi-GARCH, Semi-Log-GARCH and Semi-APARCH models among # others) with a slowly changing variance over time due to a deterministic, # nonparametric scale function. # Obtain the logarithm of the squared returns returns <- diff(log(dax$Close)) # (log-)returns rt <- returns - mean(returns) # demeaned (log-)returns yt <- log(rt^2) # logarithm of the squared returns # Apply 'smoots' function to the log-data, because the logarithm of # the squared returns follows an additive model with a nonparametric trend # function, if the returns are assumed to follow a semiparametric GARCH-type # model. # In this case, the optimal inflation rate is used for p = 3. est <- tsmooth(yt, p = 3, InfR = "Opt") m_xt <- est$ye # estimated trend values # Obtain the standardized returns 'eps' and the scale function 'scale.f' res <- est$res # the detrended log-data C <- -log(mean(exp(res))) # an estimate of a constant value needed # for the retransformation scale.f <- exp((m_xt - C) / 2) # estimated values of the scale function in # the returns eps <- rt / scale.f # the estimated standardized returns # -> 'eps' can now be analyzed by any suitable GARCH-type model. # The total volatilities are then the product of the conditional # volatilities obtained from 'eps' and the scale function 'scale.f'. ## End(Not run)
A dataset that contains the daily financial data of the VIX from 1990 to July 2019 (currency in USD).
vix
vix
A data frame with 7452 rows and 9 variables:
the observation year
the observation month
the observation day
the opening price of the day
the highest price of the day
the lowest price of the day
the closing price of the day
the adjusted closing price of the day
the traded volume
The data was obtained from Yahoo Finance (accessed: 2019-08-22).