Title: | Alternative Time Series Analysis |
---|---|
Description: | Contains some tools for testing, analyzing time series data and fitting popular time series models such as ARIMA, Moving Average and Holt Winters, etc. Most functions also provide nice and clear outputs like SAS does, such as identify, estimate and forecast, which are the same statements in PROC ARIMA in SAS. |
Authors: | Debin Qiu |
Maintainer: | Debin Qiu <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 3.1.2.1 |
Built: | 2024-10-31 19:46:54 UTC |
Source: | CRAN |
Computes the accurate criterion of smoothed (fitted) values.
accurate(x, x.hat, k, output = TRUE)
accurate(x, x.hat, k, output = TRUE)
x |
a numeric vector of original values. |
x.hat |
a numeric vector of smoothed (fitted) values. |
k |
the number of parameters in obtaining the smoothed (fitted) values. |
output |
a logical value indicating to print the results in R console. The default is
|
See http://www.dms.umontreal.ca/~duchesne/chap12.pdf in page 616 - 617 for the details of calculations for each criterion.
A vector containing the following components:
SST |
the total sum of squares. |
SSE |
the sum of the squared residuals. |
MSE |
the mean squared error. |
RMSE |
the root mean square error. |
MAPE |
the mean absolute percent error. |
MPE |
the mean percent error. |
MAE |
the mean absolute error. |
ME |
the mean error. |
R.squared |
R^2 = 1 - SSE/SST. |
R.adj.squared |
the adjusted R^2. |
RW.R.squared |
the random walk R^2. |
AIC |
the Akaike's information criterion. |
SBC |
the Schwarz's Bayesian criterion. |
APC |
the Amemiya's prediction criterion |
If the model fits the series badly, the model error sum of squares SSE
may be larger than SST
and the R.squared
or RW.R.squared
statistics
will be negative. The RW.R.squared
uses the random walk model for the purpose of
comparison.
Debin Qiu
X <- matrix(rnorm(200),100,2) y <- 0.1*X[,1] + 2*X[,2] + rnorm(100) y.hat <- fitted(lm(y ~ X)) accurate(y,y.hat,2)
X <- matrix(rnorm(200),100,2) y <- 0.1*X[,1] + 2*X[,2] + rnorm(100) y.hat <- fitted(lm(y ~ X)) accurate(y,y.hat,2)
Performs the Augmented Dickey-Fuller test for the null hypothesis
of a unit root of a univarate time series x
(equivalently, x
is a
non-stationary time series).
adf.test(x, nlag = NULL, output = TRUE)
adf.test(x, nlag = NULL, output = TRUE)
x |
a numeric vector or univariate time series. |
nlag |
the lag order with default to calculate the test statistic. See details for the default. |
output |
a logical value indicating to print the test results in R console.
The default is |
The Augmented Dickey-Fuller test incorporates
three types of linear regression models. The first type (type1
) is a linear model
with no drift and linear trend with respect to time:
where is an operator of first order difference, i.e.,
, and
is an error term.
The second type (type2
) is a linear model with drift but no linear trend:
The third type (type3
) is a linear model with both drift and linear trend:
We use the default nlag = floor(4*(length(x)/100)^(2/9))
to
calcuate the test statistic.
The Augmented Dickey-Fuller test statistic is defined as
where is the coefficient estimation
and
is its corresponding estimation of standard error for each
type of linear model. The p.value is
calculated by interpolating the test statistics from the corresponding critical values
tables (see Table 10.A.2 in Fuller (1996)) for each type of linear models with given
sample size
= length(
x
).
The Dickey-Fuller test is a special case of Augmented Dickey-Fuller test
when nlag
= 2.
A list containing the following components:
type1 |
a matrix with three columns: |
type2 |
same as above for the second type of linear model. |
type3 |
same as above for the third type of linear model. |
Missing values are removed.
Debin Qiu
Fuller, W. A. (1996). Introduction to Statistical Time Series, second ed., New York: John Wiley and Sons.
pp.test
, kpss.test
, stationary.test
# ADF test for AR(1) process x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) adf.test(x) # ADF test for co2 data adf.test(co2)
# ADF test for AR(1) process x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) adf.test(x) # ADF test for co2 data adf.test(co2)
Performs Portmanteau Q and Lagrange Multiplier tests for the null hypothesis that the residuals of a ARIMA model are homoscedastic.
arch.test(object, output = TRUE)
arch.test(object, output = TRUE)
object |
an object from arima model estimated by
|
output |
a logical value indicating to print the results in R console, including the
plots. The default is |
The ARCH Engle's test is constructed based on the fact that if the residuals
(defined as ) are heteroscedastic, the squared residuals (
) are
autocorrelated. The
first type of test is to examine whether the squares of residuals are a sequence of white
noise, which is called Portmanteau Q test and similar to the Ljung-Box test on the squared
residuals. The second type of test proposed by Engle (1982) is the Lagrange Multiplier
test which is to fit a linear regression model for the squared residuals and examine
whether the fitted model
is significant. So the null hypothesis is that the squared residuals are a sequence
of white noise, namely, the residuals are homoscedastic. The lag parameter
to calculate the test statistics is taken from an integer sequence of
with
step 4 if
, otherwise 2, where
is the number of nonmissing observations.
The plots of residuals, squared residuals, p.values of PQ and LM tests will be drawn if
output = TRUE
.
A matrix with the following five columns:
order |
the lag parameter to calculate the test statistics. |
PQ |
the Portmanteau Q test statistic. |
p.value |
the p.value for PQ test. |
LM |
the Lagrange Multiplier test statistic. |
p.value |
the p.value for LM test. |
Missing values are removed before analysis.
Debin Qiu
Engle, Robert F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 50 (4): 987-1007.
McLeod, A. I. and W. K. Li. Diagnostic Checking ARMA Time Series Models Using Squared-Residual Autocorrelations. Journal of Time Series Analysis. Vol. 4, 1983, pp. 269-273.
x <- rnorm(100) mod <- estimate(x,p = 1) # or mod <- arima(x,order = c(1,0,0)) arch.test(mod)
x <- rnorm(100) mod <- estimate(x,p = 1) # or mod <- arima(x,order = c(1,0,0)) arch.test(mod)
This is an alternative package to analyze the time series data, especially the univariate time series. Compared with other existing functions for time series analysis, most functions in this package provide nice outputs like SAS does for time series. Several functions are exactly the same names as 'arima' procedure in SAS, such as identify
, estimate
, and forecast
, etc. They also have the similar outputs.
Package: | aTSA |
Type: | Package |
Version: | 3.1.2 |
Date: | 2015-06-19 |
License: | GPL-2 | GPL-3 |
For a complete list of functions and dataset, use library(help = aTSA)
.
Debin Qiu
Maintainer: Debin Qiu <[email protected]>
Engle, Robert F.; Granger, Clive W. J. (1987). Co-integration and error correction: Representation, estimation and testing. Econometrica, 55 (2): 251-276.
Engle, Robert F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 50 (4): 987-1007.
Fuller, W. A. (1976). Introduction to Statistical Time Series. New York: John Wiley and Sons.
Hobijn B, Franses PH and Ooms M (2004). Generalization of the KPSS-test for stationarity. Statistica Neerlandica, vol. 58, p. 482-502.
Kwiatkowski, D.; Phillips, P. C. B.; Schmidt, P.; Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics, 54 (1-3): 159-178.
McLeod, A. I. and W. K. Li. Diagnostic Checking ARMA Time Series Models Using Squared-Residual Autocorrelations. Journal of Time Series Analysis. Vol. 4, 1983, pp. 269-27.
Phillips, P. C. B.; Perron, P. (1988). Testing for a Unit Root in Time Series Regression. Biometrika, 75 (2): 335-346.
Performs Engle-Granger(or EG) tests for the null hypothesis that two or more time series, each of which is I(1), are not cointegrated.
coint.test(y, X, d = 0, nlag = NULL, output = TRUE)
coint.test(y, X, d = 0, nlag = NULL, output = TRUE)
y |
the response |
X |
the exogenous input variable of a numeric vector or a matrix. |
d |
difference operator for both |
nlag |
the lag order to calculate the test statistics. The default is |
output |
a logical value indicating to print the results in R console.
The default is |
To implement the original EG tests, one first has to fit the linear regression
where is the coefficient vector and
is an error term.
With the fitted model, the residuals are obtained, i.e.,
and a Augmented Dickey-Fuller test is utilized to examine whether the sequence of
residuals
is white noise. The null hypothesis of non-cointegration
is equivalent to the null hypothesis that
is white noise. See
adf.test
for more details of Augmented Dickey-Fuller test, as well as the default nlag
.
A matrix for test results with three columns (lag
, EG
, p.value
)
and three rows (type1
, type2
, type3
).
Each row is the test results (including lag parameter,
test statistic and p.value) for each type of linear regression models of residuals
. See
adf.test
for more details of three types of linear models.
Debin Qiu
MacKinnon, J. G. (1991). Critical values for cointegration tests, Ch. 13 in Long-run Economic Relationships: Readings in Cointegration, eds. R. F. Engle and C. W. J. Granger, Oxford, Oxford University Press.
X <- matrix(rnorm(200),100,2) y <- 0.3*X[,1] + 1.2*X[,2] + rnorm(100) # test for original y and X coint.test(y,X) # test for response = diff(y,differences = 1) and # input = apply(X, diff, differences = 1) coint.test(y,X,d = 1)
X <- matrix(rnorm(200),100,2) y <- 0.3*X[,1] + 1.2*X[,2] + rnorm(100) # test for original y and X coint.test(y,X) # test for response = diff(y,differences = 1) and # input = apply(X, diff, differences = 1) coint.test(y,X,d = 1)
Fits an error correction model for univriate response.
ecm(y, X, output = TRUE)
ecm(y, X, output = TRUE)
y |
a response of a numeric vector or univariate time series. |
X |
an exogenous input of a numeric vector or a matrix for multivariate time series. |
output |
a logical value indicating to print the results in R console.
The default is |
An error correction model captures the short term relationship between the
response y
and the exogenous input variable X
. The model is defined as
where is an operator of the first order difference, i.e.,
, and
is a coefficient vector with the
number of elements being the number of columns of
X
(i.e., the number
of exogenous input variables), and which is the
main term in the sense that its coefficient
explains the short term
dynamic relationship between
y
and X
in this model, in which is estimated from the linear regression model
. Here,
and
are both error terms
but from different linear models.
An object with class "lm
", which is the same results of lm
for
fitting linear regression.
Missing values are removed before the analysis. In the results, dX
or
dX1
, dX2
, ... represents the first difference of each exogenous input
variable X
, and dy
is the first difference of response y
.
Debin Qiu
Engle, Robert F.; Granger, Clive W. J. (1987). Co-integration and error correction: Representation, estimation and testing. Econometrica, 55 (2): 251-276.
X <- matrix(rnorm(200),100,2) y <- 0.1*X[,1] + 2*X[,2] + rnorm(100) ecm(y,X)
X <- matrix(rnorm(200),100,2) y <- 0.1*X[,1] + 2*X[,2] + rnorm(100) ecm(y,X)
Estimates an ARIMA model for a univariate time series, including a sparse ARIMA model.
estimate(x, p = 0, d = 0, q = 0, PDQ = c(0, 0, 0), S = NA, method = c("CSS-ML", "ML", "CSS"), intercept = TRUE, output = TRUE, ...)
estimate(x, p = 0, d = 0, q = 0, PDQ = c(0, 0, 0), S = NA, method = c("CSS-ML", "ML", "CSS"), intercept = TRUE, output = TRUE, ...)
x |
a univariate time series. |
p |
the AR order, can be a positive integer or a vector with several positive
integers. The default is |
d |
the degree of differencing. The default is |
q |
the MA order, can be a positive integer or a vector with several positive
integers. The default is |
PDQ |
a vector with three non-negative integers for specification of the seasonal
part of the ARIMA model. The default is |
S |
the period of seasonal ARIMA model. The default is |
method |
fitting method. The default is |
intercept |
a logical value indicating to include the intercept in ARIMA model. The
default is |
output |
a logical value indicating to print the results in R console. The default is
|
... |
optional arguments to |
This function is similar to the ESTIMATE statement in ARIMA procedure of SAS,
except that it does not fit a transfer function model for a univariate time series. The
fitting method is inherited from arima
in stats
package. To be
specific, the pure ARIMA(p,q) is defined as
The p
and q
can be a vector for fitting a sparse ARIMA model. For example,
p = c(1,3),q = c(1,3)
means the ARMA((1,3),(1,3)) model defined as
The PDQ
controls the
order of seasonal ARIMA model, i.e., ARIMA(p,d,q)x(P,D,Q)(S), where S is the seasonal
period. Note that the difference operators d
and D = PDQ
[2] are different.
The d
is equivalent to diff(x,differences = d)
and D is
diff(x,lag = D,differences = S)
, where the default seasonal period is
S = frequency(x)
.
The residual diagnostics plots will be drawn.
A list with class "estimate
" and the same results as
arima
. See arima
for
more details.
Missing values are removed before the estimate. Sparse seasonal ARIMA(p,d,q)x(P,D,Q)(S) model is not allowed.
Debin Qiu
Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and Forecasting. Springer, New York. Sections 3.3 and 8.3.
estimate(lh, p = 1) # AR(1) process estimate(lh, p = 1, q = 1) # ARMA(1,1) process estimate(lh, p = c(1,3)) # sparse AR((1,3)) process # seasonal ARIMA(0,1,1)x(0,1,1)(12) model estimate(USAccDeaths, p = 1, d = 1, PDQ = c(0,1,1))
estimate(lh, p = 1) # AR(1) process estimate(lh, p = 1, q = 1) # ARMA(1,1) process estimate(lh, p = c(1,3)) # sparse AR((1,3)) process # seasonal ARIMA(0,1,1)x(0,1,1)(12) model estimate(USAccDeaths, p = 1, d = 1, PDQ = c(0,1,1))
Performs a simple exponential smoothing for univariate time series with no trend or seasonal pattern.
expsmooth(x, trend = 1, alpha = 0.2, beta = 0.10557, gamma = 0.07168, lead = 0, plot = TRUE)
expsmooth(x, trend = 1, alpha = 0.2, beta = 0.10557, gamma = 0.07168, lead = 0, plot = TRUE)
x |
a numeric vector or univariate time series. |
trend |
the type of trend. See details. |
alpha |
the smoothing parameter for constant component. The default is |
beta |
the smoothing parameter for linear component. The default is |
gamma |
the smoothing parameter for quadratic component. The default is |
lead |
the number of steps ahead for which prediction is required.
The default is |
plot |
a logical value indicating to print the plot of original data v.s smoothed
data. The default is |
Simple exponential smoothing is a weighted average between the most recent
observation and the most recent forecasting, with weights and
, respectively. To be precise, the smoothing equation of single exponential
smoothing (constant model,
trend = 1
) is given by
and the forecasting equation is
for .
The initial value
. For example,
,
,..., etc.
Let be the smoothed values of single exponential smoothing. The double
exponential smoothing (
trend = 2
, a linear model) is to apply a single
exponential smoothing again to the smoothed sequence , with a new smoothing
parameter
beta
. Similarly, we denote the smoothed values of double
exponential smoothing to be . The triple exponential smoothing
(
trend = 3
, a quadratic model) is to apply the single exponential smoothing
to the smoothed sequence with a new smoothing parameter
gamma
. The
default smoothing parameters (weights) alpha
, beta
, gamma
are
taken from the equation 1 - 0.8^{1/trend}
respectively, which is similar
to the FORECAST procedure in SAS.
A list with class "es"
containing the following components:
estimate |
the smoothed values. |
pred |
the predicted values when |
accurate |
the accurate measurements. |
Missing values are removed before the analysis.
Debin Qiu
x <- rnorm(100) es <- expsmooth(x) # trend = 1: a constant model plot(x,type = "l") lines(es$estimate,col = 2) expsmooth(x,trend = 2) # trend = 2: a linear model expsmooth(x,trend = 3) # trend = 3: a quadratic model
x <- rnorm(100) es <- expsmooth(x) # trend = 1: a constant model plot(x,type = "l") lines(es$estimate,col = 2) expsmooth(x,trend = 2) # trend = 2: a linear model expsmooth(x,trend = 3) # trend = 3: a quadratic model
Forecasts from models fitted by arima
or estimate
function.
forecast(object, lead = 1, id = NULL, alpha = 0.05, output = TRUE)
forecast(object, lead = 1, id = NULL, alpha = 0.05, output = TRUE)
object |
the result of an |
lead |
the number of steps ahead for which prediction is required. The default is
|
id |
the id of the observation which is the time. The default is |
alpha |
the significant level for constructing the confidence interval of prediction.
The default is |
output |
a logical value indicating to print the results in R console.
The default is |
This function is originally from predict.Arima
in stats
package,
but has a nice output
including 100*(1 - )% confidence interval and a prediction plot. It is
similar to FORECAST statement in PROC ARIMA of SAS.
A matrix with lead
rows and five columns. Each column represents the number
of steps ahead (Lead
), the predicted values (Forecast
), the standard errors
(S.E
) and the 100*(1 - )% lower bound (
Lower
) and upper bound
(Upper
) of confidence interval.
Debin Qiu
x <- arima.sim(list(order = c(3,0,0),ar = c(0.2,0.4,-0.15)),n = 100) fit <- estimate(x,p = 3) # same as fit <- arima(x,order = c(3,0,0)) forecast(fit,lead = 4) # forecast with id t <- as.Date("2014-03-25") + 1:100 forecast(fit,lead = 4, id = t)
x <- arima.sim(list(order = c(3,0,0),ar = c(0.2,0.4,-0.15)),n = 100) fit <- estimate(x,p = 3) # same as fit <- arima(x,order = c(3,0,0)) forecast(fit,lead = 4) # forecast with id t <- as.Date("2014-03-25") + 1:100 forecast(fit,lead = 4, id = t)
Performs Holt's two-parameter exponential smoothing for linear trend or damped trend.
Holt(x, type = c("additive", "multiplicative"), alpha = 0.2, beta = 0.1057, lead = 0, damped = FALSE, phi = 0.98, plot = TRUE)
Holt(x, type = c("additive", "multiplicative"), alpha = 0.2, beta = 0.1057, lead = 0, damped = FALSE, phi = 0.98, plot = TRUE)
x |
a numeric vector or univariate time series. |
type |
the type of interaction between the level and the linear trend. See details. |
alpha |
the parameter for the level smoothing. The default is |
beta |
the parameter for the trend smoothing. The default is |
lead |
the number of steps ahead for which prediction is required.
The default is |
damped |
a logical value indicating a damped trend. See details. The default is
|
phi |
a smoothing parameter for damped trend. The default is |
plot |
a logical value indicating to print the plot of original data v.s smoothed
data. The default is |
Holt's two parameter is used to forecast a time series with trend, but
wihtout seasonal pattern. For the additive model (type = "additive"
), the
-step-ahead forecast is given by
,
where
in which is the trend component.
For the multiplicative (
type = "multiplicative"
) model, the
-step-ahead forecast is given by
,
where
Compared with the Holt's linear trend that displays a constant increasing or
decreasing, the damped trend generated by exponential smoothing method shows a
exponential growth or decline, which is a situation between simple exponential
smoothing (with 0 increasing or decreasing rate) and Holt's two-parameter smoothing.
If damped = TRUE
, the additive model becomes
The multiplicative model becomes
See Chapter 7.4 for more details in R. J. Hyndman and G. Athanasopoulos (2013).
A list with class "Holt
" containing the following components:
estimate |
the estimate values. |
alpha |
the smoothing parameter used for level. |
beta |
the smoothing parameter used for trend. |
phi |
the smoothing parameter used for damped trend. |
pred |
the predicted values, only available for |
accurate |
the accurate measurements. |
Missing values are removed before analysis.
Debin Qiu
R. J. Hyndman and G. Athanasopoulos, "Forecasting: principles and practice," 2013. [Online]. Available: http://otexts.org/fpp/.
HoltWinters
, expsmooth
, Winters
x <- (1:100)/100 y <- 2 + 1.2*x + rnorm(100) ho0 <- Holt(y) # with additive interaction ho1 <- Holt(y,damped = TRUE) # with damped trend # multiplicative model for AirPassengers data, # although seasonal pattern exists. ho2 <- Holt(AirPassengers,type = "multiplicative")
x <- (1:100)/100 y <- 2 + 1.2*x + rnorm(100) ho0 <- Holt(y) # with additive interaction ho1 <- Holt(y,damped = TRUE) # with damped trend # multiplicative model for AirPassengers data, # although seasonal pattern exists. ho2 <- Holt(AirPassengers,type = "multiplicative")
Checks the white noise and stationarity of a univariate time series, and also identifies an appropriate ARIMA model using AICC criterion.
identify(x, p = NULL, q = NULL, nlag = 6, intercept = TRUE, stat.test = FALSE, method = c("adf", "pp", "kpss"), output = TRUE)
identify(x, p = NULL, q = NULL, nlag = 6, intercept = TRUE, stat.test = FALSE, method = c("adf", "pp", "kpss"), output = TRUE)
x |
a numeric vector or univariate time series. |
p |
the maximum lag order for AR process. The default is |
q |
the maximum lag order for MA process. The default is |
nlag |
the lag parameter to calculate the Ljung-Box test statistic.
The default is |
intercept |
an intercept to be included in ARIMA model, only valid for
|
stat.test |
the stationary test for time series, see |
method |
the method of stationary test, only valid for |
output |
a logical value indicating to print the results in R console.
The default is |
This function is similar to IDENTIFY statement in ARIMA procedure of SAS software,
which is to check the white noise and stationarity for a univariate time
series. The white noise check is accomplished by using Box.test
in stats
package, with the default method type = "Ljung-Box"
.
The stationary check uses the stationary.test
implemented in this package.
The AICC criterion (Burnham and Anderson (2002)) is used to identify an optimal model which has the minimum AICC value. The AICC is defined as
where which is called Akaike information criterion
(Akaike (1974)). Here,
are the number of estimated parameters and observations, respectively.
is the maximized value of the likelihood function for the model.
Four plots are made: plot of original data, ACF plot, PACF plot and p.value of white noise check.
A list with class "identify
" containing the following components:
WNcheck |
a matrix with three columns for results of white noise check for each lag. |
aicc |
the minimum AICC value, only available for |
min.p |
the optimal order |
min.q |
the optimal order |
stnt.test |
a list of stationary test results with three components. See
|
Missing values are removed before the analysis.
Debin Qiu
Akaike, H. (1974), "A new look at the statistical model identification", IEEE Transactions on Automatic Control, 19 (6): 716-723.
Box, G. E. P. and Pierce, D. A. (1970), Distribution of residual correlations in autoregressive-integrated moving average time series models. Journal of the American Statistical Association, 65, 1509-1526.
Burnham, K. P.; Anderson, D. R. (2002), Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.), Springer-Verlag
Ljung, G. M. and Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika 65, 297-303.
Harvey, A. C. (1993) Time Series Models. 2nd Edition, Harvester Wheatsheaf, NY, pp. 44, 45.
x <- arima.sim(list(order = c(2,0,0),ar = c(0.2,0.4)),n = 100) identify(x) # white noise check identify(x,stat.test = TRUE) # white noise and stationarity check identify(x,p = 3,q = 2) # white noise check and optimal model identification.
x <- arima.sim(list(order = c(2,0,0),ar = c(0.2,0.4)),n = 100) identify(x) # white noise check identify(x,stat.test = TRUE) # white noise and stationarity check identify(x,p = 3,q = 2) # white noise check and optimal model identification.
Performs Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test for the null
hypothesis that x
is a stationary univariate time series.
kpss.test(x, lag.short = TRUE, output = TRUE)
kpss.test(x, lag.short = TRUE, output = TRUE)
x |
a numeric vector or univariate time series. |
lag.short |
a logical value indicating whether the parameter of lag to calculate the test statistic is a short or long term. The default is a short term. See details. |
output |
a logical value indicating to print out the results in R console.
The default is |
The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test tends to decompose the time series into the sum of a deterministic trend, a random walk, and a stationary error:
where satisfies
, and
are i.i.d
. The null hypothesis is that
, which implies
x
is a stationary time series. In order to calculate the test statistic,
we consider three types of linear regression models.
The first type (type1
) is the one with no drift and deterministic trend,
defined as
The second type (type2
) is the one with drift but no trend:
The third type (type3
) is the one with both drift and trend:
The details of calculation of test statistic (kpss
) can be seen in the references
below. The default parameter of lag to calculate the test statistic is
for short term effect, otherwise,
for long term effect.
The p.value is calculated by the interpolation of test statistic from tables of
critical values (Table 5, Hobijn B., Franses PH. and Ooms M (2004)) for a given
sample size
= length(
x
).
A matrix for test results with three columns (lag
, kpss
,
p.value
) and three rows (type1
, type2
, type3
).
Each row is the test results (including lag parameter, test statistic and p.value) for
each type of linear regression models.
Missing values are removed.
Debin Qiu
Hobijn B, Franses PH and Ooms M (2004). Generalization of the KPSS-test for stationarity. Statistica Neerlandica, vol. 58, p. 482-502.
Kwiatkowski, D.; Phillips, P. C. B.; Schmidt, P.; Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics, 54 (1-3): 159-178.
adf.test
, pp.test
, stationary.test
# KPSS test for AR(1) process x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) kpss.test(x) # KPSS test for co2 data kpss.test(co2)
# KPSS test for AR(1) process x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) kpss.test(x) # KPSS test for co2 data kpss.test(co2)
Applies moving average filter to estimate the linear trend or nonseasonal pattern.
MA(x, nlag = NULL, plot = TRUE)
MA(x, nlag = NULL, plot = TRUE)
x |
a numeric vector or univariate time series. |
nlag |
the number of period to calculate the average. The default is |
plot |
a logical value indicating to print out the plot. The default is |
The moving average filter uses the unweight mean of (2*nlag
+ 1) adjacent
observations. That is,
for .
For the values at the boundary
or
, you can
refer to Equation (7) in Qiu et al., (2013) for details of calculations.
The default method for choosing the optimal
nlag
uses the rule-of-thumb
criterion proposed by Qiu, et al., (2013), in which they showed that the moving
average
is a special case of local linear estimator in the sense that the kernel function is the
uniform one, and the moving average period nlag
is a function of bandwidth. Thus,
choosing the optimal nlag
is equivalent to choosing the optimal bandwidth in local
linear regression.
The plot of original values v.s fitted values will be displayed if plot = TRUE
.
A list with class "MA
" containing the following components:
estimate |
the smoothed values. |
nlag |
the period used to compute the average. |
accurate |
the accurate measurements. |
Debin Qiu
D. Qiu, Q. Shao, and L. Yang (2013), Efficient inference for autoregressive coefficient in the presence of trend. Journal of Multivariate Analysis 114, 40-53.
P.J. Brockwell, R.A. Davis, Time Series: Theory and Methods, second ed., Springer, New York, 1991.
x <- arima.sim(list(order = c(1,0,0),ar = 0.4),n = 100) y <- 5*(1:100)/100 + x MA(y) # moving average filter for co2 data MA(co2)
x <- arima.sim(list(order = c(1,0,0),ar = 0.4),n = 100) y <- 5*(1:100)/100 + x MA(y) # moving average filter for co2 data MA(co2)
Performs the Phillips-Perron test for the null hypothesis of a unit root of
a univariate time series x
(equivalently, x
is a non-stationary time series).
pp.test(x, type = c("Z_rho", "Z_tau"), lag.short = TRUE, output = TRUE)
pp.test(x, type = c("Z_rho", "Z_tau"), lag.short = TRUE, output = TRUE)
x |
a numeric vector or univariate time series. |
type |
the type of Phillips-Perron test. The default is |
lag.short |
a logical value indicating whether the parameter of lag to calculate the statistic is a short or long term. The default is a short term. |
output |
a logical value indicating to print the results in R console. The default
is |
Compared with the Augmented Dickey-Fuller test, Phillips-Perron test makes
correction to the test statistics and is robust to the unspecified autocorrelation
and heteroscedasticity in the errors. There are two types of test statistics,
and
, which have the same asymptotic distributions
as Augmented Dickey-Fuller test statistic,
ADF
. The calculations of each type
of the Phillips-Perron test can be see in the reference below. If the
lag.short = TRUE
, we use the default number of Newey-West lags
,
otherwise
to calculate the test statistics.
In order to calculate the test statistic, we consider
three types of linear regression models. The first type (
type1
) is the one
with no drift and linear trend with respect to time:
where is an error term.
The second type (
type2
) is the one with drift but no linear trend:
The third type (type3) is the one with both drift and linear trend:
The p.value is calculated by the interpolation of test statistics from the critical values
tables (Table 10.A.1 for Z_rho
and 10.A.2 for Z_tau
in Fuller (1996))
with a given sample size = length(
x
).
A matrix for test results with three columns (lag
,Z_rho
or Z_tau
, p.value
) and three rows (type1
, type2
, type3
).
Each row is the test results (including lag parameter, test statistic and p.value) for
each type of linear equation.
Missing values are removed.
Debin Qiu
Phillips, P. C. B.; Perron, P. (1988). Testing for a Unit Root in Time Series Regression. Biometrika, 75 (2): 335-346.
Fuller, W. A. (1996). Introduction to statistical time series, second ed., Wiley, New York.
adf.test
, kpss.test
, stationary.test
# PP test for ar(1) process x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) pp.test(x) # PP test for co2 data pp.test(co2)
# PP test for ar(1) process x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) pp.test(x) # PP test for co2 data pp.test(co2)
Performs stationary test for a univariate time series.
stationary.test(x, method = c("adf", "pp", "kpss"), nlag = NULL, type = c("Z_rho", "Z_tau"), lag.short = TRUE, output = TRUE)
stationary.test(x, method = c("adf", "pp", "kpss"), nlag = NULL, type = c("Z_rho", "Z_tau"), lag.short = TRUE, output = TRUE)
x |
a numeric vector or univariate time series. |
method |
a character indicating which test to use. The default is
|
nlag |
the lag order to calculate the test statistic, only valid for
|
type |
the test type, only valid for |
lag.short |
a logical value, only valid for |
output |
a logical value indicating to print the results in R console.
The default is |
This function combines the existing functions adf.test
,
pp.test
and
kpss.test
for testing the stationarity of a univariate time series x
.
The results are the same as one of the adf.test
, pp.test
,
kpss.test
, depending on which test are used.
Missing values are removed.
Debin Qiu
x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) stationary.test(x) # same as adf.test(x) stationary.test(x, method = "pp") # same as pp.test(x) stationary.test(x, method = "kpss") # same as kpss.test(x)
x <- arima.sim(list(order = c(1,0,0),ar = 0.2),n = 100) stationary.test(x) # same as adf.test(x) stationary.test(x, method = "pp") # same as pp.test(x) stationary.test(x, method = "kpss") # same as kpss.test(x)
Fit a stepwise autoregressive model
stepar(y, xreg = NULL, trend = c("linear", "quadratic", "constant"), order = NULL, lead = 0, newx = NULL, output = TRUE, ...)
stepar(y, xreg = NULL, trend = c("linear", "quadratic", "constant"), order = NULL, lead = 0, newx = NULL, output = TRUE, ...)
y |
a numeric vector of response |
xreg |
a numeric vector or matrix of exogenous input variables. The default is
|
trend |
the type of trend with respective to time. The default is |
order |
the order to fit the AR model for residuals. The default is |
lead |
the number of steps ahead for which prediction is required.
The default is |
newx |
a matrix of new data of |
output |
a logical value indicating to print the results in R console. The default is
|
... |
additional arguments for |
The stewise autoregressive model uses a two-stage procedure to fit time series.
The first stage is to fit a (constant
,linear
,quadratic
)
model with respective to time sequence:
, where
. If
xreg
is supplied,
the fitted model is updated by
for trend = "constant"
, and
for trend = "linear"
, and
for trend = "quadratic"
.
The second stage is to fit an autoregressive process to the residuals of the fitted
model obtained in the first stage, which is accomplished by using ar
function
in stats
package.
A list with class "stepar
" containing the following components:
coef |
a estimated coefficient matrix including the t-test results. |
sigma |
the square root of the estimated variance of the random error. |
R.squared |
the R^2 for fitted model in the first stage. |
pred |
the predictions, only available for |
If lead
> 0 and xreg
is supplied, newx
must also be
supplied in order to make a prediction. The newx
must be a matrix with
the same number of columns as xreg
and the number of rows being
equal to lead
. The predictions should be used with cautions.
Debin Qiu
x <- 5*(1:100)/100 x <- x + arima.sim(list(order = c(1,0,0),ar = 0.4),n = 100) stepar(x) stepar(x,order = 1) # with xreg supplied X <- matrix(rnorm(200),100,2) y <- 0.1*X[,1] + 1.2*X[,2] + rnorm(100) stepar(y,X) # make a prediction with lead = 1; used with caution. newdat1 <- matrix(rnorm(2),nrow = 1) fit1 <- stepar(y,X,lead = 1,newx = newdat1,output = FALSE) # make a prediction with lead = 2; used with caution. newdat2 <- matrix(rnorm(4),nrow = 2) fit2 <- stepar(y,X,lead = 2,newx = newdat2,output = FALSE)
x <- 5*(1:100)/100 x <- x + arima.sim(list(order = c(1,0,0),ar = 0.4),n = 100) stepar(x) stepar(x,order = 1) # with xreg supplied X <- matrix(rnorm(200),100,2) y <- 0.1*X[,1] + 1.2*X[,2] + rnorm(100) stepar(y,X) # make a prediction with lead = 1; used with caution. newdat1 <- matrix(rnorm(2),nrow = 1) fit1 <- stepar(y,X,lead = 1,newx = newdat1,output = FALSE) # make a prediction with lead = 2; used with caution. newdat2 <- matrix(rnorm(4),nrow = 2) fit2 <- stepar(y,X,lead = 2,newx = newdat2,output = FALSE)
Performs an approximate Cox-Stuart or Difference-Sign trend test.
trend.test(x, method = c("cox.stuart", "diff.sign"), plot = FALSE)
trend.test(x, method = c("cox.stuart", "diff.sign"), plot = FALSE)
x |
a numeric vector or univariate time series. |
method |
test method. The default is |
plot |
a logical value indicating to display the plot of data.
The default is |
Cox-Stuart or Difference-Sign test is used to test whether the data have a
increasing or decreasing trend. They are useful to detect the linear or nonlinear trend.
The Cox-Stuart test is constructed as follows.
For the given data , one can divide them into two sequences with
equal number of observations cutted in the midpoint and then take the paired difference,
i.e.,
, where
is the index of
midpoint. Let
be the number of positive or negative values in
. Under the
null hypothesis that data have no trend, for large
= length(x),
is
approximately distributed as
, such that one can immediately obtain
the p value. The exact Cox-Stuart trend test can be seen in
cs.test
of
snpar
package.
The Difference-Sign test is constructed as the similar way as Cox-Stuart test. We first
let for
and then
count the number of positive or negative values in
, defined as
.
Under the null hypothesis,
is approximately distributed as
. Thus, p-value can be calculated based on the null distribution.
A list with class "htest
" containing the following components:
data.name |
a character string giving the names of the data. |
method |
the type of test applied. |
alternative |
a character string describing the alternative hypothesis. |
p.value |
the p-value for the test. |
statistic |
the value of the test statistic with a name describing it. |
Missing values are removed.
Debin Qiu
D.R. Cox and A. Stuart (1955). Some quick sign tests for trend in location and dispersion. Biometrika, Vol. 42, pp. 80-95.
P.J. Brockwell, R.A. Davis, Time Series: Theory and Methods, second ed., Springer, New York, 1991. (p. 37)
x <- rnorm(100) trend.test(x,plot = TRUE) # no trend x <- 5*(1:100)/100 x <- x + arima.sim(list(order = c(1,0,0),ar = 0.4),n = 100) trend.test(x,plot = TRUE) # increasing trend
x <- rnorm(100) trend.test(x,plot = TRUE) # no trend x <- 5*(1:100)/100 x <- x + arima.sim(list(order = c(1,0,0),ar = 0.4),n = 100) trend.test(x,plot = TRUE) # increasing trend
Performs diagnostics for ARIMA model fitted by arima
or
estimate
with output of diagnostic plots.
ts.diag(object, lag.seq = NULL)
ts.diag(object, lag.seq = NULL)
object |
the result of an |
lag.seq |
the sequence of lag to calculate the Ljung-Box test statistics. The default
is |
This function is similar to ts.diag
in stats
package, but with
one more diagnostic plot for the normality of residuals. Also, the default sequence of lags
for a Ljung-Box test is set to be seq(4,24,by = 4)
if sample size ,
otherwise
seq(1,n,4)
. This function has been automatically implemented in
estimate
function.
Diagnostics are plotted, including the ACF plot, PACF plot, p.value of white noise checking plot, and Q-Q plot for residuals.
A matrix for the result of white noise checking by Ljung-Box test.
Debin Qiu
x <- arima.sim(list(order = c(3,0,0),ar = c(0.2,0.4,-0.15)),n = 100) fit <- estimate(x,p = 3) # same as fit <- arima(x,order = c(3,0,0)) ts.diag(fit)
x <- arima.sim(list(order = c(3,0,0),ar = c(0.2,0.4,-0.15)),n = 100) fit <- estimate(x,p = 3) # same as fit <- arima(x,order = c(3,0,0)) ts.diag(fit)
Performs Winters three-parameter smoothing for a univariate time series with seasonal pattern.
Winters(x, period = NULL, trend = 2, lead = 0, plot = TRUE, seasonal = c("additive", "multiplicative"), damped = FALSE, alpha = 0.2, beta = 0.1057, gamma = 0.07168, phi = 0.98, a.start = NA, b.start = NA, c.start = NA)
Winters(x, period = NULL, trend = 2, lead = 0, plot = TRUE, seasonal = c("additive", "multiplicative"), damped = FALSE, alpha = 0.2, beta = 0.1057, gamma = 0.07168, phi = 0.98, a.start = NA, b.start = NA, c.start = NA)
x |
a univariate time series. |
period |
seasonal period. The default is |
trend |
the type of trend component, can be one of 1,2,3 which represents constant,
linear and quadratic trend, respectively. The default is |
lead |
the number of steps ahead for which prediction is required.
The default is |
plot |
a logical value indicating to display the smoothed graph. The default is
|
seasonal |
character string to select an " |
damped |
a logical value indicating to include the damped trend, only valid for
|
alpha |
the parameter of level smoothing. The default is |
beta |
the parameter of trend smoothing. The default is |
gamma |
the parameter of season smoothing. The default is |
phi |
the parameter of damped trend smoothing, only valid for |
a.start |
the starting value for level smoothing. The default is |
b.start |
the starting value for trend smoothing. The default is |
c.start |
the starting value for season smoothing. The default is |
The Winters filter is used to decompose the trend and seasonal components by
updating equations. This is similar to the function HoltWinters
in
stats
package but may be in different perspective. To be precise, it uses the
updating equations similar to exponential
smoothing to fit the parameters for the following models when
seasonal = "additive"
.
If the trend is constant (trend = 1
):
If the trend is linear (trend = 2
):
If the trend is quadratic (trend = 3
):
Here are the trend parameters,
is the seasonal
parameter for the
season corresponding to time
.
For the multiplicative season, the models are as follows.
If the trend is constant (
trend = 1
):
If the trend is linear (trend = 2
):
If the trend is quadratic (trend = 3
):
When seasonal = "multiplicative"
, the updating equations for each parameter can
be seen in page 606-607 of PROC FORECAST document of SAS. Similarly, for the
additive seasonal model, the 'division' (/) for and
in page 606-607
is changed to 'minus' (-).
The default starting values for are computed by a time-trend regression over
the first cycle of time series. The default starting values for the seasonal factors are
computed from seasonal averages. The default smoothing parameters (weights)
alpha,
beta, gamma
are taken from the equation 1 - 0.8^{1/trend}
respectively. You can
also use the HoltWinters
function to get the optimal smoothing parameters
and plug them back in this function.
The prediction equation is for
trend = 2
and
seasonal = "additive"
. Similar equations can be derived for the other options. When
the damped = TRUE
, the prediction equation is
. More details can be
referred to R. J. Hyndman and G. Athanasopoulos (2013).
A list with class "Winters
" containing the following components:
season |
the seasonal factors. |
estimate |
the smoothed values. |
pred |
the prediction, only available with |
accurate |
the accurate measurements. |
The sum of seasonal factors is equal to the seasonal period. This restriction is to ensure the identifiability of seasonality in the models above.
Debin Qiu
P. R. Winters (1960) Forecasting sales by exponentially weighted moving averages, Management Science 6, 324-342.
R. J. Hyndman and G. Athanasopoulos, "Forecasting: principles and practice," 2013. [Online]. Available: http://otexts.org/fpp/.
Winters(co2) Winters(AirPassengers, seasonal = "multiplicative")
Winters(co2) Winters(AirPassengers, seasonal = "multiplicative")