Title: | Testing, Monitoring, and Dating Structural Changes: C++ Version |
---|---|
Description: | A fast implementation with additional experimental features for testing, monitoring and dating structural changes in (linear) regression models. 'strucchangeRcpp' features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g. cumulative/moving sum, recursive/moving estimates) and F statistics, respectively. These methods are described in Zeileis et al. (2002) <doi:10.18637/jss.v007.i02>. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals, and their magnitude as well as the model fit can be evaluated using a variety of statistical measures. |
Authors: | Dainius Masiliunas [aut, cre] , Achim Zeileis [aut] , Marius Appel [aut], Friedrich Leisch [aut], Kurt Hornik [aut], Christian Kleiber [aut], Andrei Mirt [ctb] , Bruce Hansen [ctb], Edgar C. Merkle [ctb], Nikolaus Umlauf [ctb] |
Maintainer: | Dainius Masiliunas <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 1.5-4-1.0.0 |
Built: | 2024-11-17 06:57:35 UTC |
Source: | CRAN |
Data about the number of youth homicides in Boston during the ‘Boston Gun Project’—a policing initiative aiming at lowering homicide victimization among young people in Boston.
data("BostonHomicide")
data("BostonHomicide")
A data frame containing 6 monthly time series and two factors coding seasonality and year, respectively.
time series. Number of youth homicides.
time series. Boston population (aged 25-44), linearly interpolated from annual data.
time series. Population of black males (aged 15-24), linearly interpolated from annual data.
time series. Number of adult homicides (aged 25 and older).
time series. Number of adult homicides (aged 35-44).
time series. Teen unemployment rate (in percent).
factor coding the month.
factor coding the year.
The ‘Boston Gun Project’ is a policing initiative aiming at lowering youth homicides in Boston. The project began in early 1995 and implemented the so-called ‘Operation Ceasefire’ intervention which began in the late spring of 1996.
Piehl et al. (2004), Figure 1, Figure 3, and Table 1.
From the table it is not clear how the data should be linearly interpolated.
Here, it was chosen to use the given observations for July of the corresponding
year and then use approx
with rule = 2
.
Piehl A.M., Cooper S.J., Braga A.A., Kennedy D.M. (2003), Testing for Structural Breaks in the Evaluation of Programs, The Review of Economics and Statistics, 85(3), 550-558.
Kennedy D.M., Piehl A.M., Braga A.A. (1996), Youth Violence in Boston: Gun Markets, Serious Youth Offenders, and a Use-Reduction Strategy, Law and Contemporary Problems, 59, 147-183.
data("BostonHomicide") attach(BostonHomicide) ## data from Table 1 tapply(homicides, year, mean) populationBM[0:6*12 + 7] tapply(ahomicides25, year, mean) tapply(ahomicides35, year, mean) population[0:6*12 + 7] unemploy[0:6*12 + 7] ## model A ## via OLS fmA <- lm(homicides ~ populationBM + season) anova(fmA) ## as GLM fmA1 <- glm(homicides ~ populationBM + season, family = poisson) anova(fmA1, test = "Chisq") ## model B & C fmB <- lm(homicides ~ populationBM + season + ahomicides25) fmC <- lm(homicides ~ populationBM + season + ahomicides25 + unemploy) detach(BostonHomicide)
data("BostonHomicide") attach(BostonHomicide) ## data from Table 1 tapply(homicides, year, mean) populationBM[0:6*12 + 7] tapply(ahomicides25, year, mean) tapply(ahomicides35, year, mean) population[0:6*12 + 7] unemploy[0:6*12 + 7] ## model A ## via OLS fmA <- lm(homicides ~ populationBM + season) anova(fmA) ## as GLM fmA1 <- glm(homicides ~ populationBM + season, family = poisson) anova(fmA1, test = "Chisq") ## model B & C fmB <- lm(homicides ~ populationBM + season + ahomicides25) fmC <- lm(homicides ~ populationBM + season + ahomicides25 + unemploy) detach(BostonHomicide)
A generic function computing boundaries for structural change tests
boundary(x, ...)
boundary(x, ...)
x |
an object. Use |
... |
additional arguments affecting the boundary. |
an object of class "ts"
with the same time properties as
the time series in x
boundary.efp
, boundary.mefp
,
boundary.Fstats
Computes boundary for an object of class "efp"
## S3 method for class 'efp' boundary(x, alpha = 0.05, alt.boundary = FALSE, functional = "max", ...)
## S3 method for class 'efp' boundary(x, alpha = 0.05, alt.boundary = FALSE, functional = "max", ...)
x |
an object of class |
alpha |
numeric from interval (0,1) indicating the confidence level for which the boundary of the corresponding test will be computed. |
alt.boundary |
logical. If set to |
functional |
indicates which functional should be applied to the
empirical fluctuation process. See also |
... |
currently not used. |
an object of class "ts"
with the same time properties as
the process in x
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains constant ## over the years ## compute OLS-CUSUM fluctuation process temp.cus <- efp(nhtemp ~ 1, type = "OLS-CUSUM") ## plot the process without boundaries plot(temp.cus, alpha = 0.01, boundary = FALSE) ## add the boundaries in another colour bound <- boundary(temp.cus, alpha = 0.01) lines(bound, col=4) lines(-bound, col=4)
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains constant ## over the years ## compute OLS-CUSUM fluctuation process temp.cus <- efp(nhtemp ~ 1, type = "OLS-CUSUM") ## plot the process without boundaries plot(temp.cus, alpha = 0.01, boundary = FALSE) ## add the boundaries in another colour bound <- boundary(temp.cus, alpha = 0.01) lines(bound, col=4) lines(-bound, col=4)
Computes boundary for an object of class "Fstats"
## S3 method for class 'Fstats' boundary(x, alpha = 0.05, pval = FALSE, aveF = FALSE, asymptotic = FALSE, ...)
## S3 method for class 'Fstats' boundary(x, alpha = 0.05, pval = FALSE, aveF = FALSE, asymptotic = FALSE, ...)
x |
an object of class |
alpha |
numeric from interval (0,1) indicating the confidence level for which the boundary of the supF test will be computed. |
pval |
logical. If set to |
aveF |
logical. If set to |
asymptotic |
logical. If set to |
... |
currently not used. |
an object of class "ts"
with the same time properties as
the time series in x
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years for potential break points between 1941 ## (corresponds to from = 0.5) and 1962 (corresponds to to = 0.85) ## compute F statistics fs <- Fstats(nhtemp ~ 1, from = 0.5, to = 0.85) ## plot the p values without boundary plot(fs, pval = TRUE, alpha = 0.01) ## add the boundary in another colour lines(boundary(fs, pval = TRUE, alpha = 0.01), col = 2)
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years for potential break points between 1941 ## (corresponds to from = 0.5) and 1962 (corresponds to to = 0.85) ## compute F statistics fs <- Fstats(nhtemp ~ 1, from = 0.5, to = 0.85) ## plot the p values without boundary plot(fs, pval = TRUE, alpha = 0.01) ## add the boundary in another colour lines(boundary(fs, pval = TRUE, alpha = 0.01), col = 2)
Computes boundary for an object of class "mefp"
## S3 method for class 'mefp' boundary(x, ...)
## S3 method for class 'mefp' boundary(x, ...)
x |
an object of class |
... |
currently not used. |
an object of class "ts"
with the same time properties as
the monitored process
df1 <- data.frame(y=rnorm(300)) df1[150:300,"y"] <- df1[150:300,"y"]+1 me1 <- mefp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1, alpha=0.05) me2 <- monitor(me1, data=df1) plot(me2, boundary=FALSE) lines(boundary(me2), col="green", lty="44")
df1 <- data.frame(y=rnorm(300)) df1[150:300,"y"] <- df1[150:300,"y"]+1 me1 <- mefp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1, alpha=0.05) me2 <- monitor(me1, data=df1) plot(me2, boundary=FALSE) lines(boundary(me2), col="green", lty="44")
A generic function for computing the breakdates corresponding to breakpoints (and their confidence intervals).
breakdates(obj, format.times = FALSE, ...)
breakdates(obj, format.times = FALSE, ...)
obj |
An object of class |
format.times |
logical. If set to |
... |
currently not used. |
Breakpoints are the number of observations that are the last in one
segment and breakdates are the corresponding points on the underlying
time scale. The breakdates can be formatted which enhances readability
in particular for quarterly or monthly time series. For example the
breakdate 2002.75
of a monthly time series will be formatted to
"2002(10)"
.
A vector or matrix containing the breakdates.
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) bp.nile <- breakpoints(Nile ~ 1) summary(bp.nile) plot(bp.nile) ## compute breakdates corresponding to the ## breakpoints of minimum BIC segmentation breakdates(bp.nile) ## confidence intervals ci.nile <- confint(bp.nile) breakdates(ci.nile) ci.nile plot(Nile) lines(ci.nile)
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) bp.nile <- breakpoints(Nile ~ 1) summary(bp.nile) plot(bp.nile) ## compute breakdates corresponding to the ## breakpoints of minimum BIC segmentation breakdates(bp.nile) ## confidence intervals ci.nile <- confint(bp.nile) breakdates(ci.nile) ci.nile plot(Nile) lines(ci.nile)
Generates a factor encoding the segmentation given by a set of breakpoints.
breakfactor(obj, breaks = NULL, labels = NULL, ...)
breakfactor(obj, breaks = NULL, labels = NULL, ...)
obj |
An object of class |
breaks |
an integer specifying the number of breaks
to extract (only if |
labels |
a vector of labels for the returned factor,
by default the segments are numbered starting from
|
... |
further arguments passed to |
A factor encoding the segmentation.
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## compute breakpoints bp.nile <- breakpoints(Nile ~ 1) ## fit and visualize segmented and unsegmented model fm0 <- lm(Nile ~ 1) fm1 <- lm(Nile ~ breakfactor(bp.nile, breaks = 1)) lines(fitted(fm0), col = 3) lines(fitted(fm1), col = 4) lines(bp.nile, breaks = 1)
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## compute breakpoints bp.nile <- breakpoints(Nile ~ 1) ## fit and visualize segmented and unsegmented model fm0 <- lm(Nile ~ 1) fm1 <- lm(Nile ~ breakfactor(bp.nile, breaks = 1)) lines(fitted(fm0), col = 3) lines(fitted(fm1), col = 4) lines(bp.nile, breaks = 1)
Computation of breakpoints in regression relationships. Given a number of breaks the function computes the optimal breakpoints.
## S3 method for class 'formula' breakpoints(formula, h = 0.15, breaks = c("BIC", "LWZ", "RSS", "all"), data = list(), hpc = c("none", "foreach"), ...) ## S3 method for class 'matrix' breakpoints(obj, y, h = 0.15, breaks = c("BIC", "LWZ", "RSS", "all"), hpc = c("none", "foreach"), ...) ## S3 method for class 'breakpointsfull' breakpoints(obj, breaks = c("BIC", "LWZ", "RSS", "all"), ...) ## S3 method for class 'breakpointsfull' summary(object, breaks = NULL, sort = NULL, format.times = NULL, ...) ## S3 method for class 'breakpoints' lines(x, breaks = NULL, lty = 2, ...) ## S3 method for class 'breakpointsfull' coef(object, breaks = NULL, names = NULL, ...) ## S3 method for class 'breakpointsfull' fitted(object, breaks = NULL, bp = NULL, ...) ## S3 method for class 'breakpointsfull' residuals(object, breaks = NULL, ...) ## S3 method for class 'breakpointsfull' vcov(object, breaks = NULL, names = NULL, het.reg = TRUE, het.err = TRUE, vcov. = NULL, sandwich = TRUE, ...)
## S3 method for class 'formula' breakpoints(formula, h = 0.15, breaks = c("BIC", "LWZ", "RSS", "all"), data = list(), hpc = c("none", "foreach"), ...) ## S3 method for class 'matrix' breakpoints(obj, y, h = 0.15, breaks = c("BIC", "LWZ", "RSS", "all"), hpc = c("none", "foreach"), ...) ## S3 method for class 'breakpointsfull' breakpoints(obj, breaks = c("BIC", "LWZ", "RSS", "all"), ...) ## S3 method for class 'breakpointsfull' summary(object, breaks = NULL, sort = NULL, format.times = NULL, ...) ## S3 method for class 'breakpoints' lines(x, breaks = NULL, lty = 2, ...) ## S3 method for class 'breakpointsfull' coef(object, breaks = NULL, names = NULL, ...) ## S3 method for class 'breakpointsfull' fitted(object, breaks = NULL, bp = NULL, ...) ## S3 method for class 'breakpointsfull' residuals(object, breaks = NULL, ...) ## S3 method for class 'breakpointsfull' vcov(object, breaks = NULL, names = NULL, het.reg = TRUE, het.err = TRUE, vcov. = NULL, sandwich = TRUE, ...)
obj , object
|
an object of class |
y |
response vector. |
formula |
a symbolic description for the model in which breakpoints will be estimated. |
h |
minimal segment size either given as fraction relative to the sample size or as an integer giving the minimal number of observations in each segment. |
breaks |
either a positive integer specifying the maximal number of breaks to be calculated,
or a string specifying the information criterion to use to automatically determine
the optimal number of breaks (see also |
data |
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which |
hpc |
a character specifying the high performance computing support.
Default is |
... |
arguments passed to |
sort |
logical. If set to |
format.times |
logical. If set to |
x |
an object of class |
lty |
line type. |
names |
a character vector giving the names of the segments. If of length
1 it is taken to be a generic prefix, e.g. |
bp |
integer vector denoting the breakpoint indices for which to get the fitted values.
Default is to choose according to |
het.reg |
logical. Should heterogeneous regressors be assumed? If set
to |
het.err |
logical. Should heterogeneous errors be assumed? If set
to |
vcov. |
a function to extract the covariance matrix
for the coefficients of a fitted model of class |
sandwich |
logical. Is the function |
All procedures in this package are concerned with testing or assessing deviations from stability in the classical linear regression model
In many applications it is reasonable to assume
that there are breakpoints, where the coefficients shift from
one stable regression relationship to a different one. Thus,
there are
segments in which the regression coefficients are
constant, and the model can be rewritten as
where denotes the segment index. In practice the breakpoints
are rarely given exogenously, but have to be estimated.
breakpoints
estimates these breakpoints by minimizing the residual sum of
squares (RSS) of the equation above.
The foundation for estimating breaks in time series regression models
was given by Bai (1994) and was extended to multiple breaks by Bai (1997ab)
and Bai & Perron (1998). breakpoints
implements the algorithm
described in Bai & Perron (2003) for simultaneous estimation of
multiple breakpoints. The distribution function used for the confidence
intervals for the breakpoints is given in Bai (1997b). The ideas behind
this implementation are described in Zeileis et al. (2003).
The algorithm for computing the optimal breakpoints given the number
of breaks is based on a dynamic programming approach. The underlying
idea is that of the Bellman principle. The main computational effort
is to compute a triangular RSS matrix, which gives the residual
sum of squares for a segment starting at observation and
ending at
with
<
.
Given a formula
as the first argument, breakpoints
computes
an object of class "breakpointsfull"
which inherits from "breakpoints"
.
This contains in particular the triangular RSS
matrix and functions to extract an optimal segmentation. A summary
of this object will give the breakpoints (and associated) breakdates
for all segmentations up to the maximal number of breaks together
with the associated RSS, BIC and LWZ. These will be plotted if plot
is applied and thus visualize the minimum BIC and LWZ estimators of the number
of breakpoints. From an object of class "breakpointsfull"
an
arbitrary number of breaks
(admissible by the minimum segment
size h
) can be extracted by another application of
breakpoints
, returning an object of class "breakpoints"
.
This contains only the breakpoints for the specified number of breaks
and some model properties (number of observations, regressors, time
series properties and the associated RSS) but not the triangular RSS
matrix and related extractor functions. The set of breakpoints which
is associated by default with a "breakpointsfull"
object is
the minimum BIC partition.
Breakpoints are the number of observations that are the last in one
segment, it is also possible to compute the corresponding breakdates
which are the breakpoints on the underlying time scale. The breakdates
can be formatted which enhances readability in particular for quarterly
or monthly time series. For example the breakdate 2002.75
of a monthly
time series will be formatted to "2002(10)"
. See breakdates
for more details.
From a "breakpointsfull"
object confidence intervals for the breakpoints
can be computed using the method of confint
.
The breakdates corresponding to the breakpoints can again be computed
by breakdates
. The breakpoints and their confidence
intervals can be visualized by lines
. Convenience functions are
provided for extracting the coefficients and covariance matrix, fitted
values and residuals of segmented models.
The log likelihood as well as some information criteria can be computed
using the methods for the logLik
, AIC
and LWZ
. As
for linear models the log likelihood is computed on a normal model and
the degrees of freedom are the number of regression coefficients multiplied
by the number of segments plus the number of estimated breakpoints plus
1 for the error variance. More details can be found on the help page of
the method logLik.breakpoints
.
As the maximum of a sequence of F statistics is equivalent to the minimum
OLS estimator of the breakpoint in a 2-segment partition it can be
extracted by breakpoints
from an object of class "Fstats"
as computed by Fstats
. However, this cannot be used to extract
a larger number of breakpoints.
For illustration see the commented examples below and Zeileis et al. (2003).
Optional support for high performance computing is available, currently using
foreach
for the dynamic programming algorithm.
If hpc = "foreach"
is to be used, a parallel backend should be registered
before. See foreach
for more information.
An object of class "breakpoints"
is a list with the following
elements:
the breakpoints of the optimal partition with the
number of breaks specified (set to NA
if the optimal 1-segment
solution is reported),
the associated RSS,
the number of observations,
the number of regressors,
the function call,
the time series properties tsp
of the data,
if any, c(1/nobs, 1, nobs)
otherwise.
If applied to a formula
as first argument, breakpoints
returns an object of class
"breakpointsfull"
(which inherits from "breakpoints"
), that
contains some additional (or slightly different) elements such as:
the breakpoints of the minimum BIC partition,
a function which takes two arguments i,j
and computes
the residual sum of squares for a segment starting at observation i
and
ending at j
by looking up the corresponding element in the triangular
RSS matrix RSS.triang
,
a list encoding the triangular RSS matrix.
Bai J. (1994), Least Squares Estimation of a Shift in Linear Processes, Journal of Time Series Analysis, 15, 453-472.
Bai J. (1997a), Estimating Multiple Breaks One at a Time, Econometric Theory, 13, 315-352.
Bai J. (1997b), Estimation of a Change Point in Multiple Regression Models, Review of Economics and Statistics, 79, 551-563.
Bai J., Perron P. (1998), Estimating and Testing Linear Models With Multiple Structural Changes, Econometrica, 66, 47-78.
Bai J., Perron P. (2003), Computation and Analysis of Multiple Structural Change Models, Journal of Applied Econometrics, 18, 1-22.
Zeileis A., Kleiber C., Krämer W., Hornik K. (2003), Testing and Dating of Structural Changes in Practice, Computational Statistics and Data Analysis, 44, 109-123. doi:10.1016/S0167-9473(03)00030-6.
Zeileis A., Shah A., Patnaik I. (2010), Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes, Computational Statistics and Data Analysis, 54(6), 1696–1706. doi:10.1016/j.csda.2009.12.005.
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## F statistics indicate one breakpoint fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) breakpoints(fs.nile) lines(breakpoints(fs.nile)) ## or bp.nile <- breakpoints(Nile ~ 1) summary(bp.nile) ## the BIC and LWZ also choose one breakpoint plot(bp.nile) breakpoints(bp.nile) breakpoints(bp.nile, breaks = "LWZ") ## fit null hypothesis model and model with 1 breakpoint fm0 <- lm(Nile ~ 1) fm1 <- lm(Nile ~ breakfactor(bp.nile, breaks = 1)) plot(Nile) lines(ts(fitted(fm0), start = 1871), col = 3) lines(ts(fitted(fm1), start = 1871), col = 4) lines(bp.nile) ## confidence interval ci.nile <- confint(bp.nile) ci.nile lines(ci.nile) ## UK Seatbelt data: a SARIMA(1,0,0)(1,0,0)_12 model ## (fitted by OLS) is used and reveals (at least) two ## breakpoints - one in 1973 associated with the oil crisis and ## one in 1983 due to the introduction of compulsory ## wearing of seatbelts in the UK. data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) ## testing re.seat <- efp(y ~ ylag1 + ylag12, data = seatbelt, type = "RE") plot(re.seat) ## dating bp.seat <- breakpoints(y ~ ylag1 + ylag12, data = seatbelt, h = 0.1) summary(bp.seat) lines(bp.seat, breaks = 2) ## minimum BIC partition plot(bp.seat) breakpoints(bp.seat) ## the BIC would choose 0 breakpoints although the RE and supF test ## clearly reject the hypothesis of structural stability. Bai & ## Perron (2003) report that the BIC has problems in dynamic regressions. ## due to the shape of the RE process of the F statistics choose two ## breakpoints and fit corresponding models bp.seat2 <- breakpoints(bp.seat, breaks = 2) fm0 <- lm(y ~ ylag1 + ylag12, data = seatbelt) fm1 <- lm(y ~ breakfactor(bp.seat2)/(ylag1 + ylag12) - 1, data = seatbelt) ## plot plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) time.seat <- as.vector(time(seatbelt)) lines(time.seat, fitted(fm0), col = 3) lines(time.seat, fitted(fm1), col = 4) lines(bp.seat2) ## confidence intervals ci.seat2 <- confint(bp.seat, breaks = 2) ci.seat2 lines(ci.seat2)
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## F statistics indicate one breakpoint fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) breakpoints(fs.nile) lines(breakpoints(fs.nile)) ## or bp.nile <- breakpoints(Nile ~ 1) summary(bp.nile) ## the BIC and LWZ also choose one breakpoint plot(bp.nile) breakpoints(bp.nile) breakpoints(bp.nile, breaks = "LWZ") ## fit null hypothesis model and model with 1 breakpoint fm0 <- lm(Nile ~ 1) fm1 <- lm(Nile ~ breakfactor(bp.nile, breaks = 1)) plot(Nile) lines(ts(fitted(fm0), start = 1871), col = 3) lines(ts(fitted(fm1), start = 1871), col = 4) lines(bp.nile) ## confidence interval ci.nile <- confint(bp.nile) ci.nile lines(ci.nile) ## UK Seatbelt data: a SARIMA(1,0,0)(1,0,0)_12 model ## (fitted by OLS) is used and reveals (at least) two ## breakpoints - one in 1973 associated with the oil crisis and ## one in 1983 due to the introduction of compulsory ## wearing of seatbelts in the UK. data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) ## testing re.seat <- efp(y ~ ylag1 + ylag12, data = seatbelt, type = "RE") plot(re.seat) ## dating bp.seat <- breakpoints(y ~ ylag1 + ylag12, data = seatbelt, h = 0.1) summary(bp.seat) lines(bp.seat, breaks = 2) ## minimum BIC partition plot(bp.seat) breakpoints(bp.seat) ## the BIC would choose 0 breakpoints although the RE and supF test ## clearly reject the hypothesis of structural stability. Bai & ## Perron (2003) report that the BIC has problems in dynamic regressions. ## due to the shape of the RE process of the F statistics choose two ## breakpoints and fit corresponding models bp.seat2 <- breakpoints(bp.seat, breaks = 2) fm0 <- lm(y ~ ylag1 + ylag12, data = seatbelt) fm1 <- lm(y ~ breakfactor(bp.seat2)/(ylag1 + ylag12) - 1, data = seatbelt) ## plot plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) time.seat <- as.vector(time(seatbelt)) lines(time.seat, fitted(fm0), col = 3) lines(time.seat, fitted(fm1), col = 4) lines(bp.seat2) ## confidence intervals ci.seat2 <- confint(bp.seat, breaks = 2) ci.seat2 lines(ci.seat2)
Generators for efpFunctional
objects suitable for aggregating
empirical fluctuation processes to test statistics along (ordinal)
categorical variables.
catL2BB(freq) ordL2BB(freq, nproc = NULL, nrep = 1e5, probs = c(0:84/100, 850:1000/1000), ...) ordwmax(freq, algorithm = mvtnorm::GenzBretz(), ...)
catL2BB(freq) ordL2BB(freq, nproc = NULL, nrep = 1e5, probs = c(0:84/100, 850:1000/1000), ...) ordwmax(freq, algorithm = mvtnorm::GenzBretz(), ...)
freq |
object specifying the category frequencies for the
categorical variable to be used for aggregation: either a
|
nproc |
numeric. Number of processes used for simulating
from the asymptotic distribution (passed to |
nrep |
numeric. Number of replications used for simulating
from the asymptotic distribution (passed to |
probs |
numeric vector specifying for which probabilities critical values should be tabulated. |
... |
further arguments passed to |
algorithm |
algorithm specification passed to |
Merkle, Fan, and Zeileis (2014) discuss three functionals that are
suitable for aggregating empirical fluctuation processes along categorical
variables, especially ordinal variables. The functions catL2BB
,
ordL2BB
, and ordwmax
all require a specification of the
relative frequencies within each category (which can be computed from
various specifications, see arguments). All of them employ
efpFunctional
(Zeileis 2006) internally to set up an
object that can be employed with gefp
fluctuation
processes.
catL2BB
results in a chi-squared test. This is essentially
the LM test counterpart to the likelihood ratio test that assesses
a split into unordered categories.
ordL2BB
is the ordinal counterpart to supLM
where aggregation is done along the ordered categories (rather than
continuously). The asymptotic distribution is non-standard and needs
to be simulated for every combination of frequencies and number of
processes. Hence, this is somewhat more time-consuming compared to
the closed-form solution employed in catL2BB
. It is also
possible to store the result of ordL2BB
in case it needs to
be applied several gefp
fluctuation processes.
ordwmax
is a weighted double maximum test based on ideas
previously suggested by Hothorn and Zeileis (2008) in the context of
maximally selected statistics. The asymptotic distribution is
(multivariate) normal and computed by means of pmvnorm
.
An object of class efpFunctional
.
Hothorn T., Zeileis A. (2008), Generalized Maximally Selected Statistics. Biometrics, 64, 1263–1269.
Merkle E.C., Fan J., Zeileis A. (2014), Testing for Measurement Invariance with Respect to an Ordinal Variable. Psychometrika, 79(4), 569–584. doi:10.1007/S11336-013-9376-7.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
## artificial data set.seed(1) d <- data.frame( x = runif(200, -1, 1), z = factor(rep(1:4, each = 50)), err = rnorm(200) ) d$y <- rep(c(0.5, -0.5), c(150, 50)) * d$x + d$err ## empirical fluctuation process scus <- gefp(y ~ x, data = d, fit = lm, order.by = ~ z) ## chi-squared-type test (unordered LM-type test) LMuo <- catL2BB(scus) plot(scus, functional = LMuo) sctest(scus, functional = LMuo) ## ordinal maxLM test (with few replications only to save time) maxLMo <- ordL2BB(scus, nrep = 10000) plot(scus, functional = maxLMo) sctest(scus, functional = maxLMo) ## ordinal weighted double maximum test WDM <- ordwmax(scus) plot(scus, functional = WDM) sctest(scus, functional = WDM)
## artificial data set.seed(1) d <- data.frame( x = runif(200, -1, 1), z = factor(rep(1:4, each = 50)), err = rnorm(200) ) d$y <- rep(c(0.5, -0.5), c(150, 50)) * d$x + d$err ## empirical fluctuation process scus <- gefp(y ~ x, data = d, fit = lm, order.by = ~ z) ## chi-squared-type test (unordered LM-type test) LMuo <- catL2BB(scus) plot(scus, functional = LMuo) sctest(scus, functional = LMuo) ## ordinal maxLM test (with few replications only to save time) maxLMo <- ordL2BB(scus, nrep = 10000) plot(scus, functional = maxLMo) sctest(scus, functional = maxLMo) ## ordinal weighted double maximum test WDM <- ordwmax(scus) plot(scus, functional = WDM) sctest(scus, functional = WDM)
Computes confidence intervals for breakpoints.
## S3 method for class 'breakpointsfull' confint(object, parm = NULL, level = 0.95, breaks = NULL, het.reg = TRUE, het.err = TRUE, vcov. = NULL, sandwich = TRUE, ...) ## S3 method for class 'confint.breakpoints' lines(x, col = 2, angle = 90, length = 0.05, code = 3, at = NULL, breakpoints = TRUE, ...)
## S3 method for class 'breakpointsfull' confint(object, parm = NULL, level = 0.95, breaks = NULL, het.reg = TRUE, het.err = TRUE, vcov. = NULL, sandwich = TRUE, ...) ## S3 method for class 'confint.breakpoints' lines(x, col = 2, angle = 90, length = 0.05, code = 3, at = NULL, breakpoints = TRUE, ...)
object |
an object of class |
parm |
the same as |
level |
the confidence level required. |
breaks |
an integer specifying the number of breaks to be used. By default the breaks of the minimum BIC partition are used. |
het.reg |
logical. Should heterogeneous regressors be assumed? If set
to |
het.err |
logical. Should heterogeneous errors be assumed? If set
to |
vcov. |
a function to extract the covariance matrix
for the coefficients of a fitted model of class |
sandwich |
logical. Is the function |
x |
an object of class |
col , angle , length , code
|
arguments passed to |
at |
position on the y axis, where the confidence arrows should be drawn. By default they are drawn at the bottom of the plot. |
breakpoints |
logical. If |
... |
currently not used. |
As the breakpoints are integers (observation numbers) the corresponding confidence intervals are also rounded to integers.
The distribution function used for the computation of confidence intervals of breakpoints is given in Bai (1997). The procedure, in particular the usage of heterogeneous regressors and/or errors, is described in more detail in Bai & Perron (2003).
The breakpoints should be computed from a formula with breakpoints
,
then the confidence intervals for the breakpoints can be derived by
confint
and these can be visualized by lines
. For an
example see below.
A matrix containing the breakpoints and their lower and upper confidence boundary for the given level.
Bai J. (1997), Estimation of a Change Point in Multiple Regression Models, Review of Economics and Statistics, 79, 551-563.
Bai J., Perron P. (2003), Computation and Analysis of Multiple Structural Change Models, Journal of Applied Econometrics, 18, 1-22.
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## dating breaks bp.nile <- breakpoints(Nile ~ 1) ci.nile <- confint(bp.nile, breaks = 1) lines(ci.nile)
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## dating breaks bp.nile <- breakpoints(Nile ~ 1) ci.nile <- confint(bp.nile, breaks = 1) lines(ci.nile)
Weekly closing values of the Dow Jones Industrial Average.
data("DJIA")
data("DJIA")
A weekly univariate time series of class "zoo"
from 1971-07-01 to 1974-08-02.
Appendix A in Hsu (1979).
Hsu D. A. (1979), Detecting Shifts of Parameter in Gamma Sequences with Applications to Stock Price and Air Traffic Flow Analysis, Journal of the American Statistical Association, 74, 31–40.
data("DJIA") ## look at log-difference returns djia <- diff(log(DJIA)) plot(djia) ## convenience functions ## set up a normal regression model which ## explicitely also models the variance normlm <- function(formula, data = list()) { rval <- lm(formula, data = data) class(rval) <- c("normlm", "lm") return(rval) } estfun.normlm <- function(obj) { res <- residuals(obj) ef <- NextMethod(obj) sigma2 <- mean(res^2) rval <- cbind(ef, res^2 - sigma2) colnames(rval) <- c(colnames(ef), "(Variance)") return(rval) } ## normal model (with constant mean and variance) for log returns m1 <- gefp(djia ~ 1, fit = normlm, vcov = meatHAC, sandwich = FALSE) plot(m1, aggregate = FALSE) ## suggests a clear break in the variance (but not the mean) ## dating bp <- breakpoints(I(djia^2) ~ 1) plot(bp) ## -> clearly one break bp time(djia)[bp$breakpoints] ## visualization plot(djia) abline(v = time(djia)[bp$breakpoints], lty = 2) lines(time(djia)[confint(bp)$confint[c(1,3)]], rep(min(djia), 2), col = 2, type = "b", pch = 3)
data("DJIA") ## look at log-difference returns djia <- diff(log(DJIA)) plot(djia) ## convenience functions ## set up a normal regression model which ## explicitely also models the variance normlm <- function(formula, data = list()) { rval <- lm(formula, data = data) class(rval) <- c("normlm", "lm") return(rval) } estfun.normlm <- function(obj) { res <- residuals(obj) ef <- NextMethod(obj) sigma2 <- mean(res^2) rval <- cbind(ef, res^2 - sigma2) colnames(rval) <- c(colnames(ef), "(Variance)") return(rval) } ## normal model (with constant mean and variance) for log returns m1 <- gefp(djia ~ 1, fit = normlm, vcov = meatHAC, sandwich = FALSE) plot(m1, aggregate = FALSE) ## suggests a clear break in the variance (but not the mean) ## dating bp <- breakpoints(I(djia^2) ~ 1) plot(bp) ## -> clearly one break bp time(djia)[bp$breakpoints] ## visualization plot(djia) abline(v = time(djia)[bp$breakpoints], lty = 2) lines(time(djia)[confint(bp)$confint[c(1,3)]], rep(min(djia), 2), col = 2, type = "b", pch = 3)
US labor productivity in the manufacturing/durables sector.
data("durab")
data("durab")
durab
is a multivariate monthly time series from 1947(3)
to 2001(4) with variables
growth rate of the Industrial Production Index to average weekly labor hours in the manufacturing/durables sector,
lag 1 of the series y
,
The data set is available from Bruce Hansen's homepage https://www.ssc.wisc.edu/~bhansen/. For more information see Hansen (2001).
Hansen B. (2001), The New Econometrics of Structural Change: Dating Breaks in U.S. Labor Productivity, Journal of Economic Perspectives, 15, 117–128.
Zeileis A., Leisch F., Kleiber C., Hornik K. (2005), Monitoring Structural Change in Dynamic Econometric Models, Journal of Applied Econometrics, 20, 99–121.
data("durab") ## use AR(1) model as in Hansen (2001) and Zeileis et al. (2005) durab.model <- y ~ lag ## historical tests ## OLS-based CUSUM process ols <- efp(durab.model, data = durab, type = "OLS-CUSUM") plot(ols) ## F statistics fs <- Fstats(durab.model, data = durab, from = 0.1) plot(fs) ## F statistics based on heteroskadisticy-consistent covariance matrix fsHC <- Fstats(durab.model, data = durab, from = 0.1, vcov = function(x, ...) vcovHC(x, type = "HC", ...)) plot(fsHC) ## monitoring Durab <- window(durab, start=1964, end = c(1979, 12)) ols.efp <- efp(durab.model, type = "OLS-CUSUM", data = Durab) newborder <- function(k) 1.723 * k/192 ols.mefp <- mefp(ols.efp, period=2) ols.mefp2 <- mefp(ols.efp, border=newborder) Durab <- window(durab, start=1964) ols.mon <- monitor(ols.mefp) ols.mon2 <- monitor(ols.mefp2) plot(ols.mon) lines(boundary(ols.mon2), col = 2) ## Note: critical value for linear boundary taken from Table III ## in Zeileis et al. 2005: (1.568 + 1.896)/2 = 1.732 is a linear ## interpolation between the values for T = 2 and T = 3 at ## alpha = 0.05. A typo switched 1.732 to 1.723. ## dating bp <- breakpoints(durab.model, data = durab) summary(bp) plot(summary(bp)) plot(ols) lines(breakpoints(bp, breaks = 1), col = 3) lines(breakpoints(bp, breaks = 2), col = 4) plot(fs) lines(breakpoints(bp, breaks = 1), col = 3) lines(breakpoints(bp, breaks = 2), col = 4)
data("durab") ## use AR(1) model as in Hansen (2001) and Zeileis et al. (2005) durab.model <- y ~ lag ## historical tests ## OLS-based CUSUM process ols <- efp(durab.model, data = durab, type = "OLS-CUSUM") plot(ols) ## F statistics fs <- Fstats(durab.model, data = durab, from = 0.1) plot(fs) ## F statistics based on heteroskadisticy-consistent covariance matrix fsHC <- Fstats(durab.model, data = durab, from = 0.1, vcov = function(x, ...) vcovHC(x, type = "HC", ...)) plot(fsHC) ## monitoring Durab <- window(durab, start=1964, end = c(1979, 12)) ols.efp <- efp(durab.model, type = "OLS-CUSUM", data = Durab) newborder <- function(k) 1.723 * k/192 ols.mefp <- mefp(ols.efp, period=2) ols.mefp2 <- mefp(ols.efp, border=newborder) Durab <- window(durab, start=1964) ols.mon <- monitor(ols.mefp) ols.mon2 <- monitor(ols.mefp2) plot(ols.mon) lines(boundary(ols.mon2), col = 2) ## Note: critical value for linear boundary taken from Table III ## in Zeileis et al. 2005: (1.568 + 1.896)/2 = 1.732 is a linear ## interpolation between the values for T = 2 and T = 3 at ## alpha = 0.05. A typo switched 1.732 to 1.723. ## dating bp <- breakpoints(durab.model, data = durab) summary(bp) plot(summary(bp)) plot(ols) lines(breakpoints(bp, breaks = 1), col = 3) lines(breakpoints(bp, breaks = 2), col = 4) plot(fs) lines(breakpoints(bp, breaks = 1), col = 3) lines(breakpoints(bp, breaks = 2), col = 4)
Computes an empirical fluctuation process according to a specified method from the generalized fluctuation test framework, which includes CUSUM and MOSUM tests based on recursive or OLS residuals, parameter estimates or ML scores (OLS first order conditions).
## S3 method for class 'formula' efp(formula, data, type = , h = 0.15, dynamic = FALSE, rescale = TRUE, lrvar = FALSE, vcov = NULL, ...) ## S3 method for class 'matrix' efp(X, y, type = , h = 0.15, dynamic = FALSE, rescale = TRUE, ...)
## S3 method for class 'formula' efp(formula, data, type = , h = 0.15, dynamic = FALSE, rescale = TRUE, lrvar = FALSE, vcov = NULL, ...) ## S3 method for class 'matrix' efp(X, y, type = , h = 0.15, dynamic = FALSE, rescale = TRUE, ...)
X , y , formula
|
specification of the linear regression model:
either by a regressor matrix |
data |
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which |
type |
specifies which type of fluctuation process will be
computed, the default is |
h |
bandwidth (for MOSUM and ME processes only), either specified relative to the sample size as a numeric from interval (0,1), or as an integer >= 1 determining the absolute bandwidth size (number of samples). |
dynamic |
logical. If |
rescale |
logical. If |
lrvar |
logical or character. Should a long-run variance estimator
be used for the residuals? By default, the standard OLS variance is employed.
Alternatively, |
vcov |
a function to extract the covariance matrix for the coefficients
of the fitted model (only for |
... |
currently not used. |
If type
is one of "Rec-CUSUM"
, "OLS-CUSUM"
,
"Rec-MOSUM"
or "OLS-MOSUM"
the function efp
will return a
one-dimensional empirical process of sums of residuals. Either it will be based
on recursive residuals or on OLS residuals and the process will contain
CUmulative SUMs or MOving SUMs of residuals in a certain data window.
For the MOSUM and ME processes all estimations are done for the
observations in a moving data window, whose size is determined by h
and
which is shifted over the whole sample.
If type
is either "RE"
or "ME"
a
k-dimensional process will be returned, if k is the number of
regressors in the model, as it is based on recursive OLS estimates of the
regression coefficients or moving OLS estimates respectively. The recursive
estimates test is also called fluctuation test, therefore setting type
to "fluctuation"
was used to specify it in earlier versions of
strucchange. It still can be used now, but will be forced to "RE"
.
If type
is "Score-CUSUM"
or "Score-MOSUM"
a k+1-dimensional
process will be returned, one for each score of the regression coefficients and one for
the scores of the variance. The process gives the decorrelated cumulative sums of the ML
scores (in a Gaussian model) or first order conditions respectively (in an OLS framework).
If there is a single structural change point , the recursive CUSUM path
starts to depart from its mean 0 at
. The Brownian bridge type paths
will have their respective peaks around
.
The Brownian bridge increments type paths should have a strong change at
.
The function plot
has a method to plot the empirical fluctuation process; with
sctest
the corresponding test on structural change can be
performed.
efp
returns a list of class "efp"
with components including:
process |
the fitted empirical fluctuation process of class
|
type |
a string with the |
nreg |
the number of regressors, |
nobs |
the number of observations, |
par |
the bandwidth |
Brown R.L., Durbin J., Evans J.M. (1975), Techniques for testing constancy of regression relationships over time, Journal of the Royal Statistical Society, B, 37, 149-163.
Chu C.-S., Hornik K., Kuan C.-M. (1995), MOSUM tests for parameter constancy, Biometrika, 82, 603-617.
Chu C.-S., Hornik K., Kuan C.-M. (1995), The moving-estimates test for parameter stability, Econometric Theory, 11, 669-720.
Hansen B. (1992), Testing for Parameter Instability in Linear Models, Journal of Policy Modeling, 14, 517-533.
Hjort N.L., Koning A. (2002), Tests for Constancy of Model Parameters Over Time, Nonparametric Statistics, 14, 113-132.
Krämer W., Ploberger W., Alt R. (1988), Testing for structural change in dynamic models, Econometrica, 56, 1355-1369.
Kuan C.-M., Hornik K. (1995), The generalized fluctuation test: A unifying view, Econometric Reviews, 14, 135 - 161.
Kuan C.-M., Chen (1994), Implementing the fluctuation and moving estimates tests in dynamic econometric models, Economics Letters, 44, 235-239.
Ploberger W., Krämer W. (1992), The CUSUM test with OLS residuals, Econometrica, 60, 271-285.
Zeileis A., Leisch F., Hornik K., Kleiber C. (2002), strucchange
:
An R Package for Testing for Structural Change in Linear Regression Models,
Journal of Statistical Software, 7(2), 1-38.
doi:10.18637/jss.v007.i02.
Zeileis A. (2005), A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals. Econometric Reviews, 24, 445–466. doi:10.1080/07474930500406053.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
Zeileis A., Hornik K. (2007), Generalized M-Fluctuation Tests for Parameter Instability, Statistica Neerlandica, 61, 488–508. doi:10.1111/j.1467-9574.2007.00371.x.
gefp
, plot.efp
, print.efp
,
sctest.efp
, boundary.efp
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## test the null hypothesis that the annual flow remains constant ## over the years ## compute OLS-based CUSUM process and plot ## with standard and alternative boundaries ocus.nile <- efp(Nile ~ 1, type = "OLS-CUSUM") plot(ocus.nile) plot(ocus.nile, alpha = 0.01, alt.boundary = TRUE) ## calculate corresponding test statistic sctest(ocus.nile) ## UK Seatbelt data: a SARIMA(1,0,0)(1,0,0)_12 model ## (fitted by OLS) is used and reveals (at least) two ## breakpoints - one in 1973 associated with the oil crisis and ## one in 1983 due to the introduction of compulsory ## wearing of seatbelts in the UK. data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) ## use RE process re.seat <- efp(y ~ ylag1 + ylag12, data = seatbelt, type = "RE") plot(re.seat) plot(re.seat, functional = NULL) sctest(re.seat)
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## test the null hypothesis that the annual flow remains constant ## over the years ## compute OLS-based CUSUM process and plot ## with standard and alternative boundaries ocus.nile <- efp(Nile ~ 1, type = "OLS-CUSUM") plot(ocus.nile) plot(ocus.nile, alpha = 0.01, alt.boundary = TRUE) ## calculate corresponding test statistic sctest(ocus.nile) ## UK Seatbelt data: a SARIMA(1,0,0)(1,0,0)_12 model ## (fitted by OLS) is used and reveals (at least) two ## breakpoints - one in 1973 associated with the oil crisis and ## one in 1983 due to the introduction of compulsory ## wearing of seatbelts in the UK. data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) ## use RE process re.seat <- efp(y ~ ylag1 + ylag12, data = seatbelt, type = "RE") plot(re.seat) plot(re.seat, functional = NULL) sctest(re.seat)
Computes an object for aggregating, plotting and testing empirical fluctuation processes.
efpFunctional(functional = list(comp = function(x) max(abs(x)), time = max), boundary = function(x) rep(1, length(x)), computePval = NULL, computeCritval = NULL, plotProcess = NULL, lim.process = "Brownian bridge", nobs = 10000, nrep = 50000, nproc = 1:20, h = 0.5, probs = c(0:84/100, 850:1000/1000))
efpFunctional(functional = list(comp = function(x) max(abs(x)), time = max), boundary = function(x) rep(1, length(x)), computePval = NULL, computeCritval = NULL, plotProcess = NULL, lim.process = "Brownian bridge", nobs = 10000, nrep = 50000, nproc = 1:20, h = 0.5, probs = c(0:84/100, 850:1000/1000))
functional |
either a function for aggregating fluctuation processes
or a list with two functions names |
boundary |
a boundary function. |
computePval |
a function for computing p values. If neither
|
computeCritval |
a function for computing critical values. If neither
|
plotProcess |
a function for plotting the empirical process,
if set to |
lim.process |
a string specifying the limiting process. |
nobs |
integer specifying the number of observations of each Brownian motion simulated. |
nrep |
integer specifying the number of replications. |
nproc |
integer specifying for which number of processes
Brownian motions should be simulated. If set to |
h |
bandwidth parameter for increment processes. |
probs |
numeric vector specifying for which probabilities critical values should be tabulated. |
efpFunctional
computes an object of class "efpFunctional"
which then knows how to do inference based on empirical fluctuation processes
(currently only for gefp
objects and not yet for efp
objects) and how to visualize the corresponding processes.
efpFunctional
s for many frequently used test statistics are provided:
maxBB
for the double maximum statistic, meanL2BB
for the Cramer-von Mises
statistic, or rangeBB
for the range statistic. Furthermore, supLM
generates an object of class "efpFunctional"
for a certain trimming parameter,
see the examples. More details can be found in Zeileis (2006). Based on
Merkle, Fan, and Zeileis (2014), further efpFunctional
generators for
aggregating along (ordered) categorical variables have been added:
catL2BB
, ordL2BB
, ordwmax
.
For setting up an efpFunctional
, the functions
computeStatistic
, computePval
, and plotProcess
need to be
supplied. These should have the following interfaces:
computeStatistic
should take a single argument which is the process
itself, i.e., essentially a n x k matrix where n is the number of
observations and k the number of processes (regressors).
computePval
should take two arguments: a scalar test statistic and the
number of processes k.
plotProcess
should take two arguments: an object of class "gefp"
and alpha
the level of significance for any boundaries or critical
values to be visualized.
efpFunctional
returns a list of class "efpFunctional"
with components including:
plotProcess |
a function for plotting empirical fluctuation processes, |
computeStatistic |
a function for computing a test statistic from an empirical fluctuation process, |
computePval |
a function for computing the corresponding p value, |
computeCritval |
a function for computing critical values. |
Merkle E.C., Zeileis A. (2013), Tests of Measurement Invariance without Subgroups: A Generalization of Classical Methods. Psychometrika, 78(1), 59–82. doi:10.1007/S11336-012-9302-4
Merkle E.C., Fan J., Zeileis A. (2014), Testing for Measurement Invariance with Respect to an Ordinal Variable. Psychometrika, 79(4), 569–584. doi:10.1007/S11336-013-9376-7.
Zeileis A. (2005), A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals. Econometric Reviews, 24, 445–466. doi:10.1080/07474930500406053.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
Zeileis A., Hornik K. (2007), Generalized M-Fluctuation Tests for Parameter Instability, Statistica Neerlandica, 61, 488–508. doi:10.1111/j.1467-9574.2007.00371.x.
gefp
, supLM
, catL2BB
, sctest.default
data("BostonHomicide") gcus <- gefp(homicides ~ 1, family = poisson, vcov = kernHAC, data = BostonHomicide) plot(gcus, functional = meanL2BB) gcus sctest(gcus, functional = meanL2BB) y <- rnorm(1000) x1 <- runif(1000) x2 <- runif(1000) ## supWald statistic computed by Fstats() fs <- Fstats(y ~ x1 + x2, from = 0.1) plot(fs) sctest(fs) ## compare with supLM statistic scus <- gefp(y ~ x1 + x2, fit = lm) plot(scus, functional = supLM(0.1)) sctest(scus, functional = supLM(0.1)) ## seatbelt data data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) scus.seat <- gefp(y ~ ylag1 + ylag12, data = seatbelt) ## double maximum test plot(scus.seat) ## range test plot(scus.seat, functional = rangeBB) ## Cramer-von Mises statistic (Nyblom-Hansen test) plot(scus.seat, functional = meanL2BB) ## supLM test plot(scus.seat, functional = supLM(0.1))
data("BostonHomicide") gcus <- gefp(homicides ~ 1, family = poisson, vcov = kernHAC, data = BostonHomicide) plot(gcus, functional = meanL2BB) gcus sctest(gcus, functional = meanL2BB) y <- rnorm(1000) x1 <- runif(1000) x2 <- runif(1000) ## supWald statistic computed by Fstats() fs <- Fstats(y ~ x1 + x2, from = 0.1) plot(fs) sctest(fs) ## compare with supLM statistic scus <- gefp(y ~ x1 + x2, fit = lm) plot(scus, functional = supLM(0.1)) sctest(scus, functional = supLM(0.1)) ## seatbelt data data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) scus.seat <- gefp(y ~ ylag1 + ylag12, data = seatbelt) ## double maximum test plot(scus.seat) ## range test plot(scus.seat, functional = rangeBB) ## Cramer-von Mises statistic (Nyblom-Hansen test) plot(scus.seat, functional = meanL2BB) ## supLM test plot(scus.seat, functional = supLM(0.1))
Computes a series of F statistics for a specified data window.
Fstats(formula, from = 0.15, to = NULL, data = list(), vcov. = NULL)
Fstats(formula, from = 0.15, to = NULL, data = list(), vcov. = NULL)
formula |
a symbolic description for the model to be tested |
from , to
|
numeric. If |
data |
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which |
vcov. |
a function to extract the covariance matrix
for the coefficients of a fitted model of class |
For every potential change point in from:to
a F statistic (Chow
test statistic) is computed. For this an OLS model is fitted for the
observations before and after the potential change point, i.e. 2k
parameters have to be estimated, and the error sum of squares is computed (ESS).
Another OLS model for all observations with a restricted sum of squares (RSS) is
computed, hence k
parameters have to be estimated here. If n
is
the number of observations and k
the number of regressors in the model,
the formula is:
Note that this statistic has an asymptotic chi-squared distribution with k degrees of freedom and (under the assumption of normality) F/k has an exact F distribution with k and n - 2k degrees of freedom.
Fstats
returns an object of class "Fstats"
, which contains
mainly a time series of F statistics. The function plot
has a
method to plot the F statistics or the
corresponding p values; with sctest
a
supF-, aveF- or expF-test on structural change can be performed.
Andrews D.W.K. (1993), Tests for parameter instability and structural change with unknown change point, Econometrica, 61, 821-856.
Hansen B. (1992), Tests for parameter instability in regressions with I(1) processes, Journal of Business & Economic Statistics, 10, 321-335.
Hansen B. (1997), Approximate asymptotic p values for structural-change tests, Journal of Business & Economic Statistics, 15, 60-67.
plot.Fstats
, sctest.Fstats
,
boundary.Fstats
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## test the null hypothesis that the annual flow remains constant ## over the years fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) sctest(fs.nile) ## visualize the breakpoint implied by the argmax of the F statistics plot(Nile) lines(breakpoints(fs.nile)) ## UK Seatbelt data: a SARIMA(1,0,0)(1,0,0)_12 model ## (fitted by OLS) is used and reveals (at least) two ## breakpoints - one in 1973 associated with the oil crisis and ## one in 1983 due to the introduction of compulsory ## wearing of seatbelts in the UK. data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) ## compute F statistics for potential breakpoints between ## 1971(6) (corresponds to from = 0.1) and 1983(6) (corresponds to ## to = 0.9 = 1 - from, the default) ## compute F statistics fs <- Fstats(y ~ ylag1 + ylag12, data = seatbelt, from = 0.1) ## this gives the same result fs <- Fstats(y ~ ylag1 + ylag12, data = seatbelt, from = c(1971, 6), to = c(1983, 6)) ## plot the F statistics plot(fs, alpha = 0.01) ## plot F statistics with aveF boundary plot(fs, aveF = TRUE) ## perform the expF test sctest(fs, type = "expF")
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## test the null hypothesis that the annual flow remains constant ## over the years fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) sctest(fs.nile) ## visualize the breakpoint implied by the argmax of the F statistics plot(Nile) lines(breakpoints(fs.nile)) ## UK Seatbelt data: a SARIMA(1,0,0)(1,0,0)_12 model ## (fitted by OLS) is used and reveals (at least) two ## breakpoints - one in 1973 associated with the oil crisis and ## one in 1983 due to the introduction of compulsory ## wearing of seatbelts in the UK. data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) plot(seatbelt[,"y"], ylab = expression(log[10](casualties))) ## compute F statistics for potential breakpoints between ## 1971(6) (corresponds to from = 0.1) and 1983(6) (corresponds to ## to = 0.9 = 1 - from, the default) ## compute F statistics fs <- Fstats(y ~ ylag1 + ylag12, data = seatbelt, from = 0.1) ## this gives the same result fs <- Fstats(y ~ ylag1 + ylag12, data = seatbelt, from = c(1971, 6), to = c(1983, 6)) ## plot the F statistics plot(fs, alpha = 0.01) ## plot F statistics with aveF boundary plot(fs, aveF = TRUE) ## perform the expF test sctest(fs, type = "expF")
Computes an empirical M-fluctuation process from the scores of a fitted model.
gefp(..., fit = glm, scores = estfun, vcov = NULL, decorrelate = TRUE, sandwich = TRUE, order.by = NULL, fitArgs = NULL, parm = NULL, data = list())
gefp(..., fit = glm, scores = estfun, vcov = NULL, decorrelate = TRUE, sandwich = TRUE, order.by = NULL, fitArgs = NULL, parm = NULL, data = list())
... |
specification of some model which is passed together
with |
fit |
|
scores |
a function which extracts the scores or estimating
function from the fitted object: |
vcov |
a function to extract the covariance matrix
for the coefficients of the fitted model:
|
decorrelate |
logical. Should the process be decorrelated? |
sandwich |
logical. Is the function |
order.by |
Either a vector |
fitArgs |
List of additional arguments which could be passed to
the |
parm |
integer or character specifying the component of the estimating functions which should be used (by default all components are used). |
data |
an optional data frame containing the variables in the |
gefp
returns a list of class "gefp"
with components including:
process |
the fitted empirical fluctuation process of class
|
nreg |
the number of regressors, |
nobs |
the number of observations, |
fit |
the fit function used, |
scores |
the scores function used, |
fitted.model |
the fitted model. |
Zeileis A. (2005), A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals. Econometric Reviews, 24, 445–466. doi:10.1080/07474930500406053.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
Zeileis A., Hornik K. (2007), Generalized M-Fluctuation Tests for Parameter Instability, Statistica Neerlandica, 61, 488–508. doi:10.1111/j.1467-9574.2007.00371.x.
Zeileis A., Shah A., Patnaik I. (2010), Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes, Computational Statistics and Data Analysis, 54(6), 1696–1706. doi:10.1016/j.csda.2009.12.005.
data("BostonHomicide") gcus <- gefp(homicides ~ 1, family = poisson, vcov = kernHAC, data = BostonHomicide) plot(gcus, aggregate = FALSE) gcus sctest(gcus)
data("BostonHomicide") gcus <- gefp(homicides ~ 1, family = poisson, vcov = kernHAC, data = BostonHomicide) plot(gcus, aggregate = FALSE) gcus sctest(gcus)
German M1 money demand.
data("GermanM1")
data("GermanM1")
GermanM1
is a data frame containing 12 quarterly time series
from 1961(1) to 1995(4) and two further variables. historyM1
is the subset of GermanM1
up to 1990(2), i.e., the data before
the German monetary unification on 1990-06-01. monitorM1
is the complement of historyM1
, i.e., the data after
the unification. All three data frames contain the variables
time series. Logarithm of real M1 per capita,
time series. Logarithm of a price index,
time series. Logarithm of real per capita gross national product,
time series. Long-run interest rate,
time series. First differences of m
,
time series. First differences of lag 2 of y
,
time series. First differences of R
,
time series. First differences of lag 1 of R
,
time series. First differences of p
,
time series. Lag 1 of m
,
time series. Lag 1 of y
,
time series. Lag 1 of R
,
factor coding the seasonality,
vector containing the OLS residuals of the Lütkepohl et al. (1999) model fitted in the history period.
Lütkepohl et al. (1999) investigate the linearity and stability of German M1 money demand: they find a stable regression relation for the time before the monetary union on 1990-06-01 but a clear structural instability afterwards.
Zeileis et al. (2005) use a model with
ecm.res
instead of m1
, y1
and R1
, which
leads to equivalent results in the history period but slightly
different results in the monitoring period. The reason for the
replacement is that stationary regressors are needed for the
structural change tests. See references and the examples below for
more details.
The data is provided by the German central bank and is available online in the data archive of the Journal of Applied Econometrics http://qed.econ.queensu.ca/jae/1999-v14.5/lutkepohl-terasvirta-wolters/.
Lütkepohl H., Teräsvirta T., Wolters J. (1999), Investigating Stability and Linearity of a German M1 Money Demand Function, Journal of Applied Econometrics, 14, 511-525.
Zeileis A., Leisch F., Kleiber C., Hornik K. (2005), Monitoring Structural Change in Dynamic Econometric Models, Journal of Applied Econometrics, 20, 99–121.
data("GermanM1") ## Lütkepohl et al. (1999) use the following model LTW.model <- dm ~ dy2 + dR + dR1 + dp + m1 + y1 + R1 + season ## Zeileis et al. (2005) use M1.model <- dm ~ dy2 + dR + dR1 + dp + ecm.res + season ## historical tests ols <- efp(LTW.model, data = GermanM1, type = "OLS-CUSUM") plot(ols) re <- efp(LTW.model, data = GermanM1, type = "fluctuation") plot(re) fs <- Fstats(LTW.model, data = GermanM1, from = 0.1) plot(fs) ## monitoring M1 <- historyM1 ols.efp <- efp(M1.model, type = "OLS-CUSUM", data = M1) newborder <- function(k) 1.5778*k/118 ols.mefp <- mefp(ols.efp, period = 2) ols.mefp2 <- mefp(ols.efp, border = newborder) M1 <- GermanM1 ols.mon <- monitor(ols.mefp) ols.mon2 <- monitor(ols.mefp2) plot(ols.mon) lines(boundary(ols.mon2), col = 2) ## dating bp <- breakpoints(LTW.model, data = GermanM1) summary(bp) plot(bp) plot(fs) lines(confint(bp))
data("GermanM1") ## Lütkepohl et al. (1999) use the following model LTW.model <- dm ~ dy2 + dR + dR1 + dp + m1 + y1 + R1 + season ## Zeileis et al. (2005) use M1.model <- dm ~ dy2 + dR + dR1 + dp + ecm.res + season ## historical tests ols <- efp(LTW.model, data = GermanM1, type = "OLS-CUSUM") plot(ols) re <- efp(LTW.model, data = GermanM1, type = "fluctuation") plot(re) fs <- Fstats(LTW.model, data = GermanM1, from = 0.1) plot(fs) ## monitoring M1 <- historyM1 ols.efp <- efp(M1.model, type = "OLS-CUSUM", data = M1) newborder <- function(k) 1.5778*k/118 ols.mefp <- mefp(ols.efp, period = 2) ols.mefp2 <- mefp(ols.efp, border = newborder) M1 <- GermanM1 ols.mon <- monitor(ols.mefp) ols.mon2 <- monitor(ols.mefp2) plot(ols.mon) lines(boundary(ols.mon2), col = 2) ## dating bp <- breakpoints(LTW.model, data = GermanM1) summary(bp) plot(bp) plot(fs) lines(confint(bp))
Data about the number of marriages, illegitimate and legitimate births, and deaths in the Austrian Alpine village Grossarl during the 18th and 19th century.
data("Grossarl")
data("Grossarl")
Grossarl
is a data frame containing 6 annual time series
(1700 - 1899), 3 factors coding policy interventions and 1 vector
with the year (plain numeric).
time series. Number of marriages,
time series. Number of illegitimate births,
time series. Number of legitimate births,
time series. Number of deaths,
time series. Fraction of illegitimate births,
time series. Number of marriages in the previous year,
ordered factor coding 4 different political regimes,
ordered factor coding 5 different moral regulations,
ordered factor coding 5 different marriage restrictions,
numeric. Year of observation.
The data frame contains historical demographic data from Grossarl, a village in the Alpine region of Salzburg, Austria, during the 18th and 19th century. During this period, the total population of Grossarl did not vary much on the whole, with the very exception of the period of the protestant emigrations in 1731/32.
Especially during the archbishopric, moral interventions aimed at lowering the proportion of illegitimate baptisms. For details see the references.
Parish registers provide the basic demographic series of baptisms and burials (which is almost equivalent to births and deaths in the study area) and marriages. For more information see Veichtlbauer et al. (2006).
Veichtlbauer O., Zeileis A., Leisch F. (2006), The Impact Of Policy Interventions on a Pre-Industrial Population System in the Austrian Alps, forthcoming.
Zeileis A., Veichtlbauer O. (2002), Policy Interventions Affecting Illegitimacy in Preindustrial Austria: A Structural Change Analysis, In R. Dutter (ed.), Festschrift 50 Jahre Österreichische Statistische Gesellschaft, 133-146, Österreichische Statistische Gesellschaft.
data("Grossarl") ## time series of births, deaths, marriages ########################################### with(Grossarl, plot(cbind(deaths, illegitimate + legitimate, marriages), plot.type = "single", col = grey(c(0.7, 0, 0)), lty = c(1, 1, 3), lwd = 1.5, ylab = "annual Grossarl series")) legend("topright", c("deaths", "births", "marriages"), col = grey(c(0.7, 0, 0)), lty = c(1, 1, 3), bty = "n") ## illegitimate births ###################### ## lm + MOSUM plot(Grossarl$fraction) fm.min <- lm(fraction ~ politics, data = Grossarl) fm.ext <- lm(fraction ~ politics + morals + nuptiality + marriages, data = Grossarl) lines(ts(fitted(fm.min), start = 1700), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4) mos.min <- efp(fraction ~ politics, data = Grossarl, type = "OLS-MOSUM") mos.ext <- efp(fraction ~ politics + morals + nuptiality + marriages, data = Grossarl, type = "OLS-MOSUM") plot(mos.min) lines(mos.ext, lty = 2) ## dating bp <- breakpoints(fraction ~ 1, data = Grossarl, h = 0.1) summary(bp) ## RSS, BIC, AIC plot(bp) plot(0:8, AIC(bp), type = "b") ## probably use 5 or 6 breakpoints and compare with ## coding of the factors as used by us ## ## politics 1803 1816 1850 ## morals 1736 1753 1771 1803 ## nuptiality 1803 1810 1816 1883 ## ## m = 5 1753 1785 1821 1856 1878 ## m = 6 1734 1754 1785 1821 1856 1878 ## 6 2 5 1 4 3 ## fitted models coef(bp, breaks = 6) plot(Grossarl$fraction) lines(fitted(bp, breaks = 6), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4) ## marriages ############ ## lm + MOSUM plot(Grossarl$marriages) fm.min <- lm(marriages ~ politics, data = Grossarl) fm.ext <- lm(marriages ~ politics + morals + nuptiality, data = Grossarl) lines(ts(fitted(fm.min), start = 1700), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4) mos.min <- efp(marriages ~ politics, data = Grossarl, type = "OLS-MOSUM") mos.ext <- efp(marriages ~ politics + morals + nuptiality, data = Grossarl, type = "OLS-MOSUM") plot(mos.min) lines(mos.ext, lty = 2) ## dating bp <- breakpoints(marriages ~ 1, data = Grossarl, h = 0.1) summary(bp) ## RSS, BIC, AIC plot(bp) plot(0:8, AIC(bp), type = "b") ## probably use 3 or 4 breakpoints and compare with ## coding of the factors as used by us ## ## politics 1803 1816 1850 ## morals 1736 1753 1771 1803 ## nuptiality 1803 1810 1816 1883 ## ## m = 3 1738 1813 1875 ## m = 4 1738 1794 1814 1875 ## 2 4 1 3 ## fitted models coef(bp, breaks = 4) plot(Grossarl$marriages) lines(fitted(bp, breaks = 4), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4)
data("Grossarl") ## time series of births, deaths, marriages ########################################### with(Grossarl, plot(cbind(deaths, illegitimate + legitimate, marriages), plot.type = "single", col = grey(c(0.7, 0, 0)), lty = c(1, 1, 3), lwd = 1.5, ylab = "annual Grossarl series")) legend("topright", c("deaths", "births", "marriages"), col = grey(c(0.7, 0, 0)), lty = c(1, 1, 3), bty = "n") ## illegitimate births ###################### ## lm + MOSUM plot(Grossarl$fraction) fm.min <- lm(fraction ~ politics, data = Grossarl) fm.ext <- lm(fraction ~ politics + morals + nuptiality + marriages, data = Grossarl) lines(ts(fitted(fm.min), start = 1700), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4) mos.min <- efp(fraction ~ politics, data = Grossarl, type = "OLS-MOSUM") mos.ext <- efp(fraction ~ politics + morals + nuptiality + marriages, data = Grossarl, type = "OLS-MOSUM") plot(mos.min) lines(mos.ext, lty = 2) ## dating bp <- breakpoints(fraction ~ 1, data = Grossarl, h = 0.1) summary(bp) ## RSS, BIC, AIC plot(bp) plot(0:8, AIC(bp), type = "b") ## probably use 5 or 6 breakpoints and compare with ## coding of the factors as used by us ## ## politics 1803 1816 1850 ## morals 1736 1753 1771 1803 ## nuptiality 1803 1810 1816 1883 ## ## m = 5 1753 1785 1821 1856 1878 ## m = 6 1734 1754 1785 1821 1856 1878 ## 6 2 5 1 4 3 ## fitted models coef(bp, breaks = 6) plot(Grossarl$fraction) lines(fitted(bp, breaks = 6), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4) ## marriages ############ ## lm + MOSUM plot(Grossarl$marriages) fm.min <- lm(marriages ~ politics, data = Grossarl) fm.ext <- lm(marriages ~ politics + morals + nuptiality, data = Grossarl) lines(ts(fitted(fm.min), start = 1700), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4) mos.min <- efp(marriages ~ politics, data = Grossarl, type = "OLS-MOSUM") mos.ext <- efp(marriages ~ politics + morals + nuptiality, data = Grossarl, type = "OLS-MOSUM") plot(mos.min) lines(mos.ext, lty = 2) ## dating bp <- breakpoints(marriages ~ 1, data = Grossarl, h = 0.1) summary(bp) ## RSS, BIC, AIC plot(bp) plot(0:8, AIC(bp), type = "b") ## probably use 3 or 4 breakpoints and compare with ## coding of the factors as used by us ## ## politics 1803 1816 1850 ## morals 1736 1753 1771 1803 ## nuptiality 1803 1810 1816 1883 ## ## m = 3 1738 1813 1875 ## m = 4 1738 1794 1814 1875 ## 2 4 1 3 ## fitted models coef(bp, breaks = 4) plot(Grossarl$marriages) lines(fitted(bp, breaks = 4), col = 2) lines(ts(fitted(fm.ext), start = 1700), col = 4)
Computation of log likelihood and AIC type information criteria for partitions given by breakpoints.
## S3 method for class 'breakpointsfull' logLik(object, breaks = NULL, ...) ## S3 method for class 'breakpointsfull' AIC(object, breaks = NULL, ..., k = 2) ## S3 method for class 'breakpointsfull' LWZ(object, ...) ## S3 method for class 'breakpoints' LWZ(object, ...)
## S3 method for class 'breakpointsfull' logLik(object, breaks = NULL, ...) ## S3 method for class 'breakpointsfull' AIC(object, breaks = NULL, ..., k = 2) ## S3 method for class 'breakpointsfull' LWZ(object, ...) ## S3 method for class 'breakpoints' LWZ(object, ...)
object |
an object of class |
breaks |
if |
... |
for |
k |
the penalty parameter to be used, the default |
As for linear models the log likelihood is computed on a normal model and the degrees of freedom are the number of regression coefficients multiplied by the number of segments plus the number of estimated breakpoints plus 1 for the error variance.
If AIC
or LWZ
is applied to an object of class "breakpointsfull"
breaks
can be a vector of integers and the AIC or LWZ for each corresponding
partition will be returned. By default the maximal number of breaks stored
in the object
is used. See below for an example.
An object of class "logLik"
or a simple vector containing
the AIC respectively.
Liu, J., Wu, S., & Zidek, J. V. (1997). On segmented multivariate regression. Statistica Sinica, 497-525.
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) bp.nile <- breakpoints(Nile ~ 1) summary(bp.nile) plot(bp.nile) ## BIC of partitions with 0 to 5 breakpoints plot(0:5, AIC(bp.nile, k = log(bp.nile$nobs)), type = "b") ## AIC plot(0:5, AIC(bp.nile), type = "b") ## LWZ plot(0:5, LWZ(bp.nile), type = "b") ## BIC, AIC, LWZ, log likelihood of a single partition bp.nile1 <- breakpoints(bp.nile, breaks = 1) AIC(bp.nile1, k = log(bp.nile1$nobs)) AIC(bp.nile1) LWZ(bp.nile1) logLik(bp.nile1)
## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) bp.nile <- breakpoints(Nile ~ 1) summary(bp.nile) plot(bp.nile) ## BIC of partitions with 0 to 5 breakpoints plot(0:5, AIC(bp.nile, k = log(bp.nile$nobs)), type = "b") ## AIC plot(0:5, AIC(bp.nile), type = "b") ## LWZ plot(0:5, LWZ(bp.nile), type = "b") ## BIC, AIC, LWZ, log likelihood of a single partition bp.nile1 <- breakpoints(bp.nile, breaks = 1) AIC(bp.nile1, k = log(bp.nile1$nobs)) AIC(bp.nile1) LWZ(bp.nile1) logLik(bp.nile1)
Computes magnitude statistics for breakpoints.
## S3 method for class 'breakpointsfull' magnitude(object, interval = 0.1, breaks = NULL, component = "trend", ...)
## S3 method for class 'breakpointsfull' magnitude(object, interval = 0.1, breaks = NULL, component = "trend", ...)
object |
an object of class |
interval |
the (one-side) interval over which to compute the magnitude for the columns
|
breaks |
how many breaks to use or which statistic to use for estimating it,
see |
component |
which covariate(s) used for fitting the |
... |
ignored |
The breakpoint magnitude is estimated using several statistics: the difference (diff) in the fitted value immediately before and after the break, and the root mean squared deviation (RMSD), mean absolute deviation (MAD) and mean deviation (MD) between the fitted values of the model preceding the break extrapolated over interval samples after the break, and vice versa. RMSD and MAD should be more robust estimators of magnitude compared to diff.
A list with the following elements, compatible with the magnitude format as output by bfast
:
a matrix containing the magnitude estimates (in columns) for each breakpoint (in rows),
sample number at which the largest break was detected, twice
fitted values before and after the largest break
the maximum diff magnitude of the largest break
time of the largest break (same as m.x
, but single value)
data(Nile) trend <- 1:length(Nile) bp.nile <- breakpoints(Nile ~ trend) # The trend component is default, set "component" to the # name of your coviariate(s), if it is different. magnitude(bp.nile)
data(Nile) trend <- 1:length(Nile) bp.nile <- breakpoints(Nile ~ trend) # The trend component is default, set "component" to the # name of your coviariate(s), if it is different. magnitude(bp.nile)
Online monitoring of structural breaks in a linear regression model. A sequential fluctuation test based on parameter estimates or OLS residuals signals structural breaks.
## S3 method for class 'formula' mefp(formula, type = c("OLS-CUSUM", "OLS-MOSUM", "RE", "ME", "fluctuation"), data, h = 1, alpha = 0.05, functional = c("max", "range"), period = 10, tolerance = .Machine$double.eps^0.5, CritvalTable = NULL, rescale = NULL, border = NULL, ...) ## S3 method for class 'matrix' mefp(obj, y, type = c("OLS-CUSUM", "OLS-MOSUM", "RE", "ME", "fluctuation"), h = 1, alpha = 0.05, functional = c("max", "range"), period = 10, tolerance = .Machine$double.eps^0.5, CritvalTable = NULL, rescale = NULL, border = NULL, ...) ## S3 method for class 'efp' mefp(obj, alpha=0.05, functional = c("max", "range"), period = 10, tolerance = .Machine$double.eps^0.5, CritvalTable = NULL, rescale = NULL, border = NULL, ...) monitor(obj, data = NULL, verbose = TRUE) monitor.matrix(obj, X, y, verbose = TRUE)
## S3 method for class 'formula' mefp(formula, type = c("OLS-CUSUM", "OLS-MOSUM", "RE", "ME", "fluctuation"), data, h = 1, alpha = 0.05, functional = c("max", "range"), period = 10, tolerance = .Machine$double.eps^0.5, CritvalTable = NULL, rescale = NULL, border = NULL, ...) ## S3 method for class 'matrix' mefp(obj, y, type = c("OLS-CUSUM", "OLS-MOSUM", "RE", "ME", "fluctuation"), h = 1, alpha = 0.05, functional = c("max", "range"), period = 10, tolerance = .Machine$double.eps^0.5, CritvalTable = NULL, rescale = NULL, border = NULL, ...) ## S3 method for class 'efp' mefp(obj, alpha=0.05, functional = c("max", "range"), period = 10, tolerance = .Machine$double.eps^0.5, CritvalTable = NULL, rescale = NULL, border = NULL, ...) monitor(obj, data = NULL, verbose = TRUE) monitor.matrix(obj, X, y, verbose = TRUE)
formula |
specification of the linear regression model by a |
obj |
object of class |
X |
regression matrix X. |
y |
response vector. |
data |
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which |
type |
specifies which type of fluctuation process will be computed. |
h |
(only used for MOSUM/ME processes). A numeric scalar from interval (0,1) specifying the size of the data window relative to the sample size. |
alpha |
Significance level of the test, i.e., probability of type I error. |
functional |
Determines if maximum or range of parameter differences is used as statistic. |
period |
(only used for MOSUM/ME processes). Maximum time (relative to the history period) that will be monitored. Default is 10 times the history period. |
tolerance |
Tolerance for numeric |
CritvalTable |
Table of critical values, this table is
interpolated to get critical values
for arbitrary |
rescale |
If |
border |
An optional user-specified border function for the empirical process. This argument is under development. |
verbose |
If |
... |
Currently not used. |
mefp
creates an object of class "mefp"
either
from a model formula or from an object of class "efp"
. In
addition to the arguments of efp
, the type of statistic
and a significance level for the monitoring must be specified. The
monitoring itself is performed by monitor
, which can be
called arbitrarily often on objects of class "mefp"
. If new
data have arrived, then the empirical fluctuation process is computed
for the new data. If the process crosses the boundaries corresponding
to the significance level alpha
, a structural break is detected
(and signaled).
The typical usage is to initialize the monitoring by creation of an
object of class "mefp"
either using a formula or an
"efp"
object. Data available at this stage are considered the
history sample, which is kept fixed during the complete
monitoring process, and may not contain any structural changes.
Subsequent calls to monitor
perform a sequential test of the
null hypothesis of no structural change in new data against the
general alternative of changes in one or more of the coefficients of
the regression model.
The recursive
estimates test is also called fluctuation test, therefore setting type
to "fluctuation"
was used to specify it in earlier versions of
strucchange. It still can be used now, but will be forced to "RE"
An object of class mefp
that includes components such as:
breakpoint |
index of the detected breakpoint, |
process |
a matrix denoting the coefficients of the empirical fluctuation process, |
computeEmpProc |
a function to compute the empirical process
based on |
last |
number of observations, |
type |
a string with the |
alpha |
the significance level. |
Leisch F., Hornik K., Kuan C.-M. (2000), Monitoring Structural Changes with the Generalized Fluctuation Test, Econometric Theory, 16, 835–854.
Zeileis A., Leisch F., Kleiber C., Hornik K. (2005), Monitoring Structural Change in Dynamic Econometric Models, Journal of Applied Econometrics, 20, 99–121. doi:10.1002/jae.776.
Zeileis A. (2005), A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals. Econometric Reviews, 24, 445–466. doi:10.1080/07474930500406053.
Zeileis A., Shah A., Patnaik I. (2010), Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes, Computational Statistics and Data Analysis, 54(6), 1696–1706. doi:10.1016/j.csda.2009.12.005.
df1 <- data.frame(y=rnorm(300)) df1[150:300,"y"] <- df1[150:300,"y"]+1 ## use the first 50 observations as history period e1 <- efp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1) me1 <- mefp(e1, alpha=0.05) ## the same in one function call me1 <- mefp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1, alpha=0.05) ## monitor the 50 next observations me2 <- monitor(me1, data=df1[1:100,,drop=FALSE]) plot(me2) # and now monitor on all data me3 <- monitor(me2, data=df1) plot(me3) ## Load dataset "USIncExp" with income and expenditure in the US ## and choose a suitable subset for the history period data("USIncExp") USIncExp3 <- window(USIncExp, start=c(1969,1), end=c(1971,12)) ## initialize the monitoring with the formula interface me.mefp <- mefp(expenditure~income, type="ME", rescale=TRUE, data=USIncExp3, alpha=0.05) ## monitor the new observations for the year 1972 USIncExp3 <- window(USIncExp, start=c(1969,1), end=c(1972,12)) me.mefp <- monitor(me.mefp) ## monitor the new data for the years 1973-1976 USIncExp3 <- window(USIncExp, start=c(1969,1), end=c(1976,12)) me.mefp <- monitor(me.mefp) plot(me.mefp, functional = NULL)
df1 <- data.frame(y=rnorm(300)) df1[150:300,"y"] <- df1[150:300,"y"]+1 ## use the first 50 observations as history period e1 <- efp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1) me1 <- mefp(e1, alpha=0.05) ## the same in one function call me1 <- mefp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1, alpha=0.05) ## monitor the 50 next observations me2 <- monitor(me1, data=df1[1:100,,drop=FALSE]) plot(me2) # and now monitor on all data me3 <- monitor(me2, data=df1) plot(me3) ## Load dataset "USIncExp" with income and expenditure in the US ## and choose a suitable subset for the history period data("USIncExp") USIncExp3 <- window(USIncExp, start=c(1969,1), end=c(1971,12)) ## initialize the monitoring with the formula interface me.mefp <- mefp(expenditure~income, type="ME", rescale=TRUE, data=USIncExp3, alpha=0.05) ## monitor the new observations for the year 1972 USIncExp3 <- window(USIncExp, start=c(1969,1), end=c(1972,12)) me.mefp <- monitor(me.mefp) ## monitor the new data for the years 1973-1976 USIncExp3 <- window(USIncExp, start=c(1969,1), end=c(1976,12)) me.mefp <- monitor(me.mefp) plot(me.mefp, functional = NULL)
Macroeconomic time series from the United Kingdom with variables for estimating the Phillips curve equation.
data("PhillipsCurve")
data("PhillipsCurve")
A multivariate annual time series from 1857 to 1987 with the columns
Logarithm of the consumer price index,
Logarithm of nominal wages,
Unemployment rate,
First differences of p
,
First differences of w
,
First differences of u
Lag 1 of u
,
Lag 1 of dp
.
The data is available online in the data archive of the Journal of Applied Econometrics http://qed.econ.queensu.ca/jae/2003-v18.1/bai-perron/.
Alogoskoufis G.S., Smith R. (1991), The Phillips Curve, the Persistence of Inflation, and the Lucas Critique: Evidence from Exchange Rate Regimes, American Economic Review, 81, 1254-1275.
Bai J., Perron P. (2003), Computation and Analysis of Multiple Structural Change Models, Journal of Applied Econometrics, 18, 1-22.
## load and plot data data("PhillipsCurve") uk <- window(PhillipsCurve, start = 1948) plot(uk[, "dp"]) ## AR(1) inflation model ## estimate breakpoints bp.inf <- breakpoints(dp ~ dp1, data = uk, h = 8) plot(bp.inf) summary(bp.inf) ## fit segmented model with three breaks fac.inf <- breakfactor(bp.inf, breaks = 2, label = "seg") fm.inf <- lm(dp ~ 0 + fac.inf/dp1, data = uk) summary(fm.inf) ## Results from Table 2 in Bai & Perron (2003): ## coefficient estimates coef(bp.inf, breaks = 2) ## corresponding standard errors sqrt(sapply(vcov(bp.inf, breaks = 2), diag)) ## breakpoints and confidence intervals confint(bp.inf, breaks = 2) ## Phillips curve equation ## estimate breakpoints bp.pc <- breakpoints(dw ~ dp1 + du + u1, data = uk, h = 5, breaks = 5) ## look at RSS and BIC plot(bp.pc) summary(bp.pc) ## fit segmented model with three breaks fac.pc <- breakfactor(bp.pc, breaks = 2, label = "seg") fm.pc <- lm(dw ~ 0 + fac.pc/dp1 + du + u1, data = uk) summary(fm.pc) ## Results from Table 3 in Bai & Perron (2003): ## coefficient estimates coef(fm.pc) ## corresponding standard errors sqrt(diag(vcov(fm.pc))) ## breakpoints and confidence intervals confint(bp.pc, breaks = 2, het.err = FALSE)
## load and plot data data("PhillipsCurve") uk <- window(PhillipsCurve, start = 1948) plot(uk[, "dp"]) ## AR(1) inflation model ## estimate breakpoints bp.inf <- breakpoints(dp ~ dp1, data = uk, h = 8) plot(bp.inf) summary(bp.inf) ## fit segmented model with three breaks fac.inf <- breakfactor(bp.inf, breaks = 2, label = "seg") fm.inf <- lm(dp ~ 0 + fac.inf/dp1, data = uk) summary(fm.inf) ## Results from Table 2 in Bai & Perron (2003): ## coefficient estimates coef(bp.inf, breaks = 2) ## corresponding standard errors sqrt(sapply(vcov(bp.inf, breaks = 2), diag)) ## breakpoints and confidence intervals confint(bp.inf, breaks = 2) ## Phillips curve equation ## estimate breakpoints bp.pc <- breakpoints(dw ~ dp1 + du + u1, data = uk, h = 5, breaks = 5) ## look at RSS and BIC plot(bp.pc) summary(bp.pc) ## fit segmented model with three breaks fac.pc <- breakfactor(bp.pc, breaks = 2, label = "seg") fm.pc <- lm(dw ~ 0 + fac.pc/dp1 + du + u1, data = uk) summary(fm.pc) ## Results from Table 3 in Bai & Perron (2003): ## coefficient estimates coef(fm.pc) ## corresponding standard errors sqrt(diag(vcov(fm.pc))) ## breakpoints and confidence intervals confint(bp.pc, breaks = 2, het.err = FALSE)
Plot and lines method for objects of class "efp"
## S3 method for class 'efp' plot(x, alpha = 0.05, alt.boundary = FALSE, boundary = TRUE, functional = "max", main = NULL, ylim = NULL, ylab = "Empirical fluctuation process", ...) ## S3 method for class 'efp' lines(x, functional = "max", ...)
## S3 method for class 'efp' plot(x, alpha = 0.05, alt.boundary = FALSE, boundary = TRUE, functional = "max", main = NULL, ylim = NULL, ylab = "Empirical fluctuation process", ...) ## S3 method for class 'efp' lines(x, functional = "max", ...)
x |
an object of class |
alpha |
numeric from interval (0,1) indicating the confidence level for which the boundary of the corresponding test will be computed. |
alt.boundary |
logical. If set to |
boundary |
logical. If set to |
functional |
indicates which functional should be applied to the
process before plotting and which boundaries should be used. If set to |
main , ylim , ylab , ...
|
high-level |
Plots are available for the "max"
functional for all process types.
For Brownian bridge type processes the maximum or mean squared Euclidean norm
("maxL2"
and "meanL2"
) can be used for aggregating before plotting.
No plots are available for the "range"
functional.
Alternative boundaries that are proportional to the standard deviation of the corresponding limiting process are available for processes with Brownian motion or Brownian bridge limiting processes.
efp
returns an object of class "efp"
which inherits
from the class "ts"
or "mts"
respectively. The function
plot
has a method to plot the
empirical fluctuation process; with sctest
the corresponding test for
structural change can be performed.
Brown R.L., Durbin J., Evans J.M. (1975), Techniques for testing constancy of regression relationships over time, Journal of the Royal Statistical Society, B, 37, 149-163.
Chu C.-S., Hornik K., Kuan C.-M. (1995), MOSUM tests for parameter constancy, Biometrika, 82, 603-617.
Chu C.-S., Hornik K., Kuan C.-M. (1995), The moving-estimates test for parameter stability, Econometric Theory, 11, 669-720.
Krämer W., Ploberger W., Alt R. (1988), Testing for structural change in dynamic models, Econometrica, 56, 1355-1369.
Kuan C.-M., Hornik K. (1995), The generalized fluctuation test: A unifying view, Econometric Reviews, 14, 135 - 161.
Kuan C.-M., Chen (1994), Implementing the fluctuation and moving estimates tests in dynamic econometric models, Economics Letters, 44, 235-239.
Ploberger W., Krämer W. (1992), The CUSUM test with OLS residuals, Econometrica, 60, 271-285.
Zeileis A., Leisch F., Hornik K., Kleiber C. (2002), strucchange
:
An R Package for Testing for Structural Change in Linear Regression Models,
Journal of Statistical Software, 7(2), 1-38.
doi:10.18637/jss.v007.i02.
Zeileis A. (2004), Alternative Boundaries for CUSUM Tests, Statistical Papers, 45, 123–131.
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years ## compute Rec-CUSUM fluctuation process temp.cus <- efp(nhtemp ~ 1) ## plot the process plot(temp.cus, alpha = 0.01) ## and calculate the test statistic sctest(temp.cus) ## compute (recursive estimates) fluctuation process ## with an additional linear trend regressor lin.trend <- 1:60 temp.me <- efp(nhtemp ~ lin.trend, type = "fluctuation") ## plot the bivariate process plot(temp.me, functional = NULL) ## and perform the corresponding test sctest(temp.me)
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years ## compute Rec-CUSUM fluctuation process temp.cus <- efp(nhtemp ~ 1) ## plot the process plot(temp.cus, alpha = 0.01) ## and calculate the test statistic sctest(temp.cus) ## compute (recursive estimates) fluctuation process ## with an additional linear trend regressor lin.trend <- 1:60 temp.me <- efp(nhtemp ~ lin.trend, type = "fluctuation") ## plot the bivariate process plot(temp.me, functional = NULL) ## and perform the corresponding test sctest(temp.me)
Plotting method for objects of class "Fstats"
## S3 method for class 'Fstats' plot(x, pval = FALSE, asymptotic = FALSE, alpha = 0.05, boundary = TRUE, aveF = FALSE, xlab = "Time", ylab = NULL, ylim = NULL, ...)
## S3 method for class 'Fstats' plot(x, pval = FALSE, asymptotic = FALSE, alpha = 0.05, boundary = TRUE, aveF = FALSE, xlab = "Time", ylab = NULL, ylim = NULL, ...)
x |
an object of class |
pval |
logical. If set to |
asymptotic |
logical. If set to |
alpha |
numeric from interval (0,1) indicating the confidence level for which the boundary of the supF test will be computed. |
boundary |
logical. If set to |
aveF |
logical. If set to |
xlab , ylab , ylim , ...
|
high-level |
No return value, called for side effects.
Andrews D.W.K. (1993), Tests for parameter instability and structural change with unknown change point, Econometrica, 61, 821-856.
Hansen B. (1992), Tests for parameter instability in regressions with I(1) processes, Journal of Business & Economic Statistics, 10, 321-335.
Hansen B. (1997), Approximate asymptotic p values for structural-change tests, Journal of Business & Economic Statistics, 15, 60-67.
Fstats
, boundary.Fstats
,
sctest.Fstats
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years for potential break points between 1941 ## (corresponds to from = 0.5) and 1962 (corresponds to to = 0.85) ## compute F statistics fs <- Fstats(nhtemp ~ 1, from = 0.5, to = 0.85) ## plot the F statistics plot(fs, alpha = 0.01) ## and the corresponding p values plot(fs, pval = TRUE, alpha = 0.01) ## perform the aveF test sctest(fs, type = "aveF")
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years for potential break points between 1941 ## (corresponds to from = 0.5) and 1962 (corresponds to to = 0.85) ## compute F statistics fs <- Fstats(nhtemp ~ 1, from = 0.5, to = 0.85) ## plot the F statistics plot(fs, alpha = 0.01) ## and the corresponding p values plot(fs, pval = TRUE, alpha = 0.01) ## perform the aveF test sctest(fs, type = "aveF")
This is a method of the generic plot
function for
for "mefp"
objects as returned by mefp
or
monitor
. It plots the empirical fluctuation process (or a
functional thereof) as a time series plot, and includes boundaries
corresponding to the significance level of the monitoring procedure.
## S3 method for class 'mefp' plot(x, boundary = TRUE, functional = "max", main = NULL, ylab = "Empirical fluctuation process", ylim = NULL, ...)
## S3 method for class 'mefp' plot(x, boundary = TRUE, functional = "max", main = NULL, ylab = "Empirical fluctuation process", ylim = NULL, ...)
x |
an object of class |
boundary |
if |
functional |
indicates which functional should be applied to a
multivariate empirical process. If set to |
main , ylab , ylim , ...
|
high-level |
No return value, called for side effects.
df1 <- data.frame(y=rnorm(300)) df1[150:300,"y"] <- df1[150:300,"y"]+1 me1 <- mefp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1, alpha=0.05) me2 <- monitor(me1, data=df1) plot(me2)
df1 <- data.frame(y=rnorm(300)) df1[150:300,"y"] <- df1[150:300,"y"]+1 me1 <- mefp(y~1, data=df1[1:50,,drop=FALSE], type="ME", h=1, alpha=0.05) me2 <- monitor(me1, data=df1) plot(me2)
US ex-post real interest rate: the three-month treasury bill deflated by the CPI inflation rate.
data("RealInt")
data("RealInt")
A quarterly time series from 1961(1) to 1986(3).
The data is available online in the data archive of the Journal of Applied Econometrics http://qed.econ.queensu.ca/jae/2003-v18.1/bai-perron/.
Bai J., Perron P. (2003), Computation and Analysis of Multiple Structural Change Models, Journal of Applied Econometrics, 18, 1-22.
Zeileis A., Kleiber C. (2005), Validating Multiple Structural Change Models - A Case Study. Journal of Applied Econometrics, 20, 685-690.
## load and plot data data("RealInt") plot(RealInt) ## estimate breakpoints bp.ri <- breakpoints(RealInt ~ 1, h = 15) plot(bp.ri) summary(bp.ri) ## fit segmented model with three breaks fac.ri <- breakfactor(bp.ri, breaks = 3, label = "seg") fm.ri <- lm(RealInt ~ 0 + fac.ri) summary(fm.ri) ## setup kernel HAC estimator vcov.ri <- function(x, ...) kernHAC(x, kernel = "Quadratic Spectral", prewhite = 1, approx = "AR(1)", ...) ## Results from Table 1 in Bai & Perron (2003): ## coefficient estimates coef(bp.ri, breaks = 3) ## corresponding standard errors sapply(vcov(bp.ri, breaks = 3, vcov = vcov.ri), sqrt) ## breakpoints and confidence intervals confint(bp.ri, breaks = 3, vcov = vcov.ri) ## Visualization plot(RealInt) lines(as.vector(time(RealInt)), fitted(fm.ri), col = 4) lines(confint(bp.ri, breaks = 3, vcov = vcov.ri))
## load and plot data data("RealInt") plot(RealInt) ## estimate breakpoints bp.ri <- breakpoints(RealInt ~ 1, h = 15) plot(bp.ri) summary(bp.ri) ## fit segmented model with three breaks fac.ri <- breakfactor(bp.ri, breaks = 3, label = "seg") fm.ri <- lm(RealInt ~ 0 + fac.ri) summary(fm.ri) ## setup kernel HAC estimator vcov.ri <- function(x, ...) kernHAC(x, kernel = "Quadratic Spectral", prewhite = 1, approx = "AR(1)", ...) ## Results from Table 1 in Bai & Perron (2003): ## coefficient estimates coef(bp.ri, breaks = 3) ## corresponding standard errors sapply(vcov(bp.ri, breaks = 3, vcov = vcov.ri), sqrt) ## breakpoints and confidence intervals confint(bp.ri, breaks = 3, vcov = vcov.ri) ## Visualization plot(RealInt) lines(as.vector(time(RealInt)), fitted(fm.ri), col = 4) lines(confint(bp.ri, breaks = 3, vcov = vcov.ri))
A generic function for computing the recursive residuals (standardized one step prediction errors) of a linear regression model.
## Default S3 method: recresid(x, y, start = ncol(x) + 1, end = nrow(x), tol = sqrt(.Machine$double.eps)/ncol(x), qr.tol = 1e-7, ...) ## S3 method for class 'formula' recresid(formula, data = list(), ...) ## S3 method for class 'lm' recresid(x, data = list(), ...)
## Default S3 method: recresid(x, y, start = ncol(x) + 1, end = nrow(x), tol = sqrt(.Machine$double.eps)/ncol(x), qr.tol = 1e-7, ...) ## S3 method for class 'formula' recresid(formula, data = list(), ...) ## S3 method for class 'lm' recresid(x, data = list(), ...)
x , y , formula
|
specification of the linear regression model:
either by a regressor matrix |
start , end
|
integer. Index of the first and last observation, respectively, for which recursive residuals should be computed. By default, the maximal range is selected. |
tol |
numeric. A relative tolerance for precision of recursive coefficient estimates, see details. |
qr.tol |
numeric. The |
data |
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which |
... |
currently not used. |
Recursive residuals are standardized one-step-ahead prediction errors. Under the usual assumptions for the linear regression model they are (asymptotically) normal and i.i.d. (see Brown, Durbin, Evans, 1975, for details).
The default method computes the initial coefficient estimates via QR
decomposition, using lm.fit
. In subsequent steps, the
updating formula provided by Brown, Durbin, Evans (1975) is employed.
To avoid numerical instabilities in the first steps (with typically
small sample sizes), the QR solution is computed for comparison.
When the relative difference (assessed bey all.equal
)
between the two solutions falls below tol
, only the updating
formula is used in subsequent steps.
A vector containing the recursive residuals.
Brown R.L., Durbin J., Evans J.M. (1975), Techniques for testing constancy of regression relationships over time, Journal of the Royal Statistical Society, B, 37, 149-163.
x <- rnorm(100) + rep(c(0, 2), each = 50) rr <- recresid(x ~ 1) plot(cumsum(rr), type = "l") plot(efp(x ~ 1, type = "Rec-CUSUM"))
x <- rnorm(100) + rep(c(0, 2), each = 50) rr <- recresid(x ~ 1) plot(cumsum(rr), type = "l") plot(efp(x ~ 1, type = "Rec-CUSUM"))
Computes the root of a symmetric and positive semidefinite matrix.
root.matrix(X)
root.matrix(X)
X |
a symmetric and positive semidefinite matrix |
a symmetric matrix of same dimensions as X
X <- matrix(c(1,2,2,8), ncol=2) test <- root.matrix(X) ## control results X test %*% test
X <- matrix(c(1,2,2,8), ncol=2) test <- root.matrix(X) ## control results X test %*% test
Computes the root of the Gramian X^TX.
root.matrix.crossprod(X)
root.matrix.crossprod(X)
X |
a matrix |
a symmetric matrix V
where V^2
equals X^TX
set.seed(1) n <- 100 p <- 3 X <- matrix(rnorm(n*p),nrow=n, ncol=p) test <- root.matrix.crossprod(X) ## control results t(X) %*% X test %*% test
set.seed(1) n <- 100 p <- 3 X <- matrix(rnorm(n*p),nrow=n, ncol=p) test <- root.matrix.crossprod(X) ## control results t(X) %*% X test %*% test
Bibliographic information about papers related to structural change and changepoints published in 27 different econometrics and statistics journals.
data("scPublications")
data("scPublications")
A data frame containing information on 835 structural change papers in 9 variables.
character. Author(s) of the paper.
character. Title of the paper.
factor. In which journal was the paper published?
numeric. Year of publication.
numeric. Journal volume.
character. Issue within the journal volume.
numeric. Page on which the paper begins.
numeric. Page on which the paper ends.
factor. Is the journal an econometrics or statistics journal?
The data set scPublications
includes
bibliographic information about publications related to structural change and obtained
from the ‘ISI Web of Science’. The query was based on the ‘Science Citation Index Expanded’
and ‘Social Sciences Citation Index’ (for the full range of years available: 1900-2006 and
1956-2006, respectively). The ‘Source Title’ was restricted to the 27 journals
in the data frame and the ‘Topic’ to be one of the following:
structural change, structural break, structural stability, structural instability,
parameter instability, parameter stability, parameter constancy, change point,
changepoint, change-point, breakpoint, break-point, break point, CUSUM, MOSUM.
Additionally, the famous CUSUM paper of Brown, Durbin and Evans (1975) was added
manually to scPublications
(because it did not match the query above).
ISI Web of Science at https://www.webofknowledge.com/. Queried by James Bullard.
## construct time series: ## number of sc publications in econometrics/statistics data("scPublications") ## select years from 1987 and ## `most important' journals pub <- scPublications pub <- subset(pub, year > 1986) tab1 <- table(pub$journal) nam1 <- names(tab1)[as.vector(tab1) > 9] ## at least 10 papers tab2 <- sapply(levels(pub$journal), function(x) min(subset(pub, journal == x)$year)) nam2 <- names(tab2)[as.vector(tab2) < 1991] ## started at least in 1990 nam <- nam1[nam1 %in% nam2] pub <- subset(pub, as.character(journal) %in% nam) pub$journal <- factor(pub$journal) pub_data <- pub ## generate time series pub <- with(pub, tapply(type, year, table)) pub <- zoo(t(sapply(pub, cbind)), 1987:2006) colnames(pub) <- levels(pub_data$type) ## visualize plot(pub, ylim = c(0, 35))
## construct time series: ## number of sc publications in econometrics/statistics data("scPublications") ## select years from 1987 and ## `most important' journals pub <- scPublications pub <- subset(pub, year > 1986) tab1 <- table(pub$journal) nam1 <- names(tab1)[as.vector(tab1) > 9] ## at least 10 papers tab2 <- sapply(levels(pub$journal), function(x) min(subset(pub, journal == x)$year)) nam2 <- names(tab2)[as.vector(tab2) < 1991] ## started at least in 1990 nam <- nam1[nam1 %in% nam2] pub <- subset(pub, as.character(journal) %in% nam) pub$journal <- factor(pub$journal) pub_data <- pub ## generate time series pub <- with(pub, tapply(type, year, table)) pub <- zoo(t(sapply(pub, cbind)), 1987:2006) colnames(pub) <- levels(pub_data$type) ## visualize plot(pub, ylim = c(0, 35))
Generic function for performing structural change tests.
sctest(x, ...)
sctest(x, ...)
x |
an object. |
... |
arguments passed to methods. |
sctest
is a generic function for performing/extracting structural
change tests based on various types of objects. The strucchange
package
provides various types of methods.
First, structural change tests based on
F statistics in linear regression models (Fstats
),
empirical fluctuation processes in linear regression models (efp
),
and generalized empirical fluctuation processes in parametric models (gefp
)
are available in the corresponding sctest
methods.
Second, convenience interfaces for carrying out structural change tests
in linear regression models and general parametric models are provided in
sctest.formula
and sctest.default
, respectively.
An object of class "htest"
containing:
statistic |
the test statistic, |
p.value |
the corresponding p value, |
method |
a character string with the method used, |
data.name |
a character string with the data name. |
Zeileis A., Leisch F., Hornik K., Kleiber C. (2002), strucchange
:
An R Package for Testing for Structural Change in Linear Regression Models,
Journal of Statistical Software, 7(2), 1-38.
doi:10.18637/jss.v007.i02.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
sctest.formula
, sctest.default
,
sctest.Fstats
, sctest.efp
, sctest.gefp
Performs model-based tests for structural change (or parameter instability) in parametric models.
## Default S3 method: sctest(x, order.by = NULL, functional = maxBB, vcov = NULL, scores = estfun, decorrelate = TRUE, sandwich = TRUE, parm = NULL, plot = FALSE, from = 0.1, to = NULL, nobs = NULL, nrep = 50000, width = 0.15, xlab = NULL, ...)
## Default S3 method: sctest(x, order.by = NULL, functional = maxBB, vcov = NULL, scores = estfun, decorrelate = TRUE, sandwich = TRUE, parm = NULL, plot = FALSE, from = 0.1, to = NULL, nobs = NULL, nrep = 50000, width = 0.15, xlab = NULL, ...)
x |
a model object. The model class can in principle be arbitrary
but needs to provide suitable methods for extracting the |
order.by |
either a vector |
functional |
either a character specification of the functional
to be used or an |
vcov |
a function to extract the covariance matrix
for the coefficients of the fitted model:
|
scores |
a function which extracts the scores or estimating
function from the fitted object: |
decorrelate |
logical. Should the process be decorrelated? |
sandwich |
logical. Is the function |
parm |
integer or character specifying the component of the estimating functions which should be used (by default all components are used). |
plot |
logical. Should the result of the test also be visualized? |
from , to
|
numeric. In case the |
nobs , nrep
|
numeric. In case the |
width |
numeric. In case the |
xlab , ...
|
graphical parameters passed to the plot method (in case
|
sctest.default
is a convenience interface to gefp
for
structural change tests (or parameter instability tests) in general
parametric models. It proceeds in the following steps:
The generalized empirical fluctuation process (or score-based CUSUM process)
is computed via scus <- gefp(x, fit = NULL, ...)
where ...
comprises the arguments order.by
, vcov
, scores
, decorrelate
,
sandwich
, parm
that are simply passed on to gefp
.
The empirical fluctuation process is visualized (if plot = TRUE
) via
plot(scus, functional = functional, ...)
.
The empirical fluctuation is assessed by the corresponding significance test
via sctest(scus, functional = functional)
.
The main motivation for prociding the convenience interface is that these three steps can be easily carried out in one go along with a two convenience options:
By default, the covariance is computed by an outer-product of gradients
estimator just as in gefp
. This is always available based on the scores
.
Additionally, by setting vcov = "info"
, the corresponding information
matrix can be used. Then the average information is assumed to be provided by
the vcov
method for the model class. (Note that this is only sensible
for models estimated by maximum likelihood.)
Instead of providing the functional
by an efpFunctional
object, the test labels employed by Merkle and Zeileis (2013) and Merkle, Fan,
and Zeileis (2013) can be used for convenience. Namely, for continuous numeric
orderings, the following functionals are available:
functional = "DM"
or "dmax"
provides the double-maximum test (maxBB
).
"CvM"
is the Cramer-von Mises functional meanL2BB
.
"supLM"
or equivalently "maxLM"
is Andrews' supLM test
(supLM
). "MOSUM"
or "maxMOSUM"
is the MOSUM
functional (maxMOSUM
), and "range"
is the range
functional rangeBB
. Furthermore, several functionals suitable
for (ordered) categorical order.by
variables are provided:
"LMuo"
is the unordered LM test (catL2BB
),
"WDMo"
is the weighted double-maximum test for ordered variables
(ordwmax
), and "maxLMo"
is the maxLM test for
ordered variables (ordL2BB
).
The theoretical model class is introduced in Zeileis and Hornik (2007) with a
unifying view in Zeileis (2005), especially from an econometric perspective.
Zeileis (2006) introduces the underling computational tools gefp
and
efpFunctional
.
Merkle and Zeileis (2013) discuss the methods in the context of measurement invariance which is particularly relevant to psychometric models for cross section data. Merkle, Fan, and Zeileis (2014) extend the results to ordered categorical variables.
Zeileis, Shah, and Patnaik (2013) provide a unifying discussion in the context of time series methods, specifically in financial econometrics.
An object of class "htest"
containing:
statistic |
the test statistic, |
p.value |
the corresponding p value, |
method |
a character string with the method used, |
data.name |
a character string with the data name. |
Merkle E.C., Zeileis A. (2013), Tests of Measurement Invariance without Subgroups: A Generalization of Classical Methods. Psychometrika, 78(1), 59–82. doi:10.1007/S11336-012-9302-4
Merkle E.C., Fan J., Zeileis A. (2014), Testing for Measurement Invariance with Respect to an Ordinal Variable. Psychometrika, 79(4), 569–584. doi:10.1007/S11336-013-9376-7.
Zeileis A. (2005), A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals. Econometric Reviews, 24, 445–466. doi:10.1080/07474930500406053.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
Zeileis A., Hornik K. (2007), Generalized M-Fluctuation Tests for Parameter Instability, Statistica Neerlandica, 61, 488–508. doi:10.1111/j.1467-9574.2007.00371.x.
Zeileis A., Shah A., Patnaik I. (2010), Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes, Computational Statistics and Data Analysis, 54(6), 1696–1706. doi:10.1016/j.csda.2009.12.005.
## Zeileis and Hornik (2007), Section 5.3, Figure 6 data("Grossarl") m <- glm(cbind(illegitimate, legitimate) ~ 1, family = binomial, data = Grossarl, subset = time(fraction) <= 1800) sctest(m, order.by = 1700:1800, functional = "CvM")
## Zeileis and Hornik (2007), Section 5.3, Figure 6 data("Grossarl") m <- glm(cbind(illegitimate, legitimate) ~ 1, family = binomial, data = Grossarl, subset = time(fraction) <= 1800) sctest(m, order.by = 1700:1800, functional = "CvM")
Performs a generalized fluctuation test.
## S3 method for class 'efp' sctest(x, alt.boundary = FALSE, functional = c("max", "range", "maxL2", "meanL2"), ...)
## S3 method for class 'efp' sctest(x, alt.boundary = FALSE, functional = c("max", "range", "maxL2", "meanL2"), ...)
x |
an object of class |
alt.boundary |
logical. If set to |
functional |
indicates which functional should be applied to the empirical fluctuation process. |
... |
currently not used. |
The critical values for the MOSUM tests and the ME test are just
tabulated for confidence levels between 0.1 and 0.01, thus the p
value approximations will be poor for other p values. Similarly the
critical values for the maximum and mean squared Euclidean norm ("maxL2"
and "meanL2"
) are tabulated for confidence levels between 0.2 and 0.005.
An object of class "htest"
containing:
statistic |
the test statistic, |
p.value |
the corresponding p value, |
method |
a character string with the method used, |
data.name |
a character string with the data name. |
Brown R.L., Durbin J., Evans J.M. (1975), Techniques for testing constancy of regression relationships over time, Journal of the Royal Statistical Society, B, 37, 149-163.
Chu C.-S., Hornik K., Kuan C.-M. (1995), MOSUM tests for parameter constancy, Biometrika, 82, 603-617.
Chu C.-S., Hornik K., Kuan C.-M. (1995), The moving-estimates test for parameter stability, Econometric Theory, 11, 669-720.
Krämer W., Ploberger W., Alt R. (1988), Testing for structural change in dynamic models, Econometrica, 56, 1355-1369.
Kuan C.-M., Hornik K. (1995), The generalized fluctuation test: A unifying view, Econometric Reviews, 14, 135 - 161.
Kuan C.-M., Chen (1994), Implementing the fluctuation and moving estimates tests in dynamic econometric models, Economics Letters, 44, 235-239.
Ploberger W., Krämer W. (1992), The CUSUM Test with OLS Residuals, Econometrica, 60, 271-285.
Zeileis A., Leisch F., Hornik K., Kleiber C. (2002), strucchange
:
An R Package for Testing for Structural Change in Linear Regression Models,
Journal of Statistical Software, 7(2), 1-38.
doi:10.18637/jss.v007.i02.
Zeileis A. (2004), Alternative Boundaries for CUSUM Tests, Statistical Papers, 45, 123–131.
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years compute OLS-CUSUM fluctuation process temp.cus <- efp(nhtemp ~ 1, type = "OLS-CUSUM") ## plot the process with alternative boundaries plot(temp.cus, alpha = 0.01, alt.boundary = TRUE) ## and calculate the test statistic sctest(temp.cus) ## compute moving estimates fluctuation process temp.me <- efp(nhtemp ~ 1, type = "ME", h = 0.2) ## plot the process with functional = "max" plot(temp.me) ## and perform the corresponding test sctest(temp.me)
## Load dataset "nhtemp" with average yearly temperatures in New Haven data("nhtemp") ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years compute OLS-CUSUM fluctuation process temp.cus <- efp(nhtemp ~ 1, type = "OLS-CUSUM") ## plot the process with alternative boundaries plot(temp.cus, alpha = 0.01, alt.boundary = TRUE) ## and calculate the test statistic sctest(temp.cus) ## compute moving estimates fluctuation process temp.me <- efp(nhtemp ~ 1, type = "ME", h = 0.2) ## plot the process with functional = "max" plot(temp.me) ## and perform the corresponding test sctest(temp.me)
Performs tests for structural change in linear regression models.
## S3 method for class 'formula' sctest(formula, type = , h = 0.15, alt.boundary = FALSE, functional = c("max", "range", "maxL2", "meanL2"), from = 0.15, to = NULL, point = 0.5, asymptotic = FALSE, data, ...)
## S3 method for class 'formula' sctest(formula, type = , h = 0.15, alt.boundary = FALSE, functional = c("max", "range", "maxL2", "meanL2"), from = 0.15, to = NULL, point = 0.5, asymptotic = FALSE, data, ...)
formula |
a formula describing the model to be tested. |
type |
a character string specifying the structural change test that is
to be performed, the default is |
h |
numeric from interval (0,1) specifying the bandwidth. Determines the size of the data window relative to the sample size (for MOSUM and ME tests only). |
alt.boundary |
logical. If set to |
functional |
indicates which functional should be used to aggregate the empirical fluctuation processes to a test statistic. |
from , to
|
numeric. If |
point |
parameter of the Chow test for the potential change point.
Interpreted analogous to the |
asymptotic |
logical. If |
data |
an optional data frame containing the variables in the model. By
default the variables are taken from the environment which
|
... |
sctest.formula
is a convenience interface for performing structural change
tests in linear regression models based on efp
and Fstats
.
It is mainly a wrapper for sctest.efp
and sctest.Fstats
as it fits an empirical fluctuation process
first or computes the F statistics respectively and subsequently performs the
corresponding test. The Chow test and the Nyblom-Hansen test are available explicitly here.
An alternative convenience interface for performing structural change tests in general
parametric models (based on gefp
) is available in sctest.default
.
An object of class "htest"
containing:
statistic |
the test statistic, |
p.value |
the corresponding p value, |
method |
a character string with the method used, |
data.name |
a character string with the data name. |
sctest.efp
, sctest.Fstats
, sctest.default
## Example 7.4 from Greene (1993), "Econometric Analysis" ## Chow test on Longley data data("longley") sctest(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley, type = "Chow", point = 7) ## which is equivalent to segmenting the regression via fac <- factor(c(rep(1, 7), rep(2, 9))) fm0 <- lm(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley) fm1 <- lm(Employed ~ fac/(Year + GNP.deflator + GNP + Armed.Forces), data = longley) anova(fm0, fm1) ## estimates from Table 7.5 in Greene (1993) summary(fm0) summary(fm1)
## Example 7.4 from Greene (1993), "Econometric Analysis" ## Chow test on Longley data data("longley") sctest(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley, type = "Chow", point = 7) ## which is equivalent to segmenting the regression via fac <- factor(c(rep(1, 7), rep(2, 9))) fm0 <- lm(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley) fm1 <- lm(Employed ~ fac/(Year + GNP.deflator + GNP + Armed.Forces), data = longley) anova(fm0, fm1) ## estimates from Table 7.5 in Greene (1993) summary(fm0) summary(fm1)
Performs the supF-, aveF- or expF-test
## S3 method for class 'Fstats' sctest(x, type = c("supF", "aveF", "expF"), asymptotic = FALSE, ...)
## S3 method for class 'Fstats' sctest(x, type = c("supF", "aveF", "expF"), asymptotic = FALSE, ...)
x |
an object of class |
type |
a character string specifying which test will be performed. |
asymptotic |
logical. Only necessary if |
... |
currently not used. |
If x
contains just a single F statistic and type is
"supF"
or "aveF"
the Chow test will be performed.
The original GAUSS code for computing the p values of the supF-, aveF- and expF-test was written by Bruce Hansen and is available from https://www.ssc.wisc.edu/~bhansen/. R port by Achim Zeileis.
An object of class "htest"
containing:
statistic |
the test statistic, |
p.value |
the corresponding p value, |
method |
a character string with the method used, |
data.name |
a character string with the data name. |
Andrews D.W.K. (1993), Tests for parameter instability and structural change with unknown change point, Econometrica, 61, 821-856.
Andrews D.W.K., Ploberger W. (1994), Optimal tests when a nuisance parameter is present only under the alternative, Econometrica, 62, 1383-1414.
Hansen B. (1992), Tests for parameter instability in regressions with I(1) processes, Journal of Business & Economic Statistics, 10, 321-335.
Hansen B. (1997), Approximate asymptotic p values for structural-change tests, Journal of Business & Economic Statistics, 15, 60-67.
## Load dataset "nhtemp" with average yearly temperatures in New Haven data(nhtemp) ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years for potential break points between 1941 ## (corresponds to from = 0.5) and 1962 (corresponds to to = 0.85) ## compute F statistics fs <- Fstats(nhtemp ~ 1, from = 0.5, to = 0.85) ## plot the F statistics plot(fs, alpha = 0.01) ## and the corresponding p values plot(fs, pval = TRUE, alpha = 0.01) ## perform the aveF test sctest(fs, type = "aveF")
## Load dataset "nhtemp" with average yearly temperatures in New Haven data(nhtemp) ## plot the data plot(nhtemp) ## test the model null hypothesis that the average temperature remains ## constant over the years for potential break points between 1941 ## (corresponds to from = 0.5) and 1962 (corresponds to to = 0.85) ## compute F statistics fs <- Fstats(nhtemp ~ 1, from = 0.5, to = 0.85) ## plot the F statistics plot(fs, alpha = 0.01) ## and the corresponding p values plot(fs, pval = TRUE, alpha = 0.01) ## perform the aveF test sctest(fs, type = "aveF")
Computes the inverse of the cross-product of a matrix X.
solveCrossprod(X, method = c("qr", "chol", "solve"))
solveCrossprod(X, method = c("qr", "chol", "solve"))
X |
a matrix, typically a regressor matrix. |
method |
a string indicating whether the QR decomposition,
the Cholesky decomposition or |
Using the Cholesky decomposition of X'X (as computed by crossprod(X)
)
is computationally faster and preferred to solve(crossprod(X))
. Using the
QR decomposition of X is slower but should be more accurate.
a matrix containing the inverse of crossprod(X)
.
X <- cbind(1, rnorm(100)) solveCrossprod(X) solve(crossprod(X))
X <- cbind(1, rnorm(100)) solveCrossprod(X) solve(crossprod(X))
A multivariate series of all S&P 500 stock prices in the second half of the year 2001, i.e., before and after the terrorist attacks of 2001-09-11.
data("SP2001")
data("SP2001")
A multivariate daily "zoo"
series with "Date"
index
from 2001-07-31 to 2001-12-31 (103 observations) of all 500 S&P
stock prices.
Yahoo! Finance: https://finance.yahoo.com/.
Zeileis A., Leisch F., Kleiber C., Hornik K. (2005), Monitoring Structural Change in Dynamic Econometric Models, Journal of Applied Econometrics, 20, 99–121.
## load and transform data ## (DAL: Delta Air Lines, LU: Lucent Technologies) data("SP2001") stock.prices <- SP2001[, c("DAL", "LU")] stock.returns <- diff(log(stock.prices)) ## price and return series plot(stock.prices, ylab = c("Delta Air Lines", "Lucent Technologies"), main = "") plot(stock.returns, ylab = c("Delta Air Lines", "Lucent Technologies"), main = "") ## monitoring of DAL series myborder <- function(k) 1.939*k/28 x <- as.vector(stock.returns[, "DAL"][1:28]) dal.cusum <- mefp(x ~ 1, type = "OLS-CUSUM", border = myborder) dal.mosum <- mefp(x ~ 1, type = "OLS-MOSUM", h = 0.5, period = 4) x <- as.vector(stock.returns[, "DAL"]) dal.cusum <- monitor(dal.cusum) dal.mosum <- monitor(dal.mosum) ## monitoring of LU series x <- as.vector(stock.returns[, "LU"][1:28]) lu.cusum <- mefp(x ~ 1, type = "OLS-CUSUM", border = myborder) lu.mosum <- mefp(x ~ 1, type = "OLS-MOSUM", h = 0.5, period = 4) x <- as.vector(stock.returns[, "LU"]) lu.cusum <- monitor(lu.cusum) lu.mosum <- monitor(lu.mosum) ## pretty plotting ## (needs some work because lm() does not keep "zoo" attributes) cus.bound <- zoo(c(rep(NA, 27), myborder(28:102)), index(stock.returns)) mos.bound <- as.vector(boundary(dal.mosum)) mos.bound <- zoo(c(rep(NA, 27), mos.bound[1], mos.bound), index(stock.returns)) ## Lucent Technologies: CUSUM test plot(zoo(c(lu.cusum$efpprocess, lu.cusum$process), index(stock.prices)), ylim = c(-1, 1) * coredata(cus.bound)[102], xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(cus.bound, col = 2) lines(-cus.bound, col = 2) ## Lucent Technologies: MOSUM test plot(zoo(c(lu.mosum$efpprocess, lu.mosum$process), index(stock.prices)[-(1:14)]), ylim = c(-1, 1) * coredata(mos.bound)[102], xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(mos.bound, col = 2) lines(-mos.bound, col = 2) ## Delta Air Lines: CUSUM test plot(zoo(c(dal.cusum$efpprocess, dal.cusum$process), index(stock.prices)), ylim = c(-1, 1) * coredata(cus.bound)[102], xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(cus.bound, col = 2) lines(-cus.bound, col = 2) ## Delta Air Lines: MOSUM test plot(zoo(c(dal.mosum$efpprocess, dal.mosum$process), index(stock.prices)[-(1:14)]), ylim = range(dal.mosum$process), xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(mos.bound, col = 2) lines(-mos.bound, col = 2)
## load and transform data ## (DAL: Delta Air Lines, LU: Lucent Technologies) data("SP2001") stock.prices <- SP2001[, c("DAL", "LU")] stock.returns <- diff(log(stock.prices)) ## price and return series plot(stock.prices, ylab = c("Delta Air Lines", "Lucent Technologies"), main = "") plot(stock.returns, ylab = c("Delta Air Lines", "Lucent Technologies"), main = "") ## monitoring of DAL series myborder <- function(k) 1.939*k/28 x <- as.vector(stock.returns[, "DAL"][1:28]) dal.cusum <- mefp(x ~ 1, type = "OLS-CUSUM", border = myborder) dal.mosum <- mefp(x ~ 1, type = "OLS-MOSUM", h = 0.5, period = 4) x <- as.vector(stock.returns[, "DAL"]) dal.cusum <- monitor(dal.cusum) dal.mosum <- monitor(dal.mosum) ## monitoring of LU series x <- as.vector(stock.returns[, "LU"][1:28]) lu.cusum <- mefp(x ~ 1, type = "OLS-CUSUM", border = myborder) lu.mosum <- mefp(x ~ 1, type = "OLS-MOSUM", h = 0.5, period = 4) x <- as.vector(stock.returns[, "LU"]) lu.cusum <- monitor(lu.cusum) lu.mosum <- monitor(lu.mosum) ## pretty plotting ## (needs some work because lm() does not keep "zoo" attributes) cus.bound <- zoo(c(rep(NA, 27), myborder(28:102)), index(stock.returns)) mos.bound <- as.vector(boundary(dal.mosum)) mos.bound <- zoo(c(rep(NA, 27), mos.bound[1], mos.bound), index(stock.returns)) ## Lucent Technologies: CUSUM test plot(zoo(c(lu.cusum$efpprocess, lu.cusum$process), index(stock.prices)), ylim = c(-1, 1) * coredata(cus.bound)[102], xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(cus.bound, col = 2) lines(-cus.bound, col = 2) ## Lucent Technologies: MOSUM test plot(zoo(c(lu.mosum$efpprocess, lu.mosum$process), index(stock.prices)[-(1:14)]), ylim = c(-1, 1) * coredata(mos.bound)[102], xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(mos.bound, col = 2) lines(-mos.bound, col = 2) ## Delta Air Lines: CUSUM test plot(zoo(c(dal.cusum$efpprocess, dal.cusum$process), index(stock.prices)), ylim = c(-1, 1) * coredata(cus.bound)[102], xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(cus.bound, col = 2) lines(-cus.bound, col = 2) ## Delta Air Lines: MOSUM test plot(zoo(c(dal.mosum$efpprocess, dal.mosum$process), index(stock.prices)[-(1:14)]), ylim = range(dal.mosum$process), xlab = "Time", ylab = "empirical fluctuation process") abline(0, 0) abline(v = as.Date("2001-09-10"), lty = 2) lines(mos.bound, col = 2) lines(-mos.bound, col = 2)
Generators for efpFunctional
objects suitable for aggregating
empirical fluctuation processes to test statistics along continuous
variables (i.e., along time in time series applications).
supLM(from = 0.15, to = NULL) maxMOSUM(width = 0.15)
supLM(from = 0.15, to = NULL) maxMOSUM(width = 0.15)
from , to
|
numeric from interval (0, 1) specifying start and end
of trimmed sample period. By default, |
width |
a numeric from interval (0,1) specifying the bandwidth. Determines the size of the moving data window relative to sample size. |
supLM
and maxMOSUM
generate efpFunctional
objects for Andrews' supLM test and a (maximum) MOSUM test, respectively,
with the specified optional parameters (from
and to
,
and width
, respectively). The resulting objects can be used in
combination with empirical fluctuation processes of class gefp
for significance testing and visualization. The corresponding statistics
are useful for carrying out structural change tests along a continuous
variable (i.e., along time in time series applications). Further typical
efpFunctional
s for this setting are the double-maximum
functional maxBB
and the Cramer-von Mises functional
meanL2BB
.
An object of class efpFunctional
.
Merkle E.C., Zeileis A. (2013), Tests of Measurement Invariance without Subgroups: A Generalization of Classical Methods. Psychometrika, 78(1), 59–82. doi:10.1007/S11336-012-9302-4
Zeileis A. (2005), A Unified Approach to Structural Change Tests Based on ML Scores, F Statistics, and OLS Residuals. Econometric Reviews, 24, 445–466. doi:10.1080/07474930500406053.
Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987–3008. doi:10.1016/j.csda.2005.07.001.
Zeileis A., Hornik K. (2007), Generalized M-Fluctuation Tests for Parameter Instability, Statistica Neerlandica, 61, 488–508. doi:10.1111/j.1467-9574.2007.00371.x.
## seatbelt data data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) ## empirical fluctuation process scus.seat <- gefp(y ~ ylag1 + ylag12, data = seatbelt) ## supLM test plot(scus.seat, functional = supLM(0.1)) ## MOSUM test plot(scus.seat, functional = maxMOSUM(0.25)) ## double maximum test plot(scus.seat) ## range test plot(scus.seat, functional = rangeBB) ## Cramer-von Mises statistic (Nyblom-Hansen test) plot(scus.seat, functional = meanL2BB)
## seatbelt data data("UKDriverDeaths") seatbelt <- log10(UKDriverDeaths) seatbelt <- cbind(seatbelt, lag(seatbelt, k = -1), lag(seatbelt, k = -12)) colnames(seatbelt) <- c("y", "ylag1", "ylag12") seatbelt <- window(seatbelt, start = c(1970, 1), end = c(1984,12)) ## empirical fluctuation process scus.seat <- gefp(y ~ ylag1 + ylag12, data = seatbelt) ## supLM test plot(scus.seat, functional = supLM(0.1)) ## MOSUM test plot(scus.seat, functional = maxMOSUM(0.25)) ## double maximum test plot(scus.seat) ## range test plot(scus.seat, functional = rangeBB) ## Cramer-von Mises statistic (Nyblom-Hansen test) plot(scus.seat, functional = meanL2BB)
Personal income and personal consumption expenditures in the US between January 1959 and February 2001 (seasonally adjusted at annual rates).
data("USIncExp")
data("USIncExp")
A multivariate monthly time series from 1959(1) to 2001(2) with variables
monthly personal income (in billion US dollars),
monthly personal consumption expenditures (in billion US Dollars).
https://web.archive.org/web/20201205041942/http://www.economagic.com/
A. Zeileis, F. Leisch, K. Hornik, C. Kleiber (2002), strucchange: An R Package for Testing for Structural Change in Linear Regression Models. Journal of Statistical Software 7(2), 1–38.
## These example are presented in the vignette distributed with this ## package, the code was generated by Stangle("strucchange-intro.Rnw") ################################################### ### chunk number 1: data ################################################### data("USIncExp") plot(USIncExp, plot.type = "single", col = 1:2, ylab = "billion US$") legend(1960, max(USIncExp), c("income", "expenditures"), lty = c(1,1), col = 1:2, bty = "n") ################################################### ### chunk number 2: subset ################################################### data("USIncExp") USIncExp2 <- window(USIncExp, start = c(1985,12)) ################################################### ### chunk number 3: ecm-setup ################################################### coint.res <- residuals(lm(expenditure ~ income, data = USIncExp2)) coint.res <- lag(ts(coint.res, start = c(1985,12), freq = 12), k = -1) USIncExp2 <- cbind(USIncExp2, diff(USIncExp2), coint.res) USIncExp2 <- window(USIncExp2, start = c(1986,1), end = c(2001,2)) colnames(USIncExp2) <- c("income", "expenditure", "diff.income", "diff.expenditure", "coint.res") ecm.model <- diff.expenditure ~ coint.res + diff.income ################################################### ### chunk number 4: ts-used ################################################### plot(USIncExp2[,3:5], main = "") ################################################### ### chunk number 5: efp ################################################### ocus <- efp(ecm.model, type="OLS-CUSUM", data=USIncExp2) me <- efp(ecm.model, type="ME", data=USIncExp2, h=0.2) ################################################### ### chunk number 6: efp-boundary ################################################### bound.ocus <- boundary(ocus, alpha=0.05) ################################################### ### chunk number 7: OLS-CUSUM ################################################### plot(ocus) ################################################### ### chunk number 8: efp-boundary2 ################################################### plot(ocus, boundary = FALSE) lines(bound.ocus, col = 4) lines(-bound.ocus, col = 4) ################################################### ### chunk number 9: ME-null ################################################### plot(me, functional = NULL) ################################################### ### chunk number 10: efp-sctest ################################################### sctest(ocus) ################################################### ### chunk number 11: efp-sctest2 ################################################### sctest(ecm.model, type="OLS-CUSUM", data=USIncExp2) ################################################### ### chunk number 12: Fstats ################################################### fs <- Fstats(ecm.model, from = c(1990, 1), to = c(1999,6), data = USIncExp2) ################################################### ### chunk number 13: Fstats-plot ################################################### plot(fs) ################################################### ### chunk number 14: pval-plot ################################################### plot(fs, pval=TRUE) ################################################### ### chunk number 15: aveF-plot ################################################### plot(fs, aveF=TRUE) ################################################### ### chunk number 16: Fstats-sctest ################################################### sctest(fs, type="expF") ################################################### ### chunk number 17: Fstats-sctest2 ################################################### sctest(ecm.model, type = "expF", from = 49, to = 162, data = USIncExp2) ################################################### ### chunk number 18: mefp ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1), end = c(1989,12)) me.mefp <- mefp(ecm.model, type = "ME", data = USIncExp3, alpha = 0.05) ################################################### ### chunk number 19: monitor1 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1), end = c(1990,12)) me.mefp <- monitor(me.mefp) ################################################### ### chunk number 20: monitor2 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1)) me.mefp <- monitor(me.mefp) me.mefp ################################################### ### chunk number 21: monitor-plot ################################################### plot(me.mefp) ################################################### ### chunk number 22: mefp2 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1), end = c(1989,12)) me.efp <- efp(ecm.model, type = "ME", data = USIncExp3, h = 0.5) me.mefp <- mefp(me.efp, alpha=0.05) ################################################### ### chunk number 23: monitor3 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1)) me.mefp <- monitor(me.mefp) ################################################### ### chunk number 24: monitor-plot2 ################################################### plot(me.mefp)
## These example are presented in the vignette distributed with this ## package, the code was generated by Stangle("strucchange-intro.Rnw") ################################################### ### chunk number 1: data ################################################### data("USIncExp") plot(USIncExp, plot.type = "single", col = 1:2, ylab = "billion US$") legend(1960, max(USIncExp), c("income", "expenditures"), lty = c(1,1), col = 1:2, bty = "n") ################################################### ### chunk number 2: subset ################################################### data("USIncExp") USIncExp2 <- window(USIncExp, start = c(1985,12)) ################################################### ### chunk number 3: ecm-setup ################################################### coint.res <- residuals(lm(expenditure ~ income, data = USIncExp2)) coint.res <- lag(ts(coint.res, start = c(1985,12), freq = 12), k = -1) USIncExp2 <- cbind(USIncExp2, diff(USIncExp2), coint.res) USIncExp2 <- window(USIncExp2, start = c(1986,1), end = c(2001,2)) colnames(USIncExp2) <- c("income", "expenditure", "diff.income", "diff.expenditure", "coint.res") ecm.model <- diff.expenditure ~ coint.res + diff.income ################################################### ### chunk number 4: ts-used ################################################### plot(USIncExp2[,3:5], main = "") ################################################### ### chunk number 5: efp ################################################### ocus <- efp(ecm.model, type="OLS-CUSUM", data=USIncExp2) me <- efp(ecm.model, type="ME", data=USIncExp2, h=0.2) ################################################### ### chunk number 6: efp-boundary ################################################### bound.ocus <- boundary(ocus, alpha=0.05) ################################################### ### chunk number 7: OLS-CUSUM ################################################### plot(ocus) ################################################### ### chunk number 8: efp-boundary2 ################################################### plot(ocus, boundary = FALSE) lines(bound.ocus, col = 4) lines(-bound.ocus, col = 4) ################################################### ### chunk number 9: ME-null ################################################### plot(me, functional = NULL) ################################################### ### chunk number 10: efp-sctest ################################################### sctest(ocus) ################################################### ### chunk number 11: efp-sctest2 ################################################### sctest(ecm.model, type="OLS-CUSUM", data=USIncExp2) ################################################### ### chunk number 12: Fstats ################################################### fs <- Fstats(ecm.model, from = c(1990, 1), to = c(1999,6), data = USIncExp2) ################################################### ### chunk number 13: Fstats-plot ################################################### plot(fs) ################################################### ### chunk number 14: pval-plot ################################################### plot(fs, pval=TRUE) ################################################### ### chunk number 15: aveF-plot ################################################### plot(fs, aveF=TRUE) ################################################### ### chunk number 16: Fstats-sctest ################################################### sctest(fs, type="expF") ################################################### ### chunk number 17: Fstats-sctest2 ################################################### sctest(ecm.model, type = "expF", from = 49, to = 162, data = USIncExp2) ################################################### ### chunk number 18: mefp ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1), end = c(1989,12)) me.mefp <- mefp(ecm.model, type = "ME", data = USIncExp3, alpha = 0.05) ################################################### ### chunk number 19: monitor1 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1), end = c(1990,12)) me.mefp <- monitor(me.mefp) ################################################### ### chunk number 20: monitor2 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1)) me.mefp <- monitor(me.mefp) me.mefp ################################################### ### chunk number 21: monitor-plot ################################################### plot(me.mefp) ################################################### ### chunk number 22: mefp2 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1), end = c(1989,12)) me.efp <- efp(ecm.model, type = "ME", data = USIncExp3, h = 0.5) me.mefp <- mefp(me.efp, alpha=0.05) ################################################### ### chunk number 23: monitor3 ################################################### USIncExp3 <- window(USIncExp2, start = c(1986, 1)) me.mefp <- monitor(me.mefp) ################################################### ### chunk number 24: monitor-plot2 ################################################### plot(me.mefp)