Title: | Statistical Learning for Big Dependent Data |
---|---|
Description: | Programs for analyzing large-scale time series data. They include functions for automatic specification and estimation of univariate time series, for clustering time series, for multivariate outlier detections, for quantile plotting of many time series, for dynamic factor models and for creating input data for deep learning programs. Examples of using the package can be found in the Wiley book 'Statistical Learning with Big Dependent Data' by Daniel Peña and Ruey S. Tsay (2021). ISBN 9781119417385. |
Authors: | Angela Caro [aut], Antonio Elias [aut, cre], Daniel Peña [aut], Ruey S. Tsay [aut] |
Maintainer: | Antonio Elias <[email protected]> |
License: | GPL-3 |
Version: | 0.0.4 |
Built: | 2024-10-28 07:01:51 UTC |
Source: | CRAN |
Automatic selection and estimation of a regular or possibly seasonal ARIMA model for a given time series.
arimaID( zt, maxorder = c(5, 1, 3), criterion = "bic", period = c(12), output = TRUE, method = "CSS-ML", pv = 0.01, spv = 0.01, transpv = 0.05, nblock = 0 )
arimaID( zt, maxorder = c(5, 1, 3), criterion = "bic", period = c(12), output = TRUE, method = "CSS-ML", pv = 0.01, spv = 0.01, transpv = 0.05, nblock = 0 )
zt |
T by 1 vector of an observed scalar time series without any missing values. |
maxorder |
Maximum order of |
criterion |
Information criterion used for model selection. Either AIC or BIC. Default is "bic". |
period |
Seasonal period. Default value is 12. |
output |
If TRUE it returns the differencing order, the selected order and the minimum value of the criterion. Default is TRUE. |
method |
Estimation method. See the arima command in R. Possible values are "CSS-ML", "ML", and "CSS". Default is "CSS-ML". |
pv |
P-value for unit-root test. Default value is 0.01. |
spv |
P-value for detecting seasonality. Default value is 0.01. |
transpv |
P-value for checking non-linear transformation. Default value is 0.05. |
nblock |
Number of blocks used in checking non-linear transformations. Default value is floor(sqrt(T)). |
The program follows the following steps:
Check for seasonality: fitting a multiplicative ARIMA(p,0,0)(1,0,0)_s model to a scalar time series and testing if the estimated seasonal AR coefficient is significant.
Check for non-linear transformation: the series is divided into a given number of consecutive blocks and in each of them the Mean Absolute Deviation (MAD) and the median is computed. A regression of the log of the MAD with respect to the log of the median is run and the slope defines the non-linear transformation.
Select orders: maximum order of .
A list containing:
data - The time series. If any non-linear transformation is taken, "data" is the transformed series.
order - Regular ARIMA order.
sorder - Seasonal ARIMA order.
period - Seasonal period.
include.mean - Switch concerning the inclusion of mean in the model.
data(TaiwanAirBox032017) fit <- arimaID(TaiwanAirBox032017[,1])
data(TaiwanAirBox032017) fit <- arimaID(TaiwanAirBox032017[,1])
Select an ARIMA model for a non-seasonal scalar time series. It uses augmented Dickey-Fuller (ADF) test to check for unit roots. The maximum degree of differencing is 2.
arimaSpec( zt, maxorder = c(5, 1, 4), criterion = "bic", output = FALSE, method = "CSS-ML", pv = 0.01 )
arimaSpec( zt, maxorder = c(5, 1, 4), criterion = "bic", output = FALSE, method = "CSS-ML", pv = 0.01 )
zt |
T by 1 vector of an observed scalar time series without missing values. |
maxorder |
Maximum order of |
criterion |
Information criterion used for model selection. Either AIC or BIC. Default is "bic". |
output |
If TRUE it returns the differencing order, the selected order and the minimum value of the criterion. Default is TRUE. |
method |
Estimation method. See the arima command in R. Possible values are "CSS-ML", "ML", and "CSS". Default is "CSS-ML". |
pv |
P-value for unit-root test. Default is 0.01. |
Find the AR order by checking a pure AR model for the differenced series. The maximum AR order tried is min(default AR order and the order of pure AR model). Check the MA order by checking pure MA model using rank-based Ljung-Box statistics. The maximum MA order tried is the min(default MA order and the order of pure MA model). Finally, sequentially decreasing the AR order and increasing the MA order to obtain best models using the specified criterion function.
A list containing:
order - Regular ARIMA order.
crit - Minimum criterion.
include.mean - Switch about including mean in the model.
data(TaiwanAirBox032017) fit <- arimaSpec(as.matrix(TaiwanAirBox032017[,1]))
data(TaiwanAirBox032017) fit <- arimaSpec(as.matrix(TaiwanAirBox032017[,1]))
Check the seasonality of each component of a multiple time series.
chksea(x, period = c(12), p = 0, alpha = 0.05, output = TRUE)
chksea(x, period = c(12), p = 0, alpha = 0.05, output = TRUE)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
period |
seasonal period. Default value is 12. |
p |
Regular AR order. Default value is max(floor(log(T)),1). |
alpha |
Type-I error for the t-ratio of seasonal coefficients. Default value is 0.05. |
output |
If TRUE it returns if the series has seasonality. Default is TRUE. |
Check the seasonality fitting a seasonal AR(1) model and a regular AR(p) model to a scalar time series and testing if the estimated seasonal AR coefficient is significant.
A list containing:
Seasonal - TRUE or FALSE.
period - Seasonal period.
data(TaiwanAirBox032017) output <- chksea(TaiwanAirBox032017[,1])
data(TaiwanAirBox032017) output <- chksea(TaiwanAirBox032017[,1])
Check for possible non-linear transformations of a multiple time series, series by series.
chktrans(x, block = 0, output = FALSE, period = 1, pv = 0.05)
chktrans(x, block = 0, output = FALSE, period = 1, pv = 0.05)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
block |
Number of blocks used in the linear regression. Default value is floor(sqrt(T)). |
output |
If TRUE it returns the estimates, the code: log, sqrt and No-trans and the numbers of non-linear transformations. Default is TRUE. |
period |
Seasonal period. |
pv |
P-value = pv/log(1 + k) is used to check the significance of the coefficients. Default value is 0.05. |
Each series is divided into a given number of consecutive blocks and in each of them the mean absolute deviation (MAD) and the median are computed. A regression of the log of the MAD with respect to the log of the median is run and the slope defines the non-linear transformation.
A list containing:
lnTran - Column locations of series that require log-transformation.
sqrtTran - Column locations of series that require square-root transformation.
noTran - Column locations of series that require no transformation.
tran - A vector indicating checking results, where 0 means no transformation, 1 means log-transformation, 2 means square-root transformation.
tranX Transformed series. This is only provided if the number of series
requiring transformation is sufficiently large, i.e. greater than .
Summary Number of time series that require log-transformation, square-root transformation and no transformation.
data(TaiwanAirBox032017) output <- chktrans(TaiwanAirBox032017[,1])
data(TaiwanAirBox032017) output <- chktrans(TaiwanAirBox032017[,1])
Daily sales, in natural logarithms, of a clothing brand in 25 provinces in China from January 1, 2008, to December 9, 2012. The number of observations is 1812.
data(clothing)
data(clothing)
An object of class "data.frame"
.
Chang, J., Guo, B., and Yao, Q. (2018). Principal component analysis for second-order stationary vector time series. The Annals of Statistics, 46(5), 2094-2124.
Identification of groups using projections of a vector of features of each time series in directions of extreme kurtosis coefficient.
ClusKur(x)
ClusKur(x)
x |
p by k data matrix: p features or variables for each time series and k time series in columns. |
A list containing:
lbl - Cluster labels (possible outliers get negative labels).
ncl - Number of clusters.
data(Stockindexes99world) S <- Stockindexes99world[,-1] v1 <- apply(S,2, mean) v2 <- apply(S,2, sd) M <- rbind(v1,v2) out <- ClusKur(M)
data(Stockindexes99world) S <- Stockindexes99world[,-1] v1 <- apply(S,2, mean) v2 <- apply(S,2, sd) M <- rbind(v1,v2) out <- ClusKur(M)
Monthly consumer price indexes of European countries and the United States of America for the period January 2000 to October 2015. There are 33 indexes and the names of the countries are the columns names. The number of observations is 190.
data("CPIEurope200015")
data("CPIEurope200015")
An object of class "data.frame"
.
The function estimates the Dynamic Factor Model by Principal Components and by the estimator of Lam et al. (2011).
dfmpc(x, stand = 0, mth = 4, r, lagk = 0)
dfmpc(x, stand = 0, mth = 4, r, lagk = 0)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
stand |
Data standardization. The default is stand = 0 and x is not transformed, if stand = 1 each column of x has zero mean an if stand=2 also unit variance. |
mth |
Method to estimate the number of factors and the common component (factors and loadings):
|
r |
Number of factors, default value is estimated by Lam and Yao (2012) criterion. |
lagk |
Maximum number of lags considered in the combined matrix. The default is lagk = 3. |
A list with the following items:
r - Estimated number of common factors, if mth=0, r is given by the user.
F - Estimated common factor matrix (T x r).
L - Estimated loading matrix (k x r).
E - Estimated noise matrix (T x k).
VarF - Proportion of variability explained by the factor and the accumulated sum.
MarmaF - Matrix giving the number of AR, MA, seasonal AR and seasonal MA coefficients for the Factors, plus the seasonal period and the number of non-seasonal and seasonal differences.
MarmaE - Matrix giving the number of AR, MA, seasonal AR and seasonal MA coefficients for the noises, plus the seasonal period and the number of non-seasonal and seasonal differences.
Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3):1203–1227.
Bai, J. and Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1):191–221.
Caro, A. and Peña, D. (2020). A test for the number of factors in dynamic factor models. UC3M Working papers. Statistics and Econometrics.
Lam, C. and Yao, Q. (2012). Factor modeling for high-dimensional time series: inference for the number of factors. The Annals of Statistics, 40(2):694–726.
Lam, C., Yao, Q., and Bathia, N. (2011). Estimation of latent factors for high-dimensional time series. Biometrika, 98(4):901–918.
data(TaiwanAirBox032017) dfm1 <- dfmpc(as.matrix(TaiwanAirBox032017[1:100,1:30]), mth=4)
data(TaiwanAirBox032017) dfm1 <- dfmpc(as.matrix(TaiwanAirBox032017[1:100,1:30]), mth=4)
R command to setup the training and forecasting data for deep learning.
DLdata(x, forerate = 0.2, locY = 1, lag = 1)
DLdata(x, forerate = 0.2, locY = 1, lag = 1)
x |
T by k data matrix: T data points in rows and k time series in columns. |
forerate |
Fraction of sample size to form the forecasting (or testing) sample. |
locY |
Locator for the dependent variable. |
lag |
Number of lags to be used to form predictors. |
A list containing:
Xtrain - Standardized predictors matrix.
Ytrain - Dependent variable in training sample.
Xtest - Predictor in testing sample, standardized according to X_train.
Ytest - Dependent variable in the testing sample.
nfore - Number of forecasts.
x <- matrix(rnorm(7000), nrow=700, ncol=100) m1 <- DLdata(x, forerate=c(200/nrow(x)), lag=6, locY=6)
x <- matrix(rnorm(7000), nrow=700, ncol=100) m1 <- DLdata(x, forerate=c(200/nrow(x)), lag=6, locY=6)
Random draw of polynomial coefficients for stationary AR models or invertible MA models. The resulting polynomial has solutions outside the unit circle.
draw.coef(deg, delta = 0.02)
draw.coef(deg, delta = 0.02)
deg |
Degree of the polynomial. Maximum degree is 5. |
delta |
The minimum distance of a polynomial root from the boundary 1 or -1. The default is 0.02. |
denotes the coefficients of
.
draw.coef(2)
draw.coef(2)
Plot the observed time series and selected empirical dynamic quantiles (EDQs) computed as in Peña, Tsay and Zamar (2019).
edqplot( x, prob = c(0.05, 0.5, 0.95), h = 30, loc = NULL, color = c("yellow", "red", "green") )
edqplot( x, prob = c(0.05, 0.5, 0.95), h = 30, loc = NULL, color = c("yellow", "red", "green") )
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
prob |
Probability, the quantile series of which is to be computed. Default values are 0.05, 0.5, 0.95. |
h |
Number of time series used in the algorithm. Default value is 30. |
loc |
Locations of the EDQ. If loc is not null, then prob is not used. |
color |
Colors for plotting the EDQ. Default is "yellow", "red", and "green". |
The observed time series plot with the selected EDQs.
Peña, D. Tsay, R. and Zamar, R. (2019). Empirical Dynamic Quantiles for Visualization of High-Dimensional Time Series, Technometrics, 61:4, 429-444.
data(TaiwanAirBox032017) edqplot(TaiwanAirBox032017[1:100,1:25])
data(TaiwanAirBox032017) edqplot(TaiwanAirBox032017[1:100,1:25])
Compute empirical dynamic quantile (EDQ) for a given probability "p" based on the weighted algorithm proposed in the article by Peña, Tsay and Zamar (2019).
edqts(x, p = 0.5, h = 30)
edqts(x, p = 0.5, h = 30)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
p |
Probability, the quantile series of which is to be computed. Default value is 0.5. |
h |
Number of time series observations used in the algorithm. The larger h is the longer to compute. Default value is 30. |
The column of the matrix x which stores the "p" EDQ of interest.
Peña, D. Tsay, R. and Zamar, R. (2019). Empirical Dynamic Quantiles for Visualization of High-Dimensional Time Series, Technometrics, 61:4, 429-444.
data(TaiwanAirBox032017) edqts(TaiwanAirBox032017[,1:25])
data(TaiwanAirBox032017) edqts(TaiwanAirBox032017[,1:25])
Data obtained from the Federal Research Bank after process to remove missing values.
data("FREDMDApril19")
data("FREDMDApril19")
An object of class "data.frame"
.
This function computes the gap and the number of groups using the gap statistics.
gap.clus(DistanceMatrix, Clusters, B)
gap.clus(DistanceMatrix, Clusters, B)
DistanceMatrix |
Square matrix of GCC distances. |
Clusters |
Matrix of member labels. |
B |
Number of iterations for the bootstrap. |
A list containing:
- optim.k: number of groups.
- gap.values: gap values.
Alonso, A. M. and Peña, D. (2019). Clustering time series by linear dependency. Statistics and Computing, 29(4):655–676.
data(TaiwanAirBox032017) library(TSclust) z <- diff(as.matrix(TaiwanAirBox032017[1:50,1:8])) Macf <- as.matrix(diss(t(z), METHOD = "ACF", lag.max = 5)) sc1 <- hclust(as.dist(Macf), method = "complete") memb <- cutree(sc1, 1:8) ngroups <- gap.clus(Macf, memb, 100)
data(TaiwanAirBox032017) library(TSclust) z <- diff(as.matrix(TaiwanAirBox032017[1:50,1:8])) Macf <- as.matrix(diss(t(z), METHOD = "ACF", lag.max = 5)) sc1 <- hclust(as.dist(Macf), method = "complete") memb <- cutree(sc1, 1:8) ngroups <- gap.clus(Macf, memb, 100)
Clustering of time series using the Generalized Cross Correlation (GCC) measure of linear dependency proposed in Alonso and Peña (2019).
GCCclus(x, lag, rs, thres, plot, printSummary = TRUE, lag.set, silh = 1)
GCCclus(x, lag, rs, thres, plot, printSummary = TRUE, lag.set, silh = 1)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
lag |
Selected lag for computing the GCC between the pairs of series. Default value is computed inside the program. |
rs |
Relative size of the minimum group considered. Default value is 0.05. |
thres |
Percentile in the distribution of distances that define observations that are not considered outliers. Default value is 0.9. |
plot |
If the value is TRUE, a clustermatrix plot of distances and a dendogram are presented. Default is FALSE. |
printSummary |
If the value is TRUE, the function prints a summary table of the clustering. Default is TRUE. |
lag.set |
If lag is not specified and the user wants to use instead of lags from 1 to 'lag' a non consecutive set of lags they can be defined as lag.set = c(1, 4, 7). |
silh |
If silh = 1 standard silhoutte statistics and if silh = 2 modified procedure. Default value is 1. |
First, the matrix of Generalized Cross correlation (GCC) is built by using the subrutine GCCmatrix, then a hierarchical grouping is constructed and the number of clusters is selected by either the silhouette statistics or a modified silhouette statistics The modified silhouette statistics is as follows:
(1) Series that join the groups at a distance larger than a given threshold of the distribution of the distances are disregarded.
(2) A minimum size for the groups is defined by rs, relative size, groups smaller than rs are disregarded.
(3) The final groups are obtained in two steps:
First the silhouette statistics is applied to the set of time series that verify conditions (1) and (2).
Second, the series disregarded in steps (1) and (2) are candidates to be assigned to its closest group. It is checked using the median and the MAD of the group if the point is or it is not an outlier with respect to the group. If it is an outlier it is included in a group 0 of outlier series. The distance between a series and a group is usually to the closest in the group (simple linkage) but could be to the mean of the group.
A list containing:
- Table of number of clusters found and number of observations in each cluster. Group 0 indicates the outlier group in the case it exists.
- sal: A list with four objects
labels: assignments of time series to each of the groups.
groups: is a list of matrices. Each matrix corresponds to the set of time series that make up each group. For example, $groups[[i]] contains the set of time series that belong to the ith group.
matrix: GCC distance matrix.
gmatrix: GCC distance matrices in each group.
Two plots are included (1) A clustermatrix plot with the distances inside each group in the diagonal boxes and the distances between series in two groups in off-diagonal boxes (2) the dendogram.
Alonso, A. M. and Peña, D. (2019). Clustering time series by linear dependency. Statistics and Computing, 29(4):655–676.
data(TaiwanAirBox032017) output <- GCCclus(TaiwanAirBox032017[1:50,1:8])
data(TaiwanAirBox032017) output <- GCCclus(TaiwanAirBox032017[1:50,1:8])
Built the GCC similarity matrix between time series proposed in Alonso and Peña (2019).
GCCmatrix(x, lag, model, lag.set)
GCCmatrix(x, lag, model, lag.set)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
lag |
Selected lag for computing the GCC between the pairs of series. Default value is computed inside the program. |
model |
Model specification. When the value lag is unknown for the user, the model specification is chosen between GARCH model or AR model. Default is ARMA model. |
lag.set |
If lag is not specified and the user wants to use instead of lags from 1 to 'lag' a non consecutive set of lags they can be defined as lag.set = c(1, 4, 7). |
A list containing:
DM - A matrix object with the distance matrix.
k_GCC - The lag used to calculate GCC measure.
Alonso, A. M. and Peña, D. (2019). Clustering time series by linear dependency. Statistics and Computing, 29(4):655–676.
data(TaiwanAirBox032017) output <- GCCmatrix(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) output <- GCCmatrix(TaiwanAirBox032017[,1:3])
of United States, United Kingdom, France, Australia, Germany, and Canada.
Quarterly data from 1980 to 2018. The original data are downloaded from FRED
(Federal Reserve Bank of St Louis). The GDP is based on expenditures.
data(gdpsimple6c8018)
data(gdpsimple6c8018)
An object of class "data.frame"
.
Plot a selected time series using quantile as the background.
i.plot(x, idx = 1, prob = c(0.25, 0.5, 0.75), xtime = NULL)
i.plot(x, idx = 1, prob = c(0.25, 0.5, 0.75), xtime = NULL)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
idx |
Selected time series. |
prob |
Probability, the quantile series of which is to be computed. Default values are 0.25, 0.5, 0.75. |
xtime |
A vector with the values for the x labels. Default values are 1, 2, 3, ... |
standardized - Matrix containing the standardized time series.
data(TaiwanAirBox032017) output <- i.plot(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) output <- i.plot(TaiwanAirBox032017[,1:3])
Use sum of absolute deviations to select the individual time series that is closest to a given timewise quantile series.
i.qplot(x, prob = 0.5, box = 3, xtime = NULL)
i.qplot(x, prob = 0.5, box = 3, xtime = NULL)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
prob |
Probability, the quantile series of which is to be computed. Default value is 0.5. |
box |
Number of boxplots for the difference series between the selected series and the timewise quantile with the given probability. The differences are divided into blocks. Default value is 3. |
xtime |
A vector with the values for the x labels. Default values are 1, 2, 3, ... |
A list containing:
standardized - A matrix containing standardized time series.
qts - The timewise quantile of order prob.
selected - The closest time series to the given timewise quantile series.
data(TaiwanAirBox032017) output <- i.qplot(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) output <- i.qplot(TaiwanAirBox032017[,1:3])
Use sum of absolute deviations to select the individual time series that is closest to a given timewise quantile series.
i.qrank(x, prob = 0.5)
i.qrank(x, prob = 0.5)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
prob |
Probability, the quantile series of which is to be computed. Default value is 0.5. |
A list containing:
standardized - A matrix containing standardized time series.
qts - The timewise quantile of order prob.
ranks - Rank of the individual time series according to a the given timewise quantile series.
crit - Sum of absolute deviations of each individual series. Distance of each series to the quantile.
data(TaiwanAirBox032017) output <- i.qrank(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) output <- i.qrank(TaiwanAirBox032017[,1:3])
Use out-of-sample Root Mean Square Error to select the penalty parameter of LASSO-type linear regression.
Lambda.sel(X, y, newX, newY, family = "gaussian", alpha = 1)
Lambda.sel(X, y, newX, newY, family = "gaussian", alpha = 1)
X |
Matrix of predictors of the estimation sample. |
y |
Dependent variables of the estimation sample. |
newX |
Design matrix in the forecasting subsample. |
newY |
Dependent variable in the forecasting subsample. |
family |
Response type. See the glmnet command in R. Possible types are "gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian". Default is "gaussian". |
alpha |
The elasticnet mixing parameter, with |
A list containing:
lambda.min - lambda that achieves the minimum mean square error.
beta - estimated coefficients for lambda.min.
mse - mean squared error.
lambda - the actual sequence of lambda values used.
X <- cbind(rnorm(200),rnorm(200,2,1),rnorm(200,4,1)) y <- rnorm(200) newX <- cbind(rnorm(200),rnorm(200,2,1),rnorm(200,4,1)) newy <- rnorm(200) output <- Lambda.sel(X, y, newX, newy)
X <- cbind(rnorm(200),rnorm(200,2,1),rnorm(200,4,1)) y <- rnorm(200) newX <- cbind(rnorm(200),rnorm(200,2,1),rnorm(200,4,1)) newy <- rnorm(200) output <- Lambda.sel(X, y, newX, newy)
Latitude and longitude of the Air Boxes used in TaiwanAirBox032017.
data(locations032017)
data(locations032017)
An object of class "data.frame"
.
Chen, L.J. et al. (2017). Open framework for participatory PM2.5 monitoring in smart cities. IEEE Access Journal, Vol. 5, pp. 14441-14454.
Based on Market Cap. The first 5 (smallest 10 percentage, next 10 percentage, etc.). The time span is from January 1962 to December 2018. The data is from CRSP (Center of Research for the Security Prices). The first column is calendar time.
data("mdec1to5")
data("mdec1to5")
An object of class "data.frame"
.
The original data are from FRED (Federal Reserve Bank of St Louis) and the unit is US Dollars. The time span is from January 1992 to December 2018.
data("mexpimpcnus")
data("mexpimpcnus")
An object of class "data.frame"
.
Plot multiple time series in one frame and return standardized time series.
mts.plot(x, title = "mts plot", scaling = TRUE, xtime = NULL)
mts.plot(x, title = "mts plot", scaling = TRUE, xtime = NULL)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
title |
Character with the title of the plot. Default title is "mts plot". |
scaling |
If scaling = TRUE (default), then each series is standardized based on its own range. If scaling = FALSE, then the original series is used. |
xtime |
A vector with the values for the x labels. Default values are 1, 2, 3, ... |
standardized - Matrix containing the standardized time series.
data(TaiwanAirBox032017) output <- mts.plot(TaiwanAirBox032017[,1:5])
data(TaiwanAirBox032017) output <- mts.plot(TaiwanAirBox032017[,1:5])
Plot timewise quantiles in one frame.
mts.qplot( x, title = "mts quantile plot", prob = c(0.25, 0.5, 0.75), scaling = TRUE, xtime = NULL, plot = TRUE )
mts.qplot( x, title = "mts quantile plot", prob = c(0.25, 0.5, 0.75), scaling = TRUE, xtime = NULL, plot = TRUE )
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
title |
Character with the title of the plot. Default title is "mts quantile plot". |
prob |
Probability, the quantile series of which is to be computed. Default values are 0.25, 0.5, 0.75. |
scaling |
If scaling = TRUE (default), then each series is standardized based on its own range. If scaling = FALSE, then the original series is used. |
xtime |
A vector with the values for the x labels. Default values are 1, 2, 3, ... |
plot |
Receives TRUE or FALSE values. If the value is TRUE, a quantile plot is presented. Defaults is TRUE. |
A list containing:
standardized - A matrix containing standardized time series.
qseries - Matrix of timewise quantiles series of order prob.
data(TaiwanAirBox032017) output <- mts.qplot(TaiwanAirBox032017[,1:5])
data(TaiwanAirBox032017) output <- mts.qplot(TaiwanAirBox032017[,1:5])
Use an upper and a lower timewise quantile series to highlight the possible outliers in a collection of time series.
outlier.plot(x, prob = 0.05, percent = 0.05, xtime = NULL)
outlier.plot(x, prob = 0.05, percent = 0.05, xtime = NULL)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
prob |
Tail probability. That is, the two quantile series is (prob, 1-prob). prob is restricted to be in (0,0.15). Default value is 0.05. |
percent |
The number of possible outliers in each side is T*k*prob*percent. |
xtime |
A vector with the values for the x labels. Default values are 1, 2, 3, ... |
A list containing:
standardized - A matrix containing standardized time series.
qts - The timewise quantile of order prob.
minseries - The timewise minimum of the standardized time series.
maxseries - The timewise maximum of the standardized time series.
data(TaiwanAirBox032017) output <- outlier.plot(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) output <- outlier.plot(TaiwanAirBox032017[,1:3])
Use LASSO estimation to identify outliers in a set of time series by creating dummy variables for every time point.
outlierLasso( zt, p = 12, crit = 3.5, family = "gaussian", standardize = TRUE, alpha = 1, jend = 3 )
outlierLasso( zt, p = 12, crit = 3.5, family = "gaussian", standardize = TRUE, alpha = 1, jend = 3 )
zt |
T by 1 vector of an observed scalar time series without missing values. |
p |
Seasonal period. Default value is 12. |
crit |
Criterion. Default is 3.5. |
family |
Response type. See the glmnet command in R. Possible types are "gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian". Default is "gaussian". |
standardize |
Logical flag for zt variable standardization. See the glmnet command in R. Default is TRUE. |
alpha |
Elasticnet mixing parameter, with |
jend |
Number of first and last observations assumed to not be level shift outliers. Default value is 3. |
A list containing:
nAO - Number of additive outliers.
nLS - Number of level shifts.
data(TaiwanAirBox032017) output <- outlierLasso(TaiwanAirBox032017[1:100,1])
data(TaiwanAirBox032017) output <- outlierLasso(TaiwanAirBox032017[1:100,1])
Outlier detection in high dimensional time series by using projections as in Galeano, Peña and Tsay (2006).
outliers.hdts(x, r.max, type)
outliers.hdts(x, r.max, type)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
r.max |
The maximum number of factors including stationary and non-stationary. |
type |
The type of series, i.e., 1 if stationary or 2 if nonstationary. |
A list containing:
x.clean - The time series cleaned at the end of the procedure (n x m).
P.clean - The estimate of the loading matrix if the number of factors is positive.
Ft.clean - The estimated dynamic factors if the number of factors is positive.
Nt.clean - The idiosyncratic residuals if the number of factors is positive.
times.idi.out - The times of the idiosyncratic outliers.
comps.idi.out - The components of the noise affected by the idiosyncratic outliers.
sizes.idi.out - The sizes of the idiosyncratic outliers.
stats.idi.out - The statistics of the idiosyncratic outliers.
times.fac.out - The times of the factor outliers.
comps.fac.out - The dynamic factors affected by the factor outliers.
sizes.fac.out - The sizes of the factor outliers.
stats.fac.out - The statistics of the factor outliers.
x.kurt - The time series cleaned in the kurtosis sub-step (n x m).
times.kurt - The outliers detected in the kurtosis sub-step.
pro.kurt - The projection number of the detected outliers in the kurtosis sub-step.
n.pro.kurt - The number of projections leading to outliers in the kurtosis sub-step.
x.rand - The time series cleaned in the random projections sub-step (n x m).
times.rand - The outliers detected in the random projections sub-step.
x.uni - The time series cleaned after the univariate substep (n x m).
times.uni - The vector of outliers detected with the univariate substep.
comps.uni - The components affected by the outliers detected with the univariate substep.
r.rob - The number of factors estimated (1 x 1).
P.rob - The estimate of the loading matrix (m x r.rob).
V.rob - The estimate of the orthonormal complement to P (m x (m - r.rob)).
I.cov.rob - The matrix (V'GnV)^-1 used to compute the statistics to detect the idiosyncratic outliers.
IC.1 - The values of the information criterion of Bai and Ng.
Galeano, P., Peña, D., and Tsay, R. S. (2006). Outlier detection in multivariate time series by projection pursuit. Journal of the American Statistical Association, 101(474), 654-669.
data(TaiwanAirBox032017) output <- outliers.hdts(as.matrix(TaiwanAirBox032017[1:100,1:3]), r.max = 1, type =2)
data(TaiwanAirBox032017) output <- outliers.hdts(as.matrix(TaiwanAirBox032017[1:100,1:3]), r.max = 1, type =2)
Weakly series of electricity price each hour of each day during 678 weeks
in the 8 regions in New England. We have 1344 series corresponding to each
of the seven days, one of the 24 hours and for one of the regions .
The first series correspond to the price in the first region at 1 am CT of Thursday
01/01/2004, the second to 2 am, same day and so on. Thus the first 192 series
(24 hours x 8 regions) are the price of all the hours of Thursday in the eight regions,
the next 192 are for Friday and so on. These series were used in the articles Alonso and Peña (2019),
and Peña, Tsay and Zamar (2019).
The series have been corrected of missing values at days of changing time in saving energy days.
data(PElectricity1344)
data(PElectricity1344)
An object of class "data.frame"
.
Alonso, A. M. and Peña, D. (2019). Clustering time series by linear dependency. Statistics and Computing, 29(4):655–676.
Peña, D. Tsay, R. and Zamar, R. (2019). Empirical Dynamic Quantiles for Visualization of High- Dimensional Time Series, Technometrics, 61:4, 429-444.
Boxplots of selected quantiles of each time series (Column-wise operations).
quantileBox(x, prob = c(0.25, 0.5, 0.75))
quantileBox(x, prob = c(0.25, 0.5, 0.75))
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
prob |
Probability, the quantile series of which is to be computed. Default values are 0.25, 0.5, 0.75. |
Boxplot.
data(TaiwanAirBox032017) quantileBox(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) quantileBox(TaiwanAirBox032017[,1:3])
R command to setup the input and output for a Recurrent Neural Network. It is used in the Wiley book Statistical Learning with Big Dependent Data by Daniel Peña and Ruey S. Tsay (2021).
rnnStream(z, h = 25, nfore = 200)
rnnStream(z, h = 25, nfore = 200)
z |
Input in integer values. |
h |
Number of lags used as input. |
nfore |
Data points in the testing subsample. |
A list containing:
Xfit - Predictor in training sample (binary).
Yfit - Dependent variable in the training sample (binary).
yp - Dependent variable in testing sample.
Xp - Predictor in the testing sample (binary).
X - Predictor in the training sample.
yfit - Dependent variable in the training sample.
newX - Predictor in the testing sample.
output <- rnnStream(rnorm(100), h=5, nfore=20)
output <- rnnStream(rnorm(100), h=5, nfore=20)
Auto-model specification of a scalar seasonal time series. The period should be given.
sarimaSpec( zt, maxorder = c(2, 1, 3), maxsea = c(1, 1, 1), criterion = "bic", period = 12, output = FALSE, method = "CSS-ML", include.mean = TRUE )
sarimaSpec( zt, maxorder = c(2, 1, 3), maxsea = c(1, 1, 1), criterion = "bic", period = 12, output = FALSE, method = "CSS-ML", include.mean = TRUE )
zt |
T by 1 vector of an observed scalar time series without missing values. |
maxorder |
Maximum order of |
maxsea |
Maximum order of |
criterion |
Information criterion used for model selection. Either AIC or BIC. Default is "bic". |
period |
Seasonal period. The default is 12. |
output |
If TRUE it returns the differencing order, the selected order and the minimum value of the criterion. Default is TRUE. |
method |
Estimation method. See the arima command in R. Possible values are "CSS-ML", "ML", and "CSS". Default is "CSS-ML". |
include.mean |
Should the model include a mean/intercept term? Default is TRUE. |
ADF unit-root test is used to assess seasonal and regular differencing. For seasonal unit-root test, critical value associated with pv = 0.01 is used.
A list containing:
data - The time series. If any transformation is taken, "data" is the transformed series.
order - Regular ARIMA order.
sorder - Seasonal ARIMA order.
period - Seasonal period.
include.mean - Switch about including mean in the model.
data(TaiwanAirBox032017) output <- sarimaSpec(TaiwanAirBox032017[1:100,1])
data(TaiwanAirBox032017) output <- sarimaSpec(TaiwanAirBox032017[1:100,1])
Scatterplot of two selected-lag ACFs.
scatterACF(x, lags = c(1, 2))
scatterACF(x, lags = c(1, 2))
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
lags |
Set of lags. Default values are 1, 2. |
A list containing:
acf1 - Autocorrelation function of order lags[1].
acf2 - Autocorrelation function of order lags[2].
data(TaiwanAirBox032017) output <- scatterACF(TaiwanAirBox032017[,1:100])
data(TaiwanAirBox032017) output <- scatterACF(TaiwanAirBox032017[,1:100])
To be used after the command "SummaryModel". The input "M" must be an output from "SummaryModel".
Selected time series of a given order .
SelectedSeries(M, order = c(1, 0, 1))
SelectedSeries(M, order = c(1, 0, 1))
M |
Matrix that is an output from "SummaryModel" command, that is, M1, M2 or M3. |
order |
Specification of the non-seasonal part of the ARIMA model:
the three integer components |
The number of series with the given order and the names of the resulting series.
data(TaiwanAirBox032017) outputSummaryModel <- SummaryModel(TaiwanAirBox032017[,1:3]) SelectedSeries(outputSummaryModel$M1, order = c(2,0,0))
data(TaiwanAirBox032017) outputSummaryModel <- SummaryModel(TaiwanAirBox032017[,1:3]) SelectedSeries(outputSummaryModel$M1, order = c(2,0,0))
Find the number of clusters by the standard Silhouette statistics. The cluster is hierarchical.
silh.clus(nClus, distanceMatrix, method)
silh.clus(nClus, distanceMatrix, method)
nClus |
Maximum number of groups. |
distanceMatrix |
Matrix of distances. |
method |
Hierarchical method "single", "average","complete". |
A list containing:
nClus - Number of groups
list - Silhouette statistics for each value of nclus.
data(TaiwanAirBox032017) output_gcc <- GCCmatrix(TaiwanAirBox032017[1:100,1:10]) output <- silh.clus(nClus=3,distanceMatrix=output_gcc$DM ,method="complete")
data(TaiwanAirBox032017) output_gcc <- GCCmatrix(TaiwanAirBox032017[1:100,1:10]) output <- silh.clus(nClus=3,distanceMatrix=output_gcc$DM ,method="complete")
Generate Unit-root ARIMA, possibly, seasonal time series.
sim.urarima( T = 300, ar = c(0.5), ma = c(-0.5), d = 1, sar = NULL, sma = NULL, D = 0, period = 12, ini = 200, df = 50 )
sim.urarima( T = 300, ar = c(0.5), ma = c(-0.5), d = 1, sar = NULL, sma = NULL, D = 0, period = 12, ini = 200, df = 50 )
T |
Number of observations. |
ar |
Vector with the autoregressive coefficients. Default value is 0.5. |
ma |
Vector with the moving average coefficients. Default value is -0.5. |
d |
Order of first-differencing. Default value is 1. |
sar |
Seasonal autoregressive coefficients. Default is NULL. |
sma |
Seasonal moving average coefficients. Default is NULL. |
D |
Order of seasonal differencing. Default value is 0. |
period |
Seasonal period. Default value is 12. |
ini |
Length of ‘burn-in’ period. Default value is 200. |
df |
If df |
A time series vector.
x <- sim.urarima()
x <- sim.urarima()
To be used after the command "sSummaryModel". The input "M" must be output from "sSummaryModel".
Selected seasonal time series of a given order .
sSelectedSeries(M, order = c(0, 1, 1, 0, 1, 1))
sSelectedSeries(M, order = c(0, 1, 1, 0, 1, 1))
M |
Matrix that is an output from "sSummaryModel" command, that is, M1, M2, M3, M4, M5, or M6. |
order |
order of ARIMA model |
A list with the series names and count.
data(TaiwanAirBox032017) outputSummaryModel <- sSummaryModel(TaiwanAirBox032017[,1:3]) sSelectedSeries(outputSummaryModel$M1)
data(TaiwanAirBox032017) outputSummaryModel <- sSummaryModel(TaiwanAirBox032017[,1:3]) sSelectedSeries(outputSummaryModel$M1)
Models specified by "sarimaSpec".
sSummaryModel( x, maxorder = c(3, 1, 2), period = 12, criterion = "bic", method = "CSS" )
sSummaryModel( x, maxorder = c(3, 1, 2), period = 12, criterion = "bic", method = "CSS" )
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
maxorder |
Maximum order of ARIMA model |
period |
Seasonal period. The default is 12. |
criterion |
Information criterion used for model selection. Either AIC or BIC. Default is "bic". |
method |
Estimation method. See the arima command in R. Possible values are "CSS-ML", "ML", and "CSS". Default is "CSS". |
A list containing:
Order - Order of ARIMA model of each series. A matrix of (ncol(x),6). The six columns are "p","d","q", "P", "D", "Q".
Mean - A logical vector indicating whether each series needs a constant (or mean).
M1 - Contains orders the stationary series.
M2 - Contains orders of series with (d=1) and (D=0).
M3 - Contains orders of series with (d=2) and (D=0).
M4 - Contains orders of series with (d=0) and (D=1).
M5 - Contains orders of series with (d=1) and (D=1).
M6 - Contains orders of series with (d=2) and (D=1).
data(TaiwanAirBox032017) summary <- sSummaryModel(TaiwanAirBox032017[,1:3])
data(TaiwanAirBox032017) summary <- sSummaryModel(TaiwanAirBox032017[,1:3])
To compute and plot the observed and simulated distances for measuring similarity between time series. The distance can be computed using ACF, PACF, AR-coefficients, or Periodogram.
stepp( x, M = 100, lmax = 5, alpha = 0.95, dismethod = "ACF", clumethod = "complete" )
stepp( x, M = 100, lmax = 5, alpha = 0.95, dismethod = "ACF", clumethod = "complete" )
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
M |
Number of simulation realizations. Default value is 100. |
lmax |
Number of lags used (for ACF, PACF, AR-coefficient). Default value is 5. |
alpha |
Quantile used in the plotting. Default value is 0.95. |
dismethod |
Summary statistics of each time series to be used in computing distance. Choices include “ACF”, “PACF”, “AR.PIC” and “PER”. Default is "ACF". |
clumethod |
Hierarchical clustering method: choices include “single”, “average”, and “complete”. Default is “complete”. |
The Empirical Dynamic Quantile of the series is obtained, a set of Txk series is generated and the heights in the dendrogram are obtained. This is repeated M times and the alpha quantile of the results of the M simulations are reported. Both dendrogram's heights and steps (differences) of these heights are compared.
Two plots are given in output:
The first plot shows the “height” of the dendrogram. Solid line is the observed height. The points denote the alpha quantile of heights based on the simulated series.
The second plot shows the “step” of the dendrogam (increments of heights). Solid line is the observed increments and the points are those of selected quantile for the simulated series.
A list containing:
mh - alpha quantile of heights based on the simulated series.
mdh - increments of selected quantile for the simulated series.
hgt - observed height.
hgtincre - observed increments.
Mh - the alpha quantile of the results of the M simulations are reported.
data(TaiwanAirBox032017) output <- stepp(as.matrix(TaiwanAirBox032017[,1:50]), M = 2)
data(TaiwanAirBox032017) output <- stepp(as.matrix(TaiwanAirBox032017[,1:50]), M = 2)
Standardized daily stock indexes of the 99 most important financial markets around the world from January 3, 2000, to December 16, 2015, with a total of 4163 observations. The first column contains the dates and the names of the indexes are the columns names.
data(Stockindexes99world)
data(Stockindexes99world)
An object of class "data.frame"
.
Refence or source
Compute and plot summary statistics of cross-correlation matrices (CCM) for high-dimensional time series.
Summaryccm(x, max.lag = 12)
Summaryccm(x, max.lag = 12)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
max.lag |
The number of lags for CCM. |
A list containing:
pvalue - P-values of Chi-square tests of individual-lag CCM being zero-matrix.
ndiag - Percentage of significant diagonal elements for each lag.
noff - Percentage of significant off-diagonal elements for each lag.
data(TaiwanAirBox032017) output <- Summaryccm(as.matrix(TaiwanAirBox032017[,1:4]))
data(TaiwanAirBox032017) output <- Summaryccm(as.matrix(TaiwanAirBox032017[,1:4]))
Collects all models Specified by "arimaSpec".
SummaryModel(x, maxorder = c(5, 1, 3), criterion = "bic", method = "CSS")
SummaryModel(x, maxorder = c(5, 1, 3), criterion = "bic", method = "CSS")
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
maxorder |
Maximum order of |
criterion |
Information criterion used for model selection. Either AIC or BIC. Default is "bic". |
method |
Estimation method. See the arima command in R. Possible values are "CSS-ML", "ML", and "CSS". Default is "CSS". |
A list containing:
Order - Orders of each series. A matrix of (ncol(x),3). The three columns are "p", "d", "q".
Mean - A logical vector indicating whether each series needs a constant (or mean).
M1 - A matrix with three columns (p, 0, q). The number of rows is the number of stationary time series. M1 is NULL if there is no stationary series.
M2 - A matrix with three columns (p, 1, q). The number of rows is the number of first-differenced series. M2 is NULL if there is no first-differenced series.
M3 - A matrix with three columns (p, 2, q). The number of rows is the number of 2nd-differenced series. M3 is NULL if there is no 2nd-differenced series.
data - Time series.
x <- matrix(rnorm(300, mean = 10, sd = 4), ncol = 3, nrow = 100) summary <- SummaryModel(x)
x <- matrix(rnorm(300, mean = 10, sd = 4), ncol = 3, nrow = 100) summary <- SummaryModel(x)
Use the command "tso" of the R package "tsoutliers" to identify outliers for each individual time series.
SummaryOutliers( x, type = c("LS", "AO", "TC"), tsmethod = "arima", args.tsmethod = list(order = c(5, 0, 0)) )
SummaryOutliers( x, type = c("LS", "AO", "TC"), tsmethod = "arima", args.tsmethod = list(order = c(5, 0, 0)) )
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
type |
A character vector indicating the type of outlier to be considered by the detection procedure. See 'types' in tso function. |
tsmethod |
The framework for time series modeling. Default is "arima". See 'tsmethod' in tso function. |
args.tsmethod |
An optional list containing arguments to be passed to the function invoking the method selected in tsmethod. See 'args.tsmethod' in tso function. Default value is c(5,0,0). |
A list containing:
Otable - Summary of various types of outliers detected.
x.cleaned - Outlier-adjusted data.
xadja - T-dimensional vector containing the number of time series that have outlier at a given time point.
data(TaiwanAirBox032017) output <- SummaryOutliers(TaiwanAirBox032017[1:50,1:3])
data(TaiwanAirBox032017) output <- SummaryOutliers(TaiwanAirBox032017[1:50,1:3])
Hourly PM25 measurements were constructed from random minute observations collected by AirBox devices for March 2017. There are 744 observations and 516 series.
data(TaiwanAirBox032017)
data(TaiwanAirBox032017)
An object of class "data.frame"
.
https://sites.google.com/site/cclljj/research/dataset-airbox
Chen, L.J. et al. (2017). Open framework for participatory PM2.5 monitoring in smart cities. IEEE Access Journal, Vol. 5, pp. 14441-14454.
Hourly measurements of at 15 monitoring stations from southern part of Taiwan
from January 1, 2006 to December 31, 2015. The first two columns are
the date and the hour. Missing values are filled using fixed window around the missing values.
Data of February 29 are removed so that there are 87600 observations in total.
data(TaiwanPM25)
data(TaiwanPM25)
An object of class "data.frame"
.
Three series with 106 observations (from year 1910 to 2016) with the deviation with respect to average value of temperatures in November in three regions: Europe, North America and South America. First columns contains the years. Units are degrees Celsius.
data(temperatures)
data(temperatures)
An object of class "data.frame"
.
https://www.ncdc.noaa.gov/cag/
Find the median of each time series in the time span and obtain the boxplots of the medians.
ts.box(x, maxbox = 200)
ts.box(x, maxbox = 200)
x |
T by k data matrix: T data points in rows with each row being data at a given time point, and k time series in columns. |
maxbox |
Maximum number of boxes. Default value is 200. |
Boxplots of the medians of subperiods.
data(TaiwanAirBox032017) ts.box(as.matrix(TaiwanAirBox032017[,1:10]), maxbox = 10)
data(TaiwanAirBox032017) ts.box(as.matrix(TaiwanAirBox032017[,1:10]), maxbox = 10)
It uses simple linear regression as the weak learner to perform L2 Boosting for time series data.
tsBoost(y, X, v = 0.01, m = 1000, rm.mean = TRUE)
tsBoost(y, X, v = 0.01, m = 1000, rm.mean = TRUE)
y |
T by 1 scalar dependent variable. |
X |
T by k data matrix of predictors: T data points in rows with each row being data at a given time point, and k time series in columns. |
v |
Learning rate of boosting. Default value is 0.01. |
m |
Maximum number of boosting iterations. Default is 1000. |
rm.mean |
a logical command. Default is TRUE. If rm.mean=TRUE, both the dependent and predictors are mean-adjusted. If rm.mean=FALSE, no mean adjustment is made. |
A list containing:
beta - the estimates of coefficient vector.
residuals - residuals after the boosting fit.
m - the maximum number of boosting iterations (from input).
v - learning rate (from input).
selection - the indexes for selected predictors. That is, the indexes for large beta estimates.
count: the number of selected predictors.
yhat - the fitted value of y.
data(TaiwanAirBox032017) output <- tsBoost(TaiwanAirBox032017[,1], TaiwanAirBox032017[,2])
data(TaiwanAirBox032017) output <- tsBoost(TaiwanAirBox032017[,1], TaiwanAirBox032017[,2])
Series of Gross Domestic Product at market prices, Household and NPISH final consumption expenditure and Gross Fixed Capital Formation for the 19 Euro Area country members, a total of 57 series. The source of the data is Eurostat and the data was extracted 08-07-2019. Seasonally and calendar adjusted data. The data includes 76 quaterly observations from Q1-2000 to Q4-2018.
data("UMEdata20002018")
data("UMEdata20002018")
An object of class "data.frame"
.