Title: | Functional Time Series Analysis |
---|---|
Description: | Functions for visualizing, modeling, forecasting and hypothesis testing of functional time series. |
Authors: | Rob Hyndman [aut] , Han Lin Shang [aut, cre, cph] |
Maintainer: | Han Lin Shang <[email protected]> |
License: | GPL-3 |
Version: | 6.4 |
Built: | 2024-12-01 08:34:21 UTC |
Source: | CRAN |
This package presents descriptive statistics of functional data; implements principal component regression and partial least squares regression to provide point and distributional forecasts for functional data; utilizes functional linear regression, ordinary least squares, penalized least squares, ridge regression, and moving block approaches to dynamically update point and distributional forecasts when partial data points in the most recent curve are observed; performs stationarity test for a functional time series; estimates a long-run covariance function by kernel sandwich estimator.
Rob J Hyndman and Han Lin Shang
Maintainer: Han Lin Shang <[email protected]>
########################### # References in Statistics ###########################
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series (with discussion)", Journal of the Korean Statistical Society, 38(3), 199-221.
R. J. Hyndman and H. L. Shang (2010) "Rainbow plots, bagplots, and boxplots for functional data", Journal of Computational and Graphical Statistics, 19(1), 29-45.
H. L. Shang and R. J. Hyndman (2011) "Nonparametric time series forecasting with dynamic updating", Mathematics and Computers in Simulation, 81(7), 1310-1324.
H. L. Shang (2011) "rainbow: an R package for visualizing functional time series, The R Journal, 3(2), 54-59.
H. L. Shang (2013) "Functional time series approach for forecasting very short-term electricity demand", Journal of Applied Statistics, 40(1), 152-168.
H. L. Shang (2013) "ftsa: An R package for analyzing functional time series", The R Journal, 5(1), 64-72.
H. L. Shang (2014) "A survey of functional principal component analysis", Advances in Statistical Analysis, 98(2), 121-142.
H. L. Shang (2014) "Bayesian bandwidth estimation for a functional nonparametric regression model with mixed types of regressors and unknown error density", Journal of Nonparametric Statistics, 26(3), 599-615.
H. L. Shang (2014) "Bayesian bandwidth estimation for a semi-functional partial linear regression model with unknown error density", Computational Statistics, 29(3-4), 829-848.
H. L. Shang (2015) "Resampling techniques for estimating the distribution of descriptive statistics of functional data", Communications in Statistics - Simulation and Computation, 44(3), 614- 635.
H. L. Shang (2016) "Mortality and life expectancy forecasting for a group of populations in developed countries: A robust multilevel functional data method", in C. Agostinelli, A. Basu, P. Filzmoser, D. Mukherjee (ed.), Recent Advances in Robust Statistics: Theory and Applications, Springer, India, pp. 169-184.
H. L. Shang (2016) "Mortality and life expectancy forecasting for a group of populations in developed countries: A multilevel functional data method", Annals of Applied Statistics, 10(3), 1639-1672.
H. L. Shang (2016) "A Bayesian approach for determining the optimal semi-metric and bandwidth in scalar-on-function quantile regression with unknown error density and dependent functional data", Journal of Multivariate Analysis, 146, 95-104.
H. L. Shang (2017) "Functional time series forecasting with dynamic updating: An application to intraday particulate matter concentration", Econometrics and Statistics, 1, 184-200.
H. L. Shang (2017) "Forecasting Intraday S&P 500 Index Returns: A Functional Time Series Approach", Journal of Forecasting, 36(7), 741-755.
H. L. Shang and R. J. Hyndman (2017) "Grouped functional time series forecasting: An application to age-specific mortality rates", Journal of Computational and Graphical Statistics, 26(2), 330-343.
G. Rice and H. L. Shang (2017) "A plug-in bandwidth selection procedure for long-run covariance estimation with stationary functional time series", Journal of Time Series Analysis, 38(4), 591-609.
P. Reiss, J. Goldsmith, H. L. Shang and R. T. Ogden (2017) "Methods for scalar-on-function regression", International Statistical Review, 85(2), 228-249.
P. Kokoszka, G. Rice and H. L. Shang (2017) "Inference for the autocovariance of a functional time series under conditional heteroscedasticity", Journal of Multivariate Analysis, 162, 32-50.
Y. Gao, H. L. Shang and Y. Yang (2017) "High-dimensional functional time series forecasting", in G. Aneiros, E. Bongiorno, R. Cao and P. Vieu (ed.), Functional Statistics and Related Fields, Springer, Cham, pp. 131-136.
Y. Gao and H. L. Shang (2017) "Multivariate functional time series forecasting: An application to age-specific mortality rates", Risks, 5(2), Article 21.
H. L. Shang (2018) "Visualizing rate of change: An application to age-specific fertility rates", Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(1), 249-262.
H. L. Shang (2018) "Bootstrap methods for stationary functional time series", Statistics and Computing, 28(1), 1-10.
Y. Gao, H. L. Shang and Y. Yang (2019) "High-dimensional functional time series forecasting: An application to age-specific mortality rates", Journal of Multivariate Analysis, 170, 232-243.
D. Li, P. M. Robinson and H. L. Shang (2020) "Long-range dependent curve time series", Journal of the American Statistical Association: Theory and Methods, 115(530), 957-971.
H. L. Shang (2020) "A comparison of Hurst exponent estimators in long-range dependent curve time series", Journal of Time Series Econometrics, 12(1).
D. Li, P. M. Robinson and H. L. Shang (2021) "Local Whittle estimation of long range dependence for functional time series", Journal of Time Series Analysis, 42(5-6), 685-695.
H. L. Shang and R. Xu (2021) "Functional time series forecasting of extreme values", Communications in Statistics: Case Studies, Data Analysis and Applications, 7(2), 182-199.
U. Beyaztas, H. L. Shang and Z. Yaseen (2021) "Development of functional autoregressive model based exogenous hydrometeorological variables for river flow prediction", Journal of Hydrology, 598, 126380.
U. Beyaztas and H. L. Shang (2022) "Machine learning-based functional time series forecasting: Application to age-specific mortality rates", Forecasting, 4(1), 394-408.
Y. Yang, Y. Yang and H. L. Shang (2022) "Feature extraction for functional time series: Theory and application to NIR spectroscopy data", Journal of Multivariate Analysis, 189, 104863.
A. E. Fernandez, R. Jimenez and H. L. Shang (2022) "On projection methods for functional time series forecasting", Journal of Multivariate Analysis, 189, 104890.
H. L. Shang (2022) "Not all long-memory estimators are born equal: A case of non-stationary curve time series", The Canadian Journal of Statistics, 50(1), 357-380.
X. Huang, H. L. Shang and D. Pitt (2022) "Permutation entropy and its variants for measuring temporal dependence", Australian and New Zealand Journal of Statistics, 64(4), 442-477.
H. L. Shang, J. Cao and P. Sang (2022) "Stopping time detection of wood panel compression: A functional time series approach", Journal of the Royal Statistical Society: Series C, 71(5), 1205-1224.
C. Tang, H. L. Shang and Y. Yang (2022) "Clustering and forecasting multiple functional time series", The Annals of Applied Statistics, 16(4), 2523-2553.
J. Trinka, H. Haghbin, M. Maadooliat and H. L. Shang (2023) "Functional time series forecasting: Functional singular spectrum analysis approaches", Stat, 12(1), e621.
D. Li, P. M. Robinson and H. L. Shang (2023) "Nonstationary fractionally integrated functional time series", Bernoulli, 29(2), 1505-1526.
X. Huang and H. L. Shang (2023) "Nonlinear autocorrelation function of functional time series", Nonlinear Dynamics: An International Journal of Nonlinear Dynamics and Chaos in Engineering Systems, 111, 2537-2554.
H. L. Shang (2023) "Sieve bootstrapping memory parameter in long-range dependent stationary functional time series", AStA Advances in Statistical Analysis, 107, 421-441.
E. Paparoditis and H. L. Shang (2023) "Bootstrap prediction bands for functional time series", Journal of the American Statistical Association: Theory and Methods, 118(542), 972-986.
Y. Gao, H. L. Shang and Y. Yang (2024) "Factor-augmented smoothing model for functional data", Statistica Sinica, 34(1), 1-26.
############################# # References in Population Studies #############################
H. L. Shang, H. Booth and R. J. Hyndman (2011) "Point and interval forecasts of mortality rates and life expectancy: a comparison of ten principal component methods, Demographic Research, 25(5), 173-214.
H. L. Shang (2012) "Point and interval forecasts of age-specific fertility rates: a comparison of functional principal component methods", Journal of Population Research, 29(3), 249-267.
H. L. Shang (2012) "Point and interval forecasts of age-specific life expectancies: a model averaging", Demographic Research, 27, 593-644.
H. L. Shang, A. Wisniowski, J. Bijak, P. W. F. Smith and J. Raymer (2014) "Bayesian functional models for population forecasting", in M. Marsili and G. Capacci (eds), Proceedings of the Sixth Eurostat/UNECE Work Session on Demographic Projections, Istituto nazionale di statistica, Rome, pp. 313-325.
H. L. Shang (2015) "Selection of the optimal Box-Cox transformation parameter for modelling and forecasting age-specific fertility", Journal of Population Research, 32(1), 69-79.
H. L. Shang (2015) "Forecast accuracy comparison of age-specific mortality and life expectancy: Statistical tests of the results", Population Studies, 69(3), 317-335.
H. L. Shang, P. W. F. Smith, J. Bijak, A. Wisniowski (2016) "A multilevel functional data method for forecasting population, with an application to the United Kingdom, International Journal of Forecasting, 32(3), 629-649.
H. L. Shang (2017) "Reconciling forecasts of infant mortality rates at national and sub-national levels: Grouped time-series method", Population Research and Policy Review, 36(1), 55-84.
R. J. Hyndman, Y. Zeng and H. L. Shang (2021) "Forecasting the old-age dependency ratio to determine the best pension age", Australian and New Zealand Journal of Statistics, 63(2), 241-256.
Y. Yang and H. L. Shang (2022) "Is the group structure important in grouped functional time series?", Journal of Data Science, 20(3), 303-324.
H. L. Shang and Y. Yang (2022) "Forecasting Australian subnational age-specific mortality rates", Journal of Population Research, 38, 1-24.
Y. Yang, H. L. Shang and J. Raymer (2024) "Forecasting Australian fertility by age, region, and birthplace", International Journal of Forecasting, in press.
########################### # References in Actuarial Studies ###########################
H. L. Shang and S. Haberman (2017) "Grouped multivariate and functional time series forecasting: An application to annuity pricing", Presented at the Living to 100 Symposium, Orlando Florida, January 4-6, 2017.
H. L. Shang and S. Haberman (2017) "Grouped multivariate and functional time series forecasting: An application to annuity pricing", Insurance: Mathematics and Economics, 75, 166-179.
H. L. Shang and S. Haberman (2018) "Model confidence sets and forecast combination: An application to age-specific mortality", Genus - Journal of Population Sciences, 74, Article number: 19.
H. L. Shang and S. Haberman (2020) "Forecasting multiple functional time series in a group structure: an application to mortality", ASTIN Bulletin, 50(2), 357-379.
H. L. Shang (2020) "Dynamic principal component regression for forecasting functional time series in a group structure", Scandinavian Actuarial Journal, 2020(4), 307-322.
H. L. Shang and S. Haberman (2020) "Forecasting age distribution of death counts: An application to annuity pricing", Annals of Actuarial Science, 14(1), 150-169.
H. L .Shang and S. Haberman and R. Xu (2022) "Multi-population modelling and forecasting age-specific life-table death counts", Insurance: Mathematics and Economics, 106, 239-253.
#################### # References in Finance ####################
F. Kearney and H. L. Shang (2020) "Uncovering predictability in the evolution of the WTI oil futures curve", European Financial Management, 26(1), 238-257.
H. L. Shang, K. Ji and U. Beyaztas (2021) "Granger causality of bivariate stationary curve time series", Journal of Forecasting, 40(4), 626-635.
S. Butler, P. Kokoszka, H. Miao and H. L. Shang (2021) "Neural network prediction of crude oil futures using B-splines", Energy Economics, 94, 105080.
H. L. Shang and F. Kearney (2022) "Dynamic functional time series forecasts of foreign exchange implied volatility surfaces", International Journal of Forecasting, 38(3), 1025-1049.
H. L. Shang and K. Ji (2023) "Forecasting intraday financial time series with sieve bootstrapping and dynamic updating", Journal of Forecasting, 42(8), 1973-1988.
We generate for the female population in the US. The functional time series corresponding to the log mortality data in each of the 3 states. Each functional time series comprises the ages from 0 to 100+.
data("all_hmd_male_data")
data("all_hmd_male_data")
A n x p matrix with n=186 observations on the following p=101 ages from 0 to 100+.
The data generated corresponds to the FTS for the female US log-mortality. The matrix contains 186 FTS stacked by rows. They correspond to 62 (number of years) times 3 (states). Each FTS contains 101 functional values.
United States Mortality Database (2023). University of California, Berkeley (USA). Department of Demography at the University of California, Berkeley. Available at usa.mortality.org (data downloaded on March 15, 2023).
data(all_hmd_male_data)
data(all_hmd_male_data)
We generate for the male population in the US. The functional time series corresponding to the log mortality data in each of the 3 states. Each functional time series comprises the ages from 0 to 100+.
data("all_hmd_male_data")
data("all_hmd_male_data")
A n x p matrix with n=186 observations on the following p=101 ages from 0 to 100+.
The data generated corresponds to the FTS for the male US log-mortality. The matrix contains 186 FTS stacked by rows. They correspond to 62 (number of years) times 3 (states). Each FTS contains 101 functional values.
United States Mortality Database (2023). University of California, Berkeley (USA). Department of Demography at the University of California, Berkeley. Available at usa.mortality.org (data downloaded on March 15, 2023).
data(all_hmd_male_data)
data(all_hmd_male_data)
Mean function, variance function, median function, trim mean function of functional data
centre(x, type)
centre(x, type)
x |
An object of class |
type |
Mean, variance, median or trim mean? |
Return mean function, variance function, median function or trim mean function.
Han Lin Shang
pcscorebootstrapdata
, mean.fts
, median.fts
, sd.fts
, var.fts
# mean function is often removed in the functional principal component analysis. # trimmed mean function is sometimes employed for robustness in the presence of outliers. # In calculating trimmed mean function, several functional depth measures were employed. centre(x = ElNino_ERSST_region_1and2$y, type = "mean") centre(x = ElNino_ERSST_region_1and2$y, type = "var") centre(x = ElNino_ERSST_region_1and2$y, type = "median") centre(x = ElNino_ERSST_region_1and2$y, type = "trimmed")
# mean function is often removed in the functional principal component analysis. # trimmed mean function is sometimes employed for robustness in the presence of outliers. # In calculating trimmed mean function, several functional depth measures were employed. centre(x = ElNino_ERSST_region_1and2$y, type = "mean") centre(x = ElNino_ERSST_region_1and2$y, type = "var") centre(x = ElNino_ERSST_region_1and2$y, type = "median") centre(x = ElNino_ERSST_region_1and2$y, type = "trimmed")
Log-ratio transformation from constrained space to unconstrained space, where a standard nonparametric function-on-function regression can be applied.
CoDa_BayesNW(data, normalization, m = 5001, band_choice = c("Silverman", "DPI"), kernel = c("gaussian", "epanechnikov"))
CoDa_BayesNW(data, normalization, m = 5001, band_choice = c("Silverman", "DPI"), kernel = c("gaussian", "epanechnikov"))
data |
Densities or raw data matrix of dimension N by p, where N denotes sample size and p denotes dimensionality |
normalization |
If a standardization should be performed? |
m |
Grid points within the data range |
band_choice |
Selection of optimal bandwidth |
kernel |
Type of kernel function |
1) Compute the geometric mean function 2) Apply the centered log-ratio transformation 3) Apply a nonparametric function-on-function regression to the transformed data 4) Transform forecasts back to the compositional data 5) Add back the geometric means, to obtain the forecasts of the density function
Out-of-sample density forecasts
Han Lin Shang
Egozcue, J. J., Diaz-Barrero, J. L. and Pawlowsky-Glahn, V. (2006) ‘Hilbert space of probability density functions based on Aitchison geometry’, Acta Mathematica Sinica, 22, 1175-1182.
Ferraty, F. and Shang, H. L. (2021) ‘Nonparametric density-on-density regression’, working paper.
## Not run: CoDa_BayesNW(data = DJI_return, normalization = "TRUE", band_choice = "DPI", kernel = "epanechnikov") ## End(Not run)
## Not run: CoDa_BayesNW(data = DJI_return, normalization = "TRUE", band_choice = "DPI", kernel = "epanechnikov") ## End(Not run)
Log-ratio transformation from constrained space to unconstrained space, where a standard functional principal component analysis can be applied.
CoDa_FPCA(data, normalization, h_scale = 1, m = 5001, band_choice = c("Silverman", "DPI"), kernel = c("gaussian", "epanechnikov"), varprop = 0.99, fmethod)
CoDa_FPCA(data, normalization, h_scale = 1, m = 5001, band_choice = c("Silverman", "DPI"), kernel = c("gaussian", "epanechnikov"), varprop = 0.99, fmethod)
data |
Densities or raw data matrix of dimension n by p, where n denotes sample size and p denotes dimensionality |
normalization |
If a standardization should be performed? |
h_scale |
Scaling parameter in the kernel density estimator |
m |
Grid point within the data range |
band_choice |
Selection of optimal bandwidth |
kernel |
Type of kernel functions |
varprop |
Proportion of variance explained |
fmethod |
Univariate time series forecasting method |
1) Compute the geometric mean function 2) Apply the centered log-ratio transformation 3) Apply FPCA to the transformed data 4) Forecast principal component scores 5) Transform forecasts back to the compositional data 6) Add back the geometric means, to obtain the forecasts of the density function
Out-of-sample forecast densities
Han Lin Shang
Boucher, M.-P. B., Canudas-Romo, V., Oeppen, J. and Vaupel, J. W. (2017) ‘Coherent forecasts of mortality with compositional data analysis’, Demographic Research, 37, 527-566.
Egozcue, J. J., Diaz-Barrero, J. L. and Pawlowsky-Glahn, V. (2006) ‘Hilbert space of probability density functions based on Aitchison geometry’, Acta Mathematica Sinica, 22, 1175-1182.
Horta_Ziegelmann_FPCA
, LQDT_FPCA
, skew_t_fun
## Not run: CoDa_FPCA(data = DJI_return, normalization = "TRUE", band_choice = "DPI", kernel = "epanechnikov", varprop = 0.9, fmethod = "ETS") ## End(Not run)
## Not run: CoDa_FPCA(data = DJI_return, normalization = "TRUE", band_choice = "DPI", kernel = "epanechnikov", varprop = 0.9, fmethod = "ETS") ## End(Not run)
Computes differences of a fts
object at each variable.
## S3 method for class 'fts' diff(x, lag = 1, differences = 1, ...)
## S3 method for class 'fts' diff(x, lag = 1, differences = 1, ...)
x |
An object of class |
lag |
An integer indicating which lag to use. |
differences |
An integer indicating the order of the difference. |
... |
Other arguments. |
An object of class fts
.
Rob J Hyndman
# ElNino is an object of sliced functional time series. # Differencing is sometimes used to achieve stationarity. diff(x = ElNino_ERSST_region_1and2)
# ElNino is an object of sliced functional time series. # Differencing is sometimes used to achieve stationarity. diff(x = ElNino_ERSST_region_1and2)
Dow Jones Industrial Average (DJIA) is a stock market index that shows how 30 large publicly owned companies based in the United States have traded during a standard NYSE trading session. We consider monthly cross-sectional returns from April 2004 to December 2017. The data were obtained from the CRSP (Center for Research in Security Prices) database.
data("DJI_return")
data("DJI_return")
A data matrix
Kokoszka, P., Miao, H., Petersen, A. and Shang, H. L. (2019) ‘Forecasting of density functions with an application to cross-sectional and intraday returns’, International Journal of Forecasting, 35(4), 1304-1317.
data(DJI_return)
data(DJI_return)
Functional principal component analysis is used to decompose multiple functional time series. This function uses a functional panel data model to reduce dimensions for multiple functional time series.
dmfpca(y, M = NULL, J = NULL, N = NULL, tstart = 0, tlength = 1)
dmfpca(y, M = NULL, J = NULL, N = NULL, tstart = 0, tlength = 1)
y |
A data matrix containing functional responses. Each row contains measurements from a function at a set of grid points, and each column contains measurements of all functions at a particular grid point |
M |
Number of |
J |
Number of functions in each object |
N |
Number of grid points per function |
tstart |
Start point of the grid points |
tlength |
Length of the interval that the functions are evaluated at |
K1 |
Number of components for the common time-trend |
K2 |
Number of components for the residual component |
lambda1 |
A vector containing all common time-trend eigenvalues in non-increasing order |
lambda2 |
A vector containing all residual component eigenvalues in non-increasing order |
phi1 |
A matrix containing all common time-trend eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
phi2 |
A matrix containing all residual component eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues. |
scores1 |
A matrix containing estimated common time-trend principal component scores. Each row corresponding to the common time-trend scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix y. Each column contains the scores for a common time-trend component for all subjects. |
scores2 |
A matrix containing estimated residual component principal component scores. Each row corresponding to the level 2 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix y. Each column contains the scores for a residual component for all subjects. |
mu |
A vector containing the overall mean function. |
eta |
A matrix containing the deviation from overall mean function to country specific mean function. The number of rows is the number of countries. |
Chen Tang and Han Lin Shang
Rice, G. and Shang, H. L. (2017) "A plug-in bandwidth selection procedure for long-run covariance estimation with stationary functional time series", Journal of Time Series Analysis, 38, 591-609.
Shang, H. L. (2016) "Mortality and life expectancy forecasting for a group of populations in developed countries: A multilevel functional data method", The Annals of Applied Statistics, 10, 1639-1672.
Di, C.-Z., Crainiceanu, C. M., Caffo, B. S. and Punjabi, N. M. (2009) "Multilevel functional principal component analysis", The Annals of Applied Statistics, 3, 458-488.
## The following takes about 10 seconds to run ## ## Not run: y <- do.call(rbind, sim_ex_cluster) MFPCA.sim <- dmfpca(y, M = length(sim_ex_cluster), J = nrow(sim_ex_cluster[[1]]), N = ncol(sim_ex_cluster[[1]]), tlength = 1) ## End(Not run)
## The following takes about 10 seconds to run ## ## Not run: y <- do.call(rbind, sim_ex_cluster) MFPCA.sim <- dmfpca(y, M = length(sim_ex_cluster), J = nrow(sim_ex_cluster[[1]]), N = ncol(sim_ex_cluster[[1]]), tlength = 1) ## End(Not run)
A functional linear regression is used to address the problem of dynamic updating, when partial data in the most recent curve are observed.
dynamic_FLR(dat, newdata, holdoutdata, order_k_percent = 0.9, order_m_percent = 0.9, pcd_method = c("classical", "M"), robust_lambda = 2.33, bootrep = 100, pointfore, level = 80)
dynamic_FLR(dat, newdata, holdoutdata, order_k_percent = 0.9, order_m_percent = 0.9, pcd_method = c("classical", "M"), robust_lambda = 2.33, bootrep = 100, pointfore, level = 80)
dat |
An object of class |
newdata |
A data vector of newly arrived observations. |
holdoutdata |
A data vector of holdout sample to evaluate point forecast accuracy. |
order_k_percent |
Select the number of components that explains at least 90 percent of the total variation. |
order_m_percent |
Select the number of components that explains at least 90 percent of the total variation. |
pcd_method |
Method to use for principal components decomposition. Possibilities are "M", "rapca" and "classical". |
robust_lambda |
Tuning parameter in the two-step robust functional principal component analysis, when |
bootrep |
Number of bootstrap samples. |
pointfore |
If |
level |
Nominal coverage probability. |
This function is designed to dynamically update point and interval forecasts, when partial data in the most recent curve are observed.
update_forecast |
Updated forecasts. |
holdoutdata |
Holdout sample. |
err |
Forecast errors. |
order_k |
Number of principal components in the first block of functions. |
order_m |
Number of principal components in the second block of functions. |
update_comb |
Bootstrapped forecasts for the dynamically updating time period. |
update_comb_lb_ub |
By taking corresponding quantiles, obtain lower and upper prediction bounds. |
err_boot |
Bootstrapped in-sample forecast error for the dynamically updating time period. |
Han Lin Shang
H. Shen and J. Z. Huang (2008) "Interday forecasting and intraday updating of call center arrivals", Manufacturing and Service Operations Management, 10(3), 391-410.
H. Shen (2009) "On modeling and forecasting time series of curves", Technometrics, 51(3), 227-238.
H. L. Shang and R. J. Hyndman (2011) "Nonparametric time series forecasting with dynamic updating", Mathematics and Computers in Simulation, 81(7), 1310-1324.
J-M. Chiou (2012) "Dynamical functional prediction and classification with application to traffic flow prediction", Annals of Applied Statistics, 6(4), 1588-1614.
H. L. Shang (2013) "Functional time series approach for forecasting very short-term electricity demand", Journal of Applied Statistics, 40(1), 152-168.
H. L. Shang (2015) "Forecasting Intraday S&P 500 Index Returns: A Functional Time Series Approach", Journal of Forecasting, 36(7), 741-755.
H. L. Shang (2017) "Functional time series forecasting with dynamic updating: An application to intraday particulate matter concentration", Econometrics and Statistics, 1, 184-200.
dynamic_FLR_point = dynamic_FLR(dat = ElNino_ERSST_region_1and2$y[,1:68], newdata = ElNino_ERSST_region_1and2$y[1:4,69], holdoutdata = ElNino_ERSST_region_1and2$y[5:12,69], pointfore = TRUE) dynamic_FLR_interval = dynamic_FLR(dat = ElNino_ERSST_region_1and2$y[,1:68], newdata = ElNino_ERSST_region_1and2$y[1:4,69], holdoutdata = ElNino_ERSST_region_1and2$y[5:12,69], pointfore = FALSE)
dynamic_FLR_point = dynamic_FLR(dat = ElNino_ERSST_region_1and2$y[,1:68], newdata = ElNino_ERSST_region_1and2$y[1:4,69], holdoutdata = ElNino_ERSST_region_1and2$y[5:12,69], pointfore = TRUE) dynamic_FLR_interval = dynamic_FLR(dat = ElNino_ERSST_region_1and2$y[,1:68], newdata = ElNino_ERSST_region_1and2$y[1:4,69], holdoutdata = ElNino_ERSST_region_1and2$y[5:12,69], pointfore = FALSE)
Four methods, namely block moving (BM), ordinary least squares (OLS) regression, ridge regression (RR), penalized least squares (PLS) regression, were proposed to address the problem of dynamic updating, when partial data in the most recent curve are observed.
dynupdate(data, newdata = NULL, holdoutdata, method = c("ts", "block", "ols", "pls", "ridge"), fmethod = c("arima", "ar", "ets", "ets.na", "rwdrift", "rw"), pcdmethod = c("classical", "M", "rapca"), ngrid = max(1000, ncol(data$y)), order = 6, robust_lambda = 2.33, lambda = 0.01, value = FALSE, interval = FALSE, level = 80, pimethod = c("parametric", "nonparametric"), B = 1000)
dynupdate(data, newdata = NULL, holdoutdata, method = c("ts", "block", "ols", "pls", "ridge"), fmethod = c("arima", "ar", "ets", "ets.na", "rwdrift", "rw"), pcdmethod = c("classical", "M", "rapca"), ngrid = max(1000, ncol(data$y)), order = 6, robust_lambda = 2.33, lambda = 0.01, value = FALSE, interval = FALSE, level = 80, pimethod = c("parametric", "nonparametric"), B = 1000)
data |
An object of class |
newdata |
A data vector of newly arrived observations. |
holdoutdata |
A data vector of holdout sample to evaluate point forecast accuracy. |
method |
Forecasting methods. The latter four can dynamically update point forecasts. |
fmethod |
Univariate time series forecasting methods used in |
pcdmethod |
Method to use for principal components decomposition. Possibilities are "M", "rapca" and "classical". |
ngrid |
Number of grid points to use in calculations. Set to maximum of 1000 and |
order |
Number of principal components to fit. |
robust_lambda |
Tuning parameter in the two-step robust functional principal component analysis, when |
lambda |
Penalty parameter used in |
value |
When |
interval |
When |
level |
Nominal coverage probability. |
pimethod |
Parametric or nonparametric method to construct prediction intervals. |
B |
Number of bootstrap samples. |
This function is designed to dynamically update point and interval forecasts, when partial data in the most recent curve are observed.
If method = "classical"
, then standard functional principal component decomposition is used, as described by Ramsay and Dalzell (1991).
If method = "rapca"
, then the robust principal component algorithm of Hubert, Rousseeuw and Verboven (2002) is used.
If method = "M"
, then the hybrid algorithm of Hyndman and Ullah (2005) is used.
forecasts |
An object of class |
bootsamp |
An object of class |
low |
An object of class |
up |
An object of class |
Han Lin Shang
J. O. Ramsay and C. J. Dalzell (1991) "Some tools for functional data analysis (with discussion)", Journal of the Royal Statistical Society: Series B, 53(3), 539-572.
M. Hubert and P. J. Rousseeuw and S. Verboven (2002) "A fast robust method for principal components with applications to chemometrics", Chemometrics and Intelligent Laboratory Systems, 60(1-2), 101-111.
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
H. Shen and J. Z. Huang (2008) "Interday forecasting and intraday updating of call center arrivals", Manufacturing and Service Operations Management, 10(3), 391-410.
H. Shen (2009) "On modeling and forecasting time series of curves", Technometrics, 51(3), 227-238.
H. L. Shang and R. J. Hyndman (2011) "Nonparametric time series forecasting with dynamic updating", Mathematics and Computers in Simulation, 81(7), 1310-1324.
H. L. Shang (2013) "Functional time series approach for forecasting very short-term electricity demand", Journal of Applied Statistics, 40(1), 152-168.
H. L. Shang (2017) "Forecasting Intraday S&P 500 Index Returns: A Functional Time Series Approach", Journal of Forecasting, 36(7), 741-755.
H. L. Shang (2017) "Functional time series forecasting with dynamic updating: An application to intraday particulate matter concentration", Econometrics and Statistics, 1, 184-200.
ftsm
, forecast.ftsm
, plot.fm
, residuals.fm
, summary.fm
# ElNino is an object of sliced functional time series, constructed from a univariate time series. # When we observe some newly arrived information in the most recent time period, this function # allows us to update the point and interval forecasts for the remaining time period. dynupdate(data = ElNino_ERSST_region_1and2, newdata = ElNino_ERSST_region_1and2$y[1:4,69], holdoutdata = ElNino_ERSST_region_1and2$y[5:12,57], method = "block", interval = FALSE)
# ElNino is an object of sliced functional time series, constructed from a univariate time series. # When we observe some newly arrived information in the most recent time period, this function # allows us to update the point and interval forecasts for the remaining time period. dynupdate(data = ElNino_ERSST_region_1and2, newdata = ElNino_ERSST_region_1and2$y[1:4,69], holdoutdata = ElNino_ERSST_region_1and2$y[5:12,57], method = "block", interval = FALSE)
Eigenvalue ratio and growth ratio
ER_GR(data)
ER_GR(data)
data |
An n by p matrix, where n denotes sample size and p denotes the number of discretized data points in a curve |
k_ER |
The number of components selected by the eigenvalue ratio |
k_GR |
The number of components selected by the growth ratio |
Han Lin Shang
Lam, C. and Yao, Q. (2012). Factor modelling for high-dimensional time series: Inference for the number of factors. The Annals of Statistics, 40, 694-726.
Ahn, S. and Horenstein, A. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81, 1203-1227.
ER_GR(pm_10_GR$y)
ER_GR(pm_10_GR$y)
Computes the forecast error measure.
error(forecast, forecastbench, true, insampletrue, method = c("me", "mpe", "mae", "mse", "sse", "rmse", "mdae", "mdse", "mape", "mdape", "smape", "smdape", "rmspe", "rmdspe", "mrae", "mdrae", "gmrae", "relmae", "relmse", "mase", "mdase", "rmsse"), giveall = FALSE)
error(forecast, forecastbench, true, insampletrue, method = c("me", "mpe", "mae", "mse", "sse", "rmse", "mdae", "mdse", "mape", "mdape", "smape", "smdape", "rmspe", "rmdspe", "mrae", "mdrae", "gmrae", "relmae", "relmse", "mase", "mdase", "rmsse"), giveall = FALSE)
forecast |
Out-of-sample forecasted values. |
forecastbench |
Forecasted values using a benchmark method, such as random walk. |
true |
Out-of-sample holdout values. |
insampletrue |
Insample values. |
method |
Method of forecast error measure. |
giveall |
If |
Bias measure:
If method = "me"
, the forecast error measure is mean error.
If method = "mpe"
, the forecast error measure is mean percentage error.
Forecast accuracy error measure:
If method = "mae"
, the forecast error measure is mean absolute error.
If method = "mse"
, the forecast error measure is mean square error.
If method = "sse"
, the forecast error measure is sum square error.
If method = "rmse"
, the forecast error measure is root mean square error.
If method = "mdae"
, the forecast error measure is median absolute error.
If method = "mape"
, the forecast error measure is mean absolute percentage error.
If method = "mdape"
, the forecast error measure is median absolute percentage error.
If method = "rmspe"
, the forecast error measure is root mean square percentage error.
If method = "rmdspe"
, the forecast error measure is root median square percentage error.
Forecast accuracy symmetric error measure:
If method = "smape"
, the forecast error measure is symmetric mean absolute percentage error.
If method = "smdape"
, the forecast error measure is symmetric median absolute percentage error.
Forecast accuracy relative error measure:
If method = "mrae"
, the forecast error measure is mean relative absolute error.
If method = "mdrae"
, the forecast error measure is median relative absolute error.
If method = "gmrae"
, the forecast error measure is geometric mean relative absolute error.
If method = "relmae"
, the forecast error measure is relative mean absolute error.
If method = "relmse"
, the forecast error measure is relative mean square error.
Forecast accuracy scaled error measure:
If method = "mase"
, the forecast error measure is mean absolute scaled error.
If method = "mdase"
, the forecast error measure is median absolute scaled error.
If method = "rmsse"
, the forecast error measure is root mean square scaled error.
A numeric value.
Han Lin Shang
P. A. Thompson (1990) "An MSE statistic for comparing forecast accuracy across series", International Journal of Forecasting, 6(2), 219-227.
C. Chatfield (1992) "A commentary on error measures", International Journal of Forecasting, 8(1), 100-102.
S. Makridakis (1993) "Accuracy measures: theoretical and practical concerns", International Journal of Forecasting, 9(4), 527-529.
R. J. Hyndman and A. Koehler (2006) "Another look at measures of forecast accuracy", International Journal of Forecasting, 22(3), 443-473.
# Forecast error measures can be categorized into three groups: (1) scale-dependent, # (2) scale-independent but with possible zero denominator, # (3) scale-independent with non-zero denominator. error(forecast = 1:2, true = 3:4, method = "mae") error(forecast = 1:5, forecastbench = 6:10, true = 11:15, method = "mrae") error(forecast = 1:5, forecastbench = 6:10, true = 11:15, insampletrue = 16:20, giveall = TRUE)
# Forecast error measures can be categorized into three groups: (1) scale-dependent, # (2) scale-independent but with possible zero denominator, # (3) scale-independent with non-zero denominator. error(forecast = 1:2, true = 3:4, method = "mae") error(forecast = 1:5, forecastbench = 6:10, true = 11:15, method = "mrae") error(forecast = 1:5, forecastbench = 6:10, true = 11:15, insampletrue = 16:20, giveall = TRUE)
Creates subsets of a fts
object.
extract(data, direction = c("time", "x"), timeorder, xorder)
extract(data, direction = c("time", "x"), timeorder, xorder)
data |
An object of |
direction |
In time direction or x variable direction? |
timeorder |
Indexes of time order. |
xorder |
Indexes of x variable order. |
When xorder
is specified, it returns a fts
object with same argument as
data but with a subset of x
variables.
When timeorder
is specified, it returns a fts
object with same argument
as data but with a subset of time
variables.
Han Lin Shang
# ElNino is an object of class sliced functional time series. # This function truncates the data series rowwise or columnwise. extract(data = ElNino_ERSST_region_1and2, direction = "time", timeorder = 1980:2006) # Last 27 curves extract(data = ElNino_ERSST_region_1and2, direction = "x", xorder = 1:8) # First 8 x variables
# ElNino is an object of class sliced functional time series. # This function truncates the data series rowwise or columnwise. extract(data = ElNino_ERSST_region_1and2, direction = "time", timeorder = 1980:2006) # Last 27 curves extract(data = ElNino_ERSST_region_1and2, direction = "x", xorder = 1:8) # First 8 x variables
Compute functional autocorrelation function at various lags
facf(fun_data, lag_value_range = seq(0, 20, by = 1))
facf(fun_data, lag_value_range = seq(0, 20, by = 1))
fun_data |
A data matrix of dimension (n by p), where n denotes sample size; and p denotes dimensionality |
lag_value_range |
Lag value |
The autocovariance at lag is estimated by the function
, a functional analog of the autocorrelation is defined as
A vector of functional autocorrelation function at various lags
Han Lin Shang
L. Horv\'ath, G. Rice and S. Whipple (2016) Adaptive bandwidth selection in the long run covariance estimator of functional time series, Computational Statistics and Data Analysis, 100, 676-693.
facf_value = facf(fun_data = t(ElNino_ERSST_region_1and2$y))
facf_value = facf(fun_data = t(ElNino_ERSST_region_1and2$y))
Decomposition by functional analysis of variance fitted by means.
FANOVA(data_pop1, data_pop2, year=1959:2020, age= 0:100, n_prefectures=51, n_populations=2)
FANOVA(data_pop1, data_pop2, year=1959:2020, age= 0:100, n_prefectures=51, n_populations=2)
data_pop1 |
It's a p by n matrix |
data_pop2 |
It's a p by n matrix |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year. |
n_populations |
Number of populations. |
FGE_mean |
FGE_mean, a vector of dimension p |
FRE_mean |
FRE_mean, a matrix of dimension length(row_partition_index) by p. |
FCE_mean |
FCE_mean, a matrix of dimension length(column_partition_index) by p. |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Ramsay, J. and B. Silverman (2006). Functional Data Analysis. Springer Series in Statistics. Chapter 13. New York: Springer
# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional Anova decomposition fitted by means. FANOVA_means <- FANOVA(data_pop1 = t(all_hmd_male_data), data_pop2 = t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) ##1. The funcional grand effect FGE = FANOVA_means$FGE_mean ##2. The funcional row effect FRE = FANOVA_means$FRE_mean ##3. The funcional column effect FCE = FANOVA_means$FCE_mean
# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional Anova decomposition fitted by means. FANOVA_means <- FANOVA(data_pop1 = t(all_hmd_male_data), data_pop2 = t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) ##1. The funcional grand effect FGE = FANOVA_means$FGE_mean ##2. The funcional row effect FRE = FANOVA_means$FRE_mean ##3. The funcional column effect FCE = FANOVA_means$FCE_mean
The coefficients from the fitted object are forecasted using a multivariate time-series forecasting method. The forecast coefficients are then multiplied by the functional principal components to obtain a forecast curve.
farforecast(object, h = 10, var_type = "const", Dmax_value, Pmax_value, level = 80, PI = FALSE)
farforecast(object, h = 10, var_type = "const", Dmax_value, Pmax_value, level = 80, PI = FALSE)
object |
An object of |
h |
Forecast horizon. |
var_type |
Type of multivariate time series forecasting method; see |
Dmax_value |
Maximum number of components considered. |
Pmax_value |
Maximum order of VAR model considered. |
level |
Nominal coverage probability of prediction error bands. |
PI |
When |
1. Decompose the smooth curves via a functional principal component analysis (FPCA).
2. Fit a multivariate time-series model to the principal component score matrix.
3. Forecast the principal component scores using the fitted multivariate time-series models. The order of VAR is selected optimally via an information criterion.
4. Multiply the forecast principal component scores by estimated principal components to obtain forecasts of .
5. Prediction intervals are constructed by taking quantiles of the one-step-ahead forecast errors.
point_fore |
Point forecast |
order_select |
Selected VAR order and number of components |
PI_lb |
Lower bound of a prediction interval |
PI_ub |
Upper bound of a prediction interval |
Han Lin Shang
A. Aue, D. D. Norinho and S. Hormann (2015) "On the prediction of stationary functional time series", Journal of the American Statistical Association, 110(509), 378-392.
J. Klepsch, C. Kl\"uppelberg and T. Wei (2017) "Prediction of functional ARMA processes with an application to traffic data", Econometrics and Statistics, 1, 128-149.
sqrt_pm10 = sqrt(pm_10_GR$y) multi_forecast_sqrt_pm10 = farforecast(object = fts(seq(0, 23.5, by = 0.5), sqrt_pm10), h = 1, Dmax_value = 5, Pmax_value = 3)
sqrt_pm10 = sqrt(pm_10_GR$y) multi_forecast_sqrt_pm10 = farforecast(object = fts(seq(0, 23.5, by = 0.5), sqrt_pm10), h = 1, Dmax_value = 5, Pmax_value = 3)
Computes bootstrap or smoothed bootstrap samples based on independent and identically distributed functional data.
fbootstrap(data, estad = func.mean, alpha = 0.05, nb = 200, suav = 0, media.dist = FALSE, graph = FALSE, ...)
fbootstrap(data, estad = func.mean, alpha = 0.05, nb = 200, suav = 0, media.dist = FALSE, graph = FALSE, ...)
data |
An object of class |
estad |
Estimate function of interest. Default is to estimate the mean function. Other options are |
alpha |
Significance level used in the smooth bootstrapping. |
nb |
Number of bootstrap samples. |
suav |
Smoothing parameter. |
media.dist |
Estimate mean function. |
graph |
Graphical output. |
... |
Other arguments. |
A list containing the following components is returned.
estimate |
Estimate function. |
max.dist |
Max distance of bootstrap samples. |
rep.dist |
Distances of bootstrap samples. |
resamples |
Bootstrap samples. |
center |
Functional mean. |
Han Lin Shang
A. Cuevas and M. Febrero and R. Fraiman (2006), "On the use of the bootstrap for estimating functions with functional data", Computational Statistics and Data Analysis, 51(2), 1063-1074.
A. Cuevas and M. Febrero and R. Fraiman (2007), "Robust estimation and classification for functional data via projection-based depth notions", Computational Statistics, 22(3), 481-496.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2007) "A functional analysis of NOx levels: location and scale estimation and outlier detection", Computational Statistics, 22(3), 411-427.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2008) "Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels", Environmetrics, 19(4), 331-345.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2010) "Measures of influence for the functional linear model with scalar response", Journal of Multivariate Analysis, 101(2), 327-339.
J. A. Cuesta-Albertos and A. Nieto-Reyes (2010) "Functional classification and the random Tukey depth. Practical issues", Combining Soft Computing and Statistical Methods in Data Analysis, Advances in Intelligent and Soft Computing, 77, 123-130.
D. Gervini (2012) "Outlier detection and trimmed estimation in general functional spaces", Statistica Sinica, 22(4), 1639-1660.
H. L. Shang (2015) "Re-sampling techniques for estimating the distribution of descriptive statistics of functional data", Communication in Statistics–Simulation and Computation, 44(3), 614-635.
H. L. Shang (2018) Bootstrap methods for stationary functional time series, Statistics and Computing, 28(1), 1-10.
# Bootstrapping the distribution of a summary statistics of functional data. fbootstrap(data = ElNino_ERSST_region_1and2)
# Bootstrapping the distribution of a summary statistics of functional data. fbootstrap(data = ElNino_ERSST_region_1and2)
The coefficients from the fitted object are forecasted
using either an ARIMA model (method = "arima"
), an AR model (method = "ar"
),
an exponential smoothing method (method = "ets"
), a linear exponential smoothing
method allowing missing values (method = "ets.na"
), or a random walk with drift model
(method = "rwdrift"
). The forecast coefficients are then multiplied by the principal
components to obtain a forecast curve.
## S3 method for class 'ftsm' forecast(object, h = 10, method = c("ets", "arima", "ar", "ets.na", "rwdrift", "rw", "struct", "arfima"), level = 80, jumpchoice = c("fit", "actual"), pimethod = c("parametric", "nonparametric"), B = 100, usedata = nrow(object$coeff), adjust = TRUE, model = NULL, damped = NULL, stationary = FALSE, ...)
## S3 method for class 'ftsm' forecast(object, h = 10, method = c("ets", "arima", "ar", "ets.na", "rwdrift", "rw", "struct", "arfima"), level = 80, jumpchoice = c("fit", "actual"), pimethod = c("parametric", "nonparametric"), B = 100, usedata = nrow(object$coeff), adjust = TRUE, model = NULL, damped = NULL, stationary = FALSE, ...)
object |
Output from |
h |
Forecast horizon. |
method |
Univariate time series forecasting methods. Current possibilities are “ets”, “arima”, “ets.na”, “rwdrift” and “rw”. |
level |
Coverage probability of prediction intervals. |
jumpchoice |
Jump-off point for forecasts. Possibilities are “actual” and “fit”. If “actual”, the forecasts are bias-adjusted by the difference between the fit and the last year of observed data. Otherwise, no adjustment is used. See Booth et al. (2006) for the detail on jump-off point. |
pimethod |
Indicates if parametric method is used to construct prediction intervals. |
B |
Number of bootstrap samples. |
usedata |
Number of time periods to use in forecasts. Default is to use all. |
adjust |
If |
model |
If the |
damped |
If the |
stationary |
If |
... |
Other arguments passed to forecast routine. |
1. Obtain a smooth curve for each
using a nonparametric smoothing technique.
2. Decompose the smooth curves via a functional principal component analysis.
3. Fit a univariate time series model to each of the principal component scores.
4. Forecast the principal component scores using the fitted time series models.
5. Multiply the forecast principal component scores by fixed principal components to obtain forecasts of .
6. The estimated variances of the error terms (smoothing error and model residual error) are used to compute prediction intervals for the forecasts.
List with the following components:
mean |
An object of class |
lower |
An object of class |
upper |
An object of class |
fitted |
An object of class |
error |
An object of class |
coeff |
List of objects of type |
coeff.error |
One-step-ahead forecast errors for each of the coefficients. |
var |
List containing the various components of variance: model, error, mean, total and coeff. |
model |
Fitted |
bootsamp |
An array of |
Rob J Hyndman
H. Booth and R. J. Hyndman and L. Tickle and P. D. Jong (2006) "Lee-Carter mortality forecasting: A multi-country comparison of variants and extensions", Demographic Research, 15, 289-310.
B. Erbas and R. J. Hyndman and D. M. Gertig (2007) "Forecasting age-specific breast cancer mortality using functional data model", Statistics in Medicine, 26(2), 458-470.
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. Booth (2008) "Stochastic population forecasts using functional data models for mortality, fertility and migration", International Journal of Forecasting, 24(3), 323-342.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series" (with discussion), Journal of the Korean Statistical Society, 38(3), 199-221.
H. L. Shang (2012) "Functional time series approach for forecasting very short-term electricity demand", Journal of Applied Statistics, 40(1), 152-168.
H. L. Shang (2013) "ftsa: An R package for analyzing functional time series", The R Journal, 5(1), 64-72.
H. L. Shang, A. Wisniowski, J. Bijak, P. W. F. Smith and J. Raymer (2014) "Bayesian functional models for population forecasting", in M. Marsili and G. Capacci (eds), Proceedings of the Sixth Eurostat/UNECE Work Session on Demographic Projections, Istituto nazionale di statistica, Rome, pp. 313-325.
H. L. Shang (2015) "Selection of the optimal Box-Cox transformation parameter for modelling and forecasting age-specific fertility", Journal of Population Research, 32(1), 69-79.
H. L. Shang (2015) "Forecast accuracy comparison of age-specific mortality and life expectancy: Statistical tests of the results", Population Studies, 69(3), 317-335.
H. L. Shang, P. W. F. Smith, J. Bijak, A. Wisniowski (2016) "A multilevel functional data method for forecasting population, with an application to the United Kingdom", International Journal of Forecasting, 32(3), 629-649.
ftsm
, forecastfplsr
, plot.ftsf
, plot.fm
, residuals.fm
, summary.fm
# ElNino is an object of class sliced functional time series. # Via functional principal component decomposition, the dynamic was captured # by a few principal components and principal component scores. # By using an exponential smoothing method, # the principal component scores are forecasted. # The forecasted curves are constructed by forecasted principal components # times fixed principal components plus the mean function. forecast(object = ftsm(ElNino_ERSST_region_1and2), h = 10, method = "ets") forecast(object = ftsm(ElNino_ERSST_region_1and2, weight = TRUE))
# ElNino is an object of class sliced functional time series. # Via functional principal component decomposition, the dynamic was captured # by a few principal components and principal component scores. # By using an exponential smoothing method, # the principal component scores are forecasted. # The forecasted curves are constructed by forecasted principal components # times fixed principal components plus the mean function. forecast(object = ftsm(ElNino_ERSST_region_1and2), h = 10, method = "ets") forecast(object = ftsm(ElNino_ERSST_region_1and2, weight = TRUE))
Forecast high-dimensional functional principal component model.
## S3 method for class 'hdfpca' forecast(object, h = 3, level = 80, B = 50, ...)
## S3 method for class 'hdfpca' forecast(object, h = 3, level = 80, B = 50, ...)
object |
An object of class 'hdfpca' |
h |
Forecast horizon |
level |
Prediction interval level, the default is 80 percent |
B |
Number of bootstrap replications |
... |
Other arguments passed to forecast routine. |
The low-dimensional factors are forecasted with autoregressive integrated moving average (ARIMA) models separately. The forecast functions are then calculated using the forecast factors. Bootstrap prediction intervals are constructed by resampling from the forecast residuals of the ARIMA models.
forecast |
A list containing the h-step-ahead forecast functions for each population |
upper |
Upper confidence bound for each population |
lower |
Lower confidence bound for each population |
Y. Gao and H. L. Shang
Y. Gao, H. L. Shang and Y. Yang (2018) High-dimensional functional time series forecasting: An application to age-specific mortality rates, Journal of Multivariate Analysis, forthcoming.
## Not run: hd_model = hdfpca(hd_data, order = 2, r = 2) hd_model_fore = forecast.hdfpca(object = hd_model, h = 1) ## End(Not run)
## Not run: hd_model = hdfpca(hd_data, order = 2, r = 2) hd_model_fore = forecast.hdfpca(object = hd_model, h = 1) ## End(Not run)
The decentralized response is forecasted by multiplying the estimated regression coefficient with the new decentralized predictor
forecastfplsr(object, components, h)
forecastfplsr(object, components, h)
object |
An object of class |
components |
Number of optimal components. |
h |
Forecast horizon. |
A fts
class object, containing forecasts of responses.
Han Lin Shang
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series" (with discussion), Journal of the Korean Statistical Society, 38(3), 199-221.
forecast.ftsm
, ftsm
, plot.fm
, plot.ftsf
, residuals.fm
, summary.fm
# A set of functions are decomposed by functional partial least squares decomposition. # By forecasting univariate partial least squares scores, the forecasted curves are # obtained by multiplying the forecasted scores by fixed functional partial least # squares function plus fixed mean function. forecastfplsr(object = ElNino_ERSST_region_1and2, components = 2, h = 5)
# A set of functions are decomposed by functional partial least squares decomposition. # By forecasting univariate partial least squares scores, the forecasted curves are # obtained by multiplying the forecasted scores by fixed functional partial least # squares function plus fixed mean function. forecastfplsr(object = ElNino_ERSST_region_1and2, components = 2, h = 5)
Fits a functional partial least squares (PLSR) model using nonlinear partial least squares (NIPALS) algorithm or simple partial least squares (SIMPLS) algorithm.
fplsr(data, order = 6, type = c("simpls", "nipals"), unit.weights = TRUE, weight = FALSE, beta = 0.1, interval = FALSE, method = c("delta", "boota"), alpha = 0.05, B = 100, adjust = FALSE, backh = 10)
fplsr(data, order = 6, type = c("simpls", "nipals"), unit.weights = TRUE, weight = FALSE, beta = 0.1, interval = FALSE, method = c("delta", "boota"), alpha = 0.05, B = 100, adjust = FALSE, backh = 10)
data |
An object of class |
order |
Number of principal components to fit. |
type |
When |
unit.weights |
Constrains predictor loading weights to have unit norm. |
weight |
When |
beta |
When |
interval |
When |
method |
Method used for computing prediction intervals. |
alpha |
|
B |
Number of replications. |
adjust |
When |
backh |
When |
Point forecasts:
The NIPALS function implements the orthogonal scores algorithm, as described in Martens and Naes (1989). This is one of the two classical PLSR algorthms, the other is the simple partial least squares regression in DeJong (1993). The difference between these two approaches is that the NIPALS deflates the original predictors and responses, while the SIMPLS deflates the covariance matrix of original predictors and responses. Thus, SIMPLS is more computationally efficient than NIPALS.
In a functional data set, the functional PLSR can be performed by setting the functional responses to be 1 lag ahead of the functional predictors. This idea has been adopted from the Autoregressive Hilbertian processes of order 1 (ARH(1)) of Bosq (2000).
Distributional forecasts:
Parametric method:
Influenced by the works of Denham (1997) and Phatak et al. (1993), one way of constructing prediction intervals in the PLSR is via a local linearization method (also known as the Delta method). It can be easily understood as the first two terms in a Taylor series expansion. The variance of coefficient estimators can be approximated, from which an analytic-formula based prediction intervals are constructed.
Nonparametric method:
After discretizing and decentralizing functional data and
, a PLSR model with
latent components is built.
Then, the fit residuals
between
and
are calculated as
The next step is to generate bootstrap samples
by randomly sampling with replacement
from
. Adding bootstrapped residuals to the original
response variables in order to generate new bootstrap responses,
Then, the PLSR models are constructed using the centered and discretized predictors and bootstrapped responses
to obtain the boostrapped regression coefficients and point forecasts, from which the empirical prediction intervals and kernel density plots are constructed.
A list containing the following components is returned.
B |
|
P |
|
Q |
|
T |
|
R |
|
Yscores |
|
projection |
|
meanX |
An object of class |
meanY |
An object of class |
Ypred |
An object of class |
fitted |
An object of class |
residuals |
An object of class |
Xvar |
A vector with the amount of predictor variance explained by each number of component. |
Xtotvar |
Total variance in predictors. |
weight |
When |
x1 |
Time period of a |
y1 |
Variables of a |
ypred |
Returns the original functional predictors. |
y |
Returns the original functional responses. |
bootsamp |
Bootstrapped point forecasts. |
lb |
Lower bound of prediction intervals. |
ub |
Upper bound of prediction intervals. |
lbadj |
Adjusted lower bound of prediction intervals. |
ubadj |
Adjusted upper bound of prediction intervals. |
lbadjfactor |
Adjusted lower bound factor, which lies generally between 0.9 and 1.1. |
ubadjfactor |
Adjusted upper bound factor, which lies generally between 0.9 and 1.1. |
Han Lin Shang
S. Wold and A. Ruhe and H. Wold and W. J. Dunn (1984) "The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses", SIAM Journal of Scientific and Statistical Computing, 5(3), 735-743.
S. de Jong (1993) "SIMPLS: an alternative approach to partial least square regression", Chemometrics and Intelligent Laboratory Systems, 18(3), 251-263.
C J. F. Ter Braak and S. de Jong (1993) "The objective function of partial least squares regression", Journal of Chemometrics, 12(1), 41-54.
B. Dayal and J. MacGregor (1997) "Recursive exponentially weighted PLS and its applications to adaptive control and prediction", Journal of Process Control, 7(3), 169-179.
B. D. Marx (1996) "Iteratively reweighted partial least squares estimation for generalized linear regression", Technometrics, 38(4), 374-381.
L. Xu and J-H. Jiang and W-Q. Lin and Y-P. Zhou and H-L. Wu and G-L. Shen and R-Q. Yu (2007) "Optimized sample-weighted partial least squares", Talanta, 71(2), 561-566.
A. Phatak and P. Reilly and A. Penlidis (1993) "An approach to interval estimation in partial least squares regression", Analytica Chimica Acta, 277(2), 495-501.
M. Denham (1997) "Prediction intervals in partial least squares", Journal of Chemometrics, 11(1), 39-52.
D. Bosq (2000) Linear Processes in Function Spaces, New York: Springer.
N. Faber (2002) "Uncertainty estimation for multivariate regression coefficients", Chemometrics and Intelligent Laboratory Systems, 64(2), 169-179.
J. A. Fernandez Pierna and L. Jin and F. Wahl and N. M. Faber and D. L. Massart (2003) "Estimation of partial least squares regression prediction uncertainty when the reference values carry a sizeable measurement error", Chemometrics and Intelligent Laboratory Systems, 65(2), 281-291.
P. T. Reiss and R. T. Ogden (2007), "Functional principal component regression and functional partial least squares", Journal of the American Statistical Association, 102(479), 984-996.
C. Preda, G. Saporta (2005) "PLS regression on a stochastic process", Computational Statistics and Data Analysis, 48(1), 149-158.
C. Preda, G. Saporta, C. Leveder (2007) "PLS classification of functional data", Computational Statistics, 22, 223-235.
A. Delaigle and P. Hall (2012), "Methodology and theory for partial least squares applied to functional data", Annals of Statistics, 40(1), 322-352.
M. Febrero-Bande, P. Galeano, W. Gonz\'alez-Manteiga (2017), "Functional principal component regression and functional partial least-squares regression: An overview and a comparative study", International Statistical Review, 85(1), 61-83.
ftsm
, forecast.ftsm
, plot.fm
,
summary.fm
, residuals.fm
, plot.fmres
# When weight = FALSE, all observations are assigned equally. # When weight = TRUE, all observations are assigned geometrically decaying weights. fplsr(data = ElNino_ERSST_region_1and2, order = 6, type = "nipals") fplsr(data = ElNino_ERSST_region_1and2, order = 6) fplsr(data = ElNino_ERSST_region_1and2, weight = TRUE) fplsr(data = ElNino_ERSST_region_1and2, unit.weights = FALSE) fplsr(data = ElNino_ERSST_region_1and2, unit.weights = FALSE, weight = TRUE) # The prediction intervals are calculated numerically. fplsr(data = ElNino_ERSST_region_1and2, interval = TRUE, method = "delta") # The prediction intervals are calculated by bootstrap method. fplsr(data = ElNino_ERSST_region_1and2, interval = TRUE, method = "boota")
# When weight = FALSE, all observations are assigned equally. # When weight = TRUE, all observations are assigned geometrically decaying weights. fplsr(data = ElNino_ERSST_region_1and2, order = 6, type = "nipals") fplsr(data = ElNino_ERSST_region_1and2, order = 6) fplsr(data = ElNino_ERSST_region_1and2, weight = TRUE) fplsr(data = ElNino_ERSST_region_1and2, unit.weights = FALSE) fplsr(data = ElNino_ERSST_region_1and2, unit.weights = FALSE, weight = TRUE) # The prediction intervals are calculated numerically. fplsr(data = ElNino_ERSST_region_1and2, interval = TRUE, method = "delta") # The prediction intervals are calculated by bootstrap method. fplsr(data = ElNino_ERSST_region_1and2, interval = TRUE, method = "boota")
Fits a principal component model to a fts
object. The
function uses optimal orthonormal principal components obtained from a
principal components decomposition.
ftsm(y, order = 6, ngrid = max(500, ncol(y$y)), method = c("classical", "M", "rapca"), mean = TRUE, level = FALSE, lambda = 3, weight = FALSE, beta = 0.1, ...)
ftsm(y, order = 6, ngrid = max(500, ncol(y$y)), method = c("classical", "M", "rapca"), mean = TRUE, level = FALSE, lambda = 3, weight = FALSE, beta = 0.1, ...)
y |
An object of class |
order |
Number of principal components to fit. |
ngrid |
Number of grid points to use in calculations. Set to maximum of 500 and |
method |
Method to use for principal components decomposition. Possibilities are “M”, “rapca” and “classical”. |
mean |
If |
level |
If |
lambda |
Tuning parameter for robustness when |
weight |
When |
beta |
When |
... |
Additional arguments controlling the fitting procedure. |
If method = "classical"
, then standard functional principal component decomposition is used, as described by
Ramsay and Dalzell (1991).
If method = "rapca"
, then the robust principal component algorithm of Hubert, Rousseeuw and Verboven (2002) is used.
If method = "M"
, then the hybrid algorithm of Hyndman and Ullah (2005) is used.
Object of class “ftsm” with the following components:
x1 |
Time period of a |
y1 |
Variables of a |
y |
Original functional time series or sliced functional time series. |
basis |
Matrix of principal components evaluated at value of |
basis2 |
Matrix of principal components excluded from the selected model. |
coeff |
Matrix of coefficients (one column for each coefficient series). The first column is all ones. |
coeff2 |
Matrix of coefficients associated with the principal components excluded from the selected model. |
fitted |
An object of class |
residuals |
An object of class |
varprop |
Proportion of variation explained by each principal component. |
wt |
Weight associated with each time period. |
v |
Measure of variation for each time period. |
mean.se |
Measure of standar error associated with the mean. |
Rob J Hyndman
J. O. Ramsay and C. J. Dalzell (1991) "Some tools for functional data analysis (with discussion)", Journal of the Royal Statistical Society: Series B, 53(3), 539-572.
M. Hubert and P. J. Rousseeuw and S. Verboven (2002) "A fast robust method for principal components with applications to chemometrics", Chemometrics and Intelligent Laboratory Systems, 60(1-2), 101-111.
B. Erbas and R. J. Hyndman and D. M. Gertig (2007) "Forecasting age-specific breast cancer mortality using functional data model", Statistics in Medicine, 26(2), 458-470.
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. Booth (2008) "Stochastic population forecasts using functional data models for mortality, fertility and migration", International Journal of Forecasting, 24(3), 323-342.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series (with discussion)", Journal of the Korean Statistical Society, 38(3), 199-221.
ftsmweightselect
, forecast.ftsm
, plot.fm
, plot.ftsf
, residuals.fm
, summary.fm
# ElNino is an object of class sliced functional time series, constructed # from a univariate time series. # By default, all observations are assigned with equal weighting. ftsm(y = ElNino_ERSST_region_1and2, order = 6, method = "classical", weight = FALSE) # When weight = TRUE, geometrically decaying weights are used. ftsm(y = ElNino_ERSST_region_1and2, order = 6, method = "classical", weight = TRUE)
# ElNino is an object of class sliced functional time series, constructed # from a univariate time series. # By default, all observations are assigned with equal weighting. ftsm(y = ElNino_ERSST_region_1and2, order = 6, method = "classical", weight = FALSE) # When weight = TRUE, geometrically decaying weights are used. ftsm(y = ElNino_ERSST_region_1and2, order = 6, method = "classical", weight = TRUE)
The coefficients from the fitted object are forecasted
using either an ARIMA model (method = "arima"
), an AR model (method = "ar"
),
an exponential smoothing method (method = "ets"
), a linear exponential smoothing
method allowing missing values (method = "ets.na"
), or a random walk with drift model
(method = "rwdrift"
). The forecast coefficients are then multiplied by the principal
components to obtain a forecast curve.
ftsmiterativeforecasts(object, components, iteration = 20)
ftsmiterativeforecasts(object, components, iteration = 20)
object |
An object of class |
components |
Number of principal components. |
iteration |
Number of iterative one-step-ahead forecasts. |
1. Obtain a smooth curve for each
using a nonparametric smoothing technique.
2. Decompose the smooth curves via a functional principal component analysis.
3. Fit a univariate time series model to each of the principal component scores.
4. Forecast the principal component scores using the fitted time series models.
5. Multiply the forecast principal component scores by fixed principal components to obtain forecasts of .
6. The estimated variances of the error terms (smoothing error and model residual error) are used to compute prediction intervals for the forecasts.
List with the following components:
mean |
An object of class |
lower |
An object of class |
upper |
An object of class |
fitted |
An object of class |
error |
An object of class |
coeff |
List of objects of type |
coeff.error |
One-step-ahead forecast errors for each of the coefficients. |
var |
List containing the various components of variance: model, error, mean, total and coeff. |
model |
Fitted |
bootsamp |
An array of |
Han Lin Shang
H. Booth and R. J. Hyndman and L. Tickle and P. D. Jong (2006) "Lee-Carter mortality forecasting: A multi-country comparison of variants and extensions", Demographic Research, 15, 289-310.
B. Erbas and R. J. Hyndman and D. M. Gertig (2007) "Forecasting age-specific breast cancer mortality using functional data model", Statistics in Medicine, 26(2), 458-470.
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. Booth (2008) "Stochastic population forecasts using functional data models for mortality, fertility and migration", International Journal of Forecasting, 24(3), 323-342.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series" (with discussion), Journal of the Korean Statistical Society, 38(3), 199-221.
ftsm
, plot.ftsf
, plot.fm
, residuals.fm
, summary.fm
# Iterative one-step-ahead forecasts via functional principal component analysis. ftsmiterativeforecasts(object = Australiasmoothfertility, components = 2, iteration = 5)
# Iterative one-step-ahead forecasts via functional principal component analysis. ftsmiterativeforecasts(object = Australiasmoothfertility, components = 2, iteration = 5)
The geometrically decaying weights are used to estimate the mean curve and functional principal components, where more weights are assigned to the more recent data than the data from the distant past.
ftsmweightselect(data, ncomp = 6, ntestyear, errorcriterion = c("mae", "mse", "mape"))
ftsmweightselect(data, ncomp = 6, ntestyear, errorcriterion = c("mae", "mse", "mape"))
data |
An object of class |
ncomp |
Number of components. |
ntestyear |
Number of holdout observations used to assess the forecast accuracy. |
errorcriterion |
Error measure. |
The data set is split into a fitting period and forecasting period. Using the data in the fitting period, we compute the one-step-ahead forecasts and calculate the forecast error. Then, we increase the fitting period by one, and carry out the same forecasting procedure until the fitting period covers entire data set. The forecast accuracy is determined by the averaged forecast error across the years in the forecasting period. By using an optimization algorithm, we select the optimal weight parameter that would result in the minimum forecast error.
Optimal weight parameter.
Can be computational intensive, as it takes about half-minute to compute. For example, ftsmweightselect(ElNinosmooth, ntestyear = 1).
Han Lin Shang
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series (with discussion)", Journal of the Korean Statistical Society, 38(3), 199-221.
One-step-ahead forecast for any given quantile(s) of functional time sereies of extreme values using a generalized additive extreme value (GAEV) model.
GAEVforecast(data, q, d.loc.max = 10, d.logscale.max = 10)
GAEVforecast(data, q, d.loc.max = 10, d.logscale.max = 10)
data |
a n by p data matrix, where n denotes the number of functional objects and p denotes the number of realizations on each functional object |
q |
a required scalar or vector of GEV quantiles that are of forecasting interest |
d.loc.max |
the maximum number of basis functions considered for the location parameter |
d.logscale.max |
the maximum number of basis functions considered for the (log-)scale parameter |
For the functional time seres , the GAEV model is given as
where
where are positive integers of basis numbers,
and
are the cubic regression spline basis functions.
The optimal number of basis functions are chosen by minimizing the Kullback-Leibler divergence on the test set using a leave-one-out cross-validation technique.
The one-step-ahead forecast of the joint coefficients are produced using a vector autoregressive model, whose order is selected via the corrected Akaike information criterion. Then the one-step-ahead forecast of the GEV parameter
can be computed accordingly.
The one-step-ahead forecast for the -th quantile of the extreme values
is computed by
=
kdf.location |
the optimal number of basis functions considered for the location parameter |
kdf.logscale |
the optimal number of basis functions considered for the (log-)scale parameter |
basis.location |
the basis functions for the location parameter |
basis.logscale |
the basis functions for the (log-)scale parameter |
para.location.pred |
the predicted location function |
para.scale.pred |
the predicted scale function |
para.shape.pred |
the predicted shape parameter |
density.pred |
the prediced density function(s) for the given quantile(s) |
Ruofan Xu and Han Lin Shang
Shang, H. L. and Xu, R. (2021) ‘Functional time series forecasting of extreme values’, Communications in Statistics Case Studies Data Analysis and Applications, in press.
## Not run: library(evd) data = matrix(rgev(1000),ncol=50) GAEVforecast(data = data, q = c(0.02,0.7), d.loc.max = 5, d.logscale.max = 5) ## End(Not run)
## Not run: library(evd) data = matrix(rgev(1000),ncol=50) GAEVforecast(data = data, q = c(0.02,0.7), d.loc.max = 5, d.logscale.max = 5) ## End(Not run)
We generate populations of functional time series. For each
, the
th function at time
is given by
where .
data("hd_data")
data("hd_data")
The coefficients for all
populations are combined and generated, for all
, by
where . Here,
is an
matrix, and
is an
vector. Furthermore, we assume that the
s have mean 0 and variance 0 when
, so we only construct the coefficients
for
.
The first set of coefficients for
populations are generated with
. Each element in the matrix
is generated by
, where
.
The factors are generated using an autoregressive model of order 1, i.e., AR(1). Define the
th element in vector
as
. Then,
is generated by
, where
are independent
random variables. We generate
for all
by
, where
are also AR(1) and follow
. It is then ensured that most of the variance of
can be explained by one factor. The second coefficient
are constructed the same way as
.
We also generate the third functional principal component scores but with small values. Moreover,
is generated by
, where
. The factors
are generated as
.
The three basis functions are constructed by ,
and
, where
. Finally, the functional time series for the
th population is constructed by
where denotes the
th element of the vector.
Y. Gao, H. L. Shang and Y. Yang (2018) High-dimensional functional time series forecasting: An application to age-specific mortality rates, Journal of Multivariate Analysis, forthcoming.
data(hd_data)
data(hd_data)
Fit a high dimensional functional principal component analysis model to a multiple-population of functional time series data.
hdfpca(y, order, q = sqrt(dim(y[[1]])[2]), r)
hdfpca(y, order, q = sqrt(dim(y[[1]])[2]), r)
y |
A list, where each item is a population of functional time series. Each item is a data matrix of dimension p by n, where p is the number of discrete points in each function and n is the sample size |
order |
The number of principal component scores to retain in the first step dimension reduction |
q |
The tuning parameter used in the first step dimension reduction, by default it is equal to the square root of the sample size |
r |
The number of factors to retain in the second step dimension reduction |
In the first step, dynamic functional principal component analysis is performed on each population and then in the second step, factor models are fitted to the resulting principal component scores. The high-dimensional functional time series are thus reduced to low-dimensional factors.
y |
The input data |
p |
The number of discrete points in each function |
fitted |
A list containing the fitted functions for each population |
m |
The number of populations |
model |
Model 1 includes the first step dynamic functional principal component analysis models, model 2 includes the second step high-dimensional principal component analysis models |
order |
Input order |
r |
Input r |
Y. Gao and H. L. Shang
Y. Gao, H. L. Shang and Y. Yang (2018) High-dimensional functional time series forecasting: An application to age-specific mortality rates, Journal of Multivariate Analysis, forthcoming.
hd_model = hdfpca(hd_data, order = 2, r = 2)
hd_model = hdfpca(hd_data, order = 2, r = 2)
Implementation of a dynamic functional principal component analysis to forecast densities.
Horta_Ziegelmann_FPCA(data, gridpoints, h_scale = 1, p = 5, m = 5001, kernel = c("gaussian", "epanechnikov"), band_choice = c("Silverman", "DPI"), VAR_type = "both", lag_maximum = 6, no_boot = 1000, alpha_val = 0.1, ncomp_select = "TRUE", D_val = 10)
Horta_Ziegelmann_FPCA(data, gridpoints, h_scale = 1, p = 5, m = 5001, kernel = c("gaussian", "epanechnikov"), band_choice = c("Silverman", "DPI"), VAR_type = "both", lag_maximum = 6, no_boot = 1000, alpha_val = 0.1, ncomp_select = "TRUE", D_val = 10)
data |
Densities or raw data matrix of dimension N by p, where N denotes sample size and p denotes dimensionality |
gridpoints |
Grid points |
h_scale |
Scaling parameter in the kernel density estimator |
p |
Number of backward parameters |
m |
Number of grid points |
kernel |
Type of kernel function |
band_choice |
Selection of optimal bandwidth |
VAR_type |
Type of vector autoregressive process |
lag_maximum |
A tuning parameter in the |
no_boot |
A tuning parameter in the |
alpha_val |
A tuning parameter in the |
ncomp_select |
A tuning parameter in the |
D_val |
A tuning parameter in the |
1) Compute a kernel covariance function 2) Via eigen-decomposition, a density can be decomposed into a set of functional principal components and their associated scores 3) Fit a vector autoregressive model to the scores with the order selected by Akaike information criterion 4) By multiplying the estimated functional principal components with the forecast scores, obtain forecast densities 5) Since forecast densities may neither be non-negative nor sum to one, normalize the forecast densities accordingly
Yhat.fix_den |
Forecast density |
u |
Grid points |
du |
Distance between two successive grid points |
Ybar_est |
Mean of density functions |
psihat_est |
Estimated functional principal components |
etahat_est |
Estimated principal component scores |
etahat_pred_val |
Forecast principal component scores |
selected_d0 |
Selected number of components |
selected_d0_pvalues |
p-values associated with the selected functional principal components |
thetahat_val |
Estimated eigenvalues |
Han Lin Shang
Horta, E. and Ziegelmann, F. (2018) ‘Dynamics of financial returns densities: A functional approach applied to the Bovespa intraday index’, International Journal of Forecasting, 34, 75-88.
Bathia, N., Yao, Q. and Ziegelmann, F. (2010) ‘Identifying the finite dimensionality of curve time series’, The Annals of Statistics, 38, 3353-3386.
CoDa_FPCA
, LQDT_FPCA
, skew_t_fun
## Not run: Horta_Ziegelmann_FPCA(data = DJI_return, kernel = "epanechnikov", band_choice = "DPI", ncomp_select = "FALSE") ## End(Not run)
## Not run: Horta_Ziegelmann_FPCA(data = DJI_return, kernel = "epanechnikov", band_choice = "DPI", ncomp_select = "FALSE") ## End(Not run)
Tests whether an object is of class fts
.
is.fts(x)
is.fts(x)
x |
Arbitrary R object. |
Rob J Hyndman
# check if ElNino is the class of the functional time series. is.fts(x = ElNino_ERSST_region_1and2)
# check if ElNino is the class of the functional time series. is.fts(x = ElNino_ERSST_region_1and2)
Computes integrated squared forecast error (ISFE) values for functional time series models of various orders.
isfe.fts(data, max.order = N - 3, N = 10, h = 5:10, method = c("classical", "M", "rapca"), mean = TRUE, level = FALSE, fmethod = c("arima", "ar", "ets", "ets.na", "struct", "rwdrift", "rw", "arfima"), lambda = 3, ...)
isfe.fts(data, max.order = N - 3, N = 10, h = 5:10, method = c("classical", "M", "rapca"), mean = TRUE, level = FALSE, fmethod = c("arima", "ar", "ets", "ets.na", "struct", "rwdrift", "rw", "arfima"), lambda = 3, ...)
data |
An object of class |
max.order |
Maximum number of principal components to fit. |
N |
Minimum number of functional observations to be used in fitting a model. |
h |
Forecast horizons over which to average. |
method |
Method to use for principal components decomposition. Possibilities are “M”, “rapca” and “classical”. |
mean |
Indicates if mean term should be included. |
level |
Indicates if level term should be included. |
fmethod |
Method used for forecasting. Current possibilities are “ets”, “arima”, “ets.na”, “struct”, “rwdrift” and “rw”. |
lambda |
Tuning parameter for robustness when |
... |
Additional arguments controlling the fitting procedure. |
Numeric matrix with (max.order+1)
rows and length(h)
columns
containing ISFE values for models of orders 0:(max.order)
.
This function can be very time consuming for data with large dimensionality or large sample size.
By setting max.order
small, computational speed can be dramatically increased.
Rob J Hyndman
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
ftsm
, forecast.ftsm
, plot.fm
, plot.fmres
, summary.fm
, residuals.fm
Bandwidth estimation in the long-run covariance function for a functional time series, using different types of kernel function
long_run_covariance_estimation(dat, C0 = 3, H = 3)
long_run_covariance_estimation(dat, C0 = 3, H = 3)
dat |
A matrix of p by n, where p denotes the number of grid points and n denotes sample size |
C0 |
A tuning parameter used in the adaptive bandwidth selection algorithm of Rice |
H |
A tuning parameter used in the adaptive bandwidth selection algorithm of Rice |
An estimated covariance function of size (p by p)
Han Lin Shang
L. Horvath, G. Rice and S. Whipple (2016) Adaptive bandwidth selection in the long run covariance estimation of functional time series, Computational Statistics and Data Analysis, 100, 676-693.
G. Rice and H. L. Shang (2017) A plug-in bandwidth selection procedure for long run covariance estimation with stationary functional time series, Journal of Time Series Analysis, 38(4), 591-609.
D. Li, P. M. Robinson and H. L. Shang (2018) Long-range dependent curve time series, Journal of the American Statistical Association: Theory and Methods, under revision.
dum = long_run_covariance_estimation(dat = ElNino_OISST_region_1and2$y[,1:5])
dum = long_run_covariance_estimation(dat = ElNino_OISST_region_1and2$y[,1:5])
Probability density function, cumulative distribution function and quantile density function are three characterizations of a distribution. Of these three, quantile density function is the least constrained. The only constrain is nonnegative. By taking a log transformation, there is no constrain.
LQDT_FPCA(data, gridpoints, h_scale = 1, M = 3001, m = 5001, lag_maximum = 4, no_boot = 1000, alpha_val = 0.1, p = 5, band_choice = c("Silverman", "DPI"), kernel = c("gaussian", "epanechnikov"), forecasting_method = c("uni", "multi"), varprop = 0.85, fmethod, VAR_type)
LQDT_FPCA(data, gridpoints, h_scale = 1, M = 3001, m = 5001, lag_maximum = 4, no_boot = 1000, alpha_val = 0.1, p = 5, band_choice = c("Silverman", "DPI"), kernel = c("gaussian", "epanechnikov"), forecasting_method = c("uni", "multi"), varprop = 0.85, fmethod, VAR_type)
data |
Densities or raw data matrix of dimension N by p, where N denotes sample size and p denotes dimensionality |
gridpoints |
Grid points |
h_scale |
Scaling parameter in the kernel density estimator |
M |
Number of grid points between 0 and 1 |
m |
Number of grid points within the data range |
lag_maximum |
A tuning parameter in the |
no_boot |
A tuning parameter in the |
alpha_val |
A tuning parameter in the |
p |
Number of backward parameters |
band_choice |
Selection of optimal bandwidth |
kernel |
Type of kernel function |
forecasting_method |
Univariate or multivariate time series forecasting method |
varprop |
Proportion of variance explained |
fmethod |
If |
VAR_type |
If |
1) Transform the densities f into log quantile densities Y and c specifying the value of the cdf at 0 for the target density f. 2) Compute the predictions for future log quantile density and c value. 3) Transform the forecasts in Step 2) into the predicted density f.
L2Diff |
L2 norm difference between reconstructed and actual densities |
unifDiff |
Uniform Metric excluding missing boundary values (due to boundary cutoff) |
density_reconstruct |
Reconstructed densities |
density_original |
Actual densities |
dens_fore |
Forecast densities |
totalMass |
Assess loss of mass incurred by boundary cutoff |
u |
m number of grid points |
Han Lin Shang
Petersen, A. and Muller, H.-G. (2016) ‘Functional data analysis for density functions by transformation to a Hilbert space’, The Annals of Statistics, 44, 183-218.
Jones, M. C. (1992) ‘Estimating densities, quantiles, quantile densities and density quantiles’, Annals of the Institute of Statistical Mathematics, 44, 721-727.
CoDa_FPCA
, Horta_Ziegelmann_FPCA
, skew_t_fun
## Not run: LQDT_FPCA(data = DJI_return, band_choice = "DPI", kernel = "epanechnikov", forecasting_method = "uni", fmethod = "ets") ## End(Not run)
## Not run: LQDT_FPCA(data = DJI_return, band_choice = "DPI", kernel = "epanechnikov", forecasting_method = "uni", fmethod = "ets") ## End(Not run)
Dimension reduction via maximum autocorrelation factors
MAF_multivariate(data, threshold)
MAF_multivariate(data, threshold)
data |
A p by n data matrix, where p denotes the number of variables and n denotes the sample size |
threshold |
A threshold level for retaining the optimal number of factors |
MAF |
Maximum autocorrelation factor scores |
MAF_loading |
Maximum autocorrelation factors |
Z |
Standardized original data |
recon |
Reconstruction via maximum autocorrelation factors |
recon_err |
Reconstruction errors between the standardized original data and reconstruction via maximum autocorrelation factors |
ncomp_threshold |
Number of maximum autocorrelation factors selected by explaining autocorrelation at and above a given level of threshold |
ncomp_eigen_ratio |
Number of maximum autocorrelation factors selected by eigenvalue ratio tests |
Han Lin Shang
M. A. Haugen, B. Rajaratnam and P. Switzer (2015). Extracting common time trends from concurrent time series: Maximum autocorrelation factors with applications, arXiv paper https://arxiv.org/abs/1502.01073.
MAF_multivariate(data = pm_10_GR_sqrt$y, threshold = 0.85)
MAF_multivariate(data = pm_10_GR_sqrt$y, threshold = 0.85)
Computes mean of functional time series at each variable.
## S3 method for class 'fts' mean(x, method = c("coordinate", "FM", "mode", "RP", "RPD", "radius"), na.rm = TRUE, alpha, beta, weight, ...)
## S3 method for class 'fts' mean(x, method = c("coordinate", "FM", "mode", "RP", "RPD", "radius"), na.rm = TRUE, alpha, beta, weight, ...)
x |
An object of class |
method |
Method for computing the mean function. |
na.rm |
A logical value indicating whether NA values should be stripped before the computation proceeds. |
alpha |
Tuning parameter when |
beta |
Trimming percentage, by default it is 0.25, when |
weight |
Hard thresholding or soft thresholding. |
... |
Other arguments. |
If method = "coordinate"
, it computes the coordinate-wise functional mean.
If method = "FM"
, it computes the mean of trimmed functional data ordered by the functional depth of Fraiman and Muniz (2001).
If method = "mode"
, it computes the mean of trimmed functional data ordered by -modal functional depth.
If method = "RP"
, it computes the mean of trimmed functional data ordered by random projection depth.
If method = "RPD"
, it computes the mean of trimmed functional data ordered by random projection derivative depth.
If method = "radius"
, it computes the mean of trimmed functional data ordered by the notion of alpha-radius.
A list containing x
= variables and y
= mean rates.
Rob J Hyndman, Han Lin Shang
O. Hossjer and C. Croux (1995) "Generalized univariate signed rank statistics for testing and estimating a multivariate location parameter", Journal of Nonparametric Statistics, 4(3), 293-308.
A. Cuevas and M. Febrero and R. Fraiman (2006) "On the use of bootstrap for estimating functions with functional data", Computational Statistics and Data Analysis, 51(2), 1063-1074.
A. Cuevas and M. Febrero and R. Fraiman (2007), "Robust estimation and classification for functional data via projection-based depth notions", Computational Statistics, 22(3), 481-496.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2007) "A functional analysis of NOx levels: location and scale estimation and outlier detection", Computational Statistics, 22(3), 411-427.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2008) "Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels", Environmetrics, 19(4), 331-345.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2010) "Measures of influence for the functional linear model with scalar response", Journal of Multivariate Analysis, 101(2), 327-339.
J. A. Cuesta-Albertos and A. Nieto-Reyes (2010) "Functional classification and the random Tukey depth. Practical issues", Combining Soft Computing and Statistical Methods in Data Analysis, Advances in Intelligent and Soft Computing, 77, 123-130.
D. Gervini (2012) "Outlier detection and trimmed estimation in general functional spaces", Statistica Sinica, 22(4), 1639-1660.
median.fts
, var.fts
, sd.fts
, quantile.fts
# Calculate the mean function by the different depth measures. mean(x = ElNino_ERSST_region_1and2, method = "coordinate") mean(x = ElNino_ERSST_region_1and2, method = "FM") mean(x = ElNino_ERSST_region_1and2, method = "mode") mean(x = ElNino_ERSST_region_1and2, method = "RP") mean(x = ElNino_ERSST_region_1and2, method = "RPD") mean(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "hard") mean(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "soft")
# Calculate the mean function by the different depth measures. mean(x = ElNino_ERSST_region_1and2, method = "coordinate") mean(x = ElNino_ERSST_region_1and2, method = "FM") mean(x = ElNino_ERSST_region_1and2, method = "mode") mean(x = ElNino_ERSST_region_1and2, method = "RP") mean(x = ElNino_ERSST_region_1and2, method = "RPD") mean(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "hard") mean(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "soft")
Computes median of functional time series at each variable.
## S3 method for class 'fts' median(x, na.rm, method = c("hossjercroux", "coordinate", "FM", "mode", "RP", "RPD", "radius"), alpha, beta, weight, ...)
## S3 method for class 'fts' median(x, na.rm, method = c("hossjercroux", "coordinate", "FM", "mode", "RP", "RPD", "radius"), alpha, beta, weight, ...)
x |
An object of class |
na.rm |
Remove any missing value. |
method |
Method for computing median. |
alpha |
Tuning parameter when |
beta |
Trimming percentage, by default it is 0.25, when |
weight |
Hard thresholding or soft thresholding. |
... |
Other arguments. |
If method = "coordinate"
, it computes a coordinate-wise median.
If method = "hossjercroux"
, it computes the L1-median using the Hossjer-Croux algorithm.
If method = "FM"
, it computes the median of trimmed functional data ordered by the functional depth of Fraiman and Muniz (2001).
If method = "mode"
, it computes the median of trimmed functional data ordered by -modal functional depth.
If method = "RP"
, it computes the median of trimmed functional data ordered by random projection depth.
If method = "RPD"
, it computes the median of trimmed functional data ordered by random projection derivative depth.
If method = "radius"
, it computes the mean of trimmed functional data ordered by the notion of alpha-radius.
A list containing x
= variables and y
= median rates.
Rob J Hyndman, Han Lin Shang
O. Hossjer and C. Croux (1995) "Generalized univariate signed rank statistics for testing and estimating a multivariate location parameter", Journal of Nonparametric Statistics, 4(3), 293-308.
A. Cuevas and M. Febrero and R. Fraiman (2006) "On the use of bootstrap for estimating functions with functional data", Computational Statistics and Data Analysis, 51(2), 1063-1074.
A. Cuevas and M. Febrero and R. Fraiman (2007), "Robust estimation and classification for functional data via projection-based depth notions", Computational Statistics, 22(3), 481-496.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2007) "A functional analysis of NOx levels: location and scale estimation and outlier detection", Computational Statistics, 22(3), 411-427.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2008) "Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels", Environmetrics, 19(4), 331-345.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2010) "Measures of influence for the functional linear model with scalar response", Journal of Multivariate Analysis, 101(2), 327-339.
J. A. Cuesta-Albertos and A. Nieto-Reyes (2010) "Functional classification and the random Tukey depth. Practical issues", Combining Soft Computing and Statistical Methods in Data Analysis, Advances in Intelligent and Soft Computing, 77, 123-130.
D. Gervini (2012) "Outlier detection and trimmed estimation in general functional spaces", Statistica Sinica, 22(4), 1639-1660.
mean.fts
, var.fts
, sd.fts
, quantile.fts
# Calculate the median function by the different depth measures. median(x = ElNino_ERSST_region_1and2, method = "hossjercroux") median(x = ElNino_ERSST_region_1and2, method = "coordinate") median(x = ElNino_ERSST_region_1and2, method = "FM") median(x = ElNino_ERSST_region_1and2, method = "mode") median(x = ElNino_ERSST_region_1and2, method = "RP") median(x = ElNino_ERSST_region_1and2, method = "RPD") median(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "hard") median(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "soft")
# Calculate the median function by the different depth measures. median(x = ElNino_ERSST_region_1and2, method = "hossjercroux") median(x = ElNino_ERSST_region_1and2, method = "coordinate") median(x = ElNino_ERSST_region_1and2, method = "FM") median(x = ElNino_ERSST_region_1and2, method = "mode") median(x = ElNino_ERSST_region_1and2, method = "RP") median(x = ElNino_ERSST_region_1and2, method = "RPD") median(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "hard") median(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, beta = 0.25, weight = "soft")
Fit a multilevel functional principal component model. The function uses two-step functional principal component decompositions.
MFDM(mort_female, mort_male, mort_ave, percent_1 = 0.95, percent_2 = 0.95, fh, level = 80, alpha = 0.2, MCMCiter = 100, fmethod = c("auto_arima", "ets"), BC = c(FALSE, TRUE), lambda)
MFDM(mort_female, mort_male, mort_ave, percent_1 = 0.95, percent_2 = 0.95, fh, level = 80, alpha = 0.2, MCMCiter = 100, fmethod = c("auto_arima", "ets"), BC = c(FALSE, TRUE), lambda)
mort_female |
Female mortality (p by n matrix), where p denotes the dimension and n denotes the sample size. |
mort_male |
Male mortality (p by n matrix). |
mort_ave |
Total mortality (p by n matrix). |
percent_1 |
Cumulative percentage used for determining the number of common functional principal components. |
percent_2 |
Cumulative percentage used for determining the number of sex-specific functional principal components. |
fh |
Forecast horizon. |
level |
Nominal coverage probability of a prediction interval. |
alpha |
1 - Nominal coverage probability. |
MCMCiter |
Number of MCMC iterations. |
fmethod |
Univariate time-series forecasting method. |
BC |
If Box-Cox transformation is performed. |
lambda |
If |
The basic idea of multilevel functional data method is to decompose functions from different sub-populations into an aggregated average, a sex-specific deviation from the aggregated average, a common trend, a sex-specific trend and measurement error. The common and sex-specific trends are modelled by projecting them onto the eigenvectors of covariance operators of the aggregated and sex-specific centred stochastic process, respectively.
first_percent |
Percentage of total variation explained by the first common functional principal component. |
female_percent |
Percentage of total variation explained by the first female functional principal component in the residual. |
male_percent |
Percentage of total variation explained by the first male functional principal component in the residual. |
mort_female_fore |
Forecast female mortality in the original scale. |
mort_male_fore |
Forecast male mortality in the original scale. |
It can be quite time consuming, especially when MCMCiter is large.
Han Lin Shang
C. M. Crainiceanu and J. Goldsmith (2010) "Bayesian functional data analysis using WinBUGS", Journal of Statistical Software, 32(11).
C-Z. Di and C. M. Crainiceanu and B. S. Caffo and N. M. Punjabi (2009) "Multilevel functional principal component analysis", The Annals of Applied Statistics, 3(1), 458-488.
V. Zipunnikov and B. Caffo and D. M. Yousem and C. Davatzikos and B. S. Schwartz and C. Crainiceanu (2015) "Multilevel functional principal component analysis for high-dimensional data", Journal of Computational and Graphical Statistics, 20, 852-873.
H. L. Shang, P. W. F. Smith, J. Bijak, A. Wisniowski (2016) "A multilevel functional data method for forecasting population, with an application to the United Kingdom", International Journal of Forecasting, 32(3), 629-649.
ftsm
, forecast.ftsm
, fplsr
, forecastfplsr
A multilevel functional principal component analysis for performing clustering analysis
MFPCA(y, M = NULL, J = NULL, N = NULL)
MFPCA(y, M = NULL, J = NULL, N = NULL)
y |
A data matrix containing functional responses. Each row contains measurements from a function at a set of grid points, and each column contains measurements of all functions at a particular grid point |
M |
Number of countries |
J |
Number of functional responses in each country |
N |
Number of grid points per function |
K1 |
Number of components at level 1 |
K2 |
Number of components at level 2 |
K3 |
Number of components at level 3 |
lambda1 |
A vector containing all level 1 eigenvalues in non-increasing order |
lambda2 |
A vector containing all level 2 eigenvalues in non-increasing order |
lambda3 |
A vector containing all level 3 eigenvalues in non-increasing order |
phi1 |
A matrix containing all level 1 eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
phi2 |
A matrix containing all level 2 eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
phi3 |
A matrix containing all level 3 eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
scores1 |
A matrix containing estimated level 1 principal component scores. Each row corresponds to the level 1 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix |
scores2 |
A matrix containing estimated level 2 principal component scores. Each row corresponds to the level 2 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix |
scores3 |
A matrix containing estimated level 3 principal component scores. Each row corresponds to the level 3 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix |
mu |
A vector containing the overall mean function |
eta |
A matrix containing the deviation from overall mean function to country-specific mean function. The number of rows is the number of countries |
Rj |
Common trend |
Uij |
Country-specific mean function |
Chen Tang, Yanrong Yang and Han Lin Shang
Clustering the multiple functional time series. The function uses the functional panel data model to cluster different time series into subgroups
mftsc(X, alpha)
mftsc(X, alpha)
X |
A list of sets of smoothed functional time series to be clustered, for each object, it is a p x q matrix, where p is the sample size and q is the number of grid points of the function |
alpha |
A value input for adjusted rand index to measure similarity of the memberships with last iteration, can be any value big than 0.9 |
As an initial step, conventional k-means clustering is performed on the dynamic FPC scores, then an iterative membership updating process is applied by fitting the MFPCA model.
iteration |
the number of iterations until convergence |
memebership |
a list of all the membership matrices at each iteration |
member.final |
the final membership |
Chen Tang, Yanrong Yang and Han Lin Shang
## Not run: data(sim_ex_cluster) cluster_result<-mftsc(X=sim_ex_cluster, alpha=0.99) cluster_result$member.final ## End(Not run)
## Not run: data(sim_ex_cluster) cluster_result<-mftsc(X=sim_ex_cluster, alpha=0.99) cluster_result$member.final ## End(Not run)
Decomposition by one-way functional median polish.
One_way_median_polish(Y, n_prefectures=51, year=1959:2020, age=0:100)
One_way_median_polish(Y, n_prefectures=51, year=1959:2020, age=0:100)
Y |
The multivariate functional data, which are a matrix with dimension n by 2p, where n is the sample size and p is the dimensionality. |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures. |
age |
Vector with the ages considered in each year. |
grand_effect |
Grand_effect, a vector of dimension p. |
row_effect |
Row_effect, a matrix of dimension length(row_partition_index) by p. |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality", arXiv. \ Sun, Ying, and Marc G. Genton (2012) “Functional Median Polish", Journal of Agricultural, Biological, and Environmental Statistics 17(3), 354-376.
One_way_Residuals
, Two_way_median_polish
, Two_way_Residuals
# The US mortality data 1959-2020, for one populations (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # Compute the functional median polish decomposition. FMP <- One_way_median_polish(Y,n_prefectures=3,year=1959:2020,age=0:100) # The results ##1. The funcional grand effect FGE <- FMP$grand_effect ##2. The funcional row effect FRE <- FMP$row_effect
# The US mortality data 1959-2020, for one populations (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # Compute the functional median polish decomposition. FMP <- One_way_median_polish(Y,n_prefectures=3,year=1959:2020,age=0:100) # The results ##1. The funcional grand effect FGE <- FMP$grand_effect ##2. The funcional row effect FRE <- FMP$row_effect
Decomposition of functional time series into deterministic (from functional median polish), and functional residuals
One_way_Residuals(Y, n_prefectures = 51, year = 1959:2020, age = 0:100)
One_way_Residuals(Y, n_prefectures = 51, year = 1959:2020, age = 0:100)
Y |
The multivariate functional data, which are a matrix with dimension n by 2p, where n is the sample size and p is the dimensionality. |
n_prefectures |
Number of prefectures. |
year |
Vector with the years considered in each population. |
age |
Vector with the ages considered in each year. |
A matrix of dimension n by p.
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality", arXiv. \ Y. Sun and M. G. Genton (2012) “Functional median polish", Journal of Agricultural, Biological, and Environmental Statistics, 17(3), 354-376.
# The US mortality data 1959-2020, for one populations (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # The results # Compute the functional residuals. FMP_residuals <- One_way_Residuals(Y, n_prefectures=3, year=1959:2020, age=0:100)
# The US mortality data 1959-2020, for one populations (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # The results # Compute the functional residuals. FMP_residuals <- One_way_Residuals(Y, n_prefectures=3, year=1959:2020, age=0:100)
Computes bootstrap or smoothed bootstrap samples based on either independent and identically distributed functional data or functional time series.
pcscorebootstrapdata(dat, bootrep, statistic, bootmethod = c("st", "sm", "mvn", "stiefel", "meboot"), smo)
pcscorebootstrapdata(dat, bootrep, statistic, bootmethod = c("st", "sm", "mvn", "stiefel", "meboot"), smo)
dat |
An object of class |
bootrep |
Number of bootstrap samples. |
statistic |
Summary statistics. |
bootmethod |
Bootstrap method. When |
smo |
Smoothing parameter. |
We will presume that each curve is observed on a grid of points with
.
Thus, the raw data set
of
observations will consist of an
by
data matrix.
By applying the singular value decomposition,
can be decomposed into
,
where the crossproduct of
and
is identity matrix.
Holding the mean and and
fixed at their realized values, there are four re-sampling methods that differ mainly by the ways of re-sampling U.
(a) Obtain the re-sampled singular column matrix by randomly sampling with replacement from the original principal component scores.
(b) To avoid the appearance of repeated values in bootstrapped principal component scores, we adapt a smooth bootstrap procedure by adding a white noise component to the bootstrap.
(c) Because principal component scores follow a standard multivariate normal distribution asymptotically, we can randomly draw principal component scores from a multivariate normal distribution with mean vector and covariance matrix of original principal component scores.
(d) Because the crossproduct of U is identitiy matrix, U is considered as a point on the Stiefel manifold, that is the space of orthogonal vectors, thus we can randomly draw principal component scores from the Stiefel manifold.
bootdata |
Bootstrap samples. If the original data matrix is |
meanfunction |
Bootstrap summary statistics. If the original data matrix is |
Han Lin Shang
H. D. Vinod (2004), "Ranking mutual funds using unconventional utility theory and stochastic dominance", Journal of Empirical Finance, 11(3), 353-377.
A. Cuevas, M. Febrero, R. Fraiman (2006), "On the use of the bootstrap for estimating functions with functional data", Computational Statistics and Data Analysis, 51(2), 1063-1074.
D. S. Poskitt and A. Sengarapillai (2013), "Description length and dimensionality reduction in functional data analysis", Computational Statistics and Data Analysis, 58, 98-113.
H. L. Shang (2015), "Re-sampling techniques for estimating the distribution of descriptive statistics of functional data", Communications in Statistics–Simulation and Computation, 44(3), 614-635.
H. L. Shang (2018), "Bootstrap methods for stationary functional time series", Statistics and Computing, 28(1), 1-10.
# Bootstrapping the distribution of a summary statistics of functional data. boot1 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "st") boot2 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "sm", smo = 0.05) boot3 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "mvn") boot4 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "stiefel") boot5 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "meboot")
# Bootstrapping the distribution of a summary statistics of functional data. boot1 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "st") boot2 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "sm", smo = 0.05) boot3 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "mvn") boot4 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "stiefel") boot5 = pcscorebootstrapdata(dat = ElNino_ERSST_region_1and2$y, bootrep = 200, statistic = "mean", bootmethod = "meboot")
When class(x)[1] = ftsm
, plot showing the principal components in the top row of plots and the coefficients in the bottom row of plots.
When class(x)[1] = fm
, plot showing the predictor scores in the top row of plots and the response loadings in the bottom row of plots.
## S3 method for class 'fm' plot(x, order, xlab1 = x$y$xname, ylab1 = "Principal component", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", level.lab = "Level", main.title = "Main effects", interaction.title = "Interaction", basiscol = 1, coeffcol = 1, outlier.col = 2, outlier.pch = 19, outlier.cex = 0.5, ...)
## S3 method for class 'fm' plot(x, order, xlab1 = x$y$xname, ylab1 = "Principal component", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", level.lab = "Level", main.title = "Main effects", interaction.title = "Interaction", basiscol = 1, coeffcol = 1, outlier.col = 2, outlier.pch = 19, outlier.cex = 0.5, ...)
x |
|
order |
Number of principal components to plot. Default is all principal components in a model. |
xlab1 |
x-axis label for principal components. |
xlab2 |
x-axis label for coefficient time series. |
ylab1 |
y-axis label for principal components. |
ylab2 |
y-axis label for coefficient time series. |
mean.lab |
Label for mean component. |
level.lab |
Label for level component. |
main.title |
Title for main effects. |
interaction.title |
Title for interaction terms. |
basiscol |
Colors for principal components if |
coeffcol |
Colors for time series coefficients if |
outlier.col |
Colors for outlying years. |
outlier.pch |
Plotting character for outlying years. |
outlier.cex |
Size of plotting character for outlying years. |
... |
Plotting parameters. |
Function produces a plot.
Rob J Hyndman
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. Booth (2008) "Stochastic population forecasts using functional data models for mortality, fertility and migration", International Journal of Forecasting, 24(3), 323-342.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series (with discussion)", Journal of the Korean Statistical Society, 38(3), 199-221.
ftsm
, forecast.ftsm
, residuals.fm
, summary.fm
, plot.fmres
, plot.ftsf
plot(x = ftsm(y = ElNino_ERSST_region_1and2))
plot(x = ftsm(y = ElNino_ERSST_region_1and2))
Functions to produce a plot of residuals from a fitted functional model.
## S3 method for class 'fmres' plot(x, type = c("image", "fts", "contour", "filled.contour", "persp"), xlab = "Year", ylab = "Age", zlab = "Residual", ...)
## S3 method for class 'fmres' plot(x, type = c("image", "fts", "contour", "filled.contour", "persp"), xlab = "Year", ylab = "Age", zlab = "Residual", ...)
x |
Generated by |
type |
Type of plot to use. Possibilities are |
xlab |
Label for |
ylab |
Label for |
zlab |
Label for |
... |
Plotting parameters. |
Produces a plot.
Rob J Hyndman
ftsm
, forecast.ftsm
, plot.fm
, plot.fmres
, residuals.fm
, summary.fm
# colorspace package was used to provide a more coherent color option. plot(residuals(ftsm(y = ElNino_ERSST_region_1and2)), type = "filled.contour", xlab = "Month", ylab = "Residual sea surface temperature")
# colorspace package was used to provide a more coherent color option. plot(residuals(ftsm(y = ElNino_ERSST_region_1and2)), type = "filled.contour", xlab = "Month", ylab = "Residual sea surface temperature")
Plot fitted model components for a fts
object.
## S3 method for class 'ftsf' plot(x, plot.type = c("function", "components", "variance"), components, xlab1 = fit$y$xname, ylab1 = "Basis function", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", level.lab = "Level", main.title = "Main effects", interaction.title = "Interaction", vcol = 1:3, shadecols = 7, fcol = 4, basiscol = 1, coeffcol = 1, outlier.col = 2, outlier.pch = 19, outlier.cex = 0.5,...)
## S3 method for class 'ftsf' plot(x, plot.type = c("function", "components", "variance"), components, xlab1 = fit$y$xname, ylab1 = "Basis function", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", level.lab = "Level", main.title = "Main effects", interaction.title = "Interaction", vcol = 1:3, shadecols = 7, fcol = 4, basiscol = 1, coeffcol = 1, outlier.col = 2, outlier.pch = 19, outlier.cex = 0.5,...)
x |
Output from |
plot.type |
Type of plot. |
components |
Number of principal components. |
xlab1 |
x-axis label for principal components. |
xlab2 |
x-axis label for coefficient time series. |
ylab1 |
y-axis label for principal components. |
ylab2 |
y-axis label for coefficient time series. |
mean.lab |
Label for mean component. |
level.lab |
Label for level component. |
main.title |
Title for main effects. |
interaction.title |
Title for interaction terms. |
vcol |
Colors to use if |
shadecols |
Color for shading of prediction intervals when |
fcol |
Color of point forecasts when |
basiscol |
Colors for principal components if |
coeffcol |
Colors for time series coefficients if |
outlier.col |
Colors for outlying years. |
outlier.pch |
Plotting character for outlying years. |
outlier.cex |
Size of plotting character for outlying years. |
... |
Plotting parameters. |
When plot.type = "function"
, it produces a plot of the forecast functions;
When plot.type = "components"
, it produces a plot of the principla components and coefficients with forecasts and prediction intervals for each coefficient;
When plot.type = "variance"
, it produces a plot of the variance components.
Function produces a plot.
Rob J Hyndman
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. Booth (2008) "Stochastic population forecasts using functional data models for mortality, fertility and migration", International Journal of Forecasting, 24(3), 323-342.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series (with discussion)", Journal of the Korean Statistical Society, 38(3), 199-221.
H. L. Shang, H. Booth and R. J. Hyndman (2011) "Point and interval forecasts of mortality rates and life expectancy: A comparison of ten principal component methods", Demographic Research, 25(5), 173-214.
ftsm
, plot.fm
, plot.fmres
, residuals.fm
, summary.fm
plot(x = forecast(object = ftsm(y = ElNino_ERSST_region_1and2)))
plot(x = forecast(object = ftsm(y = ElNino_ERSST_region_1and2)))
Plot showing the basis functions in the top row of plots and the coefficients in the bottom row of plots.
## S3 method for class 'ftsm' plot(x, components, components.start = 0, xlab1 = x$y$xname, ylab1 = "Basis function", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", level.lab = "Level", main.title = "Main effects", interaction.title = "Interaction", basiscol = 1, coeffcol = 1, outlier.col = 2, outlier.pch = 19, outlier.cex = 0.5, ...)
## S3 method for class 'ftsm' plot(x, components, components.start = 0, xlab1 = x$y$xname, ylab1 = "Basis function", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", level.lab = "Level", main.title = "Main effects", interaction.title = "Interaction", basiscol = 1, coeffcol = 1, outlier.col = 2, outlier.pch = 19, outlier.cex = 0.5, ...)
x |
Output from |
components |
Number of principal components to plot. |
components.start |
Plotting specified component. |
xlab1 |
x-axis label for basis functions. |
xlab2 |
x-axis label for coefficient time series. |
ylab1 |
y-axis label for basis functions. |
ylab2 |
y-axis label for coefficient time series. |
mean.lab |
Label for mean component. |
level.lab |
Label for level component. |
main.title |
Title for main effects. |
interaction.title |
Title for interaction terms. |
basiscol |
Colors for basis functions if plot.type="components". |
coeffcol |
Colors for time series coefficients if plot.type="components". |
outlier.col |
Colour for outlying years. |
outlier.pch |
Plotting character for outlying years. |
outlier.cex |
Size of plotting character for outlying years. |
... |
Plotting parameters. |
None. Function produces a plot.
Rob J Hyndman
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series" (with discussion), Journal of the Korean Statistical Society, 38(3), 199-221.
forecast.ftsm
, ftsm
, plot.fm
, plot.ftsf
, residuals.fm
, summary.fm
# plot different principal components. plot.ftsm(ftsm(y = ElNino_ERSST_region_1and2, order = 2), components = 2)
# plot different principal components. plot.ftsm(ftsm(y = ElNino_ERSST_region_1and2, order = 2), components = 2)
Plot showing the basis functions of the predictors in the top row, followed by the basis functions of the responses in the second row, then the coefficients in the bottom row of plots.
plotfplsr(x, xlab1 = x$ypred$xname, ylab1 = "Basis function", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", interaction.title = "Interaction")
plotfplsr(x, xlab1 = x$ypred$xname, ylab1 = "Basis function", xlab2 = "Time", ylab2 = "Coefficient", mean.lab = "Mean", interaction.title = "Interaction")
x |
Output from |
xlab1 |
x-axis label for basis functions. |
ylab1 |
y-axis label for basis functions. |
xlab2 |
x-axis label for coefficient time series. |
ylab2 |
y-axis label for coefficient time series. |
mean.lab |
Label for mean component. |
interaction.title |
Title for interaction terms. |
None. Function produces a plot.
Han Lin Shang
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. L. Shang (2009) "Forecasting functional time series" (with discussion), Journal of the Korean Statistical Society, 38(3), 199-221.
forecast.ftsm
, ftsm
, plot.fm
, plot.ftsf
, residuals.fm
, summary.fm
# Fit the data by the functional partial least squares. ausfplsr = fplsr(data = ElNino_ERSST_region_1and2, order = 2) plotfplsr(x = ausfplsr)
# Fit the data by the functional partial least squares. ausfplsr = fplsr(data = ElNino_ERSST_region_1and2, order = 2) plotfplsr(x = ausfplsr)
This data set consists of half-hourly measurement of the concentrations (measured in ug/m3) of particular matter with an aerodynamic diameter of less than 10um, abbreviated PM10, in ambient air taken in Graz-Mitte, Austria from October 1, 2010 until March 31, 2011. To stabilise the variance, a square-root transformation can be applied to the data.
data(pm_10_GR)
data(pm_10_GR)
As epidemiological and toxicological studies have pointed to negative health effects, European Union (EU) regulation sets pollution standards for the level of the concentration. Policy makers have to ensure compliance with these EU rules and need reliable statistical tools to determine, and justify the public, appropriate measures such as partial traffic regulation (see Stadlober, Hormann and Pfeiler, 2008).
Thanks Professor Siegfried. Hormann for providing this data set. The original data source is https://www.umwelt.steiermark.at/cms/
A. Aue, D. D. Norinho, S. Hormann (2015) "On the prediction of stationary functional time series", Journal of the American Statistical Association, 110(509), 378-392.
E. Stadlober, S. Hormann, B. Pfeiler (2008) "Quality and performance of a PM10 daily forecasting model", Atmospheric Environment, 42, 1098-1109.
S. Hormann, B. Pfeiler, E. Stadlober (2005) "Analysis and prediction of particulate matter PM10 for the winter season in Graz", Austrian Journal of Statistics, 34(4), 307-326.
H. L. Shang (2017) "Functional time series forecasting with dynamic updating: An application to intraday particulate matter concentration", Econometrics and Statistics, 1, 184-200.
plot(pm_10_GR)
plot(pm_10_GR)
Generic functions for quantile.
quantile(x, ...)
quantile(x, ...)
x |
Numeric vector whose sample quantiles are wanted, or an object of a class for which a method has been defined. |
... |
Arguments passed to specific methods. |
Refer to specific methods. For numeric vectors, see the quantile
functions in the stats
package.
Han Lin Shang
Computes quantiles of functional time series at each variable.
## S3 method for class 'fts' quantile(x, probs, ...)
## S3 method for class 'fts' quantile(x, probs, ...)
x |
An object of class |
probs |
Quantile percentages. |
... |
Other arguments. |
Return quantiles for each variable.
Han Lin Shang
mean.fts
, median.fts
, var.fts
, sd.fts
quantile(x = ElNino_ERSST_region_1and2)
quantile(x = ElNino_ERSST_region_1and2)
After fitting a functional model, it is useful to inspect the residuals.
This function extracts the relevant information from the fit object and puts it in a form suitable for
plotting with image
, persp
, contour
, filled.contour
, etc.
## S3 method for class 'fm' residuals(object, ...)
## S3 method for class 'fm' residuals(object, ...)
object |
|
... |
Other arguments. |
Produces an object of class “fmres” containing the residuals from the model.
Rob J Hyndman
B. Erbas and R. J. Hyndman and D. M. Gertig (2007) "Forecasting age-specific breast cancer mortality using functional data model", Statistics in Medicine, 26(2), 458-470.
R. J. Hyndman and M. S. Ullah (2007) "Robust forecasting of mortality and fertility rates: A functional data approach", Computational Statistics and Data Analysis, 51(10), 4942-4956.
R. J. Hyndman and H. Booth (2008) "Stochastic population forecasts using functional data models for mortality, fertility and migration", International Journal of Forecasting, 24(3), 323-342.
H. L. Shang (2012) "Point and interval forecasts of age-specific fertility rates: a comparison of functional principal component methods", Journal of Population Research, 29(3), 249-267.
H. L. Shang (2012) "Point and interval forecasts of age-specific life expectancies: a model averaging", Demographic Research, 27, 593-644.
ftsm
, forecast.ftsm
, summary.fm
, plot.fm
, plot.fmres
plot(residuals(object = ftsm(y = ElNino_ERSST_region_1and2)), xlab = "Year", ylab = "Month")
plot(residuals(object = ftsm(y = ElNino_ERSST_region_1and2)), xlab = "Year", ylab = "Month")
Generic functions for standard deviation.
sd(...)
sd(...)
... |
Arguments passed to specific methods. |
The sd
functions in the stats
package are replaced by sd.default
.
Refer to specific methods. For numeric vectors, see the sd
functions in the stats
package.
Han Lin Shang
Computes standard deviation of functional time series at each variable.
## S3 method for class 'fts' sd(x, method = c("coordinate", "FM", "mode", "RP", "RPD", "radius"), trim = 0.25, alpha, weight,...)
## S3 method for class 'fts' sd(x, method = c("coordinate", "FM", "mode", "RP", "RPD", "radius"), trim = 0.25, alpha, weight,...)
x |
An object of class |
method |
Method for computing median. |
trim |
Percentage of trimming. |
alpha |
Tuning parameter when |
weight |
Hard thresholding or soft thresholding. |
... |
Other arguments. |
If method = "coordinate"
, it computes coordinate-wise standard deviation functions.
If method = "FM"
, it computes the standard deviation functions of trimmed functional data ordered by the functional depth of
Fraiman and Muniz (2001).
If method = "mode"
, it computes the standard deviation functions of trimmed functional data ordered by -modal functional
depth.
If method = "RP"
, it computes the standard deviation functions of trimmed functional data ordered by random projection
depth.
If method = "RPD"
, it computes the standard deviation functions of trimmed functional data ordered by random projection with
derivative depth.
If method = "radius"
, it computes the standard deviation function of trimmed functional data ordered by the notion of alpha-radius.
A list containing x
= variables and y
= standard deviation rates.
Han Lin Shang
O. Hossjer and C. Croux (1995) "Generalized univariate signed rank statistics for testing and estimating a multivariate location parameter", Nonparametric Statistics, 4(3), 293-308.
A. Cuevas and M. Febrero and R. Fraiman (2006) "On the use of bootstrap for estimating functions with functional data", Computational Statistics and Data Analysis, 51(2), 1063-1074.
A. Cuevas and M. Febrero and R. Fraiman (2007), "Robust estimation and classification for functional data via projection-based depth notions", Computational Statistics, 22(3), 481-496.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2007) "A functional analysis of NOx levels: location and scale estimation and outlier detection", Computational Statistics, 22(3), 411-427.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2008) "Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels", Environmetrics, 19(4), 331-345.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2010) "Measures of influence for the functional linear model with scalar response", Journal of Multivariate Analysis, 101(2), 327-339.
J. A. Cuesta-Albertos and A. Nieto-Reyes (2010) "Functional classification and the random Tukey depth. Practical issues", Combining Soft Computing and Statistical Methods in Data Analysis, Advances in Intelligent and Soft Computing, 77, 123-130.
D. Gervini (2012) "Outlier detection and trimmed estimation in general functional spaces", Statistica Sinica, 22(4), 1639-1660.
mean.fts
, median.fts
, var.fts
,
quantile.fts
# Fraiman-Muniz depth was arguably the oldest functional depth. sd(x = ElNino_ERSST_region_1and2, method = "FM") sd(x = ElNino_ERSST_region_1and2, method = "coordinate") sd(x = ElNino_ERSST_region_1and2, method = "mode") sd(x = ElNino_ERSST_region_1and2, method = "RP") sd(x = ElNino_ERSST_region_1and2, method = "RPD") sd(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "hard") sd(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "soft")
# Fraiman-Muniz depth was arguably the oldest functional depth. sd(x = ElNino_ERSST_region_1and2, method = "FM") sd(x = ElNino_ERSST_region_1and2, method = "coordinate") sd(x = ElNino_ERSST_region_1and2, method = "mode") sd(x = ElNino_ERSST_region_1and2, method = "RP") sd(x = ElNino_ERSST_region_1and2, method = "RPD") sd(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "hard") sd(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "soft")
We generate 2 groups of functional time series. For each
in {1, ..., m} in a given cluster
,
in {1,2}, the
th function,
in {1,..., T}, is given by
data("sim_ex_cluster")
data("sim_ex_cluster")
The mean functions for each of these two clusters are set to be and
.
While the variates for both clusters, are generated from autoregressive of order 1 with parameter 0.7, while the variates
and
for both clusters, are generated from independent and identically distributed
and
, respectively.
The basis functions for the common-time trend for the first cluster, , for
in {1,2} are
and
respectively; and the basis functions for the common-time trend for the second cluster,
, for
in {1,2} are
and
respectively.
The basis functions for the residual for the first cluster, , for
in {1,2} are
and
respectively; and the basis functions for the residual for the second cluster,
, for
in {1,2} are
and
respectively.
The measurement error for each continuum x is generated from independent and identically distributed
data(sim_ex_cluster)
data(sim_ex_cluster)
Fitting a parametric skewed t distribution of Fernandez and Steel's (1998) method
skew_t_fun(data, gridpoints, M = 5001)
skew_t_fun(data, gridpoints, M = 5001)
data |
a data matrix of dimension |
gridpoints |
Grid points |
M |
number of grid points |
1) Fit a skewed t distribution to data, and obtain four latent parameters; 2) Transform the four latent parameters so that they are un-constrained; 3) Fit a vector autoregressive model to these transformed latent parameters; 4) Obtain their forecasts, and then back-transform them to the original scales; 5) Via the skewed t distribution in Step 1), we obtain forecast density using the forecast latent parameters.
m |
Grid points within data range |
skewed_t_den_fore |
Density forecasts via a skewed t distribution |
This is a parametric approach for fitting and forecasting density.
Han Lin Shang
Fernandez, C. and Steel, M. F. J. (1998), ‘On Bayesian modeling of fat tails and skewness’, Journal of the American Statistical Association: Theory and Methods, 93(441), 359-371.
CoDa_FPCA
, Horta_Ziegelmann_FPCA
, LQDT_FPCA
skew_t_fun(DJI_return)
skew_t_fun(DJI_return)
Detecting the optimal stopping time for the glue curing of wood panels in an automatic process environment.
stop_time_detect(data, forecasting_method = c("ets", "arima", "rw"))
stop_time_detect(data, forecasting_method = c("ets", "arima", "rw"))
data |
An object of class |
forecasting_method |
A univariate time series forecasting method |
break_points_strucchange |
Breakpoints detected by the regression approach |
break_points_ecp |
Breakpoints detected by the distance-based approach |
err_forward |
Forward integrated squared forecast errors |
err_backward |
Backward integrated squared forecast errors (ISFEs) |
ncomp_select_forward |
Number of components selected by the eigenvalue ratio tests based on the forward ISFEs |
ncomp_select_backward |
Number of components selected by the eigenvalue ratio tests based on the backward ISFEs |
Han Lin Shang
Bekhta, P., Ortynska, G. and Sedliacik, J. (2014). Properties of modified phenol-formaldehyde adhesive for plywood panels manufactured from high moisture content veneer. Drvna Industrija 65(4), 293-301.
For detecting the optimal stopping time, we simulate a curve time series that follows a functional autoregression of order 1, with a breakpoint in the middle point of the entire sample.
stop_time_sim_data(sample_size, omega, seed_number)
stop_time_sim_data(sample_size, omega, seed_number)
sample_size |
Number of curves |
omega |
Noise level |
seed_number |
Random seed number |
An object of class fts
Han Lin Shang
stop_time_sim_data(sample_size = 401, omega = 0.1, seed_number = 123)
stop_time_sim_data(sample_size = 401, omega = 0.1, seed_number = 123)
Summarizes a basis function model fitted to a functional time series. It returns various measures of goodness-of-fit.
## S3 method for class 'fm' summary(object, ...)
## S3 method for class 'fm' summary(object, ...)
object |
|
... |
Other arguments. |
None.
Rob J Hyndman
ftsm
, forecast.ftsm
, residuals.fm
, plot.fm
, plot.fmres
summary(object = ftsm(y = ElNino_ERSST_region_1and2))
summary(object = ftsm(y = ElNino_ERSST_region_1and2))
A hypothesis test for stationarity of functional time series.
T_stationary(sample, L = 49, J = 500, MC_rep = 1000, cumulative_var = .90, Ker1 = FALSE, Ker2 = TRUE, h = ncol(sample)^.5, pivotal = FALSE, use_table = FALSE, significance)
T_stationary(sample, L = 49, J = 500, MC_rep = 1000, cumulative_var = .90, Ker1 = FALSE, Ker2 = TRUE, h = ncol(sample)^.5, pivotal = FALSE, use_table = FALSE, significance)
sample |
A matrix of discretised curves of dimension (p by n), where p represents the dimensionality and n represents sample size. |
L |
Number of Fourier basis functions. |
J |
Truncation level used to approximate the distribution of the squared integrals of Brownian bridges that appear in the limit distribution. |
MC_rep |
Number of replications. |
cumulative_var |
Amount of variance explained. |
Ker1 |
Flat top kernel in (4.1) of Horvath et al. (2014). |
Ker2 |
Flat top kernel in (7) of Politis (2003). |
h |
Kernel bandwidth. |
pivotal |
If |
use_table |
If |
significance |
Level of significance. Possibilities are “10%”, “5%”, “1%”. |
As in traditional (scalar and vector) time series analysis, many inferential procedures for functional time series assume stationarity. Stationarity is required for functional dynamic regression models, for bootstrap and resampling methods for functional time series and for the functional analysis of volatility.
p-value |
When |
Greg. Rice and Han Lin Shang
L. Horvath and Kokoszka, P. (2012) Inference for Functional Data with Applications, Springer, New York.
L. Horvath, P. Kokoszka, G. Rice (2014) "Testing stationarity of functional time series", Journal of Econometrics, 179(1), 66-82.
D. N. Politis (2003) "Adaptive bandwidth choice", Journal of Nonparametric Statistics, 15(4-5), 517-533.
A. Aue, G. Rice, O. S\"onmez (2018) "Detecting and dating structural breaks in functional data without dimension reduction", Journal of the Royal Statistical Society: Series B, 80(3), 509-529.
result = T_stationary(sample = pm_10_GR_sqrt$y) result_pivotal = T_stationary(sample = pm_10_GR_sqrt$y, J = 100, MC_rep = 5000, h = 20, pivotal = TRUE)
result = T_stationary(sample = pm_10_GR_sqrt$y) result_pivotal = T_stationary(sample = pm_10_GR_sqrt$y, J = 100, MC_rep = 5000, h = 20, pivotal = TRUE)
Decomposition by two-way functional median polish
Two_way_median_polish(Y, year=1959:2020, age=0:100, n_prefectures=51, n_populations=2)
Two_way_median_polish(Y, year=1959:2020, age=0:100, n_prefectures=51, n_populations=2)
Y |
A matrix with dimension n by 2p. The functional data. |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year. |
n_populations |
Number of populations. |
grand_effect |
grand_effect, a vector of dimension p |
row_effect |
row_effect, a matrix of dimension length(row_partition_index) by p. |
col_effect |
col_effect, a matrix of dimension length(column_partition_index) by p |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Sun, Ying, and Marc G. Genton (2012) “Functional Median Polish", Journal of Agricultural, Biological, and Environmental Statistics, 17(3), 354-376.
# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional median polish decomposition. FMP = Two_way_median_polish(cbind(all_hmd_male_data, all_hmd_female_data), n_prefectures = 3, year = 1959:2020, age = 0:100, n_populations = 2) ##1. The functional grand effect FGE = FMP$grand_effect ##2. The functional row effect FRE = FMP$row_effect ##3. The functional column effect FCE = FMP$col_effect
# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional median polish decomposition. FMP = Two_way_median_polish(cbind(all_hmd_male_data, all_hmd_female_data), n_prefectures = 3, year = 1959:2020, age = 0:100, n_populations = 2) ##1. The functional grand effect FGE = FMP$grand_effect ##2. The functional row effect FRE = FMP$row_effect ##3. The functional column effect FCE = FMP$col_effect
Decomposition of functional time series into deterministic (from functional median polish), and time-varying components (functional residuals)
Two_way_Residuals(Y, n_prefectures, year, age, n_populations)
Two_way_Residuals(Y, n_prefectures, year, age, n_populations)
Y |
A matrix with dimension n by 2p. The functional data |
year |
Vector with the years considered in each population |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year |
n_populations |
Number of populations |
residuals1 |
A matrix with dimension n by p |
residuals2 |
A matrix with dimension n by p |
rd |
A two dimension logic vector that proves that the decomposition sum up to the data |
R |
A matrix with the same dimension as Y. This represent the time-varying component in the decomposition |
Fixed_comp |
A matrix with the same dimension as Y. This represent the deterministic component in the decomposition |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) "Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Sun, Ying, and Marc G. Genton (2012). "Functional Median Polish". Journal of Agricultural, Biological, and Environmental Statistics 17(3), 354-376.
# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Column binds the data from both populations Y = cbind(all_hmd_male_data, all_hmd_female_data) # Decompose FTS into deterministic (from functional median polish) # and time-varying components (functional residuals). FMP_residuals <- Two_way_Residuals(Y,n_prefectures=3,year=1959:2020, age=0:100,n_populations=2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FMP_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FMP_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recover the original FTS. Construct_data=FMP_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FMP_residuals$R ##5. The deterministic components from the functional median polish decomposition deterministic_comp <- FMP_residuals$Fixed_comp
# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Column binds the data from both populations Y = cbind(all_hmd_male_data, all_hmd_female_data) # Decompose FTS into deterministic (from functional median polish) # and time-varying components (functional residuals). FMP_residuals <- Two_way_Residuals(Y,n_prefectures=3,year=1959:2020, age=0:100,n_populations=2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FMP_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FMP_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recover the original FTS. Construct_data=FMP_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FMP_residuals$R ##5. The deterministic components from the functional median polish decomposition deterministic_comp <- FMP_residuals$Fixed_comp
Decomposition of functional time series into deterministic (by functional analysis of variance fitted by means), and time-varying components (functional residuals)
Two_way_Residuals_means(data_pop1, data_pop2, year, age, n_prefectures, n_populations)
Two_way_Residuals_means(data_pop1, data_pop2, year, age, n_prefectures, n_populations)
data_pop1 |
A p by n matrix |
data_pop2 |
A p by n matrix |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year. |
n_populations |
Number of populations. |
residuals1 |
A matrix with dimension n by p. |
residuals2 |
A matrix with dimension n by p. |
rd |
A two dimension logic vector proving that the decomposition sum up the data. |
R |
A matrix of dimension as n by 2p. This represents the time-varying component in the decomposition. |
Fixed_comp |
A matrix of dimension as n by 2p. This represents the deterministic component in the decomposition. |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Ramsay, J. and B. Silverman (2006). Functional Data Analysis. Springer Series in Statistics. Chapter 13. New York: Springer.
# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Compute the functional Anova decomposition fitted by means. FANOVA_means_residuals <- Two_way_Residuals_means(data_pop1=t(all_hmd_male_data), data_pop2=t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FANOVA_means_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FANOVA_means_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recover the original FTS. Construct_data=FANOVA_means_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FANOVA_means_residuals$R ##5. The deterministic components from the functional ANOVA decomposition deterministic_comp <- FANOVA_means_residuals$Fixed_comp
# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Compute the functional Anova decomposition fitted by means. FANOVA_means_residuals <- Two_way_Residuals_means(data_pop1=t(all_hmd_male_data), data_pop2=t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FANOVA_means_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FANOVA_means_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recover the original FTS. Construct_data=FANOVA_means_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FANOVA_means_residuals$R ##5. The deterministic components from the functional ANOVA decomposition deterministic_comp <- FANOVA_means_residuals$Fixed_comp
Generic functions for variance.
var(...)
var(...)
... |
Arguments passed to specific methods. |
The cor
functions in the stats
package are replaced by var.default
.
Refer to specific methods. For numeric vectors, see the cor
functions in the stats
package.
Rob J Hyndman and Han Lin Shang
Computes variance functions of functional time series at each variable.
## S3 method for class 'fts' var(x, method = c("coordinate", "FM", "mode", "RP", "RPD", "radius"), trim = 0.25, alpha, weight, ...)
## S3 method for class 'fts' var(x, method = c("coordinate", "FM", "mode", "RP", "RPD", "radius"), trim = 0.25, alpha, weight, ...)
x |
An object of class |
method |
Method for computing median. |
trim |
Percentage of trimming. |
alpha |
Tuning parameter when |
weight |
Hard thresholding or soft thresholding. |
... |
Other arguments. |
If method = "coordinate"
, it computes coordinate-wise variance.
If method = "FM"
, it computes the variance of trimmed functional data ordered by the functional depth of Fraiman and Muniz (2001).
If method = "mode"
, it computes the variance of trimmed functional data ordered by -modal functional depth.
If method = "RP"
, it computes the variance of trimmed functional data ordered by random projection depth.
If method = "RPD"
, it computes the variance of trimmed functional data ordered by random projection derivative depth.
If method = "radius"
, it computes the standard deviation function of trimmed functional data ordered by the notion of alpha-radius.
A list containing x
= variables and y
= variance rates.
Han Lin Shang
O. Hossjer and C. Croux (1995) "Generalized univariate signed rank statistics for testing and estimating a multivariate location parameter", Nonparametric Statistics, 4(3), 293-308.
A. Cuevas and M. Febrero and R. Fraiman (2006) "On the use of bootstrap for estimating functions with functional data", Computational Statistics and Data Analysis, 51(2), 1063-1074.
A. Cuevas and M. Febrero and R. Fraiman (2007), "Robust estimation and classification for functional data via projection-based depth notions", Computational Statistics, 22(3), 481-496.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2007) "A functional analysis of NOx levels: location and scale estimation and outlier detection", Computational Statistics, 22(3), 411-427.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2008) "Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels", Environmetrics, 19(4), 331-345.
M. Febrero and P. Galeano and W. Gonzalez-Manteiga (2010) "Measures of influence for the functional linear model with scalar response", Journal of Multivariate Analysis, 101(2), 327-339.
J. A. Cuesta-Albertos and A. Nieto-Reyes (2010) "Functional classification and the random Tukey depth. Practical issues", Combining Soft Computing and Statistical Methods in Data Analysis, Advances in Intelligent and Soft Computing, 77, 123-130.
D. Gervini (2012) "Outlier detection and trimmed estimation in general functional spaces", Statistica Sinica, 22(4), 1639-1660.
mean.fts
, median.fts
, sd.fts
, quantile.fts
# Fraiman-Muniz depth was arguably the oldest functional depth. var(x = ElNino_ERSST_region_1and2, method = "FM") var(x = ElNino_ERSST_region_1and2, method = "coordinate") var(x = ElNino_ERSST_region_1and2, method = "mode") var(x = ElNino_ERSST_region_1and2, method = "RP") var(x = ElNino_ERSST_region_1and2, method = "RPD") var(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "hard") var(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "soft")
# Fraiman-Muniz depth was arguably the oldest functional depth. var(x = ElNino_ERSST_region_1and2, method = "FM") var(x = ElNino_ERSST_region_1and2, method = "coordinate") var(x = ElNino_ERSST_region_1and2, method = "mode") var(x = ElNino_ERSST_region_1and2, method = "RP") var(x = ElNino_ERSST_region_1and2, method = "RPD") var(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "hard") var(x = ElNino_ERSST_region_1and2, method = "radius", alpha = 0.5, weight = "soft")