| Title: | High-Dimensional Functional Time Series Analysis |
|---|---|
| Description: | Offers methods for visualising, modelling, and forecasting high-dimensional functional time series, also known as functional panel data. Documentation about 'hdftsa' is initially provided via the paper by Cristian F. Jimenez-Varon, Ying Sun and Han Lin Shang (2024, Journal of Computational and Graphical Statistics). |
| Authors: | Han Lin Shang [aut, cre] (ORCID: <https://orcid.org/0000-0003-1769-6430>) |
| Maintainer: | Han Lin Shang <[email protected]> |
| License: | GPL-3 |
| Version: | 1.1 |
| Built: | 2026-06-01 07:14:01 UTC |
| Source: | https://github.com/cran/hdftsa |
Offers methods for visualising, modelling, and forecasting high-dimensional functional time series, also known as functional panel data. Documentation for 'hdftsa' is provided in the paper by Cristian F. Jimenez-Varon, Ying Sun, and Han Lin Shang (2024, Journal of Computational and Graphical Statistics).
Han Lin Shang [aut, cre] (ORCID: <https://orcid.org/0000-0003-1769-6430>)
Maintainer: Han Lin Shang <[email protected]>
D. Li, R. Li and H. L. Shang (2024) Detection and estimation of structural breaks in high-dimensional functional time series, Annals of Statistics, 52(4), 1716-1740.
C. F. Jimenez-Varon, Y. Sun and H. L. Shang (2024) Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality, Journal of Computational and Graphical Statistics, 33(4), 1160-1174.
C. F. Jimenez-Varon, Y. Sun and H. L. Shang (2025) Forecasting density-valued functional panel data, Australian and New Zealand Journal of Statistics, 67(3), 401-415.
H. Wang, T. Guan and H. L. Shang (2025) Interpretable additive model for analyzing functional panel data, Journal of Multivariate Analysis, in press.
H. L. Shang (2025) Forecasting a time series of Lorenz curves: one-way functional analysis of variance, Journal of Applied Statistics, 52(15), 2924-2940.
C. Tang, H. L. Shang, Y. Yang and Y. Yang (2025) Forecasting high-dimensional functional time series with dual-factor structures, Journal of the Royal Statistical Society: Series A, in press.
C. Leng, D. Li, H. L. Shang and Y. Xia (2026) Covariance function estimation for high-dimensional functional time series with dual factor structures, Journal of Business and Economic Statistics, in press.
H. L. Shang (2026) Conformal prediction for high-dimensional functional time series: Applications to subnational mortality. https://arxiv.org/abs/2603.10674.
H. L. Shang and C. F. Jimenez-Varon (2026) Interpretable models for forecasting high-dimensional functional time series.
We generate for the female population in the US. The functional time series corresponding to the log mortality data in each of the 3 states. Each functional time series comprises the ages from 0 to 100+.
data("all_hmd_male_data")data("all_hmd_male_data")
A n x p matrix with n=186 observations on the following p=101 ages from 0 to 100+.
The data generated corresponds to the FTS for the female US log-mortality. The matrix contains 186 FTS stacked by rows. They correspond to 62 (number of years) times 3 (states). Each FTS contains 101 functional values.
United States Mortality Database (2023). University of California, Berkeley (USA). Department of Demography at the University of California, Berkeley. Available at usa.mortality.org (data downloaded on March 15, 2023).
data(all_hmd_male_data)data(all_hmd_male_data)
We generate for the male population in the US. The functional time series corresponding to the log mortality data in each of the 3 states. Each functional time series comprises the ages from 0 to 100+.
data("all_hmd_male_data")data("all_hmd_male_data")
A n x p matrix with n=186 observations on the following p=101 ages from 0 to 100+.
The data generated corresponds to the FTS for the male US log-mortality. The matrix contains 186 FTS stacked by rows. They correspond to 62 (number of years) times 3 (states). Each FTS contains 101 functional values.
United States Mortality Database (2023). University of California, Berkeley (USA). Department of Demography at the University of California, Berkeley. Available at usa.mortality.org (data downloaded on March 15, 2023).
data(all_hmd_male_data)data(all_hmd_male_data)
Functional principal component analysis is used to decompose multiple functional time series. This function uses a functional panel data model to reduce dimensions for multiple functional time series.
dmfpca(y, M = NULL, J = NULL, N = NULL, tstart = 0, tlength = 1)dmfpca(y, M = NULL, J = NULL, N = NULL, tstart = 0, tlength = 1)
y |
A data matrix containing functional responses. Each row contains measurements from a function at a set of grid points, and each column contains measurements of all functions at a particular grid point |
M |
Number of |
J |
Number of functions in each object |
N |
Number of grid points per function |
tstart |
Start point of the grid points |
tlength |
Length of the interval that the functions are evaluated at |
K1 |
Number of components for the common time-trend |
K2 |
Number of components for the residual component |
lambda1 |
A vector containing all common time-trend eigenvalues in non-increasing order |
lambda2 |
A vector containing all residual component eigenvalues in non-increasing order |
phi1 |
A matrix containing all common time-trend eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
phi2 |
A matrix containing all residual component eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues. |
scores1 |
A matrix containing estimated common time-trend principal component scores. Each row corresponds to the common time-trend scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix y. Each column contains the scores for a common time-trend component for all subjects. |
scores2 |
A matrix containing estimated residual component principal component scores. Each row corresponds to the level 2 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix y. Each column contains the scores for a residual component for all subjects. |
mu |
A vector containing the overall mean function. |
eta |
A matrix containing the deviation from the overall mean function to the country-specific mean function. The number of rows is the number of countries. |
Chen Tang and Han Lin Shang
Rice, G. and Shang, H. L. (2017) "A plug-in bandwidth selection procedure for long-run covariance estimation with stationary functional time series", Journal of Time Series Analysis, 38, 591-609.
Shang, H. L. (2016) "Mortality and life expectancy forecasting for a group of populations in developed countries: A multilevel functional data method", The Annals of Applied Statistics, 10, 1639-1672.
Di, C.-Z., Crainiceanu, C. M., Caffo, B. S. and Punjabi, N. M. (2009) "Multilevel functional principal component analysis", The Annals of Applied Statistics, 3, 458-488.
## The following takes about 10 seconds to run ## ## Not run: y <- do.call(rbind, sim_ex_cluster) MFPCA.sim <- dmfpca(y, M = length(sim_ex_cluster), J = nrow(sim_ex_cluster[[1]]), N = ncol(sim_ex_cluster[[1]]), tlength = 1) ## End(Not run)## The following takes about 10 seconds to run ## ## Not run: y <- do.call(rbind, sim_ex_cluster) MFPCA.sim <- dmfpca(y, M = length(sim_ex_cluster), J = nrow(sim_ex_cluster[[1]]), N = ncol(sim_ex_cluster[[1]]), tlength = 1) ## End(Not run)
Forecast a high-dimensional functional principal component model.
## S3 method for class 'hdfpca' forecast(object, h = 3, level = 80, B = 50, ...)## S3 method for class 'hdfpca' forecast(object, h = 3, level = 80, B = 50, ...)
object |
An object of class 'hdfpca' |
h |
Forecast horizon |
level |
Prediction interval level, the default is 80 percent |
B |
Number of bootstrap replications |
... |
Other arguments passed to forecast routine. |
The low-dimensional factors are forecasted separately using autoregressive integrated moving-average (ARIMA) models. The forecast functions are then calculated using the forecast factors. Bootstrap prediction intervals are constructed by resampling from the forecast residuals of the ARIMA models.
forecast |
A list containing the h-step-ahead forecast functions for each population |
upper |
Upper confidence bound for each population |
lower |
Lower confidence bound for each population |
Y. Gao and H. L. Shang
Y. Gao, H. L. Shang and Y. Yang (2018) High-dimensional functional time series forecasting: An application to age-specific mortality rates, Journal of Multivariate Analysis, forthcoming.
## Not run: hd_model = hdfpca(hd_data, order = 2, r = 2) hd_model_fore = forecast.hdfpca(object = hd_model, h = 1) ## End(Not run)## Not run: hd_model = hdfpca(hd_data, order = 2, r = 2) hd_model_fore = forecast.hdfpca(object = hd_model, h = 1) ## End(Not run)
We generate populations of functional time series. For each , the th function at time is given by
where .
data("hd_data")data("hd_data")
The coefficients for all populations are combined and generated, for all , by
where . Here, is an matrix, and is an vector. Furthermore, we assume that the s have mean 0 and variance 0 when , so we only construct the coefficients for .
The first set of coefficients for populations are generated with . Each element in the matrix is generated by , where .
The factors are generated using an autoregressive model of order 1, i.e., AR(1). Define the th element in vector as . Then, is generated by , where are independent random variables. We generate for all by , where are also AR(1) and follow . It is then ensured that most of the variance of can be explained by one factor. The second coefficient are constructed the same way as .
We also generate the third functional principal component scores but with small values. Moreover, is generated by , where . The factors are generated as .
The three basis functions are constructed by , and , where . Finally, the functional time series for the th population is constructed by
where denotes the th element of the vector.
Y. Gao, H. L. Shang and Y. Yang (2019) High-dimensional functional time series forecasting: An application to age-specific mortality rates, Journal of Multivariate Analysis, 170, 232-243.
data(hd_data)data(hd_data)
Fit a high-dimensional functional principal component analysis model to a multiple-population of functional time series data.
hdfpca(y, order, q = sqrt(dim(y[[1]])[2]), r)hdfpca(y, order, q = sqrt(dim(y[[1]])[2]), r)
y |
A list, where each item is a population of functional time series. Each item is a data matrix of dimension p by n, where p is the number of discrete points in each function and n is the sample size |
order |
The number of principal component scores to retain in the first step dimension reduction |
q |
The tuning parameter used in the first step of dimension reduction, by default it is equal to the square root of the sample size |
r |
The number of factors to retain in the second step dimension reduction |
In the first step, dynamic functional principal component analysis is performed on each population, and then in the second step, factor models are fitted to the resulting principal component scores. The high-dimensional functional time series are thus reduced to low-dimensional factors.
y |
The input data |
p |
The number of discrete points in each function |
fitted |
A list containing the fitted functions for each population |
m |
The number of populations |
model |
Model 1 includes the first step dynamic functional principal component analysis models, model 2 includes the second step high-dimensional principal component analysis models |
order |
Input order |
r |
Input r |
Y. Gao and H. L. Shang
Y. Gao, H. L. Shang and Y. Yang (2019) High-dimensional functional time series forecasting: An application to age-specific mortality rates, Journal of Multivariate Analysis, 170, 232-243.
hd_model = hdfpca(hd_data, order = 2, r = 2)hd_model = hdfpca(hd_data, order = 2, r = 2)
A multilevel functional principal component analysis for performing clustering analysis
MFPCA(y, M = NULL, J = NULL, N = NULL)MFPCA(y, M = NULL, J = NULL, N = NULL)
y |
A data matrix containing functional responses. Each row contains measurements from a function at a set of grid points, and each column contains measurements of all functions at a particular grid point |
M |
Number of countries |
J |
Number of functional responses in each country |
N |
Number of grid points per function |
K1 |
Number of components at level 1 |
K2 |
Number of components at level 2 |
K3 |
Number of components at level 3 |
lambda1 |
A vector containing all level 1 eigenvalues in non-increasing order |
lambda2 |
A vector containing all level 2 eigenvalues in non-increasing order |
lambda3 |
A vector containing all level 3 eigenvalues in non-increasing order |
phi1 |
A matrix containing all level 1 eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
phi2 |
A matrix containing all level 2 eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
phi3 |
A matrix containing all level 3 eigenfunctions. Each row contains an eigenfunction evaluated at the same set of grid points as the input data. The eigenfunctions are in the same order as the corresponding eigenvalues |
scores1 |
A matrix containing estimated level 1 principal component scores. Each row corresponds to the level 1 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix |
scores2 |
A matrix containing estimated level 2 principal component scores. Each row corresponds to the level 2 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix |
scores3 |
A matrix containing estimated level 3 principal component scores. Each row corresponds to the level 3 scores for a particular subject in a cluster. The number of rows is the same as that of the input matrix |
mu |
A vector containing the overall mean function |
eta |
A matrix containing the deviation from overall mean function to country-specific mean function. The number of rows is the number of countries |
Rj |
Common trend |
Uij |
Country-specific mean function |
Chen Tang, Yanrong Yang and Han Lin Shang
T. Chen, H. L. Shang, Y. Yang and Y. Yang (2026) Forecasting high-dimensional functional time series with dual-factor structures, Journal of the Royal Statistical Society: Series A, in press.
Clustering the multiple functional time series. The function uses the functional panel data model to cluster different time series into subgroups
mftsc(X, alpha)mftsc(X, alpha)
X |
A list of sets of smoothed functional time series to be clustered, for each object, it is a p x q matrix, where p is the sample size and q is the number of grid points of the function |
alpha |
A value input for adjusted rand index to measure similarity of the memberships with last iteration, can be any value big than 0.9 |
As an initial step, conventional k-means clustering is performed on the dynamic FPC scores, then an iterative membership updating process is applied by fitting the MFPCA model.
iteration |
the number of iterations until convergence |
memebership |
a list of all the membership matrices at each iteration |
member.final |
the final membership |
Chen Tang, Yanrong Yang and Han Lin Shang
T. Chen, H. L. Shang, Y. Yang and Y. Yang (2026) Forecasting high-dimensional functional time series with dual-factor structures, Journal of the Royal Statistical Society: Series A, in press.
## Not run: data(sim_ex_cluster) cluster_result<-mftsc(X=sim_ex_cluster, alpha=0.99) cluster_result$member.final ## End(Not run)## Not run: data(sim_ex_cluster) cluster_result<-mftsc(X=sim_ex_cluster, alpha=0.99) cluster_result$member.final ## End(Not run)
Decomposition by one-way functional analysis of variance based on means.
One_way_mean(data_pop1, year = 1959:2020, age = 0:100, n_prefectures = 51)One_way_mean(data_pop1, year = 1959:2020, age = 0:100, n_prefectures = 51)
data_pop1 |
The multivariate functional data, which are a matrix with dimension n by 2p, where n is the sample size, and p is the dimensionality. |
year |
Vector with the years considered in each population. |
age |
Vector with the ages considered in each year. |
n_prefectures |
Number of prefectures. |
GE_mean |
Grand_effect, a vector of dimension p. |
FRE_mean |
Row_effect, a matrix of dimension length(row_partition_index) by p. |
Deterministic |
Deterministic component = Grand effect + Row effect |
Cristian Felipe Jimenez Varon, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2024) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality", Journal of Computational and Graphical Statistics, 33(4), 1160-1174.
H. L. Shang (2025) “Forecasting a time series of Lorenz curves: One-way functional analysis of variance", Journal of Applied Statistics, 52(15), 2924-2940.
One_way_mean_residuals, One_way_median_polish, One_way_median_polish_residuals
# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data FMP <- One_way_mean(t(Y), year=1959:2020, age=0:100, n_prefectures=3)# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data FMP <- One_way_mean(t(Y), year=1959:2020, age=0:100, n_prefectures=3)
Decomposition of high-dimensional functional time series into deterministic and functional residuals
One_way_mean_residuals(data_pop1, n_prefectures, n_year, n_age)One_way_mean_residuals(data_pop1, n_prefectures, n_year, n_age)
data_pop1 |
The multivariate functional data, which is a matrix with dimension n by 2p, where n is the sample size, and p is the dimensionality. |
n_prefectures |
Number of prefectures. |
n_year |
Vector with the years considered in each population. |
n_age |
Vector with the ages considered in each year. |
Residuals |
Residual component |
Reconstructed_Data |
Reconstructed data |
Reconstruction_OK |
Indicator if reconstruction equals original data |
Cristian Felipe Jimenez Varon and Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2024) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality", Journal of Computational and Graphical Statistics, 33(4), 1160-1174.
# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # The results # Compute the functional residuals. FANOVA_mean_residuals <- One_way_mean_residuals(t(Y), n_prefectures=3, n_year=length(year), n_age=length(age))# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # The results # Compute the functional residuals. FANOVA_mean_residuals <- One_way_mean_residuals(t(Y), n_prefectures=3, n_year=length(year), n_age=length(age))
Decomposition by one-way functional median polish.
One_way_median_polish(Y, n_prefectures=51, year=1959:2020, age=0:100)One_way_median_polish(Y, n_prefectures=51, year=1959:2020, age=0:100)
Y |
The multivariate functional data, which are a matrix with dimension n by 2p, where n is the sample size, and p is the dimensionality. |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures. |
age |
Vector with the ages considered in each year. |
grand_effect |
Grand_effect, a vector of dimension p. |
row_effect |
Row_effect, a matrix of dimension length(row_partition_index) by p. |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2024) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality", Journal of Computational and Graphical Statistics, 33(4), 1160-1174. \ Sun, Ying, and Marc G. Genton (2012) “Functional Median Polish", Journal of Agricultural, Biological, and Environmental Statistics 17(3), 354-376.
One_way_median_polish_residuals, Two_way_median_polish, Two_way_median_polish_residuals
# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # Compute the functional median polish decomposition. FMP <- One_way_median_polish(Y,n_prefectures=3,year=1959:2020,age=0:100) # The results ##1. The functional grand effect FGE <- FMP$grand_effect ##2. The functional row effect FRE <- FMP$row_effect# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # Compute the functional median polish decomposition. FMP <- One_way_median_polish(Y,n_prefectures=3,year=1959:2020,age=0:100) # The results ##1. The functional grand effect FGE <- FMP$grand_effect ##2. The functional row effect FRE <- FMP$row_effect
Decomposition of high-dimensional functional time series into deterministic (from functional median polish), and functional residuals
One_way_median_polish_residuals(Y, n_prefectures = 51, year = 1959:2020, age = 0:100)One_way_median_polish_residuals(Y, n_prefectures = 51, year = 1959:2020, age = 0:100)
Y |
The multivariate functional data, which is a matrix with dimension n by 2p, where n is the sample size, and p is the dimensionality. |
n_prefectures |
Number of prefectures. |
year |
Vector with the years considered in each population. |
age |
Vector with the ages considered in each year. |
A matrix of dimension n by p.
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2024) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality", Journal of Computational and Graphical Statistics, 33(4), 1160-1174. \ Y. Sun and M. G. Genton (2012) “Functional median polish", Journal of Agricultural, Biological, and Environmental Statistics, 17(3), 354-376.
One_way_median_polish, One_way_mean, One_way_mean_residuals
# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # The results # Compute the functional residuals. FMP_residuals <- One_way_median_polish_residuals(Y, n_prefectures=3, year=1959:2020, age=0:100)# The US mortality data 1959-2020, for one population (female) # and 3 states (New York, California, Illinois) # first define the parameters and the row partitions. # Define some parameters. year = 1959:2020 age = 0:100 n_prefectures = 3 #Load the US data. Make sure it is a matrix. Y <- all_hmd_female_data # The results # Compute the functional residuals. FMP_residuals <- One_way_median_polish_residuals(Y, n_prefectures=3, year=1959:2020, age=0:100)
We generate 2 groups of functional time series. For each in {1, ..., m} in a given cluster , in {1,2}, the th function, in {1,..., T}, is given by
data("sim_ex_cluster")data("sim_ex_cluster")
The mean functions for each of these two clusters are set to be and .
While the variates for both clusters, are generated from autoregressive of order 1 with parameter 0.7, while the variates and for both clusters, are generated from independent and identically distributed and , respectively.
The basis functions for the common-time trend for the first cluster, , for in {1,2} are and respectively; and the basis functions for the common-time trend for the second cluster, , for in {1,2} are and respectively.
The basis functions for the residual for the first cluster, , for in {1,2} are and respectively; and the basis functions for the residual for the second cluster, , for in {1,2} are and respectively.
The measurement error for each continuum x is generated from independent and identically distributed
T. Chen, H. L. Shang, Y. Yang and Y. Yang (2026) Forecasting high-dimensional functional time series with dual-factor structures, Journal of the Royal Statistical Society: Series A, in press.
data(sim_ex_cluster)data(sim_ex_cluster)
Decomposition by functional analysis of variance fitted by means.
Two_way_mean(data_pop1, data_pop2, year=1959:2020, age= 0:100, n_prefectures=51, n_populations=2)Two_way_mean(data_pop1, data_pop2, year=1959:2020, age= 0:100, n_prefectures=51, n_populations=2)
data_pop1 |
It's a p by n matrix |
data_pop2 |
It's a p by n matrix |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year. |
n_populations |
Number of populations. |
FGE_mean |
FGE_mean, a vector of dimension p |
FRE_mean |
FRE_mean, a matrix of dimension length(row_partition_index) by p. |
FCE_mean |
FCE_mean, a matrix of dimension length(column_partition_index) by p. |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Ramsay, J. and B. Silverman (2006). Functional Data Analysis. Springer Series in Statistics. Chapter 13. New York: Springer
Two_way_median_polish, Two_way_median_polish_residuals
# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional ANOVA decomposition fitted by means. FANOVA_means <- Two_way_mean(data_pop1 = t(all_hmd_male_data), data_pop2 = t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) ##1. The functional grand effect FGE = FANOVA_means$FGE_mean ##2. The functional row effect FRE = FANOVA_means$FRE_mean ##3. The functional column effect FCE = FANOVA_means$FCE_mean# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional ANOVA decomposition fitted by means. FANOVA_means <- Two_way_mean(data_pop1 = t(all_hmd_male_data), data_pop2 = t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) ##1. The functional grand effect FGE = FANOVA_means$FGE_mean ##2. The functional row effect FRE = FANOVA_means$FRE_mean ##3. The functional column effect FCE = FANOVA_means$FCE_mean
Decomposition of functional time series into deterministic (by functional analysis of variance fitted by means), and time-varying components (functional residuals)
Two_way_mean_residuals(data_pop1, data_pop2, year, age, n_prefectures, n_populations)Two_way_mean_residuals(data_pop1, data_pop2, year, age, n_prefectures, n_populations)
data_pop1 |
A p by n matrix |
data_pop2 |
A p by n matrix |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year. |
n_populations |
Number of populations. |
residuals1 |
A matrix with dimension n by p. |
residuals2 |
A matrix with dimension n by p. |
rd |
A two-dimensional logic vector proving that the decomposition sums up the data. |
R |
A matrix of dimension as n by 2p. This represents the time-varying component in the decomposition. |
Fixed_comp |
A matrix of dimension as n by 2p. This represents the deterministic component in the decomposition. |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Ramsay, J. and B. Silverman (2006). Functional Data Analysis. Springer Series in Statistics. Chapter 13. New York: Springer.
Two_way_median_polish_residuals
# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Compute the functional ANOVA decomposition fitted by means. FANOVA_means_residuals <- Two_way_mean_residuals(data_pop1=t(all_hmd_male_data), data_pop2=t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FANOVA_means_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FANOVA_means_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recovers the original FTS. Construct_data=FANOVA_means_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FANOVA_means_residuals$R ##5. The deterministic components from the functional ANOVA decomposition deterministic_comp <- FANOVA_means_residuals$Fixed_comp# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Compute the functional ANOVA decomposition fitted by means. FANOVA_means_residuals <- Two_way_mean_residuals(data_pop1=t(all_hmd_male_data), data_pop2=t(all_hmd_female_data), year = 1959:2020, age = 0:100, n_prefectures = 3, n_populations = 2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FANOVA_means_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FANOVA_means_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recovers the original FTS. Construct_data=FANOVA_means_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FANOVA_means_residuals$R ##5. The deterministic components from the functional ANOVA decomposition deterministic_comp <- FANOVA_means_residuals$Fixed_comp
Decomposition by two-way functional median polish
Two_way_median_polish(Y, year=1959:2020, age=0:100, n_prefectures=51, n_populations=2)Two_way_median_polish(Y, year=1959:2020, age=0:100, n_prefectures=51, n_populations=2)
Y |
A matrix with dimension n by 2p. The functional data. |
year |
Vector with the years considered in each population. |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year. |
n_populations |
Number of populations. |
grand_effect |
grand_effect, a vector of dimension p |
row_effect |
row_effect, a matrix of dimension length(row_partition_index) by p. |
col_effect |
col_effect, a matrix of dimension length(column_partition_index) by p |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) “Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Sun, Ying, and Marc G. Genton (2012) “Functional Median Polish", Journal of Agricultural, Biological, and Environmental Statistics, 17(3), 354-376.
# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional median polish decomposition. FMP = Two_way_median_polish(cbind(all_hmd_male_data, all_hmd_female_data), n_prefectures = 3, year = 1959:2020, age = 0:100, n_populations = 2) ##1. The functional grand effect FGE = FMP$grand_effect ##2. The functional row effect FRE = FMP$row_effect ##3. The functional column effect FCE = FMP$col_effect# The US mortality data 1959-2020 for two populations and three states # (New York, California, Illinois) # Compute the functional median polish decomposition. FMP = Two_way_median_polish(cbind(all_hmd_male_data, all_hmd_female_data), n_prefectures = 3, year = 1959:2020, age = 0:100, n_populations = 2) ##1. The functional grand effect FGE = FMP$grand_effect ##2. The functional row effect FRE = FMP$row_effect ##3. The functional column effect FCE = FMP$col_effect
Decomposition of functional time series into deterministic (from functional median polish), and time-varying components (functional residuals)
Two_way_median_polish_residuals(Y, n_prefectures, year, age, n_populations)Two_way_median_polish_residuals(Y, n_prefectures, year, age, n_populations)
Y |
A matrix with dimension n by 2p. The functional data |
year |
Vector with the years considered in each population |
n_prefectures |
Number of prefectures |
age |
Vector with the ages considered in each year |
n_populations |
Number of populations |
residuals1 |
A matrix with dimension n by p |
residuals2 |
A matrix with dimension n by p |
rd |
A two-dimensional logic vector that proves that the decomposition sums up to the data |
R |
A matrix with the same dimension as Y. This represent the time-varying component in the decomposition |
Fixed_comp |
A matrix with the same dimension as Y. This represent the deterministic component in the decomposition |
Cristian Felipe Jimenez Varon, Ying Sun, Han Lin Shang
C. F. Jimenez Varon, Y. Sun and H. L. Shang (2023) "Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality".
Sun, Ying, and Marc G. Genton (2012). "Functional Median Polish". Journal of Agricultural, Biological, and Environmental Statistics 17(3), 354-376.
# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Column binds the data from both populations Y = cbind(all_hmd_male_data, all_hmd_female_data) # Decompose FTS into deterministic (from functional median polish) # and time-varying components (functional residuals). FMP_residuals <- Two_way_median_polish_residuals(Y,n_prefectures=3,year=1959:2020, age=0:100,n_populations=2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FMP_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FMP_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recover the original FTS. Construct_data=FMP_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FMP_residuals$R ##5. The deterministic components from the functional median polish decomposition deterministic_comp <- FMP_residuals$Fixed_comp# The US mortality data 1959-2020, for two populations # and three states (New York, California, Illinois) # Column binds the data from both populations Y = cbind(all_hmd_male_data, all_hmd_female_data) # Decompose FTS into deterministic (from functional median polish) # and time-varying components (functional residuals). FMP_residuals <- Two_way_median_polish_residuals(Y,n_prefectures=3,year=1959:2020, age=0:100,n_populations=2) # The results ##1. The functional residuals from population 1 Residuals_pop_1=FMP_residuals$residuals1 ##2. The functional residuals from population 2 Residuals_pop_2=FMP_residuals$residuals2 ##3. A logic vector whose components indicate whether the sum of deterministic ## and time-varying components recover the original FTS. Construct_data=FMP_residuals$rd ##4. Time-varying components for all the populations. The functional residuals All_pop_functional_residuals <- FMP_residuals$R ##5. The deterministic components from the functional median polish decomposition deterministic_comp <- FMP_residuals$Fixed_comp