Title: | High-Dimensional Matrix Factor Analysis |
---|---|
Description: | High-dimensional matrix factor models have drawn much attention in view of the fact that observations are usually well structured to be an array such as in macroeconomics and finance. In addition, data often exhibit heavy-tails and thus it is also important to develop robust procedures. We aim to address this issue by replacing the least square loss with Huber loss function. We propose two algorithms to do robust factor analysis by considering the Huber loss. One is based on minimizing the Huber loss of the idiosyncratic error's Frobenius norm, which leads to a weighted iterative projection approach to compute and learn the parameters and thereby named as Robust-Matrix-Factor-Analysis (RMFA), see the details in He et al. (2023)<doi:10.1080/07350015.2023.2191676>. The other one is based on minimizing the element-wise Huber loss, which can be solved by an iterative Huber regression algorithm (IHR), see the details in He et al. (2023) <arXiv:2306.03317>. In this package, we also provide the algorithm for alpha-PCA by Chen & Fan (2021) <doi:10.1080/01621459.2021.1970569>, the Projected estimation (PE) method by Yu et al. (2022)<doi:10.1016/j.jeconom.2021.04.001>. In addition, the methods for determining the pair of factor numbers are also given. |
Authors: | Yong He [aut], Changwei Zhao [aut], Ran Zhao [aut, cre] |
Maintainer: | Ran Zhao <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 0.1.1 |
Built: | 2025-01-15 06:28:32 UTC |
Source: | CRAN |
This function is to fit the matrix factor model via the -PCA method by conducting eigen-analysis of a weighted average of the sample mean and the column (row) sample covariance matrix through a hyper-parameter
.
alpha_PCA(X, m1, m2, alpha = 0)
alpha_PCA(X, m1, m2, alpha = 0)
X |
Input an array with |
m1 |
A positive integer indicating the row factor numbers. |
m2 |
A positive integer indicating the column factor numbers. |
alpha |
A hyper-parameter balancing the information of the first and second moments ( |
For the matrix factor models, Chen & Fan (2021) propose an estimation procedure, i.e. -PCA. The method aggregates the information in both first and second moments and extract it via a spectral method. In detail, for observations
, define
where [-1,
],
,
and
are the sample row and column covariance matrix, respectively. The loading matrices
and
are estimated as
times the top
eigenvectors of
and
times the top
eigenvectors of
, respectively. For details, see Chen & Fan (2021).
The return value is a list. In this list, it contains the following:
F |
The estimated factor matrix of dimension |
R |
The estimated row loading matrix of dimension |
C |
The estimated column loading matrix of dimension |
Yong He, Changwei Zhao, Ran Zhao.
Chen, E. Y., & Fan, J. (2021). Statistical inference for high-dimensional matrix-variate factor models. Journal of the American Statistical Association, 1-18.
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E #Estimate the factor matrices and loadings fit=alpha_PCA(X, k1, k2, alpha = 0) Rhat=fit$R Chat=fit$C Fhat=fit$F #Estimate the common component CC=array(0,c(T,p1,p2)) for (t in 1:T){ CC[t,,]=Rhat%*%Fhat[t,,]%*%t(Chat) } CC
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E #Estimate the factor matrices and loadings fit=alpha_PCA(X, k1, k2, alpha = 0) Rhat=fit$R Chat=fit$C Fhat=fit$F #Estimate the common component CC=array(0,c(T,p1,p2)) for (t in 1:T){ CC[t,,]=Rhat%*%Fhat[t,,]%*%t(Chat) } CC
The function is to estimate the pair of factor numbers via eigenvalue-ratio corresponding to RMFA method or rank minimization and eigenvalue-ratio corresponding to Iterative Huber Regression (IHR).
KMHFA(X, W1 = NULL, W2 = NULL, kmax, method, max_iter = 100, c = 1e-04, ep = 1e-04)
KMHFA(X, W1 = NULL, W2 = NULL, kmax, method, max_iter = 100, c = 1e-04, ep = 1e-04)
X |
Input an array with |
W1 |
Only if |
W2 |
Only if |
kmax |
The user-supplied maximum factor numbers. Here it means the upper bound of the number of row factors and column factors. |
method |
Character string, specifying the type of the estimation method to be used.
|
max_iter |
Only if |
c |
A constant to avoid vanishing denominators. The default is |
ep |
Only if |
If method="P"
, the number of factors and
are estimated by
where is a predetermined value larger than
and
.
is the j-th largest eigenvalue of a nonnegative definitive matrix. See the function
MHFA
for the definition of and
. For details, see He et al. (2023).
Define ,
where is estimated by IHR under the number of factor is
.
If method="E_RM"
, the number of factors and
are estimated by
where is the indicator function. In practice,
is set as
,
is set as
.
If method="E_ER"
, the number of factors and
are estimated by
\eqn{k_1} |
The estimated row factor number. |
\eqn{k_2} |
The estimated column factor number. |
Yong He, Changwei Zhao, Ran Zhao.
He, Y., Kong, X., Yu, L., Zhang, X., & Zhao, C. (2023). Matrix factor analysis: From least squares to iterative projection. Journal of Business & Economic Statistics, 1-26.
He, Y., Kong, X. B., Liu, D., & Zhao, R. (2023). Robust Statistical Inference for Large-dimensional Matrix-valued Time Series via Iterative Huber Regression. <arXiv:2306.03317>.
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E KMHFA(X, kmax=6, method="P") KMHFA(X, W1 = NULL, W2 = NULL, 6, "E_RM") KMHFA(X, W1 = NULL, W2 = NULL, 6, "E_ER")
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E KMHFA(X, kmax=6, method="P") KMHFA(X, W1 = NULL, W2 = NULL, 6, "E_RM") KMHFA(X, W1 = NULL, W2 = NULL, 6, "E_ER")
-PCA
The function is to estimate the pair of factor numbers via eigenvalue ratios corresponding to -PCA.
KPCA(X, kmax, alpha = 0)
KPCA(X, kmax, alpha = 0)
X |
Input an array with |
kmax |
The user-supplied maximum factor numbers. Here it means the upper bound of the number of row factors and column factors. |
alpha |
A hyper-parameter balancing the information of the first and second moments ( |
The function KPCA
uses the eigenvalue-ratio idea to estimate the number of factors. In details, the number of factors is estimated by
where is a given upper bound.
is defined similarly with respect to
. See the function
alpha_PCA
for the definition of and
. For more details, see Chen & Fan (2021).
\eqn{k_1} |
The estimated row factor number. |
\eqn{k_2} |
The estimated column factor number. |
Yong He, Changwei Zhao, Ran Zhao.
Chen, E. Y., & Fan, J. (2021). Statistical inference for high-dimensional matrix-variate factor models. Journal of the American Statistical Association, 1-18.
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E KPCA(X, 8, alpha = 0)
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E KPCA(X, 8, alpha = 0)
The function is to estimate the pair of factor numbers via eigenvalue ratios corresponding to PE method.
KPE(X, kmax, c = 0)
KPE(X, kmax, c = 0)
X |
Input an array with |
kmax |
The user-supplied maximum factor numbers. Here it means the upper bound of the number of row factors and column factors. |
c |
A constant to avoid vanishing denominators. The default is 0. |
The function KPE
uses the eigenvalue-ratio idea to estimate the number of factors.
First, obtain the initial estimators and
. Second, define
and
the number of factors is estimated by
where is a predetermined upper bound for
. The estimation of
is defined similarly with respect to
.
For details, see Yu et al. (2022).
\eqn{k_1} |
The estimated row factor number. |
\eqn{k_2} |
The estimated column factor number. |
Yong He, Changwei Zhao, Ran Zhao.
Yu, L., He, Y., Kong, X., & Zhang, X. (2022). Projected estimation for large-dimensional matrix factor models. Journal of Econometrics, 229(1), 201-217.
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E KPE(X, 8, c = 0)
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E KPE(X, 8, c = 0)
This function is to fit the matrix factor models via the Huber loss. We propose two algorithms to do robust factor analysis. One is based on minimizing the Huber loss of the idiosyncratic error's Frobenius norm, which leads to a weighted iterative projection approach to compute and learn the parameters and thereby named as Robust-Matrix-Factor-Analysis (RMFA). The other one is based on minimizing the element-wise Huber loss, which can be solved by an iterative Huber regression algorithm (IHR).
MHFA(X, W1=NULL, W2=NULL, m1, m2, method, max_iter = 100, ep = 1e-04)
MHFA(X, W1=NULL, W2=NULL, m1, m2, method, max_iter = 100, ep = 1e-04)
X |
Input an array with |
W1 |
Only if |
W2 |
Only if |
m1 |
A positive integer indicating the row factor numbers. |
m2 |
A positive integer indicating the column factor numbers. |
method |
Character string, specifying the type of the estimation method to be used.
|
max_iter |
Only if |
ep |
Only if |
For the matrix factor models, He et al. (2021) propose a weighted iterative projection approach to compute and learn the parameters by minimizing the Huber loss function of the idiosyncratic error's Frobenius norm. In details, for observations , define
The estimators of loading matrics and
are calculated by
times the leading
eigenvectors of
and
times the leading
eigenvectors of
.
And
For details, see He et al. (2023).
The other one is based on minimizing the element-wise Huber loss. Define
This can be seen as Huber regression as each time optimizing one argument while keeping the other two fixed.
The return value is a list. In this list, it contains the following:
F |
The estimated factor matrix of dimension |
R |
The estimated row loading matrix of dimension |
C |
The estimated column loading matrix of dimension |
Yong He, Changwei Zhao, Ran Zhao.
He, Y., Kong, X., Yu, L., Zhang, X., & Zhao, C. (2023). Matrix factor analysis: From least squares to iterative projection. Journal of Business & Economic Statistics, 1-26.
He, Y., Kong, X. B., Liu, D., & Zhao, R. (2023). Robust Statistical Inference for Large-dimensional Matrix-valued Time Series via Iterative Huber Regression. <arXiv:2306.03317>.
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E #Estimate the factor matrices and loadings by RMFA fit1=MHFA(X, m1=3, m2=3, method="P") Rhat1=fit1$R Chat1=fit1$C Fhat1=fit1$F #Estimate the factor matrices and loadings by IHR fit2=MHFA(X, W1=NULL, W2=NULL, 3, 3, "E") Rhat2=fit2$R Chat2=fit2$C Fhat2=fit2$F #Estimate the common component by RMFA CC1=array(0,c(T,p1,p2)) for (t in 1:T){ CC1[t,,]=Rhat1%*%Fhat1[t,,]%*%t(Chat1) } CC1 #Estimate the common component by IHR CC2=array(0,c(T,p1,p2)) for (t in 1:T){ CC2[t,,]=Rhat2%*%Fhat2[t,,]%*%t(Chat2) } CC2
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E #Estimate the factor matrices and loadings by RMFA fit1=MHFA(X, m1=3, m2=3, method="P") Rhat1=fit1$R Chat1=fit1$C Fhat1=fit1$F #Estimate the factor matrices and loadings by IHR fit2=MHFA(X, W1=NULL, W2=NULL, 3, 3, "E") Rhat2=fit2$R Chat2=fit2$C Fhat2=fit2$F #Estimate the common component by RMFA CC1=array(0,c(T,p1,p2)) for (t in 1:T){ CC1[t,,]=Rhat1%*%Fhat1[t,,]%*%t(Chat1) } CC1 #Estimate the common component by IHR CC2=array(0,c(T,p1,p2)) for (t in 1:T){ CC2[t,,]=Rhat2%*%Fhat2[t,,]%*%t(Chat2) } CC2
This function is to fit the matrix factor model via the PE method by projecting the observation matrix onto the row or column factor space.
PE(X, m1, m2)
PE(X, m1, m2)
X |
Input an array with |
m1 |
A positive integer indicating the row factor numbers. |
m2 |
A positive integer indicating the column factor numbers. |
For the matrix factor models, Yu et al. (2022) propose a projection estimation method to estimate the model parameters. In details, for observations , the data matrix is projected to a lower dimensional space by setting
Given , define
and then the row factor loading matrix can be estimated by
times the leading
eigenvectors of
. However, the projection matrix
is unavailable in practice. A natural solution is to replace it with a consistent initial estimator. The column factor loading matrix
can be similarly estimated by projecting
onto the space of
with
. See Yu et al. (2022) for the detailed algorithm.
The return value is a list. In this list, it contains the following:
F |
The estimated factor matrix of dimension |
R |
The estimated row loading matrix of dimension |
C |
The estimated column loading matrix of dimension |
Yong He, Changwei Zhao, Ran Zhao.
Yu, L., He, Y., Kong, X., & Zhang, X. (2022). Projected estimation for large-dimensional matrix factor models. Journal of Econometrics, 229(1), 201-217.
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E #Estimate the factor matrices and loadings fit=PE(X, k1, k2) Rhat=fit$R Chat=fit$C Fhat=fit$F #Estimate the common component CC=array(0,c(T,p1,p2)) for (t in 1:T){ CC[t,,]=Rhat%*%Fhat[t,,]%*%t(Chat) } CC
set.seed(11111) T=20;p1=20;p2=20;k1=3;k2=3 R=matrix(runif(p1*k1,min=-1,max=1),p1,k1) C=matrix(runif(p2*k2,min=-1,max=1),p2,k2) X=array(0,c(T,p1,p2)) Y=X;E=Y F=array(0,c(T,k1,k2)) for(t in 1:T){ F[t,,]=matrix(rnorm(k1*k2),k1,k2) E[t,,]=matrix(rnorm(p1*p2),p1,p2) Y[t,,]=R%*%F[t,,]%*%t(C) } X=Y+E #Estimate the factor matrices and loadings fit=PE(X, k1, k2) Rhat=fit$R Chat=fit$C Fhat=fit$F #Estimate the common component CC=array(0,c(T,p1,p2)) for (t in 1:T){ CC[t,,]=Rhat%*%Fhat[t,,]%*%t(Chat) } CC