Title: | High-Dimensional Robust Factor Analysis |
---|---|
Description: | Factor models have been widely applied in areas such as economics and finance, and the well-known heavy-tailedness of macroeconomic/financial data should be taken into account when conducting factor analysis. We propose two algorithms to do robust factor analysis by considering the Huber loss. One is based on minimizing the Huber loss of the idiosyncratic error's L2 norm, which turns out to do Principal Component Analysis (PCA) on the weighted sample covariance matrix and thereby named as Huber PCA. The other one is based on minimizing the element-wise Huber loss, which can be solved by an iterative Huber regression algorithm. In this package we also provide the code for traditional PCA, the Robust Two Step (RTS) method by He et al. (2022) and the Quantile Factor Analysis (QFA) method by Chen et al. (2021) and He et al. (2023). |
Authors: | Yong He [aut], Lingxiao Li [aut], Dong Liu [aut, cre], Wenxin Zhou [aut] |
Maintainer: | Dong Liu <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 0.1.5 |
Built: | 2024-11-20 06:50:27 UTC |
Source: | CRAN |
This function is to fit the factor models via the Huber Principal Component Analysis (HPCA) method. One is based on minimizing the Huber loss of the idiosyncratic error's norm, which turns out to do Principal Component Analysis (PCA) on the weighted sample covariance matrix and thereby named as Huber PCA. The other one is based on minimizing the elementwise Huber loss, which can be solved by an iterative Huber regression algorithm.
HPCA(X, r, Method = "E", tau = NULL, scale_est="MAD", L_init = NULL, F_init = NULL, maxiter_HPCA = 100, maxiter_HLM = 100, eps = 0.001)
HPCA(X, r, Method = "E", tau = NULL, scale_est="MAD", L_init = NULL, F_init = NULL, maxiter_HPCA = 100, maxiter_HLM = 100, eps = 0.001)
X |
Input matrix, of dimension |
r |
A positive integer indicating the factor numbers. |
Method |
|
tau |
Optional user-supplied parameter for Huber loss; default is NULL, and |
scale_est |
A parameter for the elementwise Huber loss. |
L_init |
User-supplied inital value of loadings; default is the PCA estimator. |
F_init |
User-supplied inital value of factors; default is the PCA estimator. |
maxiter_HPCA |
The maximum number of iterations in the HPCA. The default is |
maxiter_HLM |
The maximum number of iterations in the iterative Huber regression algorithm. The default is |
eps |
The stopping critetion parameter in the HPCA. The default is 1e-3. |
See He et al. (2023) for details.
The return value is a list. In this list, it contains the following:
Fhat |
The estimated factor matrix of dimension |
Lhat |
The estimated loading matrix of dimension |
m |
The number of iterations. |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
He Y, Li L, Liu D, Zhou W., 2023 Huber Principal Component Analysis for Large-dimensional Factor Models.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E fit=HPCA(X,r,Method = "E") fit$Fhat;fit$Lhat fit=HPCA(X,r,Method = "P") fit$Fhat;fit$Lhat
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E fit=HPCA(X,r,Method = "E") fit$Fhat;fit$Lhat fit=HPCA(X,r,Method = "P") fit$Fhat;fit$Lhat
This function is to estimate factor numbers via rank minimization corresponding to Huber Principal Component Analysis (HPCA).
HPCA_FN(X, rmax, Method = "E", tau = NULL, scale_est="MAD", threshold = NULL, L_init = NULL, F_init = NULL, maxiter_HPCA = 100, maxiter_HLM = 100, eps = 0.001)
HPCA_FN(X, rmax, Method = "E", tau = NULL, scale_est="MAD", threshold = NULL, L_init = NULL, F_init = NULL, maxiter_HPCA = 100, maxiter_HLM = 100, eps = 0.001)
X |
Input matrix, of dimension |
rmax |
The user-supplied maximum factor numbers. |
Method |
|
tau |
Optional user-supplied parameter for Huber loss; default is NULL, and |
scale_est |
A parameter for the elementwise Huber loss. |
threshold |
The threshold of rank minimization; default is NULL. |
L_init |
User-supplied inital value of loadings in the HPCA; default is the PCA estimator. |
F_init |
User-supplied inital value of factors in the HPCA; default is the PCA estimator. |
maxiter_HPCA |
The maximum number of iterations in the HPCA. The default is |
maxiter_HLM |
The maximum number of iterations in the iterative Huber regression algorithm. The default is |
eps |
The stopping critetion parameter in the HPCA. The default is 1e-3. |
See He et al. (2023) for details.
rhat |
The estimated factor number. |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
He Y, Li L, Liu D, Zhou W., 2023 Huber Principal Component Analysis for Large-dimensional Factor Models.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E HPCA_FN(X,8,Method="E") HPCA_FN(X,8,Method="P")
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E HPCA_FN(X,8,Method="E") HPCA_FN(X,8,Method="P")
This function is to fit the quantile factor model via the Iterative Quantile Regression (IQR) algorithm.
IQR(X, r, tau, L_init = NULL, F_init = NULL, max_iter = 100, eps = 0.001)
IQR(X, r, tau, L_init = NULL, F_init = NULL, max_iter = 100, eps = 0.001)
X |
Input matrix, of dimension |
r |
A positive integer indicating the factor numbers. |
tau |
The user-supplied quantile level. |
L_init |
User-supplied inital value of loadings; default is the PCA estimator. |
F_init |
User-supplied inital value of factors; default is the PCA estimator. |
max_iter |
The maximum number of iterations. The default is |
eps |
The stopping critetion parameter. The default is 1e-06. |
See Chen et al. (2021) and He et al. (2023) for details.
The return value is a list. In this list, it contains the following:
Fhat |
The estimated factor matrix of dimension |
Lhat |
The estimated loading matrix of dimension |
t |
The number of iterations. |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
Chen, L., Dolado, J.J., Gonzalo, J., 2021. Quantile factor models. Econometrica 89, 875–910.
He Y, Kong X, Yu L, Zhao P., 2023 Quantile factor analysis for large-dimensional time series with statistical guarantee <arXiv:2006.08214>.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E tau=0.5 fit=IQR(X,r,tau) fit$Fhat;fit$Lhat
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E tau=0.5 fit=IQR(X,r,tau) fit$Fhat;fit$Lhat
This function is to estimate factor numbers via rank minimization corresponding to Iterative Quantile Regression (IQR).
IQR_FN(X, rmax, tau, threshold = NULL, L_init = NULL, F_init = NULL, max_iter = 100, eps = 10^(-6))
IQR_FN(X, rmax, tau, threshold = NULL, L_init = NULL, F_init = NULL, max_iter = 100, eps = 10^(-6))
X |
Input matrix, of dimension |
rmax |
The user-supplied maximum factor numbers. |
tau |
The user-supplied quantile level. |
threshold |
The threshold of rank minimization; default is NULL. |
L_init |
User-supplied inital value of loadings in the IQR; default is the PCA estimator. |
F_init |
User-supplied inital value of factors in the IQR; default is the PCA estimator. |
max_iter |
The maximum number of iterations. The default is |
eps |
The stopping critetion parameter of the IQR method. The default is 1e-06. |
See Chen et al. (2021) for more details.
rhat |
The estimated factor number. |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
Chen, L., Dolado, J.J., Gonzalo, J., 2021. Quantile factor models. Econometrica 89, 875–910.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E tau=0.5 IQR_FN(X,8,tau)
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E tau=0.5 IQR_FN(X,8,tau)
This function is to fit the factor models via Principal Component Analysis (PCA) methods.
PCA(X, r, constraint = "L")
PCA(X, r, constraint = "L")
X |
Input matrix, of dimension |
r |
A positive integer indicating the factor numbers. |
constraint |
The type of identification condition. If |
See Bai (2003) for details.
The return value is a list. In this list, it contains the following:
Fhat |
The estimated factor matrix of dimension |
Lhat |
The estimated loading matrix of dimension |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
Bai, J., 2003. Inferential theory for factor models of large dimensions. Econometrica 71, 135–171.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E fit=PCA(X,3,"L") t(fit$Lhat)%*%fit$Lhat/N fit=PCA(X,3,"F") t(fit$Fhat)%*%fit$Fhat/T
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E fit=PCA(X,3,"L") t(fit$Lhat)%*%fit$Lhat/N fit=PCA(X,3,"F") t(fit$Fhat)%*%fit$Fhat/T
This function is to estimate factor numbers via eigenvalue ratios corresponding to Principal Component Analysis (PCA).
PCA_FN(X, rmax)
PCA_FN(X, rmax)
X |
Input matrix, of dimension |
rmax |
The user-supplied maximum factor numbers. |
See Ahn and Horenstein (2013) for details.
rhat |
The estimated factor numbers. |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
Ahn, S.C., Horenstein, A.R., 2013. Eigenvalue ratio test for the number of factors. Econometrica 81, 1203–1227.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E PCA_FN(X,8)
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E PCA_FN(X,8)
This function is to fit the large-dimensional elliptical factor models via the Robust Two Step (RTS) algorithm.
RTS(X, r)
RTS(X, r)
X |
Input matrix, of dimension |
r |
A positive integer indicating the factor numbers. |
See He et al. (2022) for details.
The return value is a list. In this list, it contains the following:
Fhat |
The estimated factor matrix of dimension |
Lhat |
The estimated loading matrix of dimension |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
He, Y., Kong, X., Yu, L., Zhang, X., 2022. Large-dimensional factor analysis without moment constraints. Journal of Business & Economic Statistics 40, 302–312.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E fit=RTS(X,3) fit$Fhat;fit$Lhat
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E fit=RTS(X,3) fit$Fhat;fit$Lhat
This function is to estimate factor numbers robustly via multivariate Kendall’s tau eigenvalue ratios.
RTS_FN(X, rmax)
RTS_FN(X, rmax)
X |
Input matrix, of dimension |
rmax |
The user-supplied maximum factor numbers. |
See Yu et al. (2019) for details.
rhat |
The estimated factor number. |
Yong He, Lingxiao Li, Dong Liu, Wenxin Zhou.
Yu, L., He, Y., Zhang, X., 2019. Robust factor number specification for large-dimensional elliptical factor model. Journal of Multivariate analysis 174, 104543.
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E RTS_FN(X,8)
set.seed(1) T=50;N=50;r=3 L=matrix(rnorm(N*r,0,1),N,r);F=matrix(rnorm(T*r,0,1),T,r) E=matrix(rnorm(T*N,0,1),T,N) X=F%*%t(L)+E RTS_FN(X,8)