--- title: "TRMF (Temporally Regularized Matrix Factorization)" author: "Chad Hammerquist" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{TRMF (Temporally Regularized Matrix Factorization)} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Temporally Regularized Matrix Factorization This package contains a set of functions that factor a time series matrix into a set of latent time series. Given a time series matrix $A$, alternating least squares is used to estimate the solution to the following equation: $$\left (X,F\right) = \arg \min \limits_{X_m,F_m \in \Theta} \left( ||W\circ \left(X_m F_m-A \right)||_F^2+\lambda_f^2 ||F_m||_F^2 + \sum\limits_{s} R_s(X_m)\right)$$ where $W$ is a weighting matrix the same size as $A$ and has 0's where $A$ has missing values and $||\cdot||_F$ is the Frobenius norm. $\Theta$ is a constraint set for $Fm$, possible values are non-negative for NNLS-type solutions, or in the interval $[0,1]$ or non-negative and sum row-wise to 1 for probability-like solutions. The last term does the temporal regularization $$R_s(X) = \lambda_D^2||W_s(LX_s)||_2^2+\lambda_A^2||X_s||_2^2$$ where $L$ is a graph-Laplacian matrix, $X_s$ is a subset of the columns of $X$, and $W_s$ is a diagonal weight matrix. An example of $L$ is a finite difference matrix $D_{\alpha}$ approximating a derivative of order $\alpha$. In this case, if $\alpha = 2$ then the regularization prefers penalized cubic spline solutions. If $\alpha=1$ then it can be used to fit a random walk. #### TRMF plus Regression If necessary, external regressors can be included in matrix factorization by modifying the first term to include the external regressor: $$\left (X,F\right) = \arg \min \limits_{X_m,F_m \in \Theta} \left( ||W\circ \left([X_m, E_x]F_m -A \right)||_F^2+\lambda_f^2 ||F_m||_F^2 + \sum\limits_{s} R_s(X_m)\right)$$ #### References: Yu, Hsiang-Fu, Nikhil Rao, and Inderjit S. Dhillon. "High-dimensional time series prediction with missing values." arXiv preprint arXiv:1509.08333 (2015). ## How to use To use the TRMF package to factor a time series matrix: 1. Create TRMF object for your time series matrix ```{r,eval=FALSE} obj = create_TRMF(A) ``` a. It is recommended to scale the matrix using one of the scaling option in create_TRMF b. Missing values are imputed as default ```{=html} ``` 2. Add a constraint and regularization for $F_m$ to TRMF object ```{r,eval=FALSE} obj = TRMF_columns(obj,reg_type = "nnls",lambda=1) ``` 3. Add temporal regularization model for $X_m$ to TRMF object ```{r,eval=FALSE} obj = TRMF_trend(obj,numTS = 2,order = 2,lambdaD=1) ``` 4. Maybe add another temporal regularization model for $X_m$ to TRMF object ```{r,eval=FALSE} obj = TRMF_trend(obj,numTS = 3,order = 0.5,lambdaD=10) ``` 5. Maybe add an external regressor ```{r,eval=FALSE} obj = TRMF_regression(obj, Xreg, type = "global") ``` 6. Train object ```{r,eval=FALSE} out = train(obj) ``` 7. Evaluate solution ```{r,eval=FALSE} summary(out) plot(out) resid(out) fitted(out) ``` 8. Get solution ```{r,eval=FALSE} impute_TRMF(out) coef(out) Fm = out$Factors$Fm Xm =out$Factors$Xm predict(out) ```