DIFM:Dynamic ICAR Spatiotemporal Factor Models

Introduction

This package DIFM provides codes to run Dynamic ICAR Spatiotemporal Factor Models (DIFM). It includes codes to initialize parameters, run MCMC, evaluate models, and generate plots. Read (Shin and Ferreira, 2023) for more details.

The vignette presents the model description of DIFM and the assumptions for common factors and factor loadings. We provide an example of crime rates in the western states of United States.

Model Description

We assume that the region of study is partitioned into r subregions. In our motivating example, the subregions are the states in the contiguous United States. The variable of interest in each subregion is observed at n time points. Let yt be the r-dimensional vector of observations at time t (t = 1, 2, …, n.)

We assume that the spatiotemporal behavior of the r subregions can be represented by k factors, where usually k is much smaller than r. Specifically, we assume the model

where xt is the k-dimensional vector of factors at time t, B is an r × k matrix of factor loadings, and vt is the r-dimensional vector of errors at time t. We assume that the observational error vector vt, t = 1, 2, …, n is independent over time and follows a Gaussian distribution vt ∼ N(0, V), where V = diag(σ12, …, σr2). Each of the variances σ12, …, σr2 is specific to one of the r subregions, and thus they are known as idiosyncratic variances.

We assume that the vector of factors xt follows a dynamic linear model (West and Harrison, 1997; Prado et al., 2021). Specifically, we assume the general model where θt is a latent process that allows great flexibility in the description of the temporal evolution of xt. Specifically, θt may encode different types of temporal trends as well as seasonality. For example, in our application we assume a second-order polynomial DLM and specify θt as a vector of dimension 2k that contains the level and the gradient of xt at time t. In addition, the evolution matrix G describes the temporal evolution of the latent process θt. Further, ωt is a 2k-dimensional innovation vector with a dense covariance matrix W. Finally, the matrix F relates the vector of common factors xt to the appropriate elements of the latent process θt.

In the case of the second-order polynomial DLM that we consider, θt = (θt, 1, θt, 2, …, θt, 2k)T is a vector of dimension 2k where (θt, 1, θt, 3, …, θt, 2k − 1)T and (θt, 2, θt, 4, …, θt, 2k)T are respectively the level and the gradient of the vector of common factors xt. Thus, the matrix F that relates xt to θt is a k × 2k matrix of the form The evolution matrix G has dimension 2k × 2k and satisfies θt, 2j − 1 = θt − 1, 2j − 1 + θt − 1, 2j + ωt, 2j − 1 and θt, 2j = θt − 1, 2j + ωt, 2j, j = 1, …, k. Therefore, G = blockdiag(G0, …, G0) where

The specification of the factor loadings matrix B is crucial in our dynamic ICAR factor model. An important point to consider is the need for constraints on the matrix B to ensure identifiability of the model. Specifically, for any invertible k × k matrix A, substituting B and xt in Equation (1) by, respectively, B* = BA and xt* = A−1xt would lead to the same model. To ensure identifiability, we impose a hierarchical structural constraint that assumes that B is a full-rank block lower triangular matrix with diagonal elements equal to 1 (Aguilar and West, 2000). Specifically, we assume B has the form

To account for the spatial dependence among the factor loadings for neighboring subregions, we assume that each column of the matrix of factor loadings B follows an intrinsic conditional autoregressive model (Besag et al., 1991; Keefe et al., 2018). Specifically, we assume for the jth column Bj, j = 1, …, k, the density where H is a precision matrix that accounts for the spatial dependence among neighboring subregions and τj controls the strength of spatial correlation among factor loadings. Specifically, if subregions i and j are neighbors, then the corresponding element of the matrix H is hij = −gij where gij measures the strength of the association between subregions i and j. If subregions i and j are not neighbors, then gij = 0. Finally, the ith diagonal element of matrix H is hii = ∑j ≠ igij. For example, a widely used choice for H assumes gij = 1 if i and j share a border, and gij = 0 otherwise. In that case, hii is equal to the number of neighbors of subregion i. Further, we assume that there are no islands which implies that the matrix H has one eigenvalue equal to 0 and all other eigenvalues larger than zero. Note that we assume this prior for each column of B. Let B.j* = B(j + 1) : r, j be the jth column of B without the first j elements that are fixed. In addition, let Hj* = H(j + 1) : r, (j + 1) : r. Then, the conditional distribution of B.j* given B1 : j, j = (0, …, 0, 1)T is multivariate normal with mean vector hj = −Hj* − 1H(j + 1) : r, j and precision matrix Hj*.

Examples

We apply DIFM to western United States crime datasets. The data was collected from Bureau of Justice Statistics and available at disaster center website. In this vignette, we provide two datasets, Violent and Property for violent and property crime, respectively. The data was collected from 50 states of United States and District of Columbia from 1960 to 2019. In this example, we use the WestStates data included in DIFM package that contains the information of the map and polygon of the 11 western states: Arizona, California, Colorado, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington and Wyoming. The numbers represent the cases of crime per 100,000 people. We use the square root of the data to stabilize the variance.

Step 1: Read and explore the data

data(Violent)
data(Property)
data(WestStates)
Violent <- as.matrix(Violent)
Violent <- sqrt(Violent)
Property <- as.matrix(Property)
Property <- sqrt(Property)

After we call the datasets, we explore the data through plots.

par(mar=c(2,2,2,2))
layout(rbind(1:4, 5:8, c(9:11,0)))
for(i in 1:11){
  plot(1960:2019, Violent[,i], main = colnames(Violent)[i], type = "l", xlab = "", ylab = "")
}
Figure 1: Number of violent crimes

Figure 1: Number of violent crimes

Figure 1 shows the square root of the number of violent crimes in Western states. In most of the states, the cases of violent crimes soar by 2000 and have small changes. The trend after 2000 differ by states. Since many states share similar trend, we can assume that DIFM would be applicable in this case.

par(mar=c(2,2,2,2))
layout(rbind(1:4, 5:8, c(9:11,0)))
for(i in 1:11){
  plot(1960:2019, Property[,i], main = colnames(Property)[i], type = "l", xlab = "", ylab = "")
}
Figure 2: Number of property crimes

Figure 2: Number of property crimes

Figure 2 shows the square root of the number of property crimes in Western states. Property crimes soar by 1980 in western states, but would usually decrease from then. Many states show start of sharp decrease between 1980 and 1990. These information can be reperesented with smaller number of factors through DIFM.

Step 2: Run DIFM with range of factors.

Now we run DIFM with different number of factors. For our two examples, we try models from 1 to 4 factors. Before we start DIFM, we should permute the order of the variable to adjust the structural hierarchical constraint. We set the variable that would represent the factor well according to the eigenvectors. From Figure 1 and Figure 2, we can find that the time serires are rather non-stationary than stationary. Therefore, we consider a second order polynomial for the dynamic linear model. First, we run MCMC for the violent crime data.

n.iter <- 5000
n.save <- 10
G0 <- rbind(c(1,1), c(0,1))
Violent.permutation <- permutation.order(Violent, 4)
Violent <- Violent[,Violent.permutation]
Violent.Hlist <- buildH(WestStates, Violent.permutation)

set.seed(1101)

model.attributes1V <- difm.model.attributes(Violent, n.iter, n.factors = 1, G0)
hyp.parm1V <- difm.hyp.parm(model.attributes1V, Hlist = Violent.Hlist)
ViolentDIFM1 <- DIFMcpp(model.attributes1V, hyp.parm1V, Violent, every = n.save, verbose = FALSE)
ViolentAssess1 <- marginal_d_cpp(Violent, model.attributes1V, hyp.parm1V, ViolentDIFM1, verbose = FALSE)

model.attributes2V <- difm.model.attributes(Violent, n.iter, n.factors = 2, G0)
hyp.parm2V <- difm.hyp.parm(model.attributes2V, Hlist = Violent.Hlist)
ViolentDIFM2 <- DIFMcpp(model.attributes2V, hyp.parm2V, Violent, every = n.save, verbose = FALSE)
ViolentAssess2 <- marginal_d_cpp(Violent, model.attributes2V, hyp.parm2V, ViolentDIFM2, verbose = FALSE)

model.attributes3V <- difm.model.attributes(Violent, n.iter, n.factors = 3, G0)
hyp.parm3V <- difm.hyp.parm(model.attributes3V, Hlist = Violent.Hlist)
ViolentDIFM3 <- DIFMcpp(model.attributes3V, hyp.parm3V, Violent, every = n.save, verbose = FALSE)
ViolentAssess3 <- marginal_d_cpp(Violent, model.attributes3V, hyp.parm3V, ViolentDIFM3, verbose = FALSE)

model.attributes4V <- difm.model.attributes(Violent, n.iter, n.factors = 4, G0)
hyp.parm4V <- difm.hyp.parm(model.attributes4V, Hlist = Violent.Hlist)
ViolentDIFM4 <- DIFMcpp(model.attributes4V, hyp.parm4V, Violent, every = n.save, verbose = FALSE)
ViolentAssess4 <- marginal_d_cpp(Violent, model.attributes4V, hyp.parm4V, ViolentDIFM4, verbose = FALSE)

Now we run the MCMC for the property crime data.

Property.permutation <- permutation.order(Property, 4)
Property <- Property[,Property.permutation]
Property.Hlist <- buildH(WestStates, Property.permutation)

set.seed(1101)

model.attributes1P <- difm.model.attributes(Property, n.iter, n.factors = 1, G0)
hyp.parm1P <- difm.hyp.parm(model.attributes1P, Hlist = Property.Hlist)
PropertyDIFM1 <- DIFMcpp(model.attributes1P, hyp.parm1P, Property, every = n.save, verbose = FALSE)
PropertyAssess1 <- marginal_d_cpp(Property, model.attributes1P, hyp.parm1P, PropertyDIFM1, verbose = FALSE)

model.attributes2P <- difm.model.attributes(Property, n.iter, n.factors = 2, G0)
hyp.parm2P <- difm.hyp.parm(model.attributes2P, Hlist = Property.Hlist)
PropertyDIFM2 <- DIFMcpp(model.attributes2P, hyp.parm2P, Property, every = n.save, verbose = FALSE)
PropertyAssess2 <- marginal_d_cpp(Property, model.attributes2P, hyp.parm2P, PropertyDIFM2, verbose = FALSE)

model.attributes3P <- difm.model.attributes(Property, n.iter, n.factors = 3, G0)
hyp.parm3P <- difm.hyp.parm(model.attributes3P, Hlist = Property.Hlist)
PropertyDIFM3 <- DIFMcpp(model.attributes3P, hyp.parm3P, Property, every = n.save, verbose = FALSE)
PropertyAssess3 <- marginal_d_cpp(Property, model.attributes3P, hyp.parm3P, PropertyDIFM3, verbose = FALSE)

model.attributes4P <- difm.model.attributes(Property, n.iter, n.factors = 4, G0)
hyp.parm4P <- difm.hyp.parm(model.attributes4P, Hlist = Property.Hlist)
PropertyDIFM4 <- DIFMcpp(model.attributes4P, hyp.parm4P, Property, every = n.save, verbose = FALSE)
PropertyAssess4 <- marginal_d_cpp(Property, model.attributes4P, hyp.parm4P, PropertyDIFM4, verbose = FALSE)

We select the best model through the Metropolis-Laplace estimator of the predictive density.

PDtable <- matrix(NA, 2, 4)
PDtable[1,] <- c(ViolentAssess1$Maximum, ViolentAssess2$Maximum, ViolentAssess3$Maximum, ViolentAssess4$Maximum)
PDtable[2,] <- c(PropertyAssess1$Maximum, PropertyAssess2$Maximum, PropertyAssess3$Maximum, PropertyAssess4$Maximum)
PDtable <- as.data.frame(PDtable)
rownames(PDtable) <- c("Violent", "Property")
colnames(PDtable) <- paste("Factors =", 1:4)
kable(PDtable)
Factors = 1 Factors = 2 Factors = 3 Factors = 4
Violent -1423.441 -1404.070 -1460.197 -1530.303
Property -2082.268 -2154.098 -2197.381 -2371.112

From the table, we can find that the best models according to the evaluation method are 2-factor-model for violent crime and 1-factor-model for property crime datasets.

We present a few posterior distribution plots of the 4-factor models. The plots below are from the violent crime data.

oldpar <- par(mar = c(2,2,2,2))
plot_B.CI(ViolentDIFM4, permutation = Violent.permutation)

plot_X.CI(ViolentDIFM4)

par(oldpar)
plot_sigma2.CI(ViolentDIFM4, permutation = Violent.permutation)

plot_tau.CI(ViolentDIFM4)

plot_B.spatial(ViolentDIFM4, WestStates, layout.dim = c(2,2))

In the same manner, we present the results of the property crime data.

oldpar <- par(mar = c(2,2,2,2))
plot_B.CI(PropertyDIFM4, permutation = Property.permutation)

plot_X.CI(PropertyDIFM4)

par(oldpar)
plot_sigma2.CI(PropertyDIFM4, permutation = Property.permutation)

plot_tau.CI(PropertyDIFM4)

plot_B.spatial(PropertyDIFM4, WestStates, layout.dim = c(2,2))

Reference

Shin, H. and Ferreira, M. A. (2023). “Dynamic ICAR Spatiotemporal Factor Models.” Spatial Statistics, 56, 100763.

West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Models (2nd Ed.). Berlin, Heidelberg: Springer-Verlag.

Prado, R., Ferreira, M. A. R., and West, M. (2021). Time Series: Modeling, Computation, and Inference 2nd Ed. Boca Raton: Chapman & Hall/CRC.

Aguilar, O. and West, M. (2000). “Bayesian dynamic factor models and portfolio allocation.” Journal of Business and Economic Statistics, 18, 338–357.

Besag, J., York, J., and Mollie, A. (1991). “Bayesian image restoration, with two applications in spatial statistics.” Annals of the Institute of Statistical Mathematics, 43, 1

Keefe, M. J., Ferreira, M. A. R., and Franck, C. T. (2018). “On the formal specification of sum-zero constrained intrinsic conditional autoregressive models.” Spatial Statistics, 24.