DIFM:Dynamic ICAR Spatiotemporal Factor Models

Introduction

This package DIFM provides codes to run Dynamic ICAR Spatiotemporal Factor Models (DIFM). It includes codes to initialize parameters, run MCMC, evaluate models, and generate plots. Read (Shin and Ferreira, 2023) for more details.

The vignette presents the model description of DIFM and the assumptions for common factors and factor loadings. We provide an example of crime rates in the western states of United States.

Model Description

We assume that the region of study is partitioned into r subregions. In our motivating example, the subregions are the states in the contiguous United States. The variable of interest in each subregion is observed at n time points. Let y_t be the r-dimensional vector of observations at time t (t = 1, 2, …, n.)

We assume that the spatiotemporal behavior of the r subregions can be represented by k factors, where usually k is much smaller than r. Specifically, we assume the model

where x_t is the k-dimensional vector of factors at time t, B is an r × k matrix of factor loadings, and v_t is the r-dimensional vector of errors at time t. We assume that the observational error vector v_t, t = 1, 2, …, n is independent over time and follows a Gaussian distribution v_t ∼ N(0, V), where V = diag(σ₁², …, σ_r²). Each of the variances σ₁², …, σ_r² is specific to one of the r subregions, and thus they are known as idiosyncratic variances.

We assume that the vector of factors x_t follows a dynamic linear model (West and Harrison, 1997; Prado et al., 2021). Specifically, we assume the general model where θ_t is a latent process that allows great flexibility in the description of the temporal evolution of x_t. Specifically, θ_t may encode different types of temporal trends as well as seasonality. For example, in our application we assume a second-order polynomial DLM and specify θ_t as a vector of dimension 2k that contains the level and the gradient of x_t at time t. In addition, the evolution matrix G describes the temporal evolution of the latent process θ_t. Further, ω_t is a 2k-dimensional innovation vector with a dense covariance matrix W. Finally, the matrix F relates the vector of common factors x_t to the appropriate elements of the latent process θ_t.

In the case of the second-order polynomial DLM that we consider, θ_t = (θ_t, 1, θ_t, 2, …, θ_t, 2k)^T is a vector of dimension 2k where (θ_t, 1, θ_t, 3, …, θ_{t, 2k − 1})^T and (θ_t, 2, θ_t, 4, …, θ_t, 2k)^T are respectively the level and the gradient of the vector of common factors x_t. Thus, the matrix F that relates x_t to θ_t is a k × 2k matrix of the form The evolution matrix G has dimension 2k × 2k and satisfies θ_{t, 2j − 1} = θ_{t − 1, 2j − 1} + θ_{t − 1, 2j} + ω_{t, 2j − 1} and θ_t, 2j = θ_{t − 1, 2j} + ω_t, 2j, j = 1, …, k. Therefore, G = blockdiag(G₀, …, G₀) where

The specification of the factor loadings matrix B is crucial in our dynamic ICAR factor model. An important point to consider is the need for constraints on the matrix B to ensure identifiability of the model. Specifically, for any invertible k × k matrix A, substituting B and x_t in Equation (1) by, respectively, B^* = BA and x_t^* = A⁻¹x_t would lead to the same model. To ensure identifiability, we impose a hierarchical structural constraint that assumes that B is a full-rank block lower triangular matrix with diagonal elements equal to 1 (Aguilar and West, 2000). Specifically, we assume B has the form

To account for the spatial dependence among the factor loadings for neighboring subregions, we assume that each column of the matrix of factor loadings B follows an intrinsic conditional autoregressive model (Besag et al., 1991; Keefe et al., 2018). Specifically, we assume for the jth column B_j, j = 1, …, k, the density where H is a precision matrix that accounts for the spatial dependence among neighboring subregions and τ_j controls the strength of spatial correlation among factor loadings. Specifically, if subregions i and j are neighbors, then the corresponding element of the matrix H is h_ij = −g_ij where g_ij measures the strength of the association between subregions i and j. If subregions i and j are not neighbors, then g_ij = 0. Finally, the ith diagonal element of matrix H is h_ii = ∑_j ≠ ig_ij. For example, a widely used choice for H assumes g_ij = 1 if i and j share a border, and g_ij = 0 otherwise. In that case, h_ii is equal to the number of neighbors of subregion i. Further, we assume that there are no islands which implies that the matrix H has one eigenvalue equal to 0 and all other eigenvalues larger than zero. Note that we assume this prior for each column of B. Let B_.j^* = B_{(j + 1) : r, j} be the jth column of B without the first j elements that are fixed. In addition, let H_j^* = H_{(j + 1) : r, (j + 1) : r}. Then, the conditional distribution of B_.j^* given B_{1 : j, j} = (0, …, 0, 1)^T is multivariate normal with mean vector h_j = −H_j^* − 1H_{(j + 1) : r, j} and precision matrix H_j^*.

Examples

We apply DIFM to western United States crime datasets. The data was collected from Bureau of Justice Statistics and available at disaster center website. In this vignette, we provide two datasets, Violent and Property for violent and property crime, respectively. The data was collected from 50 states of United States and District of Columbia from 1960 to 2019. In this example, we use the WestStates data included in DIFM package that contains the information of the map and polygon of the 11 western states: Arizona, California, Colorado, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington and Wyoming. The numbers represent the cases of crime per 100,000 people. We use the square root of the data to stabilize the variance.

Step 1: Read and explore the data

data(Violent)
data(Property)
data(WestStates)
Violent <- as.matrix(Violent)
Violent <- sqrt(Violent)
Property <- as.matrix(Property)
Property <- sqrt(Property)

After we call the datasets, we explore the data through plots.

par(mar=c(2,2,2,2))
layout(rbind(1:4, 5:8, c(9:11,0)))
for(i in 1:11){
  plot(1960:2019, Violent[,i], main = colnames(Violent)[i], type = "l", xlab = "", ylab = "")
}

Figure 1: Number of violent crimes

Figure 1 shows the square root of the number of violent crimes in Western states. In most of the states, the cases of violent crimes soar by 2000 and have small changes. The trend after 2000 differ by states. Since many states share similar trend, we can assume that DIFM would be applicable in this case.

par(mar=c(2,2,2,2))
layout(rbind(1:4, 5:8, c(9:11,0)))
for(i in 1:11){
  plot(1960:2019, Property[,i], main = colnames(Property)[i], type = "l", xlab = "", ylab = "")
}

Figure 2: Number of property crimes

Figure 2 shows the square root of the number of property crimes in Western states. Property crimes soar by 1980 in western states, but would usually decrease from then. Many states show start of sharp decrease between 1980 and 1990. These information can be reperesented with smaller number of factors through DIFM.

Step 2: Run DIFM with range of factors.

Now we run DIFM with different number of factors. For our two examples, we try models from 1 to 4 factors. Before we start DIFM, we should permute the order of the variable to adjust the structural hierarchical constraint. We set the variable that would represent the factor well according to the eigenvectors. From Figure 1 and Figure 2, we can find that the time serires are rather non-stationary than stationary. Therefore, we consider a second order polynomial for the dynamic linear model. First, we run MCMC for the violent crime data.

n.iter <- 5000
n.save <- 10
G0 <- rbind(c(1,1), c(0,1))
Violent.permutation <- permutation.order(Violent, 4)
Violent <- Violent[,Violent.permutation]
Violent.Hlist <- buildH(WestStates, Violent.permutation)

set.seed(1101)

model.attributes1V <- difm.model.attributes(Violent, n.iter, n.factors = 1, G0)
hyp.parm1V <- difm.hyp.parm(model.attributes1V, Hlist = Violent.Hlist)
ViolentDIFM1 <- DIFMcpp(model.attributes1V, hyp.parm1V, Violent, every = n.save, verbose = FALSE)
ViolentAssess1 <- marginal_d_cpp(Violent, model.attributes1V, hyp.parm1V, ViolentDIFM1, verbose = FALSE)

model.attributes2V <- difm.model.attributes(Violent, n.iter, n.factors = 2, G0)
hyp.parm2V <- difm.hyp.parm(model.attributes2V, Hlist = Violent.Hlist)
ViolentDIFM2 <- DIFMcpp(model.attributes2V, hyp.parm2V, Violent, every = n.save, verbose = FALSE)
ViolentAssess2 <- marginal_d_cpp(Violent, model.attributes2V, hyp.parm2V, ViolentDIFM2, verbose = FALSE)

model.attributes3V <- difm.model.attributes(Violent, n.iter, n.factors = 3, G0)
hyp.parm3V <- difm.hyp.parm(model.attributes3V, Hlist = Violent.Hlist)
ViolentDIFM3 <- DIFMcpp(model.attributes3V, hyp.parm3V, Violent, every = n.save, verbose = FALSE)
ViolentAssess3 <- marginal_d_cpp(Violent, model.attributes3V, hyp.parm3V, ViolentDIFM3, verbose = FALSE)

model.attributes4V <- difm.model.attributes(Violent, n.iter, n.factors = 4, G0)
hyp.parm4V <- difm.hyp.parm(model.attributes4V, Hlist = Violent.Hlist)
ViolentDIFM4 <- DIFMcpp(model.attributes4V, hyp.parm4V, Violent, every = n.save, verbose = FALSE)
ViolentAssess4 <- marginal_d_cpp(Violent, model.attributes4V, hyp.parm4V, ViolentDIFM4, verbose = FALSE)

Now we run the MCMC for the property crime data.

Property.permutation <- permutation.order(Property, 4)
Property <- Property[,Property.permutation]
Property.Hlist <- buildH(WestStates, Property.permutation)

set.seed(1101)

model.attributes1P <- difm.model.attributes(Property, n.iter, n.factors = 1, G0)
hyp.parm1P <- difm.hyp.parm(model.attributes1P, Hlist = Property.Hlist)
PropertyDIFM1 <- DIFMcpp(model.attributes1P, hyp.parm1P, Property, every = n.save, verbose = FALSE)
PropertyAssess1 <- marginal_d_cpp(Property, model.attributes1P, hyp.parm1P, PropertyDIFM1, verbose = FALSE)

model.attributes2P <- difm.model.attributes(Property, n.iter, n.factors = 2, G0)
hyp.parm2P <- difm.hyp.parm(model.attributes2P, Hlist = Property.Hlist)
PropertyDIFM2 <- DIFMcpp(model.attributes2P, hyp.parm2P, Property, every = n.save, verbose = FALSE)
PropertyAssess2 <- marginal_d_cpp(Property, model.attributes2P, hyp.parm2P, PropertyDIFM2, verbose = FALSE)

model.attributes3P <- difm.model.attributes(Property, n.iter, n.factors = 3, G0)
hyp.parm3P <- difm.hyp.parm(model.attributes3P, Hlist = Property.Hlist)
PropertyDIFM3 <- DIFMcpp(model.attributes3P, hyp.parm3P, Property, every = n.save, verbose = FALSE)
PropertyAssess3 <- marginal_d_cpp(Property, model.attributes3P, hyp.parm3P, PropertyDIFM3, verbose = FALSE)

model.attributes4P <- difm.model.attributes(Property, n.iter, n.factors = 4, G0)
hyp.parm4P <- difm.hyp.parm(model.attributes4P, Hlist = Property.Hlist)
PropertyDIFM4 <- DIFMcpp(model.attributes4P, hyp.parm4P, Property, every = n.save, verbose = FALSE)
PropertyAssess4 <- marginal_d_cpp(Property, model.attributes4P, hyp.parm4P, PropertyDIFM4, verbose = FALSE)

We select the best model through the Metropolis-Laplace estimator of the predictive density.

PDtable <- matrix(NA, 2, 4)
PDtable[1,] <- c(ViolentAssess1$Maximum, ViolentAssess2$Maximum, ViolentAssess3$Maximum, ViolentAssess4$Maximum)
PDtable[2,] <- c(PropertyAssess1$Maximum, PropertyAssess2$Maximum, PropertyAssess3$Maximum, PropertyAssess4$Maximum)
PDtable <- as.data.frame(PDtable)
rownames(PDtable) <- c("Violent", "Property")
colnames(PDtable) <- paste("Factors =", 1:4)
kable(PDtable)

	Factors = 1	Factors = 2	Factors = 3	Factors = 4
Violent	-1423.441	-1404.070	-1460.197	-1530.303
Property	-2082.268	-2154.098	-2197.381	-2371.112

From the table, we can find that the best models according to the evaluation method are 2-factor-model for violent crime and 1-factor-model for property crime datasets.

We present a few posterior distribution plots of the 4-factor models. The plots below are from the violent crime data.

oldpar <- par(mar = c(2,2,2,2))
plot_B.CI(ViolentDIFM4, permutation = Violent.permutation)

plot_X.CI(ViolentDIFM4)

par(oldpar)

plot_sigma2.CI(ViolentDIFM4, permutation = Violent.permutation)

plot_tau.CI(ViolentDIFM4)

plot_B.spatial(ViolentDIFM4, WestStates, layout.dim = c(2,2))

In the same manner, we present the results of the property crime data.

oldpar <- par(mar = c(2,2,2,2))
plot_B.CI(PropertyDIFM4, permutation = Property.permutation)

plot_X.CI(PropertyDIFM4)

par(oldpar)

plot_sigma2.CI(PropertyDIFM4, permutation = Property.permutation)

plot_tau.CI(PropertyDIFM4)

plot_B.spatial(PropertyDIFM4, WestStates, layout.dim = c(2,2))

Reference

Shin, H. and Ferreira, M. A. (2023). “Dynamic ICAR Spatiotemporal Factor Models.” Spatial Statistics, 56, 100763.

West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Models (2nd Ed.). Berlin, Heidelberg: Springer-Verlag.

Prado, R., Ferreira, M. A. R., and West, M. (2021). Time Series: Modeling, Computation, and Inference 2nd Ed. Boca Raton: Chapman & Hall/CRC.

Aguilar, O. and West, M. (2000). “Bayesian dynamic factor models and portfolio allocation.” Journal of Business and Economic Statistics, 18, 338–357.

Besag, J., York, J., and Mollie, A. (1991). “Bayesian image restoration, with two applications in spatial statistics.” Annals of the Institute of Statistical Mathematics, 43, 1

Keefe, M. J., Ferreira, M. A. R., and Franck, C. T. (2018). “On the formal specification of sum-zero constrained intrinsic conditional autoregressive models.” Spatial Statistics, 24.