Title: | Probit with Spatial Dependence, SAR, SEM and SARAR Models |
---|---|
Description: | Fast estimation of binomial spatial probit regression models with spatial autocorrelation for big datasets. |
Authors: | Davide Martinetti [aut, cre] , Ghislain Geniaux [aut] |
Maintainer: | Davide Martinetti <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1 |
Built: | 2024-11-27 06:39:10 UTC |
Source: | CRAN |
ProbitSpatial
package allows to fit spatial autoregressive (SAR) and
spatial error (SEM) probit models. It also provides functions to simulated
spatial binary data, an empirical data set and different methods for the
diagnostic of the estimated model.
The main function of this package is ProbitSpatialFit
. It allows to
fit both SAR and SEM models for big datasets in a reasonable time. The
function is based on the maximisation of the approximate likelihood
function. The approximation is inspired by the Mendell and Elston algorithm
for computing multivariate normal probabilities and take advantage of the
sparsity of the spatial weight matrix. Two methods are available for the
estimation of the model parameter: the first one is known as conditional
method (see Case (1992)) and performs relatively well in terms of accuracy
of the estimated parameters and is very rapid. The second method, that
minimises the full-log-likelihood, is slower but it should be more accurate.
Monte Carlo experiments on simulated data reported in Martinetti and Geniaux
(2017) showed that the full-log-likelihood approach is not always
overperforming the conditional method in terms of accuracy. At the present
stage, our suggestion is to use the conditional method for a first
estimation and only attempt the full-likelihood approach in a second moment,
when the dataset size is not bigger than a few thousands.
Another feature of the ProbitSpatialFit
function is the possibility
to fit the model using the precision matrix instead of the
variance-covariance matrix, since it is usually sparser and hence allows
faster computations (see LeSage and Pace (2009)).
The output of ProbitSpatialFit
function is an object of class
ProbitSpatial
, for which the methods
residuals
, fitted
, effects
, predict
and
coef
are available.
The package also contains the function sim_binomial_probit
that
allows to simulate data samples of both SAR and SEM models. It can be used
to replicate the Monte Carlo experiments reported in Martinetti and Geniaux
(2017) as well as the experiment of Calabrese and Elkink (2014).
An empirical data set Katrina
on the reopening decisions of
firms in the aftermath of the Katrina Hurricane in New Orleans is also
available (LeSage et al.(2011)).
Other packages in CRAN repository on the same subject are
McSpatial
(McMillen (2013)) and
spatialprobit
(Wilhelm and Godinho de Matos
(2013)).
The core functions of the present package have been coded using the
Rcpp
and RcppEigen
libraries (Bates and Eddelbuettel (2013)),
that allow direct interchange of rich R objects between R and C++.
Davide Martinetti [email protected] and Ghislain Geniaux [email protected]
D. Bates and D. Eddelbuettel. Fast and elegant numerical linear algebra using the RcppEigen package. Journal of Statistical Software 52, 1–24, 2013.
A. C. Case. Neighborhood Influence and Technological Change. Regional Science and Urban Economics 22, 491–508, 1992.
R. Calabrese and J.A. Elkink. Estimators of binary spatial autoregressive models: a Monte Carlo study. Journal of Regional Science 54, 664–687, 2014.
J. LeSage and R.K. Pace. Introduction to Spatial Econometrics, CRC Press, chapter 10.1.6, 2009.
P. LeSage, R. K. Pace, N. Lam, R. Campanella and X. Liu. New Orleans business recovery in the aftermath of Hurricane Katrina. Journal of the Royal Statistical Society A 174, 1007–1027, 2011.
D. Martinetti and G. Geniaux. Approximate likelihood estimation of spatial probit models. Regional Science and Urban Economics 64, 30-45, 2017.
D. McMillen. McSpatial: Nonparametric spatial data analysis. R package version 2.0, 2013.
N. Mendell and R. Elston. Multifactorial qualitative traits: genetic analysis and prediction of recurrence risks. Biometrics 30, 41–57, 1974.
S. Wilhelm and M. Godinho de Matos. Estimating Spatial Probit Models in R. The R Journal 5, 130–143, 2013.
extract a slot from ProbitSpatial
class object.
## S4 method for signature 'ProbitSpatial' x$name
## S4 method for signature 'ProbitSpatial' x$name
x |
an object of class |
name |
of the slot. |
The content of the slot.
Returns the coefficients estimated by a ProbitSpatial
model.
## S3 method for class 'ProbitSpatial' coef(object, ...)
## S3 method for class 'ProbitSpatial' coef(object, ...)
object |
an object of class |
... |
ignored |
It returns the value of the estimated parameters.
Performs conditional estimation of SAR model with variance-covariance matrix.
conditional_SAR_UC(myenv)
conditional_SAR_UC(myenv)
myenv |
an |
We discourage the direct use of this function.
the log-likelihood and the estimated parameters.
Performs conditional estimation of SAR model with precision matrix.
conditional_SAR_UP(myenv)
conditional_SAR_UP(myenv)
myenv |
an |
We discourage the direct use of this function.
the log-likelihood and the estimated parameters.
Performs conditional estimation of SARAR model with variance-covariance matrix.
conditional_SARAR_UC(myenv)
conditional_SARAR_UC(myenv)
myenv |
an |
We discourage the direct use of this function.
the log-likelihood and the estimated parameters.
Performs conditional estimation of SARAR model with precision matrix.
conditional_SARAR_UP(myenv)
conditional_SARAR_UP(myenv)
myenv |
an |
We discourage the direct use of this function.
the log-likelihood and the estimated parameters.
Performs conditional estimation of SEM model with variance-covariance matrix.
conditional_SEM_UC(myenv)
conditional_SEM_UC(myenv)
myenv |
an |
We discourage the direct use of this function.
the log-likelihood and the estimated parameters.
Performs conditional estimation of SEM model with precision matrix.
conditional_SEM_UP(myenv)
conditional_SEM_UP(myenv)
myenv |
an |
We discourage the direct use of this function.
the log-likelihood and the estimated parameters.
Returns the marginal effects of a ProbitSpatial
model.
effects_ProbitSpatial(object)
effects_ProbitSpatial(object)
object |
an object of class |
The effects
function has different outputs according to the
DGP of the ProbitSpatial
model:
"SAR"
The marginal effects of a spatial autoregressive model are more complicated than usual measurements of impacts for non spatial models. Here we follow LeSage and Pace and propose the following summaries for impact measures:
the average over all the observations of the effects of the change of an explanatory variable of a single observation on the choice probability of that same observation.
the average over all the observations of the effect of a change on a explanatory variable on the choice probability of the neighbouring observations.
the sum of direct and indirect impacts.
"SEM"
marginal effects should be interpreted as if it were a standard probit model.
It returns the marginal effects of the estimated
ProbitSpatial
model.
J. LeSage and R.K. Pace. Introduction to Spatial Econometrics, CRC Press, chapter 10.1.6, 2009.
Extract the fitted values of a ProbitSpatial
model.
## S3 method for class 'ProbitSpatial' fitted(object, type = c("link", "response", "binary"), cut = 0.5, ...)
## S3 method for class 'ProbitSpatial' fitted(object, type = c("link", "response", "binary"), cut = 0.5, ...)
object |
an object of class |
type |
the type of output:
|
cut |
the threshold probability for the |
... |
ignored |
Returns the vector of fitted values of the ProbitSpatial
model
Generate a spatial weight matrix of given size and number of nearest neighbors from randomly-located observations on the unit square.
generate_W(n, nneigh, seed=123)
generate_W(n, nneigh, seed=123)
n |
the size of the matrix. |
nneigh |
the number of nearest neighbors. |
seed |
an integer to set the seed for the random generated locations. |
The output matrix has zero diagonal and it is row-standardised.
The n
observations are allocated randomly in the unit square.
For each observation, the nneigh
closests observations w.r.t. the
Euclidean distance are assigned with a weight equal to 1/nneigh
.
a matrix of class dgCMatrix
(sparse matrix).
W <- generate_W(100,4,seed=12)
W <- generate_W(100,4,seed=12)
This dataset has been used in the LeSage et al. (2011) paper entitled "New Orleans business recovery in the aftermath of Hurricane Katrina" to study the decisions of shop owners to reopen business after Hurricane Katrina. The dataset contains 673 observations on 3 streets in New Orleans and can be used to estimate the spatial probit models and to replicate the findings in the paper.
data(Katrina)
data(Katrina)
Katrina is a data frame with 673 observations on the following 15 variables:
code
a numeric vector
long
longitude coordinate of store
lat
latitude coordinate of store
street1
a numeric vector
medinc
median income
perinc
a numeric vector
elevation
a numeric vector
flood
flood depth (measured in feet)
owntype
type of store ownership: "sole proprietorship" vs. "local chain" vs. "national chain"
sesstatus
socio-economic status of clientele (1-5): 1-2 = low #' status customers, 3 = middle, 4-5 = high status customers
sizeemp
"small size" vs. "medium size" vs. "large size" firms
openstatus1
a numeric vector
openstatus2
a numeric vector
days
days to reopen business
street
1=Magazine Street, 2=Carrollton Avenue, 3=St. Claude Avenue
Katrina is a data frame with 673 observations on the following 13 variables.
long
longitude coordinate of store
lat
latitude coordinate of store
flood_depth
flood depth (measured in feet)
log_medinc
log median income
small_size
binary variable for "small size" firms
large_size
binary variable for "large size" firms
low_status_customers
binary variable for low socio-economic status of clientele
high_status_customers
binary variable for high socio-economic status of clientele
owntype_sole_proprietor
a binary variable indicating "sole proprietor" ownership type
owntype_national_chain
a binary variable indicating "national_chain" ownership type
y1
reopening status in the very short period 0-3 months; 1=reopened, 0=not reopened
y2
reopening status in the period 0-6 months; 1=reopened, 0=not reopened
y3
reopening status in the period 0-12 months; 1=reopened, 0=not reopened
The Katrina dataset contains the data found on the website before some of the variables are recoded. For example, the socio-economic status of clientele is coded as 1-5 in the raw data, but only 3 levels will be used in estimation: 1-2 = low status customers, 3 = middle, 4-5 = high status customers. Hence, with "middle" as the reference category, Katrina contains 2 dummy variables for low status customers and high status customers.
The dataset Katrina is the result of these recoding operations and can be directly used for model estimation.
When definining the reopening status variables y1 (0-3 months), y2 (0-6 months), and y3 (0-12 months) from the days variable, the Matlab code ignores the seven cases where days=90. To be consistent with the number of cases in the paper, we define y1,y2,y3 in the same way: y1=sum(days < 90), y2=sum(days < 180 & days != 90), y3=sum(days < 365 & days != 90). So this is not a bug, its a feature.
The raw data was obtained from the Royal Statistical Society dataset website and brought to RData format by Wilhelm and Godinho de Matos (2013).
P. LeSage, R. K. Pace, N. Lam, R. Campanella and X. Liu. New Orleans business recovery in the aftermath of Hurricane Katrina. Journal of the Royal Statistical Society A, 174, 1007–1027, 2011.
S. Wilhelm and M. Godinho de Matos. Estimating Spatial Probit Models in R. The R Journal 5, 130–143, 2013.
## Not run: data(Katrina) attach(Katrina) table(y1) # 300 of the 673 firms reopened during 0-3 months horizon, p.1016 table(y2) # 425 of the 673 firms reopened during 0-6 months horizon, p.1016 table(y3) # 478 of the 673 firms reopened during 0-12 months horizon, p.1016 detach(Katrina) # replicate LeSage et al. (2011), Table 3, p.1017 require(spdep) # (a) 0-3 months time horizon # LeSage et al. (2011) use k=11 nearest neighbors in this case nb <- knn2nb(knearneigh(cbind(Katrina$lat, Katrina$long), k=11)) listw <- nb2listw(nb, style="W") W1 <- as(as_dgRMatrix_listw(listw), "CsparseMatrix") fit1_cond <- ProbitSpatialFit(y1 ~ flood_depth + log_medinc + small_size + large_size +low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W1, data=Katrina, DGP='SAR', method="conditional", varcov="varcov") summary(fit1_cond) fit1_FL <- ProbitSpatialFit(y1 ~ flood_depth + log_medinc + small_size + large_size +low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W1, data=Katrina, DGP='SAR', method="full-lik", varcov="varcov") summary(fit1_FL) fit1_cond_10nn <- ProbitSpatialFit(y1 ~ flood_depth+ log_medinc+ small_size+ large_size +low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W1, data=Katrina, DGP='SAR', method="conditional", varcov="varcov", control=list(iW_CL=10)) summary(fit1_cond_10nn) # (b) 0-6 months time horizon # LeSage et al. (2011) use k=15 nearest neighbors nb <- knn2nb(knearneigh(cbind(Katrina$lat, Katrina$long), k=15)) listw <- nb2listw(nb, style="W") W2 <- as(as_dgRMatrix_listw(listw), "CsparseMatrix") fit2_cond <- ProbitSpatialFit(y2 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W2, data=Katrina, DGP="SAR", method="full-lik", varcov="varcov") summary(fit2_cond) fit2_FL <- ProbitSpatialFit(y2 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W2, data=Katrina, DGP="SAR", method="full-lik", varcov="varcov") summary(fit2_FL) # (c) 0-12 months time horizon # LeSage et al. (2011) use k=15 nearest neighbors as in 0-6 months W3 <- W2 fit3_cond <- ProbitSpatialFit(y3 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W3, data=Katrina, DGP="SAR", method="conditional", varcov="varcov") summary(fit3_cond) fit3_FL <- ProbitSpatialFit(y3 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W3, data=Katrina, DGP="SAR", method="full-lik", varcov="varcov") summary(fit3_FL) # replicate LeSage et al. (2011), Table 4, p.1018 # SAR probit model effects estimates for the 0-3-month time horizon effects(fit1_cond) # replicate LeSage et al. (2011), Table 5, p.1019 # SAR probit model effects estimates for the 0-6-month time horizon effects(fit2_cond) # replicate LeSage et al. (2011), Table 6, p.1020 # SAR probit model effects estimates for the 0-12-month time horizon effects(fit3_cond) ## End(Not run)
## Not run: data(Katrina) attach(Katrina) table(y1) # 300 of the 673 firms reopened during 0-3 months horizon, p.1016 table(y2) # 425 of the 673 firms reopened during 0-6 months horizon, p.1016 table(y3) # 478 of the 673 firms reopened during 0-12 months horizon, p.1016 detach(Katrina) # replicate LeSage et al. (2011), Table 3, p.1017 require(spdep) # (a) 0-3 months time horizon # LeSage et al. (2011) use k=11 nearest neighbors in this case nb <- knn2nb(knearneigh(cbind(Katrina$lat, Katrina$long), k=11)) listw <- nb2listw(nb, style="W") W1 <- as(as_dgRMatrix_listw(listw), "CsparseMatrix") fit1_cond <- ProbitSpatialFit(y1 ~ flood_depth + log_medinc + small_size + large_size +low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W1, data=Katrina, DGP='SAR', method="conditional", varcov="varcov") summary(fit1_cond) fit1_FL <- ProbitSpatialFit(y1 ~ flood_depth + log_medinc + small_size + large_size +low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W1, data=Katrina, DGP='SAR', method="full-lik", varcov="varcov") summary(fit1_FL) fit1_cond_10nn <- ProbitSpatialFit(y1 ~ flood_depth+ log_medinc+ small_size+ large_size +low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W1, data=Katrina, DGP='SAR', method="conditional", varcov="varcov", control=list(iW_CL=10)) summary(fit1_cond_10nn) # (b) 0-6 months time horizon # LeSage et al. (2011) use k=15 nearest neighbors nb <- knn2nb(knearneigh(cbind(Katrina$lat, Katrina$long), k=15)) listw <- nb2listw(nb, style="W") W2 <- as(as_dgRMatrix_listw(listw), "CsparseMatrix") fit2_cond <- ProbitSpatialFit(y2 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W2, data=Katrina, DGP="SAR", method="full-lik", varcov="varcov") summary(fit2_cond) fit2_FL <- ProbitSpatialFit(y2 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W2, data=Katrina, DGP="SAR", method="full-lik", varcov="varcov") summary(fit2_FL) # (c) 0-12 months time horizon # LeSage et al. (2011) use k=15 nearest neighbors as in 0-6 months W3 <- W2 fit3_cond <- ProbitSpatialFit(y3 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W3, data=Katrina, DGP="SAR", method="conditional", varcov="varcov") summary(fit3_cond) fit3_FL <- ProbitSpatialFit(y3 ~ flood_depth + log_medinc + small_size + large_size + low_status_customers + high_status_customers + owntype_sole_proprietor + owntype_national_chain, W=W3, data=Katrina, DGP="SAR", method="full-lik", varcov="varcov") summary(fit3_FL) # replicate LeSage et al. (2011), Table 4, p.1018 # SAR probit model effects estimates for the 0-3-month time horizon effects(fit1_cond) # replicate LeSage et al. (2011), Table 5, p.1019 # SAR probit model effects estimates for the 0-6-month time horizon effects(fit2_cond) # replicate LeSage et al. (2011), Table 6, p.1020 # SAR probit model effects estimates for the 0-12-month time horizon effects(fit3_cond) ## End(Not run)
Extract names of ProbitSpatial class.
## S3 method for class 'ProbitSpatial' names(x, ...)
## S3 method for class 'ProbitSpatial' names(x, ...)
x |
an object of class |
... |
ignored |
Returns the names of the ProbitSpatial
object.
Predicts of a ProbitSpatial
model on a set X
of covariates.
Works on both in-sample and out-of-sample using BLUP formula from Goulard et
al. (2017)
## S3 method for class 'ProbitSpatial' predict( object, X, type = c("link", "response", "binary"), cut = 0.5, oos = FALSE, WSO = NULL, ... )
## S3 method for class 'ProbitSpatial' predict( object, X, type = c("link", "response", "binary"), cut = 0.5, oos = FALSE, WSO = NULL, ... )
object |
an object of class |
X |
a matrix of explanatory variables. If oos=TRUE, it may contain more observations than the dataset on which the model has been trained |
type |
the type of output:
|
cut |
the threshold probability for the |
oos |
logical. If TRUE, out-of-sample predictions are returned. |
WSO |
W matrix containing weights of in-sample and out-of-sample data. Observations must be ordered in such a way that the first elements belong to the in-sample data and the remaining ones to the out-of-sample data. |
... |
ignored |
If oos=FALSE
, the function computes the predicted values for #' the estimated model (same as fitted
). Otherwise, it applies the BLUP #' formula of Goulard et al. (2017):
where the sub-indexes S and O refer, respectively, to the in-sample and
out-of-sample data. corresponds to fitted values, while
is computed as follows:
where is the precision matrix of
and the sub-indexes OO and
OS refer to the corresponding block matrices.
Returns a vector of predicted values for the set X
of
covariates if oos=FALSE
or the best linear unbiased predictors of the #' set XOS
if oos=TRUE
.
M. Goulard, T. Laurent and C. Thomas-Agnan. About predictions in spatial autoregressive models: optimal and almost optimal strategies. Spatial Economic Analysis 12, 304-325, 2017.
Class of Spatial Probit Model.
beta
numeric, the estimated parameters for the covariates.
rho
numeric, the estimated spatial autocorrelation parameter.
lambda
numeric, the estimated spatial error autocorrelation parameter.
coeff
numeric, all estimated parameters.
loglik
numeric, the likelihood associated to the estimated model.
formula
formula
.
nobs
numeric, number of observations.
nvar
numeric, number of covariates.
y
numeric, vector of observed dependent variable.
X
matrix, matrix of covariates.
time
numeric, estimation time.
DGP
character, DGP of the model (SAR, SEM or SARAR).
method
character, estimation method ("conditional
" or
"full-lik
").
varcov
character, indicates the matrix used in the algorithm
("varcov
" or "precision
").
W
SparseMatrix, the spatial weight matrix of y.
M
SparseMatrix, the spatial weight matrix of the disturbances.
iW_CL
numeric, the order of approximation used in the conditional method.
iW_FL
numeric, the order of approximation used inside the likelihood
function for the full-lik
method.
iW_FG
numeric, the order of approximation used inside the gradient
functions for the full-lik
method.
reltol
numeric, the relative convergence tolerance.
prune
numeric, the pruning for the gradient functions.
env
an environment
containing information for use in later
function calls to save time.
message
a integer giving any additional information or NULL.
Approximate likelihood estimation of the probit model with spatial autoregressive (SAR), spatial error (SEM), spatial autoregressive with autoregressive disturbances (SARAR).
ProbitSpatialFit(formula,data,W, DGP='SAR',method="conditional",varcov="varcov", M=NULL,control=list())
ProbitSpatialFit(formula,data,W, DGP='SAR',method="conditional",varcov="varcov", M=NULL,control=list())
formula |
an object of class |
data |
the data set containing the variables of the model. |
W |
the spatial weight matrix of class |
DGP |
the data generating process of |
method |
the optimisation method: |
varcov |
the likelihood function is computed using the
variance-covariance matrix ( |
M |
the second spatial weight matrix for SARAR models. Same class as W. |
control |
a list of control parameters. See Details. |
The estimation is based on the approximate value of the true likelihood of spatial probit models. The DGP of the spatial autoregressive model (SAR) model is the following
where the disturbances are iid standard normally distributed,
is a sparse spatial weight matrix and
is the spatial lag
parameter. The variance of the error term is equal
to
.
The DGP of the spatial error model (SEM) is as follows
where the disturbances are iid standard normally distributed,
is a sparse spatial weight matrix and
is the spatial
error parameter. The variance of the error term
is equal to
.
The DGP of the spatial autoregressive model with autoregressive disturbances
(SARAR) is as follows
where the disturbances are iid standard normally distributed,
and
are two sparse spatial weight matrix, while
and
are the spatial lag and spatial error parameters,
respectively. The variance of the error term
is equal to
.
The approximation is inspired by the Mendell-Elston approximation
of the multivariante normal probabilities (see References). It makes use of
the Cholesky decomposition of the variance-covariance matrix .
The ProbitSpatialFit
command estimates the model by maximising the
approximate log-likelihood. We propose two optimisation method:
"conditional"
: it relies on a standard probit estimation
which applies to the model estimated
conditional on .
"full-lik"
: it minimises the full-log-likelihood using the
analytical gradient functions (only available for SAR and SEM
specification). The optimisation is performed by means of the
optim
function with method = "BFGS"
.
In both cases a "conditional"
estimation is performed. If
method="conditional"
, then ProbitSpatialFit
returns
the results of this first estimation. In case method="full-lik"
,
the function tries to improve the log-likelihood by means of a further
exploration around the value of the parameters found by the conditional
step.
The conditional step is usually very accurate and particularly fast. The
second step is more time consuming and does not always improve the results
of the first step. We dissuade the user from using the full-likelihood
method for sample sizes bigger than ten thousands, since the computation of
the gradients is quite slow. Simulation studies reported in Martinetti and
Geniaux (2017) prove that the conditional estimation is highly reliable,
even if compared to the full-likelihood ones.
In order to reduce the computation time of the function
ProbitSpatialFit
, we propose a variant of the likelihood-function
estimation that uses the inverse of the variance-covariance matrix (a.k.a.
precision matrix). This variant applies to both the "conditional"
and
the "full-lik"
methods and can be invoked by setting
varcov="precision"
. Simulation studies reported in Martinetti and
Geniaux (2017) suggest that the accuracy of the results with the precision
matrix are sometimes worst than the one with the true variance-covariance
matrix, but the estimation time is considerably reduced.
The control argument is a list that can supply any of the following components:
iW_CL
the order of approximation of
used in the
"conditional"
method. Default is 6, while 0 means no
approximation (it uses exact inversion of matrixes, not suitable for big
sample sizes). See Martinetti and Geniaux (2017) for further references.
iW_FL
the order of approximation of
used in the computation of the likelihood function for the
"full-lik"
method. Default is 0, meaning no approximation.
iW_FG
the order of approximation of
used in the computation of the gradient functions for the
"full-lik"
method. Default is 0, meaning no approximation.
reltol
relative convergence tolerance. It represents
tol
in optimize
function for
method="conditional"
and reltol
in optim
function for method="full-lik"
. Default is 1e-5.
prune
the pruning value used in the gradients. Default is 0, meaning no pruning. Typacl values are around 1e-3 and 1e-6. They help reducing the estimation time of the gradient functions.
silent
Default is TRUE.
Return an object of class ProbitSpatial
.
N. Mendell and R. Elston. Multifactorial qualitative traits: genetic analysis and prediction of recurrence risks. Biometrics 30, 41–57, 1974.
D. Martinetti and G. Geniaux. Approximate likelihood estimation of spatial probit models. Regional Science and Urban Economics 64, 30-45, 2017.
n <- 1000 nneigh <- 3 rho <- 0.5 beta <- c(4,-2,1) W <- generate_W(n,nneigh,seed=123) X <- cbind(1,rnorm(n,2,2),rnorm(n,0,1)) colnames(X) <- c("intercept","X1","X2") y <- sim_binomial_probit(W=W,X=X,beta=beta,rho=rho,model="SAR") d <- as.data.frame(cbind(y,X)) mod <- ProbitSpatialFit(y~X1+X2,d,W, DGP='SAR',method="conditional",varcov="varcov")
n <- 1000 nneigh <- 3 rho <- 0.5 beta <- c(4,-2,1) W <- generate_W(n,nneigh,seed=123) X <- cbind(1,rnorm(n,2,2),rnorm(n,0,1)) colnames(X) <- c("intercept","X1","X2") y <- sim_binomial_probit(W=W,X=X,beta=beta,rho=rho,model="SAR") d <- as.data.frame(cbind(y,X)) mod <- ProbitSpatialFit(y~X1+X2,d,W, DGP='SAR',method="conditional",varcov="varcov")
Compute the residuals of an estimated ProbitSpatial
model.
## S3 method for class 'ProbitSpatial' residuals(object, ...)
## S3 method for class 'ProbitSpatial' residuals(object, ...)
object |
an object of class |
... |
ignored |
Return a vector containing the generalised residuals of the
ProbitSpatial
model.
The function sim_binomial_probit
is used to generate the dependent
variable of a spatial binomial probit model, where all the data and
parameters of the model can be modified by the user.
sim_binomial_probit(W,X,beta,rho,model="SAR",M=NULL,lambda=NULL, sigma2=1,ord_iW=6,seed=123)
sim_binomial_probit(W,X,beta,rho,model="SAR",M=NULL,lambda=NULL, sigma2=1,ord_iW=6,seed=123)
W |
the spatial weight matrix (works for |
X |
the matrix of covariates. |
beta |
the value of the covariates parameters. |
rho |
the value of the spatial dependence parameter (works for
|
model |
the type of model, between |
M |
the second spatial weight matrix (only if |
lambda |
the value of the spatial dependence parameter (only if
|
sigma2 |
the variance of the error term (Defaul is 1). |
ord_iW |
the order of approximation of the matrix
|
seed |
to set the random generator seed of the error term. |
The sim_binomial_probit
generates a vector of dependent
variables for a spatial probit model. It allows to simulate the following
DGPs (Data Generating Process):
SAR
SEM
SARAR
where are independent and normally distributed with mean zero
and variance
sigma2
(default is 1).
The matrix X
of covariates, the corresponding parameters beta
,
the spatial weight matrix W
and the corresponding spatial dependence
parameter rho
need to be passed by the user. Eventually, the same
applies for lambda
and M
for the SARAR model.
The matrix is computed using the
ApproxiW
function, that can either invert
exactely, if
order_iW=0
(not suitable for n
bigger than 1000),
or using the Taylor approximation
of order order_iW
(default is approximation of order 6).
a vector of zeros and ones
n <- 500 nneigh <- 3 rho <- 0.5 beta <- c(4,-2,1) W <- generate_W(n,nneigh) X <- cbind(1,rnorm(n,2,2),rnorm(n,0,1)) #SAR y <- sim_binomial_probit(W,X,beta,rho,model="SAR") #SAR model #SEM y <- sim_binomial_probit(W,X,beta,rho,model="SEM") #SEM model #SARAR M <- generate_W(n,nneigh,seed=1) lambda <- -0.5 y <- sim_binomial_probit(W,X,beta,rho,model="SARAR",M=M,lambda=lambda)
n <- 500 nneigh <- 3 rho <- 0.5 beta <- c(4,-2,1) W <- generate_W(n,nneigh) X <- cbind(1,rnorm(n,2,2),rnorm(n,0,1)) #SAR y <- sim_binomial_probit(W,X,beta,rho,model="SAR") #SAR model #SEM y <- sim_binomial_probit(W,X,beta,rho,model="SEM") #SEM model #SARAR M <- generate_W(n,nneigh,seed=1) lambda <- -0.5 y <- sim_binomial_probit(W,X,beta,rho,model="SARAR",M=M,lambda=lambda)
Print the results of a ProbitSpatial
model.
## S3 method for class 'ProbitSpatial' summary(object, covar = FALSE, ...)
## S3 method for class 'ProbitSpatial' summary(object, covar = FALSE, ...)
object |
an object of class |
covar |
should the statistics be computed with the matrix of variance of the parametes or not. Default is FALSE, hence Likelihood-ratio statistics are printed. |
... |
further arguments |
The summary
function prints
Featurs on the model and dataset.
Estimation time.
Standard errors of the estimated parameters. If
covar=TRUE
, it uses the matrix of variance of the parameters, else the
likelihood ratio test.
Confusion Matrix and accuracy of the estimated model.
This functions does not return any value.