Title: | Econometrics of Network Data |
---|---|
Description: | Simulating and estimating peer effect models and network formation models. The class of peer effect models includes linear-in-means models (Lee, 2004; <doi:10.1111/j.1468-0262.2004.00558.x>), Tobit models (Xu and Lee, 2015; <doi:10.1016/j.jeconom.2015.05.004>), and discrete numerical data models (Houndetoungan, 2024; <doi:10.2139/ssrn.3721250>). The network formation models include pair-wise regressions with degree heterogeneity (Graham, 2017; <doi:10.3982/ECTA12679>) and exponential random graph models (Mele, 2017; <doi:10.3982/ECTA10400>). |
Authors: | Aristide Houndetoungan [cre, aut] |
Maintainer: | Aristide Houndetoungan <[email protected]> |
License: | GPL-3 |
Version: | 2.2.0 |
Built: | 2024-12-11 07:02:02 UTC |
Source: | CRAN |
The CDatanet package simulates and estimates peer effect models and network formation models. The class of peer effect models includes linear-in-means models (Lee, 2004; Lee et al., 2010),
Tobit models (Xu and Lee, 2015), and discrete numerical data models (Houndetoungan, 2024).
The network formation models include pair-wise regressions with degree heterogeneity (Graham, 2017; Yan et al., 2019) and exponential random graph models (Mele, 2017).
To make the computations faster CDatanet uses C++
through the Rcpp package (Eddelbuettel et al., 2011).
Maintainer: Aristide Houndetoungan [email protected]
Eddelbuettel, D., & Francois, R. (2011). Rcpp: Seamless R and C++
integration. Journal of Statistical Software, 40(8), 1-18, doi:10.18637/jss.v040.i08.
Houndetoungan, E. A. (2024). Count Data Models with Social Interactions under Rational Expectations. Available at SSRN 3721250, doi:10.2139/ssrn.3721250.
Lee, L. F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 72(6), 1899-1925, doi:10.1111/j.1468-0262.2004.00558.x.
Lee, L. F., Liu, X., & Lin, X. (2010). Specification and estimation of social interaction models with network structures. The Econometrics Journal, 13(2), 145-176, doi:10.1111/j.1368-423X.2010.00310.x
Xu, X., & Lee, L. F. (2015). Maximum likelihood estimation of a spatial autoregressive Tobit model. Journal of Econometrics, 188(1), 264-280, doi:10.1016/j.jeconom.2015.05.004.
Graham, B. S. (2017). An econometric model of network formation with degree heterogeneity. Econometrica, 85(4), 1033-1063, doi:10.3982/ECTA12679.
Mele, A. (2017). A structural model of dense network formation. Econometrica, 85(3), 825-850, doi:10.3982/ECTA10400.
Yan, T., Jiang, B., Fienberg, S. E., & Leng, C. (2019). Statistical inference in a directed network model with covariates. Journal of the American Statistical Association, 114(526), 857-868, doi:10.1080/01621459.2018.1448829.
Useful links:
Report bugs at https://github.com/ahoundetoungan/CDatanet/issues
cdnet
estimates count data models with social interactions under rational expectations using the NPL algorithm (see Houndetoungan, 2024).
cdnet( formula, Glist, group, Rmax, Rbar, starting = list(lambda = NULL, Gamma = NULL, delta = NULL), Ey0 = NULL, ubslambda = 1L, optimizer = "fastlbfgs", npl.ctr = list(), opt.ctr = list(), cov = TRUE, data )
cdnet( formula, Glist, group, Rmax, Rbar, starting = list(lambda = NULL, Gamma = NULL, delta = NULL), Ey0 = NULL, ubslambda = 1L, optimizer = "fastlbfgs", npl.ctr = list(), opt.ctr = list(), cov = TRUE, data )
formula |
a class object formula: a symbolic description of the model. |
Glist |
adjacency matrix. For networks consisting of multiple subnets, |
group |
the vector indicating the individual groups. The default assumes a common group. For 2 groups; that is, |
Rmax |
an integer indicating the theoretical upper bound of |
Rbar |
an |
starting |
(optional) a starting value for |
Ey0 |
(optional) a starting value for |
ubslambda |
a positive value indicating the upper bound of |
optimizer |
is either |
npl.ctr |
a list of controls for the NPL method (see details). |
opt.ctr |
a list of arguments to be passed in |
cov |
a Boolean indicating if the covariance should be computed. |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
The count variable take the value
with probability.
In this equation, is a vector of control variables;
is the distribution function of the standard normal distribution;
is the average of
among peers using the
s
-th network definition;
is the
r
-th cut-point in the cost group .
The following identification conditions have been introduced: ,
,
, and
for any
. The last condition implies that
for any
.
For any
, the distance between two cut-points is
As the number of cut-point can be large, a quadratic cost function is considered for
, where
.
With the semi-parametric cost-function,
.
The model parameters are: ,
, and
,
where
for
.
The number of single parameters in
depends on
and
. The components
or/and
must be removed in certain cases.
If , then
.
If (binary models), then
must be empty.
If , then
.
npl.ctr
The model parameters are estimated using the Nested Partial Likelihood (NPL) method. This approach
starts with a guess of and
and constructs iteratively a sequence
of
and
. The solution converges when the
-distance
between two consecutive
and
is less than a tolerance.
The argument npl.ctr
must include
the tolerance of the NPL algorithm (default 1e-4),
the maximal number of iterations allowed (default 500),
a boolean indicating if the estimate should be printed at each step.
the number of simulations performed use to compute integral in the covariance by important sampling.
A list consisting of:
info |
a list of general information about the model. |
estimate |
the NPL estimator. |
Ey |
|
GEy |
the average of |
cov |
a list including (if |
details |
step-by-step output as returned by the optimizer. |
Houndetoungan, E. A. (2024). Count Data Models with Social Interactions under Rational Expectations. Available at SSRN 3721250, doi:10.2139/ssrn.3721250.
set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Adjacency matrix A <- list() for (m in 1:M) { nm <- nvec[m] Am <- matrix(0, nm, nm) max_d <- 30 #maximum number of friends for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Am[i, tmp] <- 1 } A[[m]] <- Am } Anorm <- norm.network(A) #Row-normalization # X X <- cbind(rnorm(n, 1, 3), rexp(n, 0.4)) # Two group: group <- 1*(X[,1] > 0.95) # Networks # length(group) = 2 and unique(sort(group)) = c(0, 1) # The networks must be defined as to capture: # peer effects of `0` on `0`, peer effects of `1` on `0` # peer effects of `0` on `1`, and peer effects of `1` on `1` G <- list() cums <- c(0, cumsum(nvec)) for (m in 1:M) { tp <- group[(cums[m] + 1):(cums[m + 1])] Am <- A[[m]] G[[m]] <- norm.network(list(Am * ((1 - tp) %*% t(1 - tp)), Am * ((1 - tp) %*% t(tp)), Am * (tp %*% t(1 - tp)), Am * (tp %*% t(tp)))) } # Parameters lambda <- c(0.2, 0.3, -0.15, 0.25) Gamma <- c(4.5, 2.2, -0.9, 1.5, -1.2) delta <- rep(c(2.6, 1.47, 0.85, 0.7, 0.5), 2) # Data data <- data.frame(X, peer.avg(Anorm, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) = c("x1", "x2", "gx1", "gx2") ytmp <- simcdnet(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), lambda = lambda, Gamma = Gamma, delta = delta, group = group, data = data) y <- ytmp$y hist(y, breaks = max(y) + 1) table(y) # Estimation est <- cdnet(formula = y ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), group = group, optimizer = "fastlbfgs", data = data, opt.ctr = list(maxit = 5e3, eps_f = 1e-11, eps_g = 1e-11)) summary(est)
set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Adjacency matrix A <- list() for (m in 1:M) { nm <- nvec[m] Am <- matrix(0, nm, nm) max_d <- 30 #maximum number of friends for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Am[i, tmp] <- 1 } A[[m]] <- Am } Anorm <- norm.network(A) #Row-normalization # X X <- cbind(rnorm(n, 1, 3), rexp(n, 0.4)) # Two group: group <- 1*(X[,1] > 0.95) # Networks # length(group) = 2 and unique(sort(group)) = c(0, 1) # The networks must be defined as to capture: # peer effects of `0` on `0`, peer effects of `1` on `0` # peer effects of `0` on `1`, and peer effects of `1` on `1` G <- list() cums <- c(0, cumsum(nvec)) for (m in 1:M) { tp <- group[(cums[m] + 1):(cums[m + 1])] Am <- A[[m]] G[[m]] <- norm.network(list(Am * ((1 - tp) %*% t(1 - tp)), Am * ((1 - tp) %*% t(tp)), Am * (tp %*% t(1 - tp)), Am * (tp %*% t(tp)))) } # Parameters lambda <- c(0.2, 0.3, -0.15, 0.25) Gamma <- c(4.5, 2.2, -0.9, 1.5, -1.2) delta <- rep(c(2.6, 1.47, 0.85, 0.7, 0.5), 2) # Data data <- data.frame(X, peer.avg(Anorm, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) = c("x1", "x2", "gx1", "gx2") ytmp <- simcdnet(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), lambda = lambda, Gamma = Gamma, delta = delta, group = group, data = data) y <- ytmp$y hist(y, breaks = max(y) + 1) table(y) # Estimation est <- cdnet(formula = y ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), group = group, optimizer = "fastlbfgs", data = data, opt.ctr = list(maxit = 5e3, eps_f = 1e-11, eps_g = 1e-11)) summary(est)
homophili.data
converts the matrix of explanatory variables between directed network models and symmetric network models.
homophili.data(data, nvec, to = c("lower", "upper", "symmetric"))
homophili.data(data, nvec, to = c("lower", "upper", "symmetric"))
data |
is the |
nvec |
is a vector of the number of individuals in the networks. |
to |
indicates the direction of the conversion. For a matrix of explanatory variable |
the transformed data.frame
.
homophily.fe
implements a Logit estimator for network formation model with homophily. The model includes degree heterogeneity using fixed effects (see details).
homophily.fe( network, formula, data, symmetry = FALSE, fe.way = 1, init = NULL, opt.ctr = list(maxit = 10000, eps_f = 1e-09, eps_g = 1e-09), print = TRUE )
homophily.fe( network, formula, data, symmetry = FALSE, fe.way = 1, init = NULL, opt.ctr = list(maxit = 10000, eps_f = 1e-09, eps_g = 1e-09), print = TRUE )
network |
matrix or list of sub-matrix of social interactions containing 0 and 1, where links are represented by 1 |
formula |
an object of class formula: a symbolic description of the model. The |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
symmetry |
indicates whether the network model is symmetric (see details). |
fe.way |
indicates whether it is a one-way or two-way fixed effect model. The expected value is 1 or 2 (see details). |
init |
(optional) either a list of starting values containing |
opt.ctr |
(optional) is a list of |
print |
Boolean indicating if the estimation progression should be printed. |
Let be a probability for a link to go from the individual
to the individual
.
This probability is specified for two-way effect models (
fe.way = 2
) as
where is the cumulative of the standard logistic distribution. Unobserved degree heterogeneity is captured by
and
. The latter are treated as fixed effects (see
homophily.re
for random effect models).
As shown by Yan et al. (2019), the estimator of
the parameter is biased. A bias correction is then necessary and is not implemented in this version. However
the estimator of
and
are consistent.
For one-way fixed effect models (fe.way = 1
), . For symmetric models, the network is not directed and the
fixed effects need to be one way.
A list consisting of:
model.info |
list of model information, such as the type of fixed effects, whether the model is symmetric, number of observations, etc. |
estimate |
maximizer of the log-likelihood. |
loglike |
maximized log-likelihood. |
optim |
returned value of the optimization solver, which contains details of the optimization. The solver used is |
init |
returned list of starting value. |
loglike(init) |
log-likelihood at the starting value. |
Yan, T., Jiang, B., Fienberg, S. E., & Leng, C. (2019). Statistical inference in a directed network model with covariates. Journal of the American Statistical Association, 114(526), 857-868, doi:10.1080/01621459.2018.1448829.
set.seed(1234) M <- 2 # Number of sub-groups nvec <- round(runif(M, 20, 50)) beta <- c(.1, -.1) Glist <- list() dX <- matrix(0, 0, 2) mu <- list() nu <- list() Emunu <- runif(M, -1.5, 0) #expectation of mu + nu smu2 <- 0.2 snu2 <- 0.2 for (m in 1:M) { n <- nvec[m] mum <- rnorm(n, 0.7*Emunu[m], smu2) num <- rnorm(n, 0.3*Emunu[m], snu2) X1 <- rnorm(n, 0, 1) X2 <- rbinom(n, 1, 0.2) Z1 <- matrix(0, n, n) Z2 <- matrix(0, n, n) for (i in 1:n) { for (j in 1:n) { Z1[i, j] <- abs(X1[i] - X1[j]) Z2[i, j] <- 1*(X2[i] == X2[j]) } } Gm <- 1*((Z1*beta[1] + Z2*beta[2] + kronecker(mum, t(num), "+") + rlogis(n^2)) > 0) diag(Gm) <- 0 diag(Z1) <- NA diag(Z2) <- NA Z1 <- Z1[!is.na(Z1)] Z2 <- Z2[!is.na(Z2)] dX <- rbind(dX, cbind(Z1, Z2)) Glist[[m]] <- Gm mu[[m]] <- mum nu[[m]] <- num } mu <- unlist(mu) nu <- unlist(nu) out <- homophily.fe(network = Glist, formula = ~ -1 + dX, fe.way = 2) muhat <- out$estimate$mu nuhat <- out$estimate$nu plot(mu, muhat) plot(nu, nuhat)
set.seed(1234) M <- 2 # Number of sub-groups nvec <- round(runif(M, 20, 50)) beta <- c(.1, -.1) Glist <- list() dX <- matrix(0, 0, 2) mu <- list() nu <- list() Emunu <- runif(M, -1.5, 0) #expectation of mu + nu smu2 <- 0.2 snu2 <- 0.2 for (m in 1:M) { n <- nvec[m] mum <- rnorm(n, 0.7*Emunu[m], smu2) num <- rnorm(n, 0.3*Emunu[m], snu2) X1 <- rnorm(n, 0, 1) X2 <- rbinom(n, 1, 0.2) Z1 <- matrix(0, n, n) Z2 <- matrix(0, n, n) for (i in 1:n) { for (j in 1:n) { Z1[i, j] <- abs(X1[i] - X1[j]) Z2[i, j] <- 1*(X2[i] == X2[j]) } } Gm <- 1*((Z1*beta[1] + Z2*beta[2] + kronecker(mum, t(num), "+") + rlogis(n^2)) > 0) diag(Gm) <- 0 diag(Z1) <- NA diag(Z2) <- NA Z1 <- Z1[!is.na(Z1)] Z2 <- Z2[!is.na(Z2)] dX <- rbind(dX, cbind(Z1, Z2)) Glist[[m]] <- Gm mu[[m]] <- mum nu[[m]] <- num } mu <- unlist(mu) nu <- unlist(nu) out <- homophily.fe(network = Glist, formula = ~ -1 + dX, fe.way = 2) muhat <- out$estimate$mu nuhat <- out$estimate$nu plot(mu, muhat) plot(nu, nuhat)
homophily.re
implements a Bayesian Probit estimator for network formation model with homophily. The model includes degree heterogeneity using random effects (see details).
homophily.re( network, formula, data, symmetry = FALSE, group.fe = FALSE, re.way = 1, init = list(), iteration = 1000, print = TRUE )
homophily.re( network, formula, data, symmetry = FALSE, group.fe = FALSE, re.way = 1, init = list(), iteration = 1000, print = TRUE )
network |
matrix or list of sub-matrix of social interactions containing 0 and 1, where links are represented by 1. |
formula |
an object of class formula: a symbolic description of the model. The |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
symmetry |
indicates whether the network model is symmetric (see details). |
group.fe |
indicates whether the model includes group fixed effects. |
re.way |
indicates whether it is a one-way or two-way fixed effect model. The expected value is 1 or 2 (see details). |
init |
(optional) list of starting values containing |
iteration |
the number of iterations to be performed. |
print |
boolean indicating if the estimation progression should be printed. |
Let be a probability for a link to go from the individual
to the individual
.
This probability is specified for two-way effect models (
fe.way = 2
) as
where is the cumulative of the standard normal distribution. Unobserved degree heterogeneity is captured by
and
. The latter are treated as random effects (see
homophily.fe
for fixed effect models).
For one-way random effect models (fe.way = 1
), . For symmetric models, the network is not directed and the
random effects need to be one way.
A list consisting of:
model.info |
list of model information, such as the type of random effects, whether the model is symmetric, number of observations, etc. |
posterior |
list of simulations from the posterior distribution. |
init |
returned list of starting values. |
set.seed(1234) library(MASS) M <- 4 # Number of sub-groups nvec <- round(runif(M, 100, 500)) beta <- c(.1, -.1) Glist <- list() dX <- matrix(0, 0, 2) mu <- list() nu <- list() cst <- runif(M, -1.5, 0) smu2 <- 0.2 snu2 <- 0.2 rho <- 0.8 Smunu <- matrix(c(smu2, rho*sqrt(smu2*snu2), rho*sqrt(smu2*snu2), snu2), 2) for (m in 1:M) { n <- nvec[m] tmp <- mvrnorm(n, c(0, 0), Smunu) mum <- tmp[,1] - mean(tmp[,1]) num <- tmp[,2] - mean(tmp[,2]) X1 <- rnorm(n, 0, 1) X2 <- rbinom(n, 1, 0.2) Z1 <- matrix(0, n, n) Z2 <- matrix(0, n, n) for (i in 1:n) { for (j in 1:n) { Z1[i, j] <- abs(X1[i] - X1[j]) Z2[i, j] <- 1*(X2[i] == X2[j]) } } Gm <- 1*((cst[m] + Z1*beta[1] + Z2*beta[2] + kronecker(mum, t(num), "+") + rnorm(n^2)) > 0) diag(Gm) <- 0 diag(Z1) <- NA diag(Z2) <- NA Z1 <- Z1[!is.na(Z1)] Z2 <- Z2[!is.na(Z2)] dX <- rbind(dX, cbind(Z1, Z2)) Glist[[m]] <- Gm mu[[m]] <- mum nu[[m]] <- num } mu <- unlist(mu) nu <- unlist(nu) out <- homophily.re(network = Glist, formula = ~ dX, group.fe = TRUE, re.way = 2, iteration = 1e3) # plot simulations plot(out$posterior$beta[,1], type = "l") abline(h = cst[1], col = "red") plot(out$posterior$beta[,2], type = "l") abline(h = cst[2], col = "red") plot(out$posterior$beta[,3], type = "l") abline(h = cst[3], col = "red") plot(out$posterior$beta[,4], type = "l") abline(h = cst[4], col = "red") plot(out$posterior$beta[,5], type = "l") abline(h = beta[1], col = "red") plot(out$posterior$beta[,6], type = "l") abline(h = beta[2], col = "red") plot(out$posterior$sigma2_mu, type = "l") abline(h = smu2, col = "red") plot(out$posterior$sigma2_nu, type = "l") abline(h = snu2, col = "red") plot(out$posterior$rho, type = "l") abline(h = rho, col = "red") i <- 10 plot(out$posterior$mu[,i], type = "l") abline(h = mu[i], col = "red") plot(out$posterior$nu[,i], type = "l") abline(h = nu[i], col = "red")
set.seed(1234) library(MASS) M <- 4 # Number of sub-groups nvec <- round(runif(M, 100, 500)) beta <- c(.1, -.1) Glist <- list() dX <- matrix(0, 0, 2) mu <- list() nu <- list() cst <- runif(M, -1.5, 0) smu2 <- 0.2 snu2 <- 0.2 rho <- 0.8 Smunu <- matrix(c(smu2, rho*sqrt(smu2*snu2), rho*sqrt(smu2*snu2), snu2), 2) for (m in 1:M) { n <- nvec[m] tmp <- mvrnorm(n, c(0, 0), Smunu) mum <- tmp[,1] - mean(tmp[,1]) num <- tmp[,2] - mean(tmp[,2]) X1 <- rnorm(n, 0, 1) X2 <- rbinom(n, 1, 0.2) Z1 <- matrix(0, n, n) Z2 <- matrix(0, n, n) for (i in 1:n) { for (j in 1:n) { Z1[i, j] <- abs(X1[i] - X1[j]) Z2[i, j] <- 1*(X2[i] == X2[j]) } } Gm <- 1*((cst[m] + Z1*beta[1] + Z2*beta[2] + kronecker(mum, t(num), "+") + rnorm(n^2)) > 0) diag(Gm) <- 0 diag(Z1) <- NA diag(Z2) <- NA Z1 <- Z1[!is.na(Z1)] Z2 <- Z2[!is.na(Z2)] dX <- rbind(dX, cbind(Z1, Z2)) Glist[[m]] <- Gm mu[[m]] <- mum nu[[m]] <- num } mu <- unlist(mu) nu <- unlist(nu) out <- homophily.re(network = Glist, formula = ~ dX, group.fe = TRUE, re.way = 2, iteration = 1e3) # plot simulations plot(out$posterior$beta[,1], type = "l") abline(h = cst[1], col = "red") plot(out$posterior$beta[,2], type = "l") abline(h = cst[2], col = "red") plot(out$posterior$beta[,3], type = "l") abline(h = cst[3], col = "red") plot(out$posterior$beta[,4], type = "l") abline(h = cst[4], col = "red") plot(out$posterior$beta[,5], type = "l") abline(h = beta[1], col = "red") plot(out$posterior$beta[,6], type = "l") abline(h = beta[2], col = "red") plot(out$posterior$sigma2_mu, type = "l") abline(h = smu2, col = "red") plot(out$posterior$sigma2_nu, type = "l") abline(h = snu2, col = "red") plot(out$posterior$rho, type = "l") abline(h = rho, col = "red") i <- 10 plot(out$posterior$mu[,i], type = "l") abline(h = mu[i], col = "red") plot(out$posterior$nu[,i], type = "l") abline(h = nu[i], col = "red")
vec.to.mat
creates a list of square matrices from a given vector.
The elements of the generated matrices are taken from the vector and placed column-wise (ie. the first column is filled up before filling the second column) and from the first matrix of the list to the last matrix of the list.
The diagonal of the generated matrices are zeros.
mat.to.vec
creates a vector from a given list of square matrices .
The elements of the generated vector are taken from column-wise and from the first matrix of the list to the last matrix of the list, while dropping the diagonal entry.
norm.network
row-normalizes matrices in a given list.
norm.network(W) vec.to.mat(u, N, normalise = FALSE, byrow = FALSE) mat.to.vec(W, ceiled = FALSE, byrow = FALSE)
norm.network(W) vec.to.mat(u, N, normalise = FALSE, byrow = FALSE) mat.to.vec(W, ceiled = FALSE, byrow = FALSE)
W |
matrix or list of matrices to convert. |
u |
numeric vector to convert. |
N |
vector of sub-network sizes such that |
normalise |
Boolean takes |
byrow |
Boolean takes |
ceiled |
Boolean takes |
a vector of size sum(N*(N - 1))
or list of length(N)
square matrices. The sizes of the matrices are N[1], N[2], ...
# Generate a list of adjacency matrices ## sub-network size N <- c(250, 370, 120) ## rate of friendship p <- c(.2, .15, .18) ## network data u <- unlist(lapply(1: 3, function(x) rbinom(N[x]*(N[x] - 1), 1, p[x]))) W <- vec.to.mat(u, N) # Convert G into a list of row-normalized matrices G <- norm.network(W) # recover u v <- mat.to.vec(G, ceiled = TRUE) all.equal(u, v)
# Generate a list of adjacency matrices ## sub-network size N <- c(250, 370, 120) ## rate of friendship p <- c(.2, .15, .18) ## network data u <- unlist(lapply(1: 3, function(x) rbinom(N[x]*(N[x] - 1), 1, p[x]))) W <- vec.to.mat(u, N) # Convert G into a list of row-normalized matrices G <- norm.network(W) # recover u v <- mat.to.vec(G, ceiled = TRUE) all.equal(u, v)
peer.avg
computes peer average value using network data (as a list) and observable characteristics.
peer.avg(Glist, V, export.as.list = FALSE)
peer.avg(Glist, V, export.as.list = FALSE)
Glist |
the adjacency matrix or list sub-adjacency matrix. |
V |
vector or matrix of observable characteristics. |
export.as.list |
(optional) boolean to indicate if the output should be a list of matrices or a single matrix. |
the matrix product diag(Glist[[1]], Glist[[2]], ...) %*% V
, where diag()
is the block diagonal operator.
# Generate a list of adjacency matrices ## sub-network size N <- c(250, 370, 120) ## rate of friendship p <- c(.2, .15, .18) ## network data u <- unlist(lapply(1: 3, function(x) rbinom(N[x]*(N[x] - 1), 1, p[x]))) G <- vec.to.mat(u, N, normalise = TRUE) # Generate a vector y y <- rnorm(sum(N)) # Compute G%*%y Gy <- peer.avg(Glist = G, V = y)
# Generate a list of adjacency matrices ## sub-network size N <- c(250, 370, 120) ## rate of friendship p <- c(.2, .15, .18) ## network data u <- unlist(lapply(1: 3, function(x) rbinom(N[x]*(N[x] - 1), 1, p[x]))) G <- vec.to.mat(u, N, normalise = TRUE) # Generate a vector y y <- rnorm(sum(N)) # Compute G%*%y Gy <- peer.avg(Glist = G, V = y)
Summary and print methods for the class simcdEy
as returned by the function simcdEy.
## S3 method for class 'simcdEy' print(x, ...) ## S3 method for class 'simcdEy' summary(object, ...) ## S3 method for class 'summary.simcdEy' print(x, ...)
## S3 method for class 'simcdEy' print(x, ...) ## S3 method for class 'simcdEy' summary(object, ...) ## S3 method for class 'summary.simcdEy' print(x, ...)
x |
an object of class |
... |
further arguments passed to or from other methods. |
object |
an object of class |
A list of the same objects in object
.
remove.ids
optimally removes identifiers with NA from adjacency matrices. Many combinations of rows and columns can be deleted
removing many rows and column
remove.ids(network, ncores = 1L)
remove.ids(network, ncores = 1L)
network |
is a list of adjacency matrices |
ncores |
is the number of cores to be used to run the program in parallel |
List of adjacency matrices without missing values and a list of vectors of retained indeces
A <- matrix(1:25, 5) A[1, 1] <- NA A[4, 2] <- NA remove.ids(A) B <- matrix(1:100, 10) B[1, 1] <- NA B[4, 2] <- NA B[2, 4] <- NA B[,8] <-NA remove.ids(B)
A <- matrix(1:25, 5) A[1, 1] <- NA A[4, 2] <- NA remove.ids(A) B <- matrix(1:100, 10) B[1, 1] <- NA B[4, 2] <- NA B[2, 4] <- NA B[,8] <-NA remove.ids(B)
sar
computes quasi-maximum likelihood estimators for linear-in-mean models with social interactions (see Lee, 2004 and Lee et al., 2010).
sar( formula, Glist, lambda0 = NULL, fixed.effects = FALSE, optimizer = "optim", opt.ctr = list(), print = TRUE, cov = TRUE, cinfo = TRUE, data )
sar( formula, Glist, lambda0 = NULL, fixed.effects = FALSE, optimizer = "optim", opt.ctr = list(), print = TRUE, cov = TRUE, cinfo = TRUE, data )
formula |
a class object formula: a symbolic description of the model. |
Glist |
The network matrix. For networks consisting of multiple subnets, |
lambda0 |
an optional starting value of |
fixed.effects |
a Boolean indicating whether group heterogeneity must be included as fixed effects. |
optimizer |
is either |
opt.ctr |
list of arguments of nlm or optim (the one set in |
print |
a Boolean indicating if the estimate should be printed at each step. |
cov |
a Boolean indicating if the covariance should be computed. |
cinfo |
a Boolean indicating whether information is complete ( |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
For a complete information model, the outcome is defined as:
where is the average of
among peers,
is a vector of control variables,
and
.
In the case of incomplete information models with rational expectations,
is defined as:
A list consisting of:
info |
list of general information on the model. |
estimate |
Maximum Likelihood (ML) estimator. |
cov |
covariance matrix of the estimate. |
details |
outputs as returned by the optimizer. |
Lee, L. F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 72(6), 1899-1925, doi:10.1111/j.1468-0262.2004.00558.x.
Lee, L. F., Liu, X., & Lin, X. (2010). Specification and estimation of social interaction models with network structures. The Econometrics Journal, 13(2), 145-176, doi:10.1111/j.1368-423X.2010.00310.x
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 1000)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ytmp <- simsar(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data) data$y <- ytmp$y out <- sar(formula = y ~ x1 + x2 + + gx1 + gx2, Glist = G, optimizer = "optim", data = data) summary(out)
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 1000)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ytmp <- simsar(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data) data$y <- ytmp$y out <- sar(formula = y ~ x1 + x2 + + gx1 + gx2, Glist = G, optimizer = "optim", data = data) summary(out)
sart
estimates Tobit models with social interactions (Xu and Lee, 2015).
sart( formula, Glist, starting = NULL, Ey0 = NULL, optimizer = "fastlbfgs", npl.ctr = list(), opt.ctr = list(), cov = TRUE, cinfo = TRUE, data )
sart( formula, Glist, starting = NULL, Ey0 = NULL, optimizer = "fastlbfgs", npl.ctr = list(), opt.ctr = list(), cov = TRUE, cinfo = TRUE, data )
formula |
a class object formula: a symbolic description of the model. |
Glist |
The network matrix. For networks consisting of multiple subnets, |
starting |
(optional) a starting value for |
Ey0 |
(optional) a starting value for |
optimizer |
is either |
npl.ctr |
a list of controls for the NPL method (see details of the function |
opt.ctr |
a list of arguments to be passed in |
cov |
a Boolean indicating if the covariance must be computed. |
cinfo |
a Boolean indicating whether information is complete ( |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
For a complete information model, the outcome is defined as:
where is the average of
among peers,
is a vector of control variables,
and
.
In the case of incomplete information modelswith rational expectations,
is defined as:
A list consisting of:
info |
a list of general information on the model. |
estimate |
the Maximum Likelihood (ML) estimator. |
Ey |
|
GEy |
the average of |
cov |
a list including (if |
details |
outputs as returned by the optimizer. |
Xu, X., & Lee, L. F. (2015). Maximum likelihood estimation of a spatial autoregressive Tobit model. Journal of Econometrics, 188(1), 264-280, doi:10.1016/j.jeconom.2015.05.004.
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # Data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ## Complete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = TRUE) data$yc <- ytmp$y ## Incomplete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = FALSE) data$yi <- ytmp$y # Complete information estimation for yc outc1 <- sart(formula = yc ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = TRUE) summary(outc1) # Complete information estimation for yi outc1 <- sart(formula = yi ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = TRUE) summary(outc1) # Incomplete information estimation for yc outi1 <- sart(formula = yc ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = FALSE) summary(outi1) # Incomplete information estimation for yi outi1 <- sart(formula = yi ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = FALSE) summary(outi1)
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # Data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ## Complete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = TRUE) data$yc <- ytmp$y ## Incomplete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = FALSE) data$yi <- ytmp$y # Complete information estimation for yc outc1 <- sart(formula = yc ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = TRUE) summary(outc1) # Complete information estimation for yi outc1 <- sart(formula = yi ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = TRUE) summary(outc1) # Incomplete information estimation for yc outi1 <- sart(formula = yc ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = FALSE) summary(outi1) # Incomplete information estimation for yi outi1 <- sart(formula = yi ~ x1 + x2 + gx1 + gx2, optimizer = "nlm", Glist = G, data = data, cinfo = FALSE) summary(outi1)
simcdpar
computes the average expected outcomes for count data models with social interactions and standard errors using the Delta method.
This function can be used to examine the effects of changes in the network or in the control variables.
simcdEy(object, Glist, data, group, tol = 1e-10, maxit = 500, S = 1000)
simcdEy(object, Glist, data, group, tol = 1e-10, maxit = 500, S = 1000)
object |
an object of class |
Glist |
adjacency matrix. For networks consisting of multiple subnets, |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
group |
the vector indicating the individual groups (see function |
tol |
the tolerance value used in the Fixed Point Iteration Method to compute the expectancy of |
maxit |
the maximal number of iterations in the Fixed Point Iteration Method. |
S |
number of simulations to be used to compute integral in the covariance by important sampling. |
A list consisting of:
Ey |
|
GEy |
the average of |
aEy |
the sampling mean of |
se.aEy |
the standard error of the sampling mean of |
simcdnet
simulate the count data model with social interactions under rational expectations developed by Houndetoungan (2024).
simcdnet( formula, group, Glist, parms, lambda, Gamma, delta, Rmax, Rbar, tol = 1e-10, maxit = 500, data )
simcdnet( formula, group, Glist, parms, lambda, Gamma, delta, Rmax, Rbar, tol = 1e-10, maxit = 500, data )
formula |
a class object formula: a symbolic description of the model. |
group |
the vector indicating the individual groups. The default assumes a common group. For 2 groups; that is, |
Glist |
adjacency matrix. For networks consisting of multiple subnets, |
parms |
a vector defining the true value of |
lambda |
the true value of the vector |
Gamma |
the true value of the vector |
delta |
the true value of the vector |
Rmax |
an integer indicating the theoretical upper bound of |
Rbar |
an |
tol |
the tolerance value used in the Fixed Point Iteration Method to compute the expectancy of |
maxit |
the maximal number of iterations in the Fixed Point Iteration Method. |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
The count variable take the value
with probability.
In this equation, is a vector of control variables;
is the distribution function of the standard normal distribution;
is the average of
among peers using the
s
-th network definition;
is the
r
-th cut-point in the cost group .
The following identification conditions have been introduced: ,
,
, and
for any
. The last condition implies that
for any
.
For any
, the distance between two cut-points is
As the number of cut-point can be large, a quadratic cost function is considered for
, where
.
With the semi-parametric cost-function,
.
The model parameters are: ,
, and
,
where
for
.
The number of single parameters in
depends on
and
. The components
or/and
must be removed in certain cases.
If , then
.
If (binary models), then
must be empty.
If , then
.
A list consisting of:
yst |
|
y |
the observed count variable. |
Ey |
|
GEy |
the average of |
meff |
a list includinh average and individual marginal effects. |
Rmax |
infinite sums in the marginal effects are approximated by sums up to Rmax. |
iteration |
number of iterations performed by sub-network in the Fixed Point Iteration Method. |
Houndetoungan, E. A. (2024). Count Data Models with Social Interactions under Rational Expectations. Available at SSRN 3721250, doi:10.2139/ssrn.3721250.
set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Adjacency matrix A <- list() for (m in 1:M) { nm <- nvec[m] Am <- matrix(0, nm, nm) max_d <- 30 #maximum number of friends for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Am[i, tmp] <- 1 } A[[m]] <- Am } Anorm <- norm.network(A) #Row-normalization # X X <- cbind(rnorm(n, 1, 3), rexp(n, 0.4)) # Two group: group <- 1*(X[,1] > 0.95) # Networks # length(group) = 2 and unique(sort(group)) = c(0, 1) # The networks must be defined as to capture: # peer effects of `0` on `0`, peer effects of `1` on `0` # peer effects of `0` on `1`, and peer effects of `1` on `1` G <- list() cums <- c(0, cumsum(nvec)) for (m in 1:M) { tp <- group[(cums[m] + 1):(cums[m + 1])] Am <- A[[m]] G[[m]] <- norm.network(list(Am * ((1 - tp) %*% t(1 - tp)), Am * ((1 - tp) %*% t(tp)), Am * (tp %*% t(1 - tp)), Am * (tp %*% t(tp)))) } # Parameters lambda <- c(0.2, 0.3, -0.15, 0.25) Gamma <- c(4.5, 2.2, -0.9, 1.5, -1.2) delta <- rep(c(2.6, 1.47, 0.85, 0.7, 0.5), 2) # Data data <- data.frame(X, peer.avg(Anorm, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) = c("x1", "x2", "gx1", "gx2") ytmp <- simcdnet(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), lambda = lambda, Gamma = Gamma, delta = delta, group = group, data = data) y <- ytmp$y hist(y, breaks = max(y) + 1) table(y)
set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Adjacency matrix A <- list() for (m in 1:M) { nm <- nvec[m] Am <- matrix(0, nm, nm) max_d <- 30 #maximum number of friends for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Am[i, tmp] <- 1 } A[[m]] <- Am } Anorm <- norm.network(A) #Row-normalization # X X <- cbind(rnorm(n, 1, 3), rexp(n, 0.4)) # Two group: group <- 1*(X[,1] > 0.95) # Networks # length(group) = 2 and unique(sort(group)) = c(0, 1) # The networks must be defined as to capture: # peer effects of `0` on `0`, peer effects of `1` on `0` # peer effects of `0` on `1`, and peer effects of `1` on `1` G <- list() cums <- c(0, cumsum(nvec)) for (m in 1:M) { tp <- group[(cums[m] + 1):(cums[m + 1])] Am <- A[[m]] G[[m]] <- norm.network(list(Am * ((1 - tp) %*% t(1 - tp)), Am * ((1 - tp) %*% t(tp)), Am * (tp %*% t(1 - tp)), Am * (tp %*% t(tp)))) } # Parameters lambda <- c(0.2, 0.3, -0.15, 0.25) Gamma <- c(4.5, 2.2, -0.9, 1.5, -1.2) delta <- rep(c(2.6, 1.47, 0.85, 0.7, 0.5), 2) # Data data <- data.frame(X, peer.avg(Anorm, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) = c("x1", "x2", "gx1", "gx2") ytmp <- simcdnet(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), lambda = lambda, Gamma = Gamma, delta = delta, group = group, data = data) y <- ytmp$y hist(y, breaks = max(y) + 1) table(y)
simnetwork
simulates adjacency matrices.
simnetwork(dnetwork, normalise = FALSE)
simnetwork(dnetwork, normalise = FALSE)
dnetwork |
is a list of sub-network matrices, where the (i, j)-th position of the m-th matrix is the probability that i be connected to j, with i and j individuals from the m-th network. |
normalise |
boolean takes |
list of (row-normalized) adjacency matrices.
# Generate a list of adjacency matrices ## sub-network size N <- c(250, 370, 120) ## distribution dnetwork <- lapply(N, function(x) matrix(runif(x^2), x)) ## network G <- simnetwork(dnetwork)
# Generate a list of adjacency matrices ## sub-network size N <- c(250, 370, 120) ## distribution dnetwork <- lapply(N, function(x) matrix(runif(x^2), x)) ## network G <- simnetwork(dnetwork)
simsar
simulates continuous variables with social interactions (see Lee, 2004 and Lee et al., 2010).
simsar(formula, Glist, theta, cinfo = TRUE, data)
simsar(formula, Glist, theta, cinfo = TRUE, data)
formula |
a class object formula: a symbolic description of the model. |
Glist |
The network matrix. For networks consisting of multiple subnets, |
theta |
a vector defining the true value of |
cinfo |
a Boolean indicating whether information is complete ( |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
For a complete information model, the outcome is defined as:
where is the average of
among peers,
is a vector of control variables,
and
.
In the case of incomplete information models with rational expectations,
is defined as:
A list consisting of:
y |
the observed count data. |
Gy |
the average of y among friends. |
Lee, L. F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 72(6), 1899-1925, doi:10.1111/j.1468-0262.2004.00558.x.
Lee, L. F., Liu, X., & Lin, X. (2010). Specification and estimation of social interaction models with network structures. The Econometrics Journal, 13(2), 145-176, doi:10.1111/j.1368-423X.2010.00310.x
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 1000)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ytmp <- simsar(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data) y <- ytmp$y
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 1000)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ytmp <- simsar(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data) y <- ytmp$y
simsart
simulates censored data with social interactions (see Xu and Lee, 2015).
simsart(formula, Glist, theta, tol = 1e-15, maxit = 500, cinfo = TRUE, data)
simsart(formula, Glist, theta, tol = 1e-15, maxit = 500, cinfo = TRUE, data)
formula |
a class object formula: a symbolic description of the model. |
Glist |
The network matrix. For networks consisting of multiple subnets, |
theta |
a vector defining the true value of |
tol |
the tolerance value used in the fixed point iteration method to compute |
maxit |
the maximal number of iterations in the fixed point iteration method. |
cinfo |
a Boolean indicating whether information is complete ( |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
For a complete information model, the outcome is defined as:
where is the average of
among peers,
is a vector of control variables,
and
.
In the case of incomplete information modelswith rational expectations,
is defined as:
A list consisting of:
yst |
|
y |
the observed censored variable. |
Ey |
|
Gy |
the average of y among friends. |
GEy |
the average of |
meff |
a list includinh average and individual marginal effects. |
iteration |
number of iterations performed by sub-network in the Fixed Point Iteration Method. |
Xu, X., & Lee, L. F. (2015). Maximum likelihood estimation of a spatial autoregressive Tobit model. Journal of Econometrics, 188(1), 264-280, doi:10.1016/j.jeconom.2015.05.004.
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # Data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ## Complete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = TRUE) data$yc <- ytmp$y ## Incomplete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = FALSE) data$yi <- ytmp$y
# Groups' size set.seed(123) M <- 5 # Number of sub-groups nvec <- round(runif(M, 100, 200)) n <- sum(nvec) # Parameters lambda <- 0.4 Gamma <- c(2, -1.9, 0.8, 1.5, -1.2) sigma <- 1.5 theta <- c(lambda, Gamma, sigma) # X X <- cbind(rnorm(n, 1, 1), rexp(n, 0.4)) # Network G <- list() for (m in 1:M) { nm <- nvec[m] Gm <- matrix(0, nm, nm) max_d <- 30 for (i in 1:nm) { tmp <- sample((1:nm)[-i], sample(0:max_d, 1)) Gm[i, tmp] <- 1 } rs <- rowSums(Gm); rs[rs == 0] <- 1 Gm <- Gm/rs G[[m]] <- Gm } # Data data <- data.frame(X, peer.avg(G, cbind(x1 = X[,1], x2 = X[,2]))) colnames(data) <- c("x1", "x2", "gx1", "gx2") ## Complete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = TRUE) data$yc <- ytmp$y ## Incomplete information game ytmp <- simsart(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, theta = theta, data = data, cinfo = FALSE) data$yi <- ytmp$y
Summary and print methods for the class cdnet
as returned by the function cdnet.
## S3 method for class 'cdnet' summary(object, Glist, data, S = 1000L, ...) ## S3 method for class 'summary.cdnet' print(x, ...) ## S3 method for class 'cdnet' print(x, ...)
## S3 method for class 'cdnet' summary(object, Glist, data, S = 1000L, ...) ## S3 method for class 'summary.cdnet' print(x, ...) ## S3 method for class 'cdnet' print(x, ...)
object |
an object of class |
Glist |
adjacency matrix. For networks consisting of multiple subnets, |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
S |
number of simulations to be used to compute integral in the covariance by important sampling. |
... |
further arguments passed to or from other methods. |
x |
an object of class |
A list of the same objects in object
.
Summary and print methods for the class sar
as returned by the function sar.
## S3 method for class 'sar' summary(object, ...) ## S3 method for class 'summary.sar' print(x, ...) ## S3 method for class 'sar' print(x, ...)
## S3 method for class 'sar' summary(object, ...) ## S3 method for class 'summary.sar' print(x, ...) ## S3 method for class 'sar' print(x, ...)
object |
an object of class |
... |
further arguments passed to or from other methods. |
x |
an object of class |
A list of the same objects in object
.
Summary and print methods for the class sart
as returned by the function sart.
## S3 method for class 'sart' summary(object, Glist, data, ...) ## S3 method for class 'summary.sart' print(x, ...) ## S3 method for class 'sart' print(x, ...)
## S3 method for class 'sart' summary(object, Glist, data, ...) ## S3 method for class 'summary.sart' print(x, ...) ## S3 method for class 'sart' print(x, ...)
object |
an object of class |
Glist |
adjacency matrix or list sub-adjacency matrix. This is not necessary if the covariance method was computed in cdnet. |
data |
dataframe containing the explanatory variables. This is not necessary if the covariance method was computed in cdnet. |
... |
further arguments passed to or from other methods. |
x |
an object of class |
A list of the same objects in object
.