Title: | Ecological Inference |
---|---|
Description: | Software accompanying Gary King's book: A Solution to the Ecological Inference Problem. (1997). Princeton University Press. ISBN 978-0691012407. |
Authors: | Gary King <[email protected]>, Molly Roberts <[email protected]> |
Maintainer: | James Honaker <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.3-3 |
Built: | 2024-11-07 06:40:30 UTC |
Source: | CRAN |
Returns analytical bounds from accounting identity on unknown table relationships beta_b, beta_w, from known, observed, table marginals, x, t (and sample size n).
bounds1(x, t, n)
bounds1(x, t, n)
x |
vector of characteristics, e.g. percentage of blacks in each district |
t |
vector of characteristics, e.g. percentage of people that voted in each district |
n |
size of each observation, e.g. number of voters in each district |
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(census1910) output<-bounds1(x=census1910$x, t=census1910$t, n=census1910$n)
data(census1910) output<-bounds1(x=census1910$x, t=census1910$t, n=census1910$n)
A dataset of aggregate literacy rates (t) and fraction of the population that is black (x), from the 1910 US Census. Each observation represents one county.
census1910
census1910
A data frame containing 1030 observations.
Gary King, 1997, "Replication data for: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data", http://hdl.handle.net/1902.1/LWMMKUTYXS UNF:3:DRWozWd89+vNLO7lY2AHbg== IQSS Dataverse Network [Distributor] V3 [Version]
Gary King. (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press. Section 13.2:241-5.
Robinson, William S. (1950). “Ecological Correlation and the Behavior of Individuals.” American Sociological Review 15:351-357.
ei
is the main command in the package EI
. It gives observation-level estimates (and various related statistics) of and
given variables
and
(
) in this accounting identity:
. Results are stored in an
ei
object, that can be read with summary()
or eiread()
and graphed in plot()
.
ei(formula, total = NULL, Zb = 1, Zw = 1, id = NA, data =NA, erho = 0.5, esigma = 0.5, ebeta = 0.5, ealphab = NA, ealphaw = NA, truth = NA, simulate = TRUE, covariate = NULL, lambda1 = 4, lambda2 = 2, covariate.prior.list = NULL, tune.list = NULL, start.list = NULL, sample = 1000, thin = 1, burnin = 1000, verbose = 0, ret.beta = "r", ret.mcmc = TRUE, usrfun = NULL)
ei(formula, total = NULL, Zb = 1, Zw = 1, id = NA, data =NA, erho = 0.5, esigma = 0.5, ebeta = 0.5, ealphab = NA, ealphaw = NA, truth = NA, simulate = TRUE, covariate = NULL, lambda1 = 4, lambda2 = 2, covariate.prior.list = NULL, tune.list = NULL, start.list = NULL, sample = 1000, thin = 1, burnin = 1000, verbose = 0, ret.beta = "r", ret.mcmc = TRUE, usrfun = NULL)
formula |
A formula of the form |
total |
‘total’ is the name of the variable in the dataset that contains the number of individuals in each unit |
Zb |
|
Zw |
|
id |
‘id’ is the nae of the variable in the dataset that identifies the precinct. Used for ‘movie’ and ‘movieD’ plot functions. |
data |
data frame that contains the variables that
correspond to formula. If using covariates and data is specified, data should also contain |
erho |
The standard deviation of the normal prior on |
esigma |
The standard deviation of an underlying normal distribution, from which a half normal is constructed as a prior for both |
ebeta |
Standard deviation of the "flat normal" prior on |
ealphab |
cols(Zb) x 2 matrix of means (in the first column) and standard deviations (in the second) of an independent normal prior distribution on elements of |
ealphaw |
cols(Zw) x 2 matrix of means (in the first column) and standard deviations (in the second) of an independent normal prior distribution on elements of |
truth |
A length(t) x 2 matrix of the true values of the quantities of interest. |
simulate |
default = TRUE:see documentation in |
covariate |
see documentation in |
lambda1 |
default = 4:see documentation in |
lambda2 |
default = 2:see documentation in |
covariate.prior.list |
see documentation in |
tune.list |
see documentation in |
start.list |
see documentation in |
sample |
default = 1000 |
thin |
default = 1 |
burnin |
default = 1000 |
verbose |
default = 0:see documentation in |
ret.beta |
default = "r": see documentation in |
ret.mcmc |
default = TRUE: see documentation in |
usrfun |
see documentation in |
The EI
algorithm is run using the ei
command. A summary of the results can be seen graphically using plot(ei.object)
or numerically using summary(ei.object)
. Quantities of interest can be calculated using eiread(ei.object)
.
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(sample) form <- t ~ x dbuf <- ei(form,total="n",data=sample) summary(dbuf)
data(sample) form <- t ~ x dbuf <- ei(form,total="n",data=sample) summary(dbuf)
Simulate EI solution via importance sampling
ei.sim(ei.object)
ei.sim(ei.object)
ei.object |
|
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
eiread
is the command that pulls quantities of interest from the ei
object. The command returns a list of quantities of interest requested by the user.
eiread(ei.object, ...)
eiread(ei.object, ...)
ei.object |
An |
... |
A list of quantities of interest for |
betab |
|
betaw |
|
sbetab |
|
sbetaw |
|
phi |
Maximum posterior estimates of the CML |
psisims |
Matrix of random simulations of |
bounds |
|
abounds |
|
aggs |
Simulations of district-level quantities of interest |
maggs |
Point estimate of 2 district-level parameters, |
VCaggs |
Variance matrix of 2 district-level parameters, |
CI80b |
|
CI80w |
|
eaggbias |
Regressions of estimated |
goodman |
Goodman's Regression. See Section 3.1 |
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(sample) formula = t ~ x dbuf <- ei(formula=formula, total="n",data=sample) eiread(dbuf, "phi") eiread(dbuf, "betab", "betaw")
data(sample) formula = t ~ x dbuf <- ei(formula=formula, total="n",data=sample) eiread(dbuf, "phi") eiread(dbuf, "betab", "betaw")
A description for this dataset
eiRxCsample
eiRxCsample
A data frame containing 93 observations.
Source
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
Aggregated data from 289 precincts in Fulton County, Georgia. The variable t
represents the fraction voting in 1994 and x
the fraction in 1992. Beta_b is then the fraction who vote in both elections, and Beta_w the fraction of nonvoters in 1992 who vote in the midterm election of 1994.
fultongen
fultongen
A data frame containing 289 observations.
Gary King, 1997, "Replication data for: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data", http://hdl.handle.net/1902.1/LWMMKUTYXS UNF:3:DRWozWd89+vNLO7lY2AHbg== IQSS Dataverse Network [Distributor] V3 [Version]
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press. Section 13.1:235-41.
The fraction of blacks registered voters (x) and fraction of voter turnout (t) in each Louisiana precinct, along with the true fraction of black turnout (tb) and non-black turnout (tw).
lavoteall
lavoteall
A data frame containing 3262 observations.
Gary King, 1997, "Replication data for: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data", http://hdl.handle.net/1902.1/LWMMKUTYXS UNF:3:DRWozWd89+vNLO7lY2AHbg== IQSS Dataverse Network [Distributor] V3 [Version]
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press. Section 1.4:22-4.
Aggregate voter registration and fraction black, in counties in Florida, Louisiana, North Carolina and South Carolina
matproii
matproii
A data frame containing 268 observations.
Gary King, 1997, "Replication data for: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data", http://hdl.handle.net/1902.1/LWMMKUTYXS UNF:3:DRWozWd89+vNLO7lY2AHbg== IQSS Dataverse Network [Distributor] V3 [Version]
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press. Chapter 10.
A description for this dataset
nj
nj
A data frame containing 493 observations.
Gary King, 1997, "Replication data for: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data", http://hdl.handle.net/1902.1/LWMMKUTYXS UNF:3:DRWozWd89+vNLO7lY2AHbg== IQSS Dataverse Network [Distributor] V3 [Version]
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press. Section 1.4:24-5.
‘plot’ method for the class ‘ei’.
## S3 method for class 'ei' plot(x, ...)
## S3 method for class 'ei' plot(x, ...)
x |
An |
... |
A list of options to return in graphs. See values below. |
Returns any of a set of possible graphical objects, mirroring those in the examples in King (1997).
Graphical option lci
is a logical value specifying the use of the Law of Conservation of Ink, where the implicit information in the data is represented through color gradients, i.e. the color of the line is a function of the length of the tomography line. This can be passed as an argument and is used for “tomogD” and “tomog” plots.
tomogD |
Tomography plot with the data only. See Figure 5.1, page 81. |
tomog |
Tomography plot with ML contours. See Figure 10.2, page 204. |
tomogCI |
Tomography plot with |
tomogCI95 |
Tomography plot with |
tomogE |
Tomography plot with estimated mean posterior |
tomogP |
Tomography plot with mean posterior contours. |
betab |
Density estimate (i.e., a smooth version of a histogram) of point estimates of |
betaw |
Density estimate (i.e., a smooth version of a histogram) of point estimates of |
xt |
Basic |
xtc |
Basic |
xtfit |
|
xtfitg |
|
estsims |
All the simulated |
boundXb |
|
boundXw |
|
truth |
Compares truth to estimates at the district and precinct-level. Requires |
movieD |
For each observation, one tomography plot appears with the line for the particular observation darkened. After the graph for each observation appears, the user can choose to view the next observation (hit return), jump to a specific observation number (type in the number and hit return), or stop (hit "s" and return). |
movie |
For each observation, one page of graphics appears with
the posterior distribution of |
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(sample) formula = t ~ x dbuf <- ei(formula=formula, total="n",data=sample) plot(dbuf, "tomog") plot(dbuf, "tomog", "betab", "betaw", "xtfit")
data(sample) formula = t ~ x dbuf <- ei(formula=formula, total="n",data=sample) plot(dbuf, "tomog") plot(dbuf, "tomog", "betab", "betaw", "xtfit")
A description for this dataset
RxCdata
RxCdata
A data frame containing 60 observations.
Source
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
A description for this dataset
sample
sample
A vector containing 141 observations.
Source
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
‘summary’ method for the class ‘ei’.
## S3 method for class 'ei' summary(object, ...)
## S3 method for class 'ei' summary(object, ...)
object |
An |
... |
A list of options to return in graphs. See values below. |
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(sample) formula = t ~ x dbuf <- ei(formula=formula, total="n",data=sample) print(summary(dbuf))
data(sample) formula = t ~ x dbuf <- ei(formula=formula, total="n",data=sample) print(summary(dbuf))
A tomography plot for an estimated Ecological Inference model in RxC data.
tomogRxC(formula, data, total=NULL, refine=100)
tomogRxC(formula, data, total=NULL, refine=100)
formula |
A formula of the form |
data |
data that contains the data that corresponds to the formula |
total |
‘total’ is the name of the variable in the dataset that contains the number of individuals in each unit |
refine |
specifies the amount of refinement for the image. Higher numbers mean better resolution. |
Gary King <<email: [email protected]>> and Molly Roberts <<email: [email protected]>>
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(RxCdata) formula = cbind(turnout, noturnout) ~ cbind(white, black,hisp) tomogRxC(formula, data=RxCdata)
data(RxCdata) formula = cbind(turnout, noturnout) ~ cbind(white, black,hisp) tomogRxC(formula, data=RxCdata)
A tomography plot in 3 dimensions for RxC Ecological Inference data and an estimated Ecological Inference model in RxC data.
tomogRxC3d(formula, data, total=NULL, lci=TRUE, estimates=FALSE, ci=FALSE, level=.95, seed=1234, color=hcl(h=30,c=100,l=60), transparency=.75, light=FALSE, rotate=TRUE)
tomogRxC3d(formula, data, total=NULL, lci=TRUE, estimates=FALSE, ci=FALSE, level=.95, seed=1234, color=hcl(h=30,c=100,l=60), transparency=.75, light=FALSE, rotate=TRUE)
formula |
A formula of the form |
data |
data that contains the data that corresponds to the formula |
total |
‘total’ is the name of the variable in the dataset that contains the number of individuals in each unit |
lci |
logical value specifying the use of the Law of Conservation of Ink, where the implicit information in the data is represented through color gradients, i.e. the color of the plane is a function of the area of the tomography plane. |
estimates |
logical value specifying whether the point estimates of |
ci |
logical value specifying whether the estimated confidence ellipse is included on the tomography plot. |
level |
numeric value from 0 to 1 specifying the significance level of the confidence ellipse; eg. .95 refers to 95% confidence ellipse. |
seed |
seed value for model estimation. |
color |
color of tomography planes if lci=F. |
transparency |
numeric value from 0 to 1 specifying transparency of tomography planes; 0 is entirely transparent. |
light |
logical value specifying whether lights should be included in the rgl interface. The inclusion of lights will create shadows in the plot that may distort colors. |
rotate |
logical value specifying whether the plot will rotate for 20 seconds. |
Requires rgl package and rgl viewer.
Gary King <<email: [email protected]>>; Molly Roberts <<email: [email protected]>>; Soledad Prillaman <<email: [email protected]..
Gary King (1997). A Solution to the Ecological Inference Problem. Princeton: Princeton University Press.
data(RxCdata) formula <- cbind(turnout, noturnout) ~ cbind(white, black, hisp) tomogRxC3d(formula, RxCdata, total=NULL, lci=TRUE, estimates=TRUE, ci=TRUE, transparency=.5, light=FALSE, rotate=FALSE)
data(RxCdata) formula <- cbind(turnout, noturnout) ~ cbind(white, black, hisp) tomogRxC3d(formula, RxCdata, total=NULL, lci=TRUE, estimates=TRUE, ci=TRUE, transparency=.5, light=FALSE, rotate=FALSE)