Title: | Generation of Artificial Binary Data |
---|---|
Description: | Generation of correlated artificial binary data. |
Authors: | Friedrich Leisch [aut] (<https://orcid.org/0000-0001-7278-1983>, maintainer up to 2024), Andreas Weingessel [aut], Kurt Hornik [aut, cre] |
Maintainer: | Kurt Hornik <[email protected]> |
License: | GPL-2 |
Version: | 0.9-21 |
Built: | 2024-10-24 06:44:49 UTC |
Source: | CRAN |
Compute a matrix of common probabilities for a binary random vector from given marginal probabilities and correlations.
bincorr2commonprob(margprob, bincorr)
bincorr2commonprob(margprob, bincorr)
margprob |
vector of marginal probabilities. |
bincorr |
matrix of binary correlations. |
The matrix of common probabilities. This has the probabilities that
variable equals 1 in element
, and the joint
probability that variables
and
both equal 1 in element
(if
).
Friedrich Leisch
Friedrich Leisch, Andreas Weingessel and Kurt Hornik (1998). On the generation of correlated artificial binary data. Working Paper Series, SFB “Adaptive Information Systems and Modelling in Economics and Management Science”, Vienna University of Economics.
commonprob2sigma
,
simul.commonprob
.
The main diagonal elements commonprob[i,i]
are interpreted as
probabilities that a binary variable
equals 1. The
off-diagonal elements
commonprob[i,j]
are the probabilities
that both
and
are 1.
This programs checks some necessary conditions on these probabilities
which must be fulfilled in order that a joint distribution of the
with the given probabilities can exist.
The conditions checked are
check.commonprob(commonprob)
check.commonprob(commonprob)
commonprob |
Matrix of pairwise probabilities. |
check.commonprob
returns TRUE
, if all conditions are
fulfilled. The attribute "message"
of the return value contains
some information on the errors that were found.
Andreas Weingessel
Friedrich Leisch, Andreas Weingessel and Kurt Hornik (1998). On the generation of correlated artificial binary data. Working Paper Series, SFB “Adaptive Information Systems and Modelling in Economics and Management Science”, Vienna University of Economics.
simul.commonprob
,
commonprob2sigma
check.commonprob(cbind(c(0.5, 0.4), c(0.4, 0.8))) check.commonprob(cbind(c(0.5, 0.25), c(0.25, 0.8))) check.commonprob(cbind(c(0.5, 0, 0), c(0, 0.5, 0), c(0, 0, 0.5)))
check.commonprob(cbind(c(0.5, 0.4), c(0.4, 0.8))) check.commonprob(cbind(c(0.5, 0.25), c(0.25, 0.8))) check.commonprob(cbind(c(0.5, 0, 0), c(0, 0.5, 0), c(0, 0, 0.5)))
Computes a covariance matrix for a normal distribution which
corresponds to a binary distribution with marginal probabilities given
by diag(commonprob)
and pairwise probabilities given by
commonprob
.
For the simulations the values of simulvals
are used.
If a non-valid covariance matrix is the result, the program stops with an error in the case of NA arguments and yields are warning message if the matrix is not positive definite.
commonprob2sigma(commonprob, simulvals)
commonprob2sigma(commonprob, simulvals)
commonprob |
matrix of pairwise probabilities. |
simulvals |
array received by |
A covariance matrix is returned with the same dimensions as
commonprob
.
Friedrich Leisch
Friedrich Leisch, Andreas Weingessel and Kurt Hornik (1998). On the generation of correlated artificial binary data. Working Paper Series, SFB “Adaptive Information Systems and Modelling in Economics and Management Science”, Vienna University of Economics.
m <- cbind(c(1/2,1/5,1/6),c(1/5,1/2,1/6),c(1/6,1/6,1/2)) sigma <- commonprob2sigma(m)
m <- cbind(c(1/2,1/5,1/6),c(1/5,1/2,1/6),c(1/6,1/6,1/2)) sigma <- commonprob2sigma(m)
Returns a matrix containing the conditional probabilities
where
corresponds to the
i
-th
column of x
.
condprob(x)
condprob(x)
x |
matrix of binary data with rows corresponding to cases and columns corresponding to variables. |
Friedrich Leisch
Converts all values of the real valued array x
to binary values
by thresholding at 0.
ra2ba(x)
ra2ba(x)
x |
array of arbitrary dimension |
Friedrich Leisch
x <- array(rnorm(10), dim=c(2,5)) ra2ba(x)
x <- array(rnorm(10), dim=c(2,5)) ra2ba(x)
Creates correlated multivariate binary random variables by thresholding a normal distribution. The correlations of the components can be specified either as common probabilities, correlation matrix of the binary distribution, or covariance matrix of the normal distribution.
rmvbin(n, margprob, commonprob=diag(margprob), bincorr=diag(length(margprob)), sigma=diag(length(margprob)), colnames=NULL, simulvals=NULL)
rmvbin(n, margprob, commonprob=diag(margprob), bincorr=diag(length(margprob)), sigma=diag(length(margprob)), colnames=NULL, simulvals=NULL)
n |
number of observations. |
margprob |
margin probabilities that the components are 1. |
commonprob |
matrix of probabilities that components |
bincorr |
matrix of binary correlations. |
sigma |
covariance matrix for the normal distribution. |
colnames |
vector of column names for the resulting observation matrix. |
simulvals |
result from |
Only one of the arguments commonprob
, bincorr
and
sigma
may be specified. Default are uncorrelated components.
n
samples from a multivariate normal distribution with mean and
variance chosen in order to get the desired margin and common
probabilities are sampled. Negative values are converted to 0,
positive values to 1.
Friedrich Leisch
Friedrich Leisch, Andreas Weingessel and Kurt Hornik (1998). On the generation of correlated artificial binary data. Working Paper Series, SFB “Adaptive Information Systems and Modelling in Economics and Management Science”, Vienna University of Economics.
commonprob2sigma
,
check.commonprob
,
simul.commonprob
## uncorrelated columns: rmvbin(10, margprob=c(0.3,0.9)) ## correlated columns m <- cbind(c(1/2,1/5,1/6),c(1/5,1/2,1/6),c(1/6,1/6,1/2)) rmvbin(10,commonprob=m) ## same as the second example, but faster if the same probabilities are ## used repeatedly (commonprob2sigma rather slow) sigma <- commonprob2sigma(m) rmvbin(10,margprob=diag(m),sigma=sigma)
## uncorrelated columns: rmvbin(10, margprob=c(0.3,0.9)) ## correlated columns m <- cbind(c(1/2,1/5,1/6),c(1/5,1/2,1/6),c(1/6,1/6,1/2)) rmvbin(10,commonprob=m) ## same as the second example, but faster if the same probabilities are ## used repeatedly (commonprob2sigma rather slow) sigma <- commonprob2sigma(m) rmvbin(10,margprob=diag(m),sigma=sigma)
Compute common probabilities of binary random variates generated by thresholding normal variates at 0.
simul.commonprob(margprob, corr=0, method="integrate", n1=10^5, n2=10)
simul.commonprob(margprob, corr=0, method="integrate", n1=10^5, n2=10)
margprob |
vector of marginal probabilities. |
corr |
vector of correlation values for normal distribution. |
method |
either |
n1 |
number of normal variates if method is |
n2 |
number of repetitions if method is |
The output of this function is used by rmvbin
. For all
combinations of marginprob[i]
, marginprob[j]
and
corr[k]
, the probability that both components of a normal
random variable with mean qnorm(marginprob[c(i,j)])
and
correlation corr[k]
are larger than zero is computed.
The probabilities are either computed by numerical integration of the multivariate normal density, or by Monte Carlo simulation.
For normal usage of rmvbin
it is not necessary to use
this function, one simulation result is provided as variable
SimulVals
in this package and loaded by default.
simul.commonprob
returns an array of dimension
c(length(margprob), length(margprob), length(corr))
.
Friedrich Leisch
Friedrich Leisch, Andreas Weingessel and Kurt Hornik (1998). On the generation of correlated artificial binary data. Working Paper Series, SFB “Adaptive Information Systems and Modelling in Economics and Management Science”, Vienna University of Economics.
simul.commonprob(seq(0,1,0.5), seq(-1,1,0.5), meth="mo", n1=10^4) data(SimulVals)
simul.commonprob(seq(0,1,0.5), seq(-1,1,0.5), meth="mo", n1=10^4) data(SimulVals)
This variable provides a pre-fabricated result from
simul.commonprob
such that it is normally not necessary
to use this (time consuming) function, and is used by
rmvbin
.
SimulVals
SimulVals
Friedrich Leisch
Friedrich Leisch, Andreas Weingessel and Kurt Hornik (1998). On the generation of correlated artificial binary data. Working Paper Series, SFB “Adaptive Information Systems and Modelling in Economics and Management Science”, Vienna University of Economics.