Title: | Adrian Dusa's Miscellaneous |
---|---|
Description: | Contains functions used across packages 'DDIwR', 'QCA' and 'venn'. Interprets and translates, factorizes and negates SOP - Sum of Products expressions, for both binary and multi-value crisp sets, and extracts information (set names, set values) from those expressions. Other functions perform various other checks if possibly numeric (even if all numbers reside in a character vector) and coerce to numeric, or check if the numbers are whole. It also offers, among many others, a highly versatile recoding routine and some more flexible alternatives to the base functions 'with()' and 'within()'. SOP simplification functions in this package use related minimization from package 'QCA', which is recommended to be installed despite not being listed in the Imports field, due to circular dependency issues. |
Authors: | Adrian Dusa [aut, cre, cph] |
Maintainer: | Adrian Dusa <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.37 |
Built: | 2024-12-08 12:46:10 UTC |
Source: | CRAN |
Utility functions to read the names and load the objects from an .rda file, into an R list.
listRDA(.filename) objRDA(.filename)
listRDA(.filename) objRDA(.filename)
.filename |
The path to the file where the R object is saved. |
Files with the extension .rda are routinely created using the base function
save()
.
The function listRDA()
loads the object(s) from the .rda file into a list,
preserving the object names in the list components.
The .rda file can naturally be loaded with the base load()
function,
but in doing so the containing objects will overwrite any existing objects with the same names.
The function objRDA()
returns the names of the objects from the .rda file.
A list, containing the objects from the loaded .rda file.
Adrian Dusa
Contains functions used across packages 'DDIwR', 'QCA' and 'venn'.
Interprets and translates, factorizes and negates SOP - Sum of Products
expressions, for both binary and multi-value crisp sets, and extracts
information (set names, set values) from those expressions. Other functions
perform various checks if possibly numeric (even if all numbers reside in a
character vector) and coerce to numeric, or check if the numbers are whole. It
also offers, among many others, a highly versatile recoding routine and some
more flexible alternatives to the base functions with()
and
within()
.
SOP simplification functions in this package use related minimization from
package QCA, which is recommended to be installed despite not being
listed in the Imports field, due to circular dependency issues.
Package: | admisc |
Type: | Package |
Version: | 0.37 |
Date: | 2024-12-08 |
License: | GPL (>= 2) |
Authors:
Adrian Dusa
Department of Sociology
University of Bucharest
[email protected]
Maintainer:
Adrian Dusa
Functions to extract the between the (escaped) quotes, in a string.
betweenQuotes(x)
betweenQuotes(x)
x |
A string. |
Adrian Dusa
x <- "An example of \"quoted\" text." betweenQuotes(x)
x <- "An example of \"quoted\" text." betweenQuotes(x)
Functions to extract information from an expression written in SOP - sum of products form, (or from the canonical DNF - disjunctive normal form) for multi-value causal conditions. It extracts either the values within brackets, or the causal conditions' names outside the brackets.
betweenBrackets(x, type = "[", invert = FALSE, regexp = NULL) outsideBrackets(x, type = "[", regexp = NULL) curlyBrackets(x, outside = FALSE, regexp = NULL) squareBrackets(x, outside = FALSE, regexp = NULL) roundBrackets(x, outside = FALSE, regexp = NULL)
betweenBrackets(x, type = "[", invert = FALSE, regexp = NULL) outsideBrackets(x, type = "[", regexp = NULL) curlyBrackets(x, outside = FALSE, regexp = NULL) squareBrackets(x, outside = FALSE, regexp = NULL) roundBrackets(x, outside = FALSE, regexp = NULL)
x |
A DNF/SOP expression. |
type |
Brackets type: curly, round or square. |
invert |
Logical, if activated returns whatever is not within the brackets. |
outside |
Logical, if activated returns the conditions' names outside the brackets. |
regexp |
Optional regular expression to extract information with. |
Expressions written in SOP - sum of products are used in Boolean logic, signaling a disjunction of conjunctions.
These expressions are useful in Qualitative Comparative Analysis, a social science methodology that is employed in the context of searching for causal configurations that are associated with a certain outcome.
They are also used to draw Venn diagrams with the package venn
, which draws any
kind of set intersection (conjunction) based on a custom SOP expression.
The functions curlyBrackets
, squareBrackets
and
roundBrackets
are just special cases of the functions betweenBrackets
and outsideBrackets
, using the argument type
as either
"{"
, "["
or "("
.
The function outsideBrackets
itself can be considered a special case of the
function betweenBrackets
, when it uses the argument invert = TRUE
.
SOP expressions are usually written using curly brackets for multi-value conditions but to allow
the evaluation of unquoted expressions, they first needs to get past R's internal parsing system.
For this reason, multi-value conditions in unquoted expresions should use the square brackets
notation, and conjunctions should always use the product *
sign.
Sufficiency is recognized as "=>"
in quoted expressions but this does not pass over R's
parsing system in unquoted expressions. To overcome this problem, it is best to use the single
arrow "->"
notation. Necessity is recognized as either "<="
or "<-"
, both
being valid in quoted and unquoted expressions.
Adrian Dusa
sop <- "A[1] + B[2]*C[0]" betweenBrackets(sop) # 1, 2, 0 betweenBrackets(sop, invert = TRUE) # A, B, C # unquoted (valid) SOP expressions are allowed, same result betweenBrackets(A[1] + B[2]*C[0]) # the default type is "[" # curly brackets are also valid in quoted expressions betweenBrackets("A{1} + B{2}*C{0}", type = "{") # or curlyBrackets("A{1} + B{2}*C{0}") # and the condition names curlyBrackets("A{1} + B{2}*C{0}", outside = TRUE) squareBrackets(A[1] + B[2]*C[0]) # 1, 2, 0 squareBrackets(A[1] + B[2]*C[0], outside = TRUE) # A, B, C
sop <- "A[1] + B[2]*C[0]" betweenBrackets(sop) # 1, 2, 0 betweenBrackets(sop, invert = TRUE) # A, B, C # unquoted (valid) SOP expressions are allowed, same result betweenBrackets(A[1] + B[2]*C[0]) # the default type is "[" # curly brackets are also valid in quoted expressions betweenBrackets("A{1} + B{2}*C{0}", type = "{") # or curlyBrackets("A{1} + B{2}*C{0}") # and the condition names curlyBrackets("A{1} + B{2}*C{0}", outside = TRUE) squareBrackets(A[1] + B[2]*C[0]) # 1, 2, 0 squareBrackets(A[1] + B[2]*C[0], outside = TRUE) # A, B, C
A generic function that applies different altering methods for different types of objects (of certain classes).
change(x, ...)
change(x, ...)
x |
An object of a particular class. |
... |
Arguments to be passed to a specific method. |
For the time being, this function is designed to change truth table objects (only). Future versions will likely add class methods for different other objects.
The changed object.
Adrian Dusa
## Not run: # An example to change a QCA truth table library(QCA) ttLF <- truthTable(LF, outcome = SURV, incl.cut = 0.8) minimize(ttLF, include = "?") # excluding contradictory simplifying assumptions minimize( change(ttLF, exclude = findRows(type = 2)), include = "?" ) ## End(Not run)
## Not run: # An example to change a QCA truth table library(QCA) ttLF <- truthTable(LF, outcome = SURV, incl.cut = 0.8) minimize(ttLF, include = "?") # excluding contradictory simplifying assumptions minimize( change(ttLF, exclude = findRows(type = 2)), include = "?" ) ## End(Not run)
This function verifies if an R vector is possibly numeric, and further if the numbers inside are whole numbers.
coerceMode(x)
coerceMode(x)
x |
An atomic R vector |
An R vector of coerced mode.
Adrian Dusa
obj <- c("1.0", 2:5) is.integer(coerceMode(obj))
obj <- c("1.0", 2:5) is.integer(coerceMode(obj))
A fast function to generate all possible combinations of n numbers, taken k at a time, starting from the first k numbers or starting from a combination that contain a certain number.
combnk(n, k, ogte = 0, zerobased = FALSE)
combnk(n, k, ogte = 0, zerobased = FALSE)
n |
Vector of any kind, or a numerical scalar. |
k |
Numeric scalar. |
ogte |
At least one value greater than or equal to this number. |
zerobased |
Logical, zero or one based. |
When a scalar, argument n
should be numeric, otherwise when a vector its
length should not be less than k
.
When the argument ogte
is specified, the combinations will sequentially
be incremented from those which contain a certain number, or a certain position from
n
when specified as a vector.
A matrix with k
rows and choose(n, k)
columns.
Adrian Dusa
combnk(5, 2) combnk(5, 2, ogte = 3) combnk(letters[1:5], 2)
combnk(5, 2) combnk(5, 2, ogte = 3) combnk(letters[1:5], 2)
Set matrix row or column names without copying, especially useful for (very) large matrices.
setColnames(matrix, colnames) setRownames(matrix, rownames) setDimnames(matrix, nameslist)
setColnames(matrix, colnames) setRownames(matrix, rownames) setDimnames(matrix, nameslist)
matrix |
An R matrix |
colnames |
Character vector of column names |
rownames |
Character vector of row names |
nameslist |
A two-component list containing rownames and colnames |
Adrian Dusa
mat <- matrix(1:9, nrow = 3) setDimnames(mat, list(LETTERS[1:3], letters[1:3]))
mat <- matrix(1:9, nrow = 3) setDimnames(mat, list(LETTERS[1:3], letters[1:3]))
This is a generic function, usually a wrapper to write.table()
.
export(what, ...)
export(what, ...)
what |
The object to be written (matrix or dataframe) |
... |
Specific arguments to class functions. |
The default convention for write.table()
is to add a blank column
name for the row names, but (despite it is a standard used for CSV files) that doesn't work
with all spreadsheets or other programs that attempt to import the result of
write.table()
.
This function acts as if write.table()
was called, with only one
difference: if row names are present in the dataframe (i.e. any of them should be different
from the default row numbers), the final result will display a new column called
cases
in the first position, except the situation that another column called
cases
already exists in the data, when the row names will be completely ignored.
If not otherwise specified, an argument sep = ","
is added by default.
The argument row.names
is always set to FALSE, a new column being added anyways (if possible).
Since this function pipes everything to write.table()
, the argument file
can also be a connection open for writing, and ""
indicates output to the console.
Adrian Dusa
The “R Data Import/Export” manual.
This function finds all combinations of common factors in a Boolean expression
written in SOP - sum of products. It makes use of the function
simplify()
, which uses the function
minimize()
from package QCA). Users are
highly encouraged to install and load that package, despite not being present
in the Imports field (due to circular dependency issues).
factorize(input, snames = "", noflevels = NULL, pos = FALSE, ...)
factorize(input, snames = "", noflevels = NULL, pos = FALSE, ...)
input |
A string representing a SOP expression, or a minimization
object of class |
snames |
A string containing the sets' names, separated by commas. |
noflevels |
Numerical vector containing the number of levels for each set. |
pos |
Logical, if possible factorize using product(s) of sums. |
... |
Other arguments (mainly for backwards compatibility). |
Factorization is a process of finding common factors in a Boolean expression, written in SOP - sum of products. Whenever possible, the factorization can also be performed in a POS - product of sums form.
Conjunctions should preferably be indicated with a star *
sign, but this is not
necessary when conditions have single letters or when the expression is expressed in
multi-value notation.
The argument snames
is only needed when conjunctions are not indicated by
any sign, and the set names have more than one letter each (see function
translate()
for more details).
The number of levels in noflevels
is needed only when negating multivalue
conditions, and it should complement the snames
argument.
If input
is an object of class "qca"
(the result of the
function minimize()
from package QCA), a
factorization is performed for each of the minimized solutions.
A named list, each component containing all possible factorizations of the input expression(s), found in the name(s).
Adrian Dusa
Ragin, C.C. (1987) The Comparative Method. Moving beyond qualitative and quantitative strategies, Berkeley: University of California Press
# typical example with redundant conditions factorize(a~b~cd + a~bc~d + a~bcd + abc~d) # results presented in alphabetical order factorize(~one*two*~four + ~one*three + three*~four) # to preserve a certain order of the set names factorize(~one*two*~four + ~one*three + three*~four, snames = c(one, two, three, four)) # using pos - products of sums factorize(~a~c + ~ad + ~b~c + ~bd, pos = TRUE) ## Not run: # make sure the package QCA is loaded library(QCA) # using an object of class "qca" produced with function minimize() # in package QCA pCVF <- minimize(CVF, outcome = "PROTEST", incl.cut = 0.8, include = "?", use.letters = TRUE) factorize(pCVF) # using an object of class "deMorgan" produced with negate() factorize(negate(pCVF)) ## End(Not run)
# typical example with redundant conditions factorize(a~b~cd + a~bc~d + a~bcd + abc~d) # results presented in alphabetical order factorize(~one*two*~four + ~one*three + three*~four) # to preserve a certain order of the set names factorize(~one*two*~four + ~one*three + three*~four, snames = c(one, two, three, four)) # using pos - products of sums factorize(~a~c + ~ad + ~b~c + ~bd, pos = TRUE) ## Not run: # make sure the package QCA is loaded library(QCA) # using an object of class "qca" produced with function minimize() # in package QCA pCVF <- minimize(CVF, outcome = "PROTEST", incl.cut = 0.8, include = "?", use.letters = TRUE) factorize(pCVF) # using an object of class "deMorgan" produced with negate() factorize(negate(pCVF)) ## End(Not run)
Useful function to invert the values from a categorical variable, for instance a Likert response scale.
finvert(x, levels = FALSE)
finvert(x, levels = FALSE)
x |
A categorical variable (a factor) |
levels |
Logical, invert the levels as well |
A factor of the same length as the original one.
Adrian Dusa
words <- c("ini", "mini", "miny", "moe") variable <- factor(words, levels = words) # inverts the value, preserving the levels finvert(variable) # inverts both values and levels finvert(variable, levels = TRUE)
words <- c("ini", "mini", "miny", "moe") variable <- factor(words, levels = words) # inverts the value, preserving the levels finvert(variable) # inverts both values and levels finvert(variable, levels = TRUE)
relevel()
functionThe base function relevel()
accepts a single argument "ref", which
can only be a scalar and not a vector of values. frelevel()
accepts
more (even all) levels and reorders them.
frelevel(variable, levels)
frelevel(variable, levels)
variable |
The categorical variable of interest |
levels |
One or more levels of the factor, in the desired order |
A factor of the same length as the initial one.
Adrian Dusa
words <- c("ini", "mini", "miny", "moe") variable <- factor(words, levels = words) # modify the order of the levels, keeping the order of the values frelevel(variable, c("moe", "ini", "miny", "mini"))
words <- c("ini", "mini", "miny", "moe") variable <- factor(words, levels = words) # modify the order of the levels, keeping the order of the values frelevel(variable, c("moe", "ini", "miny", "mini"))
This is a utility to be used inside a function.
getName(x, object = FALSE)
getName(x, object = FALSE)
x |
String, expression to be evaluated |
object |
Logical, return the object's name |
Within a function, the argument x
can be anything and it is usually
evaluated as an object.
This function should be used in conjunction with the base match.call()
,
to obtain the original name of the object being served as an input, regardless
of how it is being served.
A particular use case of this function relates to the cases when a variable within a data.frame is used. The overall name of the object (the data frame) is irrelevant, as the real object of interest is the variable.
A character vector of length 1.
Adrian Dusa
foo <- function(x) { funargs <- sapply(match.call(), deparse)[-1] return(getName(funargs[1])) } dd <- data.frame(X = 1:5, Y = 1:5, Z = 1:5) foo(dd) # dd foo(dd$X) # X foo(dd[["X"]]) # X foo(dd[[c("X", "Y")]]) # X Y foo(dd[, 1]) # X foo(dd[, 2:3]) # Y Z
foo <- function(x) { funargs <- sapply(match.call(), deparse)[-1] return(getName(funargs[1])) } dd <- data.frame(X = 1:5, Y = 1:5, Z = 1:5) foo(dd) # dd foo(dd$X) # X foo(dd[["X"]]) # X foo(dd[[c("X", "Y")]]) # X Y foo(dd[, 1]) # X foo(dd[, 2:3]) # Y Z
Produces colors from the HCL (Hue Chroma Luminance) spectrum, based on the number of levels from a factor.
hclr(x, starth = 25, c = 50, l = 75, alpha = 1, fixup = TRUE)
hclr(x, starth = 25, c = 50, l = 75, alpha = 1, fixup = TRUE)
x |
Number of factor levels, or the factor itself, or a frequency distribution from a factor |
starth |
Starting point for the hue (in the interval 0 - 360) |
c |
chroma - color purity, small values produce dark and high values produce bright colors |
l |
color luminance - a number between 0 and 100 |
alpha |
color transparency, where 0 is a completely transparent color, up to 1 |
fixup |
logical, corrects the RGB values foto produce a realistic color |
Any value of h
outside the interval 0 - 360 is constrained to this interval using
modulo values. For instance, 410 is constrained to 50 = 410
The RBG code for the corresponding HCL colors.
Adrian Dusa
aa <- sample(letters[1:5], 100, replace = TRUE) hclr(aa) # same with hclr(5) # or hclr(table(aa))
aa <- sample(letters[1:5], 100, replace = TRUE) hclr(aa) # same with hclr(5) # or hclr(table(aa))
Evaluate an R expression in an environment constructed from data.
inside(data, expr, ...) ## S3 method for class 'list' inside(data, expr, keepAttrs = TRUE, ...)
inside(data, expr, ...) ## S3 method for class 'list' inside(data, expr, keepAttrs = TRUE, ...)
data |
Data to use for constructing an environment a |
expr |
Expression to evaluate, often a “compound” expression, i.e., of the form { a <- somefun() b <- otherfun() ..... rm(unused1, temp) } |
keepAttrs |
For the |
... |
Arguments to be passed to (future) methods. |
This is a modified version of the base R function within))
, with exactly
the same arguments and functionality but only one fundamental difference:
instead of returning a modified copy of the input data, this function alters the
data directly.
Adrian Dusa
mt <- mtcars inside(mt, hwratio <- hp/wt) dim(mtcars) dim(mt)
mt <- mtcars inside(mt, hwratio <- hp/wt) dim(mtcars) dim(mt)
These functions interpret an expression written in sum of products (SOP) or in
canonical disjunctive normal form (DNF), for both crisp and multivalue notations.
The function compute()
calculates set membership scores based on a
SOP expression applied to a calibrated data set (see function
calibrate()
from package QCA), while the
function translate()
translates a SOP expression into a matrix form.
The function simplify()
transforms a SOP expression into a simpler
equivalent, through a process of Boolean minimization. The package uses the
function minimize()
from package QCA), so
users are highly encouraged to install and load that package, despite not being
present in the Imports field (due to circular dependency issues).
Function expand()
performs a Quine expansion to the complete DNF,
or a partial expansion to a SOP expression with equally complex terms.
Function asSOP()
returns a SOP expression from a POS (product of
sums) expression. This function is different from the function
invert()
, which also negates each causal condition.
Function mvSOP()
coerces an expression from crisp set notation to
multi-value notation.
asSOP(expression = "", snames = "", noflevels = NULL) compute(expression = "", data = NULL, separate = FALSE, ...) expand(expression = "", snames = "", noflevels = NULL, partial = FALSE, implicants = FALSE, ...) mvSOP(expression = "", snames = "", data = NULL, keep.tilde = TRUE, ...) simplify(expression = "", snames = "", noflevels = NULL, ...) translate(expression = "", snames = "", noflevels = NULL, data = NULL, ...)
asSOP(expression = "", snames = "", noflevels = NULL) compute(expression = "", data = NULL, separate = FALSE, ...) expand(expression = "", snames = "", noflevels = NULL, partial = FALSE, implicants = FALSE, ...) mvSOP(expression = "", snames = "", data = NULL, keep.tilde = TRUE, ...) simplify(expression = "", snames = "", noflevels = NULL, ...) translate(expression = "", snames = "", noflevels = NULL, data = NULL, ...)
expression |
String, a SOP expression. |
data |
A dataset with binary cs, mv and fs data. |
separate |
Logical, perform computations on individual, separate paths. |
snames |
A string containing the sets' names, separated by commas. |
noflevels |
Numerical vector containing the number of levels for each set. |
partial |
Logical, perform a partial Quine expansion. |
implicants |
Logical, return an expanded matrix in the implicants space. |
keep.tilde |
Logical, preserves the tilde sign when coercing a factor level |
... |
Other arguments, mainly for backwards compatibility. |
An expression written in sum of products (SOP), is a "union of intersections",
for example A*B + B*~C
. The disjunctive normal form (DNF) is also
a sum of products, with the restriction that each product has to contain all
literals. The equivalent DNF expression is: A*B*~C + A*B*C + ~A*B*~C
The same expression can be written in multivalue notation:
A[1]*B[1] + B[1]*C[0]
.
Expressions can contain multiple values for the same condition, separated by a
comma. If B was a multivalue causal condition, an expression could be:
A[1] + B[1,2]*C[0]
.
Whether crisp or multivalue, expressions are treated as Boolean. In this last example, all values in B equal to either 1 or 2 will be converted to 1, and the rest of the (multi)values will be converted to 0.
Negating a multivalue condition requires a known number of levels (see examples
below). Intersections between multiple levels of the same condition are possible.
For a causal condition with 3 levels (0, 1 and 2) the following expression
~A[0,2]*A[1,2]
is equivalent with A[1]
, while
A[0]*A[1]
results in the empty set.
The number of levels, as well as the set names can be automatically detected
from a dataset via the argument data
. When specified, arguments
snames
and noflevels
have precedence over
data
.
The product operator *
should always be used, but it can be omitted
when the data is multivalue (where product terms are separated by curly brackets),
and/or when the set names are single letters (for example AD + B~C
),
and/or when the set names are provided via the argument snames
.
When expressions are simplified, their simplest equivalent can result in the empty set, if the conditions cancel each other out.
The function mvSOP()
assumes binary crisp conditions in the
expression, except for categorical data used as multi-value conditions. The
factor levels are read directly from the data, and they should be unique accross
all conditions.
For the function compute()
, a vector of set membership values.
For function simplify()
, a character expression.
For the function translate()
, a matrix containing the implicants
on the rows and the set names on the columns, with the following codes:
0 | absence of a causal condition |
1 | presence of a causal condition |
-1 | causal condition was eliminated |
The matrix was also assigned a class "translate", to avoid printing the -1 codes when signaling a minimized condition. The mode of this matrix is character, to allow printing multiple levels in the same cell, such as "1,2".
For function expand()
, a character expression or a matrix of
implicants.
Adrian Dusa
Ragin, C.C. (1987) The Comparative Method: Moving beyond Qualitative and Quantitative Strategies. Berkeley: University of California Press.
# ----- # for compute() ## Not run: # make sure the package QCA is loaded library(QCA) compute(DEV*~IND + URB*STB, data = LF) # calculating individual paths compute(DEV*~IND + URB*STB, data = LF, separate = TRUE) ## End(Not run) # ----- # for simplify(), also make sure the package QCA is loaded simplify(asSOP("(A + B)(A + ~B)")) # result is "A" # works even without the quotes simplify(asSOP((A + B)(A + ~B))) # result is "A" # but to avoid confusion POS expressions are more clear when quoted # to force a certain order of the set names simplify("(URB + LIT*~DEV)(~LIT + ~DEV)", snames = c(DEV, URB, LIT)) # multilevel conditions can also be specified (and negated) simplify("(A[1] + ~B[0])(B[1] + C[0])", snames = c(A, B, C), noflevels = c(2, 3, 2)) # Ragin's (1987) book presents the equation E = SG + LW as the result # of the Boolean minimization for the ethnic political mobilization. # intersecting the reactive ethnicity perspective (R = ~L~W) # with the equation E (page 144) simplify("~L~W(SG + LW)", snames = c(S, L, W, G)) # [1] "S~L~WG" # resources for size and wealth (C = SW) with E (page 145) simplify("SW(SG + LW)", snames = c(S, L, W, G)) # [1] "SWG + SLW" # and factorized factorize(simplify("SW(SG + LW)", snames = c(S, L, W, G))) # F1: SW(G + L) # developmental perspective (D = Lg) and E (page 146) simplify("L~G(SG + LW)", snames = c(S, L, W, G)) # [1] "LW~G" # subnations that exhibit ethnic political mobilization (E) but were # not hypothesized by any of the three theories (page 147) # ~H = ~(~L~W + SW + L~G) = GL~S + GL~W + G~SW + ~L~SW simplify("(GL~S + GL~W + G~SW + ~L~SW)(SG + LW)", snames = c(S, L, W, G)) # ----- # for translate() translate(A + B*C) # same thing in multivalue notation translate(A[1] + B[1]*C[1]) # tilde as a standard negation (note the condition "b"!) translate(~A + b*C) # and even for multivalue variables # in multivalue notation, the product sign * is redundant translate(C[1] + T[2] + T[1]*V[0] + C[0]) # negation of multivalue sets requires the number of levels translate(~A[1] + ~B[0]*C[1], snames = c(A, B, C), noflevels = c(2, 2, 2)) # multiple values can be specified translate(C[1] + T[1,2] + T[1]*V[0] + C[0]) # or even negated translate(C[1] + ~T[1,2] + T[1]*V[0] + C[0], snames = c(C, T, V), noflevels = c(2,3,2)) # if the expression does not contain the product sign * # snames are required to complete the translation translate(AaBb + ~CcDd, snames = c(Aa, Bb, Cc, Dd)) # to print _all_ codes from the standard output matrix (obj <- translate(A + ~B*C)) print(obj, original = TRUE) # also prints the -1 code # ----- # for expand() expand(~AB + B~C) # S1: ~AB~C + ~ABC + AB~C expand(~AB + B~C, snames = c(A, B, C, D)) # S1: ~AB~C~D + ~AB~CD + ~ABC~D + ~ABCD + AB~C~D + AB~CD # In implicants form: expand(~AB + B~C, snames = c(A, B, C, D), implicants = TRUE) # A B C D # [1,] 1 2 1 1 ~AB~C~D # [2,] 1 2 1 2 ~AB~CD # [3,] 1 2 2 1 ~ABC~D # [4,] 1 2 2 2 ~ABCD # [5,] 2 2 1 1 AB~C~D # [6,] 2 2 1 2 AB~CD
# ----- # for compute() ## Not run: # make sure the package QCA is loaded library(QCA) compute(DEV*~IND + URB*STB, data = LF) # calculating individual paths compute(DEV*~IND + URB*STB, data = LF, separate = TRUE) ## End(Not run) # ----- # for simplify(), also make sure the package QCA is loaded simplify(asSOP("(A + B)(A + ~B)")) # result is "A" # works even without the quotes simplify(asSOP((A + B)(A + ~B))) # result is "A" # but to avoid confusion POS expressions are more clear when quoted # to force a certain order of the set names simplify("(URB + LIT*~DEV)(~LIT + ~DEV)", snames = c(DEV, URB, LIT)) # multilevel conditions can also be specified (and negated) simplify("(A[1] + ~B[0])(B[1] + C[0])", snames = c(A, B, C), noflevels = c(2, 3, 2)) # Ragin's (1987) book presents the equation E = SG + LW as the result # of the Boolean minimization for the ethnic political mobilization. # intersecting the reactive ethnicity perspective (R = ~L~W) # with the equation E (page 144) simplify("~L~W(SG + LW)", snames = c(S, L, W, G)) # [1] "S~L~WG" # resources for size and wealth (C = SW) with E (page 145) simplify("SW(SG + LW)", snames = c(S, L, W, G)) # [1] "SWG + SLW" # and factorized factorize(simplify("SW(SG + LW)", snames = c(S, L, W, G))) # F1: SW(G + L) # developmental perspective (D = Lg) and E (page 146) simplify("L~G(SG + LW)", snames = c(S, L, W, G)) # [1] "LW~G" # subnations that exhibit ethnic political mobilization (E) but were # not hypothesized by any of the three theories (page 147) # ~H = ~(~L~W + SW + L~G) = GL~S + GL~W + G~SW + ~L~SW simplify("(GL~S + GL~W + G~SW + ~L~SW)(SG + LW)", snames = c(S, L, W, G)) # ----- # for translate() translate(A + B*C) # same thing in multivalue notation translate(A[1] + B[1]*C[1]) # tilde as a standard negation (note the condition "b"!) translate(~A + b*C) # and even for multivalue variables # in multivalue notation, the product sign * is redundant translate(C[1] + T[2] + T[1]*V[0] + C[0]) # negation of multivalue sets requires the number of levels translate(~A[1] + ~B[0]*C[1], snames = c(A, B, C), noflevels = c(2, 2, 2)) # multiple values can be specified translate(C[1] + T[1,2] + T[1]*V[0] + C[0]) # or even negated translate(C[1] + ~T[1,2] + T[1]*V[0] + C[0], snames = c(C, T, V), noflevels = c(2,3,2)) # if the expression does not contain the product sign * # snames are required to complete the translation translate(AaBb + ~CcDd, snames = c(Aa, Bb, Cc, Dd)) # to print _all_ codes from the standard output matrix (obj <- translate(A + ~B*C)) print(obj, original = TRUE) # also prints the -1 code # ----- # for expand() expand(~AB + B~C) # S1: ~AB~C + ~ABC + AB~C expand(~AB + B~C, snames = c(A, B, C, D)) # S1: ~AB~C~D + ~AB~CD + ~ABC~D + ~ABCD + AB~C~D + AB~CD # In implicants form: expand(~AB + B~C, snames = c(A, B, C, D), implicants = TRUE) # A B C D # [1,] 1 2 1 1 ~AB~C~D # [2,] 1 2 1 2 ~AB~CD # [3,] 1 2 2 1 ~ABC~D # [4,] 1 2 2 2 ~ABCD # [5,] 2 2 1 1 AB~C~D # [6,] 2 2 1 2 AB~CD
This function takes two or more SOP expressions (combinations of conjunctions and disjunctions) or even entire minimization objects, and finds their intersection.
intersection(..., snames = "", noflevels)
intersection(..., snames = "", noflevels)
... |
One or more expressions, combined with / or minimization objects
of class |
snames |
A string containing the sets' names, separated by commas. |
noflevels |
Numerical vector containing the number of levels for each set. |
The initial aim of this function was to provide a software implementation of the
intersection examples presented by Ragin (1987: 144-147). That type of example can also
be performed with the function simplify()
, while this
function is now mainly used in conjunction with the modelFit()
function from package QCA, to assess the intersection between theory and a
QCA model.
Irrespective of the input type (character expressions and / or minimiation objects),
this function is now a wrapper to the main simplify()
function (which only accepts character expressions).
It can deal with any kind of expressions, but multivalent crisp conditions need additional
information about their number of levels, via the argument noflevels
.
The expressions can be formulated in terms of either lower case - upper case notation for the absence and the presence of the causal condition, or use the tilde notation (see examples below). Usage of either of these is automatically detected, as long as all expressions use the same notation.
If the snames
argument is provided, the result is sorted according to the order
of the causal conditions (set names) in the original dataset, otherwise it sorts the causal
conditions in alphabetical order.
For minimzation objects of class "QCA_min"
, the number of levels, and the set names are
automatically detected.
Adrian Dusa
Ragin, Charles C. 1987. The Comparative Method: Moving beyond Qualitative and Quantitative Strategies. Berkeley: University of California Press.
# using minimization objects ## Not run: library(QCA) # if not already loaded ttLF <- truthTable(LF, outcome = "SURV", incl.cut = 0.8) pLF <- minimize(ttLF, include = "?") # for example the intersection between the parsimonious model and # a theoretical expectation intersection(pLF, DEV*STB) # negating the model intersection(negate(pLF), DEV*STB) ## End(Not run) # ----- # in Ragin's (1987) book, the equation E = SG + LW is the result # of the Boolean minimization for the ethnic political mobilization. # intersecting the reactive ethnicity perspective (R = lw) # with the equation E (page 144) intersection(~L~W, SG + LW, snames = c(S, L, W, G)) # resources for size and wealth (C = SW) with E (page 145) intersection(SW, SG + LW, snames = c(S, L, W, G)) # and factorized factorize(intersection(SW, SG + LW, snames = c(S, L, W, G))) # developmental perspective (D = L~G) and E (page 146) intersection(L~G, SG + LW, snames = c(S, L, W, G)) # subnations that exhibit ethic political mobilization (E) but were # not hypothesized by any of the three theories (page 147) # ~H = ~(~L~W + SW + L~G) intersection(negate(~L~W + SW + L~G), SG + LW, snames = c(S, L, W, G))
# using minimization objects ## Not run: library(QCA) # if not already loaded ttLF <- truthTable(LF, outcome = "SURV", incl.cut = 0.8) pLF <- minimize(ttLF, include = "?") # for example the intersection between the parsimonious model and # a theoretical expectation intersection(pLF, DEV*STB) # negating the model intersection(negate(pLF), DEV*STB) ## End(Not run) # ----- # in Ragin's (1987) book, the equation E = SG + LW is the result # of the Boolean minimization for the ethnic political mobilization. # intersecting the reactive ethnicity perspective (R = lw) # with the equation E (page 144) intersection(~L~W, SG + LW, snames = c(S, L, W, G)) # resources for size and wealth (C = SW) with E (page 145) intersection(SW, SG + LW, snames = c(S, L, W, G)) # and factorized factorize(intersection(SW, SG + LW, snames = c(S, L, W, G))) # developmental perspective (D = L~G) and E (page 146) intersection(L~G, SG + LW, snames = c(S, L, W, G)) # subnations that exhibit ethic political mobilization (E) but were # not hypothesized by any of the three theories (page 147) # ~H = ~(~L~W + SW + L~G) intersection(negate(~L~W + SW + L~G), SG + LW, snames = c(S, L, W, G))
Functions to negate a DNF/SOP expression, or to invert a SOP to a negated POS or a POS to a negated SOP.
negate(input, snames = "", noflevels, simplify = TRUE, ...) invert(input, snames = "", noflevels)
negate(input, snames = "", noflevels, simplify = TRUE, ...) invert(input, snames = "", noflevels)
input |
A string representing a SOP expression, or a minimization
object of class |
snames |
A string containing the sets' names, separated by commas. |
noflevels |
Numerical vector containing the number of levels for each set. |
simplify |
Logical, allow users to choose between the raw negation or its simplest form. |
... |
Other arguments (mainly for backwards compatibility). |
In Boolean algebra, there are two transformation rules named after the British mathematician Augustus De Morgan. These rules state that:
1. The complement of the union of two sets is the intersection of their complements.
2. The complement of the intersection of two sets is the union of their complements.
In "normal" language, these would be written as:
1. not (A and B) = (not A) or (not B)
2. not (A or B) = (not A) and (not B)
Based on these two laws, any Boolean expression written in disjunctive normal form can be transformed into its negation.
It is also possible to negate all models and solutions from the result of a
Boolean minimization from function minimize()
in
package QCA
. The resulting object, of class "qca"
, is
automatically recognised by this function.
In a SOP expression, the products should normally be split by using a star
*
sign, otherwise the sets' names will be considered the individual
letters in alphabetical order, unless they are specified via snames
.
To negate multilevel expressions, the argument noflevels
is required.
It is entirely possible to obtain multiple negations of a single expression, since
the result of the negation is passed to function simplify()
.
Function invert
() simply transforms an expression from a sum of
products (SOP) to a negated product of sums (POS), and the other way round.
A character vector when the input is a SOP expresison, or a named list for minimization input objects, each component containing all possible negations of the model(s).
Adrian Dusa
Ragin, Charles C. 1987. The Comparative Method: Moving beyond Qualitative and Quantitative Strategies. Berkeley: University of California Press.
# example from Ragin (1987, p.99) negate(AC + B~C, simplify = FALSE) # the simplified, logically equivalent negation negate(AC + B~C) # with different intersection operators negate(AB*EF + ~CD*EF) # invert to POS invert(a*b + ~c*d) ## Not run: # using an object of class "qca" produced with minimize() # from package QCA library(QCA) cLC <- minimize(LC, outcome = SURV) negate(cLC) # parsimonious solution pLC <- minimize(LC, outcome = SURV, include = "?") negate(pLC) ## End(Not run)
# example from Ragin (1987, p.99) negate(AC + B~C, simplify = FALSE) # the simplified, logically equivalent negation negate(AC + B~C) # with different intersection operators negate(AB*EF + ~CD*EF) # invert to POS invert(a*b + ~c*d) ## Not run: # using an object of class "qca" produced with minimize() # from package QCA library(QCA) cLC <- minimize(LC, outcome = SURV) negate(cLC) # parsimonious solution pLC <- minimize(LC, outcome = SURV, include = "?") negate(pLC) ## End(Not run)
Check if one number is greater / lower than (or equal to) another.
agtb(a, b, bincat) altb(a, b, bincat) agteb(a, b, bincat) alteb(a, b, bincat) aeqb(a, b, bincat) aneqb(a, b, bincat)
agtb(a, b, bincat) altb(a, b, bincat) agteb(a, b, bincat) alteb(a, b, bincat) aeqb(a, b, bincat) aneqb(a, b, bincat)
a |
Numerical vector |
b |
Numerical vector |
bincat |
Binary categorization values, an atomic vector of length 2 |
Not all numbers (especially the decimal ones) can be represented exactly in floating point arithmetic, and their arithmetic may not give the normal expected result.
This set of functions check for the in(equality) between two numerical vectors a and b, with the following name convention:
gt
means “greater than”
lt
means a “lower than” b
gte
means a “greater than or equal to” b
lte
means a “lower than or equal to” b
eq
means a “equal to” b
neq
means a “not equal to” b
The argument values
is useful to replace the TRUE / FALSE values
with custom categories.
Adrian Dusa
Goldberg, David (1991) "What Every Computer Scientist Should Know About Floating-point Arithmetic", ACM Computing Surveys vol.23, no.1, pp.5-48, doi:10.1145/103162.103163
Calculates the (maximum) number of decimals in a possibly numeric vector.
numdec(x, each = FALSE, na.rm = TRUE, maxdec = 15)
numdec(x, each = FALSE, na.rm = TRUE, maxdec = 15)
x |
A vector of values |
each |
Logical, return the result for each value in the vector |
na.rm |
Logical, ignore missing values |
maxdec |
Maximal number of decimals to count |
Adrian Dusa
x <- c(12, 12.3, 12.34) numdec(x) # 2 numdec(x, each = TRUE) # 0, 1, 2 x <- c("-.1", " 2.75 ", "12", "B", NA) numdec(x) # 2 numdec(x, each = TRUE) # 1, 2, 0, NA, NA
x <- c(12, 12.3, 12.34) numdec(x) # 2 numdec(x, each = TRUE) # 0, 1, 2 x <- c("-.1", " 2.75 ", "12", "B", NA) numdec(x) # 2 numdec(x, each = TRUE) # 1, 2, 0, NA, NA
Coerces objects to class "numeric", and checks if an object is numeric.
asNumeric(x, ...) possibleNumeric(x, each = FALSE) wholeNumeric(x, each = FALSE)
asNumeric(x, ...) possibleNumeric(x, each = FALSE) wholeNumeric(x, each = FALSE)
x |
A vector of values |
each |
Logical, return the result for each value in the vector |
... |
Other arguments to be passed for class based methods |
Unlike the function as.numeric
() from the base
package, the function asNumeric()
coerces to numeric without a
warning if any values are not numeric. All such values are considered NA missing.
This is a generic function, with specific class methods for factors and objects
of class “declared”. The usual way of coercing factors to numeric is
meaningless, converting the inner storage numbers. The class method of this
particular function coerces the levels to numeric, via the default activated
argument levels
.
For objects of class “declared”, a similar argument called na_values
is by default activated to coerce the declared missing values to numeric.
The function possibleNumeric()
tests if the values in a vector are
possibly numeric, irrespective of their storing as character or numbers. In the
case of factors, it tests its levels representation.
Function wholeNumeric()
tests if numbers in a vector are whole
(round) numbers. Whole numbers are different from “integer” numbers (which
have special memory representation), and consequently the function
is.integer
() tests something different, how numbers are stored in
memory (see the description of function double()
for
more details).
The function
Adrian Dusa
x <- c("-.1", " 2.7 ", "B") asNumeric(x) # no warning f <- factor(c(3, 2, "a")) asNumeric(f) asNumeric(f, levels = FALSE) possibleNumeric(x) # FALSE possibleNumeric(x, each = TRUE) # TRUE TRUE FALSE possibleNumeric(c("1", 2, 3)) # TRUE is.integer(1) # FALSE # Signaling an integer in R is.integer(1L) # TRUE wholeNumeric(1) # TRUE wholeNumeric(c(1, 1.1), each = TRUE) # TRUE FALSE
x <- c("-.1", " 2.7 ", "B") asNumeric(x) # no warning f <- factor(c(3, 2, "a")) asNumeric(f) asNumeric(f, levels = FALSE) possibleNumeric(x) # FALSE possibleNumeric(x, each = TRUE) # TRUE TRUE FALSE possibleNumeric(c("1", 2, 3)) # TRUE is.integer(1) # FALSE # Signaling an integer in R is.integer(1L) # TRUE wholeNumeric(1) # TRUE wholeNumeric(c(1, 1.1), each = TRUE) # TRUE FALSE
Utility function to overwrite an object, and bypass the assignment operator.
overwrite(objname, content, environment)
overwrite(objname, content, environment)
objname |
Character, the name of the object to overwrite. |
content |
An R object |
environment |
The environment where to perform the overwrite procedure. |
This function does not return anything.
Adrian Dusa
foo <- function(object, x) { objname <- deparse(substitute(object)) object <- x overwrite(objname, object, parent.frame()) } bar <- 1 foo(bar, 2) bar # [1] 2 bar <- list(A = bar) foo(bar$A, 3) bar
foo <- function(object, x) { objname <- deparse(substitute(object)) object <- x overwrite(objname, object, parent.frame()) } bar <- 1 foo(bar, 2) bar # [1] 2 bar <- list(A = bar) foo(bar$A, 3) bar
Generates all possible permutations of elements from a vector.
permutations(x)
permutations(x)
x |
Any kind of vector. |
Adrian Dusa
permutations(1:3)
permutations(1:3)
Recodes a vector (numeric, character or factor) according to a set of rules.
It is similar to the function recode
() from package car,
but more flexible. It also has similarities with the function
findInterval()
from package base.
recode(x, rules = NULL, cut = NULL, values = NULL, ...)
recode(x, rules = NULL, cut = NULL, values = NULL, ...)
x |
A vector of mode numeric, character or factor. |
rules |
Character string or a vector of character strings for recoding specifications. |
cut |
A vector of one or more unique cut points. |
values |
A vector of output values. |
... |
Other parameters, for compatibility with other functions such as
|
Similar to the recode()
function in package car, the
recoding rules are separated by semicolons, of the form input = output
,
and allow for:
a single value | 1 = 0
|
a range of values | 2:5 = 1
|
a set of values | c(6,7,10) = 2
|
else |
everything that is not covered by the previously specified rules |
Contrary to the recode
() function in package car, this
function allows the :
sequence operator (even for factors), so
that a rule such as c(1,3,5:7)
, or c(a,d,f:h)
would
be valid.
Actually, since all rules are specified in a string, it really doesn't matter
if the c()
function is used or not. For compatibility reasons it
accepts it, but a more simple way to specify a set of rules is
"1,3,5:7=A; else=B"
Special values lo
and hi
may also appear in the
range of values, while else
can be used with else=copy
to copy all values which were not specified in the recoding rules.
In the package car, a character output
would have to be quoted,
like "1:2='A'"
but that is not mandatory in this function, "1:2=A"
would do just as well. Output values such as "NA"
or "missing"
are converted to NA
.
Another difference from the car package: the output is not automatically
converted to a factor even if the original variable is a factor. That option is left to the
user's decision to specify as.factor.result
, defaulted to FALSE
.
A capital difference is the treatment of the values not present in the recoding rules. By
default, package car copies all those values in the new object, whereas in this
package the default values are NA
and new values are added only if they are
found in the rules. Users can choose to copy all other values not present in the recoding
rules, by specifically adding else=copy
in the rules.
Since the two functions have the same name, it is possible that users loading both
packages to use one instead of the other (depending which package is loaded first).
In order to preserve functionality and minimize possible namespace collisions with package
car, special efforts have been invested to ensure perfect compatibility with
the other recode
() function (plus more).
The argument ...
allows for more arguments specific to the car package,
such as as.factor.result
, as.numeric.result
. In addition, it also
accepts levels
, labels
and ordered
specific to function
factor()
in package base. When using the arguments
levels
and / or labels
, the output will automatically be coerced
to a factor, unless the argument values
is used, as indicated below.
Blank spaces outside category labels are ignored, see the last example.
It is possible to use recode()
in a similar way to function
cut()
, by specifying a vector of cut points. For any number of
such c
cut ploints, there should be c + 1
values.
If not otherwise specified, the argument values
is automatically
constructed as a sequence of numbers from 1
to c + 1
.
Unlike the function cut()
, arguments such as
include.lowest
or right
are not necessary because
the final outcome can be changed by tweaking the cut values.
If both arguments values
and labels
are provided,
the labels are going to be stored as an attribute.
Adrian Dusa
x <- rep(1:3, 3) # [1] 1 2 3 1 2 3 1 2 3 recode(x, "1:2 = A; else = B") # [1] "A" "A" "B" "A" "A" "B" "A" "A" "B" recode(x, "1:2 = 0; else = copy") # [1] 0 0 3 0 0 3 0 0 3 set.seed(1234) x <- sample(18:90, 20, replace = TRUE) # [1] 45 39 26 22 55 33 21 87 31 73 79 21 21 38 57 73 84 22 83 64 recode(x, cut = "35, 55") # [1] 2 2 1 1 2 1 1 3 1 3 3 1 1 2 3 3 3 1 3 3 set.seed(1234) x <- factor(sample(letters[1:10], 20, replace = TRUE), levels = letters[1:10]) # [1] j f e i e f d b g f j f d h d d e h d h # Levels: a b c d e f g h i j recode(x, "b:d = 1; g:hi = 2; else = NA") # note the "hi" special value # [1] 2 NA NA 2 NA NA 1 1 2 NA 2 NA 1 2 1 1 NA 2 1 2 recode(x, "a, c:f = A; g:hi = B; else = C", labels = "A, B, C") # [1] B A A B A A A C B A B A A B A A A B A B # Levels: A B C recode(x, "a, c:f = 1; g:hi = 2; else = 3", labels = c("one", "two", "three"), ordered = TRUE) # [1] two one one two one one one three two one # [11] two one one two one one one two one two # Levels: one < two < three set.seed(1234) categories <- c("An", "example", "that has", "spaces") x <- factor(sample(categories, 20, replace = TRUE), levels = categories, ordered = TRUE) sort(x) # [1] An An An example example example example # [8] example example example example that has that has that has # [15] spaces spaces spaces spaces spaces spaces # Levels: An < example < that has < spaces recode(sort(x), "An : that has = 1; spaces = 2") # [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 # single quotes work, but are not necessary recode(sort(x), "An : 'that has' = 1; spaces = 2") # [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 # same using cut values recode(sort(x), cut = "that has") # [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 # modifying the output values recode(sort(x), cut = "that has", values = 0:1) # [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 # more treatment of "else" values x <- 10:20 # recoding rules don't overlap all existing values, the rest are empty recode(x, "8:15 = 1") # [1] 1 1 1 1 1 1 NA NA NA NA NA # all other values copied recode(x, "8:15 = 1; else = copy") # [1] 1 1 1 1 1 1 16 17 18 19 20
x <- rep(1:3, 3) # [1] 1 2 3 1 2 3 1 2 3 recode(x, "1:2 = A; else = B") # [1] "A" "A" "B" "A" "A" "B" "A" "A" "B" recode(x, "1:2 = 0; else = copy") # [1] 0 0 3 0 0 3 0 0 3 set.seed(1234) x <- sample(18:90, 20, replace = TRUE) # [1] 45 39 26 22 55 33 21 87 31 73 79 21 21 38 57 73 84 22 83 64 recode(x, cut = "35, 55") # [1] 2 2 1 1 2 1 1 3 1 3 3 1 1 2 3 3 3 1 3 3 set.seed(1234) x <- factor(sample(letters[1:10], 20, replace = TRUE), levels = letters[1:10]) # [1] j f e i e f d b g f j f d h d d e h d h # Levels: a b c d e f g h i j recode(x, "b:d = 1; g:hi = 2; else = NA") # note the "hi" special value # [1] 2 NA NA 2 NA NA 1 1 2 NA 2 NA 1 2 1 1 NA 2 1 2 recode(x, "a, c:f = A; g:hi = B; else = C", labels = "A, B, C") # [1] B A A B A A A C B A B A A B A A A B A B # Levels: A B C recode(x, "a, c:f = 1; g:hi = 2; else = 3", labels = c("one", "two", "three"), ordered = TRUE) # [1] two one one two one one one three two one # [11] two one one two one one one two one two # Levels: one < two < three set.seed(1234) categories <- c("An", "example", "that has", "spaces") x <- factor(sample(categories, 20, replace = TRUE), levels = categories, ordered = TRUE) sort(x) # [1] An An An example example example example # [8] example example example example that has that has that has # [15] spaces spaces spaces spaces spaces spaces # Levels: An < example < that has < spaces recode(sort(x), "An : that has = 1; spaces = 2") # [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 # single quotes work, but are not necessary recode(sort(x), "An : 'that has' = 1; spaces = 2") # [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 # same using cut values recode(sort(x), cut = "that has") # [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 # modifying the output values recode(sort(x), cut = "that has", values = 0:1) # [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 # more treatment of "else" values x <- 10:20 # recoding rules don't overlap all existing values, the rest are empty recode(x, "8:15 = 1") # [1] 1 1 1 1 1 1 NA NA NA NA NA # all other values copied recode(x, "8:15 = 1; else = copy") # [1] 1 1 1 1 1 1 16 17 18 19 20
Utility function based on substitute()
, to recover an unquoted input.
recreate(x, snames = NULL, ...)
recreate(x, snames = NULL, ...)
x |
A substituted input. |
snames |
A character string containing set names. |
... |
Other arguments, mainly for internal use. |
This function is especially useful when users have to provide lots of quoted inputs, such as the name of the columns from a data frame to be considered for a particular function.
This is actually one of the main uses of the base function
substitute()
, but here it can be employed to also
detect SOP (sum of products) expressions, explained for instance in function
translate()
.
Such SOP expressions are usually used in contexts of sufficieny and necessity,
which are indicated with the usual signs ->
and <-
. These are
both allowed by the R parser, indicating standard assignment. Due to the R's
internal parsing system, a sufficient expression using ->
is automatically
flipped to a necessity statement <-
with reversed LHS to RHS, but this
function is able to determine what is the expression and what is the output.
The other necessity code <=
is also recognized, but the equivalent
sufficiency code =>
is not allowed in unquoted expressions.
A quoted, equivalent expression or a substituted object.
Adrian Dusa
recreate(substitute(A + ~B*C)) foo <- function(x, ...) recreate(substitute(list(...))) foo(arg1 = 3, arg2 = A + ~B*C) df <- data.frame(A = 1, B = 2, C = 3, Y = 4) # substitute from the global environment # the result is the builtin C() function res <- recreate(substitute(C)) is.function(res) # TRUE # search first within the column name space from df recreate(substitute(C), colnames(df)) # "C" # necessity well recognized recreate(substitute(A <- B)) # but sufficiency is flipped recreate(substitute(A -> B)) # more complex SOP expressions are still recovered recreate(substitute(A + ~B*C -> Y))
recreate(substitute(A + ~B*C)) foo <- function(x, ...) recreate(substitute(list(...))) foo(arg1 = 3, arg2 = A + ~B*C) df <- data.frame(A = 1, B = 2, C = 3, Y = 4) # substitute from the global environment # the result is the builtin C() function res <- recreate(substitute(C)) is.function(res) # TRUE # search first within the column name space from df recreate(substitute(C), colnames(df)) # "C" # necessity well recognized recreate(substitute(A <- B)) # but sufficiency is flipped recreate(substitute(A -> B)) # more complex SOP expressions are still recovered recreate(substitute(A + ~B*C -> Y))
Provides an improved method to replace strings, compared to function
gsub
() in package base.
replaceText( expression = "", target = "", replacement = "", protect = "", boolean = FALSE, ...)
replaceText( expression = "", target = "", replacement = "", protect = "", boolean = FALSE, ...)
expression |
Character string, usually a SOP - sum of products expression. |
target |
Character vector or a string containing the text to be replaced. |
replacement |
Character vector or a string containing the text to replace with. |
protect |
Character vector or a string containing the text to protect. |
boolean |
Treat characters in a boolean way, using upper and lower case letters. |
... |
Other arguments, from and to other functions. |
If the input expression is "J*JSR", and the task is to replace "J" with "A" and "JSR" with
"B", function gsub
() is not very useful since the letter "J" is
found in multiple places, including the second target.
This function finds the exact location(s) of each target in the input string, starting with those having the largest number of characters, making sure the locations are unique. For instance, the target "JSR" is found on the location from 3 to 5, while the target "J" is is found on two locations 1 and 3, but 3 was already identified in the previously found location for the larger target.
In addition, this function can also deal with target strings containing spaces.
The original string, replacing the target text with its replacement.
Adrian Dusa
replaceText("J*JSR", "J, JSR", "A, B") # same output, on input expresions containing spaces replaceText("J*JS R", "J, JS R", "A, B") # works even with Boolean expressions, where lower case # letters signal the absence of the causal condition replaceText("DEV + urb*LIT", "DEV, URB, LIT", "A, B, C", boolean = TRUE)
replaceText("J*JSR", "J, JSR", "A, B") # same output, on input expresions containing spaces replaceText("J*JS R", "J, JS R", "A, B") # works even with Boolean expressions, where lower case # letters signal the absence of the causal condition replaceText("DEV + urb*LIT", "DEV, URB, LIT", "A, B, C", boolean = TRUE)
Functions to read and write to the system's clipboard, for copy/paste operations.
scan.clipboard(...) write.clipboard(x)
scan.clipboard(...) write.clipboard(x)
x |
Object to be written to the clipboard |
... |
Same arguments that are used in the base function |
Adrian Dusa
Checks and changes expressions containing set negations using a tilde.
hastilde(x) notilde(x) tilde1st(x)
hastilde(x) notilde(x) tilde1st(x)
x |
A vector of values |
Boolean expressions can be negated in various ways. For binary crisp and fuzzy sets, one of the most straightforward ways to invert the set membership scores is to subtract them from 1. This is both possible using R vectors and also often used to signal a negation in SOP (sum of products) expressions.
Some other times, SOP expressions can signal a set negation (also known as the absence of a causal condition) by using lower case letters, while upper case letters are used to signal the presence of a causal condition. SOP expressions also use a tilde to signal a set negation, immediately preceding the set name.
This set of functions detect when and if a set present in a SOP expression contains a tilde
(function hastilde
), whether the entire expression begins with a tilde (function
tilde1st
).
Adrian Dusa
hastilde("~A")
hastilde("~A")
This function combines the base functions tryCatch
() and
withCallingHandlers
() for the specific purpose of capturing
not only errors and warnings but messages as well.
tryCatchWEM(expr, capture = FALSE)
tryCatchWEM(expr, capture = FALSE)
expr |
Expression to be evaluated. |
capture |
Logical, capture the visible output. |
In some situations it might be important not only to test a function, but also to capture everything that is written in the R console, be it an error, a warning or simply a message.
For instance package QCA (version 3.4) has a Graphical User Interface that simulates an R console embedded into a web based shiny app.
It is not intended to replace function tryCatch
() in any
way, especially not evaluating an expression before returning or exiting, it simply
captures everything that is printed on the console (the visible output).
A list, if anything would be printed on the screen, or an empty (NULL) object otherwise.
Adrian Dusa
A function almost identical to the base function with()
, but allowing
to evaluate the expression in every subset of a split file.
using(data, expr, split.by = NULL, ...)
using(data, expr, split.by = NULL, ...)
data |
A data frame. |
expr |
Expression to evaluate |
split.by |
A factor variable from the |
... |
Other internal arguments. |
A list of results, or a matrix if each separate result is a vector.
Adrian Dusa
set.seed(123) DF <- data.frame( Area = factor(sample(c("Rural", "Urban"), 123, replace = TRUE)), Gender = factor(sample(c("Female", "Male"), 123, replace = TRUE)), Age = sample(18:90, 123, replace = TRUE), Children = sample(0:5, 123, replace = TRUE) ) # table of frequencies for Gender table(DF$Gender) # same with using(DF, table(Gender)) # same, but split by Area using(DF, table(Gender), split.by = Area) # calculate the mean age by gender using(DF, mean(Age), split.by = Gender) # same, but select cases from the urban area using(subset(DF, Area == "Urban"), mean(Age), split.by = Gender) # mean age by gender and area using(DF, mean(Age), split.by = Area & Gender) # same with using(DF, mean(Age), split.by = c(Area, Gender)) # average number of children by Area using(DF, mean(Children), split.by = Area) # frequency tables by Area using(DF, table(Children), split.by = Area)
set.seed(123) DF <- data.frame( Area = factor(sample(c("Rural", "Urban"), 123, replace = TRUE)), Gender = factor(sample(c("Female", "Male"), 123, replace = TRUE)), Age = sample(18:90, 123, replace = TRUE), Children = sample(0:5, 123, replace = TRUE) ) # table of frequencies for Gender table(DF$Gender) # same with using(DF, table(Gender)) # same, but split by Area using(DF, table(Gender), split.by = Area) # calculate the mean age by gender using(DF, mean(Age), split.by = Gender) # same, but select cases from the urban area using(subset(DF, Area == "Urban"), mean(Age), split.by = Gender) # mean age by gender and area using(DF, mean(Age), split.by = Area & Gender) # same with using(DF, mean(Age), split.by = c(Area, Gender)) # average number of children by Area using(DF, mean(Children), split.by = Area) # frequency tables by Area using(DF, table(Children), split.by = Area)