Title: | Data Structures and Algorithms for Relations |
---|---|
Description: | Data structures and algorithms for k-ary relations with arbitrary domains, featuring relational algebra, predicate functions, and fitters for consensus relations. |
Authors: | David Meyer [aut], Kurt Hornik [aut, cre] , Christian Buchta [ctb] |
Maintainer: | Kurt Hornik <[email protected]> |
License: | GPL-2 |
Version: | 0.6-13 |
Built: | 2024-11-07 06:35:47 UTC |
Source: | CRAN |
Various “relational algebra”-like operations.
relation_projection(x, margin = NULL) relation_selection(x, subset) relation_cartesian(x, y, ...) relation_complement(x, y) relation_intersection(x, y, ...) relation_union(x, y, ...) relation_symdiff(x, y) relation_division(x, y) relation_remainder(x, y) relation_join(x, y, ...) relation_semijoin(x, y, ...) relation_antijoin(x, y, ...)
relation_projection(x, margin = NULL) relation_selection(x, subset) relation_cartesian(x, y, ...) relation_complement(x, y) relation_intersection(x, y, ...) relation_union(x, y, ...) relation_symdiff(x, y) relation_division(x, y) relation_remainder(x, y) relation_join(x, y, ...) relation_semijoin(x, y, ...) relation_antijoin(x, y, ...)
x , y
|
Relation objects. |
margin |
Either a character vector of domain names, or an integer vector of domain indices. |
subset |
Expression resulting in a logical vector of length equal to the number of tuples in the graph. |
... |
Relation objects for |
These functions provide functionality similar to the corresponding operations defined in relational algebra theory as introduced by Codd (1970). Note, however, that domains in database relations, unlike the concept of relations we use here, are unordered. In fact, a database relation (“table”) is defined as a set of elements called “tuples”, where the “tuple” components are named, but unordered. So in fact, a “tuple” in this sense is a set of mappings from the attribute names into the union of the attribute domains.
The projection of a relation on a specified margin (i.e., a vector of domain names or indices) is the relation obtained when all tuples are restricted to this margin. As a consequence, duplicate tuples are removed.
The selection of a relation is the relation obtained by taking a subset of the relation graph, defined by some logical expression.
The Cartesian product of two relations is obtained by basically building the Cartesian product of all graph elements, but combining the resulting pairs into single tuples.
The union of two relations simply combines the graph elements
of both relations; the complement of two relations and
removes the tuples of
from
.
The intersection (symmetric difference) of two relations is the relation with all tuples they have (do not have) in common.
The division of relation by relation
is the
reversed Cartesian product. The result is a relation with the domain
unique to
and containing the maximum number of tuples which,
multiplied by
, are contained in
. The remainder
of this operation is the complement of
and the division of
by
. Note that for both operations, the domain of
must be contained in the domain of
.
The (natural) join of two relations is their Cartesian product,
restricted to the subset where the elements of the common attributes
do match. The left/right/full outer join of two relations
and
is
the union of
/
/
and
,
and the inner join of
and
.
The implementation uses
merge()
, and so the
left/right/full outer joins are obtained by setting
all.x
/all.y
/all
to TRUE
in
relation_join()
.
The domains to be matched are specified using by
.
The left (right) semijoin of two relations and
is the join of these, projected to the attributes of
(
). Thus, it yields all tuples of
(
) participating in the join of
and
.
The left (right) antijoin of two relations and
is the complement of
(
) and the join of both,
projected to the attributes of
(
).
Thus, it yields all tuples of
(
) not participating in the join of
and
.
The operators %><%
, %=><%
, %><=%
,
%=><=%
, %|><%
, %><|%
, %|><|%
,
%|>%
, %<|%
, and %U%
can be used for the
Cartesian product, left outer join, right outer join, full outer join,
left semi-join, right semi-join, join, left antijoin, right antijoin,
and union, respectively.
E. F. Codd (1970), A relational model of data for large shared data banks. Communications of the ACM, 13/6, 377–387. doi:10.1145/362384.362685.
relation()
## projection Person <- data.frame(Name = c("Harry", "Sally", "George", "Helena", "Peter"), Age = c(34, 28, 29, 54, 34), Weight = c(80, 64, 70, 54, 80), stringsAsFactors = FALSE) Person <- as.relation(Person) relation_table(Person) relation_table(relation_projection(Person, c("Age", "Weight"))) ## selection relation_table(R1 <- relation_selection(Person, Age < 29)) relation_table(R2 <- relation_selection(Person, Age >= 34)) relation_table(R3 <- relation_selection(Person, Age == Weight)) ## union relation_table(R1 %U% R2) ## works only for the same domains: relation_table(R2 | R3) ## complement relation_table(Person - R2) ## intersection relation_table(relation_intersection(R2, R3)) ## works only for the same domains: relation_table(R2 & R3) ## symmetric difference relation_table(relation_symdiff(R2, R3)) ## Cartesian product Employee <- data.frame(Name = c("Harry", "Sally", "George", "Harriet", "John"), EmpId = c(3415, 2241, 3401, 2202, 3999), DeptName = c("Finance", "Sales", "Finance", "Sales", "N.N."), stringsAsFactors = FALSE) Employee <- as.relation(Employee) relation_table(Employee) Dept <- data.frame(DeptName = c("Finance", "Sales", "Production"), Manager = c("George", "Harriet", "Charles"), stringsAsFactors = FALSE) Dept <- as.relation(Dept) relation_table(Dept) relation_table(Employee %><% Dept) ## Natural join relation_table(Employee %|><|% Dept) ## left (outer) join relation_table(Employee %=><% Dept) ## right (outer) join relation_table(Employee %><=% Dept) ## full outer join relation_table(Employee %=><=% Dept) ## antijoin relation_table(Employee %|>% Dept) relation_table(Employee %<|% Dept) ## semijoin relation_table(Employee %|><% Dept) relation_table(Employee %><|% Dept) ## division Completed <- data.frame(Student = c("Fred", "Fred", "Fred", "Eugene", "Eugene", "Sara", "Sara"), Task = c("Database1", "Database2", "Compiler1", "Database1", "Compiler1", "Database1", "Database2"), stringsAsFactors = FALSE) Completed <- as.relation(Completed) relation_table(Completed) DBProject <- data.frame(Task = c("Database1", "Database2"), stringsAsFactors = FALSE) DBProject <- as.relation(DBProject) relation_table(DBProject) relation_table(Completed %/% DBProject) ## division remainder relation_table(Completed %% DBProject)
## projection Person <- data.frame(Name = c("Harry", "Sally", "George", "Helena", "Peter"), Age = c(34, 28, 29, 54, 34), Weight = c(80, 64, 70, 54, 80), stringsAsFactors = FALSE) Person <- as.relation(Person) relation_table(Person) relation_table(relation_projection(Person, c("Age", "Weight"))) ## selection relation_table(R1 <- relation_selection(Person, Age < 29)) relation_table(R2 <- relation_selection(Person, Age >= 34)) relation_table(R3 <- relation_selection(Person, Age == Weight)) ## union relation_table(R1 %U% R2) ## works only for the same domains: relation_table(R2 | R3) ## complement relation_table(Person - R2) ## intersection relation_table(relation_intersection(R2, R3)) ## works only for the same domains: relation_table(R2 & R3) ## symmetric difference relation_table(relation_symdiff(R2, R3)) ## Cartesian product Employee <- data.frame(Name = c("Harry", "Sally", "George", "Harriet", "John"), EmpId = c(3415, 2241, 3401, 2202, 3999), DeptName = c("Finance", "Sales", "Finance", "Sales", "N.N."), stringsAsFactors = FALSE) Employee <- as.relation(Employee) relation_table(Employee) Dept <- data.frame(DeptName = c("Finance", "Sales", "Production"), Manager = c("George", "Harriet", "Charles"), stringsAsFactors = FALSE) Dept <- as.relation(Dept) relation_table(Dept) relation_table(Employee %><% Dept) ## Natural join relation_table(Employee %|><|% Dept) ## left (outer) join relation_table(Employee %=><% Dept) ## right (outer) join relation_table(Employee %><=% Dept) ## full outer join relation_table(Employee %=><=% Dept) ## antijoin relation_table(Employee %|>% Dept) relation_table(Employee %<|% Dept) ## semijoin relation_table(Employee %|><% Dept) relation_table(Employee %><|% Dept) ## division Completed <- data.frame(Student = c("Fred", "Fred", "Fred", "Eugene", "Eugene", "Sara", "Sara"), Task = c("Database1", "Database2", "Compiler1", "Database1", "Compiler1", "Database1", "Database2"), stringsAsFactors = FALSE) Completed <- as.relation(Completed) relation_table(Completed) DBProject <- data.frame(Task = c("Database1", "Database2"), stringsAsFactors = FALSE) DBProject <- as.relation(DBProject) relation_table(DBProject) relation_table(Completed %/% DBProject) ## division remainder relation_table(Completed %% DBProject)
A data set with 16 variables on 36 different types of cetacea.
data("Cetacea")
data("Cetacea")
A data frame with 36 observations on 16 categorical variables. The
first 15 variables relate to morphology, osteology, or behavior, and
have both self-explanatory names and levels. The last (CLASS
)
gives a common zoological classification.
G. Vescia (1985). Descriptive classification of Cetacea: Whales, porpoises, and dolphins. In: J. F. Marcotorchino, J. M. Proth, and J. Janssen (eds.), Data analysis in real life environment: ins and outs of solving problems. Elsevier Science Publishers B.V.: Amsterdam, The Netherlands.
data("Cetacea") summary(Cetacea) ## Show the cetacea types grouped by class. split(rownames(Cetacea), Cetacea$CLASS)
data("Cetacea") summary(Cetacea) ## Show the cetacea types grouped by class. split(rownames(Cetacea), Cetacea$CLASS)
Determine the characteristic function of a relation.
relation_charfun(x, components = FALSE)
relation_charfun(x, components = FALSE)
x |
an object inheriting from class |
components |
a logical indicating whether the characteristic function created should take vectors (each vector corresponding to one domain) as argument, or a data frame (with the elements in the rows). In the former case, all vectors are recycled to fit the longest vector in case of binary relations. |
relation()
## Relation 'a divides b': divides <- function(a, b) b %% a == 0 R <- relation(domain = list(1 : 10, 1 : 10), charfun = divides) R ## 'Recover' characteristic function: "%|%" <- relation_charfun(R) ## Use it. 2L %|% 6L 2:4 %|% 6L 2L %|% c(2:3, 6L) ## This also works: "%|%"(2L, 6L) ## (and more generally does for arities > 2).
## Relation 'a divides b': divides <- function(a, b) b %% a == 0 R <- relation(domain = list(1 : 10, 1 : 10), charfun = divides) R ## 'Recover' characteristic function: "%|%" <- relation_charfun(R) ## Use it. 2L %|% 6L 2:4 %|% 6L 2L %|% c(2:3, 6L) ## This also works: "%|%"(2L, 6L) ## (and more generally does for arities > 2).
Choose objects based on an ensemble of relations between these.
relation_choice(x, method = "symdiff", weights = 1, control = list(), ...)
relation_choice(x, method = "symdiff", weights = 1, control = list(), ...)
x |
an ensemble of endorelations. |
method |
a character string specifying one of the built-in methods, or a function to be taken as a user-defined method. See Details for available built-in methods. |
weights |
a numeric vector with non-negative case weights.
Recycled to the number of elements in the ensemble given by |
control |
a list of control parameters. See Details. |
... |
a list of control parameters (overruling those specified
in |
A social choice function is a rule for choosing from a set
of objects, i.e., selecting suitable subsets of
.
Voting rules used in elections are the most prominent example of such
functions, which typically aggregate individual preferences (e.g., of
voters).
Choice methods "symdiff"
, "CKS"
, "PC"
and
"euclidean"
choose a given number of objects
(“winners”) by determining a relation
minimizing
over all relations for which winners are
always strictly preferred to losers, without any further constraints
on the relations between pairs of winners or pairs of losers, where
is symmetric difference (symdiff, “Kemeny-Snell”),
Cook-Kress-Seiford (CKS), generalized paired comparison, or
Euclidean dissimilarity, respectively, and
is the case
weight given to
.
For symdiff, CKS and PC choice, the
must be crisp
endorelations, and
; for Euclidean choice, the
can be crisp or fuzzy endorelations, and
.
(Note that solving such a choice problem is different from computing
consensus preference relations.)
See
relation_dissimilarity()
for more information about
these dissimilarities.
Available control options include:
k
an integer giving the number of objects/winners to be chosen.
n
the maximal number of optimal choices to be
obtained, with NA
constants or "all"
indicating to
obtain all optimal choices. By default, only a single optimal
choice is computed.
For the general PC case, the discrepancies can be specified via the
delta
control option.
Choice method "Schulze"
implements the Schulze method for
selecting winners from (votes expressing) preferences. See e.g.
https://en.wikipedia.org/wiki/Schulze_method for details.
Currently, the Schulze heuristic is used, and the set of all possible
winners is returned.
A set with the chosen objects, or a list of such sets.
data("SVM_Benchmarking_Classification") ## Determine the three best classification learners in the above sense. relation_choice(SVM_Benchmarking_Classification, k = 3)
data("SVM_Benchmarking_Classification") ## Determine the three best classification learners in the above sense. relation_choice(SVM_Benchmarking_Classification, k = 3)
Provide class ids or classes, respectively, for an equivalence relation or the indifference relation of a weak order.
relation_class_ids(x) relation_classes(x)
relation_class_ids(x) relation_classes(x)
x |
an object inheriting from class |
For relation_class_ids()
, a numeric vector with class ids
corresponding to the classes of the equivalence relation, or the
indifference relation of the weak order with ids ordered according to
increasing preference.
For relation_classes()
, an object of class
relation_classes_of_objects
, which is a list of sets giving the
elements in the corresponding classes, named by the class ids.
## Equivalence. f <- factor(rep(c("Good", "Bad", "Ugly"), c(3, 2, 1))) R <- as.relation(f) relation_is(R, "equivalence") table(ids = relation_class_ids(R), orig = f) relation_classes(R) ## Weak order ("weak preference"). f <- ordered(f, levels = c("Ugly", "Bad", "Good")) R <- as.relation(f) relation_is(R, "weak_order") table(ids = relation_class_ids(R), orig = f) relation_classes(R)
## Equivalence. f <- factor(rep(c("Good", "Bad", "Ugly"), c(3, 2, 1))) R <- as.relation(f) relation_is(R, "equivalence") table(ids = relation_class_ids(R), orig = f) relation_classes(R) ## Weak order ("weak preference"). f <- ordered(f, levels = c("Ugly", "Bad", "Good")) R <- as.relation(f) relation_is(R, "weak_order") table(ids = relation_class_ids(R), orig = f) relation_classes(R)
Computes transitive and reflexive closure of an endorelation.
transitive_closure(x) reflexive_closure(x) ## S3 method for class 'relation' closure(x, operation = c("transitive", "reflexive"), ...)
transitive_closure(x) reflexive_closure(x) ## S3 method for class 'relation' closure(x, operation = c("transitive", "reflexive"), ...)
x |
an R object inheriting from class |
operation |
character string indicating the kind of closure. |
... |
currently not used. |
Let be an endorelation on
and
be the number of
elements in
.
The transitive closure of is the smallest transitive
relation on
that contains
. The code implements
Warshall's Algorithm which is of complexity
.
The reflexive closure of is computed by setting the
diagonal of the incidence matrix to 1.
S. Warshall (1962), A theorem on Boolean matrices. Journal of the ACM, 9/1, 11–12. doi:10.1145/321105.321107.
relation()
,
reflexive_reduction()
,
transitive_reduction()
,
closure()
.
R <- as.relation(1 : 5) relation_incidence(R) ## transitive closure/reduction RR <- transitive_reduction(R) relation_incidence(RR) R == transitive_closure(RR) ## same require("sets") # closure() and reduction() R == closure(reduction(R)) ## reflexive closure/reduction RR <- reflexive_reduction(R) relation_incidence(RR) R == reflexive_closure(RR) ## same: R == closure(reduction(R, "reflexive"), "reflexive")
R <- as.relation(1 : 5) relation_incidence(R) ## transitive closure/reduction RR <- transitive_reduction(R) relation_incidence(RR) R == transitive_closure(RR) ## same require("sets") # closure() and reduction() R == closure(reduction(R)) ## reflexive closure/reduction RR <- reflexive_reduction(R) relation_incidence(RR) R == reflexive_closure(RR) ## same: R == closure(reduction(R, "reflexive"), "reflexive")
Computes (strongly or weakly) connected components of an endorelation.
relation_connected_components(x, type = c("strongly", "weakly")) relation_condensation(x) relation_component_representation(x)
relation_connected_components(x, type = c("strongly", "weakly")) relation_condensation(x) relation_component_representation(x)
x |
an R object inheriting from class |
type |
character string indicating the kind of components sought. |
Let be the graph of an endorelation
.
A weakly connected component of some node in
is
the set of all nodes reachable from
. A strongly
connected component of some node
is the set of all nodes
reachable from
, from which
can be reached. Each
component is represented by some element, the leader.
The component representation graph of a cyclic endorelation
is composed of directed cycles, one for each strongly
connected component of
containing more than one element,
linking all corresponding elements.
The condensation of is the graph of all leaders of
.
For relation_connected_components()
, an object of class
relation_classes_of_objects
, i.e., a list of sets giving the
elements of the corresponding connected components, named by the
leaders' character representation. The list of leaders is added as a
leaders
attribute.
For relation_condensation()
, an (acyclic) endorelation.
For relation_component_representation()
, an endorelation with
same domain as x
.
S. Warshall (1962), A theorem on Boolean matrices. Journal of the ACM, 9/1, 11–12. doi:10.1145/321105.321107.
J. A. La Poutré and J. van Leeuwen (1988), Maintenance of Transitive Closures and Transitive Reductions of Graphs. Proceedings of the International Workshop of Graph-Theoretic Concepts in Computer Science, Springer, London, 106–120.
plot.relation()
,
transitive_reduction()
## example from La Poutre and van Leeuwen: require("sets") # set(), pair() etc. G <- set(pair(1L, 2L), pair(2L, 1L), pair(1L, 3L), pair(3L, 1L), pair(3L, 7L), pair(2L, 5L), pair(2L, 6L), pair(6L, 5L), pair(5L, 7L), pair(4L, 6L), pair(5L, 4L), pair(4L, 7L)) R <- endorelation(graph = G) relation_connected_components(R) relation_graph(relation_condensation(R)) relation_graph(relation_component_representation(R))
## example from La Poutre and van Leeuwen: require("sets") # set(), pair() etc. G <- set(pair(1L, 2L), pair(2L, 1L), pair(1L, 3L), pair(3L, 1L), pair(3L, 7L), pair(2L, 5L), pair(2L, 6L), pair(6L, 5L), pair(5L, 7L), pair(4L, 6L), pair(5L, 4L), pair(4L, 7L)) R <- endorelation(graph = G) relation_connected_components(R) relation_graph(relation_condensation(R)) relation_graph(relation_component_representation(R))
Compute consensus relations of a relation ensemble.
relation_consensus(x, method = NULL, weights = 1, control = list(), ...)
relation_consensus(x, method = NULL, weights = 1, control = list(), ...)
x |
an ensemble of relations (see
|
method |
a character string specifying one of the built-in
methods for computing consensus relations, or a function to be
taken as a user-defined method, or |
weights |
a numeric vector with non-negative case weights.
Recycled to the number of elements in the ensemble given by |
control |
a list of control parameters. See Details. |
... |
a list of control parameters (overruling those specified
in |
Consensus relations “synthesize” the information in the
elements of a relation ensemble into a single relation, often by
minimizing a criterion function measuring how dissimilar consensus
candidates are from the (elements of) the ensemble (the so-called
“optimization approach”), typically of the form
, where
is a suitable
dissimilarity measure (see
relation_dissimilarity()
),
is the case weight given to element
of the
ensemble, and
. Such consensus relations are called
“central relations” in Régnier (1965). For
, we
obtain (generalized) medians;
gives (generalized) means
(least squares consensus relations).
Available built-in methods are as follows. Apart from Condorcet's and the unrestricted Manhattan and Euclidean consensus methods, these are applicable to ensembles of endorelations only.
"Borda"
the consensus method proposed by Borda (1781).
For each relation and object
, one determines the
Borda/Kendall scores, i.e., the number of objects
such
that
. These are then aggregated across relations
by weighted averaging. Finally, objects are ordered according to
their aggregated scores. Note that this may result in a weak order
(i.e., with objects being tied).
One can enforce a linear order by setting the control parameter
L
to TRUE
, and obtain a relation ensemble with up to
n or all such solutions by additionally setting the control
parameter n
to some positive integer or "all"
,
respectively.
"Copeland"
the consensus method proposed by Copeland
(1951). For each relation and object
, one
determines the Copeland scores, i.e., the number of objects
such that
, minus the number of objects
such that
. Like the Borda method, these are
then aggregated across relations by weighted averaging. Finally,
objects are ordered according to their aggregated scores.
Note that this may result in a weak order
(i.e., with objects being tied).
One can enforce a linear order by setting the control parameter
L
to TRUE
, and obtain a relation ensemble with up to
n or all such solutions by additionally setting the control
parameter n
to some positive integer or "all"
,
respectively.
"Condorcet"
the consensus method proposed by Condorcet
(1785). For a given ensemble of crisp relations, this minimizes
the criterion function with
as symmetric
difference distance and
over all possible crisp
relations. In the case of endorelations, consensus is obtained by
weighting voting, such that
if the weighted number of
times that
is no less than the weighted number of
times that this is not the case. Even when aggregating linear
orders, this can lead to intransitive consensus solutions
(“effet Condorcet”).
One can obtain a relation ensemble with up to n or all such
solutions consensus relations by setting the control parameter
n
to some positive integer or "all"
, respectively.
"CS"
the consensus method of Cook and Seiford (1978)
which determines a linear order minimizing the criterion function
with
as generalized Cook-Seiford (ranking)
distance and
via solving a linear sum assignment
problem.
One can obtain a relation ensemble with up to n or all such
consensus relations by setting the control parameter n
to
some positive integer or "all"
, respectively.
"symdiff/F"
an exact solver for determining the
consensus relation of an ensemble of crisp endorelations by
minimizing the criterion function with
as
symmetric difference (“symdiff”) distance and
over a suitable class (“Family”) of crisp endorelations as
indicated by F, with values:
G
general (crisp) endorelations.
A
antisymmetric relations.
C
complete relations.
E
equivalence relations: reflexive, symmetric, and transitive.
L
linear orders: complete, reflexive, antisymmetric, and transitive.
M
matches: complete and reflexive.
O
partial orders: reflexive, antisymmetric and transitive.
S
symmetric relations.
T
tournaments: complete, irreflexive and antisymmetric (i.e., complete and asymmetric).
W
weak orders (complete preorders, preferences, “orderings”): complete, reflexive and transitive.
preorder
preorders: reflexive and transitive.
transitive
transitive relations.
Can also be referred to as "SD/F"
.
Consensus relations are determined by reformulating the consensus
problem as a binary program (for the relation incidences), see
Hornik and Meyer (2007) for details. The solver employed can be
specified via the control argument solver
, with currently
possible values "glpk"
, "lpsolve"
, "symphony"
or "cplex"
or a unique abbreviation thereof, specifying to
use the solvers from packages Rglpk (default),
lpSolve, Rsymphony, or Rcplex, respectively.
Unless control option sparse
is false, a sparse formulation
of the binary program is used, which is typically more efficient.
For fitting equivalences and weak orders (cases E
and
W
) it is possible to specify the number of classes
using the control parameter
k
. For fitting weak orders,
one can also specify the number of elements in the classes via
control parameter l
.
Additional constraints on the incidences of the consensus solution
can be given via the control parameter constraints
, in the
form of a 3-column matrix whose rows give row and column indices
and
and the corresponding incidence
.
(I.e., incidences can be constrained to be zero or one on an
object by object basis.)
One can obtain a relation ensemble with up to n or all such
consensus relations by setting the control parameter n
to
some positive integer or "all"
, respectively.
(See the examples.)
"manhattan"
the (unrestricted) median of the
ensemble, minimizing with
as Manhattan (symmetric
difference) distance and
over all (possibly fuzzy)
relations.
"euclidean"
the (unrestricted) mean of the ensemble,
minimizing with
as Euclidean distance and
over all (possibly fuzzy) relations.
"euclidean/F"
an exact solver for determining
the restricted least squares Euclidean consensus relation of an
ensemble of endorelations by minimizing the criterion function
with
as Euclidean difference distance and
over a suitable family of crisp endorelations as
indicated by F, with available families and control
parameters as for methods
"symdiff/F"
.
"majority"
a generalized majority method for which the
consensus relation contains of all tuples occurring with a
relative frequency of more than percent (of 100
percent if
). The fraction
can be specified
via the control parameter
p
. By default, is
used.
"CKS/F"
an exact solver for determining the
consensus relation of an ensemble of crisp endorelations by
minimizing the criterion function with
as
Cook-Kress-Seiford (“CKS”) distance and
over a
suitable class (“Family”) of crisp endorelations as
indicated by F, with available families and control
parameters as for methods
"symdiff/F"
.
For fitting equivalences and weak orders (cases E
and
W
) it is possible to specify the number of classes
using the control parameter
k
.
One can obtain a relation ensemble with up to n or all such
consensus relations by setting the control parameter n
to
some positive integer or "all"
, respectively.
"PC/F"
an exact solver for determining the
consensus relation of an ensemble of crisp endorelations by
minimizing the criterion function with
as
(generalized) paired comparison (“PC”) distance and
over a suitable class (“Family”) of crisp
endorelations as indicated by F, with available families and
control parameters as for methods
"symdiff/F"
, and
control option delta
for specifying the paired comparison
discrepancies.
For fitting equivalences and weak orders (cases E
and
W
) it is possible to specify the number of classes
using the control parameter
k
.
One can obtain a relation ensemble with up to n or all such
consensus relations by setting the control parameter n
to
some positive integer or "all"
, respectively.
The consensus relation(s).
J. C. Borda (1781), Mémoire sur les élections au scrutin. Histoire de l'Académie Royale des Sciences.
W. D. Cook and M. Kress (1992), Ordinal information and preference structures: decision models and applications. Prentice-Hall: New York. ISBN: 0-13-630120-7.
W. D. Cook and L. M. Seiford (1978), Priority ranking and consensus formation. Management Science, 24/16, 1721–1732. doi:10.1287/mnsc.24.16.1721.
M. J. A. de Condorcet (1785), Essai sur l'application de l'analyse à la probabilité des décisions rendues à la pluralité des voix. Paris.
A. H. Copeland (1951), A Reasonable Social Welfare Function. mimeo, University of Michigan.
E. J. Emond and D. W. Mason (2000), A new technique for high level decision support. Technical Report ORD Project Report PR2000/13, Operational Research Division, Department of National Defence, Canada.
K. Hornik and D. Meyer (2007), Deriving consensus rankings from benchmarking experiments. In R. Decker and H.-J. Lenz, Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer-Verlag: Heidelberg, 163–170.
F. Marcotorchino and P. Michaud (1982). Agrégation de similarités en classification automatique. Revue de Statistique Appliquée, 30/2, 21–44. https://eudml.org/doc/106132.
S. Régnier (1965), Sur quelques aspects mathématiques des problèmes de classification automatique. ICC Bulletin, 4, 175–191.
## Consensus equivalence. ## (I.e., in fact, consensus partition.) ## Classification of 30 felines, see Marcotorchino and Michaud (1982). data("Felines") ## Consider each variable an equivalence relation on the objects. relations <- as.relation_ensemble(Felines) ## This gives a relation ensemble of length 14 (number of variables in ## the data set). ## Now fit an equivalence relation to this: E <- relation_consensus(relations, "symdiff/E") ## And look at the equivalence classes: ids <- relation_class_ids(E) ## Or, more nicely: split(rownames(Felines), ids) ## Which is the same as in the paper ... ## Consensus linear order. ## Example from Cook and Kress, pages 48ff. ## Relation from paired comparisons. pm <- matrix(c(0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0), nrow = 5, byrow = TRUE, dimnames = list(letters[1:5], letters[1:5])) ## Note that this is a Cook and Kress "preference matrix" where entry ## (i,j) is one iff object i is preferred to object j (i > j). ## Set up the corresponding '<' relation: R <- as.relation(t(pm)) relation_incidence(R) relation_is(R, "tournament") ## Closest linear order: L <- relation_consensus(R, "symdiff/L") relation_incidence(L) ## Visualize provided that Rgraphviz is available. if(require("Rgraphviz")) plot(L) ## But note that this linear order is not unique. L <- relation_consensus(R, "symdiff/L", control = list(n = "all")) print(L) if(require("Rgraphviz")) plot(L) ## (Oh no: c is once first and once last.) ## Closest weak order relation with at most 3 indifference classes: W3 <- relation_consensus(R, "symdiff/W", control = list(k = 3)) relation_incidence(W3) ## Consensus weak orders. ## Example from Emond and Mason, pages 28f. ## The reference provides 21 partial rankings of 15 objects, ## in 3 groups of 7 rankings (corresponding to three different ## ranking criteria) with respective weights 4, 5, and 7. wei <- rep.int(c(4, 5, 7), rep(7, 3)) ## The rankings are written by listing the object labels from the ## best to the worst, with a leading minus indicating a tie with ## the previous object: EM_inputs <- c("6 1 -7 -9 10 3 8 11 5 -12 2 -4 -13", "6 10 9 3 4 -8 7 1 -5 -11 2 12 13 14 15", "6 10 3 7 8 11 5 14 15 12 1 -4 -13 2 -9", "6 9 -11 10 3 14 12 7 4 5 2 1 8 13 15", "10 6 7 1 11 -13 4 2 3 9 12 14 -15 8 5", "6 9 8 -10 11 4 1 5 7 15 2 12 14 13 3", "1 -6 -10 7 -12 9 3 4 -11 -14 -15 2 -13 8", "4 -10 1 -7 6 -9 -13 5 -14 3 12 8 11 -15 2", "4 -9 5 1 14 11 8 3 6 2 -13 10 12 7 15", "4 2 -5 8 15 7 11 -14 1 -12 -13 10 9 6", "2 -11 -12 -14 -15 6 -13 3 -4 9 8 -10 1 -5 -7", "4 14 10 2 5 3 1 13 12 7 15 8 11 6 9", "4 2 5 1 15 7 13 14 3 -12 8 11 6 9 10", "12 1 3 -4 2 11 -13 -15 9 14 6 8 7 -10 5", "5 4 9 2 -7 14 8 -11 3 1 15 12 6 10 13", "11 9 -14 15 12 3 4 13 8 6 7 10 5", "12 11 2 1 3 9 8 10 13 -14 6 4 -15 5 7", "4 -5 10 -12 3 8 -11 6 -7 -9 13 14 15", "12 5 -13 14 3 8 15 4 9 -10 11 6 7", "4 -5 -8 11 6 14 7 1 -2 -15 10 3 13 9 -12", "10 8 5 -11 6 -14 9 4 -13 -15 3 -12 2 1") ## Using the Emond-Mason paired comparison dissimilarity, there ## are three consensus rankings when using the above weights: EM_solutions <- c("4 10 5-11 1 -2-14 3-12 9 8 6 7 13-15", "4 10 5-11 1 -2 9 14 3-12 8 6 7 13-15", "4 10 5-11 2-14 1 3-12 9 8 6 7 13-15") ## We can reproduce this as follows. ## We first provide a reader for the rankings, and a maker for ## creating the (possibly partial) ranking with the appropriate ## domain: reader <- function(s) { strsplit(unlist(strsplit(gsub(" *-", "-", s), " +")), "-", fixed = TRUE) } maker <- function(s) { ranking(lapply(reader(s), as.numeric), domain = as.numeric(1 : 15)) } EM_inputs <- lapply(EM_inputs, maker) EM_solutions <- lapply(EM_solutions, maker) ## Package 'relations' uses NA for non-diagonal incidences ## featuring unranked objects. ## Following the reference, we impute these by zeroes: ens <- relation_impute(relation_ensemble(list = EM_inputs), "omit") ## We can now obtain all consensus weak orders (corresponding to ## complete rankings) as follows: con <- relation_consensus(ens, "PC/W", wei, delta = "EM", all = TRUE) ## To verify that these agree with the solutions given in the ## reference: sets::set_outer(con, relation_ensemble(list = EM_solutions), `==`)
## Consensus equivalence. ## (I.e., in fact, consensus partition.) ## Classification of 30 felines, see Marcotorchino and Michaud (1982). data("Felines") ## Consider each variable an equivalence relation on the objects. relations <- as.relation_ensemble(Felines) ## This gives a relation ensemble of length 14 (number of variables in ## the data set). ## Now fit an equivalence relation to this: E <- relation_consensus(relations, "symdiff/E") ## And look at the equivalence classes: ids <- relation_class_ids(E) ## Or, more nicely: split(rownames(Felines), ids) ## Which is the same as in the paper ... ## Consensus linear order. ## Example from Cook and Kress, pages 48ff. ## Relation from paired comparisons. pm <- matrix(c(0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0), nrow = 5, byrow = TRUE, dimnames = list(letters[1:5], letters[1:5])) ## Note that this is a Cook and Kress "preference matrix" where entry ## (i,j) is one iff object i is preferred to object j (i > j). ## Set up the corresponding '<' relation: R <- as.relation(t(pm)) relation_incidence(R) relation_is(R, "tournament") ## Closest linear order: L <- relation_consensus(R, "symdiff/L") relation_incidence(L) ## Visualize provided that Rgraphviz is available. if(require("Rgraphviz")) plot(L) ## But note that this linear order is not unique. L <- relation_consensus(R, "symdiff/L", control = list(n = "all")) print(L) if(require("Rgraphviz")) plot(L) ## (Oh no: c is once first and once last.) ## Closest weak order relation with at most 3 indifference classes: W3 <- relation_consensus(R, "symdiff/W", control = list(k = 3)) relation_incidence(W3) ## Consensus weak orders. ## Example from Emond and Mason, pages 28f. ## The reference provides 21 partial rankings of 15 objects, ## in 3 groups of 7 rankings (corresponding to three different ## ranking criteria) with respective weights 4, 5, and 7. wei <- rep.int(c(4, 5, 7), rep(7, 3)) ## The rankings are written by listing the object labels from the ## best to the worst, with a leading minus indicating a tie with ## the previous object: EM_inputs <- c("6 1 -7 -9 10 3 8 11 5 -12 2 -4 -13", "6 10 9 3 4 -8 7 1 -5 -11 2 12 13 14 15", "6 10 3 7 8 11 5 14 15 12 1 -4 -13 2 -9", "6 9 -11 10 3 14 12 7 4 5 2 1 8 13 15", "10 6 7 1 11 -13 4 2 3 9 12 14 -15 8 5", "6 9 8 -10 11 4 1 5 7 15 2 12 14 13 3", "1 -6 -10 7 -12 9 3 4 -11 -14 -15 2 -13 8", "4 -10 1 -7 6 -9 -13 5 -14 3 12 8 11 -15 2", "4 -9 5 1 14 11 8 3 6 2 -13 10 12 7 15", "4 2 -5 8 15 7 11 -14 1 -12 -13 10 9 6", "2 -11 -12 -14 -15 6 -13 3 -4 9 8 -10 1 -5 -7", "4 14 10 2 5 3 1 13 12 7 15 8 11 6 9", "4 2 5 1 15 7 13 14 3 -12 8 11 6 9 10", "12 1 3 -4 2 11 -13 -15 9 14 6 8 7 -10 5", "5 4 9 2 -7 14 8 -11 3 1 15 12 6 10 13", "11 9 -14 15 12 3 4 13 8 6 7 10 5", "12 11 2 1 3 9 8 10 13 -14 6 4 -15 5 7", "4 -5 10 -12 3 8 -11 6 -7 -9 13 14 15", "12 5 -13 14 3 8 15 4 9 -10 11 6 7", "4 -5 -8 11 6 14 7 1 -2 -15 10 3 13 9 -12", "10 8 5 -11 6 -14 9 4 -13 -15 3 -12 2 1") ## Using the Emond-Mason paired comparison dissimilarity, there ## are three consensus rankings when using the above weights: EM_solutions <- c("4 10 5-11 1 -2-14 3-12 9 8 6 7 13-15", "4 10 5-11 1 -2 9 14 3-12 8 6 7 13-15", "4 10 5-11 2-14 1 3-12 9 8 6 7 13-15") ## We can reproduce this as follows. ## We first provide a reader for the rankings, and a maker for ## creating the (possibly partial) ranking with the appropriate ## domain: reader <- function(s) { strsplit(unlist(strsplit(gsub(" *-", "-", s), " +")), "-", fixed = TRUE) } maker <- function(s) { ranking(lapply(reader(s), as.numeric), domain = as.numeric(1 : 15)) } EM_inputs <- lapply(EM_inputs, maker) EM_solutions <- lapply(EM_solutions, maker) ## Package 'relations' uses NA for non-diagonal incidences ## featuring unranked objects. ## Following the reference, we impute these by zeroes: ens <- relation_impute(relation_ensemble(list = EM_inputs), "omit") ## We can now obtain all consensus weak orders (corresponding to ## complete rankings) as follows: con <- relation_consensus(ens, "PC/W", wei, delta = "EM", all = TRUE) ## To verify that these agree with the solutions given in the ## reference: sets::set_outer(con, relation_ensemble(list = EM_solutions), `==`)
Compute the covering relation of an endorelation.
relation_cover(x)
relation_cover(x)
x |
an endorelation. |
Let be an endorelation with domain
and
be
the asymmetric part of
for which
iff
and not
. (If
is a
order relation,
is the associated strict order.) We say that
is
covered by
if
and there is no
such that
and
. One also says that
covers
, or that it is a successor of
.
The covering relation of consists of all pairs
for which
is covered by
.
Compute the dissimilarity between (ensembles of) relations.
relation_dissimilarity(x, y = NULL, method = "symdiff", ...)
relation_dissimilarity(x, y = NULL, method = "symdiff", ...)
x |
an ensemble of relations (see
|
y |
|
method |
a character string specifying one of the built-in
methods for computing dissimilarity, or a function to be taken as
a user-defined method. If a character string, its lower-cased
version is matched against the lower-cased names of the available
built-in methods using |
... |
further arguments to be passed to methods. |
Available built-in methods are as follows.
"symdiff"
symmetric difference distance.
This computes the cardinality of the symmetric difference of two
relations, i.e., the number of tuples contained in exactly one of
two relations. For preference relations, this coincides with the
Kemeny-Snell metric (Kemeny and Snell, 1962). For linear
orders, it gives Kendall's metric (Diaconis, 1988).
Can also be referred to as "SD"
.
Only applicable to crisp relations.
"manhattan"
the Manhattan distance between the incidences.
"euclidean"
the Euclidean distance between the incidences.
"CS"
Cook-Seiford distance, a generalization of the
distance function of Cook and Seiford (1978). Let the generalized
ranks of an object in the (first) domain of an
endorelation
be defined as the number of objects
dominating
(i.e., for which
and not
), plus half the number of objects
equivalent to
(i.e., for which
and
). For preference
relations, this gives the usual Kendall ranks arranged according
to decreasing preference (and averaged for ties). Then the
generalized Cook-Seiford distance is defined as the
distance between the generalized ranks. For linear orders, this
gives Spearman's footrule metric (Diaconis, 1988).
Only applicable to crisp endorelations.
"CKS"
Cook-Kress-Seiford distance, a generalization of
the distance function of Cook, Kress and Seiford (1986). For each
pair of objects and
in an endorelation
, we
can have
and not
or vice versa (cases of
“strict preference”),
and
(the case
of “indifference”), or neither
nor
(the case of “incomparability”). (Only the last two are
possible if
.) The distance by Cook, Kress and Seiford
puts indifference as the metric centroid between both preference
cases and incomparability (i.e., indifference is at distance one
from the other three, and each of the other three is at distance
two from the others). The generalized Cook-Kress-Seiford distance
is the paired comparison distance (i.e., a metric) based on these
distances between the four paired comparison cases. (Formula 3 in
the reference must be slightly modified for the generalization
from partial rankings to arbitrary endorelations.)
Only applicable to crisp endorelations.
"score"
score-based distance. This computes
for suitable score and distance functions
and
, respectively. These can be specified by
additional arguments
score
and Delta
. If
score
is a character string, it is taken as the method for
relation_scores
. Otherwise, if given it must be a
function giving the score function itself. If Delta
is a
number , the usual
distance is used.
Otherwise, it must be a function giving the distance function.
The defaults correspond to using the default relation scores and
, which for linear orders gives Spearman's footrule
distance.
Only applicable to endorelations.
"Jaccard"
Jaccard distance: 1 minus the ratio of the cardinalities of the intersection and the union of the relations.
"PC"
(generalized) paired comparison distance. This
generalizes the symdiff and CKS distances to use a general set of
discrepancies between the possible paired
comparison results with
/
incidences 0/0, 1/0,
0/1, and 1/1 numbered from 1 to 4 (in a preference context with a
encoding, these correspond to incompatibility, strict
and
preference, and indifference), with
the discrepancy between possible results
and
. The distance is then obtained as the sum of the
discrepancies from the paired comparisons of distinct objects,
plus half the sum of discrepancies from the comparisons of
identical objects (for which the only possible results are
incomparability and indifference).
The distance is a metric provided that the
satisfy the metric conditions (non-negativity and zero iff
, symmetry and sub-additivity).
The discrepancies can be specified via the additional argument
delta
, either as a numeric vector of length 6 with the
non-redundant values , or as a character string
partially matching one of the following built-in discrepancies
with corresponding parameter vector
:
"symdiff"
symmetric difference distance, with
discrepancy between distinct results two between either
opposite strict preferences or indifference and
incomparability, and one otherwise:
(default).
Can also be referred to as "SD"
.
"CKS"
Cook-Kress-Seiford distance, see above:
.
"EM"
the distance obtained from the generalization
of the Kemeny-Snell distance for complete rankings to partial
rankings introduced in Emond and Mason (2000). This uses a
discrepancy of two for opposite strict preferences, and one
for all other distinct results:
.
"JMB"
the distance with parameters as suggested by
Jabeur, Martel and Ben Khélifa (2004):
.
"discrete"
the discrete metric on the set of paired
comparison results:
.
Only applicable to crisp endorelations.
Methods "symdiff"
, "manhattan"
, "euclidean"
and
"Jaccard"
take an additional logical argument na.rm
: if
true (default: false), tuples with missing memberships are excluded in
the dissimilarity computations.
If y
is NULL
, an object of class dist
containing the dissimilarities between all pairs of elements of
x
. Otherwise, a matrix with the dissimilarities between the
elements of x
and the elements of y
.
W. D. Cook, M. Kress and L. M. Seiford (1986), Information and preference in partial orders: a bimatrix representation. Psychometrika 51/2, 197–207. doi:10.1007/BF02293980.
W. D. Cook and L. M. Seiford (1978), Priority ranking and consensus formation. Management Science, 24/16, 1721–1732. doi:10.1287/mnsc.24.16.1721.
P. Diaconis (1988), Group Representations in Probability and Statistics. Institute of Mathematical Statistics: Hayward, CA.
E. J. Emond and D. W. Mason (2000), A new technique for high level decision support. Technical Report ORD Project Report PR2000/13, Operational Research Division, Department of National Defence, Canada.
K. Jabeur, J.-M. Martel and S. Ben Khélifa (2004). A distance-based collective preorder integrating the relative importance of the groups members. Group Decision and Negotiation, 13, 327–349. doi:10.1023/B:GRUP.0000042894.00775.75.
J. G. Kemeny and J. L. Snell (1962), Mathematical Models in the Social Sciences, chapter “Preference Rankings: An Axiomatic Approach”. MIT Press: Cambridge.
Determine the domain, domain names, arity, or size of a relation or a relation ensemble.
relation_arity(x) relation_domain(x) relation_domain_names(x) relation_size(x)
relation_arity(x) relation_domain(x) relation_domain_names(x) relation_size(x)
x |
an R object inheriting from class |
For determining the domain, an object of class relation_domain
,
inheriting from tuple
.
tuple()
;
relation()
;
relation_domain<-()
and
relation_domain_names<-()
for modifying the domain and domain names of a relation, respectively.
## A simple relation: R <- as.relation(c(A = 1, B = 2, C = 3)) relation_incidence(R) relation_arity(R) relation_domain(R) relation_domain_names(R) relation_size(R)
## A simple relation: R <- as.relation(c(A = 1, B = 2, C = 3)) relation_incidence(R) relation_arity(R) relation_domain(R) relation_domain_names(R) relation_size(R)
Obtain elements of endorelation domains which have certain properties.
relation_elements(x, which, ...)
relation_elements(x, which, ...)
x |
an endorelation. |
which |
a character string specifying the property to be tested
for. Currently, one of |
... |
additional arguments to be employed in the property tests. |
Let be an endorelation with domain
and consider
elements
and
of
. We say that
is
there is no with
.
for all
.
for all
.
there is no with
.
When computing the tests for the above properties, an additional
na.rm
argument can be given to control the handling of missing
incidences. By default, these are treated as false, to the effect
that they invalidate “for all” tests (corresponding to
na.rm = FALSE
) and pass the “there is no” tests
(corresponding to na.rm = TRUE
).
A set with the elements having the specified property.
Creation and manipulation of relation ensembles.
relation_ensemble(..., list = NULL) as.relation_ensemble(x) is.relation_ensemble(x)
relation_ensemble(..., list = NULL) as.relation_ensemble(x) is.relation_ensemble(x)
... |
R objects representing relations, or coercible to such. |
list |
a list of R objects as in |
x |
for coercion with |
relation_ensemble()
creates non-empty “relation
ensembles”, i.e., collections of relations with
the same domain
and possibly different graphs
.
Such ensembles are implemented as suitably classed lists of relation
objects, making it possible to use lapply()
for computations on
the individual relations in the ensemble. Available methods for
relation ensembles include those for subscripting, c()
,
t()
, rep()
, and print()
.
data("Cetacea") ## Consider each variable an equivalence relation on the objects. ## Note that 2 variables (LACHRYMAL_AND_JUGAL_BONES and HEAD_BONES) have ## missing values, and hence are excluded. ind <- sapply(Cetacea, function(s) all(!is.na(s))) relations <- as.relation_ensemble(Cetacea[, ind]) ## This gives a relation ensemble of length 14 (number of complete ## variables in the data set). print(relations) ## Are there any duplicated relations? any(duplicated(relations)) ## Replicate and combine ... thrice <- c(rep(relations, 2), relations) ## Extract unique elements again: all.equal(unique(thrice), relations) ## Note that unique() does not preserve attributes, and hence names. ## In case we want otherwise: all.equal(thrice[!duplicated(thrice)], relations) ## Subscripting: relation_dissimilarity(relations[1 : 2], relations["CLASS"]) ## Which relation is "closest" to the classification? d <- relation_dissimilarity(relations) sort(as.matrix(d)[, "CLASS"])[-1]
data("Cetacea") ## Consider each variable an equivalence relation on the objects. ## Note that 2 variables (LACHRYMAL_AND_JUGAL_BONES and HEAD_BONES) have ## missing values, and hence are excluded. ind <- sapply(Cetacea, function(s) all(!is.na(s))) relations <- as.relation_ensemble(Cetacea[, ind]) ## This gives a relation ensemble of length 14 (number of complete ## variables in the data set). print(relations) ## Are there any duplicated relations? any(duplicated(relations)) ## Replicate and combine ... thrice <- c(rep(relations, 2), relations) ## Extract unique elements again: all.equal(unique(thrice), relations) ## Note that unique() does not preserve attributes, and hence names. ## In case we want otherwise: all.equal(thrice[!duplicated(thrice)], relations) ## Subscripting: relation_dissimilarity(relations[1 : 2], relations["CLASS"]) ## Which relation is "closest" to the classification? d <- relation_dissimilarity(relations) sort(as.matrix(d)[, "CLASS"])[-1]
A data set with 14 variables on 30 different types of felines.
data("Felines")
data("Felines")
A data frame with 30 observations on 14 categorical variables (the first 10 morphological, the last 4 behavioral), with names in French and numeric levels as in the reference. Names, descriptions in French and English and the numbers of levels are as follows.
TYPPEL
:Aspect du pelage; coat; 4.
LONGPOIL
:Fourrure; fur; 2.
OREILLES
:Oreilles; ears; 2.
TAILLE
:Taille au garrot; waist; 3.
POIDS
:Poids; weight; 3.
LONGUEUR
:Longueur du corps; body length; 3.
QUEUE
:Longueur de la queue; tail length; 3.
DENTS
:Canines développées; carnassials; 2.
LARYNX
:Os hyaoide; larynx; 2.
RETRACT
:Griffes rétractiles; retractible claws; 2.
COMPORT
:Comportement prédateur; predatory behavior; 3.
TYPPROIE
:Type de la proie; type of prey; 3.
ARBRES
:Monte ou non aux arbres; climbs trees or not; 2.
CHASSE
:Chasse (courre ou affut); chases via chivy or ambush; 2.
F. Marcotorchino and P. Michaud (1982), Agregation de similarités en classification automatique. Revue de Statistique Appliquée, 30(2), 21–44.
data("Felines") summary(Felines)
data("Felines") summary(Felines)
Determine the graph of a relation.
relation_graph(x)
relation_graph(x)
x |
an R object inheriting from class |
An object of class relation_graph
, inheriting from
set
.
set()
;
relation()
;
relation_graph<-()
for modifying the graph.
## A simple relation: R <- as.relation(c(A = 1, B = 2, C = 3)) relation_graph(R)
## A simple relation: R <- as.relation(c(A = 1, B = 2, C = 3)) relation_graph(R)
Impute missing incidences in relations by averaging all possible completions within a specified family.
relation_impute(x, method = NULL, control = list(), ...)
relation_impute(x, method = NULL, control = list(), ...)
x |
an endorelation or an ensemble of endorelations. |
method |
character string specifying the method to be used (see
Details). If |
control |
named list of control arguments. Currently, only
|
... |
named list of control arguments, overriding the ones
in |
Endorelations with missing elements (i.e., whose incidence is
NA
) are imputed using one of the methods described as follows.
"omit"
Missing incidences are replaced by zeros, i.e., the corresponding elements are removed from the graph.
"any/F"
The incidences are replaced by arbitrary values suitable for family F, with possible values:
G
General (unrestricted) relations.
L
Linear orders.
W
Weak orders.
O
Partial orders.
L
, W
, and O
can optionally be complemented by
/first
or /last
to further restrict imputed elements
to be placed on top or bottom of the given ordering.
"average/F"
Computes the relation with average
incidences, based on all possible completions as indicated for the
any/F
methods. Note that these completions are not
explicitly generated to compute the averages, and that the
resulting relation will typically be fuzzy.
If x
is an ensemble or more than one solution is requested
using the n
control argument: an ensemble
of endorelations. An endorelation otherwise.
## create a relation with a missing object R <- ranking(1:2, 1:3) print(R) R <- as.relation(R) ## find all suitable completions within L ens <- relation_impute(R, method = "any/L", n = "all") lapply(ens, as.ranking) if(require("Rgraphviz")) plot(ens) ## find 3 suitable partial orders ens <- relation_impute(R, method = "any/O", n = 3) lapply(ens, relation_incidence) if(require("Rgraphviz")) plot(ens) ## compute average completion R1 <- relation_impute(R, method = "average/O") relation_incidence(R1) ## check correctness of averaging R2 <- mean(relation_impute(R, "any/O", n = "all")) stopifnot(all.equal(R1, R2))
## create a relation with a missing object R <- ranking(1:2, 1:3) print(R) R <- as.relation(R) ## find all suitable completions within L ens <- relation_impute(R, method = "any/L", n = "all") lapply(ens, as.ranking) if(require("Rgraphviz")) plot(ens) ## find 3 suitable partial orders ens <- relation_impute(R, method = "any/O", n = 3) lapply(ens, relation_incidence) if(require("Rgraphviz")) plot(ens) ## compute average completion R1 <- relation_impute(R, method = "average/O") relation_incidence(R1) ## check correctness of averaging R2 <- mean(relation_impute(R, "any/O", n = "all")) stopifnot(all.equal(R1, R2))
Determine the incidences of a relation.
relation_incidence(x, ...)
relation_incidence(x, ...)
x |
an object inheriting from class |
... |
Further arguments passed to the labeling function used for creating the dimnames of the incidence matrix. |
For a -ary relation, a
-dimensional numeric array with
values in the unit interval inheriting from class
relation_incidence
whose elements give the memberships of the
corresponding -tuples are contained in the relation (for a
crisp relation, a binary (0/1) array with elements indicating whether
the corresponding tuples are contained in the relation or not).
relation()
;
relation_incidence<-()
for modifying the incidences.
R <- as.relation(c(A = 1, B = 2, C = 3)) relation_incidence(R)
R <- as.relation(c(A = 1, B = 2, C = 3)) relation_incidence(R)
Compute prototype-based partitions of a relation ensemble by minimizing
, the sum of the case-weighted and
membership-weighted
-th powers of the dissimilarities between
the elements
of the ensemble and the prototypes
,
for suitable dissimilarities
and exponents
.
relation_pclust(x, k, method, m = 1, weights = 1, control = list())
relation_pclust(x, k, method, m = 1, weights = 1, control = list())
x |
an ensemble of relations (see
|
k |
an integer giving the number of classes to be used in the partition. |
method |
the consensus method to be employed, see
|
m |
a number not less than 1 controlling the softness of the
partition (as the “fuzzification parameter” of the fuzzy
|
weights |
a numeric vector of non-negative case weights.
Recycled to the number of elements in the ensemble given by |
control |
a list of control parameters. See Details. |
For , a generalization of the Lloyd-Forgy variant of the
-means algorithm is used, which iterates between reclassifying
objects to their closest prototypes, and computing new prototypes as
consensus relations (generalized “central relations”, Régnier
(1965)) for the classes. This procedure was proposed in Gaul and
Schader (1988) as the “Clusterwise Aggregation of Relations”
(CAR).
For , a generalization of the fuzzy
-means recipe
is used, which alternates between computing optimal memberships for
fixed prototypes, and computing new prototypes as the consensus
relations for the classes.
This procedure is repeated until convergence occurs, or the maximal number of iterations is reached.
Consensus relations are computed using
relation_consensus()
.
Available control parameters are as follows.
maxiter
an integer giving the maximal number of iterations to be performed. Defaults to 100.
reltol
the relative convergence tolerance. Defaults
to sqrt(.Machine$double.eps)
.
control
control parameters to be used in
relation_consensus()
.
The dissimilarities and exponent
are implied by the
consensus method employed, and inferred via a registration mechanism
currently only made available to built-in consensus methods. For the
time being, all optimization-based consensus methods use the symmetric
difference dissimilarity (see
relation_dissimilarity()
)
for and
.
The fixed point approach employed is a heuristic which cannot be guaranteed to find the global minimum. Standard practice would recommend to use the best solution found in “sufficiently many” replications of the base algorithm.
An object of class cl_partition()
.
S. Régnier (1965). Sur quelques aspects mathématiques des problèmes de classification automatique. ICC Bulletin, 4, 175–191.
W. Gaul and M. Schader (1988). Clusterwise aggregation of relations. Applied Stochastic Models and Data Analysis, 4, 273–282. doi:10.1002/asm.3150040406.
Visualize certain crisp endorelations by plotting a Hasse Diagram of their transitive reduction.
## S3 method for class 'relation' plot(x, attrs = list(graph = list(rankdir = "BT"), edge = list(arrowsize = NULL), node = list(shape = "rectangle", fixedsize = FALSE)), limit = 6L, labels = NULL, main = NULL, type = c("simplified", "raw"), ...) ## S3 method for class 'relation_ensemble' plot(x, attrs = list(list(graph = list(rankdir = "BT"), edge = list(arrowsize = NULL), node = list(shape = "rectangle", fixedsize = FALSE))), type = "simplified", limit = 6L, labels = NULL, ..., layout = NULL, main = NULL)
## S3 method for class 'relation' plot(x, attrs = list(graph = list(rankdir = "BT"), edge = list(arrowsize = NULL), node = list(shape = "rectangle", fixedsize = FALSE)), limit = 6L, labels = NULL, main = NULL, type = c("simplified", "raw"), ...) ## S3 method for class 'relation_ensemble' plot(x, attrs = list(list(graph = list(rankdir = "BT"), edge = list(arrowsize = NULL), node = list(shape = "rectangle", fixedsize = FALSE))), type = "simplified", limit = 6L, labels = NULL, ..., layout = NULL, main = NULL)
x |
an R object inheriting from class |
attrs |
argument passed to the plot method for class
Note that for the |
type |
character vector of either "simplified" or "raw" strings, one for each relation plotted. (See details.) |
limit |
Argument passed to the labeling function creating default
labels for the nodes (see |
labels |
Optional list of character vectors defining unique labels for the nodes. List of such lists for relation ensembles. |
layout |
integer vector of length 2 specifying the number of rows
and columns of the screen layout. If |
... |
Other arguments passed to the
|
main |
character vector used for the main title(s). If |
Visualization requires that package Rgraphviz is available. If type is "simplified" (default), the transitive reduction is first computed to reduce visual complexity (especially for transitive relations). For partial orders, a Hasse diagram is plotted. In case of (acyclic) transitive complete relations (i.e., weak orders, preferences), the dual is plotted. For all other acyclic relations, the asymmetric part is plotted. (Note that the default settings in these cases create a diagram with nodes ordered bottom-up and with no arrows.) For cyclic relations, a raw graph (with arrows) of the corresponding transitive reduction is computed. If type is "raw", a directed graph without any transformation is plotted from the relation.
relation()
,
transitive_reduction()
require("sets") # set() etc. if(require("Rgraphviz")) { ## simple example plot(as.relation(1 : 5)) ## inclusion on a power set: ps <- 2 ^ set("a", "b", "c") inc <- set_outer(ps, set_is_subset) R <- relation(incidence = inc) plot(relation_ensemble(R, R), type = c("simplified", "raw")) }
require("sets") # set() etc. if(require("Rgraphviz")) { ## simple example plot(as.relation(1 : 5)) ## inclusion on a power set: ps <- 2 ^ set("a", "b", "c") inc <- set_outer(ps, set_is_subset) R <- relation(incidence = inc) plot(relation_ensemble(R, R), type = c("simplified", "raw")) }
Predicate functions for testing for binary relations and endorelations, and special kinds thereof.
relation_is(x, predicate, ...) relation_is_Euclidean(x, na.rm = FALSE) relation_is_Ferrers(x, na.rm = FALSE) relation_is_acyclic(x) relation_is_antisymmetric(x, na.rm = FALSE) relation_is_asymmetric(x, na.rm = FALSE) relation_is_bijective(x) relation_is_binary(x) relation_is_complete(x, na.rm = FALSE) relation_is_coreflexive(x, na.rm = FALSE) relation_is_crisp(x, na.rm = FALSE) relation_is_cyclic(x) relation_is_endorelation(x) relation_is_equivalence(x, na.rm = FALSE) relation_is_functional(x) relation_is_homogeneous(x) relation_is_injective(x) relation_is_interval_order(x, na.rm = FALSE) relation_is_irreflexive(x, na.rm = FALSE) relation_is_left_total(x) relation_is_linear_order(x, na.rm = FALSE) relation_is_match(x, na.rm = FALSE) relation_is_negatively_transitive(x, na.rm = FALSE) relation_is_partial_order(x, na.rm = FALSE) relation_is_preference(x, na.rm = FALSE) relation_is_preorder(x, na.rm = FALSE) relation_is_quasiorder(x, na.rm = FALSE) relation_is_quasitransitive(x, na.rm = FALSE) relation_is_quaternary(x) relation_is_reflexive(x, na.rm = FALSE) relation_is_right_total(x) relation_is_semiorder(x, na.rm = FALSE) relation_is_semitransitive(x, na.rm = FALSE) relation_is_strict_linear_order(x, na.rm = FALSE) relation_is_strict_partial_order(x, na.rm = FALSE) relation_is_strongly_complete(x, na.rm = FALSE) relation_is_surjective(x) relation_is_symmetric(x, na.rm = FALSE) relation_is_ternary(x) relation_is_tournament(x, na.rm = FALSE) relation_is_transitive(x, na.rm = FALSE) relation_is_trichotomous(x, na.rm = FALSE) relation_is_weak_order(x, na.rm = FALSE) relation_has_missings(x)
relation_is(x, predicate, ...) relation_is_Euclidean(x, na.rm = FALSE) relation_is_Ferrers(x, na.rm = FALSE) relation_is_acyclic(x) relation_is_antisymmetric(x, na.rm = FALSE) relation_is_asymmetric(x, na.rm = FALSE) relation_is_bijective(x) relation_is_binary(x) relation_is_complete(x, na.rm = FALSE) relation_is_coreflexive(x, na.rm = FALSE) relation_is_crisp(x, na.rm = FALSE) relation_is_cyclic(x) relation_is_endorelation(x) relation_is_equivalence(x, na.rm = FALSE) relation_is_functional(x) relation_is_homogeneous(x) relation_is_injective(x) relation_is_interval_order(x, na.rm = FALSE) relation_is_irreflexive(x, na.rm = FALSE) relation_is_left_total(x) relation_is_linear_order(x, na.rm = FALSE) relation_is_match(x, na.rm = FALSE) relation_is_negatively_transitive(x, na.rm = FALSE) relation_is_partial_order(x, na.rm = FALSE) relation_is_preference(x, na.rm = FALSE) relation_is_preorder(x, na.rm = FALSE) relation_is_quasiorder(x, na.rm = FALSE) relation_is_quasitransitive(x, na.rm = FALSE) relation_is_quaternary(x) relation_is_reflexive(x, na.rm = FALSE) relation_is_right_total(x) relation_is_semiorder(x, na.rm = FALSE) relation_is_semitransitive(x, na.rm = FALSE) relation_is_strict_linear_order(x, na.rm = FALSE) relation_is_strict_partial_order(x, na.rm = FALSE) relation_is_strongly_complete(x, na.rm = FALSE) relation_is_surjective(x) relation_is_symmetric(x, na.rm = FALSE) relation_is_ternary(x) relation_is_tournament(x, na.rm = FALSE) relation_is_transitive(x, na.rm = FALSE) relation_is_trichotomous(x, na.rm = FALSE) relation_is_weak_order(x, na.rm = FALSE) relation_has_missings(x)
x |
an object inheriting from class |
na.rm |
a logical indicating whether tuples with missing memberships are excluded in the predicate computations. |
predicate |
character vector matching one of the following (see details):
|
... |
Additional arguments passed to the predicate functions
(currently, only |
This help page documents the predicates currently available. Note that
the preferred way is to use the meta-predicate function
relation_is(x, "FOO")
instead of the individual predicates
relation_is_FOO(x)
since the latter will become deprecated in
future releases.
A binary relation is a relation with arity 2.
A relation on a set
is called
homogeneous iff
.
An endorelation is a binary homogeneous relation.
For a crisp binary relation, let us write iff
is contained in
.
A crisp binary relation is called
for all there is at least one
such that
.
for all there is at least one
such that
.
for all there is at most one
such that
.
the same as right-total.
for all there is at most one
such that
.
left-total, right-total, functional and injective.
A crisp endorelation is called
for all
.
there is no such that
.
implies
.
implies
.
implies that not
.
and
imply that
.
and
imply that
.
for all distinct and
,
or
.
for all and
,
or
(i.e., complete and reflexive).
not and not
imply that not
.
and
imply
or
.
and
imply
or
.
and not
and
and not
imply
and not
(i.e., the asymmetric part of
is transitive).
exactly one of ,
, or
holds.
and
imply
.
the transitive closure of R is antisymmetric.
R is not acyclic.
Some combinations of these basic properties have special names because of their widespread use:
reflexive and transitive.
the same as preorder.
a symmetric preorder (reflexive, symmetric, and transitive).
a complete preorder (complete, reflexive, and transitive).
the same as weak order.
an antisymmetric preorder (reflexive, antisymmetric, and transitive).
irreflexive, antisymmetric, and transitive, or equivalently: asymmetric and transitive).
a complete partial order.
a complete strict partial order.
strongly complete.
complete and asymmetric.
complete and Ferrers.
a semitransitive interval order.
If is a weak order (“(weak) preference relation”),
defined by
iff
and
is an equivalence, the indifference relation corresponding to
.
There seem to be no commonly agreed definitions for order relations: e.g., Fishburn (1972) requires these to be irreflexive.
For a fuzzy binary relation , let
denote the
membership of
in the relation. Write
and
for the fuzzy t-norm (intersection) and t-conorm (disjunction),
respectively (min and max for the “standard” Zadeh family).
Then generalizations of the above basic endorelation predicates are as
follows.
for all
.
for all
.
implies
.
for all
.
for all
.
for all
.
for all
.
for all
.
for all
.
for all
.
for all
.
for all
.
The combined predicates are obtained by combining the basic predicates as for crisp endorelations (see above).
A relation has missings iff at least one cell in the incidence matrix
is NA
. In addition to relation_has_missings()
, an
is.na
method for relations is available, returning a matrix of
logicals corresponding to the incidences tested for missingness.
P. C. Fishburn (1972), Mathematics of decision theory. Methods and Models in the Social Sciences 3. Mouton: The Hague.
H. R. Varian (2002), Intermediate Microeconomics: A Modern Approach. 6th Edition. W. W. Norton & Company.
require("sets") R <- relation(domain = c(1, 2, 3), graph = set(c(1, 2), c(2, 3))) summary(R) ## Note the possible effects of NA-handling: relation_incidence(R) relation_is(R, "transitive") ## clearly FALSE relation_incidence(R)[1, 2] <- NA relation_incidence(R) relation_is(R, "transitive") ## clearly NA ## The following gives TRUE, since NA gets replaced with 0: relation_is(R, "transitive", na.rm = TRUE)
require("sets") R <- relation(domain = c(1, 2, 3), graph = set(c(1, 2), c(2, 3))) summary(R) ## Note the possible effects of NA-handling: relation_incidence(R) relation_is(R, "transitive") ## clearly FALSE relation_incidence(R)[1, 2] <- NA relation_incidence(R) relation_is(R, "transitive") ## clearly NA ## The following gives TRUE, since NA gets replaced with 0: relation_is(R, "transitive", na.rm = TRUE)
Retrieve relation properties.
relation_properties(x) relation_property(x, which)
relation_properties(x) relation_property(x, which)
x |
A relation. |
which |
Property name (character string). |
These functions are used for the extrinsic properties of relations. Others can be retrieved using the predicate functions.
relation()
and
relation_is()
for all
predicate functions defined on relations.
x <- as.relation(1 : 3) relation_properties(x) relation_property(x, "is_endorelation")
x <- as.relation(1 : 3) relation_properties(x) relation_property(x, "is_endorelation")
Creates a ranking object.
ranking(x, domain = NULL, decreasing = TRUE, complete = FALSE) as.ranking(x, ...) is.ranking(x)
ranking(x, domain = NULL, decreasing = TRUE, complete = FALSE) as.ranking(x, ...) is.ranking(x)
x |
For |
domain |
object coercible to a set, from which the labels usable
in |
decreasing |
logical indicating whether the ranking orders objects
from the best to the worst ( |
complete |
logical specifying whether missing values should be
imputed, if any. Missing elements are those from |
... |
currently not used. |
An object of class ranking
.
relation()
## simple rankings OBJECTS <- c("Apples", "Bananas", "Oranges", "Lemons") print(R <- ranking(OBJECTS)) ranking(OBJECTS[2:4], domain = OBJECTS) ranking(OBJECTS[2:4], domain = OBJECTS, complete = TRUE) ## ranking with ties (weak orders) ranking(list(c("PhD", "MD"), "MSc", c("BSc", "BA"))) ## ranking A > B ~ C with D missing: ranking(c(A = 1, B = 2, C = 2, D = NA)) ## coercion functions identical(as.ranking(as.relation(R)), R)
## simple rankings OBJECTS <- c("Apples", "Bananas", "Oranges", "Lemons") print(R <- ranking(OBJECTS)) ranking(OBJECTS[2:4], domain = OBJECTS) ranking(OBJECTS[2:4], domain = OBJECTS, complete = TRUE) ## ranking with ties (weak orders) ranking(list(c("PhD", "MD"), "MSc", c("BSc", "BA"))) ## ranking A > B ~ C with D missing: ranking(c(A = 1, B = 2, C = 2, D = NA)) ## coercion functions identical(as.ranking(as.relation(R)), R)
Computes transitive and reflexive reduction of an endorelation.
transitive_reduction(x) reflexive_reduction(x) ## S3 method for class 'relation' reduction(x, operation = c("transitive", "reflexive"), ...)
transitive_reduction(x) reflexive_reduction(x) ## S3 method for class 'relation' reduction(x, operation = c("transitive", "reflexive"), ...)
x |
an R object inheriting from class |
operation |
character string indicating the kind of reduction. |
... |
currently not used. |
Let be an endorelation on
and
be the number of
elements in
.
The transitive reduction of is the smallest relation
on
so that the transitive closure of
is the
same than the transitive closure of
.
The transitive reduction of an acyclic relation can be obtained
by subtracting from the composition of
with its transitive closure.
The transitive reduction of a cyclic relation is the transitive
reduction of the condensation, combined with the component
representation of . (Note that the transitive reduction of a
cyclic relation is cyclic.)
The reflexive reduction of is computed by setting the
diagonal of the incidence matrix to 0.
S. Warshall (1962), A theorem on Boolean matrices. Journal of the ACM, 9/1, 11–12. doi:10.1145/321105.321107.
J. A. La Poutré and J. van Leeuwen (1988), Maintenance of Transitive Closures and Transitive Reductions of Graphs. Proceedings of the International Workshop of Graph-Theoretic Concepts in Computer Science, Springer, London, 106–120.
relation()
,reflexive_reduction()
,
transitive_reduction()
,
reduction()
,relation_condensation()
,
relation_component_representation()
.
R <- as.relation(1 : 5) relation_incidence(R) ## transitive closure/reduction RR <- transitive_reduction(R) relation_incidence(RR) R == transitive_closure(RR) ## same require("sets") # closure() and reduction() etc. R == closure(reduction(R)) ## reflexive closure/reduction RR <- reflexive_reduction(R) relation_incidence(RR) R == reflexive_closure(RR) ## same: R == closure(reduction(R, "reflexive"), "reflexive") ## transitive reduction of a cyclic relation: ## (example from La Poutre and van Leeuwen) require("sets") # set(), pair() etc. if(require("Rgraphviz")) { G <- set(pair(1L, 2L), pair(2L, 1L), pair(1L, 3L), pair(3L, 1L), pair(3L, 7L), pair(2L, 5L), pair(2L, 6L), pair(6L, 5L), pair(5L, 7L), pair(4L, 6L), pair(5L, 4L), pair(4L, 7L)) R <- endorelation(graph = G) plot(relation_ensemble(R, R), type = c("raw", "simplified"), main = c("original graph", "transitive reduction")) }
R <- as.relation(1 : 5) relation_incidence(R) ## transitive closure/reduction RR <- transitive_reduction(R) relation_incidence(RR) R == transitive_closure(RR) ## same require("sets") # closure() and reduction() etc. R == closure(reduction(R)) ## reflexive closure/reduction RR <- reflexive_reduction(R) relation_incidence(RR) R == reflexive_closure(RR) ## same: R == closure(reduction(R, "reflexive"), "reflexive") ## transitive reduction of a cyclic relation: ## (example from La Poutre and van Leeuwen) require("sets") # set(), pair() etc. if(require("Rgraphviz")) { G <- set(pair(1L, 2L), pair(2L, 1L), pair(1L, 3L), pair(3L, 1L), pair(3L, 7L), pair(2L, 5L), pair(2L, 6L), pair(6L, 5L), pair(5L, 7L), pair(4L, 6L), pair(5L, 4L), pair(4L, 7L)) R <- endorelation(graph = G) plot(relation_ensemble(R, R), type = c("raw", "simplified"), main = c("original graph", "transitive reduction")) }
Creation and manipulation of relations.
relation(domain = NULL, incidence = NULL, graph = NULL, charfun = NULL) endorelation(domain = NULL, incidence = NULL, graph = NULL, charfun = NULL) homorelation(domain = NULL, incidence = NULL, graph = NULL, charfun = NULL) as.relation(x, ...) is.relation(x)
relation(domain = NULL, incidence = NULL, graph = NULL, charfun = NULL) endorelation(domain = NULL, incidence = NULL, graph = NULL, charfun = NULL) homorelation(domain = NULL, incidence = NULL, graph = NULL, charfun = NULL) as.relation(x, ...) is.relation(x)
domain |
List (or tuple) of (possibly named) sets (or vectors)
used as the domain, recycled as needed to fit the arity of the relation.
If |
incidence |
A numeric array with values in the unit interval, or
a logical array. Note that one-dimensional incidences are also
accepted. The |
graph |
Either a set of equally sized tuples, or a list of (possibly, generic) vectors of same length where each component specifies one relation element, or a data frame where each row specifies one relation element. For the latter, the columns correspond to the domain sets, and the colnames are used as their labels if specified. |
charfun |
A characteristic function of the relation, i.e., a
predicate function taking |
x |
an R object. |
... |
Further arguments passed to |
Given sets of objects
, ...,
, a
-ary relation
on
is a
(possibly fuzzy) subset
of the Cartesian product
. We refer to
and
as the domain and the graph of the
relation, respectively (alternative notions are that of ground
and figure, respectively). We also refer to
, where each
gives the cardinality of
, as the size of the relation.
Strictly speaking, the relation is the pair ;
often, relations are identified with their graph. If
is a
crisp subset of
,
is a crisp relation. In
this case, we say that a
-tuple
is contained in
the relation
iff it is an element of
.
The characteristic function of a relation
is
the membership function of
, giving for each
-tuple
in
the membership (amount of belongingness) of
to
. In the crisp case,
is also referred
to as the indicator function of the relation, and is a binary (0/1)
function such that
is one iff
is in
.
Relations with arity 2, 3, and 4 are typically referred to as
binary, ternary, and quaternary relations,
respectively. A homorelation on is a relation with
homogeneous domain, i.e.
.
An endorelation on
(or binary relation
over
) is a binary homorelation.
See predicates for the most important
types of endorelations.
Relations with the same domain can naturally be ordered according to
their graphs. I.e., iff
is a subset of
, or equivalently, iff
for every
-tuple
(in the crisp case, iff every tuple contained in
is also contained in
). This induces a lattice
structure, with meet (greatest lower bound) and join (least upper
bound) the intersection and union of the graphs, respectively, also
known as the intersection and union of the relations.
The least element moves metric on this lattice is the
symmetric difference metric, i.e., the Manhattan distance
between the collections of membership values (incidences). In the
crisp case, this gives the cardinality of the symmetric difference of
the graphs (the number of tuples in exactly one of the relation
graphs).
The complement (or negation) of a relation
is the relation with domain
whose graph is the
complement of
(in the crisp case, containing exactly the
tuples not contained in
).
For binary crisp relations , it is customary to write
iff
is contained in
. For binary
crisp relations
and
with domains
and
, the composition
of
and
is defined by taking
iff there is an
such that
and
. The transpose (or
inverse)
of the relation
with domain
is the relation with domain
such that
iff
. The
codual (Clark (1990), also known as the ‘dual’ in the
fuzzy preference literature, e.g., Ovchinnikov (1991)) is the
complement of the transpose (equivalently, the transpose of the
complement).
For binary fuzzy relations , one often writes
for
the membership of the pair
in the relation. The above
notions need to take the fuzzy logic employed (as described by the
fuzzy t-norm (intersection)
, t-conorm (disjunction)
,
and negation
) into account. Let
,
and
be binary relations with appropriate domains. Then the
memberships for
of the complement, transpose and codual of
are
,
and
,
respectively. The membership of
for the composition of
and
is
.
Package relations implements finite relations as an S3 class
which allows for a variety of representations (even though currently,
typically dense array representations of the incidences are employed).
Other than by the generator,
relations can be obtained by coercion via the generic function
as.relation()
, which has methods for at least logical and numeric
vectors, unordered and ordered factors, arrays including matrices, and
data frames. Unordered factors are coerced to equivalence relations;
ordered factors and numeric vectors are coerced to order relations.
Logical vectors give unary relations (predicates). A (feasible)
-dimensional array is taken as the incidence of a
-ary
relation. Finally, a data frame is taken as a relation table. Note
that missing values will be propagated in the coercion.
endorelation()
is a wrapper for relation()
, trying to
guess a suitable domain from its arguments to create an
endorelation. If a domain is given, all labels are combined and the
result (as a list) recycled as needed.
Basic relation operations are available as group methods: min()
and max()
give the meet and join, and range()
a
relation ensemble with these two.
Comparison operators implement the natural ordering in the relation
lattice. Where applicable, !
gives the complement (negation),
&
and |
intersection and union, and *
composition, respectively. Finally, t()
gives the transpose
and codual()
the codual.
There is a plot()
method for certain
crisp endorelations provided that package Rgraphviz is
available.
For crisp endorelations ,
sym()
and asy()
give
the symmetric and asymmetric parts of , defined as the
intersection of
with its transpose, or, respectively, with its
codual (the complement of its transpose).
The summary()
method applies all predicates available
and returns a logical vector with the corresponding results.
S. A. Clark (1990), A folk meta-theorem in the foundations of utility theory. Mathematical Social Sciences, 19/3, 253–267. doi:10.1016/0165-4896(90)90065-F.
S. Ovchinnikov (1991), Similarity relations, fuzzy partitions, and fuzzy orderings. Fuzzy Sets and Systems, 40/1, 107–126. doi:10.1016/0165-0114(91)90048-U.
relation_incidence()
for obtaining incidences;
relation_domain()
for determining domain, arity, and
size;
relation_graph()
for determining the graph of a relation;
relation_charfun()
for determining the characteristic
function;
predicates for available predicate functions; and
algebra for further operations defined on relations.
require("sets") # set(), tuple() etc. ## A relation created by specifying the graph: R <- relation(graph = data.frame(A = c(1, 1:3), B = c(2:4, 4))) relation_incidence(R) ## extract domain relation_domain(R) ## extract graph relation_graph(R) ## both ("a pair of domain and graph" ...) as.tuple(R) ## (Almost) the same using the set specification ## (the domain labels are missing). R <- relation(graph = set(tuple(1,2), tuple(1,3), tuple(2,4), tuple(3,4))) ## equivalent to: ## relation(graph = list(c(1,2), c(1,3), c(2,4), c(3,4))) relation_incidence(R) ## Explicitly specifying the domain: R <- relation(domain = list(A = letters[1:3], B = LETTERS[1:4]), graph = set(tuple("a","B"), tuple("a","C"), tuple("b","D"), tuple("c","D"))) relation_incidence(R) ## Domains can be composed of arbitrary R objects: R <- relation(domain = set(c, "test"), graph = set(tuple(c, c), tuple(c, "test"))) relation_incidence(R) ## Characteristic function ("a divides b"): R <- relation(domain = list(1 : 10, 1 : 10), charfun = function(a, b) b %% a == 0) relation_incidence(R) ## R is a partial order: plot the Hasse diagram provided that ## Rgraphviz is available: if(require("Rgraphviz")) plot(R) ## conversions and operators x <- matrix(0, 3, 3) R1 <- as.relation(row(x) >= col(x)) R2 <- as.relation(row(x) <= col(x)) R3 <- as.relation(row(x) < col(x)) relation_incidence(max(R1, R2)) relation_incidence(min(R1, R2)) R3 < R2 relation_incidence(R1 * R2) relation_incidence(! R1) relation_incidence(t(R2)) ### endorelation s <- set(pair("a","b"), pair("c","d")) relation_incidence(relation(graph = s)) relation_incidence(endorelation(graph = s))
require("sets") # set(), tuple() etc. ## A relation created by specifying the graph: R <- relation(graph = data.frame(A = c(1, 1:3), B = c(2:4, 4))) relation_incidence(R) ## extract domain relation_domain(R) ## extract graph relation_graph(R) ## both ("a pair of domain and graph" ...) as.tuple(R) ## (Almost) the same using the set specification ## (the domain labels are missing). R <- relation(graph = set(tuple(1,2), tuple(1,3), tuple(2,4), tuple(3,4))) ## equivalent to: ## relation(graph = list(c(1,2), c(1,3), c(2,4), c(3,4))) relation_incidence(R) ## Explicitly specifying the domain: R <- relation(domain = list(A = letters[1:3], B = LETTERS[1:4]), graph = set(tuple("a","B"), tuple("a","C"), tuple("b","D"), tuple("c","D"))) relation_incidence(R) ## Domains can be composed of arbitrary R objects: R <- relation(domain = set(c, "test"), graph = set(tuple(c, c), tuple(c, "test"))) relation_incidence(R) ## Characteristic function ("a divides b"): R <- relation(domain = list(1 : 10, 1 : 10), charfun = function(a, b) b %% a == 0) relation_incidence(R) ## R is a partial order: plot the Hasse diagram provided that ## Rgraphviz is available: if(require("Rgraphviz")) plot(R) ## conversions and operators x <- matrix(0, 3, 3) R1 <- as.relation(row(x) >= col(x)) R2 <- as.relation(row(x) <= col(x)) R3 <- as.relation(row(x) < col(x)) relation_incidence(max(R1, R2)) relation_incidence(min(R1, R2)) R3 < R2 relation_incidence(R1 * R2) relation_incidence(! R1) relation_incidence(t(R2)) ### endorelation s <- set(pair("a","b"), pair("c","d")) relation_incidence(relation(graph = s)) relation_incidence(endorelation(graph = s))
Compute scores for the tuples of (ensembles of) endorelations.
## S3 method for class 'relation' relation_scores(x, method = c("ranks", "Barthelemy/Monjardet", "Borda", "Kendall", "Wei", "differential", "Copeland"), normalize = FALSE, ...) ## S3 method for class 'relation_ensemble' relation_scores(x, method = c("Borda", "Kendall", "differential", "Copeland"), normalize = FALSE, weights = 1, ...)
## S3 method for class 'relation' relation_scores(x, method = c("ranks", "Barthelemy/Monjardet", "Borda", "Kendall", "Wei", "differential", "Copeland"), normalize = FALSE, ...) ## S3 method for class 'relation_ensemble' relation_scores(x, method = c("Borda", "Kendall", "differential", "Copeland"), normalize = FALSE, weights = 1, ...)
x |
an object inheriting from class |
method |
character string indicating the method (see Details). |
normalize |
logical indicating whether the score vector should be normalized to sum up to 1. |
weights |
Numeric vector of weights used in incidence aggregation, recycled as needed. |
... |
further arguments to be passed to methods. |
In the following, consider an endorelation on
objects.
Let the in-degree
and out-degree
of an object
be defined as the numbers of objects
such
that
and, respectively,
, and let
be the differential of
(see Regenwetter
and Rykhlevskaia (2004)). Note that
and
are given by
the column sums and row sums of the incidence matrix of
. If
is a preference relation with a
interpretation,
is the difference between the numbers of objects dominated
by
(i.e.,
) and dominating
(i.e.,
),
as “ties” cancel out.
relation_score()
is generic with methods for relations and
relation ensembles. Available built-in score methods for the
relation method are as follows:
"ranks"
generalized ranks. A linear transformation of
the differential to the range from 1 to
. An
additional argument
decreasing
can be used to specify the
order of the ranks. By default, or if decreasing
is true,
objects are ranked according to decreasing differential
(“from the largest to the smallest” in the
preference context) using
.
Otherwise, if
decreasing
is false, objects are ranked via
(“from the smallest to the largest”).
See Regenwetter and Rykhlevskaia (2004) for more details on
generalized ranks.
"Barthelemy/Monjardet"
,
where
and
are the numbers of objects
such that
, and
and not
,
respectively. If
is a
preference relation, we
get the number of dominated objects plus half the number of the
equivalent objects minus 1 (the “usual” average ranks minus
one if the relation is complete). See Barthélemy and Monjardet
(1981).
"Borda"
, "Kendall"
the out-degrees. See Borda (1770) and Kendall (1955).
"Wei"
the eigenvector corresponding to the greatest
eigenvalue of the incidence matrix of the complement of .
See Wei (1952).
"differential"
, "Copeland"
the differentials, equivalent to the negative net flow of Bouyssou (1992), and also to the Copeland scores.
For relation ensembles, currently only
"differential"
/"Copeland"
and
"Borda"
/"Kendall"
are implemented. They are computed on
the aggregated incidences of the ensembles' relations.
Definitions of scores for “preference relations” are
somewhat ambiguous because
can encode
or
(or strict variants thereof) relationships (and all such variants are
used in the literature). Package relations generally assumes a
encoding, and that scores in the strict sense should
increase with preference (the most preferred get the highest scores)
whereas ranks decrease with preference (the most preferred get the
lowest ranks).
A vector of scores, with names taken from the relation domain labels.
J.-P. Barthélemy and B. Monjardet (1981), The median procedure in cluster analysis and social choice theory. Mathematical Social Sciences, 1, 235–267. doi:10.1016/0165-4896(81)90041-X.
J. C. Borda (1781), Mémoire sur les élections au scrutin. Histoire de l'Académie Royale des Sciences.
D. Bouyssou (1992), Ranking methods based on valued preference relations: A characterization of the net flow network. European Journal of Operational Research, 60, 61–67. doi:10.1016/0377-2217(92)90333-5.
M. Kendall (1955), Further contributions to the theory of paired comparisons. Biometrics, 11, 43–62. doi:10.2307/3001479.
M. Regenwetter and E. Rykhlevskaia (2004), On the (numerical) ranking associated with any finite binary relation. Journal of Mathematical Psychology, 48, 239–246. doi:10.1016/j.jmp.2004.03.003.
T. H. Wei (1952). The algebraic foundation of ranking theory. Unpublished thesis, Cambridge University.
## Example taken from Cook and Cress (1992, p.74) I <- matrix(c(0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0), ncol = 5, byrow = TRUE) R <- relation(domain = letters[1:5], incidence = I) ## Note that this is a "preference matrix", so take complement: R <- !R ## Compare Kendall and Wei scores cbind( Kendall = relation_scores(R, method = "Kendall", normalize = TRUE), Wei = relation_scores(R, method = "Wei", normalize = TRUE) ) ## Example taken from Cook and Cress (1992, p.136) ## Note that the results indicated for the Copeland scores have ## (erroneously?) been computed from the *unweighted* votes. ## Also, they report the votes as strict preferences, so we ## create the dual relations. D <- letters[1:5] X <- as.relation(ordered(D, levels = c("b", "c", "a", "d", "e"))) Y <- as.relation(ordered(D, levels = c("d", "a", "e", "c", "b"))) Z <- as.relation(ordered(D, levels = c("e", "c", "b", "a", "d"))) E <- relation_ensemble(X, Y, Z) relation_scores(E, "Copeland") relation_scores(E, "Borda", weights = c(4, 3, 2))
## Example taken from Cook and Cress (1992, p.74) I <- matrix(c(0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0), ncol = 5, byrow = TRUE) R <- relation(domain = letters[1:5], incidence = I) ## Note that this is a "preference matrix", so take complement: R <- !R ## Compare Kendall and Wei scores cbind( Kendall = relation_scores(R, method = "Kendall", normalize = TRUE), Wei = relation_scores(R, method = "Wei", normalize = TRUE) ) ## Example taken from Cook and Cress (1992, p.136) ## Note that the results indicated for the Copeland scores have ## (erroneously?) been computed from the *unweighted* votes. ## Also, they report the votes as strict preferences, so we ## create the dual relations. D <- letters[1:5] X <- as.relation(ordered(D, levels = c("b", "c", "a", "d", "e"))) Y <- as.relation(ordered(D, levels = c("d", "a", "e", "c", "b"))) Z <- as.relation(ordered(D, levels = c("e", "c", "b", "a", "d"))) E <- relation_ensemble(X, Y, Z) relation_scores(E, "Copeland") relation_scores(E, "Borda", weights = c(4, 3, 2))
Modify relations by (re)setting their domain, graph, or incidences.
relation_domain(x) <- value relation_domain_names(x) <- value relation_graph(x) <- value relation_incidence(x) <- value
relation_domain(x) <- value relation_domain_names(x) <- value relation_graph(x) <- value relation_incidence(x) <- value
x |
an R object inheriting from class |
value |
for setting the domain, a tuple (or list) as long as the
arity of the relation For setting the graph, either a set of tuples of equal lengths
(arity of the relation) or a data frame or something coercible to
this, with the values of the components of the given tuples (rows)
always elements of the corresponding elements of the domain of
For setting incidences, a numeric array with values in the unit
interval or a logical array, with dimension the size of the relation
For setting the domain names, a character vector as long as the
arity of the relation |
relation_domain()
for getting the domain of a relation;
relation_domain_names()
for getting the domain names;
relation_graph()
for getting the graph;
relation_incidence()
for getting the incidences;
relation()
for basic information.
R <- as.relation(1 : 3) print(R) relation_domain(R) ## tuple format: require("sets") # set(), pair() etc. relation_domain(R) <- pair(X = set("a","b","c"), Y = set("A","B","C")) relation_domain(R) ## the same in list format: relation_domain(R) <- list(X = letters[1:3], Y = LETTERS[1:3]) relation_domain(R) relation_domain_names(R) <- c("XX","YY") relation_domain_names(R) relation_incidence(R) relation_incidence(R) <- diag(1, 3, 3) relation_incidence(R) relation_graph(R) ## set format: relation_graph(R) <- set(pair("a","B"), pair("a","C"), pair("b","C")) relation_graph(R) ## the same in data frame format: relation_graph(R) <- data.frame(c("a", "a", "b"), c("B", "C", "C"), stringsAsFactors = FALSE) relation_graph(R)
R <- as.relation(1 : 3) print(R) relation_domain(R) ## tuple format: require("sets") # set(), pair() etc. relation_domain(R) <- pair(X = set("a","b","c"), Y = set("A","B","C")) relation_domain(R) ## the same in list format: relation_domain(R) <- list(X = letters[1:3], Y = LETTERS[1:3]) relation_domain(R) relation_domain_names(R) <- c("XX","YY") relation_domain_names(R) relation_incidence(R) relation_incidence(R) <- diag(1, 3, 3) relation_incidence(R) relation_graph(R) ## set format: relation_graph(R) <- set(pair("a","B"), pair("a","C"), pair("b","C")) relation_graph(R) ## the same in data frame format: relation_graph(R) <- data.frame(c("a", "a", "b"), c("B", "C", "C"), stringsAsFactors = FALSE) relation_graph(R)
SVM_Benchmarking_Classification
and
SVM_Benchmarking_Regression
represent the
results of a benchmark study (Meyer, Leisch and Hornik, 2003)
comparing Support Vector Machines to
other predictive methods on real and artificial data sets
involving classification and regression
methods, respectively.
In addition,
SVM_Benchmarking_Classification_Consensus
and SVM_Benchmarking_Regression_Consensus
provide consensus rankings derived from these data.
data("SVM_Benchmarking_Classification") data("SVM_Benchmarking_Regression") data("SVM_Benchmarking_Classification_Consensus") data("SVM_Benchmarking_Regression_Consensus")
data("SVM_Benchmarking_Classification") data("SVM_Benchmarking_Regression") data("SVM_Benchmarking_Classification_Consensus") data("SVM_Benchmarking_Regression_Consensus")
SVM_Benchmarking_Classification
(SVM_Benchmarking_Regression
) is an ensemble of 21 (12)
relations representing pairwise comparisons of 17 classification (10
regression) methods on 21 (12) data sets. Each relation of the
ensemble summarizes the results for a particular data set. The
relations are reflexive endorelations on the set of methods
employed, with a pair of distinct methods contained in a
relation iff both delivered results on the corresponding data set and
did not perform significantly better than
at the 5%
level. Since some methods failed on some data sets, the relations are
not guaranteed to be complete or transitive. See Meyer et al. (2003)
for details on the experimental design of the benchmark study, and
Hornik and Meyer (2007) for the pairwise comparisons.
The corresponding consensus objects are lists of ensembles of
consensus relations fitted to the benchmark results.
For each of the following three endorelation families:
SD/L
(“linear orders”),
SD/O
(“partial orders”), and SD/W
(“weak orders”), all possible consensus
relations have been computed (see relation_consensus
).
For both classification and regression,
the three relation ensembles obtained are provided as a named list of
length 3. See Hornik & Meyer (2007) for details on the meta-analysis.
D. Meyer, F. Leisch, and K. Hornik (2003), The support vector machine under test. Neurocomputing, 55, 169–186. doi:10.1016/S0925-2312(03)00431-4.
K. Hornik and D. Meyer (2007), Deriving consensus rankings from benchmarking experiments. In R. Decker and H.-J. Lenz, Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer-Verlag: Heidelberg, 163–170.
data("SVM_Benchmarking_Classification") ## 21 data sets names(SVM_Benchmarking_Classification) ## 17 methods relation_domain(SVM_Benchmarking_Classification) ## select weak orders weak_orders <- Filter(relation_is_weak_order, SVM_Benchmarking_Classification) ## only the artifical data sets yield weak orders names(weak_orders) ## visualize them using Hasse diagrams if(require("Rgraphviz")) plot(weak_orders) ## Same for regression: data("SVM_Benchmarking_Regression") ## 12 data sets names(SVM_Benchmarking_Regression) ## 10 methods relation_domain(SVM_Benchmarking_Regression) ## select weak orders weak_orders <- Filter(relation_is_weak_order, SVM_Benchmarking_Regression) ## only two of the artifical data sets yield weak orders names(weak_orders) ## visualize them using Hasse diagrams if(require("Rgraphviz")) plot(weak_orders) ## Consensus solutions: data("SVM_Benchmarking_Classification_Consensus") data("SVM_Benchmarking_Regression_Consensus") ## The solutions for the three families are not unique print(SVM_Benchmarking_Classification_Consensus) print(SVM_Benchmarking_Regression_Consensus) ## visualize the consensus weak orders classW <- SVM_Benchmarking_Classification_Consensus$W regrW <- SVM_Benchmarking_Regression_Consensus$W if(require("Rgraphviz")) { plot(classW) plot(regrW) } ## in tabular style: ranking <- function(x) rev(names(sort(relation_class_ids(x)))) sapply(classW, ranking) sapply(regrW, ranking) ## (prettier and more informative:) relation_classes(classW[[1L]])
data("SVM_Benchmarking_Classification") ## 21 data sets names(SVM_Benchmarking_Classification) ## 17 methods relation_domain(SVM_Benchmarking_Classification) ## select weak orders weak_orders <- Filter(relation_is_weak_order, SVM_Benchmarking_Classification) ## only the artifical data sets yield weak orders names(weak_orders) ## visualize them using Hasse diagrams if(require("Rgraphviz")) plot(weak_orders) ## Same for regression: data("SVM_Benchmarking_Regression") ## 12 data sets names(SVM_Benchmarking_Regression) ## 10 methods relation_domain(SVM_Benchmarking_Regression) ## select weak orders weak_orders <- Filter(relation_is_weak_order, SVM_Benchmarking_Regression) ## only two of the artifical data sets yield weak orders names(weak_orders) ## visualize them using Hasse diagrams if(require("Rgraphviz")) plot(weak_orders) ## Consensus solutions: data("SVM_Benchmarking_Classification_Consensus") data("SVM_Benchmarking_Regression_Consensus") ## The solutions for the three families are not unique print(SVM_Benchmarking_Classification_Consensus) print(SVM_Benchmarking_Regression_Consensus) ## visualize the consensus weak orders classW <- SVM_Benchmarking_Classification_Consensus$W regrW <- SVM_Benchmarking_Regression_Consensus$W if(require("Rgraphviz")) { plot(classW) plot(regrW) } ## in tabular style: ranking <- function(x) rev(names(sort(relation_class_ids(x)))) sapply(classW, ranking) sapply(regrW, ranking) ## (prettier and more informative:) relation_classes(classW[[1L]])
Returns a tabular representation of a relation like a “view” of a relational database table.
relation_table(x, memberships = TRUE)
relation_table(x, memberships = TRUE)
x |
an object inheriting from class |
memberships |
logical; should membership vector (if any) be added to the table? |
An object of class relation_table
, inheriting from class
data.frame
.
R <- data.frame(Name = c("David", "John"), Age = c(33, 66), stringsAsFactors = FALSE) R <- as.relation(R) relation_table(R)
R <- data.frame(Name = c("David", "John"), Age = c(33, 66), stringsAsFactors = FALSE) R <- as.relation(R) relation_table(R)
Compute the left or right trace of an endorelation.
relation_trace(x, which)
relation_trace(x, which)
x |
an endorelation. |
which |
one of |
Let be a crisp endorelation. The left and right trace of
contain all pairs
for which
implies
for all
(left trace) or
implies
for all
(right trace), respectively. These are
the largest (in the natural ordering of relations with the same
domain) relations such that
or
,
respectively (where
denotes composition). In the fuzzy case,
the memberships of the traces can be defined as the infima over the
corresponding fuzzy membership implications. See Chapter 2.3 in Fodor
and Roubens (1994) for more information.
J. Fodor and M. Roubens (1994), Fuzzy Preference Modelling and Multicriteria Decision Support. Kluwer Academic Publishers: Dordrecht.
Carry out transformations between incidence matrices from endorelations and other codings.
transform_incidences(x, from = c("PO", "SO", "01", "-1+1"), to = c("PO", "SO", "01", "-1+1"))
transform_incidences(x, from = c("PO", "SO", "01", "-1+1"), to = c("PO", "SO", "01", "-1+1"))
x |
An incidence matrix from an endorelation. |
from , to
|
The coding scheme (see Details). |
In the following, we consider an incidence matrix with cells
of a relation
with tuples
.
For the "PO"
(“Preference Order”) coding,
is a 0/1 matrix, and
iff
. It follows in particular
that if both
and
are 0, the corresponding pair
is not contained in R, i.e.,
and
are unrelated.
For the "SO"
(“"Strict Order"”) coding,
is a 0/1 matrix with possible
NA
values. As for "PO"
, iff
, but at most one of
and
can
be 1. If both are missing (
NA
), and
are unrelated.
For the "01"
coding, is a matrix with values 0, 1, or
0.5. The coding is similar to
"SO"
, except that NA
is
represented by 0.5.
For the "-1+1"
coding, is a matrix with values -1, 0, or 1.
The coding is similar to
"SO"
, except that NA
is
represented by 0, and if not
.
require("sets") # set(), pair() etc. x <- relation(domain = c(1,2,3,4), graph = set(pair(1,2), pair(4,2), pair(1,3), pair(1,4), pair(3,2), pair(2,1))) inc <- relation_incidence(x) print(inc) transform_incidences(inc, to = "SO") transform_incidences(inc, to = "01") transform_incidences(inc, to = "-1+1") ## transformations should be loss-free: inc2 <- transform_incidences(inc, from = "PO", to = "-1+1") inc2 <- transform_incidences(inc2, from = "-1+1", to = "SO") inc2 <- transform_incidences(inc2, from = "SO", to = "01") inc2 <- transform_incidences(inc2, from = "01", to = "PO") stopifnot(identical(inc, inc2))
require("sets") # set(), pair() etc. x <- relation(domain = c(1,2,3,4), graph = set(pair(1,2), pair(4,2), pair(1,3), pair(1,4), pair(3,2), pair(2,1))) inc <- relation_incidence(x) print(inc) transform_incidences(inc, to = "SO") transform_incidences(inc, to = "01") transform_incidences(inc, to = "-1+1") ## transformations should be loss-free: inc2 <- transform_incidences(inc, from = "PO", to = "-1+1") inc2 <- transform_incidences(inc2, from = "-1+1", to = "SO") inc2 <- transform_incidences(inc2, from = "SO", to = "01") inc2 <- transform_incidences(inc2, from = "01", to = "PO") stopifnot(identical(inc, inc2))
Computes a measure of remoteness of a relation from a specified property.
relation_violations(x, property = c("complete", "match", "reflexive", "irreflexive", "coreflexive", "symmetric", "antisymmetric", "asymmetric", "transitive", "negatively_transitive", "Ferrers", "semitransitive", "trichotomous", "Euclidean"), tuples = FALSE, na.rm = FALSE)
relation_violations(x, property = c("complete", "match", "reflexive", "irreflexive", "coreflexive", "symmetric", "antisymmetric", "asymmetric", "transitive", "negatively_transitive", "Ferrers", "semitransitive", "trichotomous", "Euclidean"), tuples = FALSE, na.rm = FALSE)
x |
an endorelation. |
property |
a character string specifying one of the properties for which the number of violations can be computed. |
tuples |
a logical indicating whether to return the amount of violations (default), or the tuples for which the property is violated. |
na.rm |
a logical indicating whether to remove tuples for which the property information is not available (due to missing memberships). |
If tuples
is false (default), the amount of violations for the
specified property: for crisp relations, the minimum number of object
tuples involved in the definition of the property (e.g., singletons
for reflexivity, pairs for antisymmetry, and triples for transitivity)
that must be modified/added/removed to make the relation satisfy the
property.
If tuples
is true, a set of tuples of objects for which the
respective property is violated.
predicates for the definitions of the properties.
## partial order: R <- as.relation(1:3) relation_incidence(R) ## R clearly is transitive, but not symmetric: relation_violations(R, "transitive") relation_violations(R, "symmetric") ## Pairs for which symmetry is violated: relation_violations(R, "symmetric", TRUE) ## create a simple relation: require("sets") # set(), pair() etc. R <- relation(domain = letters[1:2], graph = set(pair("a","b"), pair("b","a"))) relation_incidence(R) ## R is clearly symmetric, but not antisymmetric: relation_violations(R, "symmetric") relation_violations(R, "antisymmetric")
## partial order: R <- as.relation(1:3) relation_incidence(R) ## R clearly is transitive, but not symmetric: relation_violations(R, "transitive") relation_violations(R, "symmetric") ## Pairs for which symmetry is violated: relation_violations(R, "symmetric", TRUE) ## create a simple relation: require("sets") # set(), pair() etc. R <- relation(domain = letters[1:2], graph = set(pair("a","b"), pair("b","a"))) relation_incidence(R) ## R is clearly symmetric, but not antisymmetric: relation_violations(R, "symmetric") relation_violations(R, "antisymmetric")