Package 'OSDR'

Title: Finds an Optimal System of Distinct Representatives
Description: Provides routines for finding an Optimal System of Distinct Representatives (OSDR), as defined by D.Gale (1968) <doi:10.1016/S0021-9800(68)80039-0>.
Authors: Massimo Cannas [aut, cre]
Maintainer: Massimo Cannas <[email protected]>
License: GPL-3
Version: 1.1.4
Built: 2024-11-28 06:27:18 UTC
Source: CRAN

Help Index


Finds an Optimal System of Distinct Representatives

Description

Provides routines for finding an Optimal System of Distinct Representatives (OSDR), as defined by D.Gale (1968) <doi:10.1016/S0021-9800(68)80039-0>.

Details

Package: OSDR
Type: Package
Title: Finds an Optimal System of Distinct Representatives
Version: 1.1.4
Date: 2022-03-07
Authors@R: person("Massimo", "Cannas", role = c("aut", "cre"), email = "[email protected]")
Author: Massimo Cannas [aut, cre]
Maintainer: Massimo Cannas <[email protected]>
Description: Provides routines for finding an Optimal System of Distinct Representatives (OSDR), as defined by D.Gale (1968) <doi:10.1016/S0021-9800(68)80039-0>.
License: GPL-3
URL: https://cran.r-project.org/package=OSDR
NeedsCompilation: no
Packaged: 2022-03-07 15:53:48 UTC; massimo
Repository: CRAN
Date/Publication: 2022-03-08 13:30:05 UTC

Index of help topics:

OSDR                    Finds an Optimal System of Distinct
                        Representatives from a family of subsets.
OSDR-package            Finds an Optimal System of Distinct
                        Representatives
exdata                  Executive dataset (subset of)
listmat                 From cost matrix to list of feasible
                        assignments.
matlist                 From list of feasible assignments to cost
                        matrix.

The aim of this small package is to provide a function solving the assignment problem on an ordered set. This matching problem can be viewed in two ways. First, as the original ordered assignment problem, where a set of jobs, ordered by importance, must be filled by suitable applicants in the best possible way. Second, as an ordered matching problem where treated unit should be matched in order of importance with suitable controls. Suitability can be obtained by a ”caliper”: units within the caliper having zero matching cost and units outside the caliper having infinity cost. The main function OSDR exploits an algorithm suggested by I.Anderson to find an order optimal matching, as defined by D.Gale (see OSDR for details). The package includes some utilities and examples (both combinatorial and statistically oriented) to illustrate the use of OSDR.

Author(s)

Massimo Cannas [aut, cre]

Maintainer: Massimo Cannas <[email protected]>

References

Gale, D. (1968) Optimal matching in an ordered set: an application of matroid theory. Journal of Combinatorial Theory 4, pp. 176-180.

Anderson, I. (1989) A first course in Combinatorial Mathematics. Oxford University Press.

Rosenbaum, P. R. (1989). Optimal matching for observational studies. Journal of the American Statistical Association 84(408): pp. 1024-1032.

Examples

#See OSDR help for the examples.

Executive dataset (subset of)

Description

Contains executives data for an italian firm operating in the energy sector (year 2015). The complete dataset contains 302 rows (10 firms) and 180 columns (it will be released in next package version).

Usage

data("exdata")

Format

A data frame with 28 observations on the following 8 variables.

firm

a character vector

position

a numeric vector

education

a numeric vector

year_born

a numeric vector

contract

a numeric vector

part_fulltime

a numeric vector

seniority

a numeric vector

sex

a numeric vector

Details

position

is coded: 4 = top manager, 3 = medium/first line manager, 2 = supervisor

sex

is coded: 0 = M, 1 = F

education

is coded: Post-graduate = 5, Graduate = 4, High school = 3

contract type

is coded: fixed term = 4; permanent = 3

seniority

is measured by number of years in the current position

Examples

# load executive data
data(exdata)

# case study: matched samples for comparing women and men executives

table(exdata$sex)
table(exdata$position,exdata$sex)
# There are more women and more in apical position. 

# A complete matching is not possible for several choices of the caliper.
# Gap differences tend to be higher for higher ranks

# e.g. Lynn and Thompson(1997), Above the glass ceiling? A comparison of
# matched Samples of Men and Women Executives. J. of Appl. Psych. 82(3)

# so we would give higher matching priority to women in higher position.
# We can use OSDR to find a minimum cost matching 
# performing matching by decreasing hierarchical position.

From cost matrix to list of feasible assignments.

Description

Transforms the cost matrix of the assignment problem in the corresponding list of suitable applicants for each job.

Usage

listmat(x)

Arguments

x

A cost matrix where entry (i,j)(i,j) is the cost of assigning job i to applicant j.

Details

In statistical matching problems the input is usually the cost matrix while in assignment problems is the list of assignable elements. Functions matlist and listmat go back and forth these two representations of the input.

Value

A list of suitable applicants. The ithi^{th} element of the list contains the suitable applicants for job i. These are all applicants with finite assignment cost. Note that if the cost matrix contains finite non zero values there is a loss of information in the transformation.

Author(s)

Massimo Cannas <[email protected]>

See Also

See matlist, the reverse function.

Examples

# a list of feasible applicants for five jobs	
M1<-c("A","B","C")
M2<-c("A","C")
M3<-c("B")
M4<-c("A","C")
M5<-c("A","D")
M  <-list(M1,M2,M3,M4,M5)
M

# list --> cost matrix
m <- listmat(M)

# cost matrix --> list
l <- matlist(m)

From list of feasible assignments to cost matrix.

Description

Transforms the list of suitable applicants into the equivalent cost matrix.

Usage

matlist(x)

Arguments

x

An ordered list of applicants for a set of jobs. The ithi^{th} element of the list contains applicants suitable for job i. Equivalently, an ordered list of controls where the ithi^{th} element of the list contains controls matchable with the ithi^{th} treated unit.

Details

In statistical matching problems the input is usually the cost matrix while in assignment problems is the list of assignable elements. Functions matlist and listmat go back and forth these two representations of the input.

Value

The cost matrix of the assignment problem. The (i,j)(i,j) entry of the matrix is zero if the jthj^{th} applicant is assignable to the ithi^{th} job and \infty otherwise. Equivalently, the (i,j)(i,j) entry of the matrix is zero if the jthj^{th} control unit can be matched to the ithi^{th} treated unit and \infty otherwise.

Author(s)

Massimo Cannas <[email protected]>

See Also

See also the reverse function listmat.

Examples

# a list of feasible applicants for five jobs	
M1<-c("A","B","C")
M2<-c("A","C")
M3<-c("B")
M4<-c("A","C")
M5<-c("A","D")
M  <-list(M1,M2,M3,M4,M5)
M

# the corresponding cost matrix
m<-listmat(M)

# back to the list
l<-matlist(m)

Finds an Optimal System of Distinct Representatives from a family of subsets.

Description

The function finds an Optimal Set of Distinct Representatives (OSDR) from an ordered family of subsets. Optimality is order-wise in the sense specified by Gale: any other SDR cannot be larger and its elements cannot be chosen from more important sets (see details).

Usage

OSDR(M)

Arguments

M

An ordered family of subsets. In assignment problems the list specifying the feasible applicants for each job. In matching problems the list specifying the controls matchable with each treated unit. The list is assumed ordered by decreasing assignment/matching priority.

Details

Consider a set of jobs and a set of suitable applicants for each job. It was shown by D.Gale that there exists an optimally assignable subset of the jobs in the sense that if a1,ak{a_1, \ldots a_k} is optimal and b1,bh{b_1, \ldots b_h} is another assignable set of jobs then: a) khk \ge h and b) aibia_i \le b_i for all ii. The latter means that job aia_i has as higher a priority as job bib_i. From a graph perspective, if JJ is the set of jobs and AA is the set of applicants the OSDR is an order wise maximum size matching of the bipartite graph with vertex set JunionAJ union A.

Function OSDR finds the optimal matching using an algorithmic proof of Hall's theorem due to logician D.J. Shoesmith. The algorithm assigns greedily until possible and correct backward when necessary. A message is given when a correction is attempted (when it fails another message warns that Hall's condition is not satisfied).

Value

A list containing three elements: the OSDR (a zero in position i indicates impossibility of assigning job i), the index of optimally assignable jobs and the index of unassignable jobs. If the latter is not empty M does not meet Hall's condition, i.e., a complete SDR is not possible.

Note

In statistical matching problems the sets JJ and AA are the sets of treated and control units and it is usually required to find the minimum covariate distance matching of treated versus control units. When a complete matching does not exist it can be convenient to find either an order optimal matching or a minimum cost matching on the OSDR subset (see the gender gap example).

Author(s)

Massimo Cannas <[email protected]>

References

Gale, D. (1968) Optimal matching in an ordered set: an application of matroid theory. Journal of Combinatorial Theory 4: 176-180.

Anderson, I. (1989) A first course in Combinatorial Mathematics. Oxford University Press.

See Also

See also matlist, listmat

Examples

### example 1
# M is the list of suitable applicants for five jobs
M1 <- c("A" , "B")
M2 <- c("B" , "C")
M3 <- c("B")
M4 <- c("A" , "C")
M5 <- c("B" , "C" , "D")
M  <- list(M1 , M2 , M3 , M4 , M5)

OSDR(M)

# $OSDR
# [1] "A" "C" "B" "0" "D"
# $matched
# [1] 1 2 3 5
# $unmatched
# [1] 4

# job 4 unmatched so Hall's condition is not satisfied: it's impossible to fill all the jobs
# note that there are (order-\emph{suboptimal}) assignments of the same length of the optimal: 
# eg: 0CBAD , BC0AD

#### example 2: sligthly modified: more than one order optimal matching
M1<-c("A","B","C")
M2<-c("A","C")
M3<-c("B")
M4<-c("A","C")
M5<-c("A","D")
M  <-list(M1,M2,M3,M4,M5)

OSDR(M)

# note there are other order optimal matchings: ACB0D or CAB0D 
# note there are also other maximum size matchings (not order optimal):
# e.g. 0CBAD or BC0AD