| Title: | An Extended Mallows Model and Its Hierarchical Version for Ranked Data Aggregation |
|---|---|
| Description: | For multiple full/partial ranking lists, R package 'ExtMallows' can (1) detect whether the input ranking lists are over-correlated, and (2) use the Mallows model or extended Mallows model to integrate the ranking lists, and (3) use hierarchical extended Mallows model for rank integration if there are groups of over-correlated ranking lists. |
| Authors: | Han Li, Minxuan Xu, Jun S. Liu and Xiaodan Fan |
| Maintainer: | Han Li <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.1.0 |
| Built: | 2026-05-26 09:06:16 UTC |
| Source: | https://github.com/cran/ExtMallows |
It caclulates the p values that measure the correlation of pariwise rankings.
corrRankings(rankings)corrRankings(rankings)
rankings |
A n by m data frame, with each column representing a ranking list, which ranks the items from the most preferred to the least preferred. For missing items, use 0 to denote them. |
pair.pvalue |
a symmetric matrix of p values, with the (i,j)-th element denoting the p value of the i,j-th rankings. |
Note that the input rankings should have at least 8 rankings. When constructing the samples of rescaled V distance for a given rank position, the number of samples should at least be 28 and the number of rankings that have items up to this position should account for at least 2/3 of the total number of rankings, otherwise the p value calculation stops at this position.
Han Li, Minxuan Xu, Jun S. Liu and Xiaodan Fan
An extended Mallows model for ranked data aggregation
data(simu3) pvalue=corrRankings(rankings = simu3) #threshold the p values threshold=0.05 pvalue.trunc=ifelse(pvalue<=0.05, pvalue, 1) #plot the p values x=y=1:ncol(pvalue) par(mfrow=c(1,2)) image(x, y, pvalue, xlab = NA, ylab = NA, sub = "rank coefficient") image(x, y, pvalue.trunc, xlab = NA, ylab = NA, sub = "rank coefficient < 0.05")data(simu3) pvalue=corrRankings(rankings = simu3) #threshold the p values threshold=0.05 pvalue.trunc=ifelse(pvalue<=0.05, pvalue, 1) #plot the p values x=y=1:ncol(pvalue) par(mfrow=c(1,2)) image(x, y, pvalue, xlab = NA, ylab = NA, sub = "rank coefficient") image(x, y, pvalue.trunc, xlab = NA, ylab = NA, sub = "rank coefficient < 0.05")
It uses the extended Mallows model to aggregate multiple full/partial ranking lists.
EMM(rankings, initial.method, it.max)EMM(rankings, initial.method, it.max)
rankings |
A n by m matrix, with each column representing a ranking list, which ranks the items from the most preferred to the least preferred. For missing items, use 0 to denote them. |
initial.method |
the method for initializing the value of pi0, with four options: mean, median, geometric and random (the mean of three randomly sampled ranking lists). By default, initial.method="mean". |
it.max |
the maximum number of iterations. By default, it.max=20. |
op.phi |
optimal value of phi |
op.omega |
optimal value of omega |
op.alpha |
optimal value of alpha |
op.pi0 |
optimal value of pi0, ranking the items from the most preferred to the least preferred |
max.logL |
maximum value of log-likelihood |
Han Li, Minxuan Xu, Jun S. Liu and Xiaodan Fan
An extended Mallows model for ranked data aggregation
data(simu1) res=EMM(rankings = simu1, initial.method = "mean", it.max = 20) res$op.phi res$op.omega res$op.pi0data(simu1) res=EMM(rankings = simu1, initial.method = "mean", it.max = 20) res$op.phi res$op.omega res$op.pi0
It uses the hierarchical extended Mallows model to aggregate multiple full/partial ranking lists.
HEMM(rankings, num.kappa, is.kappa.ranker, initial.method, it.max)HEMM(rankings, num.kappa, is.kappa.ranker, initial.method, it.max)
rankings |
A n by m matrix, with each column representing a ranking list, which ranks the items from the most preferred to the least preferred. For missing items, use 0 to denote them. |
num.kappa |
the number of over-correlated ranking groups |
is.kappa.ranker |
a list of over-correlated ranking groups, with the k-th element denoting the column numbers of the rankings that belong to the k-th group |
initial.method |
the method for initializing the value of pi0, with four options: mean, median, geometric and random (the mean of three randomly sampled ranking lists). By default, initial.method="mean". |
it.max |
the maximum number of iterations. By default, it.max=20. |
op.phi |
optimal value of phi |
op.phi1 |
optimal value of phi1, the phi value in over-correlated ranking groups |
op.omega |
optimal value of omega |
op.alpha |
optimal value of alpha |
op.pi0 |
optimal value of pi0, ranking the items from the most preferred to the least preferred |
op.kappa |
optimal value of kappa, denoting the items from the most preferred to the least preferred |
max.logL |
maximum value of log-likelihood |
Han Li, Minxuan Xu, Jun S. Liu and Xiaodan Fan
An extended Mallows model for ranked data aggregation
data(simu3) res=HEMM(rankings = simu3, num.kappa = 2, is.kappa.ranker = list(1:5, 6:10), initial.method = "mean", it.max = 20) res$op.phi res$op.phi1 res$op.omega res$op.pi0 data(NBArankings) res=HEMM(rankings = NBArankings, num.kappa = 1, is.kappa.ranker = list(1:6), initial.method = "mean", it.max = 20) res$op.omega res$op.pi0 res$op.kappadata(simu3) res=HEMM(rankings = simu3, num.kappa = 2, is.kappa.ranker = list(1:5, 6:10), initial.method = "mean", it.max = 20) res$op.phi res$op.phi1 res$op.omega res$op.pi0 data(NBArankings) res=HEMM(rankings = NBArankings, num.kappa = 1, is.kappa.ranker = list(1:6), initial.method = "mean", it.max = 20) res$op.omega res$op.pi0 res$op.kappa
It uses the Mallows model to aggregate multiple full/partial ranking lists.
MM(rankings, initial.method, it.max)MM(rankings, initial.method, it.max)
rankings |
A n by m matrix, with each column representing a ranking list, which ranks the items from the most preferred to the least preferred. For missing items, use 0 to denote them. |
initial.method |
the method for initializing the value of pi0, with four options: mean, median, geometric and random (the mean of three randomly sampled ranking lists). By default, initial.method="mean". |
it.max |
the maximum number of iterations. By default, it.max=20. |
op.phi |
optimal value of phi |
op.pi0 |
optimal value of pi0, ranking the items from the most preferred to the least preferred |
max.logL |
maximum value of log-likelihood |
Han Li, Minxuan Xu, Jun S. Liu and Xiaodan Fan
Mallows, C. L. (1957). Non-null ranking models, Biometrika 44(1/2): 114-130.
data(simu1) res=MM(rankings = simu1, initial.method = "mean", it.max = 20) res$op.phi res$op.pi0data(simu1) res=MM(rankings = simu1, initial.method = "mean", it.max = 20) res$op.phi res$op.pi0
This example is about aggregating the multiple rankings of NBA teams and was studied by Deng et al. (2014). They collected 34 rankings, including 6 professional rankings and 28 amateur rankings, for the 30 NBA teams in the 2011-2012 season. For the missing items in the partial rankings, we use number 0 to denote them.
data("NBArankings")data("NBArankings")
A data frame with 30 observations on the following 34 variables.
V1a factor with levels 76ers Bobcats Bucks Bulls Cavaliers Celtics Clippers Grizzlies Hawks Heat Hornets Jazz Kings Knicks Lakers Magic Mavericks Nets Nuggets Pacers Pistons Raptors Rockets Spurs Suns Thunder Timberwolves TrailBlazers Warriors Wizards
V2a factor with levels 76ers Bobcats Bucks Bulls Cavaliers Celtics Clippers Grizzlies Hawks Heat Hornets Jazz Kings Knicks Lakers Magic Mavericks Nets Nuggets Pacers Pistons Raptors Rockets Spurs Suns Thunder Timberwolves TrailBlazers Warriors Wizards
V3a factor with levels 76ers Bobcats Bucks Bulls Cavaliers Celtics Clippers Grizzlies Hawks Heat Hornets Jazz Kings Knicks Lakers Magic Mavericks Nets Nuggets Pacers Pistons Raptors Rockets Spurs Suns Thunder Timberwolves TrailBlazers Warriors Wizards
V4a factor with levels 76ers Bobcats Bucks Bulls Cavaliers Celtics Clippers Grizzlies Hawks Heat Hornets Jazz Kings Knicks Lakers Magic Mavericks Nets Nuggets Pacers Pistons Raptors Rockets Spurs Suns Thunder Timberwolves TrailBlazers Warriors Wizards
V5a factor with levels 76ers Bobcats Bucks Bulls Cavaliers Celtics Clippers Grizzlies Hawks Heat Hornets Jazz Kings Knicks Lakers Magic Mavericks Nets Nuggets Pacers Pistons Raptors Rockets Spurs Suns Thunder Timberwolves TrailBlazers Warriors Wizards
V6a factor with levels 76ers Bobcats Bucks Bulls Cavaliers Celtics Clippers Grizzlies Hawks Heat Hornets Jazz Kings Knicks Lakers Magic Mavericks Nets Nuggets Pacers Pistons Raptors Rockets Spurs Suns Thunder Timberwolves TrailBlazers Warriors Wizards
V7a factor with levels 0 Bulls Celtics Hawks Heat Lakers Pacers Spurs Thunder
V8a factor with levels 0 Bulls Celtics Clippers Heat Knicks Lakers Spurs Thunder
V9a factor with levels 0 Bulls Celtics Heat Knicks Lakers Mavericks Spurs Thunder
V10a factor with levels 0 Bulls Celtics Clippers Heat Lakers Mavericks Spurs Thunder
V11a factor with levels 0 Bulls Celtics Heat Knicks Lakers Nuggets Warriors Wizards
V12a factor with levels 0 Bulls Celtics Clippers Heat Lakers Mavericks Spurs Thunder
V13a factor with levels 0 Bulls Celtics Hornets Jazz Kings Lakers Magic Rockets
V14a factor with levels 0 76ers Celtics Heat Kings Lakers Rockets Spurs Suns
V15a factor with levels 0 Bulls Celtics Heat Lakers Mavericks Rockets Spurs Thunder
V16a factor with levels 0 Celtics Hawks Heat Lakers Mavericks Raptors Spurs Thunder
V17a factor with levels 0 76ers Celtics Heat Knicks Lakers Mavericks Nets Thunder
V18a factor with levels 0 76ers Bulls Cavaliers Celtics Heat Lakers Mavericks Thunder
V19a factor with levels 0 Bulls Heat Kings Lakers Rockets Spurs Suns Warriors
V20a factor with levels 0 Bucks Celtics Heat Lakers Magic Mavericks Rockets Suns
V21a factor with levels 0 Celtics Heat Kings Lakers Mavericks Spurs Suns Timberwolves
V22a factor with levels 0 Celtics Heat Kings Lakers Spurs Suns Thunder Timberwolves
V23a factor with levels 0 Bobcats Celtics Heat Lakers Mavericks Nuggets Spurs Suns
V24a factor with levels 0 76ers Heat Knicks Lakers Pistons Rockets Spurs Wizards
V25a factor with levels 0 76ers Celtics Hawks Heat Knicks Lakers Magic Thunder
V26a factor with levels 0 Bulls Cavaliers Celtics Hawks Heat Knicks Lakers Rockets
V27a factor with levels 0 76ers Clippers Lakers Magic Mavericks Pacers Raptors Warriors
V28a factor with levels 0 76ers Bulls Celtics Heat Lakers Pistons Rockets Wizards
V29a factor with levels 0 76ers Bulls Grizzlies Hawks Kings Knicks Nets Timberwolves
V30a factor with levels 0 76ers Bucks Bulls Knicks Raptors Rockets Thunder Timberwolves
V31a factor with levels 0 76ers Heat Lakers Magic Mavericks Pacers Pistons Suns
V32a factor with levels 0 76ers Bulls Celtics Heat Knicks Lakers Magic Pacers
V33a factor with levels 0 Clippers Heat Knicks Lakers Mavericks Nets Nuggets Wizards
V34a factor with levels 0 Bulls Hawks Heat Jazz Knicks Nets Rockets Timberwolves
Deng, K., Han, S., Li, K. J. and Liu, J. S. (2014). Bayesian aggregation of order-based rank data, Journal of the American Statistical Association 109(507): 1023-1039.
data(NBArankings) dim(NBArankings)data(NBArankings) dim(NBArankings)
This data set is simulated as described in the Simulation Study 1 of the reference. It is a 30 by 6 data frame, representing 6 independent top-30 partial rankings.
data("simu1")data("simu1")
A data frame with 30 observations on the following 6 variables.
V1a numeric vector
V2a numeric vector
V3a numeric vector
V4a numeric vector
V5a numeric vector
V6a numeric vector
An extended Mallows model for ranked data aggregation
data(simu1) dim(simu1)data(simu1) dim(simu1)
This data set is simulated as described in the Simulation Study 2 of the reference. It is a 40 by 6 data frame, representing 6 independent top-40 partial rankings.
data("simu2")data("simu2")
A data frame with 40 observations on the following 6 variables.
V1a numeric vector
V2a numeric vector
V3a numeric vector
V4a numeric vector
V5a numeric vector
V6a numeric vector
An extended Mallows model for ranked data aggregation
data(simu2) dim(simu2)data(simu2) dim(simu2)
This data set is simulated as described in the Simulation Study 3 of the reference. It is a 100 by 20 data frame, representing 20 full rankings. The columns 1-5 and the columns 6-10 represent two highly correlated ranking groups, respectively.
data("simu3")data("simu3")
A data frame with 100 observations on the following 20 variables.
V1a numeric vector
V2a numeric vector
V3a numeric vector
V4a numeric vector
V5a numeric vector
V6a numeric vector
V7a numeric vector
V8a numeric vector
V9a numeric vector
V10a numeric vector
V11a numeric vector
V12a numeric vector
V13a numeric vector
V14a numeric vector
V15a numeric vector
V16a numeric vector
V17a numeric vector
V18a numeric vector
V19a numeric vector
V20a numeric vector
An extended Mallows model for ranked data aggregation
data(simu3) dim(simu3)data(simu3) dim(simu3)