| Title: | Computing the Extent of Agreement among Raters with Chance-Corrected Agreement Coefficient (CAC) |
|---|---|
| Description: | Contains a series of R functions for calculating various chance-corrected agreement coefficients (CAC) among 2 or more raters. Among the CAC coefficients covered are Cohen's kappa, Conger's kappa, Fleiss' kappa, Brennan-Prediger coefficient, Gwet's AC1/AC2 coefficients, and Krippendorff's alpha. Multiple sets of weights are proposed for computing weighted analyses. Also included in this package is Bangdiwala's B coefficient. |
| Authors: | Kilem L. Gwet [aut, cre] |
| Maintainer: | Kilem L. Gwet <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.4 |
| Built: | 2026-05-28 06:39:20 UTC |
| Source: | https://github.com/cran/irrCAC |
This dataset contains rating data about 15 subjects or patients whose psychiatric condition was evaluated by 6 raters. Each column represents one condition and contains the number of raters who diagnosed the patient with that condition.
agree.cac3rdagree.cac3rd
The first column is of character type whereas the remaining columns contains integers.
Contains subject's gender (M or F)
The interval upper bound
Contains the number of raters who labeled the subject as having personal disorder
Contains the number of raters who labeled the subject as having schizophrenia
Contains the number of raters who labeled the subject as having Neurosis
Contains the number of raters who labeled the subject as having Other psychiatric condition
K. Gwet, PhD.
This dataset contains a 10x10 contingency table.
agree.contingencyagree.contingency
The first column is alphabetic whereas the remaining columns contains integers.
Contains category's name C1,C2, ..., C10)
Colum C1
Colum C2
Colum C3
Colum C4
Colum C5
Colum C6
Colum C7
Colum C8
Colum C9
Colum C10
K. Gwet, PhD.
This dataset contains raw ratings of 15 subjects produced by 4 raters named as RaterA, RaterB, RaterC and RaterD.
agreeCACagreeCAC
All 5 columns are of alphabetic type.
Contains the subject's gender (M or F)
Contains one of ratings a,b,c produced by RaterA
Contains one of ratings a,b,c produced by RaterB
Contains one of ratings a,b,c produced by RaterC
Contains one of ratings a,b,c produced by RaterD
K. Gwet, PhD.
This dataset contains information describing the Altman scale for benchmarking chance-corrected agreement coefficients such as Gwet AC1/AC2, Kappa and many others.
altmanaltman
Each row of this dataset describes an interval and the interpretation of the magnitude it represents.
The interval lower bound
The interval upper bound
The interpretation
Altman, D.G. (1991). Practical Statistics for Medical Research. Chapman and Hall.
Computing Altman's Benchmark Scale Membership Probabilities
altman.bf(coeff, se, BenchDF = altman)altman.bf(coeff, se, BenchDF = altman)
coeff |
A mandatory parameter representing the estimated value of an agreement coefficient. |
se |
A mandatory parameter representing the agreement coefficient standard error. |
BenchDF |
An optional parameter that is a 3-column data frame containing the Altman's benchmark scale information. The 3 columns are the interval lower bound, upper bound, and their interpretation. The default value is a small file contained in the package and named altman.RData, which describes the official Altman's scale intervals and their interpretation. |
A one-column matrix containing the membership probabilities (c.f. https://agreestat.com/papers/inter-rater%20reliability%20study%20design1.pdf)
Bangdiwala B coefficient for 2 raters
bangdiwala.table(ratings, conflev = 0.95, N = Inf)bangdiwala.table(ratings, conflev = 0.95, N = Inf)
ratings |
A square table of ratings (assume no missing ratings). |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: bangdiwala.table(cont3x3abstractors) #Yields Bangdiwala coefficient along with precision measures Bcoeff <- bangdiwala.table(cont3x3abstractors)$coeff.val #Yields Bangdiwala's coefficient alone. Bcoeff q <- nrow(cont3x3abstractors) #Number of categories#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: bangdiwala.table(cont3x3abstractors) #Yields Bangdiwala coefficient along with precision measures Bcoeff <- bangdiwala.table(cont3x3abstractors)$coeff.val #Yields Bangdiwala's coefficient alone. Bcoeff q <- nrow(cont3x3abstractors) #Number of categories
Bangdiwala B coefficient for 2 raters when input dataset is made up of 2 columns of raw data.
bangdiwala2RR.fn(fra.ratings.raw, conflev = 0.95, N = Inf)bangdiwala2RR.fn(fra.ratings.raw, conflev = 0.95, N = Inf)
fra.ratings.raw |
A dataframe with 2 columns of raw ratings. |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 9 variables: coeff.name, b1, b2, coeff.val, coeff.se, conf.int, p.value, n and name of the weight used.
#The dataset cac.ben.gerry comes with this package. Analyze it as follows: bangdi <- bangdiwala2RR.fn(cac.ben.gerry[,c(3,4)]) #using only the last 2 columns. #The result will be following: c(bangdi$coeff.name,bangdi$pa,bangdi$pe,bangdi$coeff.val,bangdi$coeff.se, bangdi$conf.int,bangdi$p.val,bangdi$tot.obs) #1 Bangdiwala''s B 0.1322314 0.2066116 0.64 0.2158518 (0.159,1) 7.083e-03 11 bangdi$w.name #1 Identity#The dataset cac.ben.gerry comes with this package. Analyze it as follows: bangdi <- bangdiwala2RR.fn(cac.ben.gerry[,c(3,4)]) #using only the last 2 columns. #The result will be following: c(bangdi$coeff.name,bangdi$pa,bangdi$pe,bangdi$coeff.val,bangdi$coeff.se, bangdi$conf.int,bangdi$p.val,bangdi$tot.obs) #1 Bangdiwala''s B 0.1322314 0.2066116 0.64 0.2158518 (0.159,1) 7.083e-03 11 bangdi$w.name #1 Identity
Function for computing the Bipolar Weights
bipolar.weights(categ)bipolar.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
Brennan-Prediger's agreement coefficient among multiple raters (2, 3, +) when the input dataset is the distribution of raters by subject and category.
bp.coeff.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )bp.coeff.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )
ratings |
An nxq matrix / data frame containing the distribution of raters by subject and category. Each cell (i,k) contains the number of raters who classsified subject i into category k. |
weights |
is an optional parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ. Otherwise, only the categories reported will be used. |
categ |
An optional parameter representing all categories available to raters during the experiment. This parameter may be useful if some categories were not used by any rater inspite of being available to the raters. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A vector containing the following information: pa(the percent agreement),pe(the percent chance agreement),coeff(Brennan-Prediger coefficient), stderr(the standard error of Brennan-Prediger coefficient),conf.int(the p-value of Brennan-Prediger coefficient), p.value(the p-value of Brennan-Prediger coefficient),coeff.name ("Brennan-Prediger").
Brennan, R.L., and Prediger, D. J. (1981). “Coefficient Kappa: some uses, misuses, and alternatives," Educational and Psychological Measurement, 41, 687-699.
#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters bp.coeff.dist(distrib.6raters) #BP coefficient, precision measures, weights & list of categories bp <- bp.coeff.dist(distrib.6raters)$coeff #Yields Brennan-Prediger coefficient alone. bp q <- ncol(distrib.6raters) #Number of categories bp.coeff.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted BP with quadratic weights#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters bp.coeff.dist(distrib.6raters) #BP coefficient, precision measures, weights & list of categories bp <- bp.coeff.dist(distrib.6raters)$coeff #Yields Brennan-Prediger coefficient alone. bp q <- ncol(distrib.6raters) #Number of categories bp.coeff.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted BP with quadratic weights
Brennan & Prediger's (BP) agreement coefficient for an arbitrary number of raters (2, 3, +) when the input data represent the raw ratings reported for each subject and each rater.
bp.coeff.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )bp.coeff.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )
ratings |
An nxr matrix / data frame of ratings where each column represents one rater and each row one subject. |
weights |
is a mandatory parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ.labels. Otherwise, the program may not work. |
categ.labels |
An optional vector parameter containing the list of all possible ratings. It may be useful in case some of the possibe ratings are not used by any rater, they will still be used when calculating agreement coefficients. The default value is NULL. In this case, only categories reported by the raters are used in the calculations. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A data list containing 3 objects: (1) a one-row data frame containing various statistics including the requested agreement coefficient, (2) the weight matrix used in the calculations if any, and (3) A vector of categories used in the analysis. These could be categories reported by the raters, or those available to the raters whether they used them or not. The output data frame contains the following variables: "coeff.name" (coefficient name), "pa" (the percent agreement), "pe" (the percent chance agreement), coeff.val (Brennan-Prediger coefficient estimate), "coeff.se" (standard error), "conf.int" (the confidence interval), "p.value"(Brennan-Prediger coefficient's p-value), "w.name"(the weights' identification).
Brennan, R.L., & Prediger, D. J. (1981). “Coefficient Kappa: some uses, misuses, and alternatives." Educational and Psychological Measurement, 41, 687-699.
#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters bp.coeff.raw(cac.raw4raters) #BP coefficient, precision measures, weights & categories bp.coeff.raw(cac.raw4raters)$est #Brennan-Prediger coefficient with precision measures bp <- bp.coeff.raw(cac.raw4raters)$est$coeff.val #Yields Brennan-Prediger coefficient alone. bp bp.coeff.raw(cac.raw4raters, weights = "quadratic") #weighted Brennan-Prediger coefficient#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters bp.coeff.raw(cac.raw4raters) #BP coefficient, precision measures, weights & categories bp.coeff.raw(cac.raw4raters)$est #Brennan-Prediger coefficient with precision measures bp <- bp.coeff.raw(cac.raw4raters)$est$coeff.val #Yields Brennan-Prediger coefficient alone. bp bp.coeff.raw(cac.raw4raters, weights = "quadratic") #weighted Brennan-Prediger coefficient
Brenann-Prediger coefficient for 2 raters
bp2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )bp2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )
ratings |
A square table of ratings (assume no missing ratings). |
weights |
An optional matrix that contains the weights used in the weighted analysis. By default, this parameter contaings the identity weight matrix, which leads to the unweighted analysis. |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: bp2.table(cont3x3abstractors) #Yields Brennan-Prediger's coefficient along with precision measures bp <- bp2.table(cont3x3abstractors)$coeff.val #Yields Brennan-Prediger coefficient alone. bp q <- nrow(cont3x3abstractors) #Number of categories bp2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #Weighted BP coefficient#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: bp2.table(cont3x3abstractors) #Yields Brennan-Prediger's coefficient along with precision measures bp <- bp2.table(cont3x3abstractors)$coeff.val #Yields Brennan-Prediger coefficient alone. bp q <- nrow(cont3x3abstractors) #Number of categories bp2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #Weighted BP coefficient
This dataset contains ratings that 2 raters named Ben and Gerry assigned to 12 units distributed in 2 groups "G1" and "G2".
cac.ben.gerrycac.ben.gerry
Each row of this dataset describes an interval and the interpretation of the magnitude it represents.
Group Name
Unit number
Ben's Ratings
Gerry's Ratings
The first 2 columns "Group" and "Units" play a descriptive role here and are not used by any fucntion included in this package. One will typically use cac.ben.gerry[,c(3,4)] or cac.ben.gerry[,c("Ben","Gerry")] as input dataset.
This dataset contains rating data in the form of a subject-level distribution of 4 raters by category the subject was classified into. A total of 4 raters had to classify 14 subjects into one of 5 categories "a", "b", "c", "d", and "e". This dataset is different version of the more detailed "cac.raw.g1g2" dataset. While "cac.raw.g1g2" tells you about the exact category into which each rater classified all subjects, "cac.dist.g1g2" on the other hand, can only tell you how many raters classified a given subject into a particular category.
cac.dist.g1g2cac.dist.g1g2
This dataset contains ratings obtained from an experiment where 4 raters classified 14 subjects into 5 possible categories labeled as a, b, c, d, and e. None of the 4 raters scored all 14 units. Therefore, some missing ratings appear in each of the columns associated with the 4 raters.
Note that only the the 4 last columns are to be used with the functions included in this package. The first 2 columns only play a descriptive role and are not used in any calculation.
This variable represents the group name.
This variable represents the unit number.
Number of raters who classified the subject represented by the row into category "a"
Number of raters who classified the subject represented by the row into category "b"
Number of raters who classified the subject represented by the row into category "c"
Number of raters who classified the subject represented by the row into category "d"
Number of raters who classified the subject represented by the row into category "e"
This dataset summarizes the ratings assigned by 4 raters who classified 15 subjects into one of 3 categories named "a", "b", and "c".
cac.dist4catcac.dist4cat
This dataset has 15 rows (for the 15 subjects) and 4 columns. Only the last 3 columns representing the categories into which subjects are classified are used in the calculations - unless the sub-group analysis is required.
This variable repsents the subject number.
category a
Category b
Category c
This dataset contains data from a reliability experiment where 4 raters identified as Rater1, Rater2, Rater3 and Rater4 scored 14 units on a 5-point alphabetical scale based on the values a, b, c, d and e. These 14 units are allocated to 2 groups named G1 and G2.
cac.raw.g1g2cac.raw.g1g2
This dataset contains ratings obtained from an experiment where 4 raters classified 14 subjects into 5 possible categories labeled as a, b, c, d, and e. None of the 4 raters scored all 14 units. Therefore, some missing ratings appear in each of the columns associated with the 4 raters.
Note that only the the 4 last columns are to be used with the functions included in this package. The first 2 columns only play a descriptive role and are not used in any calculation.
This variable repsents the unit number.
This variable repsents the unit number.
All ratings from rater 1
All ratings from rater 2
All ratings from rater 3
All ratings from rater 4
This dataset contains data from a reliability experiment where 4 raters scored 15 units on a 3-point alphabetic scale based on the values a, b, and c.
cac.raw.gendercac.raw.gender
This dataset contains ratings obtained from an experiment where 4 raters classiffied 15 subjects into 3 possible categories labeled as a, b, and c.
Note that only the the 4 last columns are to be used with the functions included in this package. The first column only plays a descriptive role and is not to be used in any calculation.
This variable repsents the unit number.
All ratings from rater 1
All ratings from rater 2
All ratings from rater 3
All ratings from rater 4
This dataset contains 12 rows and 2 columns. Each row represents a subject and the 2 columns labeled as "rater1" and "rater2" contain the ratings produced by the 2 raters.
cac.raw2raterscac.raw2raters
This dataset contains ratings obtained from an experiment where 2 raters classified 12 subjects into 3 categories A, B, and C. Two of the 12 subjects were rated by a single rater. Consequently, this dataset contains 2 missing ratings that are identified with the symbol <NA>.
All ratings from rater 1
All ratings from rater 2
Gwet, K.L. (2021) Handbook of Inter-Rater Reliability: Volume 1, 5th Edition. AgreeStat Analytics.
This dataset contains data from a reliability experiment where 5 observers scored 15 units on a 4-point numeric scale based on the values 0, 1, 2 and 3.
cac.raw4raterscac.raw4raters
This dataset contains ratings obtained from an experiment where 4 raters classified 12 subjects into 5 possible categories labeled as 1, 2, 3, 4, and 5. None of the 4 raters scored all 12 units. Therefore, some missing ratings in the form of "NA" appear in each of the columns associated with the 4 raters.
Note that only the the 4 last columns are to be used with the functions included in this package. The first column only plays a descriptive role and is not used in any calculation.
This variable repsents the unit number.
All ratings from rater 1
All ratings from rater 2
All ratings from rater 3
All ratings from rater 4
Gwet, K.L. (2014) Handbook of Inter-Rater Reliability, 4th Edition, page #120. Advanced Analytics, LLC.
This dataset contains data from a reliability experiment where 5 observers scored 15 units on a 4-point numeric scale based on the values 0, 1, 2 and 3.
cac.raw5obsercac.raw5obser
This dataset has 15 rows (for the 15 subjects) and 6 columns. Only the last 5 columns associated with the 5 observers are used in the calculations. Of the 5 observers, only observer 3 scored all 15 units. Therefore, some missing ratings in the form of "NA" appear in the columns associated with the remaining 4 observers.
This variable repsents the unit number.
All ratings from Observer 1
All ratings from Observer 2
All ratings from Observer 3
All ratings from Observer 4
All ratings from Observer 5
Gwet, K.L. (2014) Handbook of Inter-Rater Reliability, 4th Edition. Advanced Analytics, LLC. A larger version of this table can be found on page #125
Function for computing the Circular Weights
circular.weights(categ)circular.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
Conger's generalized kappa coefficient for an arbitrary number of raters (2, 3, +) when the input data represent the raw ratings reported for each subject and each rater.
conger.kappa.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )conger.kappa.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )
ratings |
An nxr matrix / data frame of ratings where each column represents one rater and each row one subject. |
weights |
is a mandatory parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ.labels. Otherwise, the program may not work. |
categ.labels |
An optional vector parameter containing the list of all possible ratings. It may be useful in case some of the possibe ratings are not used by any rater, they will still be used when calculating agreement coefficients. The default value is NULL. In this case, only categories reported by the raters are used in the calculations. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A data list containing 3 objects: (1) a one-row data frame containing various statistics including the requested agreement coefficient, (2) the weight matrix used in the calculations if any, and (3) A vector of categories used in the analysis. These could be categories reported by the raters, or those available to the raters whether they used them or not. The output data frame contains the following variables: "coeff.name" (coefficient name), "pa" (the percent agreement), "pe" (the percent chance agreement), coeff.val (Conger's Kappa estimate), "coeff.se" (standard error), "conf.int" (Conger Kappa's confidence interval), "p.value"(agreement coefficient's p-value), "w.name"(the weights' identification).
Conger, A. J. (1980), “Integration and Generalization of Kappas for Multiple Raters," Psychological Bulletin, 88, 322-328.
#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters conger.kappa.raw(cac.raw4raters) #Conger's kappa, precision stats, weights & categories conger.kappa.raw(cac.raw4raters)$est #Conger's kappa with precision measures conger <- conger.kappa.raw(cac.raw4raters)$est$coeff.val #Yields Conger's kappa alone. conger conger.kappa.raw(cac.raw4raters, weights = "quadratic") #weighted Conger's kappa#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters conger.kappa.raw(cac.raw4raters) #Conger's kappa, precision stats, weights & categories conger.kappa.raw(cac.raw4raters)$est #Conger's kappa with precision measures conger <- conger.kappa.raw(cac.raw4raters)$est$coeff.val #Yields Conger's kappa alone. conger conger.kappa.raw(cac.raw4raters, weights = "quadratic") #weighted Conger's kappa
This dataset contains pregnancy type data collected from 100 women who entered an Emergency Room with a positive pregnancy test and a second condition, which is either abdominal pain or vaginal bleeding. After reviewing their medical records, 2 reviewers (also referred to as abstractors) classified them into one of the following three pregnancy categories: Ectopic Pregnancy (Ectopic), Abnormal Intrauterine pregnancy (AIU) and Normal Intrauterine Pregnancy (NIU).
cont3x3abstractorscont3x3abstractors
Each row of this dataset describes an interval and the interpretation of the magnitude it represents.
Pregnancy Type. This variable is shown here for information only and is never used by any function in the irrCAC package.
Ectopic Pregnancy
Abnormal Intrauterine Pregnancy
Normal Intrauterine Pregnancy
Gwet, K.L. (2014). Handbook of Inter-Rater Reliability, 4th Edition. Advanced Analytics, LLC.
This dataset shows the distribution of 223 psychiatric patients by diagnosis category and by the method used to obtain the diagnosis. The first method named “Clinical Diagnosis" (also known as “Facility Diagnosis") is used in a service facility (e.g. public hospital, or a community unit) and does not rely on a rigorous application of research criteria. The second method known as “Research Diagnosis" is based on a strict application of research criteria. Column 1 contains the diagnosis categories into which patients are classified with Method 1. The first row on the other hand, shows categories into which patients are classified with Method 2.
cont4x4diagnosiscont4x4diagnosis
This dataset contains a 4x4 squared table. The first column is never used in the calculations and only contains row names. Only the last 4 columns are used for computing agreement coefficients.
Pregnancy Type. This variable is shown here for information only and is never used by any function in the irrCAC package.
Ectopic Pregnancy
Abnormal Intrauterine Pregnancy
Normal Intrauterine Pregnancy
Normal Intrauterine Pregnancy
Gwet, K.L. (2014). Handbook of Inter-Rater Reliability, 4th Edition. Advanced Analytics, LLC.
This dataset summarizes the ratings assigned by 6 psychiatrists classifying 15 patients into one of five categories named "Depression", "Personal Disorder", "Schizophrenia", "Neurosis" and "Other".
distrib.6ratersdistrib.6raters
This dataset has 15 rows (for the 15 subjects) and 7 columns. Only the last 6 columns representing the categories into which subjects are classified are used in the calculations.
This variable repsents the subject number.
Personality disorder category
Schizophrenia Category
Neurosis category
"Other" category
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters, Psychological Bulletin, 76, 378-382.
This dataset contains information describing Fleiss' scale for benchmarking chance-corrected agreement coefficients such as Gwet AC1/AC2, Kappa and many others.
fleissfleiss
Each row of this dataset describes an interval and the interpretation of the magnitude it represents.
The interval lower bound
The interval upper bound
The interpretation
Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. John Wiley & Sons.
Computing Fleiss Benchmark Scale Membership Probabilities
fleiss.bf(coeff, se, BenchDF = fleiss)fleiss.bf(coeff, se, BenchDF = fleiss)
coeff |
A mandatory parameter representing the estimated value of an agreement coefficient. |
se |
A mandatory parameter representing the agreement coefficient standard error. |
BenchDF |
An optional parameter that is a 3-column data frame containing the Fleiss' benchmark scale information. The 3 columns are the interval lower bound, upper bound, and their interpretation. The default value is a small file contained in the package and named fleiss.RData, which describes the fleiss' scale intervales and their interpretation. |
A one-column matrix containing the membership probabilities (c.f. https://agreestat.com/papers/inter-rater%20reliability%20study%20design1.pdf)
Fleiss' agreement coefficient among multiple raters (2, 3, +) when the input dataset is the distribution of raters by subject and category.
fleiss.kappa.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )fleiss.kappa.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )
ratings |
An nxq matrix / data frame containing the distribution of raters by subject and category. Each cell (i,k) contains the number of raters who classsified subject i into category k. |
weights |
is an optional parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ. Otherwise, only the categories reported will be used. |
categ |
An optional parameter representing all categories available to raters during the experiment. This parameter may be useful if some categories were not used by any rater inspite of being available to the raters. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A vector containing the following information: pa(the percent agreement),pe(the percent chance agreement),coeff(Fleiss' agreement coefficient), stderr(the standard error of Fleiss' coefficient),conf.int(the confidence interval of Fleiss Kappa coefficient), p.value(the p-value of Fleiss' coefficient),coeff.name ("Fleiss").
Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. John Wiley & Sons.
#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters fleiss.kappa.dist(distrib.6raters) #Fleiss' kappa, precision measures, weights & list of categories fleiss <- fleiss.kappa.dist(distrib.6raters)$coeff #Yields Fleiss' kappa alone. fleiss q <- ncol(distrib.6raters) #Number of categories fleiss.kappa.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted fleiss/quadratic wts#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters fleiss.kappa.dist(distrib.6raters) #Fleiss' kappa, precision measures, weights & list of categories fleiss <- fleiss.kappa.dist(distrib.6raters)$coeff #Yields Fleiss' kappa alone. fleiss q <- ncol(distrib.6raters) #Number of categories fleiss.kappa.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted fleiss/quadratic wts
Fleiss' generalized kappa among multiple raters (2, 3, +) when the input data represent the raw ratings reported for each subject and each rater.
fleiss.kappa.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )fleiss.kappa.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )
ratings |
An nxr matrix / data frame of ratings where each column represents one rater and each row one subject. |
weights |
is a mandatory parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ.labels. Otherwise, the program may not work. |
categ.labels |
An optional vector parameter containing the list of all possible ratings. It may be useful in case some of the possibe ratings are not used by any rater, they will still be used when calculating agreement coefficients. The default value is NULL. In this case, only categories reported by the raters are used in the calculations. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A data list containing 3 objects: (1) a one-row data frame containing various statistics including the requested agreement coefficient, (2) the weight matrix used in the calculations if any, and (3) the categories used in the analysis. These could be categories reported by the raters, or those that were available to the raters whether they used them or not. The output data frame contains the following variables: "coeff.name" (coefficient name-here it will be "Fleiss' Kappa"), "pa" (the percent agreement), "pe" (the percent chance agreement), coeff.val (the agreement coefficient estimate-Fleiss' Kappa), "coeff.se" (the standard error), "conf.int" (Fleiss Kappa's confidence interval), "p.value"(Fleiss Kappa's p-value), "w.name"(the weights' identification).
Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. John Wiley & Sons.
#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters fleiss.kappa.raw(cac.raw4raters) #Fleiss' kappa, precision measures, weights & categories fleiss.kappa.raw(cac.raw4raters)$est #Yields Fleiss' kappa with precision measures fleiss <- fleiss.kappa.raw(cac.raw4raters)$est$coeff.val #Yields Fleiss' kappa alone. fleiss fleiss.kappa.raw(cac.raw4raters, weights = "quadratic") #weighted Fleiss' kappa/quadratic wts#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters fleiss.kappa.raw(cac.raw4raters) #Fleiss' kappa, precision measures, weights & categories fleiss.kappa.raw(cac.raw4raters)$est #Yields Fleiss' kappa with precision measures fleiss <- fleiss.kappa.raw(cac.raw4raters)$est$coeff.val #Yields Fleiss' kappa alone. fleiss fleiss.kappa.raw(cac.raw4raters, weights = "quadratic") #weighted Fleiss' kappa/quadratic wts
freq.supp.fn: This function reads a 3-variable input data file containing unique pairs of categories along with their frequency of occurrences, and outputs a similar file where all possible pairs of categories are represented, some with a frequency of occurrence of 0.
freq.supp.fn(freq.data, categories.vec)freq.supp.fn(freq.data, categories.vec)
freq.data |
The input data file containing all unique combinations of reported categories. |
categories.vec |
A vector containing the complete set of categories available to raters (e.g. "a", "b", "c", "d", "e"). The raters will not necessarily use all of these categories. |
This function returns a complete data frame containing all possible combinations of of categories in the categories.vec vector. Newly-added combinations of categories will have a frequency occurrence of 0.
#The dataset "freqs.data" comes with this package. Analyze it as follows: freq.supp.fn(freqs.data) #Executing this command will yield the following data frame: # Ben Gerry n # <chr> <chr> <chr> # a b 1 # a d 1 # b b 2 # c c 3 # d b 1 # d d 1 # e e 1 # a a 0#The dataset "freqs.data" comes with this package. Analyze it as follows: freq.supp.fn(freqs.data) #Executing this command will yield the following data frame: # Ben Gerry n # <chr> <chr> <chr> # a b 1 # a d 1 # b b 2 # c c 3 # d b 1 # d d 1 # e e 1 # a a 0
This dataset contains rating data collected from 10 human subjects by 2 raters named Ben and Gerry. While the 5 categories a, b, c, d, and e are available for rating, Gerry never used category a.
freqs.datafreqs.data
Each row of this dataset describes a unique pair of categories and the number of subjects that both Ben and Gerry classified into the 2 categories respectively.
Categories that Ben used to classify the subjects.
Categories that Gerry used to classify the subjects
Number of subjects that Ben and Gerry classified into the associated pair of categories.
N/A.
Gwet's AC1/AC2 agreement coefficient among multiple raters (2, 3, +) when the input dataset is the distribution of raters by subject and category.
gwet.ac1.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )gwet.ac1.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )
ratings |
An nxq matrix / data frame containing the distribution of raters by subject and category. Each cell (i,k) contains the number of raters who classsified subject i into category k. |
weights |
is an optional parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parameter categ. Otherwise, only the categories reported will be used. |
categ |
An optional parameter representing all categories available to raters during the experiment. This parameter may be useful if some categories were not used by any rater inspite of being available to the raters. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A vector containing the following information: pa(the percent agreement),pe(the percent chance agreement), coeff(Gwet's AC1 or AC2 dependending on whether weights are used or not),stderr(the standard error of Gwet's coefficient), conf.int(the confidence interval of Gwet's coefficient), p.value(the p-value of Gwet's coefficient),coeff.name (AC1/AC2).
Gwet, K. L. (2008). “Computing inter-rater reliability and its variance in the presence of high agreement," British Journal of Mathematical and Statistical Psychology, 61, 29-48.
#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters gwet.ac1.dist(distrib.6raters) #AC1 coefficient, precision measures, weights & list of categories ac1 <- gwet.ac1.dist(distrib.6raters)$coeff #Yields AC1 coefficient alone. ac1 q <- ncol(distrib.6raters) #Number of categories gwet.ac1.dist(distrib.6raters,weights = quadratic.weights(1:q)) #AC2 with quadratic weights#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters gwet.ac1.dist(distrib.6raters) #AC1 coefficient, precision measures, weights & list of categories ac1 <- gwet.ac1.dist(distrib.6raters)$coeff #Yields AC1 coefficient alone. ac1 q <- ncol(distrib.6raters) #Number of categories gwet.ac1.dist(distrib.6raters,weights = quadratic.weights(1:q)) #AC2 with quadratic weights
Gwet's AC1/AC2 agreement coefficient among multiple raters (2, 3, +) when the input data represent the raw ratings reported for each subject and each rater.
gwet.ac1.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )gwet.ac1.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )
ratings |
An nxr matrix / data frame of ratings where each column represents one rater and each row one subject. |
weights |
is a mandatory parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ.labels. Otherwise, the program may not work. |
categ.labels |
An optional vector parameter containing the list of all possible ratings. It may be useful in case some of the possibe ratings are not used by any rater, they will still be used when calculating agreement coefficients. The default value is NULL. In this case, only categories reported by the raters are used in the calculations. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A data list containing 3 objects: (1) a one-row data frame containing various statistics including the requested agreement coefficient, (2) the weight matrix used in the calculations if any, and (3) the categories used in the analysis. These could be categories reported by the raters, or those that were available to the raters whether they used them or not. The output data frame contains the following variables: "coeff.name" (coefficient name), "pa" (the percent agreement), "pe" (the percent chance agreement), coeff.val (the agreement coefficient estimate-AC1 or AC2), "coeff.se" (the standard error), "conf.int" (AC1/AC2 confidence interval), "p.value" (Gwet AC1/AC2 p-value), "w.name"(the weights' identification).
Gwet, K. L. (2008). “Computing inter-rater reliability and its variance in the presence of high agreement." British Journal of Mathematical and Statistical Psychology, 61, 29-48.
#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters gwet.ac1.raw(cac.raw4raters) #AC1 coefficient, precision measures, weights & categories gwet.ac1.raw(cac.raw4raters)$est #Yields AC1 coefficient with precision measures ac1 <- gwet.ac1.raw(cac.raw4raters)$est$coeff.val #Yields AC1 coefficient alone. ac1 gwet.ac1.raw(cac.raw4raters, weights = "quadratic") #AC2 coefficient with quadratic wts#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters gwet.ac1.raw(cac.raw4raters) #AC1 coefficient, precision measures, weights & categories gwet.ac1.raw(cac.raw4raters)$est #Yields AC1 coefficient with precision measures ac1 <- gwet.ac1.raw(cac.raw4raters)$est$coeff.val #Yields AC1 coefficient alone. ac1 gwet.ac1.raw(cac.raw4raters, weights = "quadratic") #AC2 coefficient with quadratic wts
Gwet's AC1/AC2 coefficient for 2 raters
gwet.ac1.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )gwet.ac1.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )
ratings |
A square table of ratings (assume no missing ratings). |
weights |
An optional matrix that contains the weights used in the weighted analysis. By default, this parameter contaings the identity weight matrix, which leads to the unweighted analysis. |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: gwet.ac1.table(cont3x3abstractors) #Yields AC1 along with precision measures ac1 <- gwet.ac1.table(cont3x3abstractors)$coeff.val #Yields AC1 coefficient alone. ac1 q <- nrow(cont3x3abstractors) #Number of categories gwet.ac1.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #AC2 with quadratic weights#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: gwet.ac1.table(cont3x3abstractors) #Yields AC1 along with precision measures ac1 <- gwet.ac1.table(cont3x3abstractors)$coeff.val #Yields AC1 coefficient alone. ac1 q <- nrow(cont3x3abstractors) #Number of categories gwet.ac1.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #AC2 with quadratic weights
Function for computing the Identity Weights
identity.weights(categ)identity.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of identity weights to be used for calculating the unweighted coefficients.
Kappa coefficient for 2 raters
kappa2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )kappa2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )
ratings |
A square or contingency table of ratings (assume no missing ratings). See the 2 datasets "cont3x3abstractors" and "cont4x4diagnosis" that come with this package as examples. |
weights |
An optional matrix that contains the weights used in the weighted analysis. |
conflev |
An optional confidence level for confidence intervals. The default value is the traditional 0.95. |
N |
An optional population size. The default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: kappa2.table(cont3x3abstractors) #Yields Cohen's kappa along with precision measures kappa <- kappa2.table(cont3x3abstractors)$coeff.val #Yields Cohen's kappa alone. kappa q <- nrow(cont3x3abstractors) #Number of categories kappa2.table(cont3x3abstractors,weights = quadratic.weights(1:q))#weighted kappa/quadratic wts#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: kappa2.table(cont3x3abstractors) #Yields Cohen's kappa along with precision measures kappa <- kappa2.table(cont3x3abstractors)$coeff.val #Yields Cohen's kappa alone. kappa q <- nrow(cont3x3abstractors) #Number of categories kappa2.table(cont3x3abstractors,weights = quadratic.weights(1:q))#weighted kappa/quadratic wts
Krippendorff's agreement coefficient among multiple raters (2, 3, +) when the input dataset is the distribution of raters by subject and category.
krippen.alpha.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )krippen.alpha.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )
ratings |
An nxq matrix / data frame containing the distribution of raters by subject and category. Each cell (i,k) contains the number of raters who classsified subject i into category k. |
weights |
is an optional parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ. Otherwise, only the categories reported will be used. |
categ |
An optional parameter representing all categories available to raters during the experiment. This parameter may be useful if some categories were not used by any rater inspite of being available to the raters. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A vector containing the following information: pa(the percent agreement),pe(the percent chance agreement),coeff(Krippendorff's alpha), stderr(the standard error of Krippendorff's coefficient),conf.int(the confidence interval of Krippendorff's alpha coefficient), p.value(the p-value of Krippendorff's alpha), coeff.name ("krippen alpha").
Gwet, K. (2014). Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Multiple Raters, 4th Edition. Advanced Analytics, LLC Krippendorff (1970). “Bivariate agreement coefficients for reliability of data," Sociological Methodology,2,139-150 Krippendorff (1980). Content analysis: An introduction to its methodology (2nd ed.), New-bury Park, CA: Sage.
#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters krippen.alpha.dist(distrib.6raters) #Krippendorff's alpha, precision measures, weights & categories alpha <- krippen.alpha.dist(distrib.6raters)$coeff #Yields Krippendorff's alpha coefficient alone. alpha q <- ncol(distrib.6raters) #Number of categories krippen.alpha.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted alpha#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters krippen.alpha.dist(distrib.6raters) #Krippendorff's alpha, precision measures, weights & categories alpha <- krippen.alpha.dist(distrib.6raters)$coeff #Yields Krippendorff's alpha coefficient alone. alpha q <- ncol(distrib.6raters) #Number of categories krippen.alpha.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted alpha
Krippendorff's alpha coefficient for an arbitrary number of raters (2, 3, +) when the input data represent the raw ratings reported for each subject and each rater.
krippen.alpha.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )krippen.alpha.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )
ratings |
An nxr matrix / data frame of ratings where each column represents one rater and each row one subject. |
weights |
is a mandatory parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ.labels. Otherwise, the program may not work. |
categ.labels |
An optional vector parameter containing the list of all possible ratings. It may be useful in case some of the possibe ratings are not used by any rater, they will still be used when calculating agreement coefficients. The default value is NULL. In this case, only categories reported by the raters are used in the calculations. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A data list containing 3 objects: (1) a one-row data frame containing various statistics including the requested agreement coefficient-in this case, Krippendorff's alpha, (2) the weight matrix used in the calculations if any, and (3) the vector of categories used in the analysis. These could be categories reported by the raters, or those that were available to the raters whether they used them or not. The output data frame contains the following variables: "coeff.name" (coefficient name), "pa" (the percent agreement), "pe" (the percent chance agreement), coeff.val (Krippendorff's alpha estimate), "coeff.se (standard error), conf.int" (Krippendorff alpha's confidence interval),"p.value" (Krippendorff alpha's p-value), "w.name" (the weights' identification).
Gwet, K. (2014). Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Multiple Raters, 4th Edition. Advanced Analytics, LLC.
Krippendorff (1970). “Bivariate agreement coefficients for reliability of data." Sociological Methodology,2,139-150.
Krippendorff (1980). Content analysis: An introduction to its methodology (2nd ed.), New-bury Park, CA: Sage.
#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters krippen.alpha.raw(cac.raw4raters) #Alpha coeff. , precision measures, weights & categories krippen.alpha.raw(cac.raw4raters)$est #Krippendorff's alpha with precision measures alpha <- krippen.alpha.raw(cac.raw4raters)$est$coeff.val #Krippendorff's alpha alone. alpha krippen.alpha.raw(cac.raw4raters, weights = "quadratic") #weighted alpha/ quadratic wts#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters krippen.alpha.raw(cac.raw4raters) #Alpha coeff. , precision measures, weights & categories krippen.alpha.raw(cac.raw4raters)$est #Krippendorff's alpha with precision measures alpha <- krippen.alpha.raw(cac.raw4raters)$est$coeff.val #Krippendorff's alpha alone. alpha krippen.alpha.raw(cac.raw4raters, weights = "quadratic") #weighted alpha/ quadratic wts
Krippendorff's Alpha coefficient for 2 raters
krippen2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )krippen2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )
ratings |
A square table of ratings (assume no missing ratings). |
weights |
An optional matrix that contains the weights used in the weighted analysis. By default, this parameter contaings the identity weight matrix, which leads to the unweighted analysis. |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: krippen2.table(cont3x3abstractors) #Krippendorff's alpha along with precision measures alpha <- krippen2.table(cont3x3abstractors)$coeff.val #Krippendorff's alpha alone. alpha q <- nrow(cont3x3abstractors) #Number of categories krippen2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #Weighted alpha coefficient#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: krippen2.table(cont3x3abstractors) #Krippendorff's alpha along with precision measures alpha <- krippen2.table(cont3x3abstractors)$coeff.val #Krippendorff's alpha alone. alpha q <- nrow(cont3x3abstractors) #Number of categories krippen2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #Weighted alpha coefficient
This dataset contains information describing the Landis & Koch scale for benchmarking chance-corrected agreement coefficients such as Gwet AC1/AC2, Kappa and many others.
landis.kochlandis.koch
Each row of this dataset describes an interval and the interpretation of the magnitude it represents.
The interval lower bound
The interval upper bound
The interpretation
Landis, J.R. & Koch G. (1977). The measurement of observer agreement for categorical data, Biometrics, 33, 159-174.
Computing Landis-Koch Benchmark Scale Membership Probabilities
landis.koch.bf(coeff, se, BenchDF = landis.koch)landis.koch.bf(coeff, se, BenchDF = landis.koch)
coeff |
A mandatory parameter representing the estimated value of an agreement coefficient. |
se |
A mandatory parameter representing the agreement coefficient standard error. |
BenchDF |
An optional parameter that is a 3-column data frame containing the Landis & Koch's benchmark scale information. The 3 columns are the interval lower bound, upper bound, and their interpretation. The default value is a small file contained in the package and named landis.koch.RData, which describes the official Landis & Koch's scale intervals and their interpretation. |
A one-column matrix containing the membership probabilities (c.f. https://agreestat.com/papers/inter-rater%20reliability%20study%20design1.pdf)
Function for computing the Linear Weights
linear.weights(categ)linear.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
long2wide.fn: This function transforms a 3-column dataset of frequencies to a square matrix or a contingency table. This function uses the freq.supp.fn() function.
long2wide.fn(freqs.long)long2wide.fn(freqs.long)
freqs.long |
A 3-column data frame, where the first 2 variables represent the categories that both raters have actually used when classifying the subjects. The third and last variable is generally named "n" and represents the count of subjects that classified into the 2 associated categories by both raters. |
A matrix that represents a contingency showing the distribution of subjects by rater and category.
#The dataset "freqs.data" comes with this package. Analyze it as follows: long2wide.fn(freqs.data) #Yields a 5x5 matrix #This will produce the following 5x5 matrix: #> long2wide.fn(freqs.data) #a b c d e #a 0 1 0 1 0 #b 0 2 0 0 0 #c 0 0 3 0 0 #d 0 1 0 1 0 #e 0 0 0 0 1#The dataset "freqs.data" comes with this package. Analyze it as follows: long2wide.fn(freqs.data) #Yields a 5x5 matrix #This will produce the following 5x5 matrix: #> long2wide.fn(freqs.data) #a b c d e #a 0 1 0 1 0 #b 0 2 0 0 0 #c 0 0 3 0 0 #d 0 1 0 1 0 #e 0 0 0 0 1
Function for computing the Ordinal Weights
ordinal.weights(categ)ordinal.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
Percent agreement coefficient among multiple raters (2, 3, +) when the input dataset is the distribution of raters by subject and category.
pa.coeff.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )pa.coeff.dist( ratings, weights = "unweighted", categ = NULL, conflev = 0.95, N = Inf )
ratings |
An nxq matrix / data frame containing the distribution of raters by subject and category. Each cell (i,k) contains the number of raters who classsified subject i into category k. |
weights |
is an optional parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ. Otherwise, only the categories reported will be used. |
categ |
An optional parameter representing all categories available to raters during the experiment. This parameter may be useful if some categories were not used by any rater inspite of being available to the raters. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A vector containing the following information: pa(the percent agreement),pe(the percent chance agreement),coeff(Brennan-Prediger coefficient), stderr(the standard error of Brennan-Prediger coefficient),conf.int(the p-value of Brennan-Prediger coefficient), p.value(the p-value of Brennan-Prediger coefficient),coeff.name ("Brennan-Prediger").
Brennan, R.L., and Prediger, D. J. (1981). “Coefficient Kappa: some uses, misuses, and alternatives," Educational and Psychological Measurement, 41, 687-699.
#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters pa.coeff.dist(distrib.6raters) #percent agreement, precision measures, weights& list of categories pa <- pa.coeff.dist(distrib.6raters)$coeff #Yields the percent agreement coefficient alone. pa q <- ncol(distrib.6raters) #Number of categories pa.coeff.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted percent agreement#The dataset "distrib.6raters" comes with this package. It represents the distribution of 6 raters #by subject and by category. Note that each row of this dataset sums to the number of raters, which #is 6. You may this dataset as follows: distrib.6raters pa.coeff.dist(distrib.6raters) #percent agreement, precision measures, weights& list of categories pa <- pa.coeff.dist(distrib.6raters)$coeff #Yields the percent agreement coefficient alone. pa q <- ncol(distrib.6raters) #Number of categories pa.coeff.dist(distrib.6raters,weights = quadratic.weights(1:q)) #Weighted percent agreement
Percent agreement among multiple raters (2, 3, +) when the input data represent the raw ratings reported for each subject and each rater.
pa.coeff.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )pa.coeff.raw( ratings, weights = "unweighted", categ.labels = NULL, conflev = 0.95, N = Inf )
ratings |
An nxr matrix / data frame of ratings where each column represents one rater and each row one subject. |
weights |
is a mandatory parameter that is either a string variable or a matrix. The string describes one of the predefined weights and must take one of the values ("quadratic", "ordinal", "linear", "radical", "ratio", "circular", "bipolar"). If this parameter is a matrix then it must be a square matri qxq where q is the number of posssible categories where a subject can be classified. If some of the q possible categories are not used, then it is strobgly advised to specify the complete list of possible categories as a vector in parametr categ.labels. Otherwise, the program may not work. |
categ.labels |
An optional vector parameter containing the list of all possible ratings. It may be useful in case some of the possibe ratings are not used by any rater, they will still be used when calculating agreement coefficients. The default value is NULL. In this case, only categories reported by the raters are used in the calculations. |
conflev |
An optional parameter representing the confidence level associated with the confidence interval. Its default value is 0.95. |
N |
An optional parameter representing the population size (if any). It may be use to perform the final population correction to the variance. Its default value is infinity. |
A data list containing 3 objects: (1) a one-row data frame containing the estimates, (2) the weight matrix used in the calculations, and (3) the categories used in the analysis. The data frame of estimates contains the following variables "coeff.name" (coefficient name), "pa" (the percent agreement), "pe" (percent chance-agreement-always equals 0), "coeff.val" (agreement coefficient = pa), coeff.se (the percent agreement standard error), "conf.int" (the percent agreement confidence interval), "p.value"(the percent agreement p-value), "w.name"(the weights' identification).
#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters pa.coeff.raw(cac.raw4raters) #Percent agreement, precision measures, weights & categories pa.coeff.raw(cac.raw4raters)$est #Yields percent agreement with precision measures pa <- pa.coeff.raw(cac.raw4raters)$est$coeff.val #Yields percent agreement alone. pa pa.coeff.raw(cac.raw4raters, weights = "quadratic") #weighted percent agreement/quadratic weights#The dataset "cac.raw4raters" comes with this package. Analyze it as follows: cac.raw4raters pa.coeff.raw(cac.raw4raters) #Percent agreement, precision measures, weights & categories pa.coeff.raw(cac.raw4raters)$est #Yields percent agreement with precision measures pa <- pa.coeff.raw(cac.raw4raters)$est$coeff.val #Yields percent agreement alone. pa pa.coeff.raw(cac.raw4raters, weights = "quadratic") #weighted percent agreement/quadratic weights
Percent Agreement coefficient for 2 raters
pa2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )pa2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )
ratings |
A square table of ratings (assume no missing ratings). |
weights |
An optional matrix that contains the weights used in the weighted analysis. By default, this parameter contains the identity weight matrix, which leads to the unweighted analysis. |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: pa2.table(cont3x3abstractors) #Yields percent agreement along with precision measures pa <- pa2.table(cont3x3abstractors)$coeff.val #Yields percent agreement alone. pa q <- nrow(cont3x3abstractors) #Number of categories pa2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #Weighted percent agreement#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: pa2.table(cont3x3abstractors) #Yields percent agreement along with precision measures pa <- pa2.table(cont3x3abstractors)$coeff.val #Yields percent agreement alone. pa q <- nrow(cont3x3abstractors) #Number of categories pa2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #Weighted percent agreement
Function for computing the Quadratic Weights
quadratic.weights(categ)quadratic.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
Function for computing the Radical Weights
radical.weights(categ)radical.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
Function for computing the Ratio Weights
ratio.weights(categ)ratio.weights(categ)
categ |
A mandatory parameter representing the vector of all possible ratings. |
A square matrix of quadratic weights to be used for calculating the weighted coefficients.
Scott's coefficient for 2 raters
scott2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )scott2.table( ratings, weights = identity.weights(1:ncol(ratings)), conflev = 0.95, N = Inf )
ratings |
A square table of ratings (assume no missing ratings). |
weights |
An optional matrix that contains the weights used in the weighted analysis. By default, this parameter contaings the identity weight matrix, which leads to the unweighted analysis. |
conflev |
An optional parameter that specifies the confidence level used for constructing confidence intervals. By default the function assumes the standard value of 95%. |
N |
An optional parameter representing the finite population size if any. It is used to perform the finite population correction to the standard error. It's default value is infinity. |
A data frame containing the following 5 variables: coeff.name coeff.val coeff.se coeff.ci coeff.pval.
#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: scott2.table(cont3x3abstractors) #Yields Scott's Pi coefficient along with precision measures scott <- scott2.table(cont3x3abstractors)$coeff.val #Yields Scott's coefficient alone. scott q <- nrow(cont3x3abstractors) #Number of categories scott2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #weighted Scott's coefficient#The dataset "cont3x3abstractors" comes with this package. Analyze it as follows: scott2.table(cont3x3abstractors) #Yields Scott's Pi coefficient along with precision measures scott <- scott2.table(cont3x3abstractors)$coeff.val #Yields Scott's coefficient alone. scott q <- nrow(cont3x3abstractors) #Number of categories scott2.table(cont3x3abstractors,weights = quadratic.weights(1:q)) #weighted Scott's coefficient
An r function for trimming leading and trealing blanks
trim(x)trim(x)
x |
is a string variable. |
A string variable where leading and trealing blanks are trimmed.
This dataset shows how 4 raters classified 10 subjects into 5 categories labeled as q1, q2, q3, q4 and q5. Each record is associated with a subject and shows how the 4 raters are distributed across the 5 categories they assigned the subject to.
x.dist10x5x.dist10x5
A data frame of 10 rows and 6 columns. integers.
Patient's identifier
Number of raters who classified the subject into category q1
Number of raters who classified the subject into category q2
Number of raters who classified the subject into category q3
Number of raters who classified the subject into category q4
Number of raters who classified the subject into category q5
K. Gwet, PhD.
Dataset of 15 psychiatric patients where each row identifies a patient and shows how the 6 psychiatrists distrubte across 5 mental conditions. These 5 conditions are Depression, personality disorder, Schizophrenia, Neurosis, and Other.
x.dist6x5psyx.dist6x5psy
A data frame of 15 rows and 6 columns. integers.
Patient's identifier
Number of raters to have diagnosed the patient with depression
Number of raters to have diagnosed the patient with personality disorder
Number of raters to have diagnosed the patient with schizophrenia
Number of raters to have diagnosed the patient with neurosis
Number of raters to have diagnosed the patient with other condition
K. Gwet, PhD.
Dataset of 10 subjects and the categorical ratings assigned to them by 4 raters. The 5 categories available to raters are labeled as 1, 2, 3, 4, 5.
x.raw10x4x.raw10x4
A data frame of 10 rows and 5 columns. integers.
Patient's identifier
Category into which rater1 classified the subject
Category into which rater2 classified the subject
Category into which rater3 classified the subject
Category into which rater4 classified the subject
K. Gwet, PhD.
Dataset of raw ratings assigned to 12 units by 4 raters. Each row is associated with a unit identifier and each column to a rater. The columns are named Rater1, Rater2, Rater3 and Rater4 and contain the categories into which the unit was assigned. Category values are 1, 2, 3, 4, 5 and some ratings are missing.
x.raw12x4x.raw12x4
A data frame of 12 rows and 5 columns containing integer values 1, 2, 3, 4 and 5. Missing data points are represented with a dot ("."). integers.
Patient's identifier
Category into which Rater1 classified the unit
Category into which Rater2 classified the unit
Category into which Rater3 classified the unit
Category into which Rater4 classified the unit
K. Gwet, PhD.