Package 'binGroup2' reference manual

Title:	Identification and Estimation using Group Testing
Description:	Methods for the group testing identification problem: 1) Operating characteristics (e.g., expected number of tests) for commonly used hierarchical and array-based algorithms, and 2) Optimal testing configurations for these same algorithms. Methods for the group testing estimation problem: 1) Estimation and inference procedures for an overall prevalence, and 2) Regression modeling for commonly used hierarchical and array-based algorithms.
Authors:	Brianna Hitt [aut, cre] , Christopher Bilder [aut] , Frank Schaarschmidt [aut] , Brad Biggerstaff [aut] , Christopher McMahan [aut] , Joshua Tebbs [aut] , Boan Zhang [ctb], Michael Black [ctb], Peijie Hou [ctb], Peng Chen [ctb], Minh Nguyen [ctb]
Maintainer:	Brianna Hitt <[email protected]>
License:	GPL (>= 3)
Version:	1.3.1
Built:	2025-02-06 06:55:15 UTC
Source:	CRAN

Extract the accuracy measures from group testing results

Description

Extract the accuracy measures from objects of class "opchar" returned by operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2).

Usage

Accuracy(object, individual = TRUE, ...)
Accuracy(object, individual = TRUE, ...)

Arguments

`object`	An object of class "opChar", from which the accuracy measures are to be extracted.
`individual`	A logical argument that determines whether the accuracy measures for each individual (`individual=TRUE`) are to be included.
`...`	Additional arguments to be passed to `Accuracy` (e.g., `digits` to be passed to `round` or `signif` for appropriate rounding).

Details

The Accuracy function gives the individual accuracy measures for each individual in object and the overall accuracy measures for the algorithm. If individual=TRUE, individual accuracy measures are provided for each individual specified in the a argument of the call to operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2).

Accuracy measures included are the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. The overall accuracy measures displayed are weighted averages of the corresponding individual accuracy measures for all individuals in the algorithm. Expressions for these averages are provided in the Supplementary Material for Hitt et al. (2019). For more information, see the Details' section for the operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2) function.

The rows in the matrices of individual accuracy measures correspond to each unique set of accuracy measures in the algorithm. Individuals with the same set of accuracy measures are displayed together in a single row of the matrix. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, pooling negative predictive value, and the indices for the individuals in each row of the matrix. Individual accuracy measures are provided only if individual=TRUE.

Value

A list containing:

`Individual`	matrix detailing the accuracy measures for each individual from `object` (for objects returned by `opChar1`).
`Disease 1 Individual`	matrix detailing the accuracy measures pertaining to disease 1 for each individual from `object` (for objects returned by `opChar2`).
`Disease 2 Individual`	matrix detailing the accuracy measures pertaining to disease 2 for each individual from `object` (for objects returned by `opChar2`).
`Overall`	matrix detailing the overall accuracy measures for the algorithm from `object`.

Author(s)

Brianna D. Hitt

Examples

config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat)
Accuracy(res1, individual = FALSE)
Accuracy(res1, individual = TRUE)

res2 <- opChar2(algorithm = "A2M",
                p.vec = c(0.92, 0.05, 0.02, 0.01),
                Se = rep(0.95, 2), Sp = rep(0.99, 2),
                rowcol.sz = 8)
Accuracy(res2)
config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat)
Accuracy(res1, individual = FALSE)
Accuracy(res1, individual = TRUE)

res2 <- opChar2(algorithm = "A2M",
                p.vec = c(0.92, 0.05, 0.02, 0.01),
                Se = rep(0.95, 2), Sp = rep(0.99, 2),
                rowcol.sz = 8)
Accuracy(res2)

binGroup2: Identification and Estimation using Group Testing

Description

Methods for the group testing identification and estimation problems.

Details

Methods for identification of positive items in group testing designs: Operating characteristics (e.g., expected number of tests) are calculated for commonly used hierarchical and array-based algorithms. Optimal testing configurations for an algorithm can be found as well. Please see Hitt et al. (2019) for specific details.

Methods for estimation and inference for proportions in group testing designs: For estimating one proportion or the difference of proportions, confidence interval methods are included that account for different pool sizes. Functions for hypothesis testing of proportions, calculation of power, and calculation of the expected width of confidence intervals are also included. Furthermore, regression methods and simulation of group testing data are implemented for simple pooling (Dorfman testing with or without retests), halving, and array testing designs.

The binGroup2 package is based upon the binGroup package that was originally designed for the group testing estimation problem. Over time, additional functions for estimation and for the group testing identification problem were included. Due to the diverse styles resulting from these additions, we have created binGroup2 as a way to unify functions in a coherent structure and incorporate additional functions for identification. The binGroup2 package provides all the main functionality from the binGroup package, and can be used in place of the binGroup package. The name “binGroup” originates from the assumption in basic estimation for group testing that the number of positive groups has a binomial distribution. While more advanced estimation methods no longer make this assumption, we continue with the binGroup name for consistency.

Bilder (2019a,b) provide introductions to group testing. These papers and additional details about group testing are available at http://chrisbilder.com/grouptesting/.

This research was supported by the National Institutes of Health under grant R01 AI121351.

Identification

The binGroup2 package focuses on the group testing identification problem using hierarchical and array-based group testing algorithms.

The OTC1 function implements a number of group testing algorithms, described in Hitt et al. (2019), which calculate the operating characteristics and find the optimal testing configuration over a range of possible initial group sizes and/or testing configurations (sets of subsequent group sizes). The OTC2 function does the same with a multiplex assay that tests for two diseases.

The operatingCharacteristics1 (opChar1) and operatingCharacteristics2 (opChar2) functions calculate operating characteristics for a specified testing configuration with assays that test for one and two diseases, respectively.

These functions allow the sensitivity and specificity to differ across stages of testing. This means that the accuracy of the diagnostic test can differ for stages in a hierarchical testing algorithm or between row/column testing and individual testing in an array testing algorithm.

Estimation

The binGroup2 package also provides functions for estimation and inference for proportions in group testing designs.

The propCI function calculates the point estimate and confidence intervals for a single proportion from group testing data. The propDiffCI function does the same for the difference of proportions. A number of confidence interval methods are available for groups of equal or different sizes.

The gtWidth function calculates the expected width of confidence intervals in group testing. The gtTest function calculates p-values for hypothesis tests of single proportions. The gtPower function calculates power to reject a hypothesis.

The designPower function iterates either the number of groups or group size in a one-parameter group testing design until a pre-specified power level is achieved. The designEst function finds the optimal group size corresponding to the minimal mean-squared error of the point estimator.

The gtReg function implements regression methods and the gtSim function simulates group testing data for simple pooling, halving, and array testing designs.

Author(s)

Maintainer: Brianna Hitt [email protected] (ORCID)

Authors:

Christopher Bilder (ORCID)
Frank Schaarschmidt (ORCID)
Brad Biggerstaff (ORCID)
Christopher McMahan (ORCID)
Joshua Tebbs (ORCID)

Other contributors:

Boan Zhang [contributor]
Michael Black [contributor]
Peijie Hou [contributor]
Peng Chen [contributor]
Minh Nguyen [contributor]

References

Altman, D., Bland, J. (1994). “Diagnostic tests 1: Sensitivity and specificity.” BMJ, 308, 1552.

Altman, D., Bland, J. (1994). “Diagnostic tests 2: Predictive values.” BMJ, 309, 102.

Biggerstaff, B. (2008). “Confidence intervals for the difference of proportions estimated from pooled samples.” Journal of Agricultural, Biological, and Environmental Statistics, 13, 478–496.

Bilder, C., Tebbs, J., Chen, P. (2010). “Informative retesting.” Journal of the American Statistical Association, 105, 942–955.

Bilder, C., Tebbs, J., McMahan, C. (2019). “Informative group testing for multiplex assays.” Biometrics, 75, 278–288.

Bilder, C. (2019a). “Group Testing for Estimation.” Wiley StatsRef: Statistics Reference Online.

Bilder, C. (2019b). “Group Testing for Identification.” Wiley StatsRef: Statistics Reference Online.

Bilder, C., Iwen, P., Abdalhamid, B., Tebbs, J., McMahan, C. (2020). “Tests in short supply? Try group testing.” Significance, 17, 15.

Black, M., Bilder, C., Tebbs, J. (2012). “Group testing in heterogeneous populations by using halving algorithms.” Journal of the Royal Statistical Society. Series C: Applied Statistics, 61, 277–290.

Black, M., Bilder, C., Tebbs, J. (2015). “Optimal retesting configurations for hierarchical group testing.” Journal of the Royal Statistical Society. Series C: Applied Statistics, 64, 693–710.

Graff, L., Roeloffs, R. (1972). “Group testing in the presence of test error; an extension of the Dorfman procedure.” Technometrics, 14, 113–122.

Hepworth, G. (1996). “Exact confidence intervals for proportions estimated by group testing.” Biometrics, 52, 1134–1146.

Hepworth, G., Biggerstaff, B. (2017). “Bias correction in estimating proportions by pooled testing.” Journal of Agricultural, Biological, and Environmental Statistics, 22, 602–614.

Hitt, B., Bilder, C., Tebbs, J., McMahan, C. (2019). “The objective function controversy for group testing: Much ado about nothing?” Statistics in Medicine, 38, 4912–4923.

Hou, P., Tebbs, J., Wang, D., McMahan, C., Bilder, C. (2021). “Array testing with multiplex assays.” Biostatistics, 21, 417–431.

Malinovsky, Y., Albert, P., Roy, A. (2016). “Reader reaction: A note on the evaluation of group testing algorithms in the presence of misclassification.” Biometrics, 72, 299–302.

McMahan, C., Tebbs, J., Bilder, C. (2012a). “Informative Dorfman Screening.” Biometrics, 68, 287–296.

McMahan, C., Tebbs, J., Bilder, C. (2012b). “Two-Dimensional Informative Array Testing.” Biometrics, 68, 793–804.

Schaarschmidt, F. (2007). “Experimental design for one-sided confidence intervals or hypothesis tests in binomial group testing.” Communications in Biometry and Crop Science, 2, 32–40. ISSN 1896-0782.

Swallow, W. (1985). “Group testing for estimating infection rates and probabilities of disease transmission.” Phytopathology, 75, 882–889.

Tebbs, J., Bilder, C. (2004). “Confidence interval procedures for the probability of disease transmission in multiple-vector-transfer designs.” Journal of Agricultural, Biological, and Environmental Statistics, 9, 75–90.

Vansteelandt, S., Goetghebeur, E., Verstraeten, T. (2000). “Regression models for disease prevalence with diagnostic tests on pools of serum samples.” Biometrics, 56, 1126–1133.

Verstraeten, T., Farah, B., Duchateau, L., Matu, R. (1998). “Pooling sera to reduce the cost of HIV surveillance: a feasibility study in a rural Kenyan district.” Tropical Medicine & International Health, 3, 747–750.

Xie, M. (2001). “Regression analysis of group testing samples.” Statistics in Medicine, 20, 1957–1969.

Examples


# 1) Identification using hierarchical and array-based
#   group testing algorithms with an assay that tests
#   for one disease.

# 1.1) Find the optimal testing configuration over a
#   range of initial group sizes, using informative
#   three-stage hierarchical testing, where
#   p denotes the overall prevalence of disease (mean
#    parameter of a beta distribution);
#   Se denotes the sensitivity of the diagnostic test;
#   Sp denotes the specificity of the diagnostic test;
#   group.sz denotes the range of initial pool sizes
#   for consideration;
#   obj.fn specifies the objective functions for which
#   to find results; and
#   alpha is the heterogeneity level.
set.seed(1002)
results1 <- OTC1(algorithm = "ID3", p = 0.025, Se = 0.95,
                 Sp = 0.95, group.sz = 3:20,
                 obj.fn = "ET", alpha = 2)
summary(results1)

# 1.2) Find the optimal testing configuration using
#   non-informative array testing without master pooling.
# The sensitivity and specificity differ for row/column
#   testing and individual testing.
results2 <- OTC1(algorithm = "A2", p = 0.05,
                 Se = c(0.95, 0.99), Sp = c(0.95, 0.98),
                 group.sz = 3:15, obj.fn = "ET")
summary(results2)

# 1.3) Calculate the operating characteristics using
#   informative two-stage hierarchical (Dorfman) testing,
#   implemented via the pool-specific optimal Dorfman
#   (PSOD) method described in McMahan et al. (2012a).
# Hierarchical testing configurations are specified by
#   a matrix in the hier.config argument. The rows of
#   the matrix correspond to the stages of the
#   hierarchical testing algorithm, the columns
#   correspond to the individuals to be tested, and the
#   cell values correspond to the group number of each
#   individual at each stage.
config.mat <- matrix(data = c(rep(1, 5), rep(2, 4), 3, 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
set.seed(8791)
results3 <- opChar1(algorithm = "ID2", p = 0.02,
                    Se = 0.95, Sp = 0.99,
                    hier.config = config.mat, alpha = 0.5)
summary(results3)

# 1.4) Calculate the operating characteristics using
#   non-informative four-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 15), rep(c(1, 2, 3), each = 5),
                              rep(1, 3), rep(2, 2),
                              rep(3, 3), rep(4, 2),
                              rep(5, 4), 6, 1:15),
                     nrow = 4, ncol = 15, byrow = TRUE)
results4 <- opChar1(algorithm = "D4", p = 0.008,
                    Se = 0.96, Sp = 0.98,
                    hier.config = config.mat,
                    a = c(1, 4, 6, 9, 11, 15))
summary(results4)


# 2) Identification using hierarchical and array-based
#   group testing algorithms with a multiplex assay that
#   tests for two diseases.

# 2.1) Find the optimal testing configuration using
#   non-informative two-stage hierarchical testing, given
#   p.vec, a vector of overall joint probabilities of disease;
#   Se, a vector of sensitivity values for each disease; and
#   Sp, a vector of specificity values for each disease.
# Se and Sp can also be specified as a matrix, where one
#   value is specified for each disease at each stage of
#   testing.
results5 <- OTC2(algorithm = "D2",
                 p.vec = c(0.90, 0.04, 0.04, 0.02),
                 Se = c(0.99, 0.99), Sp = c(0.99, 0.99),
                 group.sz = 3:20)
summary(results5)

# 2.2) Calculate the operating characteristics for
#   informative five-stage hierarchical testing, given
#   alpha.vec, a vector of shape parameters for the
#   Dirichlet distribution;
#   Se, a matrix of sensitivity values; and
#   Sp, a matrix of specificity values.
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5, byrow = TRUE)
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5, byrow = TRUE)
config.mat <- matrix(data = c(rep(1, 24), rep(1, 18),
                              rep(2, 6), rep(1, 9),
                              rep(2, 9), rep(3, 4), 4, 5,
                              rep(1, 6), rep(2, 3),
                              rep(3, 5), rep(4, 4),
                              rep(5, 3), 6, rep(NA, 2),
                              1:21, rep(NA, 3)),
                     nrow = 5, ncol = 24, byrow = TRUE)
results6 <- opChar2(algorithm = "ID5",
                    alpha = c(18.25, 0.75, 0.75, 0.25),
                    Se = Se, Sp = Sp,
                    hier.config = config.mat)
summary(results6)

# 3) Estimation of the overall disease prevalence and
#   calculation of confidence intervals.

# 3.1) Suppose 3 groups out of 24 test positively.
#   Each group has a size of 7.
propCI(x = 3, m = 7, n = 24, ci.method = "CP")
propCI(x = 3, m = 7, n = 24, ci.method = "Blaker")
propCI(x = 3, m = 7, n = 24, ci.method = "score")
propCI(x = 3, m = 7, n = 24, ci.method = "soc")

# 3.2) Consider the following situation:
#   0 out of 5 groups test positively with groups
#   of size 1 (individual testing),
#   0 out of 5 groups test positively with groups of size 5,
#   1 out of 5 groups test positively with groups of size 10,
#   2 out of 5 groups test positively with groups of size 50
propCI(x = c(0, 0, 1, 2), m = c(1, 5, 10, 50),
       n = c(5, 5, 5, 5), pt.method = "Gart",
       ci.method = "skew-score")

# 4) Estimate a group testing regression model.

# 4.1) Fit a group testing regression model with
#   simple pooling using the "hivsurv" dataset.
data(hivsurv)
fit1 <- gtReg(type = "sp",
              formula = groupres ~ AGE + EDUC.,
              data = hivsurv, groupn = gnum,
              sens = 0.9, spec = 0.9, method = "Xie")
summary(fit1)

# 4.2) Simulate data for the halving protocol, and
#   fit a group testing regression model.
set.seed(46)
gt.data <- gtSim(type = "halving", par = c(-6, 0.1),
                 gshape = 17, gscale = 1.4,
                 size1 = 1000, size2 = 5,
                 sens = 0.95, spec = 0.95)
fit2 <- gtReg(type = "halving", formula = gres ~ x,
              data = gt.data, groupn = groupn,
              subg = subgroup, retest = retest,
              sens = 0.95, spec = 0.95,
              start = c(-6, 0.1), trace = TRUE)
summary(fit2)


# 1) Identification using hierarchical and array-based
#   group testing algorithms with an assay that tests
#   for one disease.

# 1.1) Find the optimal testing configuration over a
#   range of initial group sizes, using informative
#   three-stage hierarchical testing, where
#   p denotes the overall prevalence of disease (mean
#    parameter of a beta distribution);
#   Se denotes the sensitivity of the diagnostic test;
#   Sp denotes the specificity of the diagnostic test;
#   group.sz denotes the range of initial pool sizes
#   for consideration;
#   obj.fn specifies the objective functions for which
#   to find results; and
#   alpha is the heterogeneity level.
set.seed(1002)
results1 <- OTC1(algorithm = "ID3", p = 0.025, Se = 0.95,
                 Sp = 0.95, group.sz = 3:20,
                 obj.fn = "ET", alpha = 2)
summary(results1)

# 1.2) Find the optimal testing configuration using
#   non-informative array testing without master pooling.
# The sensitivity and specificity differ for row/column
#   testing and individual testing.
results2 <- OTC1(algorithm = "A2", p = 0.05,
                 Se = c(0.95, 0.99), Sp = c(0.95, 0.98),
                 group.sz = 3:15, obj.fn = "ET")
summary(results2)

# 1.3) Calculate the operating characteristics using
#   informative two-stage hierarchical (Dorfman) testing,
#   implemented via the pool-specific optimal Dorfman
#   (PSOD) method described in McMahan et al. (2012a).
# Hierarchical testing configurations are specified by
#   a matrix in the hier.config argument. The rows of
#   the matrix correspond to the stages of the
#   hierarchical testing algorithm, the columns
#   correspond to the individuals to be tested, and the
#   cell values correspond to the group number of each
#   individual at each stage.
config.mat <- matrix(data = c(rep(1, 5), rep(2, 4), 3, 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
set.seed(8791)
results3 <- opChar1(algorithm = "ID2", p = 0.02,
                    Se = 0.95, Sp = 0.99,
                    hier.config = config.mat, alpha = 0.5)
summary(results3)

# 1.4) Calculate the operating characteristics using
#   non-informative four-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 15), rep(c(1, 2, 3), each = 5),
                              rep(1, 3), rep(2, 2),
                              rep(3, 3), rep(4, 2),
                              rep(5, 4), 6, 1:15),
                     nrow = 4, ncol = 15, byrow = TRUE)
results4 <- opChar1(algorithm = "D4", p = 0.008,
                    Se = 0.96, Sp = 0.98,
                    hier.config = config.mat,
                    a = c(1, 4, 6, 9, 11, 15))
summary(results4)


# 2) Identification using hierarchical and array-based
#   group testing algorithms with a multiplex assay that
#   tests for two diseases.

# 2.1) Find the optimal testing configuration using
#   non-informative two-stage hierarchical testing, given
#   p.vec, a vector of overall joint probabilities of disease;
#   Se, a vector of sensitivity values for each disease; and
#   Sp, a vector of specificity values for each disease.
# Se and Sp can also be specified as a matrix, where one
#   value is specified for each disease at each stage of
#   testing.
results5 <- OTC2(algorithm = "D2",
                 p.vec = c(0.90, 0.04, 0.04, 0.02),
                 Se = c(0.99, 0.99), Sp = c(0.99, 0.99),
                 group.sz = 3:20)
summary(results5)

# 2.2) Calculate the operating characteristics for
#   informative five-stage hierarchical testing, given
#   alpha.vec, a vector of shape parameters for the
#   Dirichlet distribution;
#   Se, a matrix of sensitivity values; and
#   Sp, a matrix of specificity values.
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5, byrow = TRUE)
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5, byrow = TRUE)
config.mat <- matrix(data = c(rep(1, 24), rep(1, 18),
                              rep(2, 6), rep(1, 9),
                              rep(2, 9), rep(3, 4), 4, 5,
                              rep(1, 6), rep(2, 3),
                              rep(3, 5), rep(4, 4),
                              rep(5, 3), 6, rep(NA, 2),
                              1:21, rep(NA, 3)),
                     nrow = 5, ncol = 24, byrow = TRUE)
results6 <- opChar2(algorithm = "ID5",
                    alpha = c(18.25, 0.75, 0.75, 0.25),
                    Se = Se, Sp = Sp,
                    hier.config = config.mat)
summary(results6)

# 3) Estimation of the overall disease prevalence and
#   calculation of confidence intervals.

# 3.1) Suppose 3 groups out of 24 test positively.
#   Each group has a size of 7.
propCI(x = 3, m = 7, n = 24, ci.method = "CP")
propCI(x = 3, m = 7, n = 24, ci.method = "Blaker")
propCI(x = 3, m = 7, n = 24, ci.method = "score")
propCI(x = 3, m = 7, n = 24, ci.method = "soc")

# 3.2) Consider the following situation:
#   0 out of 5 groups test positively with groups
#   of size 1 (individual testing),
#   0 out of 5 groups test positively with groups of size 5,
#   1 out of 5 groups test positively with groups of size 10,
#   2 out of 5 groups test positively with groups of size 50
propCI(x = c(0, 0, 1, 2), m = c(1, 5, 10, 50),
       n = c(5, 5, 5, 5), pt.method = "Gart",
       ci.method = "skew-score")

# 4) Estimate a group testing regression model.

# 4.1) Fit a group testing regression model with
#   simple pooling using the "hivsurv" dataset.
data(hivsurv)
fit1 <- gtReg(type = "sp",
              formula = groupres ~ AGE + EDUC.,
              data = hivsurv, groupn = gnum,
              sens = 0.9, spec = 0.9, method = "Xie")
summary(fit1)

# 4.2) Simulate data for the halving protocol, and
#   fit a group testing regression model.
set.seed(46)
gt.data <- gtSim(type = "halving", par = c(-6, 0.1),
                 gshape = 17, gscale = 1.4,
                 size1 = 1000, size2 = 5,
                 sens = 0.95, spec = 0.95)
fit2 <- gtReg(type = "halving", formula = gres ~ x,
              data = gt.data, groupn = groupn,
              subg = subgroup, retest = retest,
              sens = 0.95, spec = 0.95,
              start = c(-6, 0.1), trace = TRUE)
summary(fit2)

Extract coefficients from a fitted group testing model

Description

Extract coefficients from objects of class "gtReg" returned by gtReg.

Usage

## S3 method for class 'gtReg'
coef(object, digits = max(3, getOption("digits") - 3), ...)

## S3 method for class 'gtReg'
coefficients(object, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'gtReg'
coef(object, digits = max(3, getOption("digits") - 3), ...)

## S3 method for class 'gtReg'
coefficients(object, digits = max(3, getOption("digits") - 3), ...)

Arguments

`object`	An object of class "gtReg", created by `gtReg`, from which the coefficients are to be extracted.
`digits`	digits for rounding.
`...`	not currently used.

Value

Model coefficients extracted from the object object.

Author(s)

Brianna D. Hitt

Examples

data(hivsurv)
fit1 <- gtReg(formula = groupres ~ AGE * EDUC.,
              data = hivsurv, groupn = gnum,
              linkf = "probit")
coefficients(object = fit1)
data(hivsurv)
fit1 <- gtReg(formula = groupres ~ AGE * EDUC.,
              data = hivsurv, groupn = gnum,
              linkf = "probit")
coefficients(object = fit1)

Compare group testing results

Description

Compare group testing results from objects of class "opchar" returned by operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2).

Usage

CompareConfig(object1, object2)
CompareConfig(object1, object2)

Arguments

`object1`	An object of class "opChar" containing group testing results.
`object2`	A second object of class "opChar" containing group testing results.

Details

The CompareConfig function compares group testing results from two objects of class "opChar". The function creates a data frame with these comparisons.

Value

A data frame with the expected percent reduction in tests (PercentReductionTests) and the expected increase in testing capacity (PercentIncreaseTestCap) when using the second testing configuration rather than the first testing configuration. Positive values for these quantities indicate that the second testing configuration is more efficient than the first.

Author(s)

Brianna D. Hitt and Christopher R. Bilder

Examples

config.mat1 <- matrix(data = c(rep(1, 10), rep(1:2, each = 5), 1:10),
                      nrow = 3, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D3", p = 0.05, Se = 0.99, Sp = 0.99,
                hier.config = config.mat1)
config.mat2 <- matrix(data = c(rep(1, 10), 1:10),
                      nrow = 2, ncol = 10, byrow = TRUE)
res2 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat2)
CompareConfig(res2, res1)

config.mat3 <- matrix(data = c(rep(1, 10), rep(1, 5),
                               rep(2, 4), 3, 1:9, NA),
                      nrow = 3, ncol = 10, byrow = TRUE)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
res3 <- opChar2(algorithm = "D3", p.vec = c(0.95, 0.02, 0.02, 0.01),
                Se = Se, Sp = Sp, hier.config = config.mat3)
config.mat4 <- matrix(data = c(rep(1, 12), rep(1, 6), rep(2, 6),
                               rep(1, 4), rep(2, 2), rep(3, 3),
                               rep(4, 3), 1:12),
                    nrow = 4, ncol = 12, byrow = TRUE)
Se <- matrix(data = rep(0.95, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
Sp <- matrix(data = rep(0.99, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
res4 <- opChar2(algorithm = "D4", p.vec = c(0.92, 0.05, 0.02, 0.01),
                Se = Se, Sp = Sp, hier.config = config.mat4)
CompareConfig(res4, res3)
config.mat1 <- matrix(data = c(rep(1, 10), rep(1:2, each = 5), 1:10),
                      nrow = 3, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D3", p = 0.05, Se = 0.99, Sp = 0.99,
                hier.config = config.mat1)
config.mat2 <- matrix(data = c(rep(1, 10), 1:10),
                      nrow = 2, ncol = 10, byrow = TRUE)
res2 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat2)
CompareConfig(res2, res1)

config.mat3 <- matrix(data = c(rep(1, 10), rep(1, 5),
                               rep(2, 4), 3, 1:9, NA),
                      nrow = 3, ncol = 10, byrow = TRUE)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
res3 <- opChar2(algorithm = "D3", p.vec = c(0.95, 0.02, 0.02, 0.01),
                Se = Se, Sp = Sp, hier.config = config.mat3)
config.mat4 <- matrix(data = c(rep(1, 12), rep(1, 6), rep(2, 6),
                               rep(1, 4), rep(2, 2), rep(3, 3),
                               rep(4, 3), 1:12),
                    nrow = 4, ncol = 12, byrow = TRUE)
Se <- matrix(data = rep(0.95, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
Sp <- matrix(data = rep(0.99, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
res4 <- opChar2(algorithm = "D4", p.vec = c(0.92, 0.05, 0.02, 0.01),
                Se = Se, Sp = Sp, hier.config = config.mat4)
CompareConfig(res4, res3)

Access the testing configurations returned from an object

Description

Config is a generic function that extracts testing configurations from an object

Usage

Config(object, ...)
Config(object, ...)

Arguments

`object`	An object from which the testing configurations are to be extracted.
`...`	Additional arguments to be passed to `Config`.

Author(s)

Christopher R. Bilder

Examples

# Find the optimal testing configuration for
#   non-informative two-stage hierarchical testing.
res1 <- OTC1(algorithm = "D2", p = 0.01, Se = 0.99, Sp = 0.99,
             group.sz = 2:100, obj.fn = c("ET", "MAR", "GR1"),
             weights = matrix(data = c(1,1), nrow = 1, ncol = 2))
Config(res1)
# Find the optimal testing configuration for
#   non-informative two-stage hierarchical testing.
res1 <- OTC1(algorithm = "D2", p = 0.01, Se = 0.99, Sp = 0.99,
             group.sz = 2:100, obj.fn = c("ET", "MAR", "GR1"),
             weights = matrix(data = c(1,1), nrow = 1, ncol = 2))
Config(res1)

Extract the testing configuration from group testing results

Description

Extract the testing configuration from objects of class "opchar" returned by operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2).

Usage

## S3 method for class 'opChar'
Config(object, ...)
## S3 method for class 'opChar'
Config(object, ...)

Arguments

`object`	An object of class "opChar", from which the testing configuration is to be extracted.
`...`	currently not used.

Value

A data frame specifying elements of the testing configuration.

Author(s)

Brianna D. Hitt

Examples

config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat)
Config(res1)

config.mat <- matrix(data = c(rep(1, 20), rep(1, 10), rep(2, 10),
                             rep(c(1, 2, 3, 4), each = 5),
                             rep(1, 3), rep(2, 2), rep(3, 3),
                             rep(4, 2), rep(5, 3), rep(6, 2),
                             rep(7, 3), rep(8, 2), 1:20),
                    nrow = 5, ncol = 20, byrow = TRUE)
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
res2 <- opChar2(algorithm = "ID5",
                alpha = c(18.25, 0.75, 0.75, 0.25),
                Se = Se, Sp = Sp, hier.config = config.mat)
Config(res2)
config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat)
Config(res1)

config.mat <- matrix(data = c(rep(1, 20), rep(1, 10), rep(2, 10),
                             rep(c(1, 2, 3, 4), each = 5),
                             rep(1, 3), rep(2, 2), rep(3, 3),
                             rep(4, 2), rep(5, 3), rep(6, 2),
                             rep(7, 3), rep(8, 2), 1:20),
                    nrow = 5, ncol = 20, byrow = TRUE)
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
res2 <- opChar2(algorithm = "ID5",
                alpha = c(18.25, 0.75, 0.75, 0.25),
                Se = Se, Sp = Sp, hier.config = config.mat)
Config(res2)

Extract the testing configuration from group testing results

Description

Extract the testing configuration from objects of class "OTC" returned by OTC1 (OTC1) or OTC2 (OTC2).

Usage

## S3 method for class 'OTC'
Config(object, n = 5, top.overall = FALSE, ...)
## S3 method for class 'OTC'
Config(object, n = 5, top.overall = FALSE, ...)

Arguments

`object`	An object of class "OTC", from which the testing configuration is to be extracted.
`n`	Number of testing configurations.
`top.overall`	logical; if TRUE, best overall testing configurations; if FALSE, best testing configurations by initial group size
`...`	currently not used.

Value

A data frame providing the best testing configurations.

Author(s)

Christopher R. Bilder

Examples

res1 <- OTC1(algorithm = "D3", p = 0.05, Se = 0.99, Sp = 0.99,
             group.sz = 3:15, obj.fn = "ET")
Config(res1)
res1 <- OTC1(algorithm = "D3", p = 0.05, Se = 0.99, Sp = 0.99,
             group.sz = 3:15, obj.fn = "ET")
Config(res1)

Optimal group size determination based on minimal MSE when estimating an overall prevalence

Description

Find the group size s for a fixed number of groups n and an assumed true proportion p.tr, for which the mean squared error (MSE) of the point estimator is minimal and bias is within a restriction.

Usage

designEst(n, smax, p.tr, biasrest = 0.05)
designEst(n, smax, p.tr, biasrest = 0.05)

Arguments

`n`	integer specifying the fixed number of groups.
`smax`	integer specifying the maximum group size allowed in the planning of the design.
`p.tr`	assumed true proportion of the "positive" trait in the population, specified as a value between 0 and 1.
`biasrest`	a value between 0 and 1 specifying the absolute bias maximally allowed.

Details

Swallow (1985) recommends the use of the upper bound of the expected range of the true proportion p.tr for optimization of the design. For further details, see Swallow (1985). Note that the specified number of groups must be less than $n=1020$ .

Value

A list containing:

`call`	the function call
`result`	a data frame containing: mse the mean squared error of the estimator. sout the group size `s` for which the MSE of the estimator is minimal for the given `n` and `p.tr` and for which the bias restriction `biasrest` is not violated. In the case that the minimum MSE is achieved for a group size $s>=smax$ , the value of `smax` is returned. exp the expected value of the estimator. varp the variance of the estimator. bias the bias of the estimator.
`bias.reached`	a logical value indicating whether the bias restriction `biasrest` was violated.
`smax.reached`	a logical value indicating whether the maximum group size allowed `smax` was reached.

Author(s)

This function was originally written by Frank Schaarschmidt as the estDesign function for the binGroup package. Minor modifications were made for inclusion in the binGroup2 package.

References

Swallow, W. (1985). “Group testing for estimating infection rates and probabilities of disease transmission.” Phytopathology, 75, 882–889.

Examples

# Compare to Table 1 in Swallow (1985):
designEst(n = 10, smax = 100, p.tr = 0.001)
designEst(n = 10, smax = 100, p.tr = 0.01)
designEst(n = 25, smax = 100, p.tr = 0.05)
designEst(n = 40, smax = 100, p.tr = 0.25)
designEst(n = 200, smax = 100, p.tr = 0.30)
# Compare to Table 1 in Swallow (1985):
designEst(n = 10, smax = 100, p.tr = 0.001)
designEst(n = 10, smax = 100, p.tr = 0.01)
designEst(n = 25, smax = 100, p.tr = 0.05)
designEst(n = 40, smax = 100, p.tr = 0.25)
designEst(n = 200, smax = 100, p.tr = 0.30)

Number of groups or group size needed to achieve a power level in one parameter group testing

Description

For a fixed number of groups (group size), determine the group size (number of groups) needed to obtain a specified power level to reject a hypothesis for a proportion in one parameter group testing.

Usage

designPower(
  n,
  s,
  fixed = "s",
  delta,
  p.hyp,
  conf.level = 0.95,
  power = 0.8,
  alternative = "two.sided",
  method = "CP",
  biasrest = 0.05
)
designPower(
  n,
  s,
  fixed = "s",
  delta,
  p.hyp,
  conf.level = 0.95,
  power = 0.8,
  alternative = "two.sided",
  method = "CP",
  biasrest = 0.05
)

Arguments

`n`	integer specifying the maximum number of groups `n` allowed when `fixed="s"` or the fixed number of groups when `fixed="n"`. When `fixed="s"`, a vector of two integers giving the range of `n` which power shall be iterated over is also allowed.
`s`	integer specifying the fixed group size (number of units per group) when `fixed="s"` or the maximum group size allowed in the planning of the design when `fixed="n"`.
`fixed`	character string specifying whether the number of groups `"n"` or the group size `"s"` is to be held at a fixed value.
`delta`	the absolute difference between the true proportion and the hypothesized proportion which shall be detectable with the specified power.
`p.hyp`	the proportion in the hypotheses, specified as a value between 0 and 1.
`conf.level`	confidence level of the decision. The default confidence level is 0.95.
`power`	level of power to be achieved, specified as a probability between 0 and 1.
`alternative`	character string defining the alternative hypothesis, either `"two.sided"`, `"less"`, or `"greater"`.
`method`	character string specifying the confidence interval method (see `propCI`) to be used.
`biasrest`	a value between 0 and 1, specifying the absolute bias maximally allowed for a point estimate.

Details

The power of a hypothesis test performed by a confidence interval is defined as the probability that a confidence interval excludes the threshold parameter (p.hyp) of the hypothesis.

When fixed="s", this function increases the number of groups until a pre-specified level of power is reached or the maximum number of groups n is reached. Since the power does not increase monotonically with increasing n for single proportions but oscillates between local maxima and minima, the simple iteration given here will generally result in selecting n for which the given confidence interval method shows a local minimum of coverage if the null hypothesis is true. Bias decreases monotonically with increasing the number of groups (if other parameters are fixed). The resulting problems of choosing a number of groups which results in satisfactory power are solved in the following manner:

In the case that the pre-specified power is reached within the given range of n, the smallest n is returned for which at least this power is reached, as well as the actual power for this n.

In the case that the pre-specified power is not reached within the given value, that n is returned for which maximum power is achieved, and the corresponding value of power.

In the case that the bias restriction is violated even for the largest n within the given range of n, simply that n will be returned for which power was largest in the given range.

Especially for large n, the calculation time may become large (particularly for the Blaker interval). Alternatively, the function gtPower might be used to calculate power and bias only for some particular combinations of the input arguments.

When fixed="n", this function increases the size of groups until a pre-specified level of power is reached. Since the power does not increase monotonically with increasing s for single proportions but oscillates between local maxima and minima, the simple iteration given here will generally result in selecting s for which the given confidence interval method shows a local minimum of coverage if the null hypothesis is true. Since the positive bias of the estimator in group testing increases with increasing group size, this function checks whether the bias is smaller than a pre-specified level (bias.rest). If the bias violates this restriction for a given combination n, s, and delta, s will not be further increased and the actual power of the last acceptable group size s is returned.

Value

A list containing:

`nout`	the number of groups necessary to reach the power with the specified parameters, when `fixed="s"` only.
`sout`	the group size necessary to meet the conditions, when `fixed="n"` only.
`powerout`	the power for the specified parameters and the selected number of groups `n` when `fixed="s"` or the selected group size `s` when `fixed="n"`.
`biasout`	the bias for the specified parameters and the selected number of groups `n` when `fixed="s"` or the selected group size `s` when `fixed="n"`.
`power.reached`	a logical value indicating whether the specified level of power was reached.
`bias.reached`	a logical value indicating whether the maximum allowed bias was reached.
`nit`	the number of groups for each iteration.
`sit`	the group size for each iteration.
`powerit`	the power achieved for each iteration.
`biasit`	the bias for each iteration.
`maxit`	the iteration at which the maximum power was reached, or the total number of iterations.
`alternative`	the alternative hypothesis specified by the user.
`p.hyp`	the hypothesized proportion specified by the user.
`delta`	the absolute difference between the true proportion and the hypothesized proportion specified by the user.
`power`	the desired power specified by the user.
`biasrest`	the maximum absolute bias specified by the user.

Author(s)

The nDesign and sDesign functions were originally written by Frank Schaarschmidt for the binGroup package. Minor modifications were made for inclusion in the binGroup2 package.

References

Swallow, W. (1985). “Group testing for estimating infection rates and probabilities of disease transmission.” Phytopathology, 75, 882–889.

Examples

# Assume the objective is to show that a proportion is
#   smaller than 0.005 (i.e. 0.5 percent) with a power
#   of 0.80 (i.e. 80 percent) if the unknown proportion
#   in the population is 0.003 (i.e. 0.3 percent);
#   thus, a delta of 0.002 shall be detected.
# A 95% Clopper Pearson CI shall be used.
# The maximum group size because of limited sensitivity
#   of the diagnostic test might be s=20 and we can
#   only afford to perform maximally 100 tests:
designPower(n = 100, s = 20, delta = 0.002,
            p.hyp = 0.005, fixed = "s",
            alternative = "less", method = "CP",
            power = 0.8)

# One might accept to detect delta=0.004,
#   i.e. reject H0: p>=0.005 with power 80 percent
#   when the true proportion is 0.001:
designPower(n = 100, s = 20, delta = 0.004, p.hyp = 0.005, fixed = "s",
             alternative = "less", method = "CP", power = 0.8)

# Power for a design with a fixed group size of s = 1
#   (individual testing).
designPower(n = 200, s = 1, delta = 0.05, p.hyp = 0.10,
            fixed = "s", method = "CP", power = 0.80)

# Assume that objective is to show that a proportion
#   is smaller than 0.005 (i.e. 0.5%) with a
#   power of 0.80 (i.e. 80%) if the unknown proportion
#   in the population is 0.003 (i.e. 0.3%); thus, a
#   delta = 0.002 shall be detected.
# A 95% Clopper-Pearson CI shall be used.
# The maximum number of groups might be 30, where the
#   overall sensitivity is not limited until group
#   size s=100.
designPower(s = 100, n = 30, delta = 0.002, p.hyp = 0.005, fixed = "n",
             alternative = "less", method = "CP", power = 0.8)

# One might accept to detect delta=0.004,
#   i.e. reject H0: p>=0.005 with power 80 percent
#   when the true proportion is 0.001:
designPower(s = 100, n = 30, delta = 0.004, p.hyp = 0.005, fixed = "n",
             alternative = "less", method = "CP", power = 0.8)
designPower(s = 100, n = 30, delta = 0.004, p.hyp = 0.005, fixed = "n",
             alternative = "less", method = "score", power = 0.8)
# Assume the objective is to show that a proportion is
#   smaller than 0.005 (i.e. 0.5 percent) with a power
#   of 0.80 (i.e. 80 percent) if the unknown proportion
#   in the population is 0.003 (i.e. 0.3 percent);
#   thus, a delta of 0.002 shall be detected.
# A 95% Clopper Pearson CI shall be used.
# The maximum group size because of limited sensitivity
#   of the diagnostic test might be s=20 and we can
#   only afford to perform maximally 100 tests:
designPower(n = 100, s = 20, delta = 0.002,
            p.hyp = 0.005, fixed = "s",
            alternative = "less", method = "CP",
            power = 0.8)

# One might accept to detect delta=0.004,
#   i.e. reject H0: p>=0.005 with power 80 percent
#   when the true proportion is 0.001:
designPower(n = 100, s = 20, delta = 0.004, p.hyp = 0.005, fixed = "s",
             alternative = "less", method = "CP", power = 0.8)

# Power for a design with a fixed group size of s = 1
#   (individual testing).
designPower(n = 200, s = 1, delta = 0.05, p.hyp = 0.10,
            fixed = "s", method = "CP", power = 0.80)

# Assume that objective is to show that a proportion
#   is smaller than 0.005 (i.e. 0.5%) with a
#   power of 0.80 (i.e. 80%) if the unknown proportion
#   in the population is 0.003 (i.e. 0.3%); thus, a
#   delta = 0.002 shall be detected.
# A 95% Clopper-Pearson CI shall be used.
# The maximum number of groups might be 30, where the
#   overall sensitivity is not limited until group
#   size s=100.
designPower(s = 100, n = 30, delta = 0.002, p.hyp = 0.005, fixed = "n",
             alternative = "less", method = "CP", power = 0.8)

# One might accept to detect delta=0.004,
#   i.e. reject H0: p>=0.005 with power 80 percent
#   when the true proportion is 0.001:
designPower(s = 100, n = 30, delta = 0.004, p.hyp = 0.005, fixed = "n",
             alternative = "less", method = "CP", power = 0.8)
designPower(s = 100, n = 30, delta = 0.004, p.hyp = 0.005, fixed = "n",
             alternative = "less", method = "score", power = 0.8)

Determine a vector of probabilities for informative group testing algorithms

Description

Find the expected value of order statistics from a beta distribution. This function is used to provide a set of individual risk probabilities for informative group testing.

Usage

expectOrderBeta(
  p,
  alpha,
  size,
  grp.sz,
  num.sim = 10000,
  rel.tol = ifelse(alpha >= 1, .Machine$double.eps^0.25, .Machine$double.eps^0.1),
  ...
)
expectOrderBeta(
  p,
  alpha,
  size,
  grp.sz,
  num.sim = 10000,
  rel.tol = ifelse(alpha >= 1, .Machine$double.eps^0.25, .Machine$double.eps^0.1),
  ...
)

Arguments

`p`	overall probability of disease that will be used to determine a vector of individual risk probabilities. This is the expected value of a random variable with a beta distribution, $\frac{\alpha}{\alpha + \beta}$ .
`alpha`	a shape parameter for the beta distribution that specifies the degree of heterogeneity for the determined probability vector.
`size`	the size of the vector of individual risk probabilities to be generated. This is also the number of total individuals for which to determine risk probabilities.
`grp.sz`	the number of total individuals for which to determine risk probabilities. This argument is deprecated; the `size` argument should be used instead.
`num.sim`	the number of simulations. This argument is used only when simulation is necessary.
`rel.tol`	relative tolerance used for integration.
`...`	arguments to be passed to the `beta.dist` function written by Michael Black for Black et al. (2015).

Details

This function uses the beta.dist function from Black et al. (2015) to determine a vector of individual risk probabilities, ordered from least to greatest. Depending on the specified probability, $\alpha$ level, and overall group size, simulation may be necessary in order to determine the probabilities. For this reason, the user should set a seed in order to reproduce results. The number of simulations (default = 10,000) and relative tolerance for integration can be specified by the user. The expectOrderBeta function augments the beta.dist function by checking whether simulation is needed before attempting to determine the probabilities, and by allowing the number of simulations to be specified by the user. See Black et al. (2015) for additional details on the original beta.dist function.

Value

A vector of individual risk probabilities.

Author(s)

Brianna D. Hitt

References

Examples

set.seed(8791)
expectOrderBeta(p = 0.03, alpha = 0.5, size = 100, rel.tol = 0.0001)

expectOrderBeta(p = 0.05, alpha = 2, size = 40)
set.seed(8791)
expectOrderBeta(p = 0.03, alpha = 0.5, size = 100, rel.tol = 0.0001)

expectOrderBeta(p = 0.05, alpha = 2, size = 40)

Access the expected number of tests from an object

Description

ExpTests is a generic function that extracts the expected number of tests from an object that contains information aboout a testing configuration.

Usage

ExpTests(object, ...)
ExpTests(object, ...)

Arguments

`object`	An object for which a summary of the expected number of tests is desired.
`...`	Additional arguments to be passed to `ExpTests`.

Value

The value return depends on the class of its object. See the documentation for the corresponding method functions.

Author(s)

Christopher R. Bilder

Examples

# Find the optimal testing configuration for
#   non-informative two-stage hierarchical testing.
res1 <- OTC1(algorithm = "D2", p = 0.01, Se = 0.99, Sp = 0.99,
             group.sz = 2:100, obj.fn = c("ET", "MAR", "GR1"),
             weights = matrix(data = c(1,1), nrow = 1, ncol = 2))
ExpTests(res1)
# Find the optimal testing configuration for
#   non-informative two-stage hierarchical testing.
res1 <- OTC1(algorithm = "D2", p = 0.01, Se = 0.99, Sp = 0.99,
             group.sz = 2:100, obj.fn = c("ET", "MAR", "GR1"),
             weights = matrix(data = c(1,1), nrow = 1, ncol = 2))
ExpTests(res1)

Extract the expected number of tests from testing configuration results

Description

Extract the expected number of tests from objects of class "halving" returned by halving (halving).

Usage

## S3 method for class 'halving'
ExpTests(object, ...)
## S3 method for class 'halving'
ExpTests(object, ...)

Arguments

`object`	An object of class "halving", from which the expected number of tests is to be extracted.
`...`	Additional arguments to be passed to `ExpTests` (e.g., `digits` to be passed to `round` for appropriate rounding).

Value

A data frame containing the columns:

`ExpTests`	the expected number of tests required to decode all individuals in the algorithm.
`ExpTestsPerIndividual`	the expected number of tests per individual.
`PercentReductionTests`	The percent reduction in the number of tests; 100 * (1 - ExpTestsPerIndividual).
`PercentIncreaseTestCap`	The percent increase in testing capacity when the algorithm is applied to a continuous stream of specimens; 100 * (1/ExpTestsPerIndividual - 1).

Author(s)

Christopher R. Bilder

References

Bilder, C., Iwen, P., Abdalhamid, B., Tebbs, J., McMahan, C. (2020). “Tests in short supply? Try group testing.” Significance, 17, 15.

Examples

save.it1 <- halving(p = rep(0.01, 10), Sp = 1, Se = 1, stages = 2,
        order.p = TRUE)
ExpTests(save.it1)
save.it1 <- halving(p = rep(0.01, 10), Sp = 1, Se = 1, stages = 2,
        order.p = TRUE)
ExpTests(save.it1)

Extract the expected number of tests from testing configuration results

Description

Extract the expected number of tests and expected number of tests per individual from objects of class "opchar" returned by operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2).

Usage

## S3 method for class 'opChar'
ExpTests(object, ...)
## S3 method for class 'opChar'
ExpTests(object, ...)

Arguments

`object`	An object of class "opChar", from which the expected number of tests and expected number of tests per individual are to be extracted.
`...`	Additional arguments to be passed to `ExpTests` (e.g., `digits` to be passed to `round` for appropriate rounding).

Value

A data frame containing the columns:

`ExpTests`	the expected number of tests required to decode all individuals in the algorithm.
`ExpTestsPerIndividual`	the expected number of tests per individual.
`PercentReductionTests`	The percent reduction in the number of tests; 100 * (1 - ExpTestsPerIndividual).
`PercentIncreaseTestCap`	The percent increase in testing capacity when the algorithm is applied to a continuous stream of specimens; 100 * (1/ExpTestsPerIndividual - 1).

Author(s)

Brianna D. Hitt and Christopher R. Bilder

References

Bilder, C., Iwen, P., Abdalhamid, B., Tebbs, J., McMahan, C. (2020). “Tests in short supply? Try group testing.” Significance, 17, 15.

Examples

config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
                hier.config = config.mat)
ExpTests(res1)

res2 <- opChar2(algorithm = "A2M", p.vec = c(0.92, 0.05, 0.02, 0.01),
                Se = rep(0.95, 2), Sp = rep(0.99, 2), rowcol.sz = 8)
ExpTests(res2)
config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
                hier.config = config.mat)
ExpTests(res1)

res2 <- opChar2(algorithm = "A2M", p.vec = c(0.92, 0.05, 0.02, 0.01),
                Se = rep(0.95, 2), Sp = rep(0.99, 2), rowcol.sz = 8)
ExpTests(res2)

Extract the expected number of tests from optimal testing configuration results

Description

Extract the expected number of tests and expected number of tests per individual from objects of class "OTC" returned by OTC1 or OTC2.

Usage

## S3 method for class 'OTC'
ExpTests(object, ...)
## S3 method for class 'OTC'
ExpTests(object, ...)

Arguments

`object`	An object of class "OTC", from which the expected number of tests and expected number of tests per individual are to be extracted.
`...`	Additional arguments to be passed to `ExpTests` (e.g., `digits` to be passed to `round` for appropriate rounding).

Value

A data frame containing the columns:

`ExpTests`	the expected number of tests required by the optimal testing configuration.
`ExpTestsPerInd`	the expected number of tests per individual for the optimal testing configuration.
`PercentReductionTests`	The percent reduction in the number of tests; 100 * (1 - ExpTestsPerIndividual).
`PercentIncreaseTestCap`	The percent increase in testing capacity when the algorithm is applied to a continuous stream of specimens; 100 * (1/ExpTestsPerIndividual - 1).

Each row of the data frame represents an objective function specified in the call to OTC1 or OTC2.

Author(s)

Brianna D. Hitt and Christopher R. Bilder

References

Bilder, C., Iwen, P., Abdalhamid, B., Tebbs, J., McMahan, C. (2020). “Tests in short supply? Try group testing.” Significance, 17, 15.

Examples

res1 <- OTC1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
             group.sz = 2:100, obj.fn = c("ET", "MAR"),
             trace = TRUE)
ExpTests.OTC(res1)
res1 <- OTC1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
             group.sz = 2:100, obj.fn = c("ET", "MAR"),
             trace = TRUE)
ExpTests.OTC(res1)

Extract the expected number of tests from testing configuration results

Description

Extract the expected number of tests from objects of class "Sterrett" returned by Sterrett (Sterrett).

Usage

## S3 method for class 'Sterrett'
ExpTests(object, ...)
## S3 method for class 'Sterrett'
ExpTests(object, ...)

Arguments

`object`	An object of class "Sterrett", from which the expected number of tests is to be extracted.
`...`	Additional arguments to be passed to `ExpTests` (e.g., `digits` to be passed to `round` for appropriate rounding).

Value

A data frame containing the columns:

`ExpTests`	the expected number of tests required to decode all individuals in the algorithm.
`ExpTestsPerIndividual`	the expected number of tests per individual.
`PercentReductionTests`	The percent reduction in the number of tests; 100 * (1 - ExpTestsPerIndividual).
`PercentIncreaseTestCap`	The percent increase in testing capacity when the algorithm is applied to a continuous stream of specimens; 100 * (1/ExpTestsPerIndividual - 1).

Author(s)

Christopher R. Bilder

References

Bilder, C., Iwen, P., Abdalhamid, B., Tebbs, J., McMahan, C. (2020). “Tests in short supply? Try group testing.” Significance, 17, 15.

Examples

set.seed(1231)
p.vec1 <- rbeta(n = 8, shape1 = 1, shape2 = 10)
save.it1 <- Sterrett(p = p.vec1, Sp = 0.90, Se = 0.95)
ExpTests(save.it1)
set.seed(1231)
p.vec1 <- rbeta(n = 8, shape1 = 1, shape2 = 10)
save.it1 <- Sterrett(p = p.vec1, Sp = 0.90, Se = 0.95)
ExpTests(save.it1)

Extract the expected number of tests from testing configuration results

Description

Extract the expected number of tests from objects of class "TOD" returned by TOD (TOD).

Usage

## S3 method for class 'TOD'
ExpTests(object, ...)
## S3 method for class 'TOD'
ExpTests(object, ...)

Arguments

`object`	An object of class "TOD", from which the expected number of tests is to be extracted.
`...`	Additional arguments to be passed to `ExpTests` (e.g., `digits` to be passed to `round` for appropriate rounding).

Value

A data frame containing the columns:

`ExpTests`	the expected number of tests required to decode all individuals in the algorithm.
`ExpTestsPerIndividual`	the expected number of tests per individual.
`PercentReductionTests`	The percent reduction in the number of tests; 100 * (1 - ExpTestsPerIndividual).
`PercentIncreaseTestCap`	The percent increase in testing capacity when the algorithm is applied to a continuous stream of specimens; 100 * (1/ExpTestsPerIndividual - 1).

Author(s)

Christopher R. Bilder

References

Bilder, C., Iwen, P., Abdalhamid, B., Tebbs, J., McMahan, C. (2020). “Tests in short supply? Try group testing.” Significance, 17, 15.

Examples

set.seed(1002)
p.vec <- expectOrderBeta(p = 0.01, alpha = 2, size = 20)
save.it1 <- TOD(p = p.vec, Se = 0.95, Sp = 0.95, max = 5, threshold = 0.015)
ExpTests(save.it1)
set.seed(1002)
p.vec <- expectOrderBeta(p = 0.01, alpha = 2, size = 20)
save.it1 <- TOD(p = p.vec, Se = 0.95, Sp = 0.95, max = 5, threshold = 0.015)
ExpTests(save.it1)

Extract the model formula from a fitted group testing model

Description

Extract the model formula from objects of class "gtReg" returned by gtReg.

Usage

## S3 method for class 'gtReg'
formula(x, ...)
## S3 method for class 'gtReg'
formula(x, ...)

Arguments

`x`	An object of class "gtReg", created by `gtReg`, from which the model formula is to be extracted.
`...`	not currently used.

Value

Model formula extracted from the object object.

Author(s)

Brianna D. Hitt

Examples

data(hivsurv)
fit1 <- gtReg(formula = groupres ~ AGE * EDUC.,
              data = hivsurv, groupn = gnum,
              linkf = "probit")
formula(x = fit1)
data(hivsurv)
fit1 <- gtReg(formula = groupres ~ AGE * EDUC.,
              data = hivsurv, groupn = gnum,
              linkf = "probit")
formula(x = fit1)

Construct a group membership matrix for hierarchical algorithms

Description

Construct a group membership matrix for two-, three-, or four-stage hierarchical algorithms.

Usage

GroupMembershipMatrix(stage1, stage2 = NULL, stage3 = NULL, stage4 = NULL)
GroupMembershipMatrix(stage1, stage2 = NULL, stage3 = NULL, stage4 = NULL)

Arguments

`stage1`	the group size in stage one of testing. This also corresponds to the number of individuals to be tested and will specify the number of columns in the resulting group membership matrix.
`stage2`	a vector of group sizes in stage two of testing. The group sizes specified here should sum to the number of individuals/group size specified in `stage1`. If `NULL`, a group membership matrix will be constructed for a two-stage hierarchical algorithm. Further details are given under 'Details'.
`stage3`	a vector of group sizes in stage three of testing. The group sizes specified here should sum to the number of individuals/group size specified in `stage1`. If group sizes are provided in `stage2` and `stage3` is `NULL`, a group membership matrix will be constructed for a three-stage hierarchical algorithm. Further details are given under 'Details'.
`stage4`	a vector of group sizes in stage four of testing. The group sizes specified here should sum to the number of individuals/group size specified in `stage1`. If group sizes are provided in `stage3` and `stage4` is `NULL`, a group membership matrix will be constructed for a four-stage hierarchical algorithm. Further details are given under 'Details'.

Details

This function constructs a group membership matrix for two-, three-, four-, or five-stage hierarchical algorithms. The resulting group membership matrix has rows corresponding to the number of stages of testing and columns corresponding to each individual to be tested. The value specified in stage1 corresponds to the number of individuals to be tested.

For group membership matrices when only stage1 is specified, a two-stage hierarchical algorithm is used and the second stage will consist of individual testing. For group membership matrices when stage1 and stage2 are specified, a three-stage hierarchical algorithm is used and the third stage will consist of individual testing. Group membership matrices for four- and five-stage hierarchical algorithms follow a similar structure. There should never be group sizes specified for later stages of testing without also providing group sizes for all earlier stages of testing (i.e., to provide group sizes for stage3, group sizes must also be provided for stage1 and stage2).

Value

A matrix specifying the group membership for each individual. The rows of the matrix correspond to the stages of testing and the columns of the matrix correspond to the individuals to be tested.

Author(s)

Minh Nguyen and Christopher Bilder

Examples

# Generate a group membership matrix for a two-stage
#   hierarchical algorithm, within the opChar1() function
#   and calculate operating characteristics
opChar1(algorithm = "D2", p = 0.0193, Se = 0.99, Sp = 0.99,
        hier.config = GroupMembershipMatrix(stage1 = 16),
        print.time = FALSE)

# Generate a group membership matrix for a five-stage
#   hierarchical algorithm and calculate the
#   operating characteristics for a two-disease assay
config.mat <- GroupMembershipMatrix(stage1 = 16,
                                    stage2 = c(8,8),
                                    stage3 = c(4,4,4,4),
                                    stage4 = rep(2, times = 8))
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
opChar2(algorithm = "D5", p.vec = c(0.92, 0.05, 0.02, 0.01),
        Se = Se, Sp = Sp, hier.config = config.mat)
# Generate a group membership matrix for a two-stage
#   hierarchical algorithm, within the opChar1() function
#   and calculate operating characteristics
opChar1(algorithm = "D2", p = 0.0193, Se = 0.99, Sp = 0.99,
        hier.config = GroupMembershipMatrix(stage1 = 16),
        print.time = FALSE)

# Generate a group membership matrix for a five-stage
#   hierarchical algorithm and calculate the
#   operating characteristics for a two-disease assay
config.mat <- GroupMembershipMatrix(stage1 = 16,
                                    stage2 = c(8,8),
                                    stage3 = c(4,4,4,4),
                                    stage4 = rep(2, times = 8))
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
opChar2(algorithm = "D5", p.vec = c(0.92, 0.05, 0.02, 0.01),
        Se = Se, Sp = Sp, hier.config = config.mat)

Power to reject a hypothesis for one proportion in group testing

Description

This function calculates the power to reject a hypothesis in a group testing experiment, using confidence intervals for the decision. This function also calculates the bias of the point estimator for a given $n$ , $s$ , and true, unknown proportion.

Usage

gtPower(
  n,
  s,
  delta,
  p.hyp,
  conf.level = 0.95,
  method = "CP",
  alternative = "two.sided"
)
gtPower(
  n,
  s,
  delta,
  p.hyp,
  conf.level = 0.95,
  method = "CP",
  alternative = "two.sided"
)

Arguments

`n`	integer specifying the number of groups. A vector of integers is also allowed.
`s`	integer specifying the common group size. A vector of integers is also allowed.
`delta`	the absolute difference between the true proportion and the hypothesized proportion. A vector is also allowed.
`p.hyp`	the proportion in the hypotheses, specified as a value between 0 and 1.
`conf.level`	confidence level required for the decision on the hypotheses.
`method`	character string specifying the confidence interval method (see `propCI`) to be used.
`alternative`	character string defining the alternative hypothesis, either `"two.sided"`, `"less"`, or `"greater"`.

Details

The power of a hypothesis test performed by a confidence interval is defined as the probability that a confidence interval excludes the threshold parameter (p.hyp) of the null hypothesis, as described in Schaarschmidt (2007). Due to discreteness, the power does not increase monotonically for an increasing number of groups $n$ or group size $s$ , but exhibits local maxima and minima, depending on $n$ , $s$ , p.hyp, and conf.level.

Additional to the power, the bias of the point estimator is calculated according to Swallow (1985). If vectors are specified for $n$ , $s$ , and (or) delta, a matrix will be constructed and power and bias are calculated for each line in this matrix.

Value

A matrix containing the following columns:

`ns`	a vector of the total sample size, $n*s$ .
`n`	a vector of the number of groups.
`s`	a vector of the group sizes.
`delta`	a vector of the delta values.
`power`	the power to reject the given null hypothesis.
`bias`	the bias of the estimator for the specified $n$ , $s$ , and the true proportion.

Author(s)

This function was originally written as bgtPower by Frank Schaarschmidt for the binGroup package. Minor modifications have been made for inclusion of the function in the binGroup2 package.

References

Swallow, W. (1985). “Group testing for estimating infection rates and probabilities of disease transmission.” Phytopathology, 75, 882–889.

Examples

# Calculate the power for the design
#   in the example given in Tebbs and Bilder(2004):
#   n=24 groups each containing 7 insects
#   if the true proportion of virus vectors
#   in the population is 0.04 (4 percent),
#   the power to reject H0: p>=0.1 using an
#   upper Clopper-Pearson ("CP") confidence interval
#   is calculated with the following call:
gtPower(n = 24, s = 7, delta = 0.06, p.hyp = 0.1,
        conf.level = 0.95, alternative = "less",
        method = "CP")

# Explore development of power and bias for varying n,
#   s, and delta. How much can we decrease the number of
#   groups (costly tests to be performed) by pooling the
#   same number of 320 individuals to groups of
#   increasing size without largely decreasing power?
gtPower(n = c(320, 160, 80, 64, 40, 32, 20, 10, 5),
        s = c(1, 2, 4, 5, 8, 10, 16, 32, 64),
        delta = 0.01,  p.hyp = 0.02)

# What happens to the power for increasing differences
#   between the true proportion and the threshold
#   proportion?
gtPower(n = 50, s = 10,
        delta = seq(from = 0, to = 0.01, by = 0.001),
        p.hyp = 0.01, method = "CP")

# Calculate power with a group size of 1 (individual
#   testing).
gtPower(n = 100, s = 1,
        delta = seq(from = 0, to = 0.01, by = 0.001),
        p.hyp = 0.01, method = "CP")
# Calculate the power for the design
#   in the example given in Tebbs and Bilder(2004):
#   n=24 groups each containing 7 insects
#   if the true proportion of virus vectors
#   in the population is 0.04 (4 percent),
#   the power to reject H0: p>=0.1 using an
#   upper Clopper-Pearson ("CP") confidence interval
#   is calculated with the following call:
gtPower(n = 24, s = 7, delta = 0.06, p.hyp = 0.1,
        conf.level = 0.95, alternative = "less",
        method = "CP")

# Explore development of power and bias for varying n,
#   s, and delta. How much can we decrease the number of
#   groups (costly tests to be performed) by pooling the
#   same number of 320 individuals to groups of
#   increasing size without largely decreasing power?
gtPower(n = c(320, 160, 80, 64, 40, 32, 20, 10, 5),
        s = c(1, 2, 4, 5, 8, 10, 16, 32, 64),
        delta = 0.01,  p.hyp = 0.02)

# What happens to the power for increasing differences
#   between the true proportion and the threshold
#   proportion?
gtPower(n = 50, s = 10,
        delta = seq(from = 0, to = 0.01, by = 0.001),
        p.hyp = 0.01, method = "CP")

# Calculate power with a group size of 1 (individual
#   testing).
gtPower(n = 100, s = 1,
        delta = seq(from = 0, to = 0.01, by = 0.001),
        p.hyp = 0.01, method = "CP")

Fitting group testing regression models

Description

Fits the group testing regression model specified through a symbolic description of the linear predictor and descriptions of the group testing setting. This function allows for fitting regression models with simple pooling, halving, or array testing data.

Usage

gtReg(
  type = "sp",
  formula,
  data,
  groupn = NULL,
  subg = NULL,
  coln = NULL,
  rown = NULL,
  arrayn = NULL,
  retest = NULL,
  sens = 1,
  spec = 1,
  linkf = c("logit", "probit", "cloglog"),
  method = c("Vansteelandt", "Xie"),
  sens.ind = NULL,
  spec.ind = NULL,
  start = NULL,
  control = gtRegControl(...),
  ...
)
gtReg(
  type = "sp",
  formula,
  data,
  groupn = NULL,
  subg = NULL,
  coln = NULL,
  rown = NULL,
  arrayn = NULL,
  retest = NULL,
  sens = 1,
  spec = 1,
  linkf = c("logit", "probit", "cloglog"),
  method = c("Vansteelandt", "Xie"),
  sens.ind = NULL,
  spec.ind = NULL,
  start = NULL,
  control = gtRegControl(...),
  ...
)

Arguments

`type`	`"sp"` for simple pooling (Dorfman testing with or without retests), `"halving"` for halving protocol, or `"array"` for array testing. See 'Details' for descriptions of the group testing algorithms.
`formula`	an object of class "formula" (or one that can be coerced to that class); a symbolic description of the model to be fitted. The details of model specification are under 'Details'.
`data`	an optional data frame, list, or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which `gtReg` is called.
`groupn`	a vector, list, or data frame of the group numbers that designates individuals to groups (for use with simple pooling, `type = "sp"`, or the halving protocol, `type = "halving"`).
`subg`	a vector, list, or data frame of the group numbers that designates individuals to subgroups (for use with the halving protocol, `type = "halving"`).
`coln`	a vector, list, or data frame that specifies the column group number for each sample (for use with array testing, `type = "array"`).
`rown`	a vector, list, or data frame that specifies the row group number for each sample (for use with array testing, `type = "array"`).
`arrayn`	a vector, list, or data frame that specifies the array number for each sample (for use with array testing, `type = "array"`).
`retest`	a vector, list, or data frame of individual retest results. Default value is `NULL` for no retests. See 'Details' for details on how to specify `retest`.
`sens`	sensitivity of the test. Default value is set to 1.
`spec`	specificity of the test. Default value is set to 1.
`linkf`	a character string specifying one of the three link functions for a binomial model: `"logit"` (default), `"probit"`, or `"cloglog"`.
`method`	the method to fit the regression model. Options include `"Vansteelandt"` (default) or `"Xie"`. The `"Vansteelandt"` option finds estimates by directly maximizing the likelihood function based on the group responses, while the `"Xie"` option uses the EM algorithm to maximize the likelihood function in terms of the unobserved individual responses.
`sens.ind`	sensitivity of the individual retests. If NULL, set to be equal to `sens`.
`spec.ind`	specificity of the individual retests. If NULL, set to be equal to `spec`.
`start`	starting values for the parameters in the linear predictor.
`control`	a list of parameters for controlling the fitting process in method `"Xie"`. These parameters will be passed to the `gtRegControl` function for use.
`...`	arguments to be passed to `gtRegControl` by default. See argument `control`.

Details

With simple pooling and halving, a typical predictor has the form groupresp ~ covariates where groupresp is the (numeric) group response vector. With array testing, individual samples are placed in a matrix-like grid where samples are pooled within each row and within each column. This leads to two kinds of group responses: row and column group responses. Thus, a typical predictor has the form cbind(col.resp, row.resp) ~ covariates, where col.resp is the (numeric) column group response vector and row.resp is the (numeric) row group response vector. For all methods, covariates is a series of terms which specifies a linear predictor for individual responses. Note that it is actually the unobserved individual responses, not the observed group responses, which are modeled by the covariates. When denoting group responses (groupresp, col.resp, and row.resp), a 0 denotes a negative response and a 1 denotes a positive response, where the probability of an individual positive response is being modeled directly.

A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second. The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order, and so on; to avoid this, pass a terms object as the formula.

For simple pooling (type = "sp"), the functions gtreg.fit, EM, and EM.ret, where the first corresponds to Vansteelandt's method described in Vansteelandt et al. (2000) and the last two correspond to Xie's method described in Xie (2001), are called to carry out the model fitting. The gtreg.fit function uses the optim function with default method "Nelder-Mead" to maximize the likelihood function of the observed group responses. If this optimization method produces a Hessian matrix of all zero elements, the "SANN" method in optim is employed to find the coefficients and Hessian matrix. For the "SANN" method, the number of iterations in optim is set to be 10000. For the background on the use of optim, see help(optim).

The EM and EM.ret functions apply Xie's EM algorithm to the likelihood function written in terms of the unobserved individual responses; the functions use glm.fit to update the parameter estimates within each M step. The EM function is used when there are no retests and EM.ret is used when individual retests are available. Thus, within the retest argument, individual observations in observed positive groups are 0 (negative) or 1 (positive); the remaining individual observations are NAs, meaning that no retest is performed for them. Retests cannot be used with Vansteelandt's method; a warning message will be given in this case, and the individual retests will be ignored in the model fitting. There could be slight differences in the estimates between Vansteelandt's and Xie's methods (when retests are not available) due to different convergence criteria.

With simple pooling (i.e., Dorfman testing, two-stage hierarchical testing), each individual appears in exactly one pool. When only the group responses are observed, the null degrees of freedom are the number of groups minus 1 and the residual degrees of freedom are the number of groups minus the number of parameters. When individual retests are observed too, it is an open research question for what the degrees of freedom and the deviance for the null model should be; therefore, the degrees of freedom and null.deviance will not be displayed.

Under the halving protocol, the EM.halving function applies Xie's EM algorithm to the likelihood function written in terms of the unobserved individual responses; the functions use glm.fit to update the parameter estimates within each M step. In the halving protocol, if the initial group tests positive, it is split into two subgroups. The two subgroups are subsequently tested and if either subgroup tests positive, the third and final step is to test all individuals within the subgroup. Thus, within subg, subgroup responses in observed positive groups are 0 (negative) or 1 (positive); the remaining subgroup responses are NAs, meaning that no tests are performed for them. The individual retests are similarly coded.

With array testing (also known as matrix pooling), the EM.mp function applies Xie's EM algorithm to the likelihood function written in terms of the unobserved individual responses. In each E step, the Gibbs sampling technique is used to estimate the conditional probabilities. Because of the large number of Gibbs samples needed to achieve convergence, the model fitting process could be quite slow, especially when multiple positive rows and columns are observed. In this case, we can either increase the Gibbs sample size to help achieve convergence or loosen the convergence criteria by increasing tol at the expense of perhaps poorer estimates. If follow-up retests are performed, the retest results going into the model will help achieve convergence faster with the same Gibbs sample size and convergence criteria. In each M step, we use glm.fit to update the parameter estimates.

For simple pooling, retest provides individual retest results for Dorfman's retesting procedure. Under the halving protocol, retest provides individual retest results within a subgroup that tests positive. The retest argument provides individual retest results, where a 0 denotes negative and 1 denotes positive status. An NA denotes that no retest is performed for that individual. The default value is NULL for no retests.

For simple pooling, control provides parameters for controlling the fitting process in the "Xie" method only.

gtReg returns an object of class "gtReg". The function summary (i.e., summary.gtReg is used to obtain or print a summary of the results. The group testing function predict (i.e., predict.gtReg) is used to make predictions on "gtReg" objects.

Value

An object of class "gtReg", a list which may include:

`coefficients`	a named vector of coefficients.
`hessian`	estimated Hessian matrix of the negative log-likelihood function. This serves as an estimate of the information matrix.
`residuals`	the response residuals. This is the difference of the observed group responses and the fitted group responses. Not included for array testing.
`fitted.values`	the fitted mean values of group responses. Not included for array testing.
`deviance`	the deviance between the fitted model and the saturated model. Not included for array testing.
`aic`	Akaike's Information Criterion. This is minus twice the maximized log-likelihood plus twice the number of coefficients. Not included for array testing.
`null.deviance`	the deviance for the null model, comparable with `deviance`. The null model will include only the intercept, if there is one in the model. Provided for simple pooling, `type = "sp"`, only.
`counts`	the number of iterations in `optim` (Vansteelandt's method) or the number of iterations in the EM algorithm (Xie's method, halving, and array testing).
`Gibbs.sample.size`	the number of Gibbs samples generated in each E step. Provided for array testing, `type = "array"`, only.
`df.residual`	the residual degrees of freedom. Provided for simple pooling, `type = "sp"`, only.
`df.null`	the residual degrees of freedom for the null model. Provided for simple pooling, `type = "sp"`, only.
`z`	the vector of group responses. Not included for array testing.
`call`	the matched call.
`formula`	the formula supplied.
`terms`	the terms object used.
`method`	the method (`"Vansteelandt"` or `"Xie"`) used to fit the model. For the halving protocol, the `"Xie"` method is used. Not included for array testing.
`link`	the link function used in the model.

Author(s)

The majority of this function was originally written as gtreg.sp, gtreg.halving, and gtreg.mp by Boan Zhang for the binGroup package. Minor modifications have been made for inclusion of the functions in the binGroup2 package.

References

Vansteelandt, S., Goetghebeur, E., Verstraeten, T. (2000). “Regression models for disease prevalence with diagnostic tests on pools of serum samples.” Biometrics, 56, 1126–1133.

Xie, M. (2001). “Regression analysis of group testing samples.” Statistics in Medicine, 20, 1957–1969.

Examples

data(hivsurv)
fit1 <- gtReg(type = "sp", formula  =  groupres ~ AGE + EDUC.,
              data  =  hivsurv, groupn  =  gnum, sens  =  0.9,
              spec  =  0.9, method  =  "Xie")
fit1

set.seed(46)
gt.data <- gtSim(type = "sp", par = c(-12, 0.2),
                 size1 = 700, size2 = 5)
fit2 <- gtReg(type = "sp", formula = gres ~ x, data = gt.data,
              groupn = groupn)
fit2

set.seed(21)
gt.data <- gtSim(type = "sp", par = c(-12, 0.2),
                 size1 = 700, size2 = 6, sens = 0.95, spec = 0.95,
                 sens.ind = 0.98, spec.ind = 0.98)
fit3 <- gtReg(type = "sp", formula = gres ~ x, data = gt.data,
              groupn = groupn, retest = retest, method = "Xie",
              sens = 0.95, spec = 0.95, sens.ind = 0.98,
              spec.ind = 0.98, trace = TRUE)
summary(fit3)

set.seed(46)
gt.data <- gtSim(type = "halving", par = c(-6, 0.1), gshape = 17,
                 gscale = 1.4, size1 = 5000, size2 = 5,
                 sens = 0.95, spec = 0.95)
fit4 <- gtReg(type = "halving", formula = gres ~ x,
              data = gt.data, groupn = groupn, subg = subgroup,
              retest = retest, sens = 0.95, spec = 0.95,
              start = c(-6, 0.1), trace = TRUE)
summary(fit4)

# 5x6 and 4x5 array
set.seed(9128)
sa1a <- gtSim(type = "array", par = c(-7, 0.1), size1 = c(5, 4),
              size2 = c(6, 5), sens = 0.95, spec = 0.95)
sa1 <- sa1a$dframe
fit5 <- gtReg(type = "array",
              formula = cbind(col.resp, row.resp) ~ x,
              data = sa1, coln = coln, rown = rown,
              arrayn = arrayn, sens = 0.95, spec = 0.95,
              tol = 0.005, n.gibbs = 2000, trace = TRUE)
fit5
summary(fit5)

data(hivsurv)
fit1 <- gtReg(type = "sp", formula  =  groupres ~ AGE + EDUC.,
              data  =  hivsurv, groupn  =  gnum, sens  =  0.9,
              spec  =  0.9, method  =  "Xie")
fit1

set.seed(46)
gt.data <- gtSim(type = "sp", par = c(-12, 0.2),
                 size1 = 700, size2 = 5)
fit2 <- gtReg(type = "sp", formula = gres ~ x, data = gt.data,
              groupn = groupn)
fit2

set.seed(21)
gt.data <- gtSim(type = "sp", par = c(-12, 0.2),
                 size1 = 700, size2 = 6, sens = 0.95, spec = 0.95,
                 sens.ind = 0.98, spec.ind = 0.98)
fit3 <- gtReg(type = "sp", formula = gres ~ x, data = gt.data,
              groupn = groupn, retest = retest, method = "Xie",
              sens = 0.95, spec = 0.95, sens.ind = 0.98,
              spec.ind = 0.98, trace = TRUE)
summary(fit3)

set.seed(46)
gt.data <- gtSim(type = "halving", par = c(-6, 0.1), gshape = 17,
                 gscale = 1.4, size1 = 5000, size2 = 5,
                 sens = 0.95, spec = 0.95)
fit4 <- gtReg(type = "halving", formula = gres ~ x,
              data = gt.data, groupn = groupn, subg = subgroup,
              retest = retest, sens = 0.95, spec = 0.95,
              start = c(-6, 0.1), trace = TRUE)
summary(fit4)

# 5x6 and 4x5 array
set.seed(9128)
sa1a <- gtSim(type = "array", par = c(-7, 0.1), size1 = c(5, 4),
              size2 = c(6, 5), sens = 0.95, spec = 0.95)
sa1 <- sa1a$dframe
fit5 <- gtReg(type = "array",
              formula = cbind(col.resp, row.resp) ~ x,
              data = sa1, coln = coln, rown = rown,
              arrayn = arrayn, sens = 0.95, spec = 0.95,
              tol = 0.005, n.gibbs = 2000, trace = TRUE)
fit5
summary(fit5)

Auxiliary for controlling group testing regression

Description

Auxiliary function to control fitting parameters of the EM algorithm used internally in gtReg for simple pooling (type = "sp") with method = "Xie" or for array testing (type = "array").

Usage

gtRegControl(
  tol = 1e-04,
  n.gibbs = 1000,
  n.burnin = 20,
  maxit = 500,
  trace = FALSE,
  time = TRUE
)
gtRegControl(
  tol = 1e-04,
  n.gibbs = 1000,
  n.burnin = 20,
  maxit = 500,
  trace = FALSE,
  time = TRUE
)

Arguments

`tol`	convergence criterion.
`n.gibbs`	the Gibbs sample size to be used in each E step of the EM algorithm, for array testing. The default is 1000.
`n.burnin`	the number of samples in the burn-in period, for array testing. The default is 20.
`maxit`	maximum number of iterations in the EM algorithm.
`trace`	a logical value indicating whether the output should be printed for each iteration. The default is `FALSE`.
`time`	a logical value indicating whether the length of time for the model fitting should be printed. The default is `TRUE`.

Value

A list with components named as the input arguments.

Author(s)

This function was originally written as the gt.control function for the binGroup package. Minor modifications have been made for inclusion in the binGroup2 package.

Examples

# The default settings:
gtRegControl()
# The default settings:
gtRegControl()

Simulation function for group testing data

Description

Simulates data in group testing form ready to be fit by gtReg.

Usage

gtSim(
  type = "sp",
  x = NULL,
  gshape = 20,
  gscale = 2,
  par,
  linkf = c("logit", "probit", "cloglog"),
  size1,
  size2,
  sens = 1,
  spec = 1,
  sens.ind = NULL,
  spec.ind = NULL
)
gtSim(
  type = "sp",
  x = NULL,
  gshape = 20,
  gscale = 2,
  par,
  linkf = c("logit", "probit", "cloglog"),
  size1,
  size2,
  sens = 1,
  spec = 1,
  sens.ind = NULL,
  spec.ind = NULL
)

Arguments

`type`	`"sp"` for simple pooling (Dorfman testing with or without retests), `"halving"` for halving protocol, and `"array"` for array testing (also known as matrix pooling).
`x`	a matrix of user-submitted covariates with which to simulate the data. Default is `NULL`, in which case a gamma distribution is used to generate the covariates automatically.
`gshape`	shape parameter for the gamma distribution. The value must be non-negative. Default value is set to 20.
`gscale`	scale parameter for the gamma distribution. The value must be strictly positive. Default value is set to 2.
`par`	the true coefficients in the linear predictor.
`linkf`	a character string specifying one of the three link functions to be used: `"logit"` (default), `"probit"`, or `"cloglog"`.
`size1`	sample size of the simulated data (for use with `"sp"` and `"halving"` methods) or a vector that specifies the number of rows in each matrix (for use with `"array"` method). If only one matrix is simulated, this value is a scalar.
`size2`	group size in pooling individual samples (for use with `"sp"` and `"halving"` methods) or a vector that specifies the number of columns in each matrix (for use with `"array"` method). If only one matrix is simulated, this value is a scalar.
`sens`	sensitivity of the group tests. Default value is set to 1.
`spec`	specificity of the group tests. Default value is set to 1.
`sens.ind`	sensitivity of the individual retests. If NULL, set to be equal to sens.
`spec.ind`	specificity of the individual retests. If NULL, set to be equal to spec.

Details

Generates group testing data in simple pooling form (type = "sp"), for the halving protocol (type = "halving"), or in array testing form (type = "array"). The covariates are either specified by the x argument or they are generated from a gamma distribution with the given gshape and gscale parameters. The individual probabilities are calculated from the covariates, the coefficients given in par, and the link function specified through linkf. The true binary individual responses are then simulated from the individual probabilities.

Under the matrix pooling protocol (type = "array"), the individuals are first organized into (by column) one or more matrices specified by the number of rows (size1) and the number of columns (size2).

Then, for all pooling protocols, the true group responses are found from the individual responses within groups or within rows/columns for matrix pooling (i.e., if at least one response is positive, the group is positive; otherwise, the group response is negative). Finally, the observed group (method = "sp") and subgroup method = "halving" only), or row and column responses method = "array" are simulated using the given sens and spec.

For the simple pooling and halving protocols, individual retests are simulated from sens.ind and spec.ind for samples in observed positive groups. Note that with a given group size (specified by size2 with method = "sp" or method = "halving"), the last group may have fewer individuals. For the matrix pooling protocol, individual retests are simulated from sens.ind and spec.ind for individuals that lie on the intersection of an observed positive row and and observed positive column. In the case where no column (row) tests positive in a matrix, all the individuals in any observed positive rows (columns) will be assigned a simulated retest result. If no column or row is observed positive, NULL is returned.

Value

For simple pooling (type = "sp") and the halving protocol (type = "halving"), a data frame or for array testing (type = "array"), a list, which may include the following:

`gres`	the group response, for simple pooling and the halving protocol only.
`col.resp`	the column group response, for array testing only.
`row.resp`	the row group response, for array testing only.
`x`	the covariate.
`groupn`	the group number, for simple pooling and the halving protocol only.
`arrayn`	the array number, for array testing only.
`coln`	the column group number, for array testing only.
`rown`	the row group number, for array testing only.
`ind`	the true individual responses. For simple pooling and the halving protocol, these are included in the data frame of results. For array testing, these are included in the list of results, with individual responses presented in matrices.
`retest`	the results of individual retests.
`subgroup`	the subgroup number, for the halving protocol.
`prob`	the individual probabilities, for array testing only.

Author(s)

This function is a combination of sim.gt, sim.halving, and sim.mp written by Boan Zhang for the binGroup package. Minor modifications have been made for inclusion of the functions in the binGroup2 package.

Examples

set.seed(46)
gt.data <- gtSim(type = "sp", par = c(-12, 0.2),
                 size1 = 700, size2 = 5)

x1 <- sort(runif(100, 0, 30))
x2 <- rgamma(100, shape = 17, scale = 1.5)
gt.data <- gtSim(type = "sp", x = cbind(x1, x2),
                 par = c(-14, 0.2, 0.3), size2 = 4,
                 sens = 0.98, spec = 0.98)

set.seed(46)
gt.data <- gtSim(type = "halving", par = c(-6, 0.1),
                 gshape = 17, gscale = 1.4, size1 = 5000,
                 size2 = 5, sens = 0.95, spec = 0.95)

# 5x6 and 4x5 matrix
set.seed(9128)
sa1a <- gtSim(type = "array", par = c(-7, 0.1),
              size1 = c(5, 4), size2 = c(6, 5),
              sens = 0.95, spec = 0.95)
sa1a$dframe
set.seed(46)
gt.data <- gtSim(type = "sp", par = c(-12, 0.2),
                 size1 = 700, size2 = 5)

x1 <- sort(runif(100, 0, 30))
x2 <- rgamma(100, shape = 17, scale = 1.5)
gt.data <- gtSim(type = "sp", x = cbind(x1, x2),
                 par = c(-14, 0.2, 0.3), size2 = 4,
                 sens = 0.98, spec = 0.98)

set.seed(46)
gt.data <- gtSim(type = "halving", par = c(-6, 0.1),
                 gshape = 17, gscale = 1.4, size1 = 5000,
                 size2 = 5, sens = 0.95, spec = 0.95)

# 5x6 and 4x5 matrix
set.seed(9128)
sa1a <- gtSim(type = "array", par = c(-7, 0.1),
              size1 = c(5, 4), size2 = c(6, 5),
              sens = 0.95, spec = 0.95)
sa1a$dframe

Hypothesis test for one proportion in group testing

Description

Calculates p-values for hypothesis tests of single proportions estimated from group testing experiments against a threshold proportion in the hypotheses. Available methods include the exact test, score test, and Wald test.

Usage

gtTest(n, y, s, p.hyp, alternative = "two.sided", method = "exact")
gtTest(n, y, s, p.hyp, alternative = "two.sided", method = "exact")

Arguments

`n`	integer specifying the number of groups.
`y`	integer specifying the number of positive groups.
`s`	integer specifying the common size of groups.
`p.hyp`	the hypothetical threshold proportion against which to test, specified as a number between 0 and 1.
`alternative`	character string defining the alternative hypothesis, either `"two.sided"`, `"less"`, or `"greater"`.
`method`	character string defining the test method to be used. Options include "exact" for an exact test corresponding to the Clopper-Pearson confidence interval, "score" for a score test corresponding to the Wilson confidence interval, and "Wald" for a Wald test corresponding to the Wald confidence interval. The Wald method is not recommended. The "exact" method uses `binom.test{stats}`.

Details

This function assumes equal group sizes, no testing error (i.e., 100 percent sensitivity and specificity) to test the groups, and individual units randomly assigned to the groups with identical true probability of success.

Value

A list containing:

`p.value`	the p-value of the test
`estimate`	the estimated proportion
`p.hyp`	the threshold proportion provided by the user.
`alternative`	the alternative provided by the user.
`method`	the test method provided by the user.

Author(s)

This function was originally written as bgtTest by Frank Schaarschmidt for the binGroup package. Minor modifications have been made for inclusion of the function in the binGroup2 package.

Examples

# Consider the following the experiment: Tests are
#   performed on n=10 groups, each group has a size
#   of s=100 individuals. The aim is to show that less
#   than 0.5 percent (\eqn{p < 0.005}) of the units in
#   the population show a detrimental trait (positive test).
#   y=1 positive test and 9 negative tests are observed.
gtTest(n = 10, y = 1, s = 100, p.hyp = 0.005,
       alternative = "less", method = "exact")

# The exact test corresponds to the
#   limits of the Clopper-Pearson confidence interval
#   in the example of Tebbs & Bilder (2004):
gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "exact", p.hyp = 0.0543)

gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "exact", p.hyp = 0.0038)

# Hypothesis test with a group size of 1.
gtTest(n = 24, y = 3, s = 1, alternative = "two.sided",
       method = "exact", p.hyp = 0.1)

# Further methods:
gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "score", p.hyp = 0.0516)

gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "Wald", p.hyp = 0.0401)
# Consider the following the experiment: Tests are
#   performed on n=10 groups, each group has a size
#   of s=100 individuals. The aim is to show that less
#   than 0.5 percent (\eqn{p < 0.005}) of the units in
#   the population show a detrimental trait (positive test).
#   y=1 positive test and 9 negative tests are observed.
gtTest(n = 10, y = 1, s = 100, p.hyp = 0.005,
       alternative = "less", method = "exact")

# The exact test corresponds to the
#   limits of the Clopper-Pearson confidence interval
#   in the example of Tebbs & Bilder (2004):
gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "exact", p.hyp = 0.0543)

gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "exact", p.hyp = 0.0038)

# Hypothesis test with a group size of 1.
gtTest(n = 24, y = 3, s = 1, alternative = "two.sided",
       method = "exact", p.hyp = 0.1)

# Further methods:
gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "score", p.hyp = 0.0516)

gtTest(n = 24, y = 3, s = 7, alternative = "two.sided",
       method = "Wald", p.hyp = 0.0401)

Expected width of confidence intervals in group testing

Description

Calculation of the expected value of the width of confidence intervals for one proportion in group testing. Calculations are available for the confidence interval methods in propCI.

Usage

gtWidth(n, s, p, conf.level = 0.95, alternative = "two.sided", method = "CP")
gtWidth(n, s, p, conf.level = 0.95, alternative = "two.sided", method = "CP")

Arguments

`n`	integer specifying the number of groups. A vector of integers is also allowed.
`s`	integer specifying the common size of groups. A vector of integers is also allowed.
`p`	the assumed true proportion of individuals showing the trait to be estimated. A vector is also allowed.
`conf.level`	the required confidence level of the interval.
`alternative`	character string specifying the alternative hypothesis, either `"two.sided"`, `"less"`, or `"greater"`.
`method`	character string specifying the confidence interval method. Available options include those in `propCI`.

Details

The two-sided (alternative="two.sided") option calculates the expected width between the lower and upper bound of a two-sided $conf.level*100$ percent confidence interval. See Tebbs & Bilder (2004) for expression. The one-sided (alternative="less" or alternative="greater") options calculate the expected distance between the one-sided limit and the assumed true proportion p for a one-sided $conf.level*100$ percent confidence interval.

Value

A matrix containing the columns:

`ns`	the resulting total number of units, $n*s$ .
`n`	the number of groups.
`s`	the group size.
`p`	the assumed true proportion.
`expCIWidth`	the expected value of the confidence interval width as defined under the argument `alternative`.

Author(s)

This function was originally written as bgtWidth by Frank Schaarschmidt for the binGroup package. Minor modifications have been made for inclusion of the function in the binGroup2 package.

References

Examples

# Examine different group sizes to determine
#   the shortest expected width.
gtWidth(n = 20, s = seq(from = 1, to = 200, by = 10),
        p = 0.01, alternative = "less", method = "CP")

# Calculate the expected width of the confidence
#   interval with a group size of 1 (individual testing).
gtWidth(n = 20, s = 1, p = 0.005, alternative = "less", method = "CP")
# Examine different group sizes to determine
#   the shortest expected width.
gtWidth(n = 20, s = seq(from = 1, to = 200, by = 10),
        p = 0.01, alternative = "less", method = "CP")

# Calculate the expected width of the confidence
#   interval with a group size of 1 (individual testing).
gtWidth(n = 20, s = 1, p = 0.005, alternative = "less", method = "CP")

Probability mass function for halving

Description

Calculate the probability mass function for the number of tests from using the halving algorithm.

Usage

halving(p, Se = 1, Sp = 1, stages = 2, order.p = TRUE)
halving(p, Se = 1, Sp = 1, stages = 2, order.p = TRUE)

Arguments

`p`	a vector of individual risk probabilities.
`Se`	sensitivity of the diagnostic test.
`Sp`	specificity of the diagnostic test.
`stages`	the number of stages for the halving algorithm.
`order.p`	logical; if TRUE, the vector of individual risk probabilities will be sorted.

Details

Halving algorithms involve successively splitting a positive testing group into two equal-sized halves (or as close to equal as possible) until all individuals have been identified as positive or negative. $S$ -stage halving begins by testing the whole group of $I$ individuals. Positive groups are split in half until the final stage of the algorithm, which consists of individual testing. For example, consider an initial group of size $I=16$ individuals. Three-stage halving (3H) begins by testing the whole group of 16 individuals. If this group tests positive, the second stage involves splitting into two groups of size 8. If either of these groups test positive, a third stage involves testing each individual rather than halving again. Four-stage halving (4H) would continue with halving into groups of size 4 before individual testing. Five-stage halving (5H) would continue with halving into groups of size 2 before individual testing. 3H requires more than 2 individuals, 4H requires more than 4 individuals, and 5H requires more than 8 individuals.

This function calculates the probability mass function, expected testing expenditure, and variance of the testing expenditure for halving algorithms with 3 to 5 stages.

Value

A list containing:

`pmf`	the probability mass function for the halving algorithm.
`et`	the expected testing expenditure for the halving algorithm.
`vt`	the variance of the testing expenditure for the halving algorithm.
`p`	a vector containing the probabilities of positivity for each individual.

Author(s)

This function was originally written by Michael Black for Black et al. (2012). The function was obtained from http://chrisbilder.com/grouptesting/. Minor modifications have been made for inclusion of the function in the binGroup2 package.

References

Examples

# Equivalent to Dorfman testing (two-stage hierarchical)
halving(p = rep(0.01, 10), Se = 1, Sp = 1, stages = 2,
        order.p = TRUE)

# Halving over three stages; each individual has a
#   different probability of being positive
set.seed(12895)
p.vec <- expectOrderBeta(p = 0.05, alpha = 2, size = 20)
halving(p = p.vec, Se = 0.95, Sp = 0.95, stages = 3,
        order.p = TRUE)
# Equivalent to Dorfman testing (two-stage hierarchical)
halving(p = rep(0.01, 10), Se = 1, Sp = 1, stages = 2,
        order.p = TRUE)

# Halving over three stages; each individual has a
#   different probability of being positive
set.seed(12895)
p.vec <- expectOrderBeta(p = 0.05, alpha = 2, size = 20)
halving(p = p.vec, Se = 0.95, Sp = 0.95, stages = 3,
        order.p = TRUE)

Data from an HIV surveillance project

Description

The hivsurv data set comes from an HIV surveillance project discussed in Verstraeten et al. (1998) and Vansteelandt et al. (2000). The purpose of the study was to estimate the HIV prevalence among pregnant Kenyan women in four rural locations of the country, using both individual and group testing responses. Blood tests were administered to each participating woman, and 4 covariates were obtained on each woman. Because the original group responses are unavailable, individuals are artificially put into groups of 5 here to form group responses. Only the 428 complete observations are given.

Usage

data(hivsurv)data(hivsurv)

Format

A data frame with 428 observations on the following 8 variables.

DATE: the date when each sample was collected.
PAR.: parity (number of children).
AGE: age (in years).
MA.ST.: marital status (1: single; 2: married (polygamous); 3: married (monogamous); 4: divorced; 5: widow).
EDUC.: highest attained education level (1: no schooling; 2: primary school; 3: secondary school; 4: higher).
HIV: individual response of HIV diagnosis (0: negative; 1: positive).
gnum: the group number that designates individuals into groups.
groupres: the group response calculated from artificially formed groups.

Source

Vansteelandt, S., Goetghebeur, E., Verstraeten, T. (2000). “Regression models for disease prevalence with diagnostic tests on pools of serum samples.” Biometrics, 56, 1126–1133.

Examples

data(hivsurv)

str(hivsurv)

data(hivsurv)

str(hivsurv)

Extract the individual probabilities used to calculate group testing results

Description

Extract the individual probabilities from objects of class "opchar" returned by operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2).

Usage

IndProb(object, ...)
IndProb(object, ...)

Arguments

`object`	An object of class "opChar", from which the individual probabilities are to be extracted.
`...`	Additional arguments to be passed to `IndProb` (e.g., `digits` to be passed to `signif` for appropriate rounding).

Value

Either p.vec, the sorted vector of individual probabilities (for hierarchical group testing algorithms) or p.mat, the sorted matrix of individual probabilities in gradient arrangement (for array testing algorithms). Further details are given under the 'Details' section for the operatingCharacteristics1 (opChar1) or operatingCharacteristics2 (opChar2) functions.

Author(s)

Brianna D. Hitt

Examples

config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat)
IndProb(res1)

config.mat <- matrix(data = c(rep(1, 20), rep(1, 10), rep(2, 10),
                             rep(c(1, 2, 3, 4), each = 5),
                             rep(1, 3), rep(2, 2), rep(3, 3),
                             rep(4, 2), rep(5, 3), rep(6, 2),
                             rep(7, 3), rep(8, 2), 1:20),
                    nrow = 5, ncol = 20, byrow = TRUE)
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
res2 <- opChar2(algorithm = "ID5",
                alpha = c(18.25, 0.75, 0.75, 0.25),
                Se = Se, Sp = Sp, hier.config = config.mat)
IndProb(res2)
config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
res1 <- opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat)
IndProb(res1)

config.mat <- matrix(data = c(rep(1, 20), rep(1, 10), rep(2, 10),
                             rep(c(1, 2, 3, 4), each = 5),
                             rep(1, 3), rep(2, 2), rep(3, 3),
                             rep(4, 2), rep(5, 3), rep(6, 2),
                             rep(7, 3), rep(8, 2), 1:20),
                    nrow = 5, ncol = 20, byrow = TRUE)
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
res2 <- opChar2(algorithm = "ID5",
                alpha = c(18.25, 0.75, 0.75, 0.25),
                Se = Se, Sp = Sp, hier.config = config.mat)
IndProb(res2)

Arrange a matrix of probabilities for informative array testing

Description

Arrange a vector of individual risk probabilities in a matrix for informative array testing without master pooling.

Usage

informativeArrayProb(prob.vec, nr, nc, method = "sd")
informativeArrayProb(prob.vec, nr, nc, method = "sd")

Arguments

`prob.vec`	vector of individual risk probabilities, of length nr * nc.
`nr`	number of rows in the array.
`nc`	number of columns in the array.
`method`	character string defining the method to be used for matrix arrangement. Options include spiral ("`sd`") and gradient ("`gd`") arrangement. See McMahan et al. (2012) for additional details.

Value

A matrix of probabilities arranged according to the specified method.

Author(s)

This function was originally written by Christopher McMahan for McMahan et al. (2012). The function was obtained from http://chrisbilder.com/grouptesting/.

References

McMahan, C., Tebbs, J., Bilder, C. (2012b). “Two-Dimensional Informative Array Testing.” Biometrics, 68, 793–804.

Examples

# Use the gradient arrangement method to create a matrix
#   of individual risk probabilities for a 10x10 array.
# Depending on the specified probability, alpha level,
#   and overall group size, simulation may be necessary
#   in order to generate the vector of individual
#   probabilities. This is done using the expectOrderBeta()
#   function and requires the user to set a seed in order
#   to reproduce results.
set.seed(1107)
p.vec1 <- expectOrderBeta(p = 0.05, alpha = 2, size = 100)
informativeArrayProb(prob.vec = p.vec1, nr = 10, nc = 10,
                     method = "gd")

# Use the spiral arrangement method to create a matrix
#   of individual risk probabilities for a 5x5 array.
set.seed(8791)
p.vec2 <- expectOrderBeta(p = 0.02, alpha = 0.5, size = 25)
informativeArrayProb(prob.vec = p.vec2, nr = 5, nc = 5,
                     method = "sd")
# Use the gradient arrangement method to create a matrix
#   of individual risk probabilities for a 10x10 array.
# Depending on the specified probability, alpha level,
#   and overall group size, simulation may be necessary
#   in order to generate the vector of individual
#   probabilities. This is done using the expectOrderBeta()
#   function and requires the user to set a seed in order
#   to reproduce results.
set.seed(1107)
p.vec1 <- expectOrderBeta(p = 0.05, alpha = 2, size = 100)
informativeArrayProb(prob.vec = p.vec1, nr = 10, nc = 10,
                     method = "gd")

# Use the spiral arrangement method to create a matrix
#   of individual risk probabilities for a 5x5 array.
set.seed(8791)
p.vec2 <- expectOrderBeta(p = 0.02, alpha = 0.5, size = 25)
informativeArrayProb(prob.vec = p.vec2, nr = 5, nc = 5,
                     method = "sd")

Calculate operating characteristics for group testing algorithms that use a single-disease assay

Description

Calculate operating characteristics, such as the expected number of tests, for a specified testing configuration using non-informative and informative hierarchical and array-based group testing algorithms. Single-disease assays are used at each stage of the algorithms.

Usage

operatingCharacteristics1(
  algorithm,
  p = NULL,
  probabilities = NULL,
  Se = 0.99,
  Sp = 0.99,
  hier.config = NULL,
  rowcol.sz = NULL,
  alpha = 2,
  a = NULL,
  print.time = TRUE,
  ...
)

opChar1(
  algorithm,
  p = NULL,
  probabilities = NULL,
  Se = 0.99,
  Sp = 0.99,
  hier.config = NULL,
  rowcol.sz = NULL,
  alpha = 2,
  a = NULL,
  print.time = TRUE,
  ...
)
operatingCharacteristics1(
  algorithm,
  p = NULL,
  probabilities = NULL,
  Se = 0.99,
  Sp = 0.99,
  hier.config = NULL,
  rowcol.sz = NULL,
  alpha = 2,
  a = NULL,
  print.time = TRUE,
  ...
)

opChar1(
  algorithm,
  p = NULL,
  probabilities = NULL,
  Se = 0.99,
  Sp = 0.99,
  hier.config = NULL,
  rowcol.sz = NULL,
  alpha = 2,
  a = NULL,
  print.time = TRUE,
  ...
)

Arguments

`algorithm`	character string defining the group testing algorithm to be used. Non-informative testing options include two-stage hierarchical ("`D2`"), three-stage hierarchical ("`D3`"), four-stage hierarchical ("`D4`"), square array testing without master pooling ("`A2`"), and square array testing with master pooling ("`A2M`"). Informative testing options include two-stage hierarchical ("`ID2`"), three-stage hierarchical ("`ID3`"), four-stage hierarchical ("`ID4`"), and square array testing without master pooling ("`IA2`").
`p`	overall probability of disease that will be used to generate a vector/matrix of individual probabilities. For non-informative algorithms, a homogeneous set of probabilities will be used. For informative algorithms, the `expectOrderBeta` function will be used to generate a heterogeneous set of probabilities. Further details are given under 'Details'. Either `p` or `probabilities` should be specified, but not both.
`probabilities`	a vector of individual probabilities, which is homogeneous for non-informative testing algorithms and heterogeneous for informative testing algorithms. Either `p` or `probabilities` should be specified, but not both.
`Se`	a vector of sensitivity values, where one value is given for each stage of testing (in order). If a single value is provided, sensitivity values are assumed to be equal to this value for all stages of testing. Further details are given under 'Details'.
`Sp`	a vector of specificity values, where one value is given for each stage of testing (in order). If a single value is provided, specificity values are assumed to be equal to this value for all stages of testing. Further details are given under 'Details'.
`hier.config`	a matrix specifying the configuration for a hierarchical testing algorithm. The rows correspond to the stages of testing, the columns correspond to each individual to be tested, and the cell values specify the group number of each individual at each stage. Further details are given under 'Details'. For array testing algorithms, this argument will be ignored.
`rowcol.sz`	the row/column size for array testing algorithms. For hierarchical testing algorithms, this argument will be ignored.
`alpha`	a shape parameter for the beta distribution that specifies the degree of heterogeneity for the generated probability vector (for informative testing only).
`a`	a vector containing indices indicating which individuals to calculate individual accuracy measures for. If `NULL`, individual accuracy measures will be displayed for all individuals in the algorithm.
`print.time`	a logical value indicating whether the length of time for calculations should be printed. The default is `TRUE`.
`...`	arguments to be passed to the `expectOrderBeta` function, which generates a vector of probabilities for informative testing algorithms. Further details are given under 'Details'.

Details

This function computes the operating characteristics for group testing algorithms with an assay that tests for one disease, as described in Hitt et al. (2019).

Available algorithms include two-, three-, and four-stage hierarchical testing and array testing with and without master pooling. Both non-informative and informative group testing settings are allowed for each algorithm, except informative array testing with master pooling is unavailable because this method has not appeared in the group testing literature. Operating characteristics calculated are expected number of tests, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for each individual.

For informative algorithms where the p argument is specified, the expected value of order statistics from a beta distribution are found. These values are used to represent disease risk probabilities for each individual to be tested. The beta distribution has two parameters: a mean parameter p (overall disease prevalence) and a shape parameter alpha (heterogeneity level). Depending on the specified p, alpha, and overall group size, simulation may be necessary to generate the vector of individual probabilities. This is done using expectOrderBeta and requires the user to set a seed to reproduce results.

The sensitivity/specificity values are allowed to vary across stages of testing. For hierarchical testing, a different sensitivity/specificity value may be used for each stage of testing. For array testing, a different sensitivity/specificity value may be used for master pool testing (if included), row/column testing, and individual testing. The values must be specified in order of the testing performed. For example, values are specified as (stage 1, stage 2, stage 3) for three-stage hierarchical testing or (master pool testing, row/column testing, individual testing) for array testing with master pooling. A single sensitivity/specificity value may be specified instead. In this situation, sensitivity/specificity values for all stages are assumed to be equal.

The matrix specified by hier.config defines the hierarchical group testing algorithm for $I$ individuals. The rows of the matrix correspond to the stages $s=1,...,S$ in the testing algorithm, and the columns correspond to individuals $i=1,...I$ . The cell values within the matrix represent the group number of individual $i$ at stage $s$ . For three-stage, four-stage, and non-informative two-stage hierarchical testing, the first row of the matrix consists of all ones. This indicates that all individuals in the algorithm are tested together in a single group in the first stage of testing. For informative two-stage hierarchical testing, the initial group (block) is not tested. Thus, the first row of the matrix consists of the group numbers for each individual in the first stage of testing. For all hierarchical algorithms, the final row of the matrix denotes individual testing. Individuals who are not tested in a particular stage are represented by "NA" (e.g., an individual tested in a group of size 1 in the second stage of testing would not be tested again in a third stage of testing). It is important to note that this matrix represents the testing that could be performed if each group tests positively at each stage prior to the last. For more details on this matrix (called a group membership matrix), see Bilder et al. (2019).

For array testing without master pooling, the rowcol.sz specified represents the row/column size for initial (stage 1) testing. For array testing with master pooling, the rowcol.sz specified represents the row/column size for stage 2 testing. This is because the master pool size is the overall array size, given by the square of the row/column size.

The displayed overall pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value are weighted averages of the corresponding individual accuracy measures for all individuals within the initial group (or block) for a hierarchical algorithm, or within the entire array for an array-based algorithm. Expressions for these averages are provided in the Supplementary Material for Hitt et al. (2019). These expressions are based on accuracy definitions given by Altman and Bland (1994a, 1994b).

The operatingCharacteristics1 function accepts additional arguments, namely num.sim, to be passed to the expectOrderBeta function, which generates a vector of probabilities for informative group testing algorithms. The num.sim argument specifies the number of simulations from the beta distribution when simulation is used. By default, 10,000 simulations are used.

Value

A list containing:

`algorithm`	the group testing algorithm used for calculations.
`prob`	the probability of disease or the vector of individual probabilities, as specified by the user.
`alpha`	level of heterogeneity for the generated probability vector (for informative testing only).
`Se`	the vector of sensitivity values for each stage of testing.
`Sp`	the vector of specificity values for each stage of testing.
`Config`	a list specifying elements of the specified testing configuration, which may include: Stage1 group size for the first stage of hierarchical testing, if applicable. Stage2 group sizes for the second stage of hierarchical testing, if applicable. Stage3 group sizes for the third stage of hierarchical testing, if applicable. Block.sz the block size/initial group size for informative Dorfman testing, which is not tested. pool.szs group sizes for the first stage of testing for informative Dorfman testing. Array.dim the row/column size for array testing. Array.sz the overall array size for array testing (the square of the row/column size).
`p.vec`	the sorted vector of individual probabilities, if applicable.
`p.mat`	the sorted matrix of individual probabilities in gradient arrangement, if applicable. Further details are given under 'Details'.
`ET`	the expected testing expenditure to decode all individuals in the algorithm; this includes all individuals in all groups for hierarchical algorithms or in the entire array for array testing.
`value`	the value of the expected number of tests per individual.
`Accuracy`	a list containing: Individual a matrix of accuracy measures for each individual specified in `a`. The rows correspond to each unique set of accuracy measures in the algorithm. Individuals with the same set of accuracy measures are displayed together in a single row of the matrix. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, pooling negative predictive value, and the indices for the individuals in each row of the matrix. Overall a matrix of overall accuracy measures for the algorithm. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for the overall algorithm. Further details are given under 'Details'.

Note

This function returns the pooling positive and negative predictive values for all individuals even though these measures are diagnostic specific; e.g., the pooling positive predictive value should only be considered for those individuals who have tested positive.

Additionally, only stage dependent sensitivity and specificity values are allowed within the program (no group within stage dependent values are allowed). See Bilder et al. (2019) for additional information.

Author(s)

Brianna D. Hitt

References

Altman, D., Bland, J. (1994). “Diagnostic tests 1: Sensitivity and specificity.” BMJ, 308, 1552.

Altman, D., Bland, J. (1994). “Diagnostic tests 2: Predictive values.” BMJ, 309, 102.

Bilder, C., Tebbs, J., McMahan, C. (2019). “Informative group testing for multiplex assays.” Biometrics, 75, 278–288.

Hitt, B., Bilder, C., Tebbs, J., McMahan, C. (2019). “The objective function controversy for group testing: Much ado about nothing?” Statistics in Medicine, 38, 4912–4923.

McMahan, C., Tebbs, J., Bilder, C. (2012a). “Informative Dorfman Screening.” Biometrics, 68, 287–296.

McMahan, C., Tebbs, J., Bilder, C. (2012b). “Two-Dimensional Informative Array Testing.” Biometrics, 68, 793–804.

Examples

# Calculate the operating characteristics for non-informative
#   two-stage hierarchical (Dorfman) testing.
config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat, print.time = FALSE)

# Calculate the operating characteristics for informative
#   two-stage hierarchical (Dorfman) testing.
# A vector of individual probabilities is generated using
#   the expected value of order statistics from a beta
#   distribution with p = 0.01 and a heterogeneity level
#   of alpha = 0.5.
config.mat <- matrix(data = c(rep(1:3, each = 10), 1:30),
                     nrow = 2, ncol = 30, byrow = TRUE)
set.seed(52613)
opChar1(algorithm = "ID2", p = 0.01, Se = 0.95, Sp = 0.95,
        hier.config = config.mat, alpha = 0.5, num.sim = 10000)
# Equivalent code using a heterogeneous vector of
#   probabilities
set.seed(52613)
probs <- expectOrderBeta(p = 0.01, alpha = 0.5, size = 30)
opChar1(algorithm = "ID2", probabilities = probs,
        Se = 0.95, Sp = 0.95, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative three-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 18), rep(1:3, each = 5),
                              rep(4, 3), 1:18),
                    nrow = 3, ncol = 18, byrow = TRUE)
opChar1(algorithm = "D3", p = 0.001, Se = 0.95, Sp = 0.95,
        hier.config = config.mat)
opChar1(algorithm = "D3", p = 0.001, Se = c(0.95, 0.95, 0.99),
        Sp = c(0.96, 0.96, 0.98), hier.config = config.mat)

# Calculate the operating characteristics for
#   informative three-stage hierarchical testing,
#   given a heterogeneous vector of probabilities.
config.mat <- matrix(data = c(rep(1, 6), rep(1:2, each = 3),
                              1:6), nrow = 3, ncol = 6,
                     byrow = TRUE)
set.seed(52613)
opChar1(algorithm = "ID3",
         probabilities = c(0.012, 0.014, 0.011, 0.012, 0.010, 0.015),
         Se = 0.99, Sp = 0.99, hier.config = config.mat,
         alpha = 0.5, num.sim = 5000)

# Calculate the operating characteristics for
#   non-informative four-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 12), rep(1, 8),
                              rep(2, 2), 3, 4, rep(1, 5),
                              rep(2, 3), 3, 4, rep(NA, 2),
                              1:8, rep(NA, 4)), nrow = 4,
                     ncol = 12, byrow = TRUE)
opChar1(algorithm = "D4", p = 0.041, Se = 0.99, Sp = 0.90,
        hier.config = config.mat)

# Calculate the operating characteristics for
#   informative four-stage hierarchical testing.
# A vector of individual probabilities is generated using
#   the expected value of order statistics from a beta
#   distribution with p = 0.041 and a heterogeneity level
#   of alpha = 0.5.
config.mat <- matrix(data = c(rep(1, 12), rep(1, 8),
                              rep(2, 2), 3, 4, rep(1, 5),
                              rep(2, 3), 3, 4, rep(NA, 2),
                              1:8, rep(NA, 4)), nrow = 4,
                     ncol = 12, byrow = TRUE)
set.seed(5678)
opChar1(algorithm = "ID4", p = 0.041, Se = 0.99, Sp = 0.90,
        hier.config = config.mat, alpha = 0.5)

# Calculate the operating characteristics for
#   non-informative array testing without master pooling.
opChar1(algorithm = "A2", p = 0.005, Se = c(0.95, 0.99),
        Sp = c(0.95, 0.99), rowcol.sz = 8, a = 1)

# Calculate the operating characteristics for
#   informative array testing without master pooling.
# A vector of individual probabilities is generated using
#   the expected value of order statistics from a beta
#   distribution with p = 0.03 and a heterogeneity level
#   of alpha = 2.
set.seed(1002)
opChar1(algorithm = "IA2", p = 0.03, Se = 0.95, Sp = 0.95,
         rowcol.sz = 8, alpha = 2, a = 1:10)

# Calculate the operating characteristics for
#   non-informative array testing with master pooling.
opChar1(algorithm = "A2M", p = 0.02, Se = c(0.95,0.95,0.99),
        Sp = c(0.98,0.98,0.99), rowcol.sz = 5)
# Calculate the operating characteristics for non-informative
#   two-stage hierarchical (Dorfman) testing.
config.mat <- matrix(data = c(rep(1, 10), 1:10),
                     nrow = 2, ncol = 10, byrow = TRUE)
opChar1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99,
        hier.config = config.mat, print.time = FALSE)

# Calculate the operating characteristics for informative
#   two-stage hierarchical (Dorfman) testing.
# A vector of individual probabilities is generated using
#   the expected value of order statistics from a beta
#   distribution with p = 0.01 and a heterogeneity level
#   of alpha = 0.5.
config.mat <- matrix(data = c(rep(1:3, each = 10), 1:30),
                     nrow = 2, ncol = 30, byrow = TRUE)
set.seed(52613)
opChar1(algorithm = "ID2", p = 0.01, Se = 0.95, Sp = 0.95,
        hier.config = config.mat, alpha = 0.5, num.sim = 10000)
# Equivalent code using a heterogeneous vector of
#   probabilities
set.seed(52613)
probs <- expectOrderBeta(p = 0.01, alpha = 0.5, size = 30)
opChar1(algorithm = "ID2", probabilities = probs,
        Se = 0.95, Sp = 0.95, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative three-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 18), rep(1:3, each = 5),
                              rep(4, 3), 1:18),
                    nrow = 3, ncol = 18, byrow = TRUE)
opChar1(algorithm = "D3", p = 0.001, Se = 0.95, Sp = 0.95,
        hier.config = config.mat)
opChar1(algorithm = "D3", p = 0.001, Se = c(0.95, 0.95, 0.99),
        Sp = c(0.96, 0.96, 0.98), hier.config = config.mat)

# Calculate the operating characteristics for
#   informative three-stage hierarchical testing,
#   given a heterogeneous vector of probabilities.
config.mat <- matrix(data = c(rep(1, 6), rep(1:2, each = 3),
                              1:6), nrow = 3, ncol = 6,
                     byrow = TRUE)
set.seed(52613)
opChar1(algorithm = "ID3",
         probabilities = c(0.012, 0.014, 0.011, 0.012, 0.010, 0.015),
         Se = 0.99, Sp = 0.99, hier.config = config.mat,
         alpha = 0.5, num.sim = 5000)

# Calculate the operating characteristics for
#   non-informative four-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 12), rep(1, 8),
                              rep(2, 2), 3, 4, rep(1, 5),
                              rep(2, 3), 3, 4, rep(NA, 2),
                              1:8, rep(NA, 4)), nrow = 4,
                     ncol = 12, byrow = TRUE)
opChar1(algorithm = "D4", p = 0.041, Se = 0.99, Sp = 0.90,
        hier.config = config.mat)

# Calculate the operating characteristics for
#   informative four-stage hierarchical testing.
# A vector of individual probabilities is generated using
#   the expected value of order statistics from a beta
#   distribution with p = 0.041 and a heterogeneity level
#   of alpha = 0.5.
config.mat <- matrix(data = c(rep(1, 12), rep(1, 8),
                              rep(2, 2), 3, 4, rep(1, 5),
                              rep(2, 3), 3, 4, rep(NA, 2),
                              1:8, rep(NA, 4)), nrow = 4,
                     ncol = 12, byrow = TRUE)
set.seed(5678)
opChar1(algorithm = "ID4", p = 0.041, Se = 0.99, Sp = 0.90,
        hier.config = config.mat, alpha = 0.5)

# Calculate the operating characteristics for
#   non-informative array testing without master pooling.
opChar1(algorithm = "A2", p = 0.005, Se = c(0.95, 0.99),
        Sp = c(0.95, 0.99), rowcol.sz = 8, a = 1)

# Calculate the operating characteristics for
#   informative array testing without master pooling.
# A vector of individual probabilities is generated using
#   the expected value of order statistics from a beta
#   distribution with p = 0.03 and a heterogeneity level
#   of alpha = 2.
set.seed(1002)
opChar1(algorithm = "IA2", p = 0.03, Se = 0.95, Sp = 0.95,
         rowcol.sz = 8, alpha = 2, a = 1:10)

# Calculate the operating characteristics for
#   non-informative array testing with master pooling.
opChar1(algorithm = "A2M", p = 0.02, Se = c(0.95,0.95,0.99),
        Sp = c(0.98,0.98,0.99), rowcol.sz = 5)

Calculate operating characteristics for group testing algorithms that use a multiplex assay for two diseases

Description

Calculate operating characteristics, such as the expected number of tests, for a specified testing configuration using non-informative and informative hierarchical and array-based group testing algorithms. Multiplex assays for two diseases are used at each stage of the algorithms.

Usage

operatingCharacteristics2(
  algorithm,
  p.vec = NULL,
  probabilities = NULL,
  alpha = NULL,
  Se,
  Sp,
  hier.config = NULL,
  rowcol.sz = NULL,
  ordering = matrix(data = c(0, 1, 0, 1, 0, 0, 1, 1), nrow = 4, ncol = 2),
  a = NULL,
  print.time = TRUE,
  ...
)

opChar2(
  algorithm,
  p.vec = NULL,
  probabilities = NULL,
  alpha = NULL,
  Se,
  Sp,
  hier.config = NULL,
  rowcol.sz = NULL,
  ordering = matrix(data = c(0, 1, 0, 1, 0, 0, 1, 1), nrow = 4, ncol = 2),
  a = NULL,
  print.time = TRUE,
  ...
)
operatingCharacteristics2(
  algorithm,
  p.vec = NULL,
  probabilities = NULL,
  alpha = NULL,
  Se,
  Sp,
  hier.config = NULL,
  rowcol.sz = NULL,
  ordering = matrix(data = c(0, 1, 0, 1, 0, 0, 1, 1), nrow = 4, ncol = 2),
  a = NULL,
  print.time = TRUE,
  ...
)

opChar2(
  algorithm,
  p.vec = NULL,
  probabilities = NULL,
  alpha = NULL,
  Se,
  Sp,
  hier.config = NULL,
  rowcol.sz = NULL,
  ordering = matrix(data = c(0, 1, 0, 1, 0, 0, 1, 1), nrow = 4, ncol = 2),
  a = NULL,
  print.time = TRUE,
  ...
)

Arguments

`algorithm`	character string defining the group testing algorithm to be used. Non-informative testing options include two-stage hierarchical ("`D2`"), three-stage hierarchical ("`D3`"), four-stage hierarchical ("`D4`"), five-stage hierarchical ("`D5`"), square array testing without master pooling ("`A2`"), and square array testing with master pooling ("`A2M`"). Informative testing options include two-stage hierarchical ("`ID2`"), three-stage hierarchical ("`ID3`"), four-stage hierarchical ("`ID4`"), and five-stage hierarchical ("`ID5`") testing.
`p.vec`	vector of overall joint probabilities. The joint probabilities are assumed to be equal for all individuals in the algorithm (non-informative testing only). There are four joint probabilities to consider: $p_{00}$ , the probability that an individual tests negative for both diseases; $p_{10}$ , the probability that an individual tests positive only for the first disease; $p_{01}$ , the probability that an individual tests positive only for the second disease; and $p_{11}$ , the probability that an individual tests positive for both diseases. The joint probabilities must sum to 1. Only one of `p.vec`, `probabilities`, or `alpha` should be specified.
`probabilities`	matrix of joint probabilities for each individual, where rows correspond to the four joint probabilities and columns correspond to each individual in the algorithm. Only one of `p.vec`, `probabilities`, or `alpha` should be specified.
`alpha`	a vector containing positive shape parameters of the Dirichlet distribution (for informative testing only). The vector will be used to generate a heterogeneous matrix of joint probabilities for each individual. The vector must have length 4. Further details are given under 'Details'. Only one of `p.vec`, `probabilities`, or `alpha` should be specified.
`Se`	matrix of sensitivity values, where one value is given for each disease (or infection) at each stage of testing. The rows of the matrix correspond to each disease $k=1,2$ , and the columns of the matrix correspond to each stage of testing $s=1,...,S$ . If a vector of 2 values is provided, the sensitivity values associated with disease are assumed to be equal to the $k$ th value in the vector for all stages of testing. Further details are given under 'Details'.
`Sp`	a matrix of specificity values, where one value is given for each disease (or infection) at each stage of testing. The rows of the matrix correspond to each disease $k=1,2$ , and the columns of the matrix correspond to each stage of testing $s=1,...,S$ . If a vector of 2 values is provided, the specificity values associated with disease $k$ are assumed to be equal to the $k$ th value in the vector for all stages of testing. Further details are given under 'Details'.
`hier.config`	a matrix specifying the configuration for a hierarchical testing algorithm. The rows correspond to the stages of testing, the columns correspond to each individual to be tested, and the cell values specify the group number of each individual at each stage. Further details are given under 'Details'. For array testing algorithms, this argument will be ignored.
`rowcol.sz`	the row/column size for array testing algorithms. For hierarchical testing algorithms, this argument will be ignored.
`ordering`	a matrix detailing the ordering for the binary responses of the diseases. The columns of the matrix correspond to each disease and the rows of the matrix correspond to each of the 4 sets of binary responses for two diseases. This ordering is used with the joint probabilities. The default ordering is (p_00, p_10, p_01, p_11).
`a`	a vector containing indices indicating which individuals to calculate individual accuracy measures for. If `NULL`, individual accuracy measures will be displayed for all individuals in the algorithm.
`print.time`	a logical value indicating whether the length of time for calculations should be printed. The default is `TRUE`.
`...`	additional arguments to be passed to functions for hierarchical testing with multiplex assays for two diseases.

Details

This function computes the operating characteristics for standard group testing algorithms with a multiplex assay that tests for two diseases. Calculations for hierarchical group testing algorithms are performed as described in Bilder et al. (2019) and calculations for array-based group testing algorithms are performed as described in Hou et al. (2019).

Available algorithms include two-, three-, four-, and five-stage hierarchical testing and array testing with and without master pooling. Both non-informative and informative group testing settings are allowed for hierarchical algorithms. Only non-informative group testing settings are allowed for array testing algorithms. Operating characteristics calculated are expected number of tests, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for each individual.

For informative algorithms where the alpha argument is specified, a heterogeneous matrix of joint probabilities for each individual is generated using the Dirichlet distribution. This is done using rBeta2009::rdirichlet and requires the user to set a seed to reproduce results. See Bilder et al. (2019) for additional details on the use of the Dirichlet distribution for this purpose.

The sensitivity/specificity values are allowed to vary across stages of testing. For hierarchical testing, a different sensitivity/specificity value may be used for each stage of testing. For array testing, a different sensitivity/specificity value may be used for master pool testing (if included), row/column testing, and individual testing. The values must be specified in the order of the testing performed. For example, values are specified as (stage 1, stage 2, stage 3) for three-stage hierarchical testing or (master pool testing, row/column testing, individual testing) for array testing with master pooling. A vector of 2 sensitivity/specificity values may be specified, and sensitivity/specificity values for all stages of testing are assumed to be equal. The first value in the vector will be used at each stage of testing for the first disease, and the second value in the vector will be used at each stage of testing for the second disease.

The matrix specified by hier.config defines the hierarchical group testing algorithm for $I$ individuals. The rows of the matrix correspond to the stages $s=1,...,S$ in the testing algorithm, and the columns correspond to individuals $i=1,...I$ . The cell values within the matrix represent the group number of individual $i$ at stage $s$ . For three-stage, four-stage, five-stage, and non-informative two-stage hierarchical testing, the first row of the matrix consists of all ones. This indicates that all individuals in the algorithm are tested together in a single group in the first stage of testing. For informative two-stage hierarchical testing, the initial group (block) is not tested. Thus, the first row of the matrix consists of the group numbers for each individual in the first stage of testing. For all hierarchical algorithms, the final row of the matrix denotes individual testing. Individuals who are not tested in a particular stage are represented by "NA" (e.g., an individual tested in a group of size 1 in the second stage of testing would not be tested again in a third stage of testing). It is important to note that this matrix represents the testing that could be performed if each group tests positively at each stage prior to the last. For more details on this matrix (called a group membership matrix), see Bilder et al. (2019).

Value

A list containing:

`algorithm`	the group testing algorithm used for calculations.
`prob.vec`	the vector of joint probabilities provided by the user, if applicable (for non-informative algorithms only).
`joint.p`	the matrix of joint probabilities for each individual provided by the user, if applicable.
`alpha.vec`	the alpha vector provided by the user, if applicable (for informative algorithms only).
`Se`	the matrix of sensitivity values for each disease at each stage of testing.
`Sp`	the matrix of specificity values for each disease at each stage of testing.
`Config`	a list specifying elements of the specified testing configuration, which may include: Stage1 group size for the first stage of hierarchical testing, if applicable. Stage2 group sizes for the second stage of hierarchical testing, if applicable. Stage3 group sizes for the third stage of hierarchical testing, if applicable. Stage4 group sizes for the fourth stage of hierarchical testing, if applicable. Block.sz the block size/initial group size for informative Dorfman testing, which is not tested. pool.szs group sizes for the first stage of testing for informative Dorfman testing. Array.dim the row/column size for array testing. Array.sz the overall array size for array testing (the square of the row/column size).
`p.mat`	the matrix of joint probabilities for each individual in the algorithm. Each row corresponds to one of the four joint probabilities. Each column corresponds to an individual in the testing algorithm.
`ET`	the expected testing expenditure for the OTC.
`value`	the value of the expected number of tests per individual.
`Accuracy`	a list containing: Disease 1 Individual a matrix of accuracy measures, pertaining to the first disease, for each individual specified in `a`. The rows correspond to each unique set of accuracy measures in the algorithm. Individuals with the same set of accuracy measures are displayed together in a single row of the matrix. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, pooling negative predictive value, and the indices for the individuals in each row of the matrix. Individual accuracy measures are not displayed for array testing algorithms. Disease 2 Individual a matrix of accuracy measures, pertaining to the second disease, for each individual specified in `a`. The rows correspond to each unique set of accuracy measures in the algorithm. Individuals with the same set of accuracy measures are displayed together in a single row of the matrix. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, pooling negative predictive value, and the indices for the individuals in each row of the matrix. Individual accuracy measures are not displayed for array testing algorithms. Overall a matrix of overall accuracy measures for the algorithm. The rows correspond to each disease. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for the overall algorithm. Further details are given under 'Details'.

Note

Author(s)

This function was written by Brianna D. Hitt. It calls ET.all.stages.new and PSePSpAllStages, which were originally written by Christopher Bilder for Bilder et al. (2019), and ARRAY, which was originally written by Peijie Hou for Hou et al. (2020). The functions ET.all.stages.new, PSePSpAllStages, and ARRAY were obtained from http://chrisbilder.com/grouptesting/. Minor modifications were made to the functions for inclusion in the binGroup2 package.

References

Altman, D., Bland, J. (1994). “Diagnostic tests 1: Sensitivity and specificity.” BMJ, 308, 1552.

Altman, D., Bland, J. (1994). “Diagnostic tests 2: Predictive values.” BMJ, 309, 102.

Bilder, C., Tebbs, J., McMahan, C. (2019). “Informative group testing for multiplex assays.” Biometrics, 75, 278–288.

Hitt, B., Bilder, C., Tebbs, J., McMahan, C. (2019). “The objective function controversy for group testing: Much ado about nothing?” Statistics in Medicine, 38, 4912–4923.

Hou, P., Tebbs, J., Wang, D., McMahan, C., Bilder, C. (2021). “Array testing with multiplex assays.” Biostatistics, 21, 417–431.

McMahan, C., Tebbs, J., Bilder, C. (2012a). “Informative Dorfman Screening.” Biometrics, 68, 287–296.

Examples

# Calculate the operating characteristics for
#   non-informative two-stage hierarchical
#   (Dorfman) testing.
config.mat <- matrix(data = c(rep(1, 24), 1:24),
                     nrow = 2, ncol = 24, byrow = TRUE)
Se <- matrix(data = c(0.95, 0.95, 0.95, 0.95),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = c(0.99, 0.99, 0.99, 0.99),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
opChar2(algorithm = "D2", p.vec = c(0.90, 0.04, 0.04, 0.02),
         Se = Se, Sp = Sp, hier.config = config.mat, print.time = FALSE)

# Calculate the operating characteristics for informative
#   two-stage hierarchical (Dorfman) testing.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
config.mat <- matrix(data = c(rep(1, 5), rep(2, 4), 3, 1:9, NA),
                     nrow = 2, ncol = 10, byrow = TRUE)
Se <- matrix(data = c(0.95, 0.95, 0.99, 0.99),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = c(0.96, 0.96, 0.98, 0.98),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
set.seed(8791)
opChar2(algorithm = "ID2", alpha = c(18.25, 0.75, 0.75, 0.25),
         Se = Se, Sp = Sp, hier.config = config.mat)
# Equivalent code using a heterogeneous matrix of joint
#   probabilities for each individual
set.seed(8791)
p.unordered <- t(rBeta2009::rdirichlet(n = 10,
                            shape = c(18.25, 0.75, 0.75, 0.25)))
p.ordered <- p.unordered[, order(1 - p.unordered[1,])]
opChar2(algorithm = "ID2", probabilities = p.ordered,
        Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative three-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 10), rep(1, 5),
                              rep(2, 4), 3, 1:9, NA),
                     nrow = 3, ncol = 10, byrow = TRUE)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
opChar2(algorithm = "D3", p.vec = c(0.95, 0.02, 0.02, 0.01),
         Se = Se, Sp = Sp, hier.config = config.mat)
opChar2(algorithm = "D3", p.vec = c(0.95, 0.02, 0.02, 0.01),
        Se = Se, Sp = Sp, hier.config = config.mat,
        a = c(1, 6, 10))

# Calculate the operating characteristics for informative
#   three-stage hierarchical testing.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
config.mat <- matrix(data = c(rep(1, 15),
                              rep(c(1, 2, 3), each = 5), 1:15),
                     nrow = 3, ncol = 15, byrow = TRUE)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
opChar2(algorithm = "ID3", alpha = c(18.25, 0.75, 0.75, 0.25),
         Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative four-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 12), rep(1, 6), rep(2, 6),
                              rep(1, 4), rep(2, 2), rep(3, 3),
                              rep(4, 3), 1:12),
                     nrow = 4, ncol = 12, byrow = TRUE)
Se <- matrix(data = rep(0.95, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
Sp <- matrix(data = rep(0.99, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
opChar2(algorithm = "D4", p.vec = c(0.92, 0.05, 0.02, 0.01),
         Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for informative
#   five-stage hierarchical testing.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
config.mat <- matrix(data = c(rep(1, 20), rep(1, 10), rep(2, 10),
                              rep(c(1, 2, 3, 4), each = 5),
                              rep(1, 3), rep(2, 2), rep(3, 3),
                              rep(4, 2), rep(5, 3), rep(6, 2),
                              rep(7, 3), rep(8, 2), 1:20),
                     nrow = 5, ncol = 20, byrow = TRUE)
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
opChar2(algorithm = "ID5", alpha = c(18.25, 0.75, 0.75, 0.25),
        Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative array testing without master pooling.
Se <- matrix(data = rep(0.95, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = rep(0.99, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
opChar2(algorithm = "A2", p.vec = c(0.90, 0.04, 0.04, 0.02),
         Se = Se, Sp = Sp, rowcol.sz = 12)

# Calculate the operating characteristics for
#   non-informative array testing with master pooling.
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
opChar2(algorithm = "A2M", p.vec = c(0.90, 0.04, 0.04, 0.02),
         Se = Se, Sp = Sp, rowcol.sz = 10)
# Calculate the operating characteristics for
#   non-informative two-stage hierarchical
#   (Dorfman) testing.
config.mat <- matrix(data = c(rep(1, 24), 1:24),
                     nrow = 2, ncol = 24, byrow = TRUE)
Se <- matrix(data = c(0.95, 0.95, 0.95, 0.95),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = c(0.99, 0.99, 0.99, 0.99),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
opChar2(algorithm = "D2", p.vec = c(0.90, 0.04, 0.04, 0.02),
         Se = Se, Sp = Sp, hier.config = config.mat, print.time = FALSE)

# Calculate the operating characteristics for informative
#   two-stage hierarchical (Dorfman) testing.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
config.mat <- matrix(data = c(rep(1, 5), rep(2, 4), 3, 1:9, NA),
                     nrow = 2, ncol = 10, byrow = TRUE)
Se <- matrix(data = c(0.95, 0.95, 0.99, 0.99),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = c(0.96, 0.96, 0.98, 0.98),
             nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
set.seed(8791)
opChar2(algorithm = "ID2", alpha = c(18.25, 0.75, 0.75, 0.25),
         Se = Se, Sp = Sp, hier.config = config.mat)
# Equivalent code using a heterogeneous matrix of joint
#   probabilities for each individual
set.seed(8791)
p.unordered <- t(rBeta2009::rdirichlet(n = 10,
                            shape = c(18.25, 0.75, 0.75, 0.25)))
p.ordered <- p.unordered[, order(1 - p.unordered[1,])]
opChar2(algorithm = "ID2", probabilities = p.ordered,
        Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative three-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 10), rep(1, 5),
                              rep(2, 4), 3, 1:9, NA),
                     nrow = 3, ncol = 10, byrow = TRUE)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
opChar2(algorithm = "D3", p.vec = c(0.95, 0.02, 0.02, 0.01),
         Se = Se, Sp = Sp, hier.config = config.mat)
opChar2(algorithm = "D3", p.vec = c(0.95, 0.02, 0.02, 0.01),
        Se = Se, Sp = Sp, hier.config = config.mat,
        a = c(1, 6, 10))

# Calculate the operating characteristics for informative
#   three-stage hierarchical testing.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
config.mat <- matrix(data = c(rep(1, 15),
                              rep(c(1, 2, 3), each = 5), 1:15),
                     nrow = 3, ncol = 15, byrow = TRUE)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
opChar2(algorithm = "ID3", alpha = c(18.25, 0.75, 0.75, 0.25),
         Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative four-stage hierarchical testing.
config.mat <- matrix(data = c(rep(1, 12), rep(1, 6), rep(2, 6),
                              rep(1, 4), rep(2, 2), rep(3, 3),
                              rep(4, 3), 1:12),
                     nrow = 4, ncol = 12, byrow = TRUE)
Se <- matrix(data = rep(0.95, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
Sp <- matrix(data = rep(0.99, 8), nrow = 2, ncol = 4,
             dimnames = list(Infection = 1:2, Stage = 1:4))
opChar2(algorithm = "D4", p.vec = c(0.92, 0.05, 0.02, 0.01),
         Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for informative
#   five-stage hierarchical testing.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
config.mat <- matrix(data = c(rep(1, 20), rep(1, 10), rep(2, 10),
                              rep(c(1, 2, 3, 4), each = 5),
                              rep(1, 3), rep(2, 2), rep(3, 3),
                              rep(4, 2), rep(5, 3), rep(6, 2),
                              rep(7, 3), rep(8, 2), 1:20),
                     nrow = 5, ncol = 20, byrow = TRUE)
Se <- matrix(data = rep(0.95, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
Sp <- matrix(data = rep(0.99, 10), nrow = 2, ncol = 5,
             dimnames = list(Infection = 1:2, Stage = 1:5))
opChar2(algorithm = "ID5", alpha = c(18.25, 0.75, 0.75, 0.25),
        Se = Se, Sp = Sp, hier.config = config.mat)

# Calculate the operating characteristics for
#   non-informative array testing without master pooling.
Se <- matrix(data = rep(0.95, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = rep(0.99, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
opChar2(algorithm = "A2", p.vec = c(0.90, 0.04, 0.04, 0.02),
         Se = Se, Sp = Sp, rowcol.sz = 12)

# Calculate the operating characteristics for
#   non-informative array testing with master pooling.
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
opChar2(algorithm = "A2M", p.vec = c(0.90, 0.04, 0.04, 0.02),
         Se = Se, Sp = Sp, rowcol.sz = 10)

Find the optimal testing configuration for group testing algorithms that use a single-disease assay

Description

Find the optimal testing configuration (OTC) using non-informative and informative hierarchical and array-based group testing algorithms. Single-disease assays are used at each stage of the algorithms.

Usage

OTC1(
  algorithm,
  p = NULL,
  probabilities = NULL,
  Se = 0.99,
  Sp = 0.99,
  group.sz,
  obj.fn = "ET",
  weights = NULL,
  alpha = 2,
  trace = TRUE,
  print.time = TRUE,
  ...
)
OTC1(
  algorithm,
  p = NULL,
  probabilities = NULL,
  Se = 0.99,
  Sp = 0.99,
  group.sz,
  obj.fn = "ET",
  weights = NULL,
  alpha = 2,
  trace = TRUE,
  print.time = TRUE,
  ...
)

Arguments

`algorithm`	character string defining the group testing algorithm to be used. Non-informative testing options include two-stage hierarchical ("`D2`"), three-stage hierarchical ("`D3`"), square array testing without master pooling ("`A2`"), and square array testing with master pooling ("`A2M`"). Informative testing options include two-stage hierarchical ("`ID2`"), three-stage hierarchical ("`ID3`"), and square array testing without master pooling ("`IA2`").
`p`	overall probability of disease that will be used to generate a vector/matrix of individual probabilities. For non-informative algorithms, a homogeneous set of probabilities will be used. For informative algorithms, the `expectOrderBeta` function will be used to generate a heterogeneous set of probabilities. Further details are given under 'Details'. Either `p` or `probabilities` should be specified, but not both.
`probabilities`	a vector of individual probabilities, which is homogeneous for non-informative testing algorithms and heterogeneous for informative testing algorithms. Either `p` or `probabilities` should be specified, but not both.
`Se`	a vector of sensitivity values, where one value is given for each stage of testing (in order). If a single value is provided, sensitivity values are assumed to be equal to this value for all stages of testing. Further details are given under 'Details'.
`Sp`	a vector of specificity values, where one value is given for each stage of testing (in order). If a single value is provided, specificity values are assumed to be equal to this value for all stages of testing. Further details are given under 'Details'.
`group.sz`	a single group size or range of group sizes for which to calculate operating characteristics and/or find the OTC. The details of group size specification are given under 'Details'.
`obj.fn`	a list of objective functions which are minimized to find the OTC. The expected number of tests per individual, "`ET`", will always be calculated. Additional options include "`MAR`" (the expected number of tests divided by the expected number of correct classifications, described in Malinovsky et al. (2016)), and "`GR`" (a linear combination of the expected number of tests, the number of misclassified negatives, and the number of misclassified positives, described in Graff & Roeloffs (1972)). See Hitt et al. (2019) for additional details. The first objective function specified in this list will be used to determine the results for the top configurations. Further details are given under 'Details'.
`weights`	a matrix of up to six sets of weights for the GR function. Each set of weights is specified by a row of the matrix.
`alpha`	a shape parameter for the betadistribution that specifies the degree of heterogeneity for the generated probability vector (for informative testing only).
`trace`	a logical value indicating whether the progress of calculations should be printed for each initial group size provided by the user. The default is `TRUE`.
`print.time`	a logical value indicating whether the length of time for calculations should be printed. The default is `TRUE`.
`...`	arguments to be passed to the `expectOrderBeta` function, which generates a vector of probabilities for informative testing algorithms. Further details are given under 'Details'.

Details

This function finds the OTC for group testing algorithms with an assay that tests for one disease and computes the associated operating characteristics, as described in Hitt et al. (2019).

Available algorithms include two- and three-stage hierarchical testing and array testing with and without master pooling. Both non-informative and informative group testing settings are allowed for each algorithm, except informative array testing with master pooling is unavailable because this method has not appeared in the group testing literature. Operating characteristics calculated are expected number of tests, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for each individual.

Informative two-stage hierarchical (Dorfman) testing is implemented via the pool-specific optimal Dorfman (PSOD) method described in McMahan et al. (2012a), where the greedy algorithm proposed for PSOD is replaced by considering all possible testing configurations. Informative array testing is implemented via the gradient method (the most efficient array design), where higher-risk individuals are grouped in the left-most columns of the array. For additional details on the gradient arrangement method for informative array testing, see McMahan et al. (2012b).

The value(s) specified by group.sz represent the initial (stage 1) group size for hierarchical testing and the row/column size for array testing. For informative two-stage hierarchical testing, the group.sz specified represents the block size used in the pool-specific optimal Dorfman (PSOD) method, where the initial group (block) is not tested. For more details on informative two-stage hierarchical testing implemented via the PSOD method, see Hitt et al. (2019) and McMahan et al. (2012a).

If a single value is provided for group.sz with array testing or non-informative two-stage hierarchical testing, operating characteristics will be calculated and no optimization will be performed. If a single value is provided for group.sz with three-stage hierarchical or informative two-stage hierarchical, the OTC will be found over all possible configurations. If a range of group sizes is specified, the OTC will be found over all group sizes.

In addition to the OTC, operating characteristics for some of the other configurations corresponding to each initial group size provided by the user will be displayed. These additional configurations are only determined for whichever objective function ("ET", "MAR", or "GR") is specified first in the function call. If "GR" is the objective function listed first, the first set of corresponding weights will be used. For algorithms where there is only one configuration for each initial group size (non-informative two-stage hierarchical and all array testing algorithms), results for each initial group size are provided. For algorithms where there is more than one possible configuration for each initial group size (informative two-stage hierarchical and all three-stage hierarchical algorithms), two sets of configurations are provided: 1) the best configuration for each initial group size, and 2) the top 10 configurations for each initial group size provided by the user. If a single value is provided for group.sz with array testing or non-informative two-stage hierarchical testing, operating characteristics will not be provided for configurations other than that specified by the user. Results are sorted by the value of the objective function per individual, value.

The OTC1 function accepts additional arguments, namely num.sim, to be passed to the expectOrderBeta function, which generates a vector of probabilities for informative group testing algorithms. The num.sim argument specifies the number of simulations from the beta distribution when simulation is used. By default, 10,000 simulations are used.

Value

A list containing:

`algorithm`	the group testing algorithm used for calculations.
`prob`	the probability of disease or the vector of individual probabilities, as specified by the user.
`alpha`	level of heterogeneity for the generated probability vector (for informative testing only).
`Se`	the vector of sensitivity values for each stage of testing.
`Sp`	the vector of specificity values for each stage of testing.
`opt.ET`, `opt.MAR`, `opt.GR`	a list of results for each objective function specified by the user, containing: OTC a list specifying elements of the optimal testing configuration, which may include: Stage1 group size for the first stage of hierarchical testing, if applicable. Stage2 group sizes for the second stage of hierarchical testing, if applicable. Block.sz the block size/initial group size for informative Dorfman testing, which is not tested. pool.szs group sizes for the first stage of testing for informative Dorfman testing. Array.dim the row/column size for array testing. Array.sz the overall array size for array testing (the square of the row/column size). p.vec the sorted vector of individual probabilities, if applicable. p.mat the sorted matrix of individual probabilities in gradient arrangement, if applicable. Further details are given under 'Details'. ET the expected testing expenditure to decode all individuals in the algorithm; this includes all individuals in all groups for hierarchical algorithms or in the entire array for array testing. value the value of the objective function per individual. Accuracy a matrix of overall accuracy measures for the algorithm. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for the overall algorithm. Further details are given under 'Details'.
`Configs`	a data frame containing results for the best configuration for each initial group size provided by the user. The columns correspond to the initial group size, configuration (if applicable), overall array size (if applicable), expected number of tests, value of the objective function per individual, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. No results are displayed if a single `group.sz` is provided. Further details are given under 'Details'.
`Top.Configs`	a data frame containing results for the top overall configurations across all initial group sizes provided by the user. The columns correspond to the initial group size, configuration, expected number of tests, value of the objective function per individual, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. No results are displayed for non-informative two-stage hierarchical testing or for array testing algorithms. Further details are given under 'Details'.
`group.sz`	Initial group (or block) sizes examined to find the OTC.