Package 'fugue'

Title: Sensitivity Analysis Optimized for Matched Sets of Varied Sizes
Description: As in music, a fugue statistic repeats a theme in small variations. Here, the psi-function that defines an m-statistic is slightly altered to maintain the same design sensitivity in matched sets of different sizes. The main functions in the package are sen() and senCI(). For sensitivity analyses for m-statistics, see Rosenbaum (2007) Biometrics 63 456-464 <doi:10.1111/j.1541-0420.2006.00717.x>.
Authors: Xinran Li and Paul R. Rosenbaum
Maintainer: Paul R. Rosenbaum <[email protected]>
License: GPL-2
Version: 0.1.7
Built: 2024-11-03 06:42:14 UTC
Source: CRAN

Help Index


Amplification of sensitivity analysis in observational studies.

Description

Uses the method in Rosenbaum and Silber (2009) to interpret a value of the sensitivity parameter gamma. Each value of gamma amplifies to a curve (lambda,delta) in a two-dimensional sensitivity analysis, the inference being the same for all points on the curve. That is, a one-dimensional sensitivity analysis in terms of gamma has a two-dimensional interpretation in terms of (lambda,delta).

Usage

amplify(gamma, lambda)

Arguments

gamma

gamma > 1 is the value of the sensitivity parameter, for instance the parameter in senmv. length(gamma)>1 will generate an error.

lambda

lambda is a vector of values > gamma. An error will result unless lambda[i] > gamma > 1 for every i.

Details

A single value of gamma, say gamma = 2.2 in the example, corresponds to a curve of values of (lambda, delta), including (3, 7), (4, 4.33), (5, 3.57), and (7, 3) in the example. An unobserved covariate that is associated with a lambda = 3 fold increase in the odds of treatment and a delta = 7 fold increase in the odds of a positive pair difference is equivalent to gamma = 2.2.

The curve is gamma = (lambda*delta + 1)/(lambda+delta). Amplify is given one gamma and a vector of lambdas and solves for the vector of deltas. The calculation is elementary.

This interpretation of gamma is developed in detail in Rosenbaum and Silber (2009), and it makes use of Wolfe's (1974) family of semiparametric deformations of an arbitrary symmetric distribuiton.

Strictly speaking, the amplification describes matched pairs, not matched sets. The senm function views a k-to-1 matched set with k controls matched to one treated individual as a collection of k correlated treated-minus-control matched pair differences; see Rosenbaum (2007). For matched sets, it is natural to think of the amplification as describing any one of the k matched pair differences in a k-to-1 matched set.

The curve has asymptotes that the function amplify does not compute: gamma corresponds with (lambda,delta) = (gamma, Inf) and (Inf, gamma).

A related though distict idea is developed in Gastwirth et al (1998). The two approaches agree when the outcome is binary, that is, for McNemar's test.

Value

Returns a vector of values of delta of length(lambda) with names lambda.

Note

The amplify function is also in the sensitivitymv package where a different example is used.

Author(s)

Paul R. Rosenbaum

References

Gastwirth, J. L., Krieger, A. M., Rosenbaum, P. R. (1998) Dual and simultaneous sensitivity analysis for matched pairs. Biometrika, 85, 907-920.

Rosenbaum, P. R. and Silber, J. H. (2009) Amplification of sensitivity analysis in observational studies. Journal of the American Statistical Association, 104, 1398-1405. <doi:10.1198/jasa.2009.tm08470>

Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)

Wolfe, D. A. (1974) A charaterization of population weighted symmetry and related results. Journal of the American Statistical Association, 69, 819-822.

Examples

attach(nh1and3)
sen(homocysteine,z,mset,gamma=1.9)
amplify(1.9,c(3,3.5,4))
detach(nh1and3)

Computes M-scores for M-tests and estimates.

Description

Of very limited interest to most users, function mscoreInternal() computes the M-scores used by functions sen().

Usage

mscoreInternal(ymat, inner, trim)

Arguments

ymat

A matrix of outcomes scaled for use in an M-statistic; see the discussion of the parameter lambda in the documentation for the sen function. If the largest matched set has K controls, and there are I matched sets, then ymat is an I x (K+1) matrix. Each row is a matched set. The first column contains the treated individual in the matched set. The remaining columns contain the controls. If a set has fewer than K controls, then its last columns are NAs.

inner

inner is the inner[i] parameter described in the documentation for sen().

trim

trim is the trim[i] parameter described in the documentation for sen().

Value

Generally, a matrix with the same dimensions as ymat containing the M-scores.

Author(s)

Xinran Li and Paul R. Rosenbaum

References

Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)


Smoking Matched Sets with 1 or 3 Controls

Description

Data from NHANES 2005-2006 concerning homocysteine levels in daily smokers (z=1) and never smokers (z=0), aged 20 and older.

Usage

data("nh1and3")

Format

A data frame with 1370 observations consisting of 353 matched pairs and 166 matched sets with 3 controls.

SEQN

NHANES ID number

z

=1 for a daily smoker, =0 for a never smoker

mset

Matched set indicator, for 519 sets, 1, 2, ..., 519

homocysteine

Blood homocysteine level

cigsperday30

Cigarettes smoked per day

cotinine

Cotinine is a biomarker for exposure to nicotine

female

=1 for female, =0 for male

age

Age in years

black

=1 for black, =0 for other

education

NHANES 1-5 score. 3 is a high school degree.

povertyr

Ratio of family income to the poverty level, capped at 5.

Details

Data from NHANES 2005-2006 concerning homocysteine levels in daily smokers (z=1) and never smokers (z=0), aged 20 and older. Daily smokers smoked every day for the last 30 days, smoking an average of at least 10 cigarettes per day. Never smokers smoked fewer than 100 cigarettes in their lives, do not smoke now, and had no tobacco use in the previous 5 days.

Source

NHANES 2005-2006

References

Bazzano, L. A., He, J., Muntner, P., Vupputuri, S. and Whelton, P. K. (2003) Relationship between cigarette smoking and novel risk factors for cardiovascular disease in the United States. Annals of Internal Medicine, 138, 891-897.

Pimentel, S. D., Small, D. S. and Rosenbaum, P. R. (2016) Constructed second control groups and attenuation of unmeasured biases. Journal of the American Statistical Association, 111, 1157-1167. <doi.org/10.1080/01621459.2015.1076342>

Examples

data(nh1and3)
attach(nh1and3)
table(table(nh1and3$mset))
par(mfrow=c(1,2))
boxplot(homocysteine[1:166]~z[1:166],ylim=c(0,70),main="1-1 match",
  ylab="homocysteine",names=c("Control","Smoker"))
boxplot(homocysteine[167:1370]~z[167:1370],ylim=c(0,70),
  main="1-3 match",ylab="homocysteine",names=c("Control","Smoke"))
detach(nh1and3)

Sensitivity Analysis for a Matched Comparison in an Observational Study.

Description

Each matched set contains one treated individual and one or more controls. Uses Huber's M-statistic as the basis for the test, for instance, a mean. Matched sets of different sizes use different ψ\psi-functions, creating what is called a fugue statistic. Performs either a randomization test (Gamma=1) or an analysis of sensitivity to departures from random assignment (Gamma>1). For confidence intervals, use function senCI(). The method is described in Li and Rosenbaum (2019); see also Rosenbaum (2007,2013).

Usage

sen(y, z, mset, gamma = 1, inner = NULL, trim = NULL, lambda = 1/2,
     tau = 0, alternative = "greater")

Arguments

y

A vector of responses with no missing data.

z

Treatment indicator, z=1 for treated, z=0 for control with length(z)==length(y).

mset

Matched set indicator, 1, 2, ..., sum(z) with length(mset)==length(y). Matched set indicators should be either integers or a factor.

gamma

gamma is the sensitivity parameter Γ\Gamma, where Γ1\Gamma \ge 1. Setting Γ=1\Gamma = 1 is equivalent to assuming ignorable treatment assignment given the matched sets, and it performs a within-set randomization test.

inner

inner and trim together define the ψ\psi-function for the M-statistic. If the largest matched set has k controls, then inner is either a scalar or a vector with k=length(inner). If inner is a scalar, then the same value of inner is used, regardless of the number of controls. Otherwise, inner[1] is used with one control, inner[2] is used with two controls, etc. If inner is NULL, default values of inner=c(.8,.8,.6,.4,0,0,0,...,0) are used.

trim

inner and trim together define the ψ\psi-function for the M-statistic. If the largest matched set has k controls, then trim is either a scalar or a vector with k=length(trim). If trim is a scalar, then the same value of trim is used, regardless of the number of controls. Otherwise, trim[1] is used with one control, trim[2] is used with two controls, etc. If trim is NULL, default values of trim=c(3,3,...,3) are used. For each i, 0 <= inner[i] < trim[i] < Inf.

lambda

Before applying the ψ\psi-function to treated-minus-control differences, the differences are scaled by dividing by the lambda quantile of all within set absolute differences. Typically, lambda = 1/2 for the median. The value of lambda has no effect if trim=Inf and inner=0. See Maritz (1979) for the paired case and Rosenbaum (2007) for matched sets.

An error will result unless 0 < lambda < 1.

tau

The null hypothesis asserts that the treatment has an additive effect, tau. By default, tau=0, so by default the null hypothesis is Fisher's sharp null hypothesis of no treatment effect.

alternative

If alternative="greater", the null hypothesis of a treatment effect of tau is tested against the alternative of a treatment effect larger than tau. If alternative="less", the null hypothesis of a treatment effect of tau is tested against the alternative of a treatment effect smaller than tau. In particular, alternative="less" is equivalent to: (i) alternative="greater", (ii) y replaced by -y, and (iii) tau replaced by -tau. See the note for discussion of two-sided sensitivity analyses.

Details

The novel element in the fugue package is the automatic use of different ψ\psi-functions for matched sets of different sizes. These ψ\psi-functions have been selected to approximately equate the design sensitivities in sets of unequal sizes when the errors are Normal and the additive effect is half the standard deviation of a matched pair difference; see Li and Rosenbaum (2019). If you disable this automatic feature by manually setting a single value for inner and trim, then the results will agree with senm() in the R package sensitivitymult. For instance, using both sen() in the fugue package and senm() in the sensitivitymult package will yield the same deviate and P-value in: data(nh1and3) attach(nh1and3) sen(homocysteine,z,mset,inner=0,gamma=1.9) senm(homocysteine,z,mset,inner=0,trim=3,gamma=1.9) Note that the sensitivitymult package is intended to implement methods from Rosenbaum (2016,2019) that are not implemented in the fugue package.

For the given Γ\Gamma, sen() computes the upper bound on the 1-sided P-value testing the null hypothesis of an additive treatment effect tau against the alternative hypothesis of a treatment effect larger than tau. By default, sen() tests the null hypothesis of no treatment effect against the alternative of a positive treatment effect. The P-value is an approximate P-value based on a Normal approximation to the null distribution; see Rosenbaum (2007).

Matched sets of unequal size are weighted using weights that would be efficient in a randomization test under a simple model with additive set and treatment effects and errors with constant variance; see Rosenbaum (2007).

The upper bound on the P-value is based on the separable approximation described in Gastwirth, Krieger and Rosenbaum (2000); see also Rosenbaum (2007, 2018).

Value

pval

The upper bound on the 1-sided P-value.

deviate

The deviate that was compared to the Normal distribution to produce pval.

statistic

The value of the M-statistic.

expectation

The maximum expectation of the M-statistic for the given Γ\Gamma.

variance

The maximum variance of the M-statistic among treatment assignments that achieve the maximum expectation. Part of the separable approximation.

Note

The function sen() performs 1-sided tests. One approach to a 2-sided, α\alpha-level test does both 1-sided tests at level α/2\alpha/2, and rejects the null hypothesis if either 1-sided test rejects. Equivalently, a bound on the two sided P-value is the smaller of 1 and twice the smaller of the two 1-sided P-values. This approach views a 2-sided test as two 1-sided tests with a Bonferroni correction; see Cox (1977, Section 4.2). In all cases, this approach is a valid large sample test: a true null hypothesis is falsely rejected with probability at most α\alpha if the bias in treatment assignment is at most Γ\Gamma; so, this procedure is entirely safe to use. For a randomization test, Γ=1\Gamma=1, this Bonferroni procedure is not typically conservative. For large Γ\Gamma, this Bonferroni procedure tends to be somewhat conservative.

The examples reproduce some results from Li and Rosenbaum (2019).

Author(s)

Xinran Li and Paul R. Rosenbaum.

References

Cox, D. R. (1977). The role of signficance tests (with Discussion). Scand. J. Statist. 4, 49-70.

Huber, P. (1981) Robust Statistics. New York: John Wiley. (M-estimates based on M-statistics.)

Li, X. and Rosenbaum, P. R. (2019) Maintaining high constant design sensitivity in observational studies with matched sets of varying sizes. Manuscript.

Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66 163–166. (Introduces exact permutation tests based on M-statistics by redefining the scaling parameter.)

Rosenbaum, P. R. (2007). Sensitivity analysis for m-estimates, tests and confidence intervals in matched observational studies. Biometrics 63 456-64. (R package sensitivitymv) <doi:10.1111/j.1541-0420.2006.00717.x>

Rosenbaum, P. R. (2013). Impact of multiple matched controls on design sensitivity in observational studies. Biometrics 69 118-127. (Introduces inner trimming.) <doi:10.1111/j.1541-0420.2012.01821.x>

Rosenbaum, P. R. (2014). Weighted M-statistics with superior design sensitivity in matched observational studies with multiple controls. J. Am. Statist. Assoc. 109 1145-1158. (R package sensitivitymw) <doi:10.1080/01621459.2013.879261>

Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)

Rosenbaum, P. R. (2016) Using Scheffe projections for multiple outcomes in an observational study of smoking and periondontal disease. Annals of Applied Statistics, 10, 1447-1471. <doi:10.1214/16-AOAS942>

Rosenbaum, P. R. (2018). Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334. <doi:10.1214/18-AOAS1153>

Rosenbaum, P. R. (2019). Combining planned and discovered comparisons in observational studies. Biostatistics, to appear. <doi.org/10.1093/biostatistics/kxy055>

Examples

# Reproduces results from Table 3 of Li and Rosenbaum (2019)
data(nh1and3)
attach(nh1and3)
sen(homocysteine,z,mset,gamma=1)
sen(homocysteine,z,mset,gamma=1.9)
sen(homocysteine,z,mset,inner=0,gamma=1.9)
amplify(1.9,c(3,3.5,4))
detach(nh1and3)

Sensitivity Analysis for Point Estimates and Confidence Intervals in an Observational Study.

Description

Each matched set contains one treated individual and one or more controls. Uses Huber's M-statistic as the basis for the test; see Maritz (1979). Matched sets of different sizes use different ψ\psi-functions, creating what is called a fugue statistic. Performs either a randomization test (Γ=1\Gamma=1) or an analysis of sensitivity to departures from random assignment (Γ>1\Gamma>1). For hypothesis tests, use function sen(). The method is described in Li and Rosenbaum (2019); see also Rosenbaum (2007,2013).

Usage

senCI(y, z, mset, gamma = 1, inner = NULL, trim = NULL, lambda = 1/2,
     alpha = 0.05, alternative = "greater")

Arguments

y

A vector of responses with no missing data.

z

Treatment indicator, z=1 for treated, z=0 for control with length(z)==length(y).

mset

Matched set indicator, 1, 2, ..., sum(z) with length(mset)==length(y). Matched set indicators should be either integers or a factor.

gamma

gamma is the sensitivity parameter Γ\Gamma, where Γ1\Gamma \ge 1. Setting Γ=1\Gamma = 1 is equivalent to assuming ignorable treatment assignment given the matched sets, and it performs a within-set randomization test.

inner

inner and trim together define the ψ\psi-function for the M-statistic. If the largest matched set has k controls, then inner is either a scalar or a vector with k=length(inner). If inner is a scalar, then the same value of inner is used, regardless of the number of controls. Otherwise, inner[1] is used with one control, inner[2] is used with two controls, etc. If inner is NULL, default values of inner=c(.8,.8,.6,.4,0,0,0,...,0) are used.

trim

inner and trim together define the ψ\psi-function for the M-statistic. If the largest matched set has k controls, then trim is either a scalar or a vector with k=length(trim). If trim is a scalar, then the same value of trim is used, regardless of the number of controls. Otherwise, trim[1] is used with one control, trim[2] is used with two controls, etc. If trim is NULL, default values of trim=c(3,3,...,3) are used. For each i, 0 <= inner[i] < trim[i] < Inf.

lambda

Before applying the ψ\psi-function to treated-minus-control differences, the differences are scaled by dividing by the lambda quantile of all within set absolute differences. Typically, lambda = 1/2 for the median. The value of lambda has no effect if trim=Inf and inner=0. See Maritz (1979) for the paired case and Rosenbaum (2007) for matched sets.

An error will result unless 0 < lambda < 1.

alpha

The coverage rate of the confidence interval is 1-α\alpha. If the bias in treatment assignment is at most Γ\Gamma, then the confidence interval covers at rate 1-α\alpha.

alternative

If alternative="greater" or alternative="less", the a one-sided confidence interval is returned. If alternative="twosided", a somewhat conservative two-sided confidence interval is returned. See the discussion of two-sided tests in the documentation for sen().

Details

The confidence interval inverts the test provided by sen(). See the documentation for sen() for more information.

The upper bound on the P-value is based on the separable approximation described in Gastwirth, Krieger and Rosenbaum (2000); see also Rosenbaum (2007, 2018).

Value

point.estimates

For Γ>1\Gamma>1, an interval of point estimates is returned. Γ=1\Gamma=1, the interval is a point.

confidence.interval

The confidence interval.

Note

The examples reproduce some results from Li and Rosenbaum (2019).

Author(s)

Xinran Li and Paul R. Rosenbaum.

References

Cox, D. R. (1977). The role of signficance tests (with Discussion). Scand. J. Statist. 4, 49-70.

Huber, P. (1981) Robust Statistics. New York: John Wiley. (M-estimates based on M-statistics.)

Li, Xinran and Rosenbaum, P. R. (2019) Maintaining high constant design sensitivity in observational studies with matched sets of varying sizes. Manuscript.

Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66 163–166. (Introduces exact permutation tests based on M-statistics by redefining the scaling parameter.)

Rosenbaum, P. R. (2007). Sensitivity analysis for m-estimates, tests and confidence intervals in matched observational studies. Biometrics 63 456-64. (R package sensitivitymv) <doi:10.1111/j.1541-0420.2006.00717.x>

Rosenbaum, P. R. (2013). Impact of multiple matched controls on design sensitivity in observational studies. Biometrics 69 118-127. (Introduces inner trimming.) <doi:10.1111/j.1541-0420.2012.01821.x>

Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)

Rosenbaum, P. R. (2016) Using Scheffe projections for multiple outcomes in an observational study of smoking and periondontal disease. Annals of Applied Statistics, 10, 1447-1471. <doi:10.1214/16-AOAS942>

Rosenbaum, P. R. (2018). Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334. <doi:10.1214/18-AOAS1153>

Examples

## Not run: 
# Reproduces results from Table 3 of Li and Rosenbaum (2019)
data(nh1and3)
attach(nh1and3)
senCI(homocysteine,z,mset,gamma=1)
senCI(homocysteine,z,mset,gamma=1.9)
senCI(homocysteine,z,mset,inner=0,gamma=1.9)
amplify(1.9,c(3,3.5,4))

# Relationships between confidence intervals and P-value bounds
senCI(homocysteine,z,mset,alternative="twosided",gamma=1.75)
sen(homocysteine,z,mset,alternative="less",tau=2.21721733,gamma=1.75)
senCI(homocysteine,z,mset,alternative="less",gamma=1.75)
sen(homocysteine,z,mset,alternative="less",tau=2.159342,gamma=1.75)
detach(nh1and3)

## End(Not run)

Asymptotic separable calculations internal to other functions.

Description

Of limited interest to most users, this general purpose function is internal to other functions in the package. It is the same function as in the sensitivitymv package, version 1.3. The function performs the asymptotic separable calculations described in Gastwirth, Krieger and Rosenbaum (2000) and Rosenbaum (2018), as used in section 4 of Rosenbaum (2007). See the sensitivitymv package for an example.

Usage

separable1v(ymat, gamma = 1)

Arguments

ymat

ymat is a matrix whose rows are matched sets and whose columns are matched individuals. The first column describes treated individuals. Other columns describe controls. If matched sets contain variable numbers of controls, NAs fill in empty spaces in ymat; see the documentation for senmv. In senmv, the matrix ymat is created by mscorev. Instead, if there were no NAs and ranks within rows were used in ymat, then separable1v would perform a sensitivity analysis for the stratified Wilcoxon two-sample test. Applied directly to data, it performs a sensitivity analysis for the permutational t-test.

gamma

gamma is the value of the sensitivity parameter; see the documentation for the senmv function in the sensitivitymv package. One should use a value of gamma >= 1.

Value

pval

Approximate upper bound on the one-sided P-value.

deviate

Deviate that is compared to the upper tail of the standard Normal distribution to obtain the P-value.

statistic

Value of the test statistic.

expectation

Maximum null expectation of the test statistic for the given value of gamma.

variance

Among null distributions that yield the maximum expectation, variance is the maximum possible variance for the given value of gamma. See Rosenbaum (2007, Section 4) and Gastwirth, Krieger and Rosenbaum (2000).

Author(s)

Paul R. Rosenbaum

References

Gastwirth, J. L., Krieger, A. M., and Rosenbaum, P. R. (2000) Asymptotic separability in sensitivity analysis. Journal of the Royal Statistical Society B 2000, 62, 545-556. <DOI:10.1111/1467-9868.00249>

Rosenbaum, P. R. (2007) Sensitivity analysis for m-estimates, tests and confidence intervals in matched observational studies. Biometrics, 2007, 63, 456-464. <doi:10.1111/j.1541-0420.2006.00717.x>

Rosenbaum, P. R. (2018). Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334. <doi:10.1214/18-AOAS1153>