Title: | Sensitivity Analysis Optimized for Matched Sets of Varied Sizes |
---|---|
Description: | As in music, a fugue statistic repeats a theme in small variations. Here, the psi-function that defines an m-statistic is slightly altered to maintain the same design sensitivity in matched sets of different sizes. The main functions in the package are sen() and senCI(). For sensitivity analyses for m-statistics, see Rosenbaum (2007) Biometrics 63 456-464 <doi:10.1111/j.1541-0420.2006.00717.x>. |
Authors: | Xinran Li and Paul R. Rosenbaum |
Maintainer: | Paul R. Rosenbaum <[email protected]> |
License: | GPL-2 |
Version: | 0.1.7 |
Built: | 2024-12-03 06:47:53 UTC |
Source: | CRAN |
Uses the method in Rosenbaum and Silber (2009) to interpret a value of the sensitivity parameter gamma. Each value of gamma amplifies to a curve (lambda,delta) in a two-dimensional sensitivity analysis, the inference being the same for all points on the curve. That is, a one-dimensional sensitivity analysis in terms of gamma has a two-dimensional interpretation in terms of (lambda,delta).
amplify(gamma, lambda)
amplify(gamma, lambda)
gamma |
gamma > 1 is the value of the sensitivity parameter, for instance the parameter in senmv. length(gamma)>1 will generate an error. |
lambda |
lambda is a vector of values > gamma. An error will result unless lambda[i] > gamma > 1 for every i. |
A single value of gamma, say gamma = 2.2 in the example, corresponds to a curve of values of (lambda, delta), including (3, 7), (4, 4.33), (5, 3.57), and (7, 3) in the example. An unobserved covariate that is associated with a lambda = 3 fold increase in the odds of treatment and a delta = 7 fold increase in the odds of a positive pair difference is equivalent to gamma = 2.2.
The curve is gamma = (lambda*delta + 1)/(lambda+delta). Amplify is given one gamma and a vector of lambdas and solves for the vector of deltas. The calculation is elementary.
This interpretation of gamma is developed in detail in Rosenbaum and Silber (2009), and it makes use of Wolfe's (1974) family of semiparametric deformations of an arbitrary symmetric distribuiton.
Strictly speaking, the amplification describes matched pairs, not matched sets. The senm function views a k-to-1 matched set with k controls matched to one treated individual as a collection of k correlated treated-minus-control matched pair differences; see Rosenbaum (2007). For matched sets, it is natural to think of the amplification as describing any one of the k matched pair differences in a k-to-1 matched set.
The curve has asymptotes that the function amplify does not compute: gamma corresponds with (lambda,delta) = (gamma, Inf) and (Inf, gamma).
A related though distict idea is developed in Gastwirth et al (1998). The two approaches agree when the outcome is binary, that is, for McNemar's test.
Returns a vector of values of delta of length(lambda) with names lambda.
The amplify function is also in the sensitivitymv package where a different example is used.
Paul R. Rosenbaum
Gastwirth, J. L., Krieger, A. M., Rosenbaum, P. R. (1998) Dual and simultaneous sensitivity analysis for matched pairs. Biometrika, 85, 907-920.
Rosenbaum, P. R. and Silber, J. H. (2009) Amplification of sensitivity analysis in observational studies. Journal of the American Statistical Association, 104, 1398-1405. <doi:10.1198/jasa.2009.tm08470>
Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)
Wolfe, D. A. (1974) A charaterization of population weighted symmetry and related results. Journal of the American Statistical Association, 69, 819-822.
attach(nh1and3) sen(homocysteine,z,mset,gamma=1.9) amplify(1.9,c(3,3.5,4)) detach(nh1and3)
attach(nh1and3) sen(homocysteine,z,mset,gamma=1.9) amplify(1.9,c(3,3.5,4)) detach(nh1and3)
Of very limited interest to most users, function mscoreInternal() computes the M-scores used by functions sen().
mscoreInternal(ymat, inner, trim)
mscoreInternal(ymat, inner, trim)
ymat |
A matrix of outcomes scaled for use in an M-statistic; see the discussion of the parameter lambda in the documentation for the sen function. If the largest matched set has K controls, and there are I matched sets, then ymat is an I x (K+1) matrix. Each row is a matched set. The first column contains the treated individual in the matched set. The remaining columns contain the controls. If a set has fewer than K controls, then its last columns are NAs. |
inner |
inner is the inner[i] parameter described in the documentation for sen(). |
trim |
trim is the trim[i] parameter described in the documentation for sen(). |
Generally, a matrix with the same dimensions as ymat containing the M-scores.
Xinran Li and Paul R. Rosenbaum
Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)
Data from NHANES 2005-2006 concerning homocysteine levels in daily smokers (z=1) and never smokers (z=0), aged 20 and older.
data("nh1and3")
data("nh1and3")
A data frame with 1370 observations consisting of 353 matched pairs and 166 matched sets with 3 controls.
SEQN
NHANES ID number
z
=1 for a daily smoker, =0 for a never smoker
mset
Matched set indicator, for 519 sets, 1, 2, ..., 519
homocysteine
Blood homocysteine level
cigsperday30
Cigarettes smoked per day
cotinine
Cotinine is a biomarker for exposure to nicotine
female
=1 for female, =0 for male
age
Age in years
black
=1 for black, =0 for other
education
NHANES 1-5 score. 3 is a high school degree.
povertyr
Ratio of family income to the poverty level, capped at 5.
Data from NHANES 2005-2006 concerning homocysteine levels in daily smokers (z=1) and never smokers (z=0), aged 20 and older. Daily smokers smoked every day for the last 30 days, smoking an average of at least 10 cigarettes per day. Never smokers smoked fewer than 100 cigarettes in their lives, do not smoke now, and had no tobacco use in the previous 5 days.
NHANES 2005-2006
Bazzano, L. A., He, J., Muntner, P., Vupputuri, S. and Whelton, P. K. (2003) Relationship between cigarette smoking and novel risk factors for cardiovascular disease in the United States. Annals of Internal Medicine, 138, 891-897.
Pimentel, S. D., Small, D. S. and Rosenbaum, P. R. (2016) Constructed second control groups and attenuation of unmeasured biases. Journal of the American Statistical Association, 111, 1157-1167. <doi.org/10.1080/01621459.2015.1076342>
data(nh1and3) attach(nh1and3) table(table(nh1and3$mset)) par(mfrow=c(1,2)) boxplot(homocysteine[1:166]~z[1:166],ylim=c(0,70),main="1-1 match", ylab="homocysteine",names=c("Control","Smoker")) boxplot(homocysteine[167:1370]~z[167:1370],ylim=c(0,70), main="1-3 match",ylab="homocysteine",names=c("Control","Smoke")) detach(nh1and3)
data(nh1and3) attach(nh1and3) table(table(nh1and3$mset)) par(mfrow=c(1,2)) boxplot(homocysteine[1:166]~z[1:166],ylim=c(0,70),main="1-1 match", ylab="homocysteine",names=c("Control","Smoker")) boxplot(homocysteine[167:1370]~z[167:1370],ylim=c(0,70), main="1-3 match",ylab="homocysteine",names=c("Control","Smoke")) detach(nh1and3)
Each matched set contains one treated
individual and one or more controls.
Uses Huber's M-statistic as the basis for
the test, for instance, a mean. Matched sets of different sizes
use different -functions, creating what is called a fugue statistic.
Performs either a randomization
test (Gamma=1) or an analysis of sensitivity to departures from random
assignment (Gamma>1). For confidence intervals, use function senCI().
The method is described in Li and Rosenbaum (2019); see also Rosenbaum (2007,2013).
sen(y, z, mset, gamma = 1, inner = NULL, trim = NULL, lambda = 1/2, tau = 0, alternative = "greater")
sen(y, z, mset, gamma = 1, inner = NULL, trim = NULL, lambda = 1/2, tau = 0, alternative = "greater")
y |
A vector of responses with no missing data. |
z |
Treatment indicator, z=1 for treated, z=0 for control with length(z)==length(y). |
mset |
Matched set indicator, 1, 2, ..., sum(z) with length(mset)==length(y). Matched set indicators should be either integers or a factor. |
gamma |
gamma is the sensitivity parameter |
inner |
inner and trim together define the |
trim |
inner and trim together define the |
lambda |
Before applying the An error will result unless 0 < lambda < 1. |
tau |
The null hypothesis asserts that the treatment has an additive effect, tau. By default, tau=0, so by default the null hypothesis is Fisher's sharp null hypothesis of no treatment effect. |
alternative |
If alternative="greater", the null hypothesis of a treatment effect of tau is tested against the alternative of a treatment effect larger than tau. If alternative="less", the null hypothesis of a treatment effect of tau is tested against the alternative of a treatment effect smaller than tau. In particular, alternative="less" is equivalent to: (i) alternative="greater", (ii) y replaced by -y, and (iii) tau replaced by -tau. See the note for discussion of two-sided sensitivity analyses. |
The novel element in the fugue package is the automatic use of different
-functions for matched sets of different sizes. These
-functions have been selected to approximately equate the
design sensitivities in sets of unequal sizes when the errors are
Normal and the additive effect is half the standard deviation of
a matched pair difference; see Li and Rosenbaum (2019). If you
disable this automatic feature by manually setting a single value for inner
and trim, then the results will agree with senm() in the R
package sensitivitymult. For instance, using both sen() in the
fugue package and senm() in the sensitivitymult package will
yield the same deviate and P-value in:
data(nh1and3)
attach(nh1and3)
sen(homocysteine,z,mset,inner=0,gamma=1.9)
senm(homocysteine,z,mset,inner=0,trim=3,gamma=1.9)
Note that the sensitivitymult package is intended to
implement methods from Rosenbaum (2016,2019) that are
not implemented in the fugue package.
For the given , sen() computes the upper bound on the 1-sided
P-value testing the null hypothesis
of an additive treatment effect tau against the alternative hypothesis of
a treatment effect larger than tau. By default, sen() tests the null hypothesis of
no treatment effect against the alternative of a positive treatment effect.
The P-value is an approximate P-value
based on a Normal approximation to the null distribution; see Rosenbaum (2007).
Matched sets of unequal size are weighted using weights that would be efficient in a randomization test under a simple model with additive set and treatment effects and errors with constant variance; see Rosenbaum (2007).
The upper bound on the P-value is based on the separable approximation described in Gastwirth, Krieger and Rosenbaum (2000); see also Rosenbaum (2007, 2018).
pval |
The upper bound on the 1-sided P-value. |
deviate |
The deviate that was compared to the Normal distribution to produce pval. |
statistic |
The value of the M-statistic. |
expectation |
The maximum expectation of the
M-statistic for the given |
variance |
The maximum variance of the M-statistic among treatment assignments that achieve the maximum expectation. Part of the separable approximation. |
The function sen() performs 1-sided tests. One approach
to a 2-sided, -level test does both 1-sided tests
at level
, and rejects the null hypothesis if either
1-sided
test rejects. Equivalently, a bound on the two sided
P-value is the smaller of 1 and twice the smaller of the two 1-sided
P-values. This approach views a 2-sided test as two 1-sided tests
with a Bonferroni correction; see Cox (1977, Section 4.2). In all
cases, this approach is a valid large sample test: a true
null hypothesis is falsely
rejected with probability at most
if the bias in
treatment assignment is at most
; so, this procedure
is entirely safe to use. For a randomization test,
, this
Bonferroni procedure is not typically conservative. For large
,
this Bonferroni procedure tends to be somewhat conservative.
The examples reproduce some results from Li and Rosenbaum (2019).
Xinran Li and Paul R. Rosenbaum.
Cox, D. R. (1977). The role of signficance tests (with Discussion). Scand. J. Statist. 4, 49-70.
Huber, P. (1981) Robust Statistics. New York: John Wiley. (M-estimates based on M-statistics.)
Li, X. and Rosenbaum, P. R. (2019) Maintaining high constant design sensitivity in observational studies with matched sets of varying sizes. Manuscript.
Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66 163–166. (Introduces exact permutation tests based on M-statistics by redefining the scaling parameter.)
Rosenbaum, P. R. (2007). Sensitivity analysis for m-estimates, tests and confidence intervals in matched observational studies. Biometrics 63 456-64. (R package sensitivitymv) <doi:10.1111/j.1541-0420.2006.00717.x>
Rosenbaum, P. R. (2013). Impact of multiple matched controls on design sensitivity in observational studies. Biometrics 69 118-127. (Introduces inner trimming.) <doi:10.1111/j.1541-0420.2012.01821.x>
Rosenbaum, P. R. (2014). Weighted M-statistics with superior design sensitivity in matched observational studies with multiple controls. J. Am. Statist. Assoc. 109 1145-1158. (R package sensitivitymw) <doi:10.1080/01621459.2013.879261>
Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)
Rosenbaum, P. R. (2016) Using Scheffe projections for multiple outcomes in an observational study of smoking and periondontal disease. Annals of Applied Statistics, 10, 1447-1471. <doi:10.1214/16-AOAS942>
Rosenbaum, P. R. (2018). Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334. <doi:10.1214/18-AOAS1153>
Rosenbaum, P. R. (2019). Combining planned and discovered comparisons in observational studies. Biostatistics, to appear. <doi.org/10.1093/biostatistics/kxy055>
# Reproduces results from Table 3 of Li and Rosenbaum (2019) data(nh1and3) attach(nh1and3) sen(homocysteine,z,mset,gamma=1) sen(homocysteine,z,mset,gamma=1.9) sen(homocysteine,z,mset,inner=0,gamma=1.9) amplify(1.9,c(3,3.5,4)) detach(nh1and3)
# Reproduces results from Table 3 of Li and Rosenbaum (2019) data(nh1and3) attach(nh1and3) sen(homocysteine,z,mset,gamma=1) sen(homocysteine,z,mset,gamma=1.9) sen(homocysteine,z,mset,inner=0,gamma=1.9) amplify(1.9,c(3,3.5,4)) detach(nh1and3)
Each matched set contains one treated
individual and one or more controls.
Uses Huber's M-statistic as the basis for
the test; see Maritz (1979). Matched sets of different sizes
use different -functions, creating what is called a fugue statistic.
Performs either a randomization
test (
) or an analysis of sensitivity to departures from random
assignment (
). For hypothesis tests, use function sen().
The method is described in Li and Rosenbaum (2019); see also Rosenbaum (2007,2013).
senCI(y, z, mset, gamma = 1, inner = NULL, trim = NULL, lambda = 1/2, alpha = 0.05, alternative = "greater")
senCI(y, z, mset, gamma = 1, inner = NULL, trim = NULL, lambda = 1/2, alpha = 0.05, alternative = "greater")
y |
A vector of responses with no missing data. |
z |
Treatment indicator, z=1 for treated, z=0 for control with length(z)==length(y). |
mset |
Matched set indicator, 1, 2, ..., sum(z) with length(mset)==length(y). Matched set indicators should be either integers or a factor. |
gamma |
gamma is the sensitivity parameter |
inner |
inner and trim together define the |
trim |
inner and trim together define the |
lambda |
Before applying the An error will result unless 0 < lambda < 1. |
alpha |
The coverage rate of the confidence interval is 1- |
alternative |
If alternative="greater" or alternative="less", the a one-sided confidence interval is returned. If alternative="twosided", a somewhat conservative two-sided confidence interval is returned. See the discussion of two-sided tests in the documentation for sen(). |
The confidence interval inverts the test provided by sen(). See the documentation for sen() for more information.
The upper bound on the P-value is based on the separable approximation described in Gastwirth, Krieger and Rosenbaum (2000); see also Rosenbaum (2007, 2018).
point.estimates |
For |
confidence.interval |
The confidence interval. |
The examples reproduce some results from Li and Rosenbaum (2019).
Xinran Li and Paul R. Rosenbaum.
Cox, D. R. (1977). The role of signficance tests (with Discussion). Scand. J. Statist. 4, 49-70.
Huber, P. (1981) Robust Statistics. New York: John Wiley. (M-estimates based on M-statistics.)
Li, Xinran and Rosenbaum, P. R. (2019) Maintaining high constant design sensitivity in observational studies with matched sets of varying sizes. Manuscript.
Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66 163–166. (Introduces exact permutation tests based on M-statistics by redefining the scaling parameter.)
Rosenbaum, P. R. (2007). Sensitivity analysis for m-estimates, tests and confidence intervals in matched observational studies. Biometrics 63 456-64. (R package sensitivitymv) <doi:10.1111/j.1541-0420.2006.00717.x>
Rosenbaum, P. R. (2013). Impact of multiple matched controls on design sensitivity in observational studies. Biometrics 69 118-127. (Introduces inner trimming.) <doi:10.1111/j.1541-0420.2012.01821.x>
Rosenbaum, P. R. (2015). Two R packages for sensitivity analysis in observational studies. Observational Studies, v. 1. (Free on-line.)
Rosenbaum, P. R. (2016) Using Scheffe projections for multiple outcomes in an observational study of smoking and periondontal disease. Annals of Applied Statistics, 10, 1447-1471. <doi:10.1214/16-AOAS942>
Rosenbaum, P. R. (2018). Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334. <doi:10.1214/18-AOAS1153>
## Not run: # Reproduces results from Table 3 of Li and Rosenbaum (2019) data(nh1and3) attach(nh1and3) senCI(homocysteine,z,mset,gamma=1) senCI(homocysteine,z,mset,gamma=1.9) senCI(homocysteine,z,mset,inner=0,gamma=1.9) amplify(1.9,c(3,3.5,4)) # Relationships between confidence intervals and P-value bounds senCI(homocysteine,z,mset,alternative="twosided",gamma=1.75) sen(homocysteine,z,mset,alternative="less",tau=2.21721733,gamma=1.75) senCI(homocysteine,z,mset,alternative="less",gamma=1.75) sen(homocysteine,z,mset,alternative="less",tau=2.159342,gamma=1.75) detach(nh1and3) ## End(Not run)
## Not run: # Reproduces results from Table 3 of Li and Rosenbaum (2019) data(nh1and3) attach(nh1and3) senCI(homocysteine,z,mset,gamma=1) senCI(homocysteine,z,mset,gamma=1.9) senCI(homocysteine,z,mset,inner=0,gamma=1.9) amplify(1.9,c(3,3.5,4)) # Relationships between confidence intervals and P-value bounds senCI(homocysteine,z,mset,alternative="twosided",gamma=1.75) sen(homocysteine,z,mset,alternative="less",tau=2.21721733,gamma=1.75) senCI(homocysteine,z,mset,alternative="less",gamma=1.75) sen(homocysteine,z,mset,alternative="less",tau=2.159342,gamma=1.75) detach(nh1and3) ## End(Not run)
Of limited interest to most users, this general purpose function is internal to other functions in the package. It is the same function as in the sensitivitymv package, version 1.3. The function performs the asymptotic separable calculations described in Gastwirth, Krieger and Rosenbaum (2000) and Rosenbaum (2018), as used in section 4 of Rosenbaum (2007). See the sensitivitymv package for an example.
separable1v(ymat, gamma = 1)
separable1v(ymat, gamma = 1)
ymat |
ymat is a matrix whose rows are matched sets and whose columns are matched individuals. The first column describes treated individuals. Other columns describe controls. If matched sets contain variable numbers of controls, NAs fill in empty spaces in ymat; see the documentation for senmv. In senmv, the matrix ymat is created by mscorev. Instead, if there were no NAs and ranks within rows were used in ymat, then separable1v would perform a sensitivity analysis for the stratified Wilcoxon two-sample test. Applied directly to data, it performs a sensitivity analysis for the permutational t-test. |
gamma |
gamma is the value of the sensitivity parameter; see the documentation for the senmv function in the sensitivitymv package. One should use a value of gamma >= 1. |
pval |
Approximate upper bound on the one-sided P-value. |
deviate |
Deviate that is compared to the upper tail of the standard Normal distribution to obtain the P-value. |
statistic |
Value of the test statistic. |
expectation |
Maximum null expectation of the test statistic for the given value of gamma. |
variance |
Among null distributions that yield the maximum expectation, variance is the maximum possible variance for the given value of gamma. See Rosenbaum (2007, Section 4) and Gastwirth, Krieger and Rosenbaum (2000). |
Paul R. Rosenbaum
Gastwirth, J. L., Krieger, A. M., and Rosenbaum, P. R. (2000) Asymptotic separability in sensitivity analysis. Journal of the Royal Statistical Society B 2000, 62, 545-556. <DOI:10.1111/1467-9868.00249>
Rosenbaum, P. R. (2007) Sensitivity analysis for m-estimates, tests and confidence intervals in matched observational studies. Biometrics, 2007, 63, 456-464. <doi:10.1111/j.1541-0420.2006.00717.x>
Rosenbaum, P. R. (2018). Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334. <doi:10.1214/18-AOAS1153>