Title: | Harmonic Mean p-Values and Model Averaging by Mean Maximum Likelihood |
---|---|
Description: | The harmonic mean p-value (HMP) test combines p-values and corrects for multiple testing while controlling the strong-sense family-wise error rate. It is more powerful than common alternatives including Bonferroni and Simes procedures when combining large proportions of all the p-values, at the cost of slightly lower power when combining small proportions of all the p-values. It is more stringent than controlling the false discovery rate, and possesses theoretical robustness to positive correlations between tests and unequal weights. It is a multi-level test in the sense that a superset of one or more significant tests is certain to be significant and conversely when the superset is non-significant, the constituent tests are certain to be non-significant. It is based on MAMML (model averaging by mean maximum likelihood), a frequentist analogue to Bayesian model averaging, and is theoretically grounded in generalized central limit theorem. For detailed examples type vignette("harmonicmeanp") after installation. Version 3.0 addresses errors in versions 1.0 and 2.0 that led function p.hmp to control the familywise error rate only in the weak sense, rather than the strong sense as intended. |
Authors: | Daniel J. Wilson |
Maintainer: | Daniel Wilson <[email protected]> |
License: | GPL-3 |
Version: | 3.0.1 |
Built: | 2024-11-13 06:17:25 UTC |
Source: | CRAN |
Compute a combined p-value via the asymptotically exact harmonic mean p-value. The harmonic mean p-value (HMP) test combines p-values and corrects for multiple testing while controlling the strong-sense family-wise error rate. It is more powerful than common alternatives including Bonferroni and Simes procedures when combining large proportions of all the p-values, at the cost of slightly lower power when combining small proportions of all the p-values. It is more stringent than controlling the false discovery rate, and possesses theoretical robustness to positive correlations between tests and unequal weights. It is a multi-level test in the sense that a superset of one or more significant tests is certain to be significant and conversely when the superset is non-significant, the constituent tests are certain to be non-significant. It is based on MAMML (model averaging by mean maximum likelihood), a frequentist analogue to Bayesian model averaging, and is theoretically grounded in generalized central limit theorem.
p.hmp(p, w = NULL, L = NULL, w.sum.tolerance = 1e-6, multilevel = TRUE)
p.hmp(p, w = NULL, L = NULL, w.sum.tolerance = 1e-6, multilevel = TRUE)
p |
A numeric vector of one or more p-values to be combined. Missing values (NAs) will cause a missing value to be returned. |
w |
An optional numeric vector of weights that can be interpreted as incorporating information about prior model probabilities and power of the tests represented by the individual p-values. The sum of the weights cannot exceed one but may be less than one, which is interpreted as meaning that some of the |
L |
The number of constituent p-values. This is mandatory when using |
w.sum.tolerance |
Tolerance for checking that the weights do not exceed 1. |
multilevel |
Logical, indicating whether the test is part of a multilevel procedure involving a total of |
When multilevel==TRUE
an asymptotically exact combined p-value is returned, equivalent to sum(w)*pharmonicmeanp(hmp.stat(p,w)/sum(w),L)
. This p-value can be compared against threshold sum(w)*alpha
to control the strong-sense familywise error rate at level alpha
. A test based on this asymptotically exact harmonic mean p-value is equivalent to comparison of the ‘raw’ harmonic mean p-value calculated by hmp.stat(p,w)
to a 'harmonic mean p-value threshold' sum(w)*qharmonicmeanp(alpha,L)
.
When multilevel==FALSE
an asymptotically exact combined p-value is returned, equivalent to pharmonicmeanp(hmp.stat(p,w),length(p))
. L
is ignored and w
is normalized to sum to 1. This p-value can be compared against threshold alpha
to control the per-test error rate at level alpha
. A test based on this asymptotically exact harmonic mean p-value is equivalent to comparison of the ‘raw’ harmonic mean p-value calculated by hmp.stat(p,w)
to a 'harmonic mean p-value threshold' qharmonicmeanp(alpha,L)
.
Daniel J. Wilson
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
hmp.stat
# For detailed examples type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. L = 1000 p = rbeta(L,1/1.8,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) p.hmp(p,L=L) p.mamml(1/p,2,L=L) # Multilevel test: find significant subsets of the 1000 p-values by comparing overlapping # subsets of size 100 and 10 in addition to the top-level test of all 1000, while maintaining # the strong-sense family-wise error rate. p.100 = sapply(seq(1,L-100,by=10),function(beg) p.hmp(p[beg+0:99],L=L)) plot(-log10(p.100),xlab="Test index",main="Moving average", ylab="Asymptotically exact combined -log10 p-value",type="o") # The appropriate threshold is alpha (e.g. 0.05) times the proportion of tests in each p.100 abline(h=-log10(0.05*100/1000),col=2,lty=2) p.10 = sapply(seq(1,L-10,by=5),function(beg) p.hmp(p[beg+0:9],L=L)) plot(-log10(p.10),xlab="Test index",main="Moving average", ylab="Asymptotically exact combined -log10 p-value",type="o") # The appropriate threshold is alpha (e.g. 0.05) times the proportion of tests in each p.10 abline(h=-log10(0.05*10/1000),col=2,lty=2)
# For detailed examples type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. L = 1000 p = rbeta(L,1/1.8,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) p.hmp(p,L=L) p.mamml(1/p,2,L=L) # Multilevel test: find significant subsets of the 1000 p-values by comparing overlapping # subsets of size 100 and 10 in addition to the top-level test of all 1000, while maintaining # the strong-sense family-wise error rate. p.100 = sapply(seq(1,L-100,by=10),function(beg) p.hmp(p[beg+0:99],L=L)) plot(-log10(p.100),xlab="Test index",main="Moving average", ylab="Asymptotically exact combined -log10 p-value",type="o") # The appropriate threshold is alpha (e.g. 0.05) times the proportion of tests in each p.100 abline(h=-log10(0.05*100/1000),col=2,lty=2) p.10 = sapply(seq(1,L-10,by=5),function(beg) p.hmp(p[beg+0:9],L=L)) plot(-log10(p.10),xlab="Test index",main="Moving average", ylab="Asymptotically exact combined -log10 p-value",type="o") # The appropriate threshold is alpha (e.g. 0.05) times the proportion of tests in each p.10 abline(h=-log10(0.05*10/1000),col=2,lty=2)
The harmonic mean p-value (HMP) is defined as the inverse of the (possibly weighted) arithmetic mean of the inverse p-values. When the HMP is small (e.g. less than 0.05), it is approximately well-calibrated, meaning that it can be directly interpreted. However, the function p.hmp calculates an asymptotically exact p-value from the HMP and is preferred.
hmp.stat(p, w = NULL)
hmp.stat(p, w = NULL)
p |
A numeric vector of one or more p-values. Missing values (NAs) will cause a missing value to be returned. |
w |
An optional numeric vector of weights that can be interpreted as prior model probabilities for each of the alternative hypotheses represented by the individual p-values. The sum of the weights cannot exceed one but may be less than one, which is interpreted as meaning that some p-values have been excluded. |
The harmonic mean p-value is returned.
Daniel J. Wilson
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
p.hmp
# For detailed examples type vignette("harmonicmeanp") p = rbeta(1000,1/1.5,1) hmp.stat(p) p.hmp(p,L=1000)
# For detailed examples type vignette("harmonicmeanp") p = rbeta(1000,1/1.5,1) hmp.stat(p) p.hmp(p,L=1000)
The model-averaged mean maximized likelihood (MAMML) is defined as the (possibly weighted) arithmetic mean of the maximized likelihood ratios from a series of likelihood ratio tests comparing mutually exclusive alternative hypotheses with the same nested null hypothesis based on the exact same data.
mamml.stat(R, w = NULL)
mamml.stat(R, w = NULL)
R |
A numeric vector of one or more maximized likelihood ratios. Missing values (NAs) will cause a missing value to be returned. |
w |
An optional numeric vector of weights that can be interpreted as prior model probabilities for each of the alternative hypotheses represented by the individual p-values. The sum of the weights cannot exceed one but may be less than one, which is interpreted as meaning that some p-values have been excluded. |
The model-averaged mean maximized likelihood ratio is returned.
Daniel J. Wilson
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
p.mamml
# For detailed examples type vignette("harmonicmeanp") nu = 3 R = exp(0.5*rchisq(1000,nu)) mamml.stat(R) p.mamml(R,nu,L=1000)
# For detailed examples type vignette("harmonicmeanp") nu = 3 R = exp(0.5*rchisq(1000,nu)) mamml.stat(R) p.mamml(R,nu,L=1000)
Density, distribution function, quantile function and random number generation for the harmonic mean of L p-values under their null hypotheses, i.e. the harmonic mean of L standard uniform random variables, assuming L is large.
dharmonicmeanp(x, L, log=FALSE) pharmonicmeanp(x, L, log=FALSE, lower.tail=TRUE) qharmonicmeanp(p, L, log=FALSE, lower.tail=TRUE) rharmonicmeanp(n, L)
dharmonicmeanp(x, L, log=FALSE) pharmonicmeanp(x, L, log=FALSE, lower.tail=TRUE) qharmonicmeanp(p, L, log=FALSE, lower.tail=TRUE) rharmonicmeanp(n, L)
x |
The value or vector of values of the harmonic mean p-value, for example calculated from data using function |
L |
The number of constituent p-values used in calculating each value of x. |
log |
If true the log probability is output. |
lower.tail |
If true (the default) the lower tail probability is returned. Otherwise the upper tail probability. |
p |
The value or vector of values, between 0 and 1, of the probability specifying the quantile for which to return the harmonic mean p-value. |
n |
The number of values to simulate. |
dharmonicmeanp
produces the density, pharmonicmeanp
the tail probability, qharmonicmeanp
the quantile and rharmonicmeanp
random variates for the harmonic mean of L p-values when their null hypotheses are true.
Use qharmonicmeanp(alpha,L)
to calculate , the 'harmonic mean p-value threshold', as in Table 1 of Wilson (2019, corrected), where
L
is the total number of p-values under consideration and alpha
is the intended strong-sense familywise error rate.
Daniel J. Wilson
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
p.hmp
# For a detailed tutorial type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. L = 1000 p = rbeta(L,1/1.5,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) x = hmp.stat(p) pharmonicmeanp(x,length(p)) p.hmp(p,L=L) p.mamml(1/p,2,L=L) # Compute critical values for the HMP from asymptotic theory and compare to direct simulations L = 100 alpha = 0.05 (hmp.crit = qharmonicmeanp(alpha,L)) nsim = 100000 p.direct = matrix(runif(L*nsim),nsim,L) hmp.direct = apply(p.direct,1,hmp.stat) (hmp.crit.sim = quantile(hmp.direct,alpha)) # Compare HMP of p-values simulated directly, and via the asymptotic distribution, # to the asymptotic density L = 30 nsim = 10000 p.direct = matrix(runif(L*nsim),nsim,L) hmp.direct = apply(p.direct,1,hmp.stat) hmp.asympt = rharmonicmeanp(nsim,L) h = hist(hmp.direct,60,col="green3",prob=TRUE,main="Distributions of harmonic mean p-values") hist(hmp.asympt,c(-Inf,h$breaks,Inf),col="yellow2",prob=TRUE,add=TRUE) hist(hmp.direct,60,col=NULL,prob=TRUE,add=TRUE) curve(dharmonicmeanp(x,L),lwd=2,col="red3",add=TRUE) legend("topright",c("Direct simulation","Asymptotic simulation","Asymptotic density"), fill=c("green3","yellow2",NA),col=c(NA,NA,"red3"),lwd=c(NA,NA,2),bty="n",border=c(1,1,NA))
# For a detailed tutorial type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. L = 1000 p = rbeta(L,1/1.5,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) x = hmp.stat(p) pharmonicmeanp(x,length(p)) p.hmp(p,L=L) p.mamml(1/p,2,L=L) # Compute critical values for the HMP from asymptotic theory and compare to direct simulations L = 100 alpha = 0.05 (hmp.crit = qharmonicmeanp(alpha,L)) nsim = 100000 p.direct = matrix(runif(L*nsim),nsim,L) hmp.direct = apply(p.direct,1,hmp.stat) (hmp.crit.sim = quantile(hmp.direct,alpha)) # Compare HMP of p-values simulated directly, and via the asymptotic distribution, # to the asymptotic density L = 30 nsim = 10000 p.direct = matrix(runif(L*nsim),nsim,L) hmp.direct = apply(p.direct,1,hmp.stat) hmp.asympt = rharmonicmeanp(nsim,L) h = hist(hmp.direct,60,col="green3",prob=TRUE,main="Distributions of harmonic mean p-values") hist(hmp.asympt,c(-Inf,h$breaks,Inf),col="yellow2",prob=TRUE,add=TRUE) hist(hmp.direct,60,col=NULL,prob=TRUE,add=TRUE) curve(dharmonicmeanp(x,L),lwd=2,col="red3",add=TRUE) legend("topright",c("Direct simulation","Asymptotic simulation","Asymptotic density"), fill=c("green3","yellow2",NA),col=c(NA,NA,"red3"),lwd=c(NA,NA,2),bty="n",border=c(1,1,NA))
The harmonic mean p-value (HMP) test combines p-values and corrects for multiple testing while controlling the strong-sense family-wise error rate. It is more powerful than common alternatives including Bonferroni and Simes procedures when combining large proportions of all the p-values, at the cost of slightly lower power when combining small proportions of all the p-values. It is more stringent than controlling the false discovery rate, and possesses theoretical robustness to positive correlations between tests and unequal weights. It is a multi-level test in the sense that a superset of one or more significant tests is certain to be significant and conversely when the superset is non-significant, the constituent tests are certain to be non-significant. It is based on MAMML (model averaging by mean maximum likelihood), a frequentist analogue to Bayesian model averaging, and is theoretically grounded in generalized central limit theorem.
Package: | harmonicmeanp |
Type: | Package |
Version: | 3.0 |
Date: | 2019-08-17 |
License: | GPL-3 |
The key function is p.hmp for combining p-values using the HMP. Type vignette("harmonicmeanp") for detailed examples.
Daniel J. Wilson
Maintainer: Daniel Wilson <[email protected]>
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
Package FMStable
# For detailed examples type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. p = rbeta(1000,1/1.5,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) p.hmp(p,L=1000) p.mamml(1/p,2,L=1000)
# For detailed examples type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. p = rbeta(1000,1/1.5,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) p.hmp(p,L=1000) p.mamml(1/p,2,L=1000)
Density, distribution function, quantile function and random number generation for the Landau distribution with location parameter mu and scale parameter sigma.
dLandau(x, mu=log(pi/2), sigma=pi/2, log=FALSE) pLandau(x, mu=log(pi/2), sigma=pi/2, log=FALSE, lower.tail=TRUE) qLandau(p, mu=log(pi/2), sigma=pi/2, log=FALSE, lower.tail=TRUE) rLandau(n, mu=log(pi/2), sigma=pi/2)
dLandau(x, mu=log(pi/2), sigma=pi/2, log=FALSE) pLandau(x, mu=log(pi/2), sigma=pi/2, log=FALSE, lower.tail=TRUE) qLandau(p, mu=log(pi/2), sigma=pi/2, log=FALSE, lower.tail=TRUE) rLandau(n, mu=log(pi/2), sigma=pi/2)
x |
The value or vector of values of the Landau-distributed random variable. |
mu |
The location parameter of the Landau distribution. Defaults to log(pi/2) to give Landau's original distribution. |
sigma |
The scale parameter of the Landau distribution. Defaults to pi/2 to give Landau's original distribution. |
log |
If true the log probability is output. |
lower.tail |
If true (the default) the lower tail probability is returned. Otherwise the upper tail probability. |
p |
The value or vector of values, between 0 and 1, of the probability specifying the quantile for which to return the Landau random variable |
n |
The number of values to simulate. |
The density of the Landau distribution can be written
dLandau
produces the density, pLandau
the tail probability, qLandau
the quantile and rLandau
random variates for the Landau distribution.
Daniel J. Wilson
Landau LD (1944) On the energy loss of fast particles by ionization. J Phys USSR 8:201-205.
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
p.hmp
# For detailed examples type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. L = 1000 p = rbeta(L,1/1.5,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) x = hmp.stat(p) pLandau(1/x,log(length(p))+(1 + digamma(1) - log(2/pi)),pi/2,lower.tail=FALSE) p.hmp(p,L=L) p.mamml(1/p,2,L=L)
# For detailed examples type vignette("harmonicmeanp") # Example: simulate from a non-uniform distribution mildly enriched for small \emph{p}-values. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # HMP and (equivalently) MAMML with 2 degrees of freedom. L = 1000 p = rbeta(L,1/1.5,1) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) x = hmp.stat(p) pLandau(1/x,log(length(p))+(1 + digamma(1) - log(2/pi)),pi/2,lower.tail=FALSE) p.hmp(p,L=L) p.mamml(1/p,2,L=L)
Density, distribution function, quantile function and random number generation for the mean of L maximized likelihood ratios under their null hypotheses, i.e. the mean of L Pareto(1,1) variables, assuming L is large.
dmamml(x, L, df, log=FALSE) pmamml(x, L, df, log=FALSE, lower.tail=TRUE) qmamml(p, L, df, log=FALSE, lower.tail=TRUE, xmin=1+1e-12, xmax=1e12) rmamml(n, L, df)
dmamml(x, L, df, log=FALSE) pmamml(x, L, df, log=FALSE, lower.tail=TRUE) qmamml(p, L, df, log=FALSE, lower.tail=TRUE, xmin=1+1e-12, xmax=1e12) rmamml(n, L, df)
x |
The value or vector of values of the mean maximized likelihood ratios, for example calculated from data using function |
L |
The number of constituent likelihood ratios used in calculating each value of x. |
df |
Degrees of freedom of the constituent likelihood ratios, assumed all equal for each value of x. |
log |
If true the log probability is output. |
lower.tail |
If true (the default) the lower tail probability is returned. Otherwise the upper tail probability. |
p |
The value or vector of values, between 0 and 1, of the probability specifying the quantile for which to return the mean maximized likelihood ratio. |
xmin , xmax
|
The range of values of the mean maximized likelihood ratio over which to search for the quantile specified by |
n |
The number of values to simulate. |
dmamml
produces the density, pmamml
the tail probability, qmamml
the quantile and rmamml
random variates for the model-averaged mean maximized likelihood ratios when their null hypotheses are true.
Daniel J. Wilson
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
p.hmp
# For a detailed tutorial type vignette("harmonicmeanp") # Example: simulate from a gamma distribution mildly enriched for large likelihood ratios. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # MAMML with 4 degrees of freedom. L = 100 df = 4 enrich = 1.5 y = exp(0.5*rgamma(L,shape=df/2,rate=1/2/enrich)) p = pchisq(2*log(y),df,lower.tail=FALSE) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) (x = mamml.stat(y)) pmamml(x,L,df,lower.tail=FALSE) p.mamml(y,df,L=L) # Compare to the HMP - MAMML may act more conservatively because of adjustments for its # poorer asymptotic approximation x = hmp.stat(p) p.hmp(p,L=L) # Compute critical values for MAMML from asymptotic theory and compare to direct simulations L = 100 df = 1 alpha = 0.05 (mamml.crit = qmamml(1-alpha,L,df)) nsim = 10000 y.direct = matrix(exp(0.5*rchisq(L*nsim,df)),nsim,L) mamml.direct = apply(y.direct,1,mamml.stat) # mamml.crit may be more conservative than mamml.crit.sim because of adjustments for its # poorer asymptotic approximation (mamml.crit.sim = quantile(mamml.direct,1-alpha)) # Compare MAMML simulated directly, and via the asymptotic distribution, to the asymptotic density # Works best for df = 2 L = 30 df = 3 nsim = 1000 y.direct = matrix(exp(0.5*rchisq(L*nsim,df)),nsim,L) mamml.direct = apply(y.direct,1,mamml.stat) xmax = quantile(mamml.direct,.95) h = hist(mamml.direct,c(-Inf,seq(0,xmax,len=60),Inf),col="green3",prob=TRUE, main="Distributions of MAMML",xlim=c(1,xmax)) # Slow because rmamml calls qmamml which uses numerical root finding # mamml.asympt = rmamml(nsim,L,df) # hist(mamml.asympt,c(-Inf,seq(0,xmax,len=60),Inf),col="yellow2",prob=TRUE,add=TRUE) hist(mamml.direct,c(-Inf,seq(0,xmax,len=60),Inf),col=NULL,prob=TRUE,add=TRUE) curve(dmamml(x,L,df),lwd=2,col="red3",add=TRUE) legend("topright",c("Direct simulation","Asymptotic density"), fill=c("green3",NA),col=c(NA,"red3"),lwd=c(NA,2),bty="n",border=c(1,NA))
# For a detailed tutorial type vignette("harmonicmeanp") # Example: simulate from a gamma distribution mildly enriched for large likelihood ratios. # Compare the significance of the combined p-value for Bonferroni, Benjamini-Hochberg (i.e. Simes), # MAMML with 4 degrees of freedom. L = 100 df = 4 enrich = 1.5 y = exp(0.5*rgamma(L,shape=df/2,rate=1/2/enrich)) p = pchisq(2*log(y),df,lower.tail=FALSE) min(p.adjust(p,"bonferroni")) min(p.adjust(p,"BH")) (x = mamml.stat(y)) pmamml(x,L,df,lower.tail=FALSE) p.mamml(y,df,L=L) # Compare to the HMP - MAMML may act more conservatively because of adjustments for its # poorer asymptotic approximation x = hmp.stat(p) p.hmp(p,L=L) # Compute critical values for MAMML from asymptotic theory and compare to direct simulations L = 100 df = 1 alpha = 0.05 (mamml.crit = qmamml(1-alpha,L,df)) nsim = 10000 y.direct = matrix(exp(0.5*rchisq(L*nsim,df)),nsim,L) mamml.direct = apply(y.direct,1,mamml.stat) # mamml.crit may be more conservative than mamml.crit.sim because of adjustments for its # poorer asymptotic approximation (mamml.crit.sim = quantile(mamml.direct,1-alpha)) # Compare MAMML simulated directly, and via the asymptotic distribution, to the asymptotic density # Works best for df = 2 L = 30 df = 3 nsim = 1000 y.direct = matrix(exp(0.5*rchisq(L*nsim,df)),nsim,L) mamml.direct = apply(y.direct,1,mamml.stat) xmax = quantile(mamml.direct,.95) h = hist(mamml.direct,c(-Inf,seq(0,xmax,len=60),Inf),col="green3",prob=TRUE, main="Distributions of MAMML",xlim=c(1,xmax)) # Slow because rmamml calls qmamml which uses numerical root finding # mamml.asympt = rmamml(nsim,L,df) # hist(mamml.asympt,c(-Inf,seq(0,xmax,len=60),Inf),col="yellow2",prob=TRUE,add=TRUE) hist(mamml.direct,c(-Inf,seq(0,xmax,len=60),Inf),col=NULL,prob=TRUE,add=TRUE) curve(dmamml(x,L,df),lwd=2,col="red3",add=TRUE) legend("topright",c("Direct simulation","Asymptotic density"), fill=c("green3",NA),col=c(NA,"red3"),lwd=c(NA,2),bty="n",border=c(1,NA))
The model averaging by mean maximum likelihood (MAMML) test combines likelihood ratio tests and corrects for multiple testing while controlling the weak-sense family-wise error rate in a way that is more powerful than common alternatives including Bonferroni and Simes procedures and possesses theoretical robustness to positive correlations between tests and unequal weights. It is a frequentist analogue to Bayesian model averaging, is theoretically grounded in generalized central limit theorem, and motivates the simpler and better-calibrated harmonic mean p-value (HMP) test. The model-averaged mean maximized likelihood (MAMML) is defined as the (possibly weighted) arithmetic mean of the maximized likelihood ratios from a series of likelihood ratio tests comparing mutually exclusive alternative hypotheses with the same nested null hypothesis based on the exact same data.
p.mamml(R, nu, w = NULL, L = NULL)
p.mamml(R, nu, w = NULL, L = NULL)
R |
A numeric vector of one or more maximized likelihood ratios. Missing values (NAs) will cause a missing value to be returned. |
nu |
A numeric scalar or vector for the degrees of freedom corresponding to all or each of the maximized likelihood ratios respectively. |
w |
An optional numeric vector of weights that can be interpreted as prior model probabilities for each of the alternative hypotheses represented by the individual p-values. The sum of the weights cannot exceed one but may be less than one, which is interpreted as meaning that some p-values have been excluded. |
L |
The number of constituent maximized likelihood ratios. If ignored, it defaults to the length of argument |
The model-averaged mean maximized likelihood ratio is returned.
Daniel J. Wilson
Daniel J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences USA 116: 1195-1200.
mamml.stat, hmp.stat, p.hmp
# For detailed examples type vignette("harmonicmeanp") nu = 3 R = exp(0.5*rchisq(1000,nu)) mamml.stat(R) p.mamml(R,nu,L=1000)
# For detailed examples type vignette("harmonicmeanp") nu = 3 R = exp(0.5*rchisq(1000,nu)) mamml.stat(R) p.mamml(R,nu,L=1000)