Title: | Estimates and Bounds for the False Discovery Proportion, by Permutation |
---|---|
Description: | For multiple testing. Computes estimates and confidence bounds for the False Discovery Proportion (FDP), the fraction of false positives among all rejected hypotheses. The methods in the package use permutations of the data. Doing so, they take into account the dependence structure in the data. |
Authors: | Jesse Hemerik and Jelle Goeman |
Maintainer: | Jesse Hemerik <[email protected]> |
License: | GNU General Public License |
Version: | 0.2 |
Built: | 2024-11-19 06:40:09 UTC |
Source: | CRAN |
Computes a confidence upper bound for the False Discovery Proportion (FDP). The input required is a matrix containing test statistics (or p-values) for (randomly) permuted versions of the data.
confSAM(p, PM, includes.id=TRUE, cutoff=0.01, reject="small", alpha=0.05, method="simple", ncombs=1000)
confSAM(p, PM, includes.id=TRUE, cutoff=0.01, reject="small", alpha=0.05, method="simple", ncombs=1000)
p |
A vector containing the p-values for the original (unpermuted) data. |
PM |
A matrix (with |
includes.id |
Set this to |
cutoff |
A number or a vector of length |
reject |
If |
alpha |
1-alpha is the desired confidence level of the bounds. |
method |
If |
ncombs |
Only applies when |
A vector with three values is returned. The first value is the number if rejections. The second value is a basic median unbiased estimate of the number of false positives V. This estimate coincides with the simple upper bound for alpha=0.5. The third value is a (1-alpha)-confidence upper bound for V (it depends on the argument method
which bound this is.)
#This is a fast example. It is recommended to take w and ncombs larger in practice. set.seed(423) m <- 100 #number of hypotheses n <- 10 #the amount of subjects is 2n (n cases, n controls). w <- 50 #number of random permutations. Here we take w small for computational speed X <- matrix(rnorm((2*n)*m), 2*n, m) X[1:n,1:50] <- X[1:n,1:50]+1.5 # the first 50 hypotheses are false #(increased mean for the first n individuals). y <- c(numeric(n)+1,numeric(n)-1) Y <- t(replicate(w, sample(y, size=2*n, replace=FALSE))) Y[1,] <- y #add identity permutation pvalues <- matrix(nrow=w,ncol=m) for(j in 1:w){ for(i in 1:m){ pvalues[j,i] <- t.test( X[Y[j,]==1,i], X[Y[j,]==-1,i] , alternative="two.sided" )$p.value } } ## number of rejections: confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="simple")[1] ## basic median unbiased estimate of #false positives: confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="simple")[2] ## basic (1-alpha)-upper bound for #false positives: confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="simple")[3] ## potentially smaller (1-alpha)-upper bound for #false positives: ## (taking 'ncombs' much larger recommended) confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="approx", ncombs=50)[3] ## actual number of false positives: sum(pvalues[1,51:100]<0.05)
#This is a fast example. It is recommended to take w and ncombs larger in practice. set.seed(423) m <- 100 #number of hypotheses n <- 10 #the amount of subjects is 2n (n cases, n controls). w <- 50 #number of random permutations. Here we take w small for computational speed X <- matrix(rnorm((2*n)*m), 2*n, m) X[1:n,1:50] <- X[1:n,1:50]+1.5 # the first 50 hypotheses are false #(increased mean for the first n individuals). y <- c(numeric(n)+1,numeric(n)-1) Y <- t(replicate(w, sample(y, size=2*n, replace=FALSE))) Y[1,] <- y #add identity permutation pvalues <- matrix(nrow=w,ncol=m) for(j in 1:w){ for(i in 1:m){ pvalues[j,i] <- t.test( X[Y[j,]==1,i], X[Y[j,]==-1,i] , alternative="two.sided" )$p.value } } ## number of rejections: confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="simple")[1] ## basic median unbiased estimate of #false positives: confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="simple")[2] ## basic (1-alpha)-upper bound for #false positives: confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="simple")[3] ## potentially smaller (1-alpha)-upper bound for #false positives: ## (taking 'ncombs' much larger recommended) confSAM(p=pvalues[1,], PM=pvalues, cutoff=0.05, alpha=0.1, method="approx", ncombs=50)[3] ## actual number of false positives: sum(pvalues[1,51:100]<0.05)