Title: | True Discovery Guarantee by Combining Partial Closed Testings |
---|---|
Description: | Closed testing has been proved powerful for true discovery guarantee. The computation of closed testing is, however, quite burdensome. A general way to reduce computational complexity is to combine partial closed testings for some prespecified feature sets of interest. Partial closed testings are performed at Bonferroni-corrected alpha level to guarantee the lower bounds for the number of true discoveries in prespecified sets are simultaneously valid. For any post hoc chosen sets of interest, coherence property is used to get the lower bound. In this package, we implement closed testing with globaltest to calculate the lower bound for number of true discoveries, see Ningning Xu et.al (2021) <arXiv:2001.01541> for detailed description. |
Authors: | Ningning Xu |
Maintainer: | Ningning Xu <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1 |
Built: | 2024-11-28 06:28:21 UTC |
Source: | CRAN |
Closed testing has been proved powerful for true discovery guarantee. The computation of closed testing is, however, quite burdensome. A general way to reduce computational complexity is to combine partial closed testings for some prespecified feature sets of interest. Partial closed testings are performed at Bonferroni-corrected alpha level to guarantee the lower bounds for the number of true discoveries in prespecified sets are simultaneously valid. For any post hoc chosen sets of interest, coherence property is used to get the lower bound. In this package, we implement closed testing with globaltest to calculate the lower bound for number of true discoveries, see Ningning Xu et.al (2021) <arXiv:2001.01541> for detailed description.
The DESCRIPTION file:
Package: | newFocus |
Type: | Package |
Title: | True Discovery Guarantee by Combining Partial Closed Testings |
Version: | 1.1 |
Date: | 2021-06-22 |
Author: | Ningning Xu |
Maintainer: | Ningning Xu <[email protected]> |
Description: | Closed testing has been proved powerful for true discovery guarantee. The computation of closed testing is, however, quite burdensome. A general way to reduce computational complexity is to combine partial closed testings for some prespecified feature sets of interest. Partial closed testings are performed at Bonferroni-corrected alpha level to guarantee the lower bounds for the number of true discoveries in prespecified sets are simultaneously valid. For any post hoc chosen sets of interest, coherence property is used to get the lower bound. In this package, we implement closed testing with globaltest to calculate the lower bound for number of true discoveries, see Ningning Xu et.al (2021) <arXiv:2001.01541> for detailed description. |
License: | GPL (>= 2) |
Depends: | ctgt |
NeedsCompilation: | no |
Packaged: | 2021-07-05 15:18:45 UTC; nxu |
Repository: | CRAN |
Date/Publication: | 2021-07-05 15:50:06 UTC |
Index of help topics:
choosepath A set of focus set index ctbab Closed testing with branch and bound discov True discoveries newFocus The new focus level procedure newFocus-package True Discovery Guarantee by Combining Partial Closed Testings pick True discoveries for non-focus level node
For the GO (Gene Ontology) terms chosen as focus level nodes, newFocus function will return the minimum number of true discoveries. For GO terms that are non-focus level nodes, we use pick to count the number of true discoveries based on the result of newFocus.
Ningning Xu
Maintainer: Ningning Xu <[email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
Jelle J. Goeman, Sara A. van de Geer, Floor de Kort, Hans C. van Houwelingen, A global test for groups of genes: testing association with a clinical outcome, Bioinformatics, Volume 20, Issue 1, 1 January 2004, Pages 93-99, https://doi.org/10.1093/bioinformatics/btg382
The function aims to find out the focus set index for which the true discoveries is the most and all other focus sets that are disjoint with it .
choosepath(startingindex = 1, fsets, lowdv)
choosepath(startingindex = 1, fsets, lowdv)
startingindex |
The index of focus set that has the first largest number of true discovereis |
fsets |
A list of focus level gene sets,or GO (Gene Ontology) terms |
lowdv |
A non-negative integer vector, which are the number of true discovereis, the length of the vector is the same as the list of focus level sets |
The function will return an integer or a numeric vector.
Ningning Xu
Closed testing with branch and bound algorithm specifically for globaltest
ctbab(y, Cm, Tm, upnode, level, lownode, tmin, ctrue, lf, ls, alpha, count = 0, maxIt = 0)
ctbab(y, Cm, Tm, upnode, level, lownode, tmin, ctrue, lf, ls, alpha, count = 0, maxIt = 0)
y |
The response variable |
Cm |
The matrix for calculating critical values of globaltest |
Tm |
The matrix for calculating test statistics of globaltest |
upnode |
The upper node that is used to bound critical values |
level |
The level that the GO term of interest |
lownode |
The lower node that is used to bound critical values |
tmin |
The minimum test statistic |
ctrue |
The true critical value corresponding to the minimum test statistic |
lf |
The lambda vector corresponding to the upper node |
ls |
The lambda vector corresponding to the lower node |
alpha |
The significance level |
count |
An integer stores the repetitions of the branch and bound, i.e. how many time branch and bound is implemented |
maxIt |
The maximal number of repetitions prespecified by user |
It will retrun the rejection indicator by closed testing with branch and bound algorithm.
Ningning Xu
Xu, N., & Goeman, J. (2020). Closed testing with Globaltest with applications on metabolomics data. arXiv preprint arXiv:2001.01541.
True discoveries calculated by the partial closed testing
discov(response, alternative, null, data, maxit = 0, alpha)
discov(response, alternative, null, data, maxit = 0, alpha)
response |
The response variable |
alternative |
The alternative hypothesis, which is a character vector, i.e. a set of genes |
null |
The null hypothesis |
data |
A data frame with response and all covariates included |
maxit |
The maximal number of repetitions prespecified by user |
alpha |
The significance level |
It will return a non-negative integer: the lower bound for the number of true discovereis of the alternative gene set.
Ningning Xu
The new focus level procedure for calculating true discoveries for focus level nodes
newFocus(response, fsets, null, data, maxit = 0, alpha = 0.05, adj = 0)
newFocus(response, fsets, null, data, maxit = 0, alpha = 0.05, adj = 0)
response |
The response variable |
fsets |
A list of focus level sets |
null |
The null hypothesis |
data |
The data frame with response and all covariates included |
maxit |
The maximal number of repetitions prespecified by user |
alpha |
The significance level |
adj |
The number of focus sets that are fully rejected by partial closed testing, which is used to adjust the number of focus sets, The dafault value is 0. |
The function will return a focus subject with the lower bound for each focus level node.
Ningning Xu
Goeman, J. J., & Mansmann, U. (2008). Multiple testing on the directed acyclic graph of gene ontology. Bioinformatics, 24(4), 537-544.
## example data set n= 100 m = 5 X = matrix(0, n, m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs mydata = as.data.frame(cbind(X,y)) ## focus level sets fl = list(c("x1", "x2"), c("x3", "x4"), "x5") names(fl) = c("12", "34", "5") ## get td for focus level sets focus_subject = newFocus(response = y, fsets = fl, data = mydata) ## get td for any set of interest given the focus subject setofinterest = c("x1", "x2","x3", "x4") pick(focus_subject, setofinterest)
## example data set n= 100 m = 5 X = matrix(0, n, m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs mydata = as.data.frame(cbind(X,y)) ## focus level sets fl = list(c("x1", "x2"), c("x3", "x4"), "x5") names(fl) = c("12", "34", "5") ## get td for focus level sets focus_subject = newFocus(response = y, fsets = fl, data = mydata) ## get td for any set of interest given the focus subject setofinterest = c("x1", "x2","x3", "x4") pick(focus_subject, setofinterest)
The number of true discoveries for the non-focus level GO terms is calculated given the focus subject.
pick(focus_obj, setofinterest)
pick(focus_obj, setofinterest)
focus_obj |
The focus subject from function newFocus |
setofinterest |
A gene set or GO term of interest |
It will return an integer: the lower bound for the number of true discoveries in the set of interest
Ningning Xu
## example data set n= 100 m = 5 X = matrix(0, n, m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs mydata = as.data.frame(cbind(X,y)) ## focus level sets fl = list(c("x1", "x2"), c("x3", "x4"), "x5") names(fl) = c("12", "34", "5") ## get td for focus level sets focus_subject = newFocus(response = y, fsets = fl, data = mydata) ## get td for any set of interest given the focus subject setofinterest = c("x1", "x2","x3", "x4") pick(focus_subject, setofinterest)
## example data set n= 100 m = 5 X = matrix(0, n, m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs mydata = as.data.frame(cbind(X,y)) ## focus level sets fl = list(c("x1", "x2"), c("x3", "x4"), "x5") names(fl) = c("12", "34", "5") ## get td for focus level sets focus_subject = newFocus(response = y, fsets = fl, data = mydata) ## get td for any set of interest given the focus subject setofinterest = c("x1", "x2","x3", "x4") pick(focus_subject, setofinterest)