Package 'jointest' reference manual

Title:	Multivariate Testing Through Joint Resampling-Based Tests
Description:	Runs resampling-based tests jointly, e.g., sign-flip score tests from Hemerik et al., (2020) <doi:10.1111/rssb.12369>, to allow for multivariate testing, i.e., weak and strong control of the Familywise Error Rate or True Discovery Proportion.
Authors:	Livio Finos [cre, aut] , Angela Andreella [aut], Filippo Gambarota [ctb]
Maintainer:	Livio Finos <[email protected]>
License:	GPL-2
Version:	1.0
Built:	2025-02-14 07:01:11 UTC
Source:	CRAN

Nonparametric combination of `jointest` objects

Description

Methods for combining jointest objects.

combine combines the tests derived from multiverse models.

combine_contrasts combines the tests derived from the contrasts of a factor variable to get a global test for the factor (i.e. categorical predictor). It has strong analogies with ANOVA test.

Usage

combine(mods, comb_funct = "maxT", by = NULL, by_list=NULL, tail = 0)

combine_contrasts(mods, comb_funct = "Mahalanobis", tail = 0)
combine(mods, comb_funct = "maxT", by = NULL, by_list=NULL, tail = 0)

combine_contrasts(mods, comb_funct = "Mahalanobis", tail = 0)

Arguments

`mods`	a `jointest` object.
`comb_funct`	combining function to be used. Several functions are implemented: "mean", "median", "Fisher", "Liptak", (equal to) "Stoufer", "Tippet", (equal to) "minp", "maxT", "Mahalanobis". Alternatively it can be a custom function that has a Tspace matrix as input. For `combine` the default is `comb_funct="maxT"`, while for `combines_contrasts` the default is `comb_funct="Mahalanobis"`.
`by`	if `NULL` (default), it combines all test statistics. If a characters, it refers to the column's name of `summary_table` (and printed by something like `summary(mods)`). The elements with the same value will be combined. If `by` is a vector, the values are defined by row-wise concatenation of the values of the columns in `by`. The argument is inactive if `by_list` is not `NULL`.
`by_list`	NULL (default) or a list of vectors. For each vector of the list it combines test statistics with position given by the element of the vector. If the vectors in the list are characters, these refer to names(mods$Tspace).
`tail`	direction of the alternative hypothesis. It can be "two.sided" (or 0, the default), "less" (or -1) or "greater" (or +1).

Value

The function returns a jointest-object.

Examples

#First example
library(jointest)
set.seed(123)

#Simulate data
n=20
D=data.frame(X=rnorm(n),Z1=rnorm(n),Z2=rnorm(n))
D$Y=D$Z1+D$X+rnorm(n)

# Run four glms abd combine it in a list
mod1=glm(Y~X+Z1+Z2,data=D)
mod2=glm(Y~X+poly(Z1,2)+Z2,data=D)
mod3=glm(Y~X+poly(Z1,2)+poly(Z2,2),data=D)
mod4=glm(Y~X+Z1+poly(Z2,2),data=D)
mods=list(mod1=mod1,mod2=mod2,mod3=mod3,mod4=mod4)

# Let us analyze the tests related to coefficient "X" and combine them
res=join_flipscores(mods,n_flips = 5000, seed = 1, tested_coeffs = "X")
summary(combine(res))
# Second (continued) example
# flipscores jointly on all models and all coefficients
res=join_flipscores(mods,n_flips = 2000)
summary(combine(res))
summary(combine(res, by="Model"))
summary(combine(res, by="Coeff"))
res2=combine_contrasts(res)
summary(res2)
#custom combinations:
coeffs=c("(Intercept)","X","Z1","Z2")
coeffs_ids=lapply(coeffs,grep,res2$summary_table$Coeff)
names(coeffs_ids)=coeffs
summary(combine(res2,by_list =   coeffs_ids))
#First example
library(jointest)
set.seed(123)

#Simulate data
n=20
D=data.frame(X=rnorm(n),Z1=rnorm(n),Z2=rnorm(n))
D$Y=D$Z1+D$X+rnorm(n)

# Run four glms abd combine it in a list
mod1=glm(Y~X+Z1+Z2,data=D)
mod2=glm(Y~X+poly(Z1,2)+Z2,data=D)
mod3=glm(Y~X+poly(Z1,2)+poly(Z2,2),data=D)
mod4=glm(Y~X+Z1+poly(Z2,2),data=D)
mods=list(mod1=mod1,mod2=mod2,mod3=mod3,mod4=mod4)

# Let us analyze the tests related to coefficient "X" and combine them
res=join_flipscores(mods,n_flips = 5000, seed = 1, tested_coeffs = "X")
summary(combine(res))
# Second (continued) example
# flipscores jointly on all models and all coefficients
res=join_flipscores(mods,n_flips = 2000)
summary(combine(res))
summary(combine(res, by="Model"))
summary(combine(res, by="Coeff"))
res2=combine_contrasts(res)
summary(res2)
#custom combinations:
coeffs=c("(Intercept)","X","Z1","Z2")
coeffs_ids=lapply(coeffs,grep,res2$summary_table$Coeff)
names(coeffs_ids)=coeffs
summary(combine(res2,by_list =   coeffs_ids))

flipscores 2-Stage Summary Statistics approach

Description

This function fits a model based on the provided formula and data, accounting for clusters and summary statistics within the model.

Usage

flip2sss(formula = NULL, data = NULL, cluster = NULL, 
family = "gaussian", summstats_within=NULL, ...)
flip2sss(formula = NULL, data = NULL, cluster = NULL, 
family = "gaussian", summstats_within=NULL, ...)

Arguments

`formula`	A formula or a list of formulas. It can be a complete model as.formula or a list of formulas, one for each element produced by the function.
`data`	The dataset to be used for fitting the model.
`cluster`	A vector or a formula evaluated on the data that defines the clusters.
`family`	as in `glm`, but given as a character. Not used if argument `summstats_within` is not `NULL`.
`summstats_within`	A vector of summary statistics model within the data or a function with argument data.
`...`	Other arguments passed to the `flipscores` function.

Value

A jointest object, i.e., a list containing the following objects:

Tspace: data.frame where rows represents the sign-flipping transformed (plus the identity one) test and columns the variables.
summary_table: data.frame containing for each second-step covariate the estimated parameter, score, std error, test , partial correlation and p-value.
mods: List of glm objects, i.e., first-step glm objects

Author(s)

Livio Finos, Angela Andreella

Examples

library(jointest)
set.seed(123)
# Simulate data
N=20
n=rpois(N,20)
reff=rep(rnorm(N),n)

D=data.frame(X1=rnorm(length(reff)),
             X2=rep(rnorm(N),n),
             Grp=factor(rep(rep(LETTERS[1:3],length.out=N),n)),
             Subj=rep(1:N,n))
D$Y=rbinom(n=nrow(D),prob=plogis( 2*D$X1 * (D$Grp=="B") +  2*D$X2+reff),size=1)

# model of interest formula <- Y ~ Grp * X1 + X2
# clusters structure defined by cluster <- factor(D$Subj)
# The 2-Stage Summary Statistics via flipscore: 
res <- flip2sss(Y ~ Grp * X1 + X2, data=D, 
               cluster=D$Subj, family="binomial")
summary(res)
# This is an ANOVA-like overall test:
summary(combine(res))
# This is an ANOVA-like test:
summary(combine_contrasts(res))

# An alternative and more flexible definition of the model:
# Define the summary statistics (here we propose the glm with firth correction 
# from the logistf package)
summstats_within <- 'logistf::logistf(Y ~ X1, family = binomial(link = "logit"),
control=logistf::logistf.control(maxit=100))'
# however also the classic glm function can be used:
#summstats_within <- 'glm(Y ~ X1, family = binomial(link = "logit"))'

# Then, compute the 2-Stage Summary Statistics approach
# specifying the summary statistics (within cluster/subject)
res <- flip2sss(Y ~ Grp * X1 + X2, data=D, cluster=D$Subj, 
                   summstats_within=summstats_within)
summary(res)

# We can also combine the tests:
# Overall:
summary(combine(res))
# This is similar to an ANOVA test:
summary(combine_contrasts(res))
library(jointest)
set.seed(123)
# Simulate data
N=20
n=rpois(N,20)
reff=rep(rnorm(N),n)

D=data.frame(X1=rnorm(length(reff)),
             X2=rep(rnorm(N),n),
             Grp=factor(rep(rep(LETTERS[1:3],length.out=N),n)),
             Subj=rep(1:N,n))
D$Y=rbinom(n=nrow(D),prob=plogis( 2*D$X1 * (D$Grp=="B") +  2*D$X2+reff),size=1)

# model of interest formula <- Y ~ Grp * X1 + X2
# clusters structure defined by cluster <- factor(D$Subj)
# The 2-Stage Summary Statistics via flipscore: 
res <- flip2sss(Y ~ Grp * X1 + X2, data=D, 
               cluster=D$Subj, family="binomial")
summary(res)
# This is an ANOVA-like overall test:
summary(combine(res))
# This is an ANOVA-like test:
summary(combine_contrasts(res))

# An alternative and more flexible definition of the model:
# Define the summary statistics (here we propose the glm with firth correction 
# from the logistf package)
summstats_within <- 'logistf::logistf(Y ~ X1, family = binomial(link = "logit"),
control=logistf::logistf.control(maxit=100))'
# however also the classic glm function can be used:
#summstats_within <- 'glm(Y ~ X1, family = binomial(link = "logit"))'

# Then, compute the 2-Stage Summary Statistics approach
# specifying the summary statistics (within cluster/subject)
res <- flip2sss(Y ~ Grp * X1 + X2, data=D, cluster=D$Subj, 
                   summstats_within=summstats_within)
summary(res)

# We can also combine the tests:
# Overall:
summary(combine(res))
# This is similar to an ANOVA test:
summary(combine_contrasts(res))

join tests from multiverse models

Description

The function allows hypothesis testing across all plausible multiverse models ensuring strong family-wise error rate control.

Usage

join_flipscores(mods, tested_coeffs = NULL, n_flips = 5000, 
score_type = "standardized", statistics = "t", seed=NULL, output_models =TRUE, ...)
join_flipscores(mods, tested_coeffs = NULL, n_flips = 5000, 
score_type = "standardized", statistics = "t", seed=NULL, output_models =TRUE, ...)

Arguments

`mods`	list of `glm`s or `flipscores`-object (or any other object that can be evaluated by `flipscores`)
`tested_coeffs`	list of the same length of `mods`, each element of the list being a vector of names of tested coefficients. Alternatively, it can be a vector of names of tested coefficients, in this case, the tested coefficients are attributed to all models (when present). As a last option, it can be `NULL`, if so, all coefficients are tested.
`n_flips`	number of flips, default 5000
`score_type`	any valid type for `flipscores`, `"standardized"` is the default. see `flipscores` for more datails
`statistics`	"t" is the only method implemented (yet). Any other value will not modify the score (a different statistic will only affect the multivariate inference, not the univariate one).
`seed`	`NULL` by default. Use a number if you wanto to ensure replicability of the results
`output_models`	`TRUE` by default. Should the `flipscores` model returned?
`...`	any other further parameter.

Value

A jointest object, i.e., a list containing the following objects:

Tspace: data.frame where rows represents the sign-flipping transformed (plus the identity one) test and columns the variables.
summary_table: data.frame containing for each model the estimated parameter(s), score(s), std error(s), test(s), partial correlation(s) and p-value(s).
mods: List of glms or flipscores objects.

Examples

library(jointest)
set.seed(123)


#EXAMPLE 1: Simulate data:
n=20
D=data.frame(X=rnorm(n),Z1=rnorm(n),Z2=rnorm(n))
D$Y=D$Z1+D$X+rnorm(n)

# Run four glms abd combine it in a list
mod1=glm(Y~X+Z1+Z2,data=D)
mod2=glm(Y~X+poly(Z1,2)+Z2,data=D)
mod3=glm(Y~X+poly(Z1,2)+poly(Z2,2),data=D)
mod4=glm(Y~X+Z1+poly(Z2,2),data=D)
mods=list(mod1=mod1,mod2=mod2,mod3=mod3,mod4=mod4)

# flipscores jointly on all models
res=join_flipscores(mods,n_flips = 1000)
summary(combine(res))
summary(combine(res, by="Model"))
summary(combine_contrasts(res))

#Simulate multivariate (50) bionomial responses 
set.seed(123)
n=30
D=data.frame(X=rnorm(n),Z=rnorm(n))
Y=replicate(50,rbinom(n,1,plogis(.5*D$Z+.5*D$X)))
colnames(Y)=paste0("Y",1:50)
D=cbind(D,Y)
mods=lapply(1:50,function(i)eval(parse(text=
paste(c("glm(formula(Y",i,"~X+Z),data=D,family='binomial')"),collapse=""))))
# flipscores jointly on all models
res=join_flipscores(mods,n_flips = 1000,tested_coeffs ="X")
summary(res)
res=p.adjust(res)
summary(res)
# Compute lower bound for the true discovery proportion. See packages pARI and sumSome
# install.packages("sumSome")
# install.packages("pARI")
# library(sumSome)
# library(pARI)
# pARI returns a lower bound equals 0.24, i.e., at least 24% of the models
# have a significant effect related to X
# pARI::pARI(ix = c(1:50),pvalues = t(jointest:::.t2p(res$Tspace)),family = "simes",delta = 9)$TDP
# sumSome returns a lower bound equals 0.42, i.e., at least 42% of the models
# have a significant effect related to X
# sumSome::tdp(sumSome::sumStats(G = as.matrix(res$Tspace)))
library(jointest)
set.seed(123)


#EXAMPLE 1: Simulate data:
n=20
D=data.frame(X=rnorm(n),Z1=rnorm(n),Z2=rnorm(n))
D$Y=D$Z1+D$X+rnorm(n)

# Run four glms abd combine it in a list
mod1=glm(Y~X+Z1+Z2,data=D)
mod2=glm(Y~X+poly(Z1,2)+Z2,data=D)
mod3=glm(Y~X+poly(Z1,2)+poly(Z2,2),data=D)
mod4=glm(Y~X+Z1+poly(Z2,2),data=D)
mods=list(mod1=mod1,mod2=mod2,mod3=mod3,mod4=mod4)

# flipscores jointly on all models
res=join_flipscores(mods,n_flips = 1000)
summary(combine(res))
summary(combine(res, by="Model"))
summary(combine_contrasts(res))

#Simulate multivariate (50) bionomial responses 
set.seed(123)
n=30
D=data.frame(X=rnorm(n),Z=rnorm(n))
Y=replicate(50,rbinom(n,1,plogis(.5*D$Z+.5*D$X)))
colnames(Y)=paste0("Y",1:50)
D=cbind(D,Y)
mods=lapply(1:50,function(i)eval(parse(text=
paste(c("glm(formula(Y",i,"~X+Z),data=D,family='binomial')"),collapse=""))))
# flipscores jointly on all models
res=join_flipscores(mods,n_flips = 1000,tested_coeffs ="X")
summary(res)
res=p.adjust(res)
summary(res)
# Compute lower bound for the true discovery proportion. See packages pARI and sumSome
# install.packages("sumSome")
# install.packages("pARI")
# library(sumSome)
# library(pARI)
# pARI returns a lower bound equals 0.24, i.e., at least 24% of the models
# have a significant effect related to X
# pARI::pARI(ix = c(1:50),pvalues = t(jointest:::.t2p(res$Tspace)),family = "simes",delta = 9)$TDP
# sumSome returns a lower bound equals 0.42, i.e., at least 42% of the models
# have a significant effect related to X
# sumSome::tdp(sumSome::sumStats(G = as.matrix(res$Tspace)))

Methods for `jointest` objects

Description

Methods to extract and manipulate relevant information from a jointest object.

print method for class "jointest".

summary method for class "jointest"

p.adjust method for class "jointest". Add adjusted p-values into the jointest object.

plot method for class "jointest" This plot function visualizes p-values from multiverse models, with different markers to indicate statistical significance levels as defined by the mark_signif argument (default is 0.05). Points are plotted with varying shapes based on whether the p-value is below the significance threshold, and colors are used to distinguish between different coefficients.

Usage

## S3 method for class 'jointest'
print(x, ...)

## S3 method for class 'jointest'
summary(object, ...)

p.adjust(object, method = "maxT", tail = 0, ...)

## S3 method for class 'jointest'
plot(x, ...)
## S3 method for class 'jointest'
print(x, ...)

## S3 method for class 'jointest'
summary(object, ...)

p.adjust(object, method = "maxT", tail = 0, ...)

## S3 method for class 'jointest'
plot(x, ...)

Arguments

`x`	an object of class `jointest`.
`...`	additional arguments to be passed, i.e., `mark_signif` and `p.values=c("raw","adjusted")`. See details.
`object`	an object of class `jointest`.
`method`	any method implemented in `flip::flip.adjust` or a custom function. In the last case it must be a function that uses a matrix as input and returns a vector of adjusted p-values equal to the number of columns of the inputed matrix.
`tail`	argument: expresses the tail direction of the alternative hypothesis. It can be "two.sided" (or 0, the default), "less" (or -1) or "greater" (or +1).

Details

mark_signif argument: numeric value representing the significance threshold for marking p-values. Any p-value below this threshold will be marked with a dot. The default is 0.05. p.values argument: a character vector specifying which p-values to display. It can be either "raw" for raw p-values or "adjusted" for adjusted p-values. The default is "raw".

Longitudinal MRI data in nondemented and demented older adults

Description

This dataset consists of a longitudinal collection of 150 subjects aged 60 to 96. Each subject was scanned on two or more visits, separated by at least one year for a total of 373 imaging sessions. For each subject, 3 or 4 individual T1-weighted MRI scans obtained in single scan sessions are included.

Usage

oasis
oasis

Format

A data frame with 373 rows and 15 variables:

Subject.ID: Subject identification
MRI.ID: MRI Exam Identification
Group: Class
Visit: Visit order
MR.Delay: MR Delay Time (Contrast)
Gender: Gender
Hand: All subjects are right-handed
Age: Age of the subject
EDUC: Years of Education
SES: Socioeconomic Status
MMSE: Mini Mental State Examination
CDR: Clinical Dementia Rating
eTIV: Estimated total intracranial volume
nWBV: Normalize Whole Brain Volume
ASF: Atlas Scaling Factor

References

https://www.kaggle.com/datasets/jboysen/mri-and-alzheimers

Package 'jointest'

Help Index

Nonparametric combination of `jointest` objects

Description

Usage

Arguments

Value

Examples

flipscores 2-Stage Summary Statistics approach

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

join tests from multiverse models

Description

Usage

Arguments

Value

Examples

Methods for `jointest` objects

Description

Usage

Arguments

Details

See Also

Longitudinal MRI data in nondemented and demented older adults

Description

Usage

Format

References

Package 'jointest'

Help Index

Nonparametric combination of jointest objects

Description

Usage

Arguments

Value

Examples

flipscores 2-Stage Summary Statistics approach

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

join tests from multiverse models

Description

Usage

Arguments

Value

Examples

Methods for jointest objects

Description

Usage

Arguments

Details

See Also

Longitudinal MRI data in nondemented and demented older adults

Description

Usage

Format

References

Nonparametric combination of `jointest` objects

Methods for `jointest` objects