Title: | Genetic Rare Variants Tests |
---|---|
Description: | A step-up test for genetic rare variants in a gene or in a pathway. The method determines an optimal grouping of rare variants analytically. The method has been described in Hoffmann TJ, Marini NJ, and Witte JS (2010) <doi:10.1371/journal.pone.0013584>. |
Authors: | Thomas Hoffmann <[email protected]> |
Maintainer: | Thomas Hoffmann <[email protected]> |
License: | GPL |
Version: | 0.4-2 |
Built: | 2024-12-15 07:33:33 UTC |
Source: | CRAN |
Runs tests for rare variants by running the "step-up" approach to choose the best set of variants (possibly signing them), and correcting by permutation. rareGeneTest
tests a single gene, and also allows for other aggregation models described in Hoffmann et al.. rarePathwayTest
provides an extension of this method, that does a step-up approach for inclusions of genes (for computional time). For larger genes, it may also be advantageous to run rarePathwayTest
which will automatically chop large genes into subpieces and run step-up on those pieces.
rareGeneTest(genotype, phenotype, use_sign=TRUE, use_weight=TRUE, nperm=1000, binary=all(phenotype==0 | phenotype==1, na.rm=TRUE), strategy="step", thresh=1) rarePathwayTest(genotype, genotype_gene, phenotype, use_sign=TRUE, use_weight=TRUE, nperm=1000, binary=all(phenotype==0 | phenotype==1, na.rm=TRUE), CUT=15)
rareGeneTest(genotype, phenotype, use_sign=TRUE, use_weight=TRUE, nperm=1000, binary=all(phenotype==0 | phenotype==1, na.rm=TRUE), strategy="step", thresh=1) rarePathwayTest(genotype, genotype_gene, phenotype, use_sign=TRUE, use_weight=TRUE, nperm=1000, binary=all(phenotype==0 | phenotype==1, na.rm=TRUE), CUT=15)
genotype |
Matrix of individuals along the rows and genotypes along the columns. Each entry in the matrix should be coded |
phenotype |
Vector of same length as the number of rows of the matrix. Either continuous, or dichotomous (case= |
use_sign |
Whether to use a sign in the model for w, or just to weight all variants the same. Uses the sign of the estimated covariance between each marker and the trait to determine the sign of the marker. |
use_weight |
Whether to use a weight in the model for w (inverse variance of allele frequency in everyone (continuous) or just the controls (dichotomous)). |
nperm |
Number of permutations to run. |
binary |
Is the phenotype dichotomous, or continuous? (TRUE/FALSE) |
strategy |
‘step’ uses the step-up routine (default, recommended), ‘afreq’ tests all allele frequencies, and ‘thresh’ uses a fixed threshold. |
thresh |
Allele frequency for inclusion, used only when |
genotype_gene |
For |
CUT |
For |
The methods here are as described in the Hoffmann et al. (PLoS ONE, to appear) paper. These methods are based on the idea of trying multiple models for rare variants, since prior information is generally not very accurate. The p-value is corrected for multiple comparisons by permutation.
Hoffmann, TJ, Marini, NJ, and Witte, JS. Comprehensive approach to analyzing rare variants. PLoS ONE, 5(11): e13584. https://doi.org/10.1371/journal/pone.0013584.
## This inefficient (for clarity) code will simulate ## a dataset in the correct format, and then run ## the different approaches described here. ## Constants for data generation SEED <- 2 NCASE <- NCONT <- 500 NGENE <- 6 FUNCTIONAL <- 1:(NGENE/2) ## Half are functional BETA0 <- -4 BETAG <- log(2) set.seed(SEED) ## Reproducible results nonfunctional <- setdiff(1:NGENE, FUNCTIONAL) afreq <- runif(NGENE, 0.001, 0.01) expit <- function(x) return(exp(x) / (1 + exp(x))) gcase <- matrix(0, nrow=NCASE, ncol=NGENE) for(indiv in 1:NCASE){ affected <- FALSE while(!affected){ ## while not affected gcase[indiv, ] <- rbinom(NGENE, 2, afreq) ## draw up genotype affected <- (expit(BETA0 + BETAG*sum(gcase[indiv, FUNCTIONAL])) > runif(1)) } } cat("\n") gcont <- matrix(0, nrow=NCONT, ncol=NGENE) for(indiv in 1:NCONT){ unaffected <- FALSE while(!unaffected){ ## while not unaffected gcont[indiv, ] <- rbinom(NGENE, 2, afreq) ## draw up genotype unaffected <- (1-expit(BETA0 + BETAG*sum(gcont[indiv, FUNCTIONAL])) > runif(1)) } } cat("\n") cat("# Rare functional variants cases =", sum(gcase[,FUNCTIONAL]), "\n") cat("# Rare functional variants controls =", sum(gcont[,FUNCTIONAL]), "\n") cat("# Rare non-functional variants cases =", sum(gcase[,nonfunctional]), "\n") cat("# Rare non-functional variants controls =", sum(gcont[,nonfunctional]), "\n") case <- c(rep(1,NCASE), rep(0,NCONT)) genotype <- rbind(gcase, gcont) cat("P-value of the test:\n") rareGeneTest(genotype, case)
## This inefficient (for clarity) code will simulate ## a dataset in the correct format, and then run ## the different approaches described here. ## Constants for data generation SEED <- 2 NCASE <- NCONT <- 500 NGENE <- 6 FUNCTIONAL <- 1:(NGENE/2) ## Half are functional BETA0 <- -4 BETAG <- log(2) set.seed(SEED) ## Reproducible results nonfunctional <- setdiff(1:NGENE, FUNCTIONAL) afreq <- runif(NGENE, 0.001, 0.01) expit <- function(x) return(exp(x) / (1 + exp(x))) gcase <- matrix(0, nrow=NCASE, ncol=NGENE) for(indiv in 1:NCASE){ affected <- FALSE while(!affected){ ## while not affected gcase[indiv, ] <- rbinom(NGENE, 2, afreq) ## draw up genotype affected <- (expit(BETA0 + BETAG*sum(gcase[indiv, FUNCTIONAL])) > runif(1)) } } cat("\n") gcont <- matrix(0, nrow=NCONT, ncol=NGENE) for(indiv in 1:NCONT){ unaffected <- FALSE while(!unaffected){ ## while not unaffected gcont[indiv, ] <- rbinom(NGENE, 2, afreq) ## draw up genotype unaffected <- (1-expit(BETA0 + BETAG*sum(gcont[indiv, FUNCTIONAL])) > runif(1)) } } cat("\n") cat("# Rare functional variants cases =", sum(gcase[,FUNCTIONAL]), "\n") cat("# Rare functional variants controls =", sum(gcont[,FUNCTIONAL]), "\n") cat("# Rare non-functional variants cases =", sum(gcase[,nonfunctional]), "\n") cat("# Rare non-functional variants controls =", sum(gcont[,nonfunctional]), "\n") case <- c(rep(1,NCASE), rep(0,NCONT)) genotype <- rbind(gcase, gcont) cat("P-value of the test:\n") rareGeneTest(genotype, case)