Title: | Testing Neighbor Effects in Marker-Based Regressions |
---|---|
Description: | To incorporate neighbor genotypic identity into genome-wide association studies, the package provides a set of functions for variation partitioning and association mapping. The theoretical background of the method is described in Sato et al. (2021) <doi:10.1038/s41437-020-00401-w>. |
Authors: | Yasuhiro Sato [aut, cre] , Eiji Yamamoto [aut], Kentaro K. Shimizu [aut] , Atsushi J. Nagano [aut] |
Maintainer: | Yasuhiro Sato <[email protected]> |
License: | GPL-3 |
Version: | 1.2.4 |
Built: | 2024-11-28 06:38:06 UTC |
Source: | CRAN |
A function to calculate PVE by neighbor effects for a series of neighbor distance using a mixed model.
calc_PVEnei( pheno, geno, smap, scale_seq, addcovar = NULL, grouping = rep(1, nrow(smap)), response = c("quantitative", "binary"), n_core = 1L )
calc_PVEnei( pheno, geno, smap, scale_seq, addcovar = NULL, grouping = rep(1, nrow(smap)), response = c("quantitative", "binary"), n_core = 1L )
pheno |
A numeric vector including phenotypes for individuals |
geno |
An individual x marker matrix. Bialleles (i.e., A or a) must be converted into -1 or 1 digit. |
smap |
A matrix showing a spatial map for individuals. The first and second column include spatial points along an x-axis and y-axis, respectively. |
scale_seq |
A numeric vector including a set of the maximum spatial distance between a focal individual and neighbors to define neighbor effects. A scalar is also allowed. |
addcovar |
An optional matrix including additional non-genetic covariates. It contains no. of individuals x no. of covariates. |
grouping |
A positive integer vector assigning each individual to a group. This argument can be useful when a "smap" contains different experimental replicates. Default setting means that all individuals are belong to a single group. |
response |
An option to select if the phenotype is a |
n_core |
No. of cores for a multi-core computation. This does not work for Windows OS. Default is a single-core computation. |
This function uses mixed models via the gaston
package (Perdry & Dandine-Roulland 2020).
If "binary"
is selected, logistic.mm.aireml()
is called via the gaston
package.
In such a case, PVEnei
below is replaced by the ratio of phenotypic variation explained (RVE) by neighbor effects as RVE_nei = /
and p-values are not provided.
A numeric matrix including a given spatial scale, PVE by neighbor effects, and p-values.
scale
Maximum neighbor distance given as an argument
PVEself
Proportion of phenotypic variation explained (PVE) by self effects. RVE is returned when response = "binary"
PVEnei
Proportion of phenotypic variation explained (PVE) by neighbor effects. RVE is returned when response = "binary"
p-value
p-value by a likelihood ratio test between models with or without neighbor effects (when s is not zero); or between a null model and model with self effects alone (when s = 0). NA when response = "binary"
Yasuhiro Sato ([email protected])
Perdry H, Dandine-Roulland C. (2020) gaston: Genetic Data Handling (QC, GRM, LD, PCA) & Linear Mixed Models. https://CRAN.R-project.org/package=gaston
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno") min_s <- min_dist(fake_nei$smap, fake_nei$pheno$grouping) scale_seq <- c(min_s, quantile(dist(fake_nei$smap),c(0.2*rep(1:5)))) pve_out <- calc_PVEnei(geno=fake_nei$geno, pheno=fake_nei$pheno[,1], smap=fake_nei$smap, scale_seq=scale_seq, addcovar=as.matrix(fake_nei$pheno$grouping), grouping=fake_nei$pheno$grouping) delta_PVE(pve_out)
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno") min_s <- min_dist(fake_nei$smap, fake_nei$pheno$grouping) scale_seq <- c(min_s, quantile(dist(fake_nei$smap),c(0.2*rep(1:5)))) pve_out <- calc_PVEnei(geno=fake_nei$geno, pheno=fake_nei$pheno[,1], smap=fake_nei$smap, scale_seq=scale_seq, addcovar=as.matrix(fake_nei$pheno$grouping), grouping=fake_nei$pheno$grouping) delta_PVE(pve_out)
A function to calculate PVE that estimates the effective scale of neighbor effects.
delta_PVE(res, fig = TRUE, ...)
delta_PVE(res, fig = TRUE, ...)
res |
Output results of |
fig |
TRUE/FALSE to plot the results (or not). Default is TRUE. |
... |
Arguments to be passed to |
Estimated effective scale and proportion of phenotypic variation explained by neighbor effects at that scale.
Yasuhiro Sato ([email protected])
A function convert a bed.matrix dataset to rNeighborGWAS genotype data.
gaston2neiGWAS(x)
gaston2neiGWAS(x)
x |
A bed.matrix created using the |
This function converts genotype data into -1, 0, or 1 digit as the rNeighborGWAS format. Zero indicates heterozygotes.
A list including an individual x marker matrix, a data.frame including chromosome numbers in the first column, and SNP positions in the second column, and a numeric vector including phenotypes for individuals.
geno
Individual x marker matrix
gmap
Data.frame including chromosome numbers in the first column, and SNP positions in the second column
pheno
Numeric vector including phenotypes for individuals
Yasuhiro Sato ([email protected])
Perdry H, Dandine-Roulland C. (2020) gaston: Genetic Data Handling (QC, GRM, LD, PCA) & Linear Mixed Models. https://CRAN.R-project.org/package=gaston
data("TTN", package="gaston") x <- gaston::as.bed.matrix(TTN.gen, TTN.fam, TTN.bim) g <- gaston2neiGWAS(x)
data("TTN", package="gaston") x <- gaston::as.bed.matrix(TTN.gen, TTN.fam, TTN.bim) g <- gaston2neiGWAS(x)
A function to calculate a Euclidian distance including at least one neighbor for all individuals.
min_dist(smap, grouping = rep(1, nrow(smap)))
min_dist(smap, grouping = rep(1, nrow(smap)))
smap |
A matrix showing a spatial map for individuals. The first and second column include spatial points along an x-axis and y-axis, respectively. |
grouping |
A positive integer vector assigning each individual to a group. This argument can be useful when a "smap" contains different experimental replicates. Default setting means that all individuals are belong to a single group. |
Return a scalar of the minimum Euclidian distance that allows all individuals to have at least one neighbor.
Yasuhiro Sato ([email protected])
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap = cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2), rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno") min_s <- min_dist(fake_nei$smap, fake_nei$pheno$grouping)
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap = cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2), rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno") min_s <- min_dist(fake_nei$smap, fake_nei$pheno$grouping)
A function to calculate neighbor genotypic identity, with a given reference scale and a degree of distance decay.
nei_coval( geno, smap, scale, alpha = Inf, kernel = c("exp", "gaussian"), grouping = rep(1, nrow(smap)), n_core = 1L )
nei_coval( geno, smap, scale, alpha = Inf, kernel = c("exp", "gaussian"), grouping = rep(1, nrow(smap)), n_core = 1L )
geno |
An individual x marker matrix. Bialleles (i.e., A or a) must be converted into -1 or 1 digit. |
smap |
A matrix showing a spatial map for individuals. The first and second column include spatial points along an x-axis and y-axis, respectively. |
scale |
A numeric scalar indicating the maximum spatial distance between a focal individual and neighbors to define neighbor effects. |
alpha |
An option to set a distance decay coefficient |
kernel |
An option to select either |
grouping |
A positive integer vector assigning each individual to a group. This argument can be useful when a "smap" contains different experimental replicates. Default setting means that all individuals are belong to a single group. |
n_core |
No. of cores for a multi-core computation. This does not work for Windows OS. Default is a single-core computation. |
Default setting is recommended for alpha
and kernel
arguments unless spatial distance decay of neighbor effects needs to be modeled.
If alpha
is not Inf
, output variables are weighted by a distance decay from a focal individual to scale
.
For the type of dispersal kernel in the distance decay, we can choose a negative exponential or Gaussian kernel as a fat-tailed or thin-tailed distribution, respectively.
See Nathan et al. (2012) for detailed characteristics of the two dispersal kernels.
A numeric matrix for neighbor covariates, with no. of individuals x markers.
Yasuhiro Sato ([email protected])
Nathan R, Klein E, Robledo-Arnuncio JJ, Revilla E. (2012) Dispersal kernels: review. In: Clobert J, Baguette M, Benton TG, Bullock JM (Eds.), Dispersal Ecology and Evolution. Oxford University Press, pp.186-210.
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2), rep(2,nrow(g)/2)) g_nei <- nei_coval(g,smap,44,grouping = grouping)
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2), rep(2,nrow(g)/2)) g_nei <- nei_coval(g,smap,44,grouping = grouping)
A function to provide coefficients and p-values of self and neighbor effects for each marker.
nei_lm( geno, g_nei, pheno, addcovar = NULL, response = c("quantitative", "binary"), n_core = 1L, asym = FALSE )
nei_lm( geno, g_nei, pheno, addcovar = NULL, response = c("quantitative", "binary"), n_core = 1L, asym = FALSE )
geno |
An individual x marker matrix. Bialleles (i.e., A or a) must be converted into -1 or 1 digit. |
g_nei |
An output of |
pheno |
A numeric vector including phenotypes for individuals |
addcovar |
An optional matrix including additional non-genetic covariates. It contains no. of individuals x no. of covariates. |
response |
An option to select if the phenotype is a |
n_core |
No. of cores for a multi-core computation. This does not work for Windows OS. Default is a single-core computation. |
asym |
If TRUE, asymmetric neighbor effects are also tested and returned as "beta_sxn" and "p_sxn". |
This function is a subset of neiGWAS()
. nei_lm()
gives detailed results when the option model="lm"
is selected in neiGWAS()
.
A data.frame including coefficients and p-values of self and neighbor effects, without the chromosome numbers and marker position.
beta_self
coefficient for self effects
beta_self
coefficient for neighbor effects
p_self
p-value for self effects by a likelihood ratio test between a null and standard linear model
p_nei
p-value for neighbor effects by a likelihood ratio test between models with or without neighbor effects
Yasuhiro Sato ([email protected])
A function to provide coefficients and p-values of self and neighbor effects for each marker.
nei_lmm( geno, g_nei, pheno, addcovar = NULL, response = c("quantitative", "binary"), n_core = 1L, asym = FALSE )
nei_lmm( geno, g_nei, pheno, addcovar = NULL, response = c("quantitative", "binary"), n_core = 1L, asym = FALSE )
geno |
An individual x marker matrix. Bialleles (i.e., A or a) must be converted into -1 or 1 digit. |
g_nei |
An output of |
pheno |
A numeric vector including phenotypes for individuals |
addcovar |
An optional matrix including additional non-genetic covariates. It contains no. of individuals x no. of covariates. |
response |
An option to select if the phenotype is a |
n_core |
No. of cores for a multi-core computation. This does not work for Windows OS. Default is a single-core computation. |
asym |
If TRUE, asymmetric neighbor effects are also tested and returned as "beta_sxn" and "p_sxn". |
This function is a subset of neiGWAS()
. nei_lmm()
gives detailed results but requires more computational time.
A data.frame including coefficients and p-values of self and neighbor effects, without the chromosome numbers and marker position.
beta_self
coefficient for self effects
beta_self
coefficient for neighbor effects
p_self
p-value for self effects by a likelihood ratio test between a null and standard GWAS model
p_nei
p-value for neighbor effects by a likelihood ratio test between models with or without neighbor effects
Yasuhiro Sato ([email protected])
A function to simulate phenotypes caused by self and neighbor effects, with the proportion of phenotypic variation explained (PVE) by fixed and random effects controlled.
nei_simu( geno, smap, scale, alpha = Inf, grouping = rep(1, nrow(smap)), kernel = c("exp", "gaussian"), n_causal, pveB, pve, b_ratio = c(1, 1) )
nei_simu( geno, smap, scale, alpha = Inf, grouping = rep(1, nrow(smap)), kernel = c("exp", "gaussian"), n_causal, pveB, pve, b_ratio = c(1, 1) )
geno |
An individual x marker matrix. Bialleles (i.e., A or a) must be converted into -1 or 1 digit. |
smap |
A matrix showing a spatial map for individuals. The first and second column include spatial points along an x-axis and y-axis, respectively. |
scale |
A numeric scalar indicating the maximum spatial distance between a focal individual and neighbors to define neighbor effects. |
alpha |
Distance decay coefficient |
grouping |
A positive integer vector assigning each individual to a group. This argument can be useful when a "smap" contains different experimental replicates. Default setting means that all individuals are belong to a single group. |
kernel |
An option to select a negative exponential |
n_causal |
No. of causal markers in a simulated phenotype |
pveB |
Proportion of phenotypic variation explained by fixed effects. |
pve |
Proportion of phenotypic variation explained by fixed and random effects. |
b_ratio |
A vector composed of two numeric scalars that control the ratio of contributions of self or neighbor effects to a phenotype. The first and second element are for self and neighbor effects, respectively. |
A vector of simulated phenotype values for all individuals
Yasuhiro Sato ([email protected])
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno")
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno")
A function to test neighbor effects for each marker and to calculate p-values at a given reference scale.
neiGWAS( geno, pheno, gmap, smap, scale, addcovar = NULL, grouping = rep(1, nrow(smap)), response = c("quantitative", "binary"), model = c("lmm", "lm"), n_core = 1L )
neiGWAS( geno, pheno, gmap, smap, scale, addcovar = NULL, grouping = rep(1, nrow(smap)), response = c("quantitative", "binary"), model = c("lmm", "lm"), n_core = 1L )
geno |
An individual x marker matrix. Bialleles (i.e., A or a) must be converted into -1 or 1 digit. |
pheno |
A numeric vector including phenotypes for individuals |
gmap |
A matrix or data.frame including chromosome numbers in the first column, and SNP positions in the second column. |
smap |
A matrix showing a spatial map for individuals. The first and second column include spatial points along an x-axis and y-axis, respectively. |
scale |
A numeric scalar indicating the maximum spatial distance between a focal individual and neighbors to define neighbor effects. |
addcovar |
An optional matrix including additional non-genetic covariates. It contains no. of individuals x no. of covariates. |
grouping |
A positive integer vector assigning each individual to a group. This argument can be useful when a "smap" contains different experimental replicates. Default setting means that all individuals are belong to a single group. |
response |
An option to select if the phenotype is a |
model |
An option to select linear mixed model |
n_core |
No. of cores for a multi-core computation. This does not work for Windows OS. Default is a single-core computation. |
This function calls a mixed model via the gaston
package. If "lmm"
with "binary"
is selected, p-values are based on Wald tests.
This is because the logistic mixed model is based on a pseudo-likelihood and thus likelihood ratio tests are not applicable. See Chen et al. (2016) for the theory.
A data.frame including the chromosome number, marker position, and p-values.
chr
Chromosome number
pos
Marker position
p
p-value by a likelihood ratio test between models with or without neighbor effects
Yasuhiro Sato ([email protected])
Chen H, Wang C, Conomos M. et al. (2016) Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics 98: 653-666.
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno") scale <- 43 gwas_out <- neiGWAS(geno=fake_nei$geno, pheno=fake_nei$pheno[,1], gmap=fake_nei$gmap, smap=fake_nei$smap, scale=scale, addcovar=as.matrix(fake_nei$pheno$grouping), grouping=fake_nei$pheno$grouping) gaston::manhattan(gwas_out) gaston::qqplot.pvalues(gwas_out$p)
set.seed(1) g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000) gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g))) x <- runif(nrow(g),1,100) y <- runif(nrow(g),1,100) smap <- cbind(x,y) grouping <- c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)) pheno <- nei_simu(geno=g,smap=smap,scale=44,grouping=grouping,n_causal=50,pveB=0.4,pve=0.8) fake_nei <- list() fake_nei[[1]] <- g fake_nei[[2]] <- gmap fake_nei[[3]] <- smap fake_nei[[4]] <- data.frame(pheno,grouping) names(fake_nei) <- c("geno","gmap","smap","pheno") scale <- 43 gwas_out <- neiGWAS(geno=fake_nei$geno, pheno=fake_nei$pheno[,1], gmap=fake_nei$gmap, smap=fake_nei$smap, scale=scale, addcovar=as.matrix(fake_nei$pheno$grouping), grouping=fake_nei$pheno$grouping) gaston::manhattan(gwas_out) gaston::qqplot.pvalues(gwas_out$p)
A function to simulate phenotype values with multiple sources of variation controlled
qtl_pheno_simu( b_self, b_nei, eigenK_self, eigenK_nei, b_ratio = c(1, 1), pveB, pve )
qtl_pheno_simu( b_self, b_nei, eigenK_self, eigenK_nei, b_ratio = c(1, 1), pveB, pve )
b_self |
A n x 1 genotype vector to design major additive genetic effect. |
b_nei |
A vector of an explanatory variable for neighbor effects |
eigenK_self |
Products of |
eigenK_nei |
Products of |
b_ratio |
Ratio for contributions of |
pveB |
Proportion of variance explained by genetic effects attributable to the fixed effects (i.e., b_.. vector). |
pve |
Proportion of variance explained by all genetic effects (i.e., b_.. and eigenK_..) |
A list of simulated phenotypes
y
Simulated phenotype values
beta_self
major self-genetic effects
beta_nei
major neighbor effects
sigma_self
self polygenic effects
sigma_nei
neighbor polygenic effects
epsilon
residuals
res_pveB
realized proportion of variation explained by major-effect genes
res_pve
realized proportion of variation explained by major-effect genes and polygenic effects
Eiji Yamamoto, and Yasuhiro Sato
A function to calculate, with a negative exponential or Gaussian dispersal kernel.
w(s, a, kernel = c("exp", "gaussian"))
w(s, a, kernel = c("exp", "gaussian"))
s |
A numeric scalar indicating spatial distance at which the distance decay is referred |
a |
A numeric scalar indicating a decay coefficient |
kernel |
An option to select a negative exponential |
A numeric scalar for a distance decay weight.
Yasuhiro Sato ([email protected])