--- title: "rNeighborQTL" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{rNeighborQTL} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, fig.width = 4, fig.height = 4, comment = "#>" ) ``` ## Overview
The "rNeighborQTL" package includes core functions to perform QTL mapping of neighbor effects. Taking conditional genotype probabilities from the "R/qtl" package (Broman et al. 2003), the "scan_neighbor()" calculates neighbor genotypic identity and performs interval mapping of neighbor QTL effects. The neighbor QTL requires spatial information, namely individual positions along the x- and y-axes, in addition to the genotype and phenotype data. See Sato, Takeda & Nagano (2021) for the theoretical background.
## Input filesFirst, let us prepare input data using the "R/qtl" package (Broman et al. 2003). Here is an example to import .csv files into a 'cross' object and calculate conditional self-genotype probabilities. In this example, we import insect herbivory data on Col x Kas recombinant inbred lines (RILs) of *Arabidopsis thaliana* (Wilson et al. 2001; Sato, Takeda & Nagano 2021).
```{r input} colkas <- qtl::read.cross(format="csvs",dir="../inst", genfile="ColKas_geno.csv", phefile = "ColKas_pheno.csv", na.strings = c("_"), estimate.map=TRUE, crosstype = "riself" ) colkas_genoprob <- qtl::calc.genoprob(colkas, step=2) ``` ## Proportion of variation explained by neighbor effectsPrior to the genome scan, we estimate the 'scale' argument. Using linear mixed models implemented in the "gaston" package (Perdry & Dandine-Roulland 2020), the "calc_pve()" computes proportion of phenotypic variation (PVE) by neighbor effects for a series of spatial scales. Based on the PVE, we calculate $\Delta$PVE metric and seek the scale $s$ that gives an argument for the maximum of $\Delta$PVE.
```{r pve} library(rNeighborQTL) x <- colkas$pheno[,2] y <- colkas$pheno[,3] smap_colkas <- data.frame(x,y) s_seq <- quantile(dist(smap_colkas),c(0.1*(1:10))) colkas_pve <- calc_pve(genoprobs=colkas_genoprob, pheno=log(colkas$pheno[,5]+1), smap=smap_colkas, s_seq=s_seq, addcovar=as.matrix(colkas$pheno[,7:9]) ) ``` ## Estimation of QTL effectsSimilar to Haley-Knott regression (Haley & Knott 1992), the additive and dominance deviation $a$ and $d$ are estimated using a linear or quadratic regression on neighbor genotypic identity. The "eff_neighbor()" estimates the coefficients for self and neighbor effects, and plots the results as follows.
```{r eff, fig.width=4, fig.height=8} colkas_eff <- eff_neighbor(genoprobs=colkas_genoprob, pheno=log(colkas$pheno[,5]+1), smap=smap_colkas, scale=7, addcovar=as.matrix(colkas$pheno[,7:9]) ) ``` ## LOD scoreLastly, we perform a genome scan to obtain LOD scores for neighbor QTL effects. The "scan_neighbor()" calculates likelihoods using the estimated QTL effects through the "eff_neighbor()". The results are drawn by "plot_nei()".
```{r LOD} colkas_scan <- scan_neighbor(genoprobs=colkas_genoprob, pheno=log(colkas$pheno[,5]+1), smap=smap_colkas, scale=7, addcovar=as.matrix(colkas$pheno[,7:9]) ) plot_nei(colkas_scan) ```In addition to the genome scan, we can perform permutation tests to estimate a genome-wide significance level. Such permutation tests better account data structure, but require much computational time. This is an example code for 3-times permutations.
```{r perm, eval=FALSE} colkas_perm <- perm_neighbor(genoprobs=colkas_genoprob, pheno=log(colkas$pheno[,5]+1), smap=smap_colkas, scale=7, addcovar=as.matrix(colkas$pheno[,6:8]), times=3, p_val=c(0.5,0.1) ) ``` ## Extensions ### *1. Self-genotype effects*The "scan_neighbor()" also provides LOD scores for self QTL effects. This gives the same results as the Haley-Knott regression of standard QTL mapping.
```{r self} plot_nei(colkas_scan, type="self") colkas_scanone <- qtl::scanone(colkas_genoprob, pheno.col=log(colkas$pheno$holes+1), addcovar=as.matrix(colkas$pheno[,7:9]), method="hk") plot(colkas_scanone) ``` ### *2. Composite interval mapping*The "addQTL" argument allows us to include non-focal QTLs as covariates. This option enables composite interval mapping (Jensen et al. 1993) that considers additional QTL effects. Here is an example code using the Col x Kas herbivory data, with the nga8 marker considered a covariate.
```{r CIM, eval=FALSE} colkas_cim <- scan_neighbor(genoprobs=colkas_genoprob, pheno=log(colkas$pheno[,5]+1), smap=smap_colkas, scale=7, addcovar=as.matrix(colkas$pheno[,7:9]), addQTL="c4_nga8" ) plot_nei(colkas_cim) ``` ### *3. Epistasis in neighbor QTL effects*For the analysis of epistasis, the "int_neighbor()" calculate LOD score of two-way interactions between a focal marker and the others. Here is an example code for the 'nga8' marker in the Col x Kas herbivory data.
```{r int, eval=FALSE} colkas_int <- int_neighbor(genoprobs=colkas_genoprob, pheno=log(colkas$pheno[,5]+1), smap=smap_colkas, scale=7, addcovar=as.matrix(colkas$pheno[,7:9]), addQTL="c4_nga8", intQTL="c4_nga8" ) plot_nei(colkas_int, type="int") ``` ### *4. Binary traits*The "response" argument allows us to analyze "binary" phenotypes as well as "quantitative" traits. This argument calls logistic (mixed) models internally (Faraway 2016; Chen et al. 2016). The "calc_pve()" yields the ratio of phenotypic variation explained (RVE) by neighbor effects as RVE~nei~ =$\sigma^2_2/\sigma^2_1$ when "binary" traits are analyzed, because the logistic mixed model does not compute $\sigma^2_e$ (Perdry & Dandine-Roulland 2020). Here is an example code for the analysis of the presence or absence of bolting in Col x Kas RILs.
```{r bin} s_seq <- quantile(dist(smap_colkas),c(0.1*(1:10))) colkas_pveBin <- calc_pve(genoprobs=colkas_genoprob, pheno=colkas$pheno[,7], smap=smap_colkas, s_seq=s_seq, response="binary", addcovar=as.matrix(colkas$pheno[,8:9]), fig=TRUE) colkas_scanBin <- scan_neighbor(genoprobs=colkas_genoprob, pheno=colkas$pheno[,7], smap=smap_colkas, scale=2.24, addcovar=as.matrix(colkas$pheno[,8:9]), response="binary") plot_nei(colkas_scanBin) ``` ### *5. Crossing design*The neighbor QTL package is able to handle AB heterozygotes. It also works even when there are only AA or AB genotypes. However, sex chromosomes are not supported currently, and should be excluded before the genome scan. This is a simulation using F2 or backcross lines implemented in the "R/qtl" package.
```{r fake} #F2 lines set.seed(1234) data("fake.f2",package="qtl") fake_f2 <- subset(fake.f2, chr=1:19) smap_f2 <- cbind(runif(qtl::nind(fake_f2),1,100),runif(qtl::nind(fake_f2),1,100)) genoprobs_f2 <- qtl::calc.genoprob(fake_f2,step=2) s_seq <- quantile(dist(smap_f2),c(0.1*(1:10))) nei_eff <- sim_nei_qtl(genoprobs_f2, a2=0.5, d2=0.5, smap=smap_f2, scale=s_seq[1], n_QTL=1 ) pve_f2 <- calc_pve(genoprobs=genoprobs_f2, pheno=nei_eff$nei_y, smap=smap_f2, s_seq=s_seq[1:5], addcovar=as.matrix(cbind(fake_f2$pheno$sex,fake_f2$pheno$pgm)), fig=FALSE) deltaPVE <- pve_f2[-1,3] - c(0,pve_f2[1:4,3]) argmax_s <- s_seq[1:5][deltaPVE==max(deltaPVE)] scan_f2 <- scan_neighbor(genoprobs=genoprobs_f2, pheno=nei_eff$nei_y, smap=smap_f2, scale=argmax_s, addcovar=as.matrix(cbind(fake_f2$pheno$sex,fake_f2$pheno$pgm)) ) plot_nei(scan_f2) ``` ```{r bc} #backcross lines set.seed(1234) data("fake.bc",package="qtl") fake_bc <- subset(fake.bc, chr=1:19) smap_bc <- cbind(runif(qtl::nind(fake_bc),1,100),runif(qtl::nind(fake_bc),1,100)) genoprobs_bc <- qtl::calc.genoprob(fake_bc,step=2) s_seq <- quantile(dist(smap_bc),c(0.1*(1:10))) nei_eff <- sim_nei_qtl(genoprobs_bc, a2=0.3, d2=-0.3, smap=smap_bc, scale=s_seq[1], n_QTL=1) pve_bc <- calc_pve(genoprobs=genoprobs_bc, pheno=nei_eff$nei_y, smap=smap_bc, s_seq=s_seq[1:5], addcovar=as.matrix(cbind(fake_bc$pheno$sex,fake_bc$pheno$age)), fig=FALSE) deltaPVE <- pve_bc[-1,3] - c(0,pve_bc[1:4,3]) argmax_s <- s_seq[1:5][deltaPVE==max(deltaPVE)] scan_bc <- scan_neighbor(genoprobs=genoprobs_bc, pheno=nei_eff$nei_y, smap=smap_bc, scale=argmax_s, addcovar=as.matrix(cbind(fake_bc$pheno$sex,fake_bc$pheno$age)) ) plot_nei(scan_bc) ``` ## References - Broman KW, Wu H, Sen S, Churchill GA. 2003. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19: 889-890. - Broman KW, Sen S, 2009. Single-QTL analysis, In: A guide to QTL mapping with R/qtl. Springer New York, New York, NY, pp. 75-133. - Chen H, Wang C, Conomos M. et al. 2016. Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics 98: 653-666. - Faraway JJ. 2016. Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC press. - Haley CS, Knott SA. 1992. A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69: 315-324. - Jansen RC. 1993. Interval mapping of multiple quantitative trait loci. Genetics 135: 205-211. - Perdry H, Dandine-Roulland C. 2020. gaston: Genetic Data Handling (QC, GRM, LD, PCA) & Linear Mixed Models. R package version 1.5.6. https://CRAN.R-project.org/package=gaston - Sato Y, Takeda K, Nagano AJ. 2021. Neighbor QTL: an interval mapping method for quantitative trait loci underlying plant neighborhood effects. G3; Genes|Genomes|Genetics 11: jkab017. - Wilson IW, Schiff CL, Hughes DE, Somerville SC. 2001. Quantitative trait loci analysis of powdery mildew disease resistance in the *Arabidopsis thaliana* accession kashmir-1. Genetics 158: 1301-1309.