Title: | Tools for Analyzing QTL Experiments |
---|---|
Description: | Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>. |
Authors: | Karl W Broman [aut, cre] , Hao Wu [aut], Gary Churchill [ctb] , Saunak Sen [ctb] , Danny Arends [ctb] , Robert Corty [ctb], Timothee Flutre [ctb], Ritsert Jansen [ctb], Pjotr Prins [ctb] , Lars Ronnegard [ctb], Rohan Shah [ctb], Laura Shannon [ctb], Quoc Tran [ctb], Aaron Wolen [ctb], Brian Yandell [ctb] , R Core Team [ctb] |
Maintainer: | Karl W Broman <[email protected]> |
License: | GPL-3 |
Version: | 1.70 |
Built: | 2024-11-01 11:31:27 UTC |
Source: | CRAN |
A brief introduction to the R/qtl package, with a walk-through of an analysis.
In order to use the R/qtl package, you must type (within R)
library(qtl)
. You may wish to include this in a
.Rprofile
file.
Documention and several tutorials are available at the R archive (https://cran.r-project.org).
Use the help.start
function to start the
html version of the R help.
Type library(help=qtl)
to get a list of the functions
in R/qtl.
Use the example
function to run examples
of the various functions in R/qtl.
A tutorial on the use of R/qtl is distributed with the package and is also available at https://rqtl.org/rqtltour.pdf.
Download the latest version of R/qtl from the R archive or from https://rqtl.org.
Here we briefly describe the use of R/qtl to analyze an experimental cross. A more extensive tutorial on its use is distributed with the package and is also available at https://rqtl.org/rqtltour.pdf.
A difficult first step in the use of most data analysis software is the
import of data. With R/qtl, one may import data in several different
formats by use of the function read.cross
. The
internal data structure used by R/qtl is rather complicated, and is
described in the help file for read.cross
. We won't
discuss data import any further here, except to say that the
comma-delimited format ("csv"
) is recommended. If you have
trouble importing data, send an email to Karl Broman,
[email protected], perhaps attaching examples of your data
files. (Such data will be kept confidential.) Also see the sample data
files and code at https://rqtl.org/sampledata/.
We consider the example data hyper
, an experiment on
hypertension in the mouse, kindly provided
by Bev Paigen and Gary Churchill. Use the data
function to load the data.
data(hyper)
The hyper
data set has class "cross"
. The
function summary.cross
gives summary information
on the data, and checks the data for internal consistency. A number
of other utility functions are available; hopefully these are
self-explanatory.
summary(hyper)
nind(hyper)
nphe(hyper)
nchr(hyper)
nmar(hyper)
totmar(hyper)
The function plot.cross
gives a graphical summary of
the data; it calls plotMissing
(to plot a matrix
displaying missing genotypes) and plotMap
(to plot
the genetic maps), and also displays histograms or barplots of the
phenotypes. The plotMissing
function can plot
individuals ordered by their phenotypes; you can see that for most
markers, only individuals with extreme phenotypes were genotyped.
plot(hyper)
plotMissing(hyper)
plotMissing(hyper, reorder=TRUE)
plotMap(hyper)
Note that one marker (on chromosome 14) has no genotype data. The
function drop.nullmarkers
removes such markers from
the data.
hyper <- drop.nullmarkers(hyper)
totmar(hyper)
The function est.rf
estimates the recombination
fraction between each pair of markers, and calculates a LOD score for
the test of = 1/2. This is useful for identifying markers that
are placed on the wrong chromosome. Note that since, for these data,
many markers were typed only on recombinant individuals, the pairwise
recombination fractions show rather odd patterns.
hyper <- est.rf(hyper)
plotRF(hyper)
plotRF(hyper, chr=c(1,4))
To re-estimate the genetic map for an experimental cross, use the
function est.map
. The function
plotMap
, in addition to plotting a single map, can
plot the comparison of two genetic maps (as long as they are composed of
the same numbers of chromosomes and markers per chromosome). The
function replace.map
map be used to replace the
genetic map in a cross with a new one.
newmap <- est.map(hyper, error.prob=0.01, verbose=TRUE)
plotMap(hyper, newmap)
hyper <- replace.map(hyper, newmap)
The function calc.errorlod
may be used to assist in
identifying possible genotyping errors; it calculates the error LOD
scores described by Lincoln and Lander (1992). The
calc.errorlod
function return a modified version of
the input cross, with error LOD scores included. The function
top.errorlod
prints the genotypes with values above a
cutoff (by default, the cutoff is 4.0).
hyper <- calc.errorlod(hyper, error.prob=0.01)
top.errorlod(hyper)
The function plotGeno
may be used to inspect the
observed genotypes for a chromosome, with likely genotyping errors
flagged.
plotGeno(hyper, chr=16, ind=c(24:34, 71:81))
Before doing QTL analyses, some intermediate calculations need to be
performed. The function calc.genoprob
calculates
conditional genotype probabilities given the multipoint marker data.
sim.geno
simulates sequences of genotypes from their
joint distribution, given the observed marker data.
As with calc.errorlod
, these functions return a
modified version of the input cross, with the intermediate calculations
included. The step
argument indicates the density of the grid on
which the calculations will be performed, and determines the density at
which LOD scores will be calculated.
hyper <- calc.genoprob(hyper, step=2.5, error.prob=0.01)
hyper <- sim.geno(hyper, step=2.5, n.draws=64, error.prob=0.01)
The function scanone
performs a genome scan with a
single QTL model. By default, it performs standard interval mapping
(Lander and Botstein 1989): use of a normal model and the EM algorithm.
If one specifies method="hk"
, Haley-Knott regression is performed
(Haley and Knott 1992). These two methods require the results from
calc.genoprob
.
out.em <- scanone(hyper)
out.hk <- scanone(hyper, method="hk")
If one specifies method="imp"
, a genome scan is performed by the
multiple imputation method of Sen and Churchill (2001). This method
requires the results from sim.geno
.
out.imp <- scanone(hyper, method="imp")
The output of scanone
is a data.frame with class
"scanone"
. The function plot.scanone
may be
used to plot the results, and may plot up to three sets of results
against each other, as long as they conform appropriately.
plot(out.em)
plot(out.hk, col="blue", add=TRUE)
plot(out.imp, col="red", add=TRUE)
plot(out.hk, out.imp, out.em, chr=c(1,4), lty=1,
col=c("blue","red","black"))
The function summary.scanone
may be used to list
information on the peak LOD for each chromosome for which the LOD
exceeds a specified threshold.
summary(out.em)
summary(out.em, threshold=3)
summary(out.hk, threshold=3)
summary(out.imp, threshold=3)
The function max.scanone
returns the maximum LOD
score, genome-wide.
max(out.em)
max(out.hk)
max(out.imp)
One may also use scanone
to perform a permutation
test to get a genome-wide LOD significance threshold.
operm.hk <- scanone(hyper, method="hk", n.perm=1000)
The result has class "scanoneperm"
. The
summary.scanoneperm
function may be used to calculate
LOD thresholds.
summary(operm.hk, alpha=0.05)
The permutation results may also be used in the
summary.scanone
function to calculate LOD thresholds
and genome-scan-adjusted p-values.
summary(out.hk, perms=operm.hk, alpha=0.05, pvalues=TRUE)
We should say at this point that the function
save.image
will save your workspace to disk. You'll
wish you had used this if R crashes.
save.image()
The function scantwo
performs a two-dimensional
genome scan with a two-QTL model. Methods "em"
, "hk"
and
"imp"
are all available. scantwo
is
considerably slower than scanone
, and can require a
great deal of memory. Thus, you may wish to re-run
calc.genoprob
and/or sim.geno
with
a more coarse grid.
hyper <- calc.genoprob(hyper, step=10, err=0.01)
hyper <- sim.geno(hyper, step=10, n.draws=64, err=0.01)
out2.hk <- scantwo(hyper, method="hk")
out2.em <- scantwo(hyper)
out2.imp <- scantwo(hyper, method="imp")
The output is an object with class scantwo
. The function
plot.scantwo
may be used to plot the results. The
upper triangle contains LOD scores for tests of epistasis, while the
lower triangle contains LOD scores for the full model.
plot(out2.hk)
plot(out2.em)
plot(out2.imp)
The function summary.scantwo
lists the interesting
aspects of the output. For each pair of chromosomes , it
calculates the maximum LOD score for the full model,
; a
LOD score indicating evidence for a second QTL, allowing for epistasis),
; a LOD score indicating evidence for
epistasis,
; the LOD score for the additive QTL model,
; and a LOD score indicating evidence for a second QTL,
assuming no epistasis,
.
You must provide five LOD thresholds, corresponding to the above five
LOD scores, and in that order. A chromosome pair is printed if either
(a) and (
or
), or (b)
and
.
summary(out2.em, thresholds=c(6.2, 5.0, 4.6, 4.5, 2.3))
summary(out2.em, thresholds=c(6.2, 5.0, Inf, 4.5, 2.3))
In the latter case, the interaction LOD score will be ignored.
The function max.scantwo
returns the maximum joint
and additive LODs for a two-dimensional genome scan.
max(out2.em)
Permutation tests may also performed with scantwo
;
it may take a few days of CPU time. The output is a list containing the
maxima of the above five LOD scores for each of the imputations.
operm2 <- scantwo(hyper, method="hk", n.perm=100)
summary(operm2, alpha=0.05)
To cite R/qtl in publications, use the Broman et al. (2003) reference listed below.
Karl W Broman, [email protected]
Broman, K. W. and Sen, Ś. (2009) A guide to QTL mapping with R/qtl. Springer. https://rqtl.org/book/
Broman, K. W., Wu, H., Sen, Ś. and Churchill, G. A. (2003) R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185–199.
Lincoln, S. E. and Lander, E. S. (1992) Systematic detection of errors in genetic linkage data. Genomics 14, 604–610.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
Add dots at the locations of the selected marker covariates, for a plot of composite interval mapping results.
add.cim.covar(cimresult, chr, gap=25, ...)
add.cim.covar(cimresult, chr, gap=25, ...)
cimresult |
Composite interval mapping results, as output from
|
chr |
Optional vector specifying which chromosomes to plot. (The
chromosomes must be specified by name.) This should be identical to
that used in the call to |
gap |
Gap separating chromosomes (in cM). This should be identical to
that used in the call to |
... |
Additional plot arguments, passed to the function
|
One must first have used the function plot.scanone
to plot the composite interval mapping results.
The arguments chr
and gap
must be identical to the
values used in the call to plot.scanone
.
Dots indicating the locations of the selected marker covariates are
displayed on the x-axis. (By default, solid red circles are plotted;
this may be modified by specifying the graphics parameters pch
and
col
.)
A data frame indicating the marker covariates that were plotted.
Karl W Broman, [email protected]
## Not run: data(hyper) hyper <- calc.genoprob(hyper, step=2.5) out <- scanone(hyper) out.cim <- cim(hyper, n.marcovar=3) plot(out, out.cim, chr=c(1,4,6,15), col=c("blue", "red")) add.cim.covar(out.cim, chr=c(1,4,6,15)) ## End(Not run)
## Not run: data(hyper) hyper <- calc.genoprob(hyper, step=2.5) out <- scanone(hyper) out.cim <- cim(hyper, n.marcovar=3) plot(out, out.cim, chr=c(1,4,6,15), col=c("blue", "red")) add.cim.covar(out.cim, chr=c(1,4,6,15)) ## End(Not run)
Add a significance threshold to a plot created by
plot.scanone
), using the permutation results.
add.threshold(out, chr, perms, alpha=0.05, lodcolumn=1, gap=25, ...)
add.threshold(out, chr, perms, alpha=0.05, lodcolumn=1, gap=25, ...)
out |
An object of class |
chr |
Optional vector specifying which chromosomes to plot. If a selected subset of chromosomes were plotted, they must be specified here. |
perms |
Permutation results from |
alpha |
Significance level of the threshold. |
lodcolumn |
An integer indicating which of column in the permutation results should be used. |
gap |
Gap separating chromosomes (in cM). This must be identical
to what was used in the call to |
... |
Passed to the function |
This function allows you to add a horizontal line at the significance
threshold to genome scan results plotted by
plot.scanone
.
The arguments out
, chr
, and gap
must match what
was used in the call to plot.scanone
.
The argument perms
must be specified. If X-chromosome-specific
permutations were performed (via the argument perm.Xsp
in the
call to scanone
), separate thresholds will be
plotted for the autosomes and the X chromosome. These are calculated
via the summary.scanoneperm
function.
None.
Karl W Broman, [email protected]
scanone
,
plot.scanone
,
summary.scanoneperm
, xaxisloc.scanone
data(hyper) hyper <- calc.genoprob(hyper) out <- scanone(hyper, method="hk") operm <- scanone(hyper, method="hk", n.perm=100, perm.Xsp=TRUE) plot(out, chr=c(1,4,6,15,"X")) add.threshold(out, chr=c(1,4,6,15,"X"), perms=operm, alpha=0.05) add.threshold(out, chr=c(1,4,6,15,"X"), perms=operm, alpha=0.1, col="green", lty=2)
data(hyper) hyper <- calc.genoprob(hyper) out <- scanone(hyper, method="hk") operm <- scanone(hyper, method="hk", n.perm=100, perm.Xsp=TRUE) plot(out, chr=c(1,4,6,15,"X")) add.threshold(out, chr=c(1,4,6,15,"X"), perms=operm, alpha=0.05) add.threshold(out, chr=c(1,4,6,15,"X"), perms=operm, alpha=0.1, col="green", lty=2)
Try adding all QTL x covariate interactions, one at a time, to a multiple QTL model, for a given set of covariates.
addcovarint(cross, pheno.col=1, qtl, covar=NULL, icovar, formula, method=c("imp","hk"), model=c("normal", "binary"), verbose=TRUE, pvalues=TRUE, simple=FALSE, tol=1e-4, maxit=1000, require.fullrank=FALSE)
addcovarint(cross, pheno.col=1, qtl, covar=NULL, icovar, formula, method=c("imp","hk"), model=c("normal", "binary"), verbose=TRUE, pvalues=TRUE, simple=FALSE, tol=1e-4, maxit=1000, require.fullrank=FALSE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
An object of class |
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
icovar |
Vector of character strings indicating the columns in
|
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
verbose |
If TRUE, will print a message if there are no interactions to test. |
pvalues |
If FALSE, p-values will not be included in the results. |
simple |
If TRUE, don't include p-values or sums of squares in the summary. |
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
require.fullrank |
If TRUE, give LOD=0 when covariate matrix in the linear regression is not of full rank. |
The formula is used to specified the model to be fit. In the
formula, use Q1
, Q2
, etc., or q1
,
q2
, etc., to represent the QTLs, and the column names in the
covariate data frame to represent the covariates.
We enforce a hierarchical structure on the model formula: if a QTL or covariate is in involved in an interaction, its main effect must also be included.
An object of class addcovarint
, with results as in the
drop-one-term analysis from fitqtl
. This is a data
frame (given class "addcovarint"
, with the following columns:
degrees of freedom (df), Type III sum of squares (Type III
SS), LOD score(LOD), percentage of variance explained (%var), F
statistics (F value), and P values for chi square (Pvalue(chi2))
and F distribution (Pvalue(F)).
Note that the degree of freedom, Type III sum of squares, the LOD score and the percentage of variance explained are the values comparing the full to the sub-model with the term dropped. Also note that for imputation method, the percentage of variance explained, the the F values and the P values are approximations calculated from the LOD score.
QTL x covariate interactions already included in the input formula
are
not tested.
Karl W Broman, [email protected]
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
addint
, fitqtl
, makeqtl
,
scanqtl
, refineqtl
,
addqtl
, addpair
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # use the sex phenotype as the covariate covar <- data.frame(sex=fake.f2$pheno$sex) # try all possible QTL x sex interactions, one at a time addcovarint(fake.f2, pheno.col=1, qtl, covar, "sex", y~Q1+Q2+Q3, method="hk")
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # use the sex phenotype as the covariate covar <- data.frame(sex=fake.f2$pheno$sex) # try all possible QTL x sex interactions, one at a time addcovarint(fake.f2, pheno.col=1, qtl, covar, "sex", y~Q1+Q2+Q3, method="hk")
Try adding all possible pairwise interactions, one at a time, to a multiple QTL model.
addint(cross, pheno.col=1, qtl, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), qtl.only=FALSE, verbose=TRUE, pvalues=TRUE, simple=FALSE, tol=1e-4, maxit=1000, require.fullrank=FALSE)
addint(cross, pheno.col=1, qtl, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), qtl.only=FALSE, verbose=TRUE, pvalues=TRUE, simple=FALSE, tol=1e-4, maxit=1000, require.fullrank=FALSE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix to be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
An object of class |
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
qtl.only |
If TRUE, only test QTL:QTL interactions (and not interactions with covariates). |
verbose |
If TRUE, will print a message if there are no interactions to test. |
pvalues |
If FALSE, p-values will not be included in the results. |
simple |
If TRUE, don't include p-values or sums of squares in the summary. |
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
require.fullrank |
If TRUE, give LOD=0 when covariate matrix in the linear regression is not of full rank. |
The formula is used to specified the model to be fit. In the
formula, use Q1
, Q2
, etc., or q1
,
q2
, etc., to represent the QTLs, and the column names in the
covariate data frame to represent the covariates.
We enforce a hierarchical structure on the model formula: if a QTL or covariate is in involved in an interaction, its main effect must also be included.
An object of class addint
, with results as in the
drop-one-term analysis from fitqtl
. This is a data
frame (given class "addint"
, with the following columns:
degrees of freedom (df), Type III sum of squares (Type III
SS), LOD score(LOD), percentage of variance explained (%var), F
statistics (F value), and P values for chi square (Pvalue(chi2))
and F distribution (Pvalue(F)).
Note that the degree of freedom, Type III sum of squares, the LOD score and the percentage of variance explained are the values comparing the full to the sub-model with the term dropped. Also note that for imputation method, the percentage of variance explained, the the F values and the P values are approximations calculated from the LOD score.
Pairwise interactions already included in the input formula
are
not tested.
Karl W Broman, [email protected]
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
addcovarint
, fitqtl
, makeqtl
,
scanqtl
, refineqtl
,
addqtl
, addpair
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # try all possible pairwise interactions, one at a time addint(fake.f2, pheno.col=1, qtl, formula=y~Q1+Q2+Q3, method="hk")
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # try all possible pairwise interactions, one at a time addint(fake.f2, pheno.col=1, qtl, formula=y~Q1+Q2+Q3, method="hk")
Add phenotype location(s) into a cross object (with eQTL/pQTL studies)
addloctocross(cross, locations=NULL, locfile="locations.txt", verbose=FALSE)
addloctocross(cross, locations=NULL, locfile="locations.txt", verbose=FALSE)
cross |
An object of class |
locations |
R variable holding location information |
locfile |
load from a file, see the details section for the layout of the file. |
verbose |
If TRUE, give verbose output |
inputfile layout: Num Name Chr cM 1 X3.Hydroxypropyl 4 50.0 Num is the number of the phenotype in the cross object Name is the name of the phenotype (will be checked against the name already in the cross object at position num Chr Chromosome cM position from start of chromosome in cM
The input cross object, with the locations added as an additional component locations
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
mqmplot.cistrans
- Cis/trans plot
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
## Not run: data(multitrait) data(locations) multiloc <- addloctocross(multitrait,locations) results <- scanall(multiloc) mqmplot.cistrans(results, multiloc, 5, FALSE, TRUE) ## End(Not run)
## Not run: data(multitrait) data(locations) multiloc <- addloctocross(multitrait,locations) results <- scanall(multiloc) mqmplot.cistrans(results, multiloc, 5, FALSE, TRUE) ## End(Not run)
Add a marker to a cross object.
addmarker(cross, genotypes, markername, chr, pos)
addmarker(cross, genotypes, markername, chr, pos)
cross |
An object of class |
genotypes |
Vector of numeric genotypes. |
markername |
Marker name as character string. |
chr |
Chromosome ID as character string. |
pos |
Position of marker, as numeric value. |
Use this function with caution. It would be best to incorporate new
data into a single file to be imported with read.cross
.
But if you have genotypes on one or two additional markers that you
want to add, you might load them with read.csv
and incorporate them with this function.
The input cross
object with the single marker added.
Karl W Broman, [email protected]
data(fake.f2) # genotypes for new marker gi <- pull.geno(fill.geno(fake.f2))[,"D5M197"] # add marker to cross fake.f2 <- addmarker(fake.f2, gi, "D5M197imp", "5", 11)
data(fake.f2) # genotypes for new marker gi <- pull.geno(fill.geno(fake.f2))[,"D5M197"] # add marker to cross fake.f2 <- addmarker(fake.f2, gi, "D5M197imp", "5", 11)
Scan for an additional pair of QTL in the context of a multiple QTL model.
addpair(cross, chr, pheno.col=1, qtl, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), incl.markers=FALSE, verbose=TRUE, tol=1e-4, maxit=1000, forceXcovar=FALSE)
addpair(cross, chr, pheno.col=1, qtl, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), incl.markers=FALSE, verbose=TRUE, tol=1e-4, maxit=1000, forceXcovar=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to be scanned. If
missing, all chromosomes are scanned. Refer to chromosomes by
name. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix to be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
An object of class |
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
incl.markers |
If FALSE, do calculations only at points on an
evenly spaced grid. If |
verbose |
If TRUE, display information about the progress of
calculations. If |
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
forceXcovar |
If TRUE, force inclusion of X-chr-related covariates (like sex and cross direction). |
The formula is used to specified the model to be fit. In the
formula, use Q1
, Q2
, etc., or q1
,
q2
, etc., to represent the QTLs, and the column names in the
covariate data frame to represent the covariates.
We enforce a hierarchical structure on the model formula: if a QTL or covariate is in involved in an interaction, its main effect must also be included.
If neither of the two new QTL are indicated in the formula
, we
perform a two-dimensional scan as in scantwo
. That
is, for each pair of QTL positions, we fit two models: two
additive QTL added to the formula
, and two interacting QTL
added to the formula
.
If the both of the new QTL are indicated in the formula
, that
particular model is fit, with the positions of the new QTL allowed to
vary across the genome. If just one of the QTL is indicated in the
formula
, a main effect for the other is added, and that
particular model is fit, again with the positions of both QTL varying.
Note that in this case the LOD scores are not analogous to those
produced by scantwo
. Thus, there slightly modified forms
for the plots (produced by plot.scantwo
) and
summaries (produced by summary.scantwo
and
max.scantwo
). In the plot, the x-axis is to be
interpreted as the position of the first of the new QTL, and the
y-axis is to be interpreted as the position of the second of the new
QTL. In the summaries, we give the single best pair of positions on
each pair of chromosomes, and give LOD scores comparing that pair of
positions to the base model (without each of these QTL), and to the
base model plus one additional QTL on one or the other of the chromosomes.
An object of class scantwo
, as produced by
scantwo
.
If neither of the new QTL were indicated
in the formula
, the result is just as in
scantwo
, though with LOD scores relative to the
base model (omitting the new QTL).
Otherwise, the results are contained in what would ordinarily be in
the full and additive LOD scores, with the additive LOD scores
corresponding to the case that the first of the new QTL is to the left
of the second of the new QTL, and the full LOD scores corresponding to
the case that the first of the new QTL is to the right of the second
of the new QTL. Because the structure of the LOD scores in this
case is different from those output by scantwo
, we
include, in this case, an attribute "addpair"=TRUE
. (We also
require results of single-dimensional scans, omitting each of the two
new QTL from the formula, one at a time; these are included as
attributes "lod.minus1"
and "lod.minus2"
.) The
results are then treated somewhat differently by
summary.scantwo
, max.scantwo
,
and plot.scantwo
. See the Details section.
Karl W Broman, [email protected]
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
addint
, addqtl
,
fitqtl
, makeqtl
,
scanqtl
, refineqtl
,
makeqtl
, scantwo
,
addtoqtl
# A totally contrived example to show some of what you can do # simulate backcross data with 3 chromosomes (names "17", "18", "19") # one QTL on chr 17 at 40 cM # one QTL on chr 18 at 30 cM # two QTL on chr 19, at 10 and 40 cM data(map10) model <- rbind(c(1,40,0), c(2,30,0), c(3,10,0), c(3,40,0)) ## Not run: fakebc <- sim.cross(map10[17:19], model=model, type="bc", n.ind=250) # het at QTL on 17 and 1st QTL on 19 increases phenotype by 1 unit # het at QTL on 18 and 2nd QTL on 19 decreases phenotype by 1 unit qtlgeno <- fakebc$qtlgeno phe <- rnorm(nind(fakebc)) w <- qtlgeno[,1]==2 & qtlgeno[,3]==2 phe[w] <- phe[w] + 1 w <- qtlgeno[,2]==2 & qtlgeno[,4]==2 phe[w] <- phe[w] - 1 fakebc$pheno[,1] <- phe ## Not run: fakebc <- calc.genoprob(fakebc, step=2, err=0.001) # base model has QTLs on chr 17 and 18 qtl <- makeqtl(fakebc, chr=c("17", "18"), pos=c(40,30), what="prob") # scan for an additional pair of QTL, one interacting with the locus # on 17 and one interacting with the locus on 18 out.ap <- addpair(fakebc, qtl=qtl, formula = y~Q1*Q3 + Q2*Q4, method="hk") max(out.ap) summary(out.ap) plot(out.ap)
# A totally contrived example to show some of what you can do # simulate backcross data with 3 chromosomes (names "17", "18", "19") # one QTL on chr 17 at 40 cM # one QTL on chr 18 at 30 cM # two QTL on chr 19, at 10 and 40 cM data(map10) model <- rbind(c(1,40,0), c(2,30,0), c(3,10,0), c(3,40,0)) ## Not run: fakebc <- sim.cross(map10[17:19], model=model, type="bc", n.ind=250) # het at QTL on 17 and 1st QTL on 19 increases phenotype by 1 unit # het at QTL on 18 and 2nd QTL on 19 decreases phenotype by 1 unit qtlgeno <- fakebc$qtlgeno phe <- rnorm(nind(fakebc)) w <- qtlgeno[,1]==2 & qtlgeno[,3]==2 phe[w] <- phe[w] + 1 w <- qtlgeno[,2]==2 & qtlgeno[,4]==2 phe[w] <- phe[w] - 1 fakebc$pheno[,1] <- phe ## Not run: fakebc <- calc.genoprob(fakebc, step=2, err=0.001) # base model has QTLs on chr 17 and 18 qtl <- makeqtl(fakebc, chr=c("17", "18"), pos=c(40,30), what="prob") # scan for an additional pair of QTL, one interacting with the locus # on 17 and one interacting with the locus on 18 out.ap <- addpair(fakebc, qtl=qtl, formula = y~Q1*Q3 + Q2*Q4, method="hk") max(out.ap) summary(out.ap) plot(out.ap)
Scan for an additional QTL in the context of a multiple QTL model.
addqtl(cross, chr, pheno.col=1, qtl, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), incl.markers=TRUE, verbose=FALSE, tol=1e-4, maxit=1000, forceXcovar=FALSE, require.fullrank=FALSE)
addqtl(cross, chr, pheno.col=1, qtl, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), incl.markers=TRUE, verbose=FALSE, tol=1e-4, maxit=1000, forceXcovar=FALSE, require.fullrank=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to be scanned. If
missing, all chromosomes are scanned. Refer to chromosomes by
name. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix to be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
An object of class |
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
incl.markers |
If FALSE, do calculations only at points on an
evenly spaced grid. If |
verbose |
If TRUE, display information about the progress of
calculations. If |
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
forceXcovar |
If TRUE, force inclusion of X-chr-related covariates (like sex and cross direction). |
require.fullrank |
If TRUE, give LOD=0 when covariate matrix in the linear regression is not of full rank. |
The formula is used to specified the model to be fit. In the
formula, use Q1
, Q2
, etc., or q1
,
q2
, etc., to represent the QTLs, and the column names in the
covariate data frame to represent the covariates.
We enforce a hierarchical structure on the model formula: if a QTL or covariate is in involved in an interaction, its main effect must also be included.
If one wishes to scan for QTL that interact with another QTL, include
it in the formula (with an index of one more than the number of QTL in
the input qtl
object).
An object of class scanone
, as produced by the
scanone
function. LOD scores are relative to the
base model (with any terms that include the new QTL omitted).
Karl W Broman, [email protected]
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
scanone
, fitqtl
,
scanqtl
, refineqtl
,
makeqtl
, addtoqtl
,
addpair
, addint
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=c(1,2,3,8,13)) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # scan for an additional QTL out1 <- addqtl(fake.f2, qtl=qtl, formula=y~Q1+Q2+Q3, method="hk") max(out1) # scan for an additional QTL that interacts with the locus on chr 1 out2 <- addqtl(fake.f2, qtl=qtl, formula=y~Q1*Q4+Q2+Q3, method="hk") max(out2) # plot interaction LOD scores plot(out2-out1)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=c(1,2,3,8,13)) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # scan for an additional QTL out1 <- addqtl(fake.f2, qtl=qtl, formula=y~Q1+Q2+Q3, method="hk") max(out1) # scan for an additional QTL that interacts with the locus on chr 1 out2 <- addqtl(fake.f2, qtl=qtl, formula=y~Q1*Q4+Q2+Q3, method="hk") max(out2) # plot interaction LOD scores plot(out2-out1)
Add a QTL or multiple QTL to a qtl object.
addtoqtl(cross, qtl, chr, pos, qtl.name, drop.lod.profile=TRUE)
addtoqtl(cross, qtl, chr, pos, qtl.name, drop.lod.profile=TRUE)
cross |
An object of class |
qtl |
The qtl object to which additional QTL are to be added. |
chr |
Vector indicating the chromosome for each new QTL. (These should be character strings referring to the chromosomes by name.) |
pos |
Vector (of same length as |
qtl.name |
Optional user-specified name for each new QTL, used in the
drop-one-term ANOVA table in |
drop.lod.profile |
If TRUE, remove any LOD profiles from the object. |
An object of class qtl
, just like the input qtl
object,
but with additional QTL added. See makeqtl
for
details.
Karl W Broman, [email protected]
makeqtl
, fitqtl
,
dropfromqtl
, replaceqtl
,
reorderqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") qtl <- addtoqtl(fake.f2, qtl, 14, 35)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") qtl <- addtoqtl(fake.f2, qtl, 14, 35)
In order to assess the support for a linkage group, this function splits the linkage groups into two pieces at each interval and in each case calculates a LOD score comparing the combined linkage group to the two pieces.
allchrsplits(cross, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE)
allchrsplits(cross, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE)
cross |
An object of class |
chr |
A vector specifying which chromosomes to study.
This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. (Ignored if m > 0.) |
m |
Interference parameter for the chi-square model for interference; a non-negative integer, with m=0 corresponding to no interference. This may be used only for a backcross or intercross. |
p |
Proportion of chiasmata from the NI mechanism, in the Stahl model; p=0 gives a pure chi-square model. This may be used only for a backcross or intercross. |
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
verbose |
If TRUE, print information on progress. |
A data frame (actually, an object of class "scanone"
, so that
one may use plot.scanone
,
summary.scanone
, etc.) with each row being an interval
at which a split is made.
The first two columns are the chromosome ID and midpoint of the interval. The third
column is a LOD score comparing the combined linkage group to the
split into two linkage groups. A fourth column (gap
) indicates the length of
each interval.
The row names indicate the flanking markers for each interval.
Karl W Broman, [email protected]
est.map
, ripple
,
est.rf
, switch.order
,
movemarker
data(fake.bc) allchrsplits(fake.bc, 7, error.prob=0, verbose=FALSE)
data(fake.bc) allchrsplits(fake.bc, 7, error.prob=0, verbose=FALSE)
Uses the Viterbi algorithm to identify the most likely sequence of underlying genotypes, given the observed multipoint marker data, with possible allowance for genotyping errors.
argmax.geno(cross, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
argmax.geno(cross, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
cross |
An object of class |
step |
Maximum distance (in cM) between positions at which the
genotypes are reconstructed, though for |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype reconstructions will be carried. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer or Morgan map function when converting genetic distances into recombination fractions. |
stepwidth |
Indicates whether the intermediate points should with
fixed or variable step sizes. We recommend using
|
We use the Viterbi algorithm to calculate
where
is the underlying sequence of genotypes and
is the
observed marker genotypes.
This is done by calculating
for
and then tracing back through the
sequence.
The input cross
object is returned with a component,
argmax
, added to each component of cross$geno
.
The argmax
component is a matrix of size [n.ind x n.pos], where
n.pos is the
number of positions at which the reconstructed genotypes were obtained,
containing the most likely sequences of underlying genotypes.
Attributes "error.prob"
, "step"
, and "off.end"
are set to the values of the corresponding arguments, for later
reference.
The Viterbi algorithm can behave badly when step
is small but
positive. One may observe quite different results for different values
of step
.
The problem is that, in the presence of data like A----H
, the
sequences AAAAAA
and HHHHHH
may be more likely than any
one of the sequences AAAAAH
, AAAAHH
, AAAHHH
,
AAHHHH
, AHHHHH
, AAAAAH
. The Viterbi algorithm
produces a single "most likely" sequence of underlying genotypes.
Karl W Broman, [email protected]
Lange, K. (1999) Numerical analysis for statisticians. Springer-Verlag. Sec 23.3.
Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286.
sim.geno
, calc.genoprob
,
fill.geno
data(fake.f2) fake.f2 <- argmax.geno(fake.f2, step=2, off.end=5, err=0.01)
data(fake.f2) fake.f2 <- argmax.geno(fake.f2, step=2, off.end=5, err=0.01)
Add or subtract LOD scores in results
from scanone
or scantwo
.
scan1+scan2 scan1-scan2
scan1+scan2 scan1-scan2
scan1 , scan2
|
Genome scan results on the same set of chromosomes
and markers, as output by |
This is used to calculate the sum or difference of LOD scores of two genome scan results. It is particularly useful for calculating the LOD scores for QTL-by-covariate interactions (see the example, below). Note that the degrees of freedom are also added or subtracted.
The same type of data structure as the input objects, with LOD scores added or subtracted.
Karl W Broman, [email protected]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) # covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") # scan with additive but not the interactive covariate out.acovar <- scanone(fake.bc, addcovar=ac) # scan with interactive covariate out.icovar <- scanone(fake.bc, addcovar=ac, intcovar=ic) # plot the difference of with and without the interactive covariate # This is a LOD score for a test of QTL x covariate interaction plot(out.icovar-out.acovar)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) # covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") # scan with additive but not the interactive covariate out.acovar <- scanone(fake.bc, addcovar=ac) # scan with interactive covariate out.icovar <- scanone(fake.bc, addcovar=ac, intcovar=ic) # plot the difference of with and without the interactive covariate # This is a LOD score for a test of QTL x covariate interaction plot(out.icovar-out.acovar)
Add or subtract LOD scores in permutation results from
scanone
or scantwo
.
perm1+perm2 perm1-perm2
perm1+perm2 perm1-perm2
perm1 , perm2
|
Permutation results from
|
This is used to calculate the sum or difference of LOD scores of two
sets of permutation results from scanone
or
scantwo
. One must be careful to ensure that the
permutations are perfectly linked, which
will require the use of set.seed
.
The same data structure as the input objects, with LOD scores added or subtracted.
Karl W Broman, [email protected]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) # covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") # set seed theseed <- round(runif(1, 1, 10^8)) set.seed(theseed) # permutations with additive but not the interactive covariate ## Not run: operm.acovar <- scanone(fake.bc, addcovar=ac, n.perm=1000) # re-set the seed set.seed(theseed) # permutations with interactive covariate ## Not run: operm.icovar <- scanone(fake.bc, addcovar=ac, intcovar=ic, n.perm=1000) ## End(Not run) # permutation results for the QTL x covariate interaction operm.gxc <- operm.icovar - operm.acovar # LOD thresholds summary(operm.gxc)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) # covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") # set seed theseed <- round(runif(1, 1, 10^8)) set.seed(theseed) # permutations with additive but not the interactive covariate ## Not run: operm.acovar <- scanone(fake.bc, addcovar=ac, n.perm=1000) # re-set the seed set.seed(theseed) # permutations with interactive covariate ## Not run: operm.icovar <- scanone(fake.bc, addcovar=ac, intcovar=ic, n.perm=1000) ## End(Not run) # permutation results for the QTL x covariate interaction operm.gxc <- operm.icovar - operm.acovar # LOD thresholds summary(operm.gxc)
Simulated data for an intercross with some markers out of order.
data(badorder)
data(badorder)
An object of class cross
. See read.cross
for details.
There are 250 F2 individuals typed at a total of 36 markers on four chromosomes. The data were simulated with QTLs at the center of chromosomes 1 and 3.
The order of several markers on chromosome 1 is incorrect. Markers on chromosomes 2 and 3 are switched.
Karl W Broman, [email protected]
est.rf
, ripple
,
est.map
, sim.cross
data(badorder) # estimate recombination fractions badorder <- est.rf(badorder) plotRF(badorder) # re-estimate map newmap <- est.map(badorder) plotMap(badorder, newmap) # assess marker order on chr 1 rip3 <- ripple(badorder, chr=1, window=3) summary(rip3)
data(badorder) # estimate recombination fractions badorder <- est.rf(badorder) plotRF(badorder) # re-estimate map newmap <- est.map(badorder) plotMap(badorder, newmap) # assess marker order on chr 1 rip3 <- ripple(badorder, chr=1, window=3) summary(rip3)
Calculate an approximate Bayesian credible interval for a particular
chromosome, using output from scanone
.
bayesint(results, chr, qtl.index, prob=0.95, lodcolumn=1, expandtomarkers=FALSE)
bayesint(results, chr, qtl.index, prob=0.95, lodcolumn=1, expandtomarkers=FALSE)
results |
Output from |
chr |
A chromosome ID (if input |
qtl.index |
Numeric index for a QTL (if input |
prob |
Probability coverage of the interval. |
lodcolumn |
An integer indicating which
of the LOD score columns should be considered (if input
|
expandtomarkers |
If TRUE, the interval is expanded to the nearest flanking markers. |
We take , rescale it to have area 1, and then
calculate the connected interval with density above some threshold
and having coverage matching the target probability.
An object of class scanone
indicating the
estimated QTL position and the approximate endpoints
for the Bayesian credible interval.
Karl W Broman, [email protected]
data(hyper) hyper <- calc.genoprob(hyper, step=0.5) out <- scanone(hyper, method="hk") bayesint(out, chr=1) bayesint(out, chr=4) bayesint(out, chr=4, prob=0.99) bayesint(out, chr=4, expandtomarkers=TRUE)
data(hyper) hyper <- calc.genoprob(hyper, step=0.5) out <- scanone(hyper, method="hk") bayesint(out, chr=1) bayesint(out, chr=4) bayesint(out, chr=4, prob=0.99) bayesint(out, chr=4, expandtomarkers=TRUE)
Data from bristle number in chromosome 3 recombinant isogenic lines of Drosophila melanogaster.
data(bristle3)
data(bristle3)
An object of class cross
. See read.cross
for details.
There are 66 chromosome 3 recombinant isogenic lines, derived from inbred lines that were selected for low (A) and high (B) abdominal bristle numbers. A recombinant chromosome 3 was placed in an isogenic low background.
There are eight phenotypes: the average and SD of the number of abdominal and sternopleural bristles in males and females for each line.
Each line is typed at 29 genetic markers on chromosome 3.
Long, A. D., Mullaney, S. L., Reid, L. A., Fry, J. D., Langley, C. H. and MacKay, T. F. C. (1995) High resolution mapping of genetic factors affecting abdominal bristle number in Drosophila melanogaster. Genetics 139, 1273–1291.
bristleX
, listeria
,
fake.bc
, fake.f2
,
fake.4way
, hyper
data(bristle3) # Summaries summary(bristle3) plot(bristle3) # genome scan for each of the average phenotypes bristle3 <- calc.genoprob(bristle3, step=2) out <- scanone(bristle3, pheno.col=c(1,3,5,7)) # Plot the results # maximum LOD score among four phenotypes ym <- max(apply(out[,-(1:2)], 2, max)) plot(out, lod=1:3, ylim=c(0,ym)) plot(out, lod=4, add=TRUE, col="green")
data(bristle3) # Summaries summary(bristle3) plot(bristle3) # genome scan for each of the average phenotypes bristle3 <- calc.genoprob(bristle3, step=2) out <- scanone(bristle3, pheno.col=c(1,3,5,7)) # Plot the results # maximum LOD score among four phenotypes ym <- max(apply(out[,-(1:2)], 2, max)) plot(out, lod=1:3, ylim=c(0,ym)) plot(out, lod=4, add=TRUE, col="green")
Data from bristle number in chromosome X recombinant isogenic lines of Drosophila melanogaster.
data(bristleX)
data(bristleX)
An object of class cross
. See read.cross
for details.
There are 92 chromosome X recombinant isogenic lines, derived from inbred lines that were selected for low (A) and high (B) abdominal bristle numbers. A recombinant chromosome X was placed in an isogenic low background.
There are eight phenotypes: the average and SD of the number of abdominal and sternopleural bristles in males and females for each line.
Each line is typed at 17 genetic markers on chromosome 3.
Long, A. D., Mullaney, S. L., Reid, L. A., Fry, J. D., Langley, C. H. and MacKay, T. F. C. (1995) High resolution mapping of genetic factors affecting abdominal bristle number in Drosophila melanogaster. Genetics 139, 1273–1291.
bristleX
, listeria
,
fake.bc
, fake.f2
,
fake.4way
, hyper
data(bristleX) # Summaries summary(bristleX) plot(bristleX) # genome scan for each of the average phenotypes bristleX <- calc.genoprob(bristleX, step=2) out <- scanone(bristleX, pheno.col=c(1,3,5,7)) # Plot the results # maximum LOD score among four phenotypes ym <- max(apply(out[,-(1:2)], 2, max)) plot(out, lod=1:3, ylim=c(0,ym)) plot(out, lod=4, add=TRUE, col="green")
data(bristleX) # Summaries summary(bristleX) plot(bristleX) # genome scan for each of the average phenotypes bristleX <- calc.genoprob(bristleX, step=2) out <- scanone(bristleX, pheno.col=c(1,3,5,7)) # Plot the results # maximum LOD score among four phenotypes ym <- max(apply(out[,-(1:2)], 2, max)) plot(out, lod=1:3, ylim=c(0,ym)) plot(out, lod=4, add=TRUE, col="green")
Concatenate the data for multiple QTL experiments.
## S3 method for class 'cross' c(...)
## S3 method for class 'cross' c(...)
... |
A set of objects of class |
The concatenated input, as a cross
object. Additional
columns are added to the phenotype data indicating which cross an
individual comes from; another column indicates cross type (0=BC,
1=intercross), if there are crosses of different types. The crosses
are not required to have exactly the same set of phenotypes;
phenotypes with the same names are assumed to be the same.
If the crosses have different sets of markers, we interpolate marker order, but the cM positions of markers that are in common between crosses must be precisely the same in the different crosses.
Karl W Broman, [email protected]
data(fake.f2) junk <- fake.f2 junk <- c(fake.f2,junk)
data(fake.f2) junk <- fake.f2 junk <- c(fake.f2,junk)
Concatenate the columns from different runs of
scanone
.
## S3 method for class 'scanone' c(..., labels) ## S3 method for class 'scanone' cbind(..., labels)
## S3 method for class 'scanone' c(..., labels) ## S3 method for class 'scanone' cbind(..., labels)
... |
A set of objects of class |
labels |
A vector of character strings, of length 1 or of the same length as the input, to be appended to the column names in the output. |
The aim of this function is to concatenate the results from multiple
runs scanone
, generally for
different phenotypes and/or methods, to be used in parallel with
summary.scanone
.
The concatenated input, as a scanone
object.
Karl W Broman, [email protected]
summary.scanone
,
scanone
, cbind.scanoneperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) out.hk <- scanone(fake.f2, method="hk") out.np <- scanone(fake.f2, model="np") out <- c(out.hk, out.np, labels=c("hk","np")) plot(out, lod=1:2, col=c("blue", "red"))
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) out.hk <- scanone(fake.f2, method="hk") out.np <- scanone(fake.f2, model="np") out <- c(out.hk, out.np, labels=c("hk","np")) plot(out, lod=1:2, col=c("blue", "red"))
Concatenate the data for multiple runs of scanone
with n.perm > 0
.
## S3 method for class 'scanoneperm' c(...) ## S3 method for class 'scanoneperm' rbind(...)
## S3 method for class 'scanoneperm' c(...) ## S3 method for class 'scanoneperm' rbind(...)
... |
A set of objects of class |
The aim of this function is to concatenate the results from multiple
runs of a permutation test scanone
, to assist with
the case that such permutations are done on multiple processors in
parallel.
The concatenated input, as a scanoneperm
object.
Karl W Broman, [email protected]
summary.scanoneperm
,
scanone
, cbind.scanoneperm
,
c.scantwoperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) operm1 <- scanone(fake.f2, method="hk", n.perm=100, perm.Xsp=TRUE) operm2 <- scanone(fake.f2, method="hk", n.perm=50, perm.Xsp=TRUE) operm <- c(operm1, operm2)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) operm1 <- scanone(fake.f2, method="hk", n.perm=100, perm.Xsp=TRUE) operm2 <- scanone(fake.f2, method="hk", n.perm=50, perm.Xsp=TRUE) operm <- c(operm1, operm2)
Concatenate the columns from different runs of
scantwo
.
## S3 method for class 'scantwo' c(...) ## S3 method for class 'scantwo' cbind(...)
## S3 method for class 'scantwo' c(...) ## S3 method for class 'scantwo' cbind(...)
... |
A set of objects of class |
The aim of this function is to concatenate the results from multiple
runs scantwo
, generally for
different phenotypes and/or methods.
The concatenated input, as a scantwo
object.
Karl W Broman, [email protected]
summary.scantwo
,
scantwo
, c.scanone
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) out2a <- scantwo(fake.bc, method="hk") out2b <- scantwo(fake.bc, pheno.col=2, method="hk") out2 <- c(out2a, out2b)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) out2a <- scantwo(fake.bc, method="hk") out2b <- scantwo(fake.bc, pheno.col=2, method="hk") out2 <- c(out2a, out2b)
Concatenate the data for multiple runs of scantwo
with n.perm > 0
.
## S3 method for class 'scantwoperm' c(...) ## S3 method for class 'scantwoperm' rbind(...)
## S3 method for class 'scantwoperm' c(...) ## S3 method for class 'scantwoperm' rbind(...)
... |
A set of objects of class |
The aim of this function is to concatenate the results from multiple
runs of a permutation test scantwo
, to assist with
the case that such permutations are done on multiple processors in
parallel.
The concatenated input, as a scantwoperm
object.
Karl W Broman, [email protected]
summary.scantwoperm
,
scantwo
, cbind.scantwoperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) ## Not run: operm1 <- scantwo(fake.f2, method="hk", n.perm=50) operm2 <- scantwo(fake.f2, method="hk", n.perm=50) ## End(Not run) operm <- c(operm1, operm2)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) ## Not run: operm1 <- scantwo(fake.f2, method="hk", n.perm=50) operm2 <- scantwo(fake.f2, method="hk", n.perm=50) ## End(Not run) operm <- c(operm1, operm2)
Calculates a LOD score for each genotype, measuring the evidence for genotyping errors.
calc.errorlod(cross, error.prob=0.01, map.function=c("haldane","kosambi","c-f","morgan"), version=c("new","old"))
calc.errorlod(cross, error.prob=0.01, map.function=c("haldane","kosambi","c-f","morgan"), version=c("new","old"))
cross |
An object of class |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype) |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. |
version |
Specifies whether to use the original version of this function or the current (preferred) version. |
Calculates, for each individual at each marker, a LOD score measuring the strength of evidence for a genotyping error, as described by Lincoln and Lander (1992).
In the latest version, evidence for a genotype being in
error is considered assuming that all other genotypes (for that
individual, on that chromosome) are correct. The argument
version
allows one to specify whether this new version is used,
or whether the original (old) version of the calculation is
performed.
Note that values below 4 are generally not interesting. Also note that if markers are extremely tightly linked, recombination events can give large error LOD scores. The error LOD scores should not be trusted blindly, but should be viewed as a tool for identifying genotypes deserving further study.
Use top.errorlod
to print all genotypes with error
LOD scores above a specified threshold,
plotErrorlod
to plot the error LOD scores for
specified chromosomes, and plotGeno
to view the
observed genotype data with likely errors flagged.
The input cross
object is returned with a component,
errorlod
, added to each component of cross$geno
. The
errorlod
component is a matrix of size (n.ind x n.mar). An
attribute "error.prob"
is set to the value of the corresponding
argument, for later reference.
Karl W Broman, [email protected]
Lincoln, S. E. and Lander, E. S. (1992) Systematic detection of errors in genetic linkage data. Genomics 14, 604–610.
plotErrorlod
,
top.errorlod
, cleanGeno
data(hyper) hyper <- calc.errorlod(hyper,error.prob=0.01) # print those above a specified cutoff top.errorlod(hyper, cutoff=4) # plot genotype data, flagging genotypes with error LOD > cutoff plotGeno(hyper, chr=1, ind=160:200, cutoff=7, min.sep=2)
data(hyper) hyper <- calc.errorlod(hyper,error.prob=0.01) # print those above a specified cutoff top.errorlod(hyper, cutoff=4) # plot genotype data, flagging genotypes with error LOD > cutoff plotGeno(hyper, chr=1, ind=160:200, cutoff=7, min.sep=2)
Uses the hidden Markov model technology to calculate the probabilities of the true underlying genotypes given the observed multipoint marker data, with possible allowance for genotyping errors.
calc.genoprob(cross, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
calc.genoprob(cross, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
cross |
An object of class |
step |
Maximum distance (in cM) between positions at which the
genotype probabilities are calculated, though for |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype probability calculations will be carried. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi or Carter-Falconer map function when converting genetic distances into recombination fractions. |
stepwidth |
Indicates whether the intermediate points should with
fixed or variable step sizes. We recommend using
|
Let denote the observed marker genotype at position
, and
denote the corresponding true underlying
genotype.
We use the forward-backward equations to calculate
and
We then obtain
where
In the case of the 4-way cross, with a sex-specific map, we assume a constant ratio of female:male recombination rates within the inter-marker intervals.
The input cross
object is returned with a component,
prob
, added to each component of cross$geno
.
prob
is an array of size [n.ind x n.pos x n.gen] where n.pos is
the number of positions at which the probabilities were calculated and
n.gen = 3 for an intercross, = 2 for a backcross, and = 4 for a 4-way
cross. Attributes "error.prob"
, "step"
,
"off.end"
, and "map.function"
are set to the values of
the corresponding arguments, for later reference (especially by the
function calc.errorlod
).
Karl W Broman, [email protected]
Lange, K. (1999) Numerical analysis for statisticians. Springer-Verlag. Sec 23.3.
Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286.
sim.geno
, argmax.geno
,
calc.errorlod
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=2, off.end=5) data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=0, off.end=0, err=0.01)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=2, off.end=5) data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=0, off.end=0, err=0.01)
Derive penalties for the penalized LOD scores (used by
stepwiseqtl
) on the basis of permutation results
from a two-dimensional, two-QTL scan (obtained by scantwo
).
calc.penalties(perms, alpha=0.05, lodcolumn)
calc.penalties(perms, alpha=0.05, lodcolumn)
perms |
Permutation results from |
alpha |
Significance level. |
lodcolumn |
If the scantwo permutation results contain LOD scores for multiple phenotypes, this argument indicates which to use in the summary. This may be a vector. If missing, penalties for all phenotypes are calculated. |
Thresholds derived from scantwo
permutations (that
is, for a two-dimensional, two-QTL genome scan) are used to calculate
penalties on main effects and interactions.
The main effect penalty is the 1-alpha
quantile of the null
distribution of the genome-wide maximum LOD score from a single-QTL
genome scan (as with scanone
).
The "heavy" interaction penalty is the 1-alpha
quantile of
the null distribution of the maximum interaction LOD score (that is,
the likelihood ratio comparing the best model
with two interacting QTL to the best model with two additive QTL) from
a two-dimensional, two-QTL genome scan (as with
scantwo
).
The "light" interaction penality is the difference between the
"fv1"
threshold from the scantwo
permutations (that is, the 1-alpha
quantile of the LOD score
comparing the best model with two interacting QTL to the best
single-QTL model) and the main effect penalty.
If the permutations results were obtained with perm.Xsp=TRUE
,
to give X-chr-specific results, six penalties are calculated: main
effect for autosomes, main effect for X chr, heavy penalty on A:A
interactions, light penalty on A:A interactions, penalty on A:X
interactions, and penalty on X:X interactions.
Vector of three values indicating the penalty on main effects and heavy and light penalties on interactions, or a matrix of such results, with each row corresponding to a different phenotype.
If the input permutations are X-chromosome-specific, the result has six values: main effect for autosomes, main effect for X chr, heavy penalty on A:A interactions, light penalty on A:A interactions, penalty on A:X interactions, and penalty on X:X interactions.
Karl W Broman, [email protected]
Manichaikul, A., Moon, J. Y., Sen, Ś, Yandell, B. S. and Broman, K. W. (2009) A model selection approach for the identification of quantitative trait loci in experimental crosses, allowing epistasis. Genetics, 181, 1077–1086.
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") # permutations ## Not run: permo.2dim <- scantwo(fake.f2, method="hk", n.perm=1000) summary(permo.2dim, alpha=0.05) # penalties calc.penalties(permo.2dim)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") # permutations ## Not run: permo.2dim <- scantwo(fake.f2, method="hk", n.perm=1000) summary(permo.2dim, alpha=0.05) # penalties calc.penalties(permo.2dim)
Concatenate the columns from different runs of
scanone
with n.perm > 0
.
## S3 method for class 'scanoneperm' cbind(..., labels)
## S3 method for class 'scanoneperm' cbind(..., labels)
... |
A set of objects of class |
labels |
A vector of character strings, of length 1 or of the same
length as the input |
The aim of this function is to concatenate the results from multiple
runs of a permutation test scanone
, generally for
different phenotypes and/or methods, to be used in parallel with
c.scanone
.
The concatenated input, as a scanoneperm
object. If
different numbers of permutation replicates were used, those columns
with fewer replicates are padded with missing values (NA
).
Karl W Broman, [email protected]
summary.scanoneperm
,
scanone
, c.scanoneperm
,
c.scanone
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) operm1 <- scanone(fake.f2, method="hk", n.perm=10, perm.Xsp=TRUE) operm2 <- scanone(fake.f2, method="em", n.perm=5, perm.Xsp=TRUE) operm <- cbind(operm1, operm2, labels=c("hk","em")) summary(operm)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) operm1 <- scanone(fake.f2, method="hk", n.perm=10, perm.Xsp=TRUE) operm2 <- scanone(fake.f2, method="em", n.perm=5, perm.Xsp=TRUE) operm <- cbind(operm1, operm2, labels=c("hk","em")) summary(operm)
Column-bind permutations results from scantwo
for multiple phenotypes or models.
## S3 method for class 'scantwoperm' cbind(...)
## S3 method for class 'scantwoperm' cbind(...)
... |
A set of objects of class |
The column-binded input, as a scantwoperm
object.
Karl W Broman, [email protected]
scantwo
, c.scantwoperm
,
summary.scantwoperm
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) ## Not run: operm1 <- scantwo(fake.bc, pheno.col=1, method="hk", n.perm=50) operm2 <- scantwo(fake.bc, pheno.col=2, method="hk", n.perm=50) ## End(Not run) operm <- cbind(operm1, operm2)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) ## Not run: operm1 <- scantwo(fake.bc, pheno.col=1, method="hk", n.perm=50) operm2 <- scantwo(fake.bc, pheno.col=2, method="hk", n.perm=50) ## End(Not run) operm <- cbind(operm1, operm2)
Identify markers whose alleles might have been switched by comparing the LOD score for linkage to all other autosomal markers with the original data to that when the alleles have been switched.
checkAlleles(cross, threshold=3, verbose)
checkAlleles(cross, threshold=3, verbose)
cross |
An object of class |
threshold |
Only an increase in maximum 2-point LOD of at least this amount will lead to a marker being flagged. |
verbose |
If TRUE and there are no markers above the threshold, print a message. |
For each marker, we compare the maximum LOD score for the cases where
the estimated recombination fraction > 0.5 to those where r.f. < 0.5.
The function est.rf
must first be run.
Note: Markers that are tightly linked to a marker whose alleles are switched are likely to also be flagged by this method. The real problem markers are likely those with the biggest difference in LOD scores.
A data frame containing the flagged markers, having four columns: the marker name, chromosome ID, numeric index within chromosome, and the difference between the maximum two-point LOD score with the alleles switched to that from the original data.
Karl W Broman, [email protected]
est.rf
, geno.crosstab
, switchAlleles
data(fake.f2) # switch homozygotes at marker D5M391 fake.f2 <- switchAlleles(fake.f2, "D5M391") fake.f2 <- est.rf(fake.f2) checkAlleles(fake.f2)
data(fake.f2) # switch homozygotes at marker D5M391 fake.f2 <- switchAlleles(fake.f2, "D5M391") fake.f2 <- est.rf(fake.f2) checkAlleles(fake.f2)
Obtain the chromosome lengths in a cross
or map
object.
chrlen(object)
chrlen(object)
object |
An object of class |
Returns a vector of chromosome lengths. If the cross has sex-specific maps, it returns a 2-row matrix with the two lengths for each chromosome.
Karl W Broman, [email protected]
summaryMap
, pull.map
,
summary.cross
data(fake.f2) chrlen(fake.f2) map <- pull.map(fake.f2) chrlen(map)
data(fake.f2) chrlen(fake.f2) map <- pull.map(fake.f2) chrlen(map)
Pull out the chromosome names from a cross object as one big vector.
chrnames(cross)
chrnames(cross)
cross |
An object of class |
A vector of character strings (the chromosome names).
Karl W Broman, [email protected]
data(listeria) chrnames(listeria)
data(listeria) chrnames(listeria)
Composite interval mapping by a scheme from QTL Cartographer: forward selection at the markers (here, with filled-in genotype data) to a fixed number, followed by interval mapping with the selected markers as covariates, dropping marker covariates if they are within some fixed window size of the location under test.
cim(cross, pheno.col=1, n.marcovar=3, window=10, method=c("em", "imp", "hk", "ehk"), imp.method=c("imp", "argmax"), error.prob=0.0001, map.function=c("haldane", "kosambi", "c-v", "morgan"), addcovar=NULL, n.perm)
cim(cross, pheno.col=1, n.marcovar=3, window=10, method=c("em", "imp", "hk", "ehk"), imp.method=c("imp", "argmax"), error.prob=0.0001, map.function=c("haldane", "kosambi", "c-v", "morgan"), addcovar=NULL, n.perm)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
n.marcovar |
Number of marker covariates to use. |
window |
Window size, in cM. |
method |
Indicates whether to use the EM algorithm, imputation, Haley-Knott regression, or the extended Haley-Knott method. |
imp.method |
Method used to impute any missing marker genotype data. |
error.prob |
Genotyping error probability assumed when imputing the missing marker genotype data. |
map.function |
Map function used when imputing the missing marker genotype data. |
addcovar |
Optional numeric matrix of additional covariates to include. |
n.perm |
If specified, a permutation test is performed rather than an analysis of the observed data. This argument defines the number of permutation replicates. |
We first use fill.geno
to impute any missing marker
genotype data, either via a simple random imputation or using the
Viterbi algorithm.
We then perform forward selection to a fixed number of markers. These will be used (again, with any missing data filled in) as covariates in the subsequent genome scan.
The function returns an object of the same form as the function
scanone
:
If n.perm
is missing, the function returns the scan results as
a data.frame with three columns: chromosome, position, LOD score.
Attributes indicate the names and positions of the chosen marker
covariates.
If n.perm
> 0, the function results the results of a
permutation test: a vector giving the genome-wide maximum LOD score in
each of the permutations.
Karl W Broman, [email protected]
Jansen, R. C. (1993) Interval mapping of multiple quantitative trait loci. Genetics, 135, 205–211.
Jansen, R. C. and Stam, P. (1994) High resolution of quantitative traits into multiple loci via interval mapping. Genetics, 136, 1447-1455.
Zeng, Z. B. (1993) Theoretical basis for separation of multiple linked gene effects in mapping quantitative trait loci. Proc. Natl. Acad. Sci. USA, 90, 10972–10976.
Zeng, Z. B. (1994) Precision mapping of quantitative trait loci. Genetics, 136, 1457–1468.
add.cim.covar
, scanone
,
summary.scanone
, plot.scanone
,
fill.geno
data(hyper) hyper <- calc.genoprob(hyper, step=2.5) out <- scanone(hyper) out.cim <- cim(hyper, n.marcovar=3) plot(out, out.cim, chr=c(1,4,6,15), col=c("blue", "red")) add.cim.covar(out.cim, chr=c(1,4,6,15))
data(hyper) hyper <- calc.genoprob(hyper, step=2.5) out <- scanone(hyper) out.cim <- cim(hyper, n.marcovar=3) plot(out, out.cim, chr=c(1,4,6,15), col=c("blue", "red")) add.cim.covar(out.cim, chr=c(1,4,6,15))
Remove any intermediate calculations from a cross object.
## S3 method for class 'cross' clean(object, ...)
## S3 method for class 'cross' clean(object, ...)
object |
An object of class |
... |
Ignored at this point. |
The input object, with any intermediate calculations
(such as is produced by calc.genoprob
,
argmax.geno
and sim.geno
)
removed.
Karl W Broman, [email protected]
drop.nullmarkers
,
drop.markers
, clean.scantwo
data(fake.f2) names(fake.f2$geno) fake.f2 <- calc.genoprob(fake.f2) names(fake.f2$geno) fake.f2 <- clean(fake.f2) names(fake.f2$geno)
data(fake.f2) names(fake.f2$geno) fake.f2 <- calc.genoprob(fake.f2) names(fake.f2$geno) fake.f2 <- clean(fake.f2) names(fake.f2$geno)
In an object output from scantwo
, replaces negative
and missing LOD scores with 0, and replaces LOD scores for pairs of
positions that are not separated by n.mar
markers, or that are
less than distance
cM apart, with 0. Further, if the LOD
for full model is less than the LOD for the additive model, the
additive LOD is pasted over the full LOD.
## S3 method for class 'scantwo' clean(object, n.mar=1, distance=0, ...)
## S3 method for class 'scantwo' clean(object, n.mar=1, distance=0, ...)
object |
An object of class |
n.mar |
Pairs of positions not separated by at least this many markers have LOD scores set to 0. |
distance |
Pairs of positions not separated by at least this distance have LOD scores set to 0. |
... |
Ignored at this point. |
The input scantwo object, with any negative or missing LOD scores
replaced by 0, and LOD scores for pairs of positions separated by
fewer than n.mar
markers, or less than distance
cM, are
set to 0.
Also, if the LOD for the full model is less than the LOD for the
additive model, the additive LOD is used in place of the full LOD.
Karl W Broman, [email protected]
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out2 <- scantwo(fake.f2, method="hk") out2 <- clean(out2) out2cl2 <- clean(out2, n.mar=2, distance=5)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out2 <- scantwo(fake.f2, method="hk") out2 <- clean(out2) out2cl2 <- clean(out2, n.mar=2, distance=5)
Delete genotypes from a cross that are indicated to be possibly in error, as they result in apparent tight double-crossovers.
cleanGeno(cross, chr, maxdist=2.5, maxmark=2, verbose=TRUE)
cleanGeno(cross, chr, maxdist=2.5, maxmark=2, verbose=TRUE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
maxdist |
A vector specifying the maximum distance between two crossovers. |
maxmark |
A vector specifying the maximum number of typed markers between two crossovers. |
verbose |
If TRUE, print information on the numbers of genotypes omitted from each chromosome. |
We first use locateXO
to identify crossover locations.
If a pair of adjacted crossovers are separated by no more than
maxdist
and contain no more than maxmark
genotyped
markers, the intervening genotypes are omitted (that is, changed to
NA
).
The arguments maxdist
and maxmark
may be vectors. (If
both have length greater than 1, they must have the same length.) If
they are vectors, genotypes are omitted if they satisify any one of
the (maxdist
, maxmark
) pairs.
The input cross
object with suspect genotypes omitted.
Karl W Broman, [email protected]
locateXO
,
countXO
, calc.errorlod
data(hyper) sum(ntyped(hyper)) hyperc <- cleanGeno(hyper, chr=4, maxdist=c(2.5, 10), maxmark=c(2, 1)) sum(ntyped(hyperc))
data(hyper) sum(ntyped(hyper)) hyperc <- cleanGeno(hyper, chr=4, maxdist=c(2.5, 10), maxmark=c(2, 1)) sum(ntyped(hyperc))
Verify that two objects of class cross
have identical classes,
chromosomes, markers, genotypes, genetic maps, and phenotypes.
comparecrosses(cross1, cross2, tol=1e-5)
comparecrosses(cross1, cross2, tol=1e-5)
cross1 |
An object of class |
cross2 |
An object of class |
tol |
Tolerance value for comparing genetic map positions and numeric phenotypes. |
None.
Karl W Broman, [email protected]
data(listeria) comparecrosses(listeria, listeria)
data(listeria) comparecrosses(listeria, listeria)
Count proportion of matching genotypes between all pairs of individuals, to look for unusually closely related individuals.
comparegeno(cross, what=c("proportion","number","both"))
comparegeno(cross, what=c("proportion","number","both"))
cross |
An object of class |
what |
Indicates whether to return the proportion or number of matching genotypes (or both). |
A matrix whose (i,j)th element is the proportion or number of matching genotypes for individuals i and j.
If called with what="both"
, the lower triangle contains the
proportion and the upper triangle contains the number.
If called with what="proportion"
, the diagonal contains missing
values. Otherwise, the diagonal contains the number of typed markers
for each individual.
The output is given class "comparegeno"
so that appropriate
summary
and plot
functions may be used.
Karl W Broman, [email protected]
nmissing
, summary.comparegeno
,
plot.comparegeno
data(listeria) cg <- comparegeno(listeria) summary(cg, 0.7) plot(cg)
data(listeria) cg <- comparegeno(listeria) summary(cg, 0.7) plot(cg)
Compare the likelihood of an alternative order for markers on a chromosome to the current order.
compareorder(cross, chr, order, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-6, sex.sp=TRUE)
compareorder(cross, chr, order, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-6, sex.sp=TRUE)
cross |
An object of class |
chr |
The chromosome to investigate. Only one chromosome is allowed. (This should be a character string referring to the chromosomes by name.) |
order |
The alternate order of markers on the chromosome: a numeric vector that is a permutation of the integers from 1 to the number of markers on the chromosome. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. |
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
A data frame with two rows: the current order in the input cross object, and the revised order. The first column is the log10 likelihood of the new order relative to the original one (positive values indicate that the new order is better supported). The second column is the estimated genetic length of the chromosome for each order. In the case of sex-specific maps, there are separate columns for the female and male genetic lengths.
Karl W Broman, [email protected]
ripple
, switch.order
,
movemarker
data(badorder) compareorder(badorder, chr=1, order=c(1:8,11,10,9,12))
data(badorder) compareorder(badorder, chr=1, order=c(1:8,11,10,9,12))
Produces a very condensed version of the output of scantwo
.
## S3 method for class 'scantwo' condense(object)
## S3 method for class 'scantwo' condense(object)
object |
An object of class |
This produces a very reduced version of the output of
scantwo
, for which a summary may still be created
via summary.scantwo
, though plots can no longer be
made.
An object of class scantwocondensed
, containing just the
maximum full, additive and interactive LOD scores, and the positions
where they occured, on each pair of chromosomes.
Karl W Broman, [email protected]
scantwo
, summary.scantwo
,
max.scantwo
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) out2 <- scantwo(fake.f2, method="hk") out2c <- condense(out2) summary(out2c, allpairs=FALSE) max(out2c)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2) out2 <- scantwo(fake.f2, method="hk") out2c <- condense(out2) summary(out2c, allpairs=FALSE) max(out2c)
Convert a genetic map from using one map function to another.
## S3 method for class 'map' convert(object, old.map.function=c("haldane", "kosambi", "c-f", "morgan"), new.map.function=c("haldane", "kosambi", "c-f", "morgan"), ...)
## S3 method for class 'map' convert(object, old.map.function=c("haldane", "kosambi", "c-f", "morgan"), new.map.function=c("haldane", "kosambi", "c-f", "morgan"), ...)
object |
A genetic map object, of class |
old.map.function |
The map function used in forming the map in
|
new.map.function |
The new map function to be used. |
... |
Ignored at this point. |
The location of the first marker on each chromosome is left
unchanged. Inter-marker distances are converted to recombination
fractions with the inverse of the old.map.function
, and then
back to distances with the new.map.function
.
The same as the input, but with inter-marker distances changed to reflect a different map function.
Karl W Broman, [email protected]
data(listeria) map <- pull.map(listeria) map <- convert(map, "haldane", "kosambi") listeria <- replace.map(listeria, map)
data(listeria) map <- pull.map(listeria) map <- convert(map, "haldane", "kosambi") listeria <- replace.map(listeria, map)
Convert the output from scanone from the format used in R/qtl version 0.97 and earlier to that used in version 0.98 and later.
## S3 method for class 'scanone' convert(object, ...)
## S3 method for class 'scanone' convert(object, ...)
object |
Output from the function |
... |
Ignored at this point. |
Previously, inter-marker locations were named as, for example,
loc7.5.c3
; these were changed to c3.loc7.5
.
The same scanone output, but revised for use with R/qtl version 0.98 and later.
Karl W Broman, [email protected]
## Not run: out.new <- convert(out.old)
## Not run: out.new <- convert(out.old)
Convert the output from scantwo from the format used in R/qtl version 1.03 and earlier to that used in version 1.04 and later.
## S3 method for class 'scantwo' convert(object, ...)
## S3 method for class 'scantwo' convert(object, ...)
object |
Output from the function |
... |
Ignored at this point. |
Previously, the output from scantwo
contained the
full and interaction LOD scores. In R/qtl version 1.04 and later,
the output contains the LOD scores from the full and
additive QTL models.
The same scanone output, but revised for use with R/qtl version 1.03 and later.
Karl W Broman, [email protected]
## Not run: out2.new <- convert(out2.old)
## Not run: out2.new <- convert(out2.old)
Convert a cross to type "riself"
(RIL by selfing).
convert2riself(cross)
convert2riself(cross)
cross |
An object of class |
If there are more genotypes with code 3 (BB) than code 2 (AB), we omit the genotypes with code==2 and call those with code==3 the BB genotypes.
If, instead, there are more genotypes with code 2 than code 3, we omit the genotypes with code==3 and call those with code==2 the BB genotypes.
Any chromosomes with class "X"
(X chromosome) are changed to
class "A"
(autosomal).
The input cross object, with genotype codes possibly changed and cross
type changed to "riself"
.
Karl W Broman, [email protected]
data(hyper) hyper.as.riself <- convert2riself(hyper)
data(hyper) hyper.as.riself <- convert2riself(hyper)
Convert a cross to type "risib"
(RIL by sib mating).
convert2risib(cross)
convert2risib(cross)
cross |
An object of class |
If there are more genotypes with code 3 (BB) than code 2 (AB), we omit the genotypes with code==2 and call those with code==3 the BB genotypes.
If, instead, there are more genotypes with code 2 than code 3, we omit the genotypes with code==3 and call those with code==2 the BB genotypes.
The input cross object, with genotype codes possibly changed and cross
type changed to "risib"
.
Karl W Broman, [email protected]
data(hyper) hyper.as.risib <- convert2risib(hyper)
data(hyper) hyper.as.risib <- convert2risib(hyper)
Convert a sex-specific map to a sex-averaged one, assuming that the female and male maps are actually the same (that is, that the map was estimated assuming a common recombination rate in females and males).
convert2sa(map, tol=1e-4)
convert2sa(map, tol=1e-4)
map |
A map object with sex-specific locations (but assuming that
the female and male maps are the same), as output by the function
|
tol |
Tolerance value for inspecting the differences between the female and male maps; if they differ by more than this tolerance, a warning is issued. |
We pull out just the female marker locations, and give a warning if there are large differences between the female and male maps.
A map object, with sex-averaged distances.
Karl W Broman, [email protected]
data(fake.4way) ## Not run: fake.4way <- subset(fake.4way, chr="-X") nm <- est.map(fake.4way, sex.sp=FALSE) plot(convert2sa(nm))
data(fake.4way) ## Not run: fake.4way <- subset(fake.4way, chr="-X") nm <- est.map(fake.4way, sex.sp=FALSE) plot(convert2sa(nm))
Count the number of obligate crossovers for each individual in a cross, either by chromosome or overall.
countXO(cross, chr, bychr=FALSE)
countXO(cross, chr, bychr=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to investigate.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
bychr |
If TRUE, return counts for each individual chromosome; if FALSE, return the overall number across the selected chromosomes. |
For each individual we count the minimal number of crossovers that explain the observed genotype data.
If bychr=TRUE
, a matrix of counts is returned, with rows
corresponding to individuals and columns corresponding to
chromosomes.
If bychr=FALSE
, a vector of counts (the total number of
crossovers across all selected chromosomes) is returned.
Karl W Broman, [email protected]
data(hyper) plot(countXO(hyper))
data(hyper) plot(countXO(hyper))
Drop markers with duplicate names; retaining the first of each set, with consensus genotyps
drop.dupmarkers(cross, verbose=TRUE)
drop.dupmarkers(cross, verbose=TRUE)
cross |
An object of class |
verbose |
If TRUE, print information on the numbers of genotypes and markers omitted. If > 1, give more detailed information on genotypes omitted. |
The input cross
object, with any duplicate markers omitted
(except for one). The marker retained will have consensus genotypes;
if multiple versions of a marker have different genotypes for an
individual, they will be replaced by NA
.
Any derived data (such as produced by calc.genoprob
)
will be stripped off.
Karl W Broman, [email protected]
drop.nullmarkers
, pull.markers
, drop.markers
,
summary.cross
, clean.cross
data(listeria) listeria <- drop.dupmarkers(listeria)
data(listeria) listeria <- drop.dupmarkers(listeria)
Drop a vector of markers from the data matrices and genetic maps.
drop.markers(cross, markers)
drop.markers(cross, markers)
cross |
An object of class |
markers |
A character vector of marker names. |
The input object, with any markers in the vector markers
removed
from the genotype data matrices, genetic maps, and, if applicable, any
derived data (such as produced by calc.genoprob
).
(It might be a good idea to re-derive such things after using this
function.)
Karl W Broman, [email protected]
drop.nullmarkers
, pull.markers
, geno.table
,
clean.cross
data(listeria) listeria2 <- drop.markers(listeria, c("D10M44","D1M3","D1M75"))
data(listeria) listeria2 <- drop.markers(listeria, c("D10M44","D1M3","D1M75"))
Drop markers, from the data matrices and genetic maps, that have no genotype data.
drop.nullmarkers(cross)
drop.nullmarkers(cross)
cross |
An object of class |
The input object, with any markers lacking genotype data removed from
the genotype data matrices, genetic maps, and, if applicable, any
derived data (such as produced by calc.genoprob
).
(It might be a good idea to re-derive such things after using this
function.)
Karl W Broman, [email protected]
nullmarkers
, drop.markers
, clean.cross
,
geno.table
# removes one marker from hyper data(hyper) hyper <- drop.nullmarkers(hyper) # shouldn't do anything to listeria data(listeria) listeria <- drop.nullmarkers(listeria)
# removes one marker from hyper data(hyper) hyper <- drop.nullmarkers(hyper) # shouldn't do anything to listeria data(listeria) listeria <- drop.nullmarkers(listeria)
Drop a QTL or multiple QTL from a QTL object
dropfromqtl(qtl, index, chr, pos, qtl.name, drop.lod.profile=TRUE)
dropfromqtl(qtl, index, chr, pos, qtl.name, drop.lod.profile=TRUE)
qtl |
A qtl object, as created by |
index |
Vector specifying the numeric indices of the QTL to be dropped. |
chr |
Vector indicating the chromosome for each QTL to drop. |
pos |
Vector (of same length as |
qtl.name |
Vector specifying the names of the QTL to be dropped. |
drop.lod.profile |
If TRUE, remove any LOD profiles from the object. |
Provide either chr
and pos
, or one of qtl.name
or
index
.
The input qtl
object with the specified QTL omitted. See makeqtl
for
details on the format.
Karl W Broman, [email protected]
makeqtl
, fitqtl
,
addtoqtl
, replaceqtl
,
reorderqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") newqtl <- dropfromqtl(qtl, chr=1, pos=25.8) altqtl <- dropfromqtl(qtl, index=1)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") newqtl <- dropfromqtl(qtl, chr=1, pos=25.8) altqtl <- dropfromqtl(qtl, index=1)
Drop one marker at a time from a genetic map and calculate the change in log likelihood and in the chromosome length, in order to identify problematic markers.
droponemarker(cross, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE)
droponemarker(cross, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE)
cross |
An object of class |
chr |
A vector specifying which chromosomes to test for the
position of the marker. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. (Ignored if m > 0.) |
m |
Interference parameter for the chi-square model for interference; a non-negative integer, with m=0 corresponding to no interference. This may be used only for a backcross or intercross. |
p |
Proportion of chiasmata from the NI mechanism, in the Stahl model; p=0 gives a pure chi-square model. This may be used only for a backcross or intercross. |
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
verbose |
If TRUE, print information on progress; if > 1, print even more information. |
A data frame (actually, an object of class "scanone"
, so that
one may use plot.scanone
,
summary.scanone
, etc.) with each row being a marker.
The first two columns are the chromosome ID and position. The third
column is a LOD score comparing the hypothesis that the marker is not
linked to the hypothesis that it belongs at that position.
In the case of a 4-way cross, with sex.sp=TRUE
, there are two
additional columns with the change in the estimated female and male genetic lengths
of the respective chromosome, upon deleting that marker.
With sex.sp=FALSE
, or for other types of crosses, there is one
additional column, with the change in estimated genetic length of the respective
chromosome, when the marker is omitted.
A well behaved marker will have a negative LOD score and a small change in estimated genetic length. A poorly behaved marker will have a large positive LOD score and a large change in estimated genetic length. But note that dropping the first or last marker on a chromosome could result in a large change in estimated length, even if they are not badly behaved; for these markers one should focus on the LOD scores, with a large positive LOD score being bad.
Karl W Broman, [email protected]
tryallpositions
, est.map
, ripple
,
est.rf
, switch.order
,
movemarker
, drop.markers
data(fake.bc) droponemarker(fake.bc, 7, error.prob=0, verbose=FALSE)
data(fake.bc) droponemarker(fake.bc, 7, error.prob=0, verbose=FALSE)
Plot the phenotype means for each group defined by the genotypes at one or two markers (or the values at a discrete covariate).
effectplot(cross, pheno.col=1, mname1, mark1, geno1, mname2, mark2, geno2, main, ylim, xlab, ylab, col, add.legend=TRUE, legend.lab, draw=TRUE, var.flag=c("pooled","group"))
effectplot(cross, pheno.col=1, mname1, mark1, geno1, mname2, mark2, geno2, main, ylim, xlab, ylab, col, add.legend=TRUE, legend.lab, draw=TRUE, var.flag=c("pooled","group"))
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix to be drawn in the plot. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
mname1 |
Name for the first marker or pseudomarker.
Pseudomarkers (that is, non-marker positions on the imputation grid)
may be referred to in a form like |
mark1 |
Genotype data for the first marker. If unspecified,
genotypes will be taken from the data in the input cross object,
using the name specified in |
geno1 |
Optional labels for the genotypes (or classes in a covariate). |
mname2 |
Name for the second marker or pseudomarker (optional). |
mark2 |
Like |
geno2 |
Optional labels for the genotypes (or classes in a covariate). |
main |
Optional figure title. |
ylim |
Optional y-axis limits. |
xlab |
Optional x-axis label. |
ylab |
Optional y-axis label. |
col |
Optional vector of colors for the different line segments. |
add.legend |
A logical value to indicate whether to add a legend. |
legend.lab |
Optional title for the legend. |
draw |
A logical value to indicate generate the plot or not. If FALSE, no figure will be plotted and this function can be used to calculate the group means and standard errors. |
var.flag |
The method to calculate the group variance. "pooled" means to use the pooled variance and "group" means to calculate from individual group. |
In the plot, the y-axis is the phenotype. In the case of one marker,
the x-axis is the genotype for that marker. In the case of two
markers, the x-axis is for different genotypes of the second marker,
and the genotypes of first marker are represented by lines in
different colors. Error bars are plotted at 1 SE.
The results of sim.geno
are used; if they are not available,
sim.geno
is run with n.draws=16
. The average phenotype
for each genotype group takes account of missing genotype data by
averaging across the imputations. The SEs take account of both the
residual phenotype variation and the imputation error.
A data.frame containing the phenotype means and standard errors for each group.
Hao Wu; Karl W Broman, [email protected]
plotPXG
, find.marker
,
effectscan
, find.pseudomarker
data(fake.f2) # impute genotype data ## Not run: fake.f2 <- sim.geno(fake.f2, step=5, n.draws=64) ######################################## # one marker plots ######################################## ### plot of genotype-specific phenotype means for 1 marker mname <- find.marker(fake.f2, 1, 37) # marker D1M437 effectplot(fake.f2, pheno.col=1, mname1=mname) ### output of the function contains the means and SEs output <- effectplot(fake.f2, mname1=mname) output ### plot a phenotype # Plot of sex-specific phenotype means, # note that "sex" must be a phenotype name here effectplot(fake.f2, mname1="sex", geno1=c("F","M")) # alternatively: sex <- pull.pheno(fake.f2, "sex") effectplot(fake.f2, mname1="Sex", mark1=sex, geno1=c("F","M")) ######################################## # two markers plots ######################################## ### plot two markers # plot of genotype-specific phenotype means for 2 markers mname1 <- find.marker(fake.f2, 1, 37) # marker D1M437 mname2 <- find.marker(fake.f2, 13, 24) # marker D13M254 effectplot(fake.f2, mname1=mname1, mname2=mname2) ### plot two pseudomarkers ##### refer to pseudomarkers by their positions effectplot(fake.f2, mname1="1@35", mname2="13@25") ##### alternatively, find their names via find.pseudomarker pmnames <- find.pseudomarker(fake.f2, chr=c(1, 13), c(35, 25)) effectplot(fake.f2, mname1=pmnames[1], mname2=pmnames[2]) ### Plot of sex- and genotype-specific phenotype means mname <- find.marker(fake.f2, 13, 24) # marker D13M254 # sex and a marker effectplot(fake.f2, mname1=mname, mname2="Sex", mark2=sex, geno2=c("F","M")) # Same as above, switch role of sex and the marker # sex and marker effectplot(fake.f2, mname1="Sex", mark1=sex, geno1=c("F","M"), mname2=mname) # X chromosome marker mname <- find.marker(fake.f2, "X", 14) # marker DXM66 effectplot(fake.f2, mname1=mname) # Two markers, including one on the X mnames <- find.marker(fake.f2, c(13, "X"), c(24, 14)) effectplot(fake.f2, mname1=mnames[1], mname2=mnames[2])
data(fake.f2) # impute genotype data ## Not run: fake.f2 <- sim.geno(fake.f2, step=5, n.draws=64) ######################################## # one marker plots ######################################## ### plot of genotype-specific phenotype means for 1 marker mname <- find.marker(fake.f2, 1, 37) # marker D1M437 effectplot(fake.f2, pheno.col=1, mname1=mname) ### output of the function contains the means and SEs output <- effectplot(fake.f2, mname1=mname) output ### plot a phenotype # Plot of sex-specific phenotype means, # note that "sex" must be a phenotype name here effectplot(fake.f2, mname1="sex", geno1=c("F","M")) # alternatively: sex <- pull.pheno(fake.f2, "sex") effectplot(fake.f2, mname1="Sex", mark1=sex, geno1=c("F","M")) ######################################## # two markers plots ######################################## ### plot two markers # plot of genotype-specific phenotype means for 2 markers mname1 <- find.marker(fake.f2, 1, 37) # marker D1M437 mname2 <- find.marker(fake.f2, 13, 24) # marker D13M254 effectplot(fake.f2, mname1=mname1, mname2=mname2) ### plot two pseudomarkers ##### refer to pseudomarkers by their positions effectplot(fake.f2, mname1="1@35", mname2="13@25") ##### alternatively, find their names via find.pseudomarker pmnames <- find.pseudomarker(fake.f2, chr=c(1, 13), c(35, 25)) effectplot(fake.f2, mname1=pmnames[1], mname2=pmnames[2]) ### Plot of sex- and genotype-specific phenotype means mname <- find.marker(fake.f2, 13, 24) # marker D13M254 # sex and a marker effectplot(fake.f2, mname1=mname, mname2="Sex", mark2=sex, geno2=c("F","M")) # Same as above, switch role of sex and the marker # sex and marker effectplot(fake.f2, mname1="Sex", mark1=sex, geno1=c("F","M"), mname2=mname) # X chromosome marker mname <- find.marker(fake.f2, "X", 14) # marker DXM66 effectplot(fake.f2, mname1=mname) # Two markers, including one on the X mnames <- find.marker(fake.f2, c(13, "X"), c(24, 14)) effectplot(fake.f2, mname1=mnames[1], mname2=mnames[2])
This function is used to plot the estimated QTL effects along selected chromosomes. For a backcross, there will be only one line, representing the additive effect. For an intercross, there will be two lines, representing the additive and dominance effects.
effectscan(cross, pheno.col=1, chr, get.se=FALSE, draw=TRUE, gap=25, ylim, mtick=c("line","triangle"), add.legend=TRUE, alternate.chrid=FALSE, ...)
effectscan(cross, pheno.col=1, chr, get.se=FALSE, draw=TRUE, gap=25, ylim, mtick=c("line","triangle"), add.legend=TRUE, alternate.chrid=FALSE, ...)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which to be drawn in the plot. One may also give a character string matching a phenotype name. |
chr |
Optional vector indicating the chromosomes to be drawn in
the plot. This should be a vector of character strings referring to
chromosomes by name; numeric values are converted to strings. Refer
to chromosomes with a preceding |
get.se |
If TRUE, estimated standard errors are calculated. |
draw |
If TRUE, draw the figure. |
gap |
Gap separating chromosomes (in cM). |
ylim |
Y-axis limits (optional). |
mtick |
Tick mark type for markers. |
add.legend |
If TRUE, add a legend. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
... |
Passed to the function |
The results of sim.geno
are required for taking
account of missing genotype information.
For a backcross, the additive effect is estimated as the difference between the phenotypic averages for heterozygotes and homozygotes.
For recombinant inbred lines, the additive effect is estimated as half the difference between the phenotypic averages for the two homozygotes.
For an intercross, the additive and dominance effects are estimated
from linear regression on and
with
= -1, 0, 1,
for the AA, AB and BB genotypes, respectively, and
= 0, 1, 0,
for the AA, AB and BB genotypes, respectively.
As usual, the X chromosome is a bit more complicated. We estimate separate additive effects for the two sexes, and for the two directions within females.
There is an internal function plot.effectscan
that creates
the actual plot by calling plot.scanone
. In the case
get.se=TRUE
, colored regions indicate 1 SE.
The results are returned silently, as an object of class
"effectscan"
, which is the same as the form returned by the
function scanone
, though with estimated effects
where LOD scores might be. That is, it is a data frame with the first
two columns being chromosome ID and position (in cM), and subsequent
columns being estimated effects, and (if get.se=TRUE
) standard
errors.
Karl W. Broman, [email protected]
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
data(fake.f2) fake.f2 <- sim.geno(fake.f2, step=2.5, n.draws=16) # allelic effect on whole genome effectscan(fake.f2) # on chromosome 13, include standard errors effectscan(fake.f2, chr="13", mtick="triangle", get.se=TRUE)
data(fake.f2) fake.f2 <- sim.geno(fake.f2, step=2.5, n.draws=16) # allelic effect on whole genome effectscan(fake.f2) # on chromosome 13, include standard errors effectscan(fake.f2, chr="13", mtick="triangle", get.se=TRUE)
Uses the Lander-Green algorithm (i.e., the hidden Markov model technology) to re-estimate the genetic map for an experimental cross.
est.map(cross, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=10000, tol=1e-6, sex.sp=TRUE, verbose=FALSE, omit.noninformative=TRUE, offset, n.cluster=1)
est.map(cross, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=10000, tol=1e-6, sex.sp=TRUE, verbose=FALSE, omit.noninformative=TRUE, offset, n.cluster=1)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. (Ignored if m > 0.) |
m |
Interference parameter for the chi-square model for interference; a non-negative integer, with m=0 corresponding to no interference. This may be used only for a backcross or intercross. |
p |
Proportion of chiasmata from the NI mechanism, in the Stahl model; p=0 gives a pure chi-square model. This may be used only for a backcross or intercross. |
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
verbose |
If TRUE, print tracing information. |
omit.noninformative |
If TRUE, on each chromosome, omit individuals with fewer than two typed markers, since they are not informative for linkage. |
offset |
Defines the starting position for each chromosome. If missing, we use the starting positions that are currently present in the input cross object. This should be a single value (to be used for all chromosomes) or a vector with length equal to the number of chromosomes, defining individual starting positions for each chromosome. For a sex-specific map (as in a 4-way cross), we use the same offset for both the male and female maps. |
n.cluster |
If the package |
By default, the map is estimated assuming no crossover interference, but a map function is used to derive the genetic distances (though, by default, the Haldane map function is used).
For a backcross or intercross, inter-marker distances may be estimated using the Stahl model for crossover interference, of which the chi-square model is a special case.
In the chi-square model, points are tossed down onto the four-strand
bundle according to a Poisson process, and every st point is a
chiasma. With the assumption of no chromatid interference, crossover
locations on a random meiotic product are obtained by thinning the
chiasma process. The parameter
(a non-negative integer)
governs the strength of crossover interference, with
corresponding to no interference.
In the Stahl model, chiasmata on the four-strand bundle are a
superposition of chiasmata from two mechanisms, one following a
chi-square model and one exhibiting no interference. An additional
parameter, , gives the proportion of chiasmata from the no
interference mechanism.
A map
object; a list whose components (corresponding to
chromosomes) are either vectors of marker positions (in cM) or
matrices with two rows of sex-specific marker positions.
The maximized log likelihood for each chromosome is saved as an
attribute named loglik
. In the case that estimation was under
an interference model (with m > 0), allowed only for a backcross, m
and p are also included as attributes.
Karl W Broman, [email protected]
Armstrong, N. J., McPeek, M. J. and Speed, T. P. (2006) Incorporating interference into linkage analysis for experimental crosses. Biostatistics 7, 374–386.
Lander, E. S. and Green, P. (1987) Construction of multilocus genetic linkage maps in humans. Proc. Natl. Acad. Sci. USA 84, 2363–2367.
Lange, K. (1999) Numerical analysis for statisticians. Springer-Verlag. Sec 23.3.
Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286.
Zhao, H., Speed, T. P. and McPeek, M. S. (1995) Statistical analysis of crossover interference using the chi-square model. Genetics 139, 1045–1056.
map2table
, plotMap
, replace.map
,
est.rf
, fitstahl
data(fake.f2) newmap <- est.map(fake.f2) logliks <- sapply(newmap, attr, "loglik") plotMap(fake.f2, newmap) fake.f2 <- replace.map(fake.f2, newmap)
data(fake.f2) newmap <- est.map(fake.f2) logliks <- sapply(newmap, attr, "loglik") plotMap(fake.f2, newmap) fake.f2 <- replace.map(fake.f2, newmap)
Estimate the sex-averaged recombination fraction between all pairs of genetic markers.
est.rf(cross, maxit=10000, tol=1e-6)
est.rf(cross, maxit=10000, tol=1e-6)
cross |
An object of class |
maxit |
Maximum number of iterations for the EM algorithm (not used with backcrosses). |
tol |
Tolerance for determining convergence (not used with backcrosses). |
For a backcross, one can simply count recombination events. For an intercross or 4-way cross, a version of the EM algorithm must be used to estimate recombination fractions. (Since, for example, in an intercross individual that is heterozygous at two loci, it is not known whether there were 0 or 2 recombination events.) Note that, for the 4-way cross, we estimate sex-averaged recombination fractions.
The input cross
object is returned with a component, rf
,
added. This is a matrix of size (tot.mar x tot.mar). The diagonal
contains the number of typed meioses per marker, the lower triangle
contains the estimated recombination fractions, and the upper triangle
contains the LOD scores (testing rf = 0.5).
Karl W Broman, [email protected]
plotRF
, pull.rf
, plot.rfmatrix
,
est.map
,
badorder
, checkAlleles
data(badorder) badorder <- est.rf(badorder) plotRF(badorder)
data(badorder) badorder <- est.rf(badorder) plotRF(badorder)
Simulated data for a phase-known 4-way cross, obtained using
sim.cross
.
data(fake.4way)
data(fake.4way)
An object of class cross
. See read.cross
for details.
There are 250 individuals typed at 157 markers, including 8 on the X chromosome.
There are two phenotypes (including sex, for which 0=female and 1=male). The quantitative phenotype is affected by three QTLs: two on chromosome 2 at positions 10 and 25 cM on the female genetic map, and one on chromosome 7 at position 40 cM on the female map.
Karl W Broman, [email protected]
sim.cross
, fake.bc
,
fake.f2
, listeria
,
hyper
,
bristle3
, bristleX
data(fake.4way) plot(fake.4way) summary(fake.4way) # estimate recombination fractions fake.4way <- est.rf(fake.4way) plotRF(fake.4way) # estimate genetic maps ssmap <- est.map(fake.4way, verbose=TRUE) samap <- est.map(fake.4way, sex.sp=FALSE, verbose=TRUE) plot(ssmap, samap) # error lod scores fake.4way <- calc.genoprob(fake.4way, err=0.01) fake.4way <- calc.errorlod(fake.4way, err=0.01) top.errorlod(fake.4way, cutoff=2.5) # genome scan fake.4way <- calc.genoprob(fake.4way, step=2.5) out.hk <- scanone(fake.4way, method="hk") out.em <- scanone(fake.4way, method="em") plot(out.em,out.hk,chr=c(2,7))
data(fake.4way) plot(fake.4way) summary(fake.4way) # estimate recombination fractions fake.4way <- est.rf(fake.4way) plotRF(fake.4way) # estimate genetic maps ssmap <- est.map(fake.4way, verbose=TRUE) samap <- est.map(fake.4way, sex.sp=FALSE, verbose=TRUE) plot(ssmap, samap) # error lod scores fake.4way <- calc.genoprob(fake.4way, err=0.01) fake.4way <- calc.errorlod(fake.4way, err=0.01) top.errorlod(fake.4way, cutoff=2.5) # genome scan fake.4way <- calc.genoprob(fake.4way, step=2.5) out.hk <- scanone(fake.4way, method="hk") out.em <- scanone(fake.4way, method="em") plot(out.em,out.hk,chr=c(2,7))
Simulated data for a backcross, obtained using
sim.cross
.
data(fake.bc)
data(fake.bc)
An object of class cross
. See read.cross
for details.
There are 400 backcross individuals typed at 91 markers and with two phenotypes and two covariates (sex and age).
The two phenotypes are due to four QTLs, with no epistasis. There is one on chromosome 2 (at 30 cM), two on chromosome 5 (at 10 and 50 cM), and one on chromosome 10 (at 30 cM). The QTL on chromosome 2 has an effect only in the males (sex=1); the two QTLs on chromosome 5 have effect in coupling for the first phenotype and in repulsion for the second phenotype. Age has an effect of increasing the phenotypes.
Karl W Broman, [email protected]
sim.cross
, fake.4way
,
fake.f2
, listeria
,
hyper
,
bristle3
, bristleX
data(fake.bc) summary(fake.bc) plot(fake.bc) # genome scans without covariates fake.bc <- calc.genoprob(fake.bc, step=2.5) out.nocovar <- scanone(fake.bc, pheno.col=1:2) # genome scans with covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") out.covar <- scanone(fake.bc, pheno.col=1:2, addcovar=ac, intcovar=ic) # summaries summary(out.nocovar, thr=3, format="allpeaks") summary(out.covar, thr=3, format="allpeaks") # plots plot(out.nocovar, out.covar, chr=c(2,5,10), lod=1, col="blue", lty=1:2, ylim=c(0,13)) plot(out.nocovar, out.covar, chr=c(2,5,10), lod=2, col="red", lty=1:2, add=TRUE)
data(fake.bc) summary(fake.bc) plot(fake.bc) # genome scans without covariates fake.bc <- calc.genoprob(fake.bc, step=2.5) out.nocovar <- scanone(fake.bc, pheno.col=1:2) # genome scans with covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") out.covar <- scanone(fake.bc, pheno.col=1:2, addcovar=ac, intcovar=ic) # summaries summary(out.nocovar, thr=3, format="allpeaks") summary(out.covar, thr=3, format="allpeaks") # plots plot(out.nocovar, out.covar, chr=c(2,5,10), lod=1, col="blue", lty=1:2, ylim=c(0,13)) plot(out.nocovar, out.covar, chr=c(2,5,10), lod=2, col="red", lty=1:2, add=TRUE)
Simulated data for an F2 intercross, obtained using
sim.cross
.
data(fake.f2)
data(fake.f2)
An object of class cross
. See read.cross
for details.
There are 200 F2 individuals typed at 94 markers, including 3 on the X chromosome. There is one quantitative phenotype, along with an indication of sex (0=female, 1=male) and the direction of the cross (pgm = paternal grandmother, 0=A, meaning the cross was (AxB)x(AxB), and 1=B, meaning the cross was (AxB)x(BxA)).
Note that the X chromosome genotypes are coded in a special way (see
read.cross
). For the individuals with pgm=0, sex=0,
1=AA and 2=AB; for individuals with pgm=0, sex=1, 1=A and 2=B
(hemizygous); for individuals with pgm=1, sex=0, 1=BB and 2=AB; for
individuals with pgm=1, sex=1, 1=A and 2=B. This requires special
care!
The data were simulated using an additive model with three QTLs on chromosome 1 (at 30, 50 and 70 cM), one QTL on chromosome 13 (at 30 cM), and one QTL on the X chromosome (at 10 cM).
Karl W Broman, [email protected]
sim.cross
, fake.bc
,
fake.4way
, listeria
,
hyper
,
bristle3
, bristleX
data(fake.f2) summary(fake.f2) plot(fake.f2)
data(fake.f2) summary(fake.f2) plot(fake.f2)
Replace the genotype data for a cross with a version imputed either
by simulation with sim.geno
, by the Viterbi
algorithm with argmax.geno
, or simply filling in
genotypes between markers that have matching genotypes.
fill.geno(cross, method=c("imp","argmax", "no_dbl_XO", "maxmarginal"), error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), min.prob=0.95)
fill.geno(cross, method=c("imp","argmax", "no_dbl_XO", "maxmarginal"), error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), min.prob=0.95)
cross |
An object of class |
method |
Indicates whether to impute using a single simulation
replicate from |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi or Carter-Falconer map function when converting genetic distances into recombination fractions. |
min.prob |
For |
This function is written so that one may perform rough genome scans by marker regression without having to drop individuals with missing genotype data. We must caution the user that little trust should be placed in the results.
With method="imp"
, a single random imputation is performed,
using sim.geno
.
With method="argmax"
, for each individual the most probable
sequence of genotypes, given the observed data (via
argmax.geno
), is used.
With method="no_dbl_XO"
, non-recombinant intervals are filled
in; recombinant intervals are left missing. For example, a sequence of
genotypes like A---A---H---H---A
(with A
and H
corresponding to genotypes AA and AB, respectively, and with -
being a missing value) will be filled in as
AAAAA---HHHHH---A
.
With method="maxmarginal"
, the conditional genotype
probabilities are calculated with calc.genoprob
, and then at
each marker, the most probable genotype is determined. This is taken
as the imputed genotype if it has probability greater than
min.prob
; otherwise it is made missing.
With method="no_dbl_XO"
and method="maxmarginal"
,
some missing genotypes likely remain. With
method="maxmarginal"
, some observed genotypes may be made
missing.
The input cross
object with the genotype data replaced by an
imputed version. Any intermediate calculations (such as is produced
by calc.genoprob
, argmax.geno
and sim.geno
) are removed.
Karl W Broman, [email protected]
data(hyper) out.mr <- scantwo(fill.geno(hyper,method="argmax"), method="mr") plot(out.mr)
data(hyper) out.mr <- scantwo(fill.geno(hyper,method="argmax"), method="mr") plot(out.mr)
Find large inter-marker intervals in a map.
find_large_intervals(map, min_length=35)
find_large_intervals(map, min_length=35)
map |
A list of numeric vectors; each component is a chromosome
with the positions of markers on that chromosome. Can also
be an object of class |
min_length |
Minimum length of interval to be flagged. |
Data frame with chromosome, left and right markers and interval length.
Karl W Broman, [email protected]
data(fake.f2) find_large_intervals(fake.f2, 30)
data(fake.f2) find_large_intervals(fake.f2, 30)
Find the genetic markers flanking a specified position on a chromosome, as well as the marker that is closest to the specified position.
find.flanking(cross, chr, pos)
find.flanking(cross, chr, pos)
cross |
An object of class |
chr |
A vector of chromosome identifiers, or a single such. |
pos |
A vector of cM positions. |
A data.frame, each row corresponding to one of the input positions. The first column contains the left-flanking markers, the second column contains the right-flanking markers, and the third column contains the markers closest to the specified positions.
Brian Yandell
find.marker
, plotPXG
,
find.markerpos
, find.pseudomarker
data(listeria) find.flanking(listeria, 5, 28) find.flanking(listeria, c(1, 5, 13), c(81, 28, 26))
data(listeria) find.flanking(listeria, 5, 28) find.flanking(listeria, c(1, 5, 13), c(81, 28, 26))
Find the genetic marker closest to a specified position on a chromosome.
find.marker(cross, chr, pos, index)
find.marker(cross, chr, pos, index)
cross |
An object of class |
chr |
A vector of chromosome identifiers, or a single such. |
pos |
A vector of cM positions. |
index |
A vector of numeric indices of the markers within chromosomes. |
Provide one of pos
or index
.
If the input chr
has length one, it is expanded to the same
length as the input pos
or index
.
If pos
is specified and multiple markers are exactly the same
distance from the specified position, one is chosen at random from
among those with the most genotype data.
For a cross with sex-specific maps, positions specified by pos
are assumed to correspond to the female genetic map.
A vector of marker names (of the same length as the input pos
),
corresponding to the markers nearest to the specified
chromosomes/positions (if pos
is specified) or to the input
numeric indices (in index
is specified).
Karl W Broman, [email protected]
find.flanking
, plotPXG
,
find.pseudomarker
, effectplot
,
find.markerpos
data(listeria) find.marker(listeria, 5, 28) find.marker(listeria, 5, index=6) find.marker(listeria, c(1, 5, 13), c(81, 28, 26))
data(listeria) find.marker(listeria, 5, 28) find.marker(listeria, 5, index=6) find.marker(listeria, c(1, 5, 13), c(81, 28, 26))
Determine the numeric index for a marker in a cross object, when all markers on all chromosomes are pasted together.
find.markerindex(cross, name)
find.markerindex(cross, name)
cross |
An object of class |
name |
A vector of marker names. |
A vector of numeric indices, from 1, 2, ..., totmar(cross)
,
with NA
for markers not found.
Danny Arends; Karl W Broman [email protected]
data(hyper) mar <- find.marker(hyper, 4, 30) find.markerindex(hyper, mar)
data(hyper) mar <- find.marker(hyper, 4, 30) find.markerindex(hyper, mar)
Find the chromosome and cM position of a set of genetic markers.
find.markerpos(cross, marker)
find.markerpos(cross, marker)
cross |
An object of class |
marker |
A vector of marker names. |
A data frame with two columns: the chromosome and position of the markers.
Karl W Broman, [email protected]
find.flanking
, find.marker
,
find.pseudomarker
data(hyper) find.markerpos(hyper, "D4Mit164") find.markerpos(hyper, c("D4Mit164", "D1Mit94"))
data(hyper) find.markerpos(hyper, "D4Mit164") find.markerpos(hyper, c("D4Mit164", "D1Mit94"))
Find the column number corresponding to a particular phenotype name.
find.pheno(cross, pheno)
find.pheno(cross, pheno)
cross |
An object of class |
pheno |
Vector of phenotype names (as character strings). |
A vector of numbers, corresponding to the column numbers of the phenotype in the input cross with the specified names.
Brian Yandell
data(fake.bc) find.pheno(fake.bc, "sex")
data(fake.bc) find.pheno(fake.bc, "sex")
Find the pseudomarker closest to a specified position on a chromosome.
find.pseudomarker(cross, chr, pos, where=c("draws", "prob"), addchr=TRUE)
find.pseudomarker(cross, chr, pos, where=c("draws", "prob"), addchr=TRUE)
cross |
An object of class |
chr |
A vector of chromosome identifiers, or a single such. |
pos |
A vector of cM positions. |
where |
Indicates whether to look in the |
addchr |
If TRUE, include something like |
If the input chr
has length one, it is expanded to the same
length as the input pos
.
If multiple markers are exactly the same distance from the specified position, one is chosen at random from among those with the most genotype data.
For a cross with sex-specific maps, the input positions are assumed to correspond to the female genetic map.
A vector of pseudomarker names (of the same length as the input pos
),
corresponding to the markers nearest to the specified chromosomes/positions.
Karl W Broman, [email protected]
find.flanking
, plotPXG
,
effectplot
, find.marker
,
find.markerpos
data(listeria) listeria <- calc.genoprob(listeria, step=2.5) find.pseudomarker(listeria, 5, 28, "prob") find.pseudomarker(listeria, c(1, 5, 13), c(81, 28, 26), "prob")
data(listeria) listeria <- calc.genoprob(listeria, step=2.5) find.pseudomarker(listeria, 5, 28, "prob") find.pseudomarker(listeria, c(1, 5, 13), c(81, 28, 26), "prob")
Identify sets of markers with identical genotype data.
findDupMarkers(cross, chr, exact.only=TRUE, adjacent.only=FALSE)
findDupMarkers(cross, chr, exact.only=TRUE, adjacent.only=FALSE)
cross |
An object of class |
chr |
Optional vector specifying which chromosomes to consider. This may be a logical, numeric, or character string vector. |
exact.only |
If TRUE, look only for markers that have matching genotypes and the same pattern of missing data; if FALSE, also look for cases where the observed genotypes at one marker match those at another, and where the first marker has missing genotype whenever the genotype for the second marker is missing. |
adjacent.only |
If TRUE, look only for sets of markers that are adjacent to each other. |
If exact.only=TRUE
, we look only for groups of markers whose
pattern of missing data and observed genotypes match exactly. One
marker (chosen at random) is selected as the name of the group (in the
output of the function).
If exact.only=FALSE
, we look also for markers whose observed genotypes
are contained in the observed genotypes of another marker. We use a
pair of nested loops, working from the markers with the most observed
genotypes to the markers with the fewest observed genotypes.
A list of marker names; each component is a set of markers whose genotypes match one other marker, and the name of the component is the name of the marker that they match.
Karl W Broman, [email protected]
drop.nullmarkers
,
drop.markers
, pickMarkerSubset
data(hyper) hyper <- drop.nullmarkers(hyper) dupmar <- findDupMarkers(hyper) # finds 4 pairs dupmar.adjonly <- findDupMarkers(hyper, adjacent.only=TRUE) # finds 4 pairs dupmar.nexact <- findDupMarkers(hyper, exact.only=FALSE, adjacent.only=TRUE) # finds 6 pairs # one might consider dropping the extra markers totmar(hyper) # 173 markers hyper <- drop.markers(hyper, unlist(dupmar.adjonly)) totmar(hyper) # 169 markers
data(hyper) hyper <- drop.nullmarkers(hyper) dupmar <- findDupMarkers(hyper) # finds 4 pairs dupmar.adjonly <- findDupMarkers(hyper, adjacent.only=TRUE) # finds 4 pairs dupmar.nexact <- findDupMarkers(hyper, exact.only=FALSE, adjacent.only=TRUE) # finds 6 pairs # one might consider dropping the extra markers totmar(hyper) # 173 markers hyper <- drop.markers(hyper, unlist(dupmar.adjonly)) totmar(hyper) # 169 markers
Fits a user-specified multiple-QTL model. If specified, a drop-one-term analysis will be performed.
fitqtl(cross, pheno.col=1, qtl, covar=NULL, formula, method=c("imp", "hk"), model=c("normal", "binary"), dropone=TRUE, get.ests=FALSE, run.checks=TRUE, tol=1e-4, maxit=1000, forceXcovar=FALSE)
fitqtl(cross, pheno.col=1, qtl, covar=NULL, formula, method=c("imp", "hk"), model=c("normal", "binary"), dropone=TRUE, get.ests=FALSE, run.checks=TRUE, tol=1e-4, maxit=1000, forceXcovar=FALSE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
An object of class |
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
dropone |
If TRUE, do drop-one-term analysis. |
get.ests |
If TRUE, return estimated QTL effects and their estimated variance-covariance matrix. |
run.checks |
If TRUE, check the input formula and check for individuals with missing phenotypes or covariates. |
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
forceXcovar |
If TRUE, force inclusion of X-chr-related covariates (like sex and cross direction). |
The formula is used to specified the model to be fit. In the
formula, use Q1
, Q2
, etc., or q1
,
q2
, etc., to represent the QTLs, and the column names in the
covariate data frame to represent the covariates.
We enforce a hierarchical structure on the model formula: if a QTL or covariate is in involved in an interaction, its main effect must also be included.
In the drop-one-term analysis, for a given QTL/covariate model, all submodels will be analyzed. For each term in the input formula, when it is dropped, all higher order terms that contain it will also be dropped. The comparison between the new model and the full (input) model will be output.
The estimated percent variances explained for the QTL are simply
transformations of the conditional LOD scores by the formula . While these may be reasonable for
unlinked, additive QTL, they can be completely wrong in the case
of linked QTL, but we don't currently have any alternative.
For model="binary"
, a logistic regression model is used.
The part to get estimated QTL effects is not complete for the case of the X chromosome and 4-way crosses. The values returned in these cases are based on a design matrix that is convenient for calculations but not easily interpreted.
The estimated QTL effects for a backcross are derived by the coding
scheme 1/2 for AA and AB, so that the additive
effect corresponds to the difference between phenotype averages for
the two genotypes. For doubled haploids and RIL, the coding scheme is
1 for AA and BB, so that the additive effect
corresponds to half the difference between the phenotype averages for
the two homozygotes.
For an intercross, the additive effect is derived from the coding scheme -1/0/+1 for genotypes AA/AB/BB, and so is half the difference between the phenotype averages for the two homozygotes. The dominance deviation is derived from the coding scheme 0/+1/0 for genotypes AA/AB/BB, and so is the difference between the phenotype average for the heterozygotes and the midpoint between the phenotype averages for the two homozygotes.
Epistatic effects and QTL covariate interaction
effects are obtained through the products of the corresponding
additive/dominant effect columns.
An object of class fitqtl
. It may contains as many as four components:
result.full
is the ANOVA table as a matrix for the full model
result. It contains the degree of freedom (df), Sum of squares (SS),
mean square (MS), LOD score (LOD), percentage of variance explained
(%var) and P value (Pvalue).
lod
is the LOD score from the fit of the full model.
result.drop
is a drop-one-term ANOVA table as a
matrix. It contains degrees of freedom (df), Type III sum of squares
(Type III SS), LOD score(LOD), percentage of variance explained
(%var), F statistics (F value), and P values for chi square
(Pvalue(chi2)) and F distribution (Pvalue(F)). Note that the degree
of freedom, Type III sum of squares, the LOD score and the
percentage of variance explained are the values comparing the full
to the sub-model with the term dropped. Also note that for
imputation method, the percentage of variance explained, the the F
values and the P values are approximations calculated from the LOD
score.
ests
contains the estimated QTL effects and standard errors.
When method="normal"
, residuals are saved as an attribute of
the output, named "residuals"
and accessible via the
attr
function.
The part to get estimated QTL effects is fully working only for the case of autosomes in a backcross, intercross, RIL or doubled haploids. In other cases the values returned are based on a design matrix that is convenient for calculations but not easily interpreted.
Hao Wu; Karl W Broman, [email protected]
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
summary.fitqtl
, makeqtl
,
scanqtl
, refineqtl
,
addtoqtl
,
dropfromqtl
,
replaceqtl
,
reorderqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # fit model with 3 interacting QTLs interacting # (performing a drop-one-term analysis) lod <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1*Q2*Q3, method="hk") summary(lod) ## Not run: # fit an additive QTL model lod.add <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1+Q2+Q3, method="hk") summary(lod.add) # fit the model including sex as an interacting covariate Sex <- data.frame(Sex=pull.pheno(fake.f2, "sex")) lod.sex <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1*Q2*Q3*Sex, cov=Sex, method="hk") summary(lod.sex) # fit the same with an additive model lod.sex.add <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1+Q2+Q3+Sex, cov=Sex, method="hk") summary(lod.sex.add) # residuals residuals <- attr(lod.sex.add, "residuals") plot(residuals) ## End(Not run)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # fit model with 3 interacting QTLs interacting # (performing a drop-one-term analysis) lod <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1*Q2*Q3, method="hk") summary(lod) ## Not run: # fit an additive QTL model lod.add <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1+Q2+Q3, method="hk") summary(lod.add) # fit the model including sex as an interacting covariate Sex <- data.frame(Sex=pull.pheno(fake.f2, "sex")) lod.sex <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1*Q2*Q3*Sex, cov=Sex, method="hk") summary(lod.sex) # fit the same with an additive model lod.sex.add <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1+Q2+Q3+Sex, cov=Sex, method="hk") summary(lod.sex.add) # residuals residuals <- attr(lod.sex.add, "residuals") plot(residuals) ## End(Not run)
Fit the Stahl model for crossover inference (or the chi-square model, which is a special case).
fitstahl(cross, chr, m, p, error.prob=0.0001, maxit=4000, tol=1e-4, maxm=15, verbose=TRUE)
fitstahl(cross, chr, m, p, error.prob=0.0001, maxit=4000, tol=1e-4, maxm=15, verbose=TRUE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
m |
Interference parameter (a non-negative integer); if unspecified, this is estimated. |
p |
The proportion of chiasmata coming from the no interference mechanism in the Stahl model (0 <= p <= 1). p=0 gives the chi-square model. If unspecified, this is estimated. |
error.prob |
The genotyping error probability. If = NULL, it is estimated. |
maxit |
Maximum number of iterations to perform. |
tol |
Tolerance for determining convergence. |
maxm |
Maximum value of m to consider, if m is unspecified. |
verbose |
Logical; indicates whether to print tracing information. |
This function is currently only available for backcrosses and intercrosses.
The Stahl model of crossover interference (of which the chi-square
model is a special case) is fit. In the chi-square model, points
are tossed down onto the four-strand bundle according to a Poisson
process, and every st point is a chiasma. With the
assumption of no chromatid interference, crossover locations on a
random meiotic product are obtained by thinning the chiasma process.
The parameter
(a non-negative integer) governs the strength of
crossover interference, with
corresponding to no
interference.
In the Stahl model, chiasmata on the four-strand bundle are a
superposition of chiasmata from two mechanisms, one following a
chi-square model and one exhibiting no interference. An additional
parameter, , gives the proportion of chiasmata from the no
interference mechanism.
If all of m
, p
, and error.prob
are specified, any
of them with length > 1 must all have the same length.
If m
is unspecified, we do a grid search starting at 0 and stop
when the likelihood decreases (thus assuming a single mode), or
maxm
is reached.
A matrix with four columns: m, p, error.prob, and the log likelihood.
If specific values for m, p, error.prob are provided, the log likelihood for each set are given.
If some are left unspecified, the maximum likelihood estimates are provided in the results.
Karl W Broman, [email protected]
Armstrong, N. J., McPeek, M. J. and Speed, T. P. (2006) Incorporating interference into linkage analysis for experimental crosses. Biostatistics 7, 374–386.
Zhao, H., Speed, T. P. and McPeek, M. S. (1995) Statistical analysis of crossover interference using the chi-square model. Genetics 139, 1045–1056.
# Simulate genetic map: one chromosome of length 200 cM with # a 2 cM marker spacing mymap <- sim.map(200, 51, anchor.tel=TRUE, include.x=FALSE, sex.sp=FALSE, eq.spacing=TRUE) # Simulate data under the chi-square model, no errors mydata <- sim.cross(mymap, n.ind=250, type="bc", error.prob=0, m=3, p=0) # Fit the chi-square model for specified m's ## Not run: output <- fitstahl(mydata, m=1:5, p=0, error.prob=0) plot(output$m, output$loglik, lwd=2, type="b") # Find the MLE of m in the chi-square model ## Not run: mle <- fitstahl(mydata, p=0, error.prob=0) ## Not run: # Simulate data under the Stahl model, no errors mydata <- sim.cross(mymap, n.ind=250, type="bc", error.prob=0, m=3, p=0.1) # Find MLE of m for the Stahl model with known p mle.stahl <- fitstahl(mydata, p=0.1, error.prob=0) # Fit the Stahl model with unknown p and m, # get results for m=0, 1, 2, ..., 8 output <- fitstahl(mydata, m=0:8, error.prob=0) plot(output$m, output$loglik, type="b", lwd=2) ## End(Not run)
# Simulate genetic map: one chromosome of length 200 cM with # a 2 cM marker spacing mymap <- sim.map(200, 51, anchor.tel=TRUE, include.x=FALSE, sex.sp=FALSE, eq.spacing=TRUE) # Simulate data under the chi-square model, no errors mydata <- sim.cross(mymap, n.ind=250, type="bc", error.prob=0, m=3, p=0) # Fit the chi-square model for specified m's ## Not run: output <- fitstahl(mydata, m=1:5, p=0, error.prob=0) plot(output$m, output$loglik, lwd=2, type="b") # Find the MLE of m in the chi-square model ## Not run: mle <- fitstahl(mydata, p=0, error.prob=0) ## Not run: # Simulate data under the Stahl model, no errors mydata <- sim.cross(mymap, n.ind=250, type="bc", error.prob=0, m=3, p=0.1) # Find MLE of m for the Stahl model with known p mle.stahl <- fitstahl(mydata, p=0.1, error.prob=0) # Fit the Stahl model with unknown p and m, # get results for m=0, 1, 2, ..., 8 output <- fitstahl(mydata, m=0:8, error.prob=0) plot(output$m, output$loglik, type="b", lwd=2) ## End(Not run)
Flip the orders of markers on a specified set of chromosome, so that the markers will be in the reverse order.
flip.order(cross, chr)
flip.order(cross, chr)
cross |
An object of class |
chr |
Vector indicating the chromosomes to flip. This should be a vector of character strings referring to chromosomes by name. A logical (TRUE/FALSE) vector may also be used. |
If the cross contains results from calc.genoprob
,
sim.geno
, argmax.geno
, or
calc.errorlod
, those results are also updated.
Results of est.rf
and markerlrt
are
deleted.
The input cross
object, but with the marker order on the
specified chromosomes flipped.
Karl W Broman, [email protected]
data(fake.f2) fake.f2 <- flip.order(fake.f2, c(1, 5, 13))
data(fake.f2) fake.f2 <- flip.order(fake.f2, c(1, 5, 13))
Use pairwise linkage information between markers (as calculated by
est.rf
to partition markers into linkage groups.
formLinkageGroups(cross, max.rf=0.25, min.lod=3, reorgMarkers=FALSE, verbose=FALSE)
formLinkageGroups(cross, max.rf=0.25, min.lod=3, reorgMarkers=FALSE, verbose=FALSE)
cross |
An object of class |
max.rf |
Maximum recombination fraction for placing two markers in the same linkage group (see Details). |
min.lod |
Minimum LOD score for placing two markers in the same linkage group (see Details). |
reorgMarkers |
If TRUE, the output is a cross object, like the input, but with the markers organized into the inferred linkage groups. If FALSE, the output is a table indicating the initial chromosome assignments and the inferred linkage group partitions. |
verbose |
If TRUE, display information about the progress of the calculations. |
Two markers are placed in the same linkage group if the estimated
recombination fraction between them is
max.rf
and
the LOD score (for the test of the rec. frac. = 1/2) is
min.lod
. The transitive property (if A is linked to B and B is
linked to C then A is linked to C) is used to close the groups.
If reorgMarkers=FALSE
(the default), the output is a data frame
with rows corresponding to the markers and with two columns: the
initial chromosome assignment and the inferred linkage group. Linkage
groups are ordered by the number of markers they contain (from largest
to smallest).
If reorgMarkers=TRUE
, the output is a cross object, like the
input, but with the markers reorganized into the inferred linkage
groups. The marker order and marker positions within the linkage
groups are arbitrary.
Karl W Broman, [email protected]
data(listeria) listeria <- est.rf(listeria) result <- formLinkageGroups(listeria) tab <- table(result[,1], result[,2]) apply(tab, 1, function(a) sum(a!=0)) apply(tab, 2, function(a) sum(a!=0))
data(listeria) listeria <- est.rf(listeria) result <- formLinkageGroups(listeria) tab <- table(result[,1], result[,2]) apply(tab, 1, function(a) sum(a!=0)) apply(tab, 2, function(a) sum(a!=0))
Pull out a matrix of genotypes or genotype probabilities to use markers as covariates in QTL analysis.
formMarkerCovar(cross, markers, method=c("prob", "imp", "argmax"), ...)
formMarkerCovar(cross, markers, method=c("prob", "imp", "argmax"), ...)
cross |
An object of class |
markers |
A vector of character strings of marker or pseudomarker
names. Pseudomarker names may be of the form |
method |
If |
... |
Passed to |
A matrix containing genotype probabilities or genotype indicators,
suitable for use as covariates in scanone
.
Karl W Broman, [email protected]
pull.geno
, pull.genoprob
,
fill.geno
, scanone
data(hyper) hyper <- calc.genoprob(hyper, step=0) peakMarker <- "D4Mit164" X <- formMarkerCovar(hyper, peakMarker) out <- scanone(hyper, addcovar=X)
data(hyper) hyper <- calc.genoprob(hyper, step=0) peakMarker <- "D4Mit164" X <- formMarkerCovar(hyper, peakMarker) out <- scanone(hyper, addcovar=X)
Create a cross tabulation of the genotypes at a pair of markers.
geno.crosstab(cross, mname1, mname2, eliminate.zeros=TRUE)
geno.crosstab(cross, mname1, mname2, eliminate.zeros=TRUE)
cross |
An object of class |
mname1 |
The name of the first marker (as a character
string). (Alternatively, a vector with the two character strings, in
which case |
mname2 |
The name of the second marker (as a character string). |
eliminate.zeros |
If TRUE, don't show the rows and columns that have no data. |
A matrix containing the number of individuals having each possible pair of genotypes. Genotypes for the first marker are in the rows; genotypes for the second marker are in the columns.
Karl W Broman, [email protected]
data(hyper) geno.crosstab(hyper, "D1Mit123", "D1Mit156") geno.crosstab(hyper, "DXMit22", "DXMit16") geno.crosstab(hyper, c("DXMit22", "DXMit16"))
data(hyper) geno.crosstab(hyper, "D1Mit123", "D1Mit156") geno.crosstab(hyper, "DXMit22", "DXMit16") geno.crosstab(hyper, c("DXMit22", "DXMit16"))
Plot a grid showing which the genotype data in a cross.
geno.image(x, chr, reorder=FALSE, main="Genotype data", alternate.chrid=FALSE, col=NULL, ...)
geno.image(x, chr, reorder=FALSE, main="Genotype data", alternate.chrid=FALSE, col=NULL, ...)
x |
An object of class |
||||||
chr |
Optional vector indicating the chromosomes to be drawn in
the plot. This should be a vector of character strings referring to
chromosomes by name; numeric values are converted to strings. Refer
to chromosomes with a preceding |
||||||
reorder |
Specify whether to reorder individuals according to their phenotypes.
|
||||||
main |
Title to place on plot. |
||||||
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
||||||
col |
Vector of colors. The first is for missing genotypes,
followed by colors for each of the genotypes. If |
||||||
... |
Passed to |
Uses image
to plot a grid with the genotype
data. The genotypes AA, AB, BB are displayed in the colors red, blue,
and green, respectively. In an intercross, if there are genotypes
"not BB" and "not AA", these are displayed in purple and orange,
respectively. White pixels indicate missing data.
None.
Karl W Broman, [email protected]
plot.cross
,
plotMissing
, plotGeno
,
image
data(listeria) geno.image(listeria)
data(listeria) geno.image(listeria)
Create table showing the observed numbers of individuals with each genotype at each marker, including P-values from chi-square tests for Mendelian segregation.
geno.table(cross, chr, scanone.output=FALSE)
geno.table(cross, chr, scanone.output=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
scanone.output |
If TRUE, give result in the form output by
|
The P-values are obtained from chi-square tests of Mendelian segregation. In the case of the X chromosome, the sexes and cross directions are tested separately, and the chi-square statistics combined, and so the test is of whether any of the groups show deviation from Mendel's rules.
If scanone.output=FALSE
, the output is a matrix containing, for
each marker, the number of individuals with each possible genotype, as
well as the number that were not typed. The first column gives the
chromosome ID, and the last column gives P-values from chi-square
tests of Mendelian segregation.
If scanone.output=TRUE
, the output is of the form produced by
scanone
, with the first two columns being chromosome IDs
and cM positions of the markers. The third column is
from chi-square tests of Mendelian
segregation. The fourth column is the proportion of missing data.
The remaining columns are the proportions of the different genotypes
(among typed individuals).
Karl W Broman, [email protected]
summary.cross
,
drop.markers
, drop.nullmarkers
data(listeria) geno.table(listeria) geno.table(listeria, chr=13) gt <- geno.table(listeria) gt[gt$P.value < 0.01,] out <- geno.table(listeria, scanone.output=TRUE) plot(out) plot(out, lod=2)
data(listeria) geno.table(listeria) geno.table(listeria, chr=13) gt <- geno.table(listeria) gt[gt$P.value < 0.01,] out <- geno.table(listeria, scanone.output=TRUE) plot(out) plot(out, lod=2)
Pull out the individual identifiers from a cross object.
getid(cross)
getid(cross)
cross |
An object of class |
A vector of individual identifiers, pulled from the phenotype data (a
column named id
or ID
).
If there are no such identifiers in the cross, the function returns
NULL
.
Karl W Broman, [email protected]
data(fake.f2) # create an ID column fake.f2$pheno$id <- paste("ind", sample(nind(fake.f2)), sep="") getid(fake.f2)
data(fake.f2) # create an ID column fake.f2$pheno$id <- paste("ind", sample(nind(fake.f2)), sep="") getid(fake.f2)
Retrieving groups of clustered traits from the output of mqmplot.clusteredheatmap.
groupclusteredheatmap(cross, clusteredheatmapresult, height)
groupclusteredheatmap(cross, clusteredheatmapresult, height)
cross |
An object of class |
clusteredheatmapresult |
Resultint dendrogram object from |
height |
Height at which to 'cut' the dendrogram, a higher cut-off gives less but larger groups.
Height represents the maximum distance between two traits clustered together using hclust. the 'normal'
behaviour of bigger groups when using a higher heigh cut-off depends on the tree stucture and the amount
of traits clustered using |
A list containing groups of traits which were clustered together with a distance less that height
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) cresults <- mqmplot.clusteredheatmap(multitrait,result) groupclusteredheatmap(multitrait,cresults,10)
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) cresults <- mqmplot.clusteredheatmap(multitrait,result) groupclusteredheatmap(multitrait,cresults,10)
Data from an experiment on hypertension in the mouse.
data(hyper)
data(hyper)
An object of class cross
. See read.cross
for details.
There are 250 male backcross individuals typed at 174 markers (actually one contains only missing values), including 4 on the X chromosome, with one phenotype.
The phenotype is the blood pressure. See the reference below. Note that, for most markers, genotypes are available on only the individuals with extreme phenotypes. At many markers, only recombinant individuals were typed.
Bev Paigen and Gary Churchill (The Jackson Laboratory, Bar Harbor, Maine) https://phenome.jax.org/projects/Sugiyama2
Sugiyama, F., Churchill, G. A., Higgens, D. C., Johns, C., Makaritsis, K. P., Gavras, H. and Paigen, B. (2001) Concordance of murine quantitative trait loci for salt-induced hypertension with rat and human loci. Genomics 71, 70–77.
fake.bc
, fake.f2
,
fake.4way
, listeria
,
bristle3
, bristleX
data(hyper) summary(hyper) plot(hyper) # Note the selective genotyping ## Not run: plotMissing(hyper, reorder=TRUE) # A marker on c14 has no data; remove it hyper <- drop.nullmarkers(hyper)
data(hyper) summary(hyper) plot(hyper) # Note the selective genotyping ## Not run: plotMissing(hyper, reorder=TRUE) # A marker on c14 has no data; remove it hyper <- drop.nullmarkers(hyper)
Uses groups of adjacent markers to infer the founder haplotypes in SNP data on multi-parent recombinant inbred lines.
inferFounderHap(cross, chr, max.n.markers=15)
inferFounderHap(cross, chr, max.n.markers=15)
cross |
An object of class |
chr |
Indicator of chromosome to consider. If multiple chromosomes are selected, only the first is used. |
max.n.markers |
Maximum number of adjacent markers to consider. |
We omit SNPs for which any of the founders are missing.
We then consider groups of adjacent SNPs, looking for founder haplotypes that are unique; RIL sharing such a unique haplotype are then inferred to have that founder's DNA.
We consider each marker as the center of a haplotype, and consider
haplotypes of size 1, 3, 5, ..., max.n.markers
. We end the
extension of the haplotypes when all founders have a unique haplotype.
A matrix of dimension nind(cross)
no. markers,
with the inferred founder origin for each line at each marker.
Karl W Broman, [email protected]
sim.geno
, calc.genoprob
,
fill.geno
, argmax.geno
map <- sim.map(100, n.mar=101, include.x=FALSE, eq.spacing=TRUE) founderGeno <- simFounderSnps(map, "8") ril <- sim.cross(map, n.ind=10, type="ri8sib", founderGeno=founderGeno) h <- inferFounderHap(ril, max.n.markers=11) mean(!is.na(h)) # proportion inferred plot(map[[1]], h[1,], ylim=c(0.5, 8.5), xlab="Position", ylab="Genotype")
map <- sim.map(100, n.mar=101, include.x=FALSE, eq.spacing=TRUE) founderGeno <- simFounderSnps(map, "8") ril <- sim.cross(map, n.ind=10, type="ri8sib", founderGeno=founderGeno) h <- inferFounderHap(ril, max.n.markers=11) mean(!is.na(h)) # proportion inferred plot(map[[1]], h[1,], ylim=c(0.5, 8.5), xlab="Position", ylab="Genotype")
Identify the inferred partitions for a chromosome from the results of scanPhyloQTL.
inferredpartitions(output, chr, lodthreshold, probthreshold=0.9)
inferredpartitions(output, chr, lodthreshold, probthreshold=0.9)
output |
An object output by the function
|
chr |
A character string indicating the chromosome to consider. (It can also be a number, but it's then converted to a character string.) |
lodthreshold |
LOD threshold; if maximum LOD score is less than this, the null model is considered. |
probthreshold |
Threshold on posterior probabilities. See Details below. |
We consider a single chromosome, and take the maximum LOD score for
each partition on that chromosome. The presence of a QTL is inferred
if at least one partition has LOD score greater than
lodthreshold
. In this case, we then convert the LOD scores for
the partitions to approximate posterior probabilities by taking
and then rescaling them to sum to 1.
These are sorted from largest to smallest, and we
then take as the inferred partitions the smallest set whose posterior
probabilities cumulatively add up to at least
probthreshold
.
A vector of character strings. If the null model (no QTL) is
inferred, the output is "null"
. Otherwise, it is the set of
inferred partitions.
Karl W Broman, [email protected]
Broman, K. W., Kim, S., An\'e, C. and Payseur, B. A. Mapping quantitative trait loci to a phylogenetic tree. In preparation.
scanPhyloQTL
, plot.scanPhyloQTL
,
summary.scanPhyloQTL
, max.scanPhyloQTL
,
simPhyloQTL
# example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross ## Not run: x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # inferred partitions inferredpartitions(out, chr=3, lodthreshold=3) # inferred partitions with prob'y threshold = 0.95 inferredpartitions(out, chr=3, lodthreshold=3, probthreshold=0.95)
# example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross ## Not run: x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # inferred partitions inferredpartitions(out, chr=3, lodthreshold=3) # inferred partitions with prob'y threshold = 0.95 inferredpartitions(out, chr=3, lodthreshold=3, probthreshold=0.95)
On the basis of a pair of marker maps with common markers, take positions along one map and interpolate (or, past the terminal markers on a chromosome, extrapolate) their positions on the second map.
interpPositions(oldpositions, oldmap, newmap)
interpPositions(oldpositions, oldmap, newmap)
oldpositions |
A data frame with two columns: |
oldmap |
An object of class |
newmap |
An object of class |
In this explanation, take oldmap
and newmap
to be the
physical and genetic maps, respectively.
We use linear interpolation within each interval, assuming a constant recombination rate within the interval. Past the terminal markers, we use linear extrapolation, using the chromosome-wide average recombination rate.
The input data frame, oldpositions
, with an additional
column newpos
with the interpolated positions along
newmap
.
Karl W Broman, [email protected]
shiftmap
, rescalemap
, pull.map
data(hyper) # hyper genetic map gmap <- pull.map(hyper) # a fake physical map, with each chromosome starting at 0. pmap <- shiftmap(rescalemap(gmap, 2)) # positions on pmap to determine location on gmap tofind <- data.frame(chr=c(1, 5, 17, "X"), pos=c(220, 20, 105, 10)) rownames(tofind) <- paste("loc", 1:nrow(tofind), sep="") interpPositions(tofind, pmap, gmap)
data(hyper) # hyper genetic map gmap <- pull.map(hyper) # a fake physical map, with each chromosome starting at 0. pmap <- shiftmap(rescalemap(gmap, 2)) # positions on pmap to determine location on gmap tofind <- data.frame(chr=c(1, 5, 17, "X"), pos=c(220, 20, 105, 10)) rownames(tofind) <- paste("loc", 1:nrow(tofind), sep="") interpPositions(tofind, pmap, gmap)
Jitter the marker positions in a genetic map so that no two markers are on top of each other.
jittermap(object, amount=1e-6)
jittermap(object, amount=1e-6)
object |
Either a cross (an object of class |
amount |
The amount by which markers should be moved. |
Either the input cross object or the input map, but with marker
positions slightly jittered. If the input was a cross, the function
clean
is run to strip off any intermediate
calculations.
Karl W Broman, [email protected]
pull.map
, replace.map
,
summary.cross
data(hyper) hyper <- jittermap(hyper)
data(hyper) hyper <- jittermap(hyper)
Data from an experiment on susceptibility to Listeria monocytogenes infection in the mouse.
data(listeria)
data(listeria)
An object of class cross
. See read.cross
for details.
There are 120 F2 individuals typed at 133 markers, including 2 on the X chromosome, with one phenotype.
The phenotype is the survival time (in hours) following infection. Mice with phenotype 264 hours may be considered to have recovered from the infection. See the references below.
Victor Boyartchuk and William Dietrich (Department of Genetics, Harvard Medical School and Howard Hughes Medical Institute)
Boyartchuk, V. L., Broman, K. W., Mosher, R. E., D'Orazio S. E. F., Starnbach, M. N. and Dietrich, W. F. (2001) Multigenic control of Listeria monocytogenes susceptibility in mice. Nature Genetics 27, 259–260.
Broman, K. W. (2003) Mapping quantitative trait loci in the case of a spike in the phenotype distribution. Genetics 163, 1169–1175.
fake.bc
, fake.f2
,
fake.4way
, hyper
,
bristle3
, bristleX
data(listeria) # Summaries summary(listeria) plot(listeria) # Take log of phenotype listeria$pheno[,1] <- log2(listeria$pheno[,1]) plot(listeria) # Genome scan with a two-part model, using log survival listeria <- calc.genoprob(listeria, step=2) out <- scanone(listeria, model="2part", method="em", upper=TRUE) # Summary of the results summary(out, thr=c(5,3,3), format="allpeaks") # Plot LOD curves for interesting chromosomes # (The two-part model gives three LOD scores) plot(out, chr=c(1,5,6,13,15), lodcolumn=1:3, lty=1, col=c("black","red","blue"))
data(listeria) # Summaries summary(listeria) plot(listeria) # Take log of phenotype listeria$pheno[,1] <- log2(listeria$pheno[,1]) plot(listeria) # Genome scan with a two-part model, using log survival listeria <- calc.genoprob(listeria, step=2) out <- scanone(listeria, model="2part", method="em", upper=TRUE) # Summary of the results summary(out, thr=c(5,3,3), format="allpeaks") # Plot LOD curves for interesting chromosomes # (The two-part model gives three LOD scores) plot(out, chr=c(1,5,6,13,15), lodcolumn=1:3, lty=1, col=c("black","red","blue"))
Estimate the locations of crossovers for each individual on a given chromosome.
locateXO(cross, chr, full.info=FALSE)
locateXO(cross, chr, full.info=FALSE)
cross |
An object of class |
chr |
Chromosome to investigate (if unspecified, the first chromosome is considered). This should be a character string referring to a chromosome by name; numeric values are converted to strings. |
full.info |
If TRUE, output will include information on the left and right endpoints of the intervals to which recombination events are known, as well as the corresponding marker indices. |
For each individual we detemine the locations of obligate crossovers, and estimate their location to be at the midpoint between the nearest flanking typed markers.
The function currently only works for a backcross, intercross, or recombinant inbred line.
A list with one component per individual. Each component is either NULL or is a numeric vector with the estimated crossover locations.
If full.info=TRUE
, in place of a numeric vector with estimated
locations, there is a matrix that includes those locations, the left
and right endpoints of the intervals to which crossovers can be
placed, the marker indices corresponding to those endpoint, and
genotype codes for the genotypes to the left and right of each crossover. The
final column indicates the number of typed markers between the current
crossover and the next one (useful for identifying potential
genotyping errors).
Karl W Broman, [email protected]
data(hyper) xoloc <- locateXO(hyper, chr=4) table(sapply(xoloc, length))
data(hyper) xoloc <- locateXO(hyper, chr=4) table(sapply(xoloc, length))
A table with genetic locations of the traits in the multitrait
dataset
data(locations)
data(locations)
Each row is a trait with the following information:
Name
, Name of the trait (will be checked against the name in the cross object
Chr
, Chromosome of the trait
cM
, Location in cM from the start of the chromosome
Additional information from the Arabidopsis RIL selfing experiment with Landsberg erecta (Ler) and Cape Verde Islands (Cvi) with 162 individuals scored (with errors at) 117 markers. Dataset obtained from GBIC - Groningen BioInformatics Centre
Keurentijes JJB, Fu J, de Vos CHR,Lommen A, Jansen RC et al (2006), The genetics of plant metabolism. Nature Genetics 38, 842–849.
Alonso-Blanco C., Peeters, A. J. and Koornneef, M. (2006) Development of an AFLP based linkage map of Ler, Col and Cvi Arabidopsis thaliana ecotypes and construction of a Ler/Cvi recombinant inbred line population. Plant J. 14(2), 259–271.
## Not run: data(multitrait) data(locations) multiloc <- addloctocross(multitrait,locations) results <- scanall(multiloc) mqmplot.cistrans(results,multiloc, 5, FALSE, TRUE) ## End(Not run)
## Not run: data(multitrait) data(locations) multiloc <- addloctocross(multitrait,locations) results <- scanall(multiloc) mqmplot.cistrans(results,multiloc, 5, FALSE, TRUE) ## End(Not run)
Calculate a LOD support interval for a particular chromosome, using output from scanone.
lodint(results, chr, qtl.index, drop=1.5, lodcolumn=1, expandtomarkers=FALSE)
lodint(results, chr, qtl.index, drop=1.5, lodcolumn=1, expandtomarkers=FALSE)
results |
Output from |
chr |
A chromosome ID (if input |
qtl.index |
Numeric index for a QTL (if input |
drop |
LOD units to drop to form the interval. |
lodcolumn |
An integer indicating which
of the LOD score columns should be considered (if input
|
expandtomarkers |
If TRUE, the interval is expanded to the nearest flanking markers. |
An object of class scanone
indicating the
estimated QTL position and the approximate endpoints
for the LOD support interval.
Karl W Broman, [email protected]
data(hyper) hyper <- calc.genoprob(hyper, step=0.5) out <- scanone(hyper, method="hk") lodint(out, chr=1) lodint(out, chr=4) lodint(out, chr=4, drop=2) lodint(out, chr=4, expandtomarkers=TRUE)
data(hyper) hyper <- calc.genoprob(hyper, step=0.5) out <- scanone(hyper, method="hk") lodint(out, chr=1) lodint(out, chr=4) lodint(out, chr=4, drop=2) lodint(out, chr=4, expandtomarkers=TRUE)
This function takes a cross object and specified chromosome numbers
and positions and pulls out the genotype probabilities or imputed
genotypes at the nearest pseudomarkers, for later use by the function
fitqtl
.
makeqtl(cross, chr, pos, qtl.name, what=c("draws","prob"))
makeqtl(cross, chr, pos, qtl.name, what=c("draws","prob"))
cross |
An object of class |
chr |
Vector indicating the chromosome for each QTL. (These should be character strings referring to the chromosomes by name.) |
pos |
Vector (of same length as |
qtl.name |
Optional user-specified name for each QTL, used in the
drop-one-term ANOVA table in |
what |
Indicates whether to pull out the imputed genotypes or the genotype probabilities. |
This function will take out the genotype probabilities and imputed
genotypes if they are present in the input cross
object. If both
fields are missing in the input object, the function will report an
error. Before running this function, the user must have first run either
sim.geno
(for what="draws"
) or
calc.genoprob
(for what="prob"
).
An object of class qtl
with the following elements (though only
one of geno
and prob
will be included, according to
whether what
is given as "draws"
or "prob"
):
geno |
Imputed genotypes. |
prob |
Genotype probabilities. |
name |
User-defined name for each QTL, or a name of the
form |
altname |
QTL names of the form |
chr |
Input vector of chromosome numbers. |
pos |
Input vector of chromosome positions. |
n.qtl |
Number of QTLs. |
n.ind |
Number of individuals. |
n.gen |
A vector indicating the number of genotypes for each QTL. |
Hao Wu; Karl W Broman, [email protected]
fitqtl
, calc.genoprob
,
sim.geno
, dropfromqtl
,
replaceqtl
, addtoqtl
, summary.qtl
,
reorderqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c("1", "6", "13") qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- sim.geno(fake.f2, n.draws=8, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="draws") summary(qtl)
data(fake.f2) # take out several QTLs and make QTL object qc <- c("1", "6", "13") qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- sim.geno(fake.f2, n.draws=8, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="draws") summary(qtl)
A genetic map corresponding approximately to the mouse genome with a 10 cM marker spacing.
data(map10)
data(map10)
An object of class map
: a list whose components are vectors of
marker locations. This map approximates the mouse genome, with 20
chromosomes (including the X chromosome) and 187 markers at an
approximately 10 cM spacing. The markers are equally spaced on each
chromosome, but the spacings are a bit above or below 10 cM, so that
the lengths match those in the Mouse Genome Database.
data(map10) plot(map10) mycross <- sim.cross(map10, type="f2", n.ind=100)
data(map10) plot(map10) mycross <- sim.cross(map10, type="f2", n.ind=100)
Convert a map object (as a list) to a table (as a data frame).
map2table(map, chr)
map2table(map, chr)
map |
A |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
A data frame with two or three columns: chromosome and sex-averaged position, or chromosome, female position, and male position.
The row names are the marker names.
Karl W Broman, [email protected]
data(fake.f2) map <- pull.map(fake.f2) map_as_tab <- map2table(map)
data(fake.f2) map <- pull.map(fake.f2) map_as_tab <- map2table(map)
Simulated data for an F2 intercross, obtained using
sim.cross
, useful for illustrating the process of
constructing a genetic map.
data(mapthis)
data(mapthis)
An object of class cross
. See read.cross
for details.
These are simulated data, consisting of 300 F2 individuals typed at 100 markers on five chromosomes. There are no real phenotypes, just a set of individual identifiers. The data were simulated for the purpose of illustrating the process of constructing a genetic map. The markers are all assigned to a single chromosome and in a random order, and there are a number of problematic markers and individuals.
See https://rqtl.org/tutorials/geneticmaps.pdf for a tutorial on how to construct a genetic map with these data.
Karl W Broman, [email protected]
Broman, K. W. (2010) Genetic map construction with R/qtl. Technical report #214, Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison
fake.f2
, est.rf
,
est.map
, formLinkageGroups
,
orderMarkers
data(mapthis) summary(mapthis) plot(mapthis)
data(mapthis) summary(mapthis) plot(mapthis)
Calculate a LOD score for a general likelihood ratio test for each pair of markers, to assess their association.
markerlrt(cross)
markerlrt(cross)
cross |
An object of class |
The input cross
object is returned with a component, rf
,
added. This is a matrix of size (tot.mar x tot.mar). The diagonal
contains the number of typed meioses per marker, the upper and lower triangles
each contain the LOD scores.
Karl W Broman, [email protected]
data(badorder) badorder <- markerlrt(badorder) plotRF(badorder)
data(badorder) badorder <- markerlrt(badorder) plotRF(badorder)
Pull out the marker names from a cross object as one big vector.
markernames(cross, chr)
markernames(cross, chr)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
A vector of character strings (the marker names).
Karl W Broman, [email protected]
data(listeria) markernames(listeria, chr=5)
data(listeria) markernames(listeria, chr=5)
Print the row of the output from scanone
that
corresponds to the maximum LOD, genome-wide.
## S3 method for class 'scanone' max(object, chr, lodcolumn=1, na.rm=TRUE, ...)
## S3 method for class 'scanone' max(object, chr, lodcolumn=1, na.rm=TRUE, ...)
object |
An object of the form output by the function
|
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
lodcolumn |
An integer, indicating which of the LOD score columns should be considered in pulling out the peak (these are indexed 1, 2, ...). |
na.rm |
A logical indicating whether missing values should be removed. |
... |
Ignored. |
An object of class summary.scanone
, to be printed by
print.summary.scanone
. This is a data.frame with one row,
corresponding to the maximum LOD peak either genome-wide or for the
particular chromosome specified.
Karl W Broman, [email protected]
scanone
, plot.scanone
,
summary.scanone
data(listeria) listeria <- calc.genoprob(listeria, step=2.5) out <- scanone(listeria, model="2part", upper=TRUE) # Maximum peak for LOD(p,mu) max(out) # Maximum peak for LOD(p,mu) on chr 5 max(out,chr=5) # Maximum peak for LOD(p,mu) on chromosomes other than chr 13 max(out,chr="-13") # Maximum peak for LOD(p) max(out, lodcolumn=2) # Maximum peak for LOD(mu) max(out, lodcolumn=3)
data(listeria) listeria <- calc.genoprob(listeria, step=2.5) out <- scanone(listeria, model="2part", upper=TRUE) # Maximum peak for LOD(p,mu) max(out) # Maximum peak for LOD(p,mu) on chr 5 max(out,chr=5) # Maximum peak for LOD(p,mu) on chromosomes other than chr 13 max(out,chr="-13") # Maximum peak for LOD(p) max(out, lodcolumn=2) # Maximum peak for LOD(mu) max(out, lodcolumn=3)
Print the chromosome with the maximum LOD score across partitions,
from the results of scanPhyloQTL
.
## S3 method for class 'scanPhyloQTL' max(object, chr, format=c("postprob", "lod"), ...)
## S3 method for class 'scanPhyloQTL' max(object, chr, format=c("postprob", "lod"), ...)
object |
An object output by the function
|
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
format |
Indicates whether to provide LOD scores or approximate
posterior probabilities; see the help file for |
... |
Ignored at this point. |
The output, and the use of the argument format
, is as in
summary.scanPhyloQTL
.
An object of class summary.scanPhyloQTL
, to be printed by
print.summary.scanPhyloQTL
.
Karl W Broman, [email protected]
Broman, K. W., Kim, S., An\'e, C. and Payseur, B. A. Mapping quantitative trait loci to a phylogenetic tree. In preparation.
scanPhyloQTL
, plot.scanPhyloQTL
,
summary.scanPhyloQTL
, max.scanone
,
inferredpartitions
,
simPhyloQTL
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
Print the pair of loci with the largest LOD score in the results of
scantwo
.
## S3 method for class 'scantwo' max(object, lodcolumn=1, what=c("best", "full", "add", "int"), na.rm=TRUE, ...)
## S3 method for class 'scantwo' max(object, lodcolumn=1, what=c("best", "full", "add", "int"), na.rm=TRUE, ...)
object |
An object of class |
lodcolumn |
If the scantwo results contain LOD scores for multiple phenotypes, this argument indicates which to use. |
what |
Indicates for which LOD score the maximum should be reported. |
na.rm |
Ignored. |
... |
Ignored. |
This is very similar to the summary.scantwo
function, though this pulls out one pair of positions.
If what="best"
, we find the pair of positions at which the LOD
score for the full model (2 QTL + interaction) is maximized, and then
also print the positions on that same pair of chromosomes at which the
additive LOD score is maximized.
In the other cases, we pull out the pair of positions with the largest
LOD score; which LOD score is considered is indicated by the
what
argument.
An object of class summary.scantwo
, to be printed by
print.summary.scantwo
, with the pair of positions with the
maximum LOD score. (Which LOD score is considered is indicated by the
what
argument.)
Note that, for output from addpair
in which the
new loci are indicated explicitly in the formula, the summary provided
by max.scantwo
is somewhat special.
All arguments (except, of course, the input
object
) are ignored.
If the formula is symmetric in the two new QTL, the output has just two LOD
score columns: lod.2v0
comparing the full model to the model
with neither of the new QTL, and lod.2v1
comparing the full
model to the model with just one new QTL.
If the formula is not symmetric in the two new QTL, the output
has three LOD score columns: lod.2v0
comparing the full model
to the model with neither of the new QTL, lod.2v1b
comparing
the full model to the model in which the first of the new QTL is
omitted, and lod.2v1a
comparing the full model to the model
with the second of the new QTL omitted.
Karl W Broman, [email protected]
scantwo
, plot.scantwo
,
summary.scantwo
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=10) out.2dim <- scantwo(fake.f2, method="hk") max(out.2dim)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=10) out.2dim <- scantwo(fake.f2, method="hk") max(out.2dim)
Move a specified marker to a different chromosome.
movemarker(cross, marker, newchr, newpos)
movemarker(cross, marker, newchr, newpos)
cross |
An object of class |
marker |
The name of the marker to be moved (a character string). |
newchr |
The chromosome to which the marker should be moved. |
newpos |
The position (in cM) at which the marker should be placed. If missing, the marker is placed at the end of the chromosome. |
The input cross
object, but with the specified marker moved to
the specified chromosome.
All intermediate calculations (such as the results of
calc.genoprob
and est.rf
) are
removed.
Karl W Broman, [email protected]
data(badorder) badorder <- movemarker(badorder, "D2M937", 3, 48.15) badorder <- movemarker(badorder, "D3M160", 2, 28.83)
data(badorder) badorder <- movemarker(badorder, "D2M937", 3, 48.15) badorder <- movemarker(badorder, "D3M160", 2, 28.83)
Overview of the MQM mapping functions
Multiple QTL Mapping (MQM) provides a sensitive approach for mapping quantititive trait loci (QTL) in experimental populations. MQM adds higher statistical power compared to many other methods. The theoretical framework of MQM was introduced and explored by Ritsert Jansen, explained in the ‘Handbook of Statistical Genetics’ (see references), and used effectively in practical research, with the commercial ‘mapqtl’ software package. Here we present the first free and open source implementation of MQM, with extra features like high performance parallelization on multi-CPU computers, new plots and significance testing.
MQM is an automatic three-stage procedure in which, in the first stage, missing data is ‘augmented’. In other words, rather than guessing one likely genotype, multiple genotypes are modeled with their estimated probabilities. In the second stage important markers are selected by multiple regression and backward elimination. In the third stage a QTL is moved along the chromosomes using these pre-selected markers as cofactors, except for the markers in the window around the interval under study. QTL are (interval) mapped using the most ‘informative’ model through maximum likelihood. A refined and automated procedure for cases with large numbers of marker cofactors is included. The method internally controls false discovery rates (FDR) and lets users test different QTL models by elimination of non-significant cofactors.
R/qtl-MQM has the following advantages:
Higher power to detect linked as well as unlinked QTL, as long as the QTL explain a reasonable amount of variation
Protection against overfitting, because it fixes the residual variance from the full model. For this reason more parameters (cofactors) can be used compared to, for example, CIM
Prevention of ghost QTL (between two QTL in coupling phase)
Detection of negating QTL (QTL in repulsion phase)
The current implementation of R/qtl-MQM has the following limitations: (1) MQM is limited to experimental crosses F2, BC, and selfed RIL, (2) MQM does not treat sex chromosomes differently from autosomal chromosomes - though one can introduce sex as a cofactor. Future versions of R/qtl-MQM may improve on these points. Check the website and change log (https://github.com/kbroman/qtl/blob/main/NEWS.md) for updates.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
Arends D, Prins P, Jansen RC. R/qtl: High-throughput multiple QTL mapping. Bioinformatics, to appear
Jansen RC, (2007) Quantitative trait loci in inbred lines. Chapter 18 of Handbook of Stat. Genetics 3rd edition. John Wiley & Sons, Ltd.
Jansen RC, Nap JP (2001), Genetical genomics: the added value from segregation. Trends in Genetics, 17, 388–391.
Jansen RC, Stam P (1994), High resolution of quantitative traits into multiple loci via interval mapping. Genetics, 136, 1447–1455.
Jansen RC (1993), Interval mapping of multiple quantitative trait loci. Genetics, 135, 205–211.
Swertz MA, Jansen RC. (2007), Beyond standardization: dynamic software infrastructures for systems biology. Nat Rev Genet. 3, 235–243.
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39, 1–38.
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(map10) # Genetic map modeled after mouse # simulate a cross (autosomes 1-10) qtl <- c(3,15,1,0) # QTL model: chr, pos'n, add've & dom effects cross <- sim.cross(map10[1:10],qtl,n=100,missing.prob=0.01) # MQM crossaug <- mqmaugment(cross) # Augmentation cat(crossaug$mqm$Nind,'real individuals retained in dataset', crossaug$mqm$Naug,'individuals augmented\n') result <- mqmscan(crossaug) # Scan # show LOD interval of the QTL on chr 3 lodint(result,chr=3)
data(map10) # Genetic map modeled after mouse # simulate a cross (autosomes 1-10) qtl <- c(3,15,1,0) # QTL model: chr, pos'n, add've & dom effects cross <- sim.cross(map10[1:10],qtl,n=100,missing.prob=0.01) # MQM crossaug <- mqmaugment(cross) # Augmentation cat(crossaug$mqm$Nind,'real individuals retained in dataset', crossaug$mqm$Naug,'individuals augmented\n') result <- mqmscan(crossaug) # Scan # show LOD interval of the QTL on chr 3 lodint(result,chr=3)
Fill in missing genotypes for MQM mapping. For each missing or incomplete
marker it fills in (or ‘augments’) all possible genotypes, thus creating new
candidate ‘individuals’. The probability of each indidual is calculated using
information on neighbouring markers and recombination frequencies. When a
genotype of an augmented genotype is less likely than the minprob
parameter it is dropped from the dataset. The augmented list of
individuals is returned in a new cross object. For a full discussion on
augmentation see the MQM tutorial online.
mqmaugment(cross, maxaugind=82, minprob=0.1, strategy=c("default","impute","drop"), verbose=FALSE)
mqmaugment(cross, maxaugind=82, minprob=0.1, strategy=c("default","impute","drop"), verbose=FALSE)
cross |
An object of class |
maxaugind |
Maximum number of augmentations per individual. The default of 82
allows for six missing markers for an individual in a BC cross
( |
minprob |
Return individuals with augmented genotypes that have at least this probability of
occurring. |
strategy |
When individuals have too much missing data and augmentation fails three
options are provided:
1. |
verbose |
If TRUE, give verbose output |
Returns the cross object with augmented individuals (many individuals from
the data set will be repeated multiple times). Some individuals may have been
dropped completely when the probability falls below minprob
. An added
component to the cross object named mqm
contains information on
exactly which individuals are retained and repeated.
The sex chromosome 'X' is treated like autosomes during augmentation.
With an F2 the sex chromosome is not considered. This will change in
a future version of MQM.
Run with verbose=TRUE
to verify how many individuals are augmented
versus moved to the second augmentation round. This could have an effect
on the resulting dataset or check the return cross$mqm
values. Compare
results by using minprob=1
.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
fill.geno
- Alternative routine for estimating missing data
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(map10) # Genetic map modeled after mouse # simulate a cross (autosomes 1-10) qtl <- c(3,15,1,0) # QTL model: chr, pos'n, add've & dom effects cross <- sim.cross(map10[1:10],qtl,n=100,missing.prob=0.01) # MQM crossaug <- mqmaugment(cross) # Augmentation cat(crossaug$mqm$Nind,'real individuals retained in dataset', crossaug$mqm$Naug,'individuals augmented\n') result <- mqmscan(crossaug) # Scan # show LOD interval of the QTL on chr 3 lodint(result,chr=3)
data(map10) # Genetic map modeled after mouse # simulate a cross (autosomes 1-10) qtl <- c(3,15,1,0) # QTL model: chr, pos'n, add've & dom effects cross <- sim.cross(map10[1:10],qtl,n=100,missing.prob=0.01) # MQM crossaug <- mqmaugment(cross) # Augmentation cat(crossaug$mqm$Nind,'real individuals retained in dataset', crossaug$mqm$Naug,'individuals augmented\n') result <- mqmscan(crossaug) # Scan # show LOD interval of the QTL on chr 3 lodint(result,chr=3)
Sets cofactors, taking underlying marker density into account. Together
with mqmscan
cofactors are selected through backward elimination.
mqmautocofactors(cross, num=50, distance=5, dominance=FALSE, plot=FALSE, verbose=FALSE)
mqmautocofactors(cross, num=50, distance=5, dominance=FALSE, plot=FALSE, verbose=FALSE)
cross |
An object of class |
num |
Number of cofactors to set (warns when setting too many cofactors). |
distance |
Minimal distance between two cofactors, in cM. |
dominance |
If TRUE, create a cofactor list that is safe to use
with the dominance scan mode of MQM. See |
plot |
If TRUE, plots a genetic map displaying the selected markers as cofactors. |
verbose |
If TRUE, give verbose output. |
A list of cofactors to be used with mqmscan
.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(hyper) # hyper dataset hyperfilled <- fill.geno(hyper) cofactors <- mqmautocofactors(hyperfilled,15) # Set 15 Cofactors result <- mqmscan(hyperfilled,cofactors) # Backward model selection mqmgetmodel(result)
data(hyper) # hyper dataset hyperfilled <- fill.geno(hyper) cofactors <- mqmautocofactors(hyperfilled,15) # Set 15 Cofactors result <- mqmscan(hyperfilled,cofactors) # Backward model selection mqmgetmodel(result)
Extract the real markers from a cross object that includes pseudo markers
mqmextractmarkers(mqmresult)
mqmextractmarkers(mqmresult)
mqmresult |
result from |
Returns a scanone object with the pseudo markers removed
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) multitrait <- fill.geno(multitrait) result <- mqmscan(multitrait) newresult <- mqmextractmarkers(result)
data(multitrait) multitrait <- fill.geno(multitrait) result <- mqmscan(multitrait) newresult <- mqmextractmarkers(result)
Fetch significant makers after permutation analysis. These markers can be used as cofactors for model selection in a forward stepwise approach.
mqmfind.marker(cross, mqmscan = NULL, perm = NULL, alpha = 0.05, verbose=FALSE)
mqmfind.marker(cross, mqmscan = NULL, perm = NULL, alpha = 0.05, verbose=FALSE)
cross |
An object of class |
mqmscan |
|
perm |
a |
alpha |
Threshold value, everything with significance < alpha is reported |
verbose |
Display more output on verbose=TRUE |
returns a matrix with at each row a significant marker (determined from the
scanoneperm
object) and with columns: markername, chr and pos (cM)
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
mqmprocesspermutation
- Function called to convert results from an mqmpermutation into an scanoneperm object
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
# Use the multitrait dataset data(multitrait) # Set cofactors at each 3th marker cof <- mqmsetcofactors(multitrait,3) # impute missing genotypes multitrait <- fill.geno(multitrait) # log transform the 7th phenotype multitrait <- transformPheno(multitrait, 7) # Bootstrap 50 runs in batches of 10 ## Not run: result <- mqmpermutation(multitrait,scanfunction=mqmscan,cofactors=cof, pheno.col=7,n.perm=50,batchsize=10) ## End(Not run) # Create a permutation object f2perm <- mqmprocesspermutation(result) # What LOD score is considered significant ? summary(f2perm) # Find markers with a significant QTL effect (First run is original phenotype data) marker <- mqmfind.marker(multitrait,result[[1]],f2perm) # Print it to the screen marker
# Use the multitrait dataset data(multitrait) # Set cofactors at each 3th marker cof <- mqmsetcofactors(multitrait,3) # impute missing genotypes multitrait <- fill.geno(multitrait) # log transform the 7th phenotype multitrait <- transformPheno(multitrait, 7) # Bootstrap 50 runs in batches of 10 ## Not run: result <- mqmpermutation(multitrait,scanfunction=mqmscan,cofactors=cof, pheno.col=7,n.perm=50,batchsize=10) ## End(Not run) # Create a permutation object f2perm <- mqmprocesspermutation(result) # What LOD score is considered significant ? summary(f2perm) # Find markers with a significant QTL effect (First run is original phenotype data) marker <- mqmfind.marker(multitrait,result[[1]],f2perm) # Print it to the screen marker
Retrieves the QTL model used for scanning from the output of an MQM scan. The model only contains the selected cofactors significant at the specified cofactor.significance from the results of an mqm scan
mqmgetmodel(scanresult)
mqmgetmodel(scanresult)
scanresult |
An object returned by |
The function returns the multiple QTL model created, which consists of the cofactors selected during the modeling phase of the algorithm.
This model was used when scanning for additional QTL in the mqmscan function. The format of the model is compatible with the
makeqtl
function. For more information about the format of the model see the makeqtl
page.
When no cofactor was selected in the modeling phase no model was created, then this function will return a NULL value.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
mqmsetcofactors
- Setting multiple cofactors for backward elimination
makeqtl
- Make a qtl object
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(hyper) hyperfilled <- fill.geno(hyper) cofactors <- mqmsetcofactors(hyperfilled,4) result <- mqmscan(hyperfilled,cofactors) mqmgetmodel(result) plot(mqmgetmodel(result))
data(hyper) hyperfilled <- fill.geno(hyper) cofactors <- mqmsetcofactors(hyperfilled,4) result <- mqmscan(hyperfilled,cofactors) mqmgetmodel(result) plot(mqmgetmodel(result))
Two randomization approaches to obtain estimates of QTL significance:
Random redistribution of traits (method='permutation')
Random redistribution of simulated trait values (method='simulation')
Calculations can be parallelized using the SNOW package.
mqmpermutation(cross, scanfunction=scanone, pheno.col=1, multicore=TRUE, n.perm=10, file="MQM_output.txt", n.cluster=1, method=c("permutation","simulation"), cofactors=NULL, plot=FALSE, verbose=FALSE, ...)
mqmpermutation(cross, scanfunction=scanone, pheno.col=1, multicore=TRUE, n.perm=10, file="MQM_output.txt", n.cluster=1, method=c("permutation","simulation"), cofactors=NULL, plot=FALSE, verbose=FALSE, ...)
cross |
An object of class |
scanfunction |
Function to use when mappingQTL's (either scanone,cim or mqm) |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This can be a vector of integers. |
multicore |
Use multicore (if available) |
n.perm |
Number of permutations to perform (DEFAULT=10, should be 1000, or higher, for publications) |
file |
Name of the intermediate output file used |
n.cluster |
Number of child processes to split the job into |
method |
What kind permutation should occur: permutation or simulation |
cofactors |
cofactors, only used when scanfunction is mqm.
List of cofactors to be analysed in the QTL model. To set cofactors use |
.
plot |
If TRUE, make a plot |
verbose |
If TRUE, print tracing information |
... |
Parameters passed through to the
|
Analysis of scanone
, cim
or
mqmscan
to scan for QTL in shuffled/randomized data. It is recommended to also install the snow
library.
The snow
library allows calculations to run on multiple cores or even scale it up to an entire cluster, thus speeding up calculation.
Returns a mqmmulti object. this object is a list of scanone objects that can be plotted using plot.scanone(result[[trait]])
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
Bruno M. Tesson, Ritsert C. Jansen (2009) Chapter 3.7. Determining the significance threshold eQTL Analysis in Mice and Rats 1, 20–25
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
Rossini, A., Tierney, L., and Li, N. (2003), Simple parallel statistical computing. R. UW Biostatistics working paper series University of Washington. 193
Tierney, L., Rossini, A., Li, N., and Sevcikova, H. (2004), The snow Package: Simple Network of Workstations. Version 0.2-1.
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
# Use the multitrait dataset data(multitrait) multitrait <- calc.genoprob(multitrait) result <- mqmpermutation(multitrait,pheno.col=7, n.perm=2, batchsize=2) ## Not run: #Set 50 cofactors cof <- mqmautocofactors(multitrait,50) ## End(Not run) multitrait <- fill.geno(multitrait) result <- mqmpermutation(multitrait,scanfunction=mqmscan,cofactors=cof, pheno.col=7, n.perm=2,batchsize=2,verbose=FALSE) #Create a permutation object f2perm <- mqmprocesspermutation(result) #Get Significant LOD thresholds summary(f2perm)
# Use the multitrait dataset data(multitrait) multitrait <- calc.genoprob(multitrait) result <- mqmpermutation(multitrait,pheno.col=7, n.perm=2, batchsize=2) ## Not run: #Set 50 cofactors cof <- mqmautocofactors(multitrait,50) ## End(Not run) multitrait <- fill.geno(multitrait) result <- mqmpermutation(multitrait,scanfunction=mqmscan,cofactors=cof, pheno.col=7, n.perm=2,batchsize=2,verbose=FALSE) #Create a permutation object f2perm <- mqmprocesspermutation(result) #Get Significant LOD thresholds summary(f2perm)
Circular genome plot - shows QTL locations and relations.
mqmplot.circle(cross,result,highlight=0,spacing=25, interactstrength=2, axis.legend=TRUE, col.legend=FALSE, verbose=FALSE, transparency=FALSE)
mqmplot.circle(cross,result,highlight=0,spacing=25, interactstrength=2, axis.legend=TRUE, col.legend=FALSE, verbose=FALSE, transparency=FALSE)
cross |
An object of class |
result |
An object of class |
highlight |
With a mqmmulti object, highlight this phenotype (value between one and the number of results in the mqmmultiobject) |
interactstrength |
When highlighting a trait, consider interactions significant they have a change of more than interactstrength*SEs. A higher value will show less interactions. However the interactions reported at higher interactstrength values will generaty be more reliable. |
spacing |
User defined spacing between chromosomes in cM |
axis.legend |
When set to FALSE, suppresses the legends. (defaults to plotting legends besides the axis. |
col.legend |
With a mqmmulti object, plots a legend for the non-highlighed version |
transparency |
Use transparency when drawing the plots (defaults to no transparency) |
verbose |
Be verbose |
Depending on the input of the result being either scanone
or mqmmulti
a different plot is drawn.
If model information is present from mqmscan
(by setting cofactors) This will be highlighted in
red (see example).
If phenotypes have genetic locations (e.g. eQTL) they will be plotted on the genome otherwise
phenotypes will be plotted in the middle of the circle (with a small offset)
Locations can be added by using the addloctocross
function.
Plotting routine, no return
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) data(locations) multifilled <- fill.geno(multitrait) # impute missing genotypes multicof <- mqmsetcofactors(multitrait,10) # create cofactors multiloc <- addloctocross(multifilled,locations) # add phenotype information to cross multires <- mqmscanall(multifilled,cofactors=multicof) # run mqmscan for all phenotypes #Basic mqmmulti, color = trait, round circle = significant mqmplot.circle(multifilled,multires) #mqmmulti with locations of traits in multiloc mqmplot.circle(multiloc,multires) #mqmmulti with highlighting mqmplot.circle(multitrait,multires,highlight=3) #mqmmulti with locations of traits in multiloc and highlighting mqmplot.circle(multiloc,multires,highlight=3)
data(multitrait) data(locations) multifilled <- fill.geno(multitrait) # impute missing genotypes multicof <- mqmsetcofactors(multitrait,10) # create cofactors multiloc <- addloctocross(multifilled,locations) # add phenotype information to cross multires <- mqmscanall(multifilled,cofactors=multicof) # run mqmscan for all phenotypes #Basic mqmmulti, color = trait, round circle = significant mqmplot.circle(multifilled,multires) #mqmmulti with locations of traits in multiloc mqmplot.circle(multiloc,multires) #mqmmulti with highlighting mqmplot.circle(multitrait,multires,highlight=3) #mqmmulti with locations of traits in multiloc and highlighting mqmplot.circle(multiloc,multires,highlight=3)
Plot results for a genomescan using a multiple-QTL model. With genetic location for the traits it is possible to show cis- and trans- locations, and detect trans-bands
mqmplot.cistrans(result, cross, threshold=5, onlyPEAK=TRUE, highPEAK=FALSE, cisarea=10, pch=22, cex=0.5, verbose=FALSE, ...)
mqmplot.cistrans(result, cross, threshold=5, onlyPEAK=TRUE, highPEAK=FALSE, cisarea=10, pch=22, cex=0.5, verbose=FALSE, ...)
result |
An object of class |
cross |
An object of class |
threshold |
Threshold value in LOD, Markers that have a
LOD score above this threshold are plotted as small squares
(see |
onlyPEAK |
Plot only the peak markers ? (TRUE/FALSE)
(Peak markers are markers that have a QTL likelihood above
|
highPEAK |
Highlight peak markers ? (TRUE/FALSE). When using this option peak markers (the marker with the highest LOD score in a region above the threshold gets an 25% increase in size and is displayed in red) |
cisarea |
Adjust the two green lines around the line y=x |
pch |
What kind of character is used in plotting of the figure (Default: 22, small square) |
cex |
Size of the points plotted (default to 0.5 half of the original size) |
verbose |
If TRUE, give verbose output |
... |
Extra parameters will be passed to points |
Plotting routine, so no return
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) data(locations) multiloc <- addloctocross(multitrait,locations) multiloc <- calc.genoprob(multiloc) results <- scanall(multiloc, method="hk") mqmplot.cistrans(results, multiloc, 5, FALSE, TRUE)
data(multitrait) data(locations) multiloc <- addloctocross(multitrait,locations) multiloc <- calc.genoprob(multiloc) results <- scanall(multiloc, method="hk") mqmplot.cistrans(results, multiloc, 5, FALSE, TRUE)
Plot the results from a MQM scan on multiple phenotypes.
mqmplot.clusteredheatmap(cross, mqmresult, directed=TRUE, legend=FALSE, Colv=NA, scale="none", verbose=FALSE, breaks = c(-100,-10,-3,0,3,10,100), col = c("darkblue","blue","lightblue","yellow", "orange","red"), ...)
mqmplot.clusteredheatmap(cross, mqmresult, directed=TRUE, legend=FALSE, Colv=NA, scale="none", verbose=FALSE, breaks = c(-100,-10,-3,0,3,10,100), col = c("darkblue","blue","lightblue","yellow", "orange","red"), ...)
cross |
An object of class |
mqmresult |
Result object from mqmscanall, the object needs to be of class |
directed |
Take direction of QTLs into account (takes more time because of QTL direction calculations |
legend |
If TRUE, add a legend to the plot |
Colv |
Cluster only the Rows, the columns (Markers) should not be clustered |
scale |
character indicating if the values should be centered and scaled in either the row direction or the column direction, or none. The default "none" |
verbose |
If TRUE, give verbose output. |
breaks |
Color break points for the LOD scores |
col |
Colors used between breaks |
... |
Additional arguments passed to |
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) cresults <- mqmplot.clusteredheatmap(multitrait,result) groupclusteredheatmap(multitrait,cresults,10)
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) cresults <- mqmplot.clusteredheatmap(multitrait,result) groupclusteredheatmap(multitrait,cresults,10)
Plots cofactors as created by mqmsetcofactors
or mqmautocofactors
on the genetic map.
mqmplot.cofactors(cross,cofactors, ...)
mqmplot.cofactors(cross,cofactors, ...)
cross |
An object of class |
cofactors |
List of cofactors to be analysed in the QTL model. To set cofactors use |
.
... |
Passed to |
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) cof1 <- mqmsetcofactors(multitrait,20) cof2 <- mqmsetcofactors(multitrait,10) op <- par(mfrow=c(2,1)) mqmplot.cofactors(multitrait,cof1,col="blue") mqmplot.cofactors(multitrait,cof2,col="blue") op <- par(mfrow=c(1,1))
data(multitrait) cof1 <- mqmsetcofactors(multitrait,20) cof2 <- mqmsetcofactors(multitrait,10) op <- par(mfrow=c(2,1)) mqmplot.cofactors(multitrait,cof1,col="blue") mqmplot.cofactors(multitrait,cof2,col="blue") op <- par(mfrow=c(1,1))
Plot the LOD*Effect curve for a genome scan with a multiple-QTL model (the
output of mqmscan
).
mqmplot.directedqtl(cross, mqmresult, pheno.col=1, draw = TRUE)
mqmplot.directedqtl(cross, mqmresult, pheno.col=1, draw = TRUE)
cross |
An object of class |
mqmresult |
Results from mqmscan of type |
pheno.col |
From which phenotype in the crossobject are the result calculated |
draw |
If TRUE, draw the figure. |
Returns a scanone object, with added the effectsign calculated
internally by the function effect.scan. For more info on the
scanone object see: scanone
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
#Simulated F2 Population f2qtl <- c(3,15,1,0) # QTL at chromosome 3 data(map10) # Mouse genetic map f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") # Simulate a F2 Cross f2cross <- fill.geno(f2cross) # Fill in missing genotypes f2result <- mqmscan(f2cross) # Do a MQM scan of the genome mqmplot.directedqtl(f2cross,f2result)
#Simulated F2 Population f2qtl <- c(3,15,1,0) # QTL at chromosome 3 data(map10) # Mouse genetic map f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") # Simulate a F2 Cross f2cross <- fill.geno(f2cross) # Fill in missing genotypes f2result <- mqmscan(f2cross) # Do a MQM scan of the genome mqmplot.directedqtl(f2cross,f2result)
Plotting routine to display a heatmap of results obtained from a multiple-QTL model on multiple phenotypes (the
output of mqmscanall
)
mqmplot.heatmap(cross, result, directed=TRUE, legend=FALSE, breaks = c(-100,-10,-3,0,3,10,100), col = c("darkblue","blue","lightblue","yellow","orange","red"), ...)
mqmplot.heatmap(cross, result, directed=TRUE, legend=FALSE, breaks = c(-100,-10,-3,0,3,10,100), col = c("darkblue","blue","lightblue","yellow","orange","red"), ...)
cross |
An object of class |
result |
Result object from mqmscanall, the object needs to be of class |
directed |
Take direction of QTLs into account (takes more time because of QTL direction calculations |
legend |
If TRUE, add a legend to the plot |
breaks |
Color break points for the LOD scores |
col |
Colors used between breaks |
... |
Additional arguments passed to the |
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) mqmplot.heatmap(multitrait,result)
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) mqmplot.heatmap(multitrait,result)
Plotting routine to display the results from a multiple-QTL model on
multiple phenotypes. It supports four different visualizations: a
contourmap, heatmap, 3D graph or a multiple QTL plot created by using
plot.scanone
on the mqmmulti
object
mqmplot.multitrait(result, type=c("lines","image","contour","3Dplot"), group=NULL, meanprofile=c("none","mean","median"), theta=30, phi=15, ...)
mqmplot.multitrait(result, type=c("lines","image","contour","3Dplot"), group=NULL, meanprofile=c("none","mean","median"), theta=30, phi=15, ...)
result |
Result object from |
type |
Selection of the plot method to visualize the data: "lines" (defaut plotting option), "image", "contour" and "3Dplot" |
group |
A numeric vector indicating which traits to plot. NULL means no grouping |
meanprofile |
Plot a mean/median profile from the group selected |
theta |
Horizontal axis rotation in a 3D plot |
phi |
Vertical axis rotation in a 3D plot |
... |
Additional arguments passed to |
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) mqmplot.multitrait(result,"lines") mqmplot.multitrait(result,"contour") mqmplot.multitrait(result,"image") mqmplot.multitrait(result,"3Dplot")
data(multitrait) multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) mqmplot.multitrait(result,"lines") mqmplot.multitrait(result,"contour") mqmplot.multitrait(result,"image") mqmplot.multitrait(result,"3Dplot")
Plotting routine to display the results from a permutation QTL scan. (the
output of mqmpermutation
)
mqmplot.permutations(permutationresult, ...)
mqmplot.permutations(permutationresult, ...)
permutationresult |
|
... |
Extra arguments passed to |
No value returned (plotting routine)
Danny Arends [email protected] , Rutger Brouwer
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
# Simulated F2 Population # QTL at chromosome 3 f2qtl <- c(3,15,1,0) # Mouse genetic map data(map10) # Simulate a F2 Cross f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") f2cross <- calc.genoprob(f2cross) ## Not run: # Permutations to obtain significance threshold f2result <- mqmpermutation(f2cross, n.perm=1000, method="permutation") ## End(Not run) # Plot results mqmplot.permutations(f2result)
# Simulated F2 Population # QTL at chromosome 3 f2qtl <- c(3,15,1,0) # Mouse genetic map data(map10) # Simulate a F2 Cross f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") f2cross <- calc.genoprob(f2cross) ## Not run: # Permutations to obtain significance threshold f2result <- mqmpermutation(f2cross, n.perm=1000, method="permutation") ## End(Not run) # Plot results mqmplot.permutations(f2result)
Plot the LOD curve for a genome scan for a single trait, with a multiple-QTL model (the
output of mqmscan
).
mqmplot.singletrait(result, extended = 0 ,...)
mqmplot.singletrait(result, extended = 0 ,...)
result |
|
extended |
Extended plotting of the information content |
... |
Extra arguments passed to |
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
#Simulated F2 Population f2qtl <- c(3,15,1,0) # QTL at chromosome 3 data(map10) # Mouse genetic map f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") # Simulate a F2 Cross f2cross <- mqmaugment(f2cross) f2result <- mqmscan(f2cross) # Do a MQM scan of the genome mqmplot.singletrait(f2result) # Use our fancy plotting routine
#Simulated F2 Population f2qtl <- c(3,15,1,0) # QTL at chromosome 3 data(map10) # Mouse genetic map f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") # Simulate a F2 Cross f2cross <- mqmaugment(f2cross) f2result <- mqmscan(f2cross) # Do a MQM scan of the genome mqmplot.singletrait(f2result) # Use our fancy plotting routine
Function to convert mqmmulti
objects into a scanoneperm
object, this allows the use of R/qtl methods for permutation analysis
that do not support the output of a multiple QTL scan using mqm's
outputstructure.
mqmprocesspermutation(mqmpermutationresult = NULL)
mqmprocesspermutation(mqmpermutationresult = NULL)
mqmpermutationresult |
|
Output of the algorithm is a scanoneperm
object. See also: summary.scanoneperm
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
# QTL at chromosome 3 f2qtl <- c(3,15,1,0) # Mouse genetic map data(map10) # Simulate a F2 Cross f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") ## Not run: # Bootstrap MQM mapping on the f2cross f2result <- mqmpermutation(f2cross,scanfunction=mqmscan) ## End(Not run) # Create a permutation object f2perm <- mqmprocesspermutation(f2result) # What LOD score is considered significant? summary(f2perm)
# QTL at chromosome 3 f2qtl <- c(3,15,1,0) # Mouse genetic map data(map10) # Simulate a F2 Cross f2cross <- sim.cross(map10,f2qtl,n=100,type="f2") ## Not run: # Bootstrap MQM mapping on the f2cross f2result <- mqmpermutation(f2cross,scanfunction=mqmscan) ## End(Not run) # Create a permutation object f2perm <- mqmprocesspermutation(f2result) # What LOD score is considered significant? summary(f2perm)
Genome scan with a multiple QTL model.
mqmscan(cross, cofactors=NULL, pheno.col = 1, model=c("additive","dominance"), forceML=FALSE, cofactor.significance=0.02, em.iter=1000, window.size=25.0, step.size=5.0, logtransform = FALSE, estimate.map = FALSE, plot=FALSE, verbose=FALSE, outputmarkers=TRUE, multicore=TRUE, batchsize=10, n.clusters=1, test.normality=FALSE,off.end=0 )
mqmscan(cross, cofactors=NULL, pheno.col = 1, model=c("additive","dominance"), forceML=FALSE, cofactor.significance=0.02, em.iter=1000, window.size=25.0, step.size=5.0, logtransform = FALSE, estimate.map = FALSE, plot=FALSE, verbose=FALSE, outputmarkers=TRUE, multicore=TRUE, batchsize=10, n.clusters=1, test.normality=FALSE,off.end=0 )
cross |
An object of class |
cofactors |
List of cofactors to be analysed as cofactors in backward elimination
procedure when building the QTL model. See |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This can be a vector of integers; One may also give a character strings matching the phenotype names. Finally, one may give a numeric vector of phenotypeIDs. This should consist of integers with 0 < value < no. phenotypes. |
model |
When scanning for QTLs should haplotype dominance be considered in an F2 intercross. Using the dominance model we scan for additive effects but also allow an additional effect where AA+AB versus BB and AA versus AB+BB. This setting is ignored for BC and RIL populations |
forceML |
Specify which statistical method to use to estimate variance components to use when QTL modeling and mapping. Default usage is the Restricted maximum likelihood approach (REML). With this option a user can disable REML and use maximum likelihood. |
cofactor.significance |
Significance level at which a cofactor is considered significant. This is estimated using an analysis of deviance, and compared to the level specified by the user. The cofactors that dont reach this level of statistical significance are NOT used in the mapping stage. Value between 0 and 1 |
em.iter |
Maximum number of iterations for the EM algorithm to converge |
window.size |
Window size for mapping QTL locations, this parameter is used in the interval mapping stage. When calculating LOD scores at a genomic position all cofactors within window.size are dropped to estimate the (unbiased) effect of the location under interest. |
step.size |
Step size used in interval mapping. A lower step.size parameter increases the number of output points, this creates a smoother QTL profile |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype simulations will be carried. |
logtransform |
Indicate if the algorithm should do a log transformation on the trait data in the pheno.col |
estimate.map |
Should Re-estimation of the marker locations
on the genetic map occur before mapping QTLs. This method is
deprecated rather use the |
plot |
plot the results (default FALSE) |
verbose |
verbose output |
outputmarkers |
If TRUE (the default), the results include the marker locations as well as along a grid of pseudomarkers; if FALSE, the results include only the grid positions. |
multicore |
Use multicore (if available) |
batchsize |
Number of traits being analyzed as a batch. |
n.clusters |
Number of child processes to split the job into. |
test.normality |
If TRUE, test whether the phenotype follows a
normal distribution via |
When scanning a single phenotype the function returns a scanone
object.
The object contains a matrix of three columns for LOD scores, information content
and LOD*information content with pseudo markers sorted in increasing
order. For more information on the scanone object see: scanone
The resulting scanone object itself can be visualized using the standard R/qtl
plotting routines (plot.scanone
) or specialized function to show
the mqm model (mqmplot.singletrait
) and QTL profile. If cofactors
were specified the QTL model used in scanning is also returned as a named
attribute of the scanone object called mqmmodel. It can be extracted from the
resulting scanone object by using the mqmgetmodel
function or the
attr
function.
Also note the estimate.map
parameter does not return
its re-estimated genetic map, altough it is used internally. When scanning
multiple genotypes a mqmmulti
object is created. This object is just a
list composed of scanone objects. The results for a single trait can be
obtained from the mqmmulti
object, in scanone format.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(map10) # Genetic map modeled after mouse # simulate a cross (autosomes 1-10) qtl <- c(3,15,1,0) # QTL model: chr, pos'n, add've & dom effects cross <- sim.cross(map10[1:10],qtl,n=100,missing.prob=0.01) # MQM crossaug <- mqmaugment(cross) # Augmentation cat(crossaug$mqm$Nind,'real individuals retained in dataset', crossaug$mqm$Naug,'individuals augmented\n') result <- mqmscan(crossaug) # Scan # show LOD interval of the QTL on chr 3 lodint(result,chr=3)
data(map10) # Genetic map modeled after mouse # simulate a cross (autosomes 1-10) qtl <- c(3,15,1,0) # QTL model: chr, pos'n, add've & dom effects cross <- sim.cross(map10[1:10],qtl,n=100,missing.prob=0.01) # MQM crossaug <- mqmaugment(cross) # Augmentation cat(crossaug$mqm$Nind,'real individuals retained in dataset', crossaug$mqm$Naug,'individuals augmented\n') result <- mqmscan(crossaug) # Scan # show LOD interval of the QTL on chr 3 lodint(result,chr=3)
Parallelized QTL analysis using MQM on multiple phenotypes in a cross object (uses SNOW)
mqmscanall(cross, multicore=TRUE, n.clusters = 1,batchsize=10,cofactors=NULL, ...)
mqmscanall(cross, multicore=TRUE, n.clusters = 1,batchsize=10,cofactors=NULL, ...)
cross |
An object of class |
multicore |
Use multiple cores (only if the package SNOW is available, otherwise this setting will be ignored) |
n.clusters |
Number of parallel processes to spawn, recommended is setting this lower than the number of cores in the computer |
batchsize |
Batch size. The entire set is split in jobs to reduce memory load per core. Each job contains batchsize number of traits per job. |
cofactors |
cofactors, only used when scanfunction is mqmscan.
List of cofactors to be analysed in the QTL model. To set cofactors use |
.
... |
Parameters passed through to the |
Uses mqmscan
to scan for QTL's for each phenotype in the cross object.
It is recomended that the package SNOW is installed before using this function
on large numbers of phenotypes.
Returns a MQMmulti object. This object is a list of scanone
objects that can be plotted using plot.scanone(result[[trait]])
or using mqmplot.multitrait(result)
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
Rossini, A., Tierney, L., and Li, N. (2003), Simple parallel statistical computing. R. UW Biostatistics working paper series University of Washington. 193
Tierney, L., Rossini, A., Li, N., and Sevcikova, H. (2004), The snow Package: Simple Network of Workstations. Version 0.2-1.
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
#Doing a multitrait analysis data(multitrait) multitrait <- calc.genoprob(multitrait) cof <- mqmsetcofactors(multitrait,3) multitrait <- fill.geno(multitrait) result <- mqmscanall(multitrait,cofactors=cof,batchsize=5) mqmplot.multitrait(result,"lines")
#Doing a multitrait analysis data(multitrait) multitrait <- calc.genoprob(multitrait) cof <- mqmsetcofactors(multitrait,3) multitrait <- fill.geno(multitrait) result <- mqmscanall(multitrait,cofactors=cof,batchsize=5) mqmplot.multitrait(result,"lines")
Estimate the false discovery rate (FDR) for multiple trait analysis
mqmscanfdr(cross, scanfunction=mqmscanall, thresholds=c(1,2,3,4,5,7,10,15,20), n.perm=10, verbose=FALSE, ... )
mqmscanfdr(cross, scanfunction=mqmscanall, thresholds=c(1,2,3,4,5,7,10,15,20), n.perm=10, verbose=FALSE, ... )
cross |
An object of class |
scanfunction |
QTL mapping function, Note: Must use scanall or mqmscanall. Otherwise this will not produce usefull results. Reason: We need a function that maps all traits ecause of the correlation structure which is not changed (between traits) during permutation (Valis options: scanall or mqmscanall) |
thresholds |
False discovery rate (FDR) is calculated for peaks above these LOD thresholds (DEFAULT=Range from 1 to 20, using 10 thresholds) Parameter is a list of LOD scores at which FDR is calculated. |
n.perm |
Number of permutations (DEFAULT=10 for quick analysis, however for publications use 1000, or higher) |
verbose |
verbose output |
... |
Parameters passed to the mapping function |
This function wraps the analysis of scanone
, cim
and mqmscan
to scan for QTL in shuffled/randomized data. It is
recommended to also install the snow
library for parallelization of
calculations. The snow
library allows
calculations to run on multiple cores or even scale it up to an entire cluster,
thus speeding up calculation by the number of computers used.
Returns a data.frame with 3 columns: FalsePositives, FalseNegatives and False Discovery Rates. In the rows the userspecified thresholds are with scores for the 3 columns.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
Bruno M. Tesson, Ritsert C. Jansen (2009) Chapter 3.7. Determining the significance threshold eQTL Analysis in Mice and Rats 1, 20–25
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
Rossini, A., Tierney, L., and Li, N. (2003), Simple parallel statistical computing. R. UW Biostatistics working paper series University of Washington. 193
Tierney, L., Rossini, A., Li, N., and Sevcikova, H. (2004), The snow Package: Simple Network of Workstations. Version 0.2-1.
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) # impute missing genotype data multitrait <- fill.geno(multitrait) ## Not run: # Calculate the thresholds result <- mqmscanfdr(multitrait, threshold=10.0, n.perm=1000) ## End(Not run)
data(multitrait) # impute missing genotype data multitrait <- fill.geno(multitrait) ## Not run: # Calculate the thresholds result <- mqmscanfdr(multitrait, threshold=10.0, n.perm=1000) ## End(Not run)
Set cofactors, at fixed marker intervals. Together
with mqmscan
cofactors are selected through backward elimination.
mqmsetcofactors(cross, each = NULL, cofactors=NULL, sexfactors=NULL, verbose=FALSE)
mqmsetcofactors(cross, each = NULL, cofactors=NULL, sexfactors=NULL, verbose=FALSE)
cross |
An object of class |
each |
Every 'each' marker will be used as a cofactor, when each is used the |
cofactors |
List of cofactors to be analysed in the QTL model. To set cofactors use |
sexfactors |
list of markers which should be treated as dominant cofactors (sexfactors), when |
verbose |
If TRUE, print tracing information. |
An list of cofactors to be passed into mqmscan
.
Ritsert C Jansen; Danny Arends; Pjotr Prins; Karl W Broman [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(hyper) # Hyper dataset hyperfilled <- fill.geno(hyper) # Automatic cofactors every third marker cofactors <- mqmsetcofactors(hyperfilled,3) result <- mqmscan(hyperfilled,cofactors) # Backward model selection mqmgetmodel(result) #Manual cofactors at markers 3,6,9,12,40 and 60 cofactors <- mqmsetcofactors(hyperfilled,cofactors=c(3,6,9,12,40,60)) result <- mqmscan(hyperfilled,cofactors) # Backward model selection mqmgetmodel(result)
data(hyper) # Hyper dataset hyperfilled <- fill.geno(hyper) # Automatic cofactors every third marker cofactors <- mqmsetcofactors(hyperfilled,3) result <- mqmscan(hyperfilled,cofactors) # Backward model selection mqmgetmodel(result) #Manual cofactors at markers 3,6,9,12,40 and 60 cofactors <- mqmsetcofactors(hyperfilled,cofactors=c(3,6,9,12,40,60)) result <- mqmscan(hyperfilled,cofactors) # Backward model selection mqmgetmodel(result)
Wraps a shapiro's normality test from the nortest package. This function is used in MQM to test the normality of the trait under investigation
mqmtestnormal(cross, pheno.col = 1,significance=0.05, verbose=FALSE)
mqmtestnormal(cross, pheno.col = 1,significance=0.05, verbose=FALSE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This can be a vector of integers. |
significance |
Significance level used in the normality test. Lower significance levels will accept larger deviations from normality. |
verbose |
If TRUE, print result as well as return it. |
For augmented data (as from mqmaugment
), the cross
is first reduced to distinct individuals. Furthermore the shapiro used to
test normality works only for 3 <= nind(cross) <= 5000
Boolean indicating normality of the trait in pheno.col. (FALSE when not normally distributed.)
Danny Arends [email protected]
shapiro.test
- Function wrapped by our mqmtestnormal
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) # test normality of 7th phenotype mqmtestnormal(multitrait, pheno.col=7) # take log multitrait <- transformPheno(multitrait, pheno.col=7, transf=log) # test again mqmtestnormal(multitrait, pheno.col=7)
data(multitrait) # test normality of 7th phenotype mqmtestnormal(multitrait, pheno.col=7) # take log multitrait <- transformPheno(multitrait, pheno.col=7, transf=log) # test again mqmtestnormal(multitrait, pheno.col=7)
Cross object from R/QTL, an object of class cross
from R/QTL. See read.cross
for details.
data(multitrait)
data(multitrait)
Cross object from R/QTL
Arabidopsis recombinant inbred lines by selfing. There are 162 lines, 24 phenotypes, and 117 markers on 5 chromosomes.
Part of the Arabidopsis RIL selfing experiment with Landsberg erecta (Ler) and Cape Verde Islands (Cvi) with 162 individuals scored (with errors at) 117 markers. Dataset obtained from GBIC - Groningen BioInformatics Centre
Keurentjes, J. J. and Fu, J. and de Vos, C. H. and Lommen, A. and Hall, R. D. and Bino, R. J. and van der Plas, L. H. and Jansen, R. C. and Vreugdenhil, D. and Koornneef, M. (2006), The genetics of plant metabolism. Nature Genetics. 38-7, 842–849.
Alonso-Blanco, C. and Peeters, A. J. and Koornneef, M. and Lister, C. and Dean, C. and van den Bosch, N. and Pot, J. and Kuiper, M. T. (1998), Development of an AFLP based linkage map of Ler, Col and Cvi Arabidopsis thaliana ecotypes and construction of a Ler/Cvi recombinant inbred line population
. Plant J. 14(2), 259–271.
data(multitrait) # Load dataset multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) # Analyse all 24 traits
data(multitrait) # Load dataset multitrait <- fill.geno(multitrait) # impute missing genotype data result <- mqmscanall(multitrait, logtransform=TRUE) # Analyse all 24 traits
Determine the number of chromosomes in a cross or map object.
nchr(object)
nchr(object)
object |
An object of class |
The number of chromosomes in the input.
Karl W Broman, [email protected]
read.cross
, plot.cross
,
summary.cross
,
nind
,
totmar
,
nmar
,
nphe
data(fake.f2) nchr(fake.f2) map <- pull.map(fake.f2) nchr(map)
data(fake.f2) nchr(fake.f2) map <- pull.map(fake.f2) nchr(map)
Determine the number of individuals in cross object.
nind(object)
nind(object)
object |
An object of class |
The number of individuals in the input cross object.
Karl W Broman, [email protected]
read.cross
, plot.cross
,
summary.cross
,
nmar
,
nchr
,
totmar
,
nphe
data(fake.f2) nind(fake.f2)
data(fake.f2) nind(fake.f2)
Determine the number of markers on each chromosome in a cross or map object.
nmar(object)
nmar(object)
object |
An object of class |
A vector with the numbers of markers on each chromosome in the input.
Karl W Broman, [email protected]
read.cross
, plot.cross
,
summary.cross
,
nind
,
nchr
,
totmar
,
nphe
data(fake.f2) nmar(fake.f2) map <- pull.map(fake.f2) nmar(map)
data(fake.f2) nmar(fake.f2) map <- pull.map(fake.f2) nmar(map)
Count the number of missing genotypes for each individual or each marker in a cross.
nmissing(cross, what=c("ind","mar"))
nmissing(cross, what=c("ind","mar"))
cross |
An object of class |
what |
Indicates whether to count missing genotypes for each individual or each marker. |
A vector containing the number of missing genotypes for each individual or for each marker.
Karl W Broman, [email protected]
ntyped
, summary.cross
,
nind
, totmar
data(listeria) # plot number of missing genotypes for each individual plot(nmissing(listeria)) # plot number of missing genotypes for each marker plot(nmissing(listeria, what="mar"))
data(listeria) # plot number of missing genotypes for each individual plot(nmissing(listeria)) # plot number of missing genotypes for each marker plot(nmissing(listeria, what="mar"))
Determine the number of phenotypes in cross object.
nphe(object)
nphe(object)
object |
An object of class |
The number of phenotypes in the input cross object.
Karl W Broman, [email protected]
read.cross
, plot.cross
,
summary.cross
,
nmar
,
nchr
,
totmar
,
nind
data(fake.f2) nphe(fake.f2)
data(fake.f2) nphe(fake.f2)
Transform a vector of quantitative values to the corresponding normal quantiles (preserving the mean and SD).
nqrank(x, jitter)
nqrank(x, jitter)
x |
A numeric vector |
jitter |
If TRUE, randomly jitter the values to break ties. |
A numeric vector; the input x
is converted to ranks and then to
normal quantiles.
Karl W Broman, [email protected]
data(hyper) hyper <- transformPheno(hyper, pheno.col=1, transf=nqrank)
data(hyper) hyper <- transformPheno(hyper, pheno.col=1, transf=nqrank)
Determine the number of QTL in a QTL object.
nqtl(qtl)
nqtl(qtl)
qtl |
An object of class |
The number of QTL in the input QTL object.
Karl W Broman, [email protected]
makeqtl
, fitqtl
,
dropfromqtl
, replaceqtl
,
addtoqtl
, summary.qtl
,
reorderqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c("1", "6", "13") qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0) qtl <- makeqtl(fake.f2, qc, qp, what="prob") nqtl(qtl)
data(fake.f2) # take out several QTLs and make QTL object qc <- c("1", "6", "13") qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0) qtl <- makeqtl(fake.f2, qc, qp, what="prob") nqtl(qtl)
Count the number of genotypes for each individual or each marker in a cross.
ntyped(cross, what=c("ind","mar"))
ntyped(cross, what=c("ind","mar"))
cross |
An object of class |
what |
Indicates whether to count genotypes for each individual or each marker. |
A vector containing the number of genotypes for each individual or for each marker.
Karl W Broman, [email protected]
nmissing
, summary.cross
,
nind
, totmar
data(listeria) # plot number of genotypes for each individual plot(ntyped(listeria)) # plot number of genotypes for each marker plot(ntyped(listeria, what="mar"))
data(listeria) # plot number of genotypes for each individual plot(ntyped(listeria)) # plot number of genotypes for each marker plot(ntyped(listeria, what="mar"))
Identify markers in a cross that have no genotype data.
nullmarkers(cross)
nullmarkers(cross)
cross |
An object of class |
Marker names (a vector of character strings) with no genotype data.
Karl W Broman, [email protected]
# one marker with no data data(hyper) nullmarkers(hyper) # nothing in listeria data(listeria) nullmarkers(listeria)
# one marker with no data data(hyper) nullmarkers(hyper) # nothing in listeria data(listeria) nullmarkers(listeria)
Establish initial orders for markers within chromosomes by a greedy algorithm, adding one marker at a time with locations of previous markers fixed, in the position giving the miniminum number of obligate crossovers.
orderMarkers(cross, chr, window=7, use.ripple=TRUE, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-4, sex.sp=TRUE, verbose=FALSE)
orderMarkers(cross, chr, window=7, use.ripple=TRUE, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-4, sex.sp=TRUE, verbose=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
window |
If |
use.ripple |
If TRUE, the initial order is refined by a call to
the function |
error.prob |
Assumed genotyping error rate used in the final estimated map. |
map.function |
Indicates the map function to use in the final estimated map. |
maxit |
Maximum number of EM iterations to perform in the final estimated map. |
tol |
Tolerance for determining convergence in the final estimated map. |
sex.sp |
Indicates whether to estimate sex-specific maps in the final estimated map; this is used only for the 4-way cross. |
verbose |
If TRUE, information about the progress of the calculations is displayed; if > 1, even more information is given. |
Markers within a linkage group are considered in order of decreasing number of genotyped individuals. The first two markers are placed in an arbitrary order. Additional markers are considered one at a time, and each possible placement of a marker is compared (with the order of the previously placed markers taken as fixed) via the number of obligate crossovers (that is, the minimal number of crossovers that would explain the observed data). The marker is placed in the position giving the minimal number of obligate crossovers. If multiple positions give the same number of obligate crossovers, a single location (among those positions) is chosen at random.
If use.ripple=TRUE
, the final order is passed to
ripple
with method="countxo"
to refine the
marker order. If use.ripple=TRUE
and the number of markers on
a chromosome is the argument
window
, the initial
greedy algorithm is skipped and all possible marker orders are
compared via ripple
.
The output is a cross object, as in the input, with orders of markers on selected chromosomes revised.
Karl W Broman, [email protected]
formLinkageGroups
,
ripple
, est.map
, countXO
data(listeria) pull.map(listeria, chr=3) revcross <- orderMarkers(listeria, chr=3, use.ripple=FALSE) pull.map(revcross, chr=3)
data(listeria) pull.map(listeria, chr=3) revcross <- orderMarkers(listeria, chr=3, use.ripple=FALSE) pull.map(revcross, chr=3)
Pull out the phenotype names from a cross object as a vector.
phenames(cross)
phenames(cross)
cross |
An object of class |
A vector of character strings (the phenotype names).
Karl W Broman, [email protected]
data(listeria) phenames(listeria)
data(listeria) phenames(listeria)
Identify the largest subset of markers for which no two adjacent markers are separated by less than some specified distance; if weights are provided, find the marker subset for which the sum of the weights is maximized.
pickMarkerSubset(locations, min.distance, weights)
pickMarkerSubset(locations, min.distance, weights)
locations |
A vector of marker locations. |
min.distance |
Minimum distance between adjacent markers in the chosen subset. |
weights |
(Optional) vector of weights for the markers. If
missing, we take |
Let be
the location of marker
, for
. We use the dynamic programming algorithm of Broman and Weber
(1999) to identify the subset of markers
for which
min.distance
and is maximized.
If there are multiple optimal subsets, we pick one at random.
A vector of marker names.
Karl W Broman, [email protected]
Broman, K. W. and Weber, J. L. (1999) Method for constructing confidently ordered linkage maps. Genet. Epidemiol., 16, 337–343.
drop.markers
, pull.markers
,
findDupMarkers
data(hyper) # subset of markers on chr 4 spaced >= 5 cM pickMarkerSubset(pull.map(hyper)[[4]], 5) # no. missing genotypes at each chr 4 marker n.missing <- nmissing(subset(hyper, chr=4), what="mar") # weight by -log(prop'n missing), but don't let 0 missing go to +Inf wts <- -log( (n.missing+1) / (nind(hyper)+1) ) # subset of markers on chr 4 spaced >= 5 cM, with weights = -log(prop'n missing) pickMarkerSubset(pull.map(hyper)[[4]], 5, wts)
data(hyper) # subset of markers on chr 4 spaced >= 5 cM pickMarkerSubset(pull.map(hyper)[[4]], 5) # no. missing genotypes at each chr 4 marker n.missing <- nmissing(subset(hyper, chr=4), what="mar") # weight by -log(prop'n missing), but don't let 0 missing go to +Inf wts <- -log( (n.missing+1) / (nind(hyper)+1) ) # subset of markers on chr 4 spaced >= 5 cM, with weights = -log(prop'n missing) pickMarkerSubset(pull.map(hyper)[[4]], 5, wts)
Plot the results of the comparison of all pairs of individuals'
genotypes. A histogram of the proportion of matching genotypes, with
tick marks at individual values below, via rug
.
## S3 method for class 'comparegeno' plot(x, breaks=NULL, main="", xlab="Proportion matching genotypes", ...)
## S3 method for class 'comparegeno' plot(x, breaks=NULL, main="", xlab="Proportion matching genotypes", ...)
x |
An object of class |
breaks |
Passed to |
main |
Title for the plot. |
xlab |
x-axis label for the plot. |
... |
Passed to |
Creates a histogram with hist
with ticks at
individual values using rug
.
None.
Karl W Broman, [email protected]
comparegeno
, summary.comparegeno
data(fake.f2) cg <- comparegeno(fake.f2) plot(cg)
data(fake.f2) cg <- comparegeno(fake.f2) plot(cg)
Plots grid of the missing genotypes, genetic map, and histograms or barplots of phenotypes for the data from an experimental cross.
## S3 method for class 'cross' plot(x, auto.layout=TRUE, pheno.col, alternate.chrid=TRUE, ...)
## S3 method for class 'cross' plot(x, auto.layout=TRUE, pheno.col, alternate.chrid=TRUE, ...)
x |
An object of class |
auto.layout |
If TRUE, |
pheno.col |
Vector of numbers or character strings corresponding to phenotypes that should be plotted. If unspecified, all phenotypes are plotted. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
... |
Ignored at this point. |
Calls plotMissing
, plotMap
and
plotPheno
to plot the missing genotypes, genetic
map, and histograms or barplots of all phenotypes.
If auto.format=TRUE
, par(mfrow)
is used with
ceiling(sqrt(n.phe+2))
rows and the minimum number of columns
so that all plots fit on the plotting device.
Numeric phenotypes are displayed as histograms or barplots by calling
plotPheno
.
None.
Karl W Broman, [email protected]; Brian Yandell
plotMissing
, plotMap
,
plotPheno
data(fake.bc) plot(fake.bc)
data(fake.bc) plot(fake.bc)
Plot the locations of the QTL against a genetic map
## S3 method for class 'qtl' plot(x, chr, horizontal=FALSE, shift=TRUE, show.marker.names=FALSE, alternate.chrid=FALSE, justdots=FALSE, col="red", ...)
## S3 method for class 'qtl' plot(x, chr, horizontal=FALSE, shift=TRUE, show.marker.names=FALSE, alternate.chrid=FALSE, justdots=FALSE, col="red", ...)
x |
An object of class |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
horizontal |
Specifies whether the chromosomes should be plotted horizontally. |
shift |
If TRUE, shift the first marker on each chromosome to be at 0 cM. |
show.marker.names |
If TRUE, marker names are included. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
justdots |
If FALSE, just plot dots at the QTL, rather than arrows and QTL names. |
col |
Color used to plot indications of QTL |
... |
Passed to |
Creates a plot, via plotMap
, and indicates the
locations of the QTL in the input QTL object, x
.
None.
Karl W Broman, [email protected]
data(fake.f2) # take out several QTLs and make QTL object qc <- c("1", "6", "13") qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") plot(qtl) plot(qtl, justdots=TRUE, col="seagreen")
data(fake.f2) # take out several QTLs and make QTL object qc <- c("1", "6", "13") qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") plot(qtl) plot(qtl, justdots=TRUE, col="seagreen")
Plot a slice (corresponding to a single marker) through the pairwise
recombination fractions or LOD scores calculated by
est.rf
and extracted with pull.rf
.
## S3 method for class 'rfmatrix' plot(x, marker, ...)
## S3 method for class 'rfmatrix' plot(x, marker, ...)
x |
An object of class |
marker |
A single marker name, as a character string. |
... |
Optional arguments passed to |
An object of class "scanone"
(as output by scanone
,
and which may be summarized by summary.scanone
or plotted
with plot.scanone
), containing the estimated recombination
fractions or LOD scores for the input marker against all others.
Karl W Broman, [email protected]
data(fake.f2) fake.f2 <- est.rf(fake.f2) marker <- markernames(fake.f2, chr=5)[6] lod <- pull.rf(fake.f2, "lod") plot(lod, marker, bandcol="gray70")
data(fake.f2) fake.f2 <- est.rf(fake.f2) marker <- markernames(fake.f2, chr=5)[6] lod <- pull.rf(fake.f2, "lod") plot(lod, marker, bandcol="gray70")
Plot the LOD curve for a genome scan with a single-QTL model (the
output of scanone
).
## S3 method for class 'scanone' plot(x, x2, x3, chr, lodcolumn=1, incl.markers=TRUE, xlim, ylim, lty=1, col=c("black","blue","red"), lwd=2, add=FALSE, gap=25, mtick = c("line", "triangle"), show.marker.names=FALSE, alternate.chrid=FALSE, bandcol=NULL, type="l", cex=1, pch=1, bg="transparent", bgrect=NULL, ...)
## S3 method for class 'scanone' plot(x, x2, x3, chr, lodcolumn=1, incl.markers=TRUE, xlim, ylim, lty=1, col=c("black","blue","red"), lwd=2, add=FALSE, gap=25, mtick = c("line", "triangle"), show.marker.names=FALSE, alternate.chrid=FALSE, bandcol=NULL, type="l", cex=1, pch=1, bg="transparent", bgrect=NULL, ...)
x |
An object of class |
x2 |
Optional second |
x3 |
Optional third |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
lodcolumn |
An integer, or vector of 3 integers, indicating which of the LOD score columns should be plotted (generally this is 1). |
incl.markers |
Indicate whether to plot line segments at the marker locations. |
xlim |
Limits for x-axis (optional). |
ylim |
Limits for y-axis (optional). |
lty |
Line types; a vector of length 1 or 3. |
col |
Line colors; a vector of length 1 or 3. |
lwd |
Line widths; a vector of length 1 or 3. |
add |
If TRUE, add to a current plot. |
gap |
Gap separating chromosomes (in cM). |
mtick |
Tick mark type for markers (line segments or upward-pointing triangels). |
show.marker.names |
If TRUE, show the marker names along the x axis. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
bandcol |
Optional color for alternating bands to indicate
chromosomes. If NULL (the default), no bands are plotted. A good
choice might be |
type |
Type of plot (see |
cex |
Point size expansion, for example if |
pch |
Point type, for example if |
bg |
Background color for points, for example if |
bgrect |
Optional background color for the rectangular plotting region. |
... |
Passed to the function |
This function allows you to plot the results of up to three genome scans against one another. Such objects must conform with each other.
One may alternatively use the argument add
to add the plot of
an additional genome scan to the current figure, but some care is
required: the same chromosomes should be selected, and the results
must concern crosses with the same genetic maps.
If a single scanone
object containing multiple LOD score
columns (for example, from different phenotypes) is input, up to three
LOD curves may be plotted, by providing a vector in the argument
lodcolumn
. If multiple scanone
objects are input (via
x
, x2
and x3
), the LOD score columns to be
plotted are chosen from the corresponding element of the
lodcolumn
argument.
None.
Karl W Broman, [email protected]
scanone
,
summary.scanone
, par
,
colors
, add.threshold
, xaxisloc.scanone
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2,step=2.5) out.mr <- scanone(fake.f2, method="mr") out.em <- scanone(fake.f2, method="em") plot(out.mr) plot(out.mr, out.em, chr=c(1,13), lty=1, col=c("violetred","black")) out.hk <- scanone(fake.f2, method="hk") plot(out.hk, chr=c(1,13), add=TRUE, col="slateblue") plot(out.hk, chr=13, show.marker.names=TRUE) plot(out.hk, bandcol="gray70") # plot points rather than lines plot(out.hk, bandcol="gray70", type="p", cex=0.3, pch=21, bg="slateblue")
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2,step=2.5) out.mr <- scanone(fake.f2, method="mr") out.em <- scanone(fake.f2, method="em") plot(out.mr) plot(out.mr, out.em, chr=c(1,13), lty=1, col=c("violetred","black")) out.hk <- scanone(fake.f2, method="hk") plot(out.hk, chr=c(1,13), add=TRUE, col="slateblue") plot(out.hk, chr=13, show.marker.names=TRUE) plot(out.hk, bandcol="gray70") # plot points rather than lines plot(out.hk, bandcol="gray70", type="p", cex=0.3, pch=21, bg="slateblue")
Plot a histogram of the results of a nonparametric bootstrap to assess uncertainty in QTL position.
## S3 method for class 'scanoneboot' plot(x, ...)
## S3 method for class 'scanoneboot' plot(x, ...)
x |
An object of class |
... |
Passed to the function |
The function plots a histogram of the bootstrap results obtained by
scanoneboot
. Genetic marker locations are
displayed by vertical lines at the bottom of the plot.
None.
Karl W Broman, [email protected]
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=1) ## Not run: out.boot <- scanoneboot(fake.f2, chr=13, method="hk") summary(out.boot) plot(out.boot)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=1) ## Not run: out.boot <- scanoneboot(fake.f2, chr=13, method="hk") summary(out.boot) plot(out.boot)
Plot a histogram of the permutation results from a single-QTL genome scan.
## S3 method for class 'scanoneperm' plot(x, lodcolumn=1, ...)
## S3 method for class 'scanoneperm' plot(x, lodcolumn=1, ...)
x |
An object of class |
lodcolumn |
This indicates the LOD score column to plot. This should be a single number between 1 and the number of LOD columns in the object input. |
... |
Passed to the function |
The function plots a histogram of the permutation results obtained by
scanone
when n.perm
is specified. If
separate permutations were performed for the autosomes and the X
chromosome (using perm.Xsp=TRUE
), separate histograms are given.
None.
Karl W Broman, [email protected]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) operm <- scanone(fake.bc, method="hk", n.perm=100) plot(operm)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) operm <- scanone(fake.bc, method="hk", n.perm=100) plot(operm)
Plot the LOD curves for each partition for a genome scan with a single
diallelic QTL (the
output of scanPhyloQTL
).
## S3 method for class 'scanPhyloQTL' plot(x, chr, incl.markers=TRUE, col, xlim, ylim, lwd=2, gap=25, mtick=c("line", "triangle"), show.marker.names=FALSE, alternate.chrid=FALSE, legend=TRUE, ...)
## S3 method for class 'scanPhyloQTL' plot(x, chr, incl.markers=TRUE, col, xlim, ylim, lwd=2, gap=25, mtick=c("line", "triangle"), show.marker.names=FALSE, alternate.chrid=FALSE, legend=TRUE, ...)
x |
An object of class |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
incl.markers |
Indicate whether to plot line segments at the marker locations. |
col |
Optional vector of colors to use for each partition. |
xlim |
Limits for x-axis (optional). |
ylim |
Limits for y-axis (optional). |
lwd |
Line width. |
gap |
Gap separating chromosomes (in cM). |
mtick |
Tick mark type for markers (line segments or upward-pointing triangels). |
show.marker.names |
If TRUE, show the marker names along the x axis. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
legend |
Indicates whether to include a legend in the plot. |
... |
Passed to the function |
None.
Karl W Broman, [email protected]
Broman, K. W., Kim, S., An\'e, C. and Payseur, B. A. Mapping quantitative trait loci to a phylogenetic tree. In preparation.
scanPhyloQTL
, max.scanPhyloQTL
,
summary.scanPhyloQTL
, plot.scanone
,
inferredpartitions
,
simPhyloQTL
,
par
, colors
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
Plot the results of a two-dimensional, two-QTL genome scan.
## S3 method for class 'scantwo' plot(x, chr, incl.markers=FALSE, zlim, lodcolumn=1, lower = c("full", "add", "cond-int", "cond-add", "int"), upper = c("int", "cond-add", "cond-int", "add", "full"), nodiag=TRUE, contours=FALSE, main, zscale=TRUE, point.at.max=FALSE, col.scheme = c("viridis", "redblue","cm","gray","heat","terrain","topo"), gamma=0.6, allow.neg=FALSE, alternate.chrid=FALSE, ...)
## S3 method for class 'scantwo' plot(x, chr, incl.markers=FALSE, zlim, lodcolumn=1, lower = c("full", "add", "cond-int", "cond-add", "int"), upper = c("int", "cond-add", "cond-int", "add", "full"), nodiag=TRUE, contours=FALSE, main, zscale=TRUE, point.at.max=FALSE, col.scheme = c("viridis", "redblue","cm","gray","heat","terrain","topo"), gamma=0.6, allow.neg=FALSE, alternate.chrid=FALSE, ...)
x |
An object of class |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
incl.markers |
If FALSE, plot LOD scores on an evenly spaced grid (not including the results at the markers). |
zlim |
A vector of length 2 (optional), indicating the z limits
for the lower-right and upper-left triangles, respectively. If one
number is given, the same limits are used for both triangles. If
|
lodcolumn |
If the scantwo results contain LOD scores for multiple phenotypes, this argument indicates which to use in the plot. |
lower |
Indicates which LOD scores should be plotted in the lower triangle. See the details below. |
upper |
Indicates which LOD scores should be plotted in the upper triangle. See the details below. |
nodiag |
If TRUE, suppress the plot of the scanone output (which is normally along the diagonal.) |
contours |
If TRUE, add a contour to the plot at 1.5-LOD below
its maximum, using a call to |
main |
An optional title for the plot. |
zscale |
If TRUE, a color scale is plotted at the right. |
point.at.max |
If TRUE, plot an X at the maximum LOD. |
col.scheme |
Name of color pallet. The default is "viridis"; see Option D at https://bids.github.io/colormap/ |
gamma |
Parameter affecting range of colors when
|
allow.neg |
If TRUE, allow the plot of negative LOD scores; in
this case, the z-limits are symmetric about 0. This option is
chiefly to allow a plot of difference between LOD scores from
different methods, calculated via |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
... |
Ignored at this point. |
Uses image
to plot a grid of LOD scores. The
particular LOD scores plotted in the upper-left and lower-right
triangles are selected via upper
and lower
,
respectively. By default, the upper-left triangle contains the
epistasis LOD
scores ("int"
), and the lower-right triangle contains the LOD
scores for the full model ("full"
).
The diagonal contains either all zeros or the main effects LOD scores
(from scanone
).
The scantwo
function calculates, for each pair of
putative QTLs, , the likelihood under the
null model
, the likelihood under each of the single-QTL
models,
and
, the likelihood
under an additive QTL model,
, and the
likelihood under a full QTL model (including QTL-QTL interaction),
.
The five possible LOD scores that may be plotted are the following.
The epistasis LOD scores ("int"
) are .
The full LOD scores ("full"
) are
.
The additive LOD scores ("add"
) are
.
In addition, we may calculate, for each pair of
chromosomes, the difference between the full LOD score and the
maximum single-QTL LOD scores for that pair of chromosomes
("cond-int"
).
Finally, we may calculate, for each pair of
chromosomes, the difference between the additive LOD score and the
maximum single-QTL LOD scores for that pair of chromosomes
("cond-add"
).
If a color scale is plotted (zscale=TRUE
), the axis on the
left indicates the scale for the upper-left triangle,
while the axis on the right indicates the scale for the
lower-right triangle. Note that the axis labels can get screwed up
if you change the size of the figure window; you'll need to redo the
plot.
None.
Note that, for output from addpair
in which the
new loci are indicated explicitly in the formula, the summary provided
by plot.scantwo
is somewhat special. In particular, the
lower
and upper
arguments are ignored.
In the case that the formula used in addpair
was
not symmetric in the two new QTL, the x-axis in the plot corresponds
to the first of the new QTL and the y-axis corresponds to the second
of the new QTL.
Hao Wu; Karl W Broman, [email protected]; Brian Yandell
scantwo
,
summary.scantwo
, plot.scanone
,
-.scantwo
data(hyper) hyper <- calc.genoprob(hyper, step=5) # 2-d scan by EM and by Haley-Knott regression out2.em <- scantwo(hyper, method="em") out2.hk <- scantwo(hyper, method="hk") # plot epistasis and full LOD scores plot(out2.em) # plot cond-int in upper triangle and full in lower triangle # for chromosomes 1, 4, 6, 15 plot(out2.em, upper="cond-int", chr=c(1,4,6,15)) # plot cond-add in upper triangle and add in lower triangle # for chromosomes 1, 4 plot(out2.em, upper="cond-add", lower="add", chr=c(1,4)) # plot the differences between the LOD scores from Haley-Knott # regression and the EM algorithm plot(out2.hk - out2.em, allow.neg=TRUE)
data(hyper) hyper <- calc.genoprob(hyper, step=5) # 2-d scan by EM and by Haley-Knott regression out2.em <- scantwo(hyper, method="em") out2.hk <- scantwo(hyper, method="hk") # plot epistasis and full LOD scores plot(out2.em) # plot cond-int in upper triangle and full in lower triangle # for chromosomes 1, 4, 6, 15 plot(out2.em, upper="cond-int", chr=c(1,4,6,15)) # plot cond-add in upper triangle and add in lower triangle # for chromosomes 1, 4 plot(out2.em, upper="cond-add", lower="add", chr=c(1,4)) # plot the differences between the LOD scores from Haley-Knott # regression and the EM algorithm plot(out2.hk - out2.em, allow.neg=TRUE)
Plot a histogram of the permutation results from a two-dimensional, two-QTL genome scan.
## S3 method for class 'scantwoperm' plot(x, lodcolumn=1, include_rug=TRUE, ...)
## S3 method for class 'scantwoperm' plot(x, lodcolumn=1, include_rug=TRUE, ...)
x |
An object of class |
lodcolumn |
This indicates the LOD score column to plot. This should be a single number between 1 and the number of LOD columns in the object input. |
include_rug |
If TRUE, include a call to |
... |
Passed to the function |
The function plots a histogram of the permutation results obtained by
scantwo
when n.perm
is specified. Separate
histograms are provided for the five LOD scores, full
,
fv1
, int
, add
, and av1
.
None.
Karl W Broman, [email protected]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) operm2 <- scantwo(fake.bc, method="hk", n.perm=10) plot(operm2)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) operm2 <- scantwo(fake.bc, method="hk", n.perm=10) plot(operm2)
Plot a grid of the LOD scores indicating which genotypes are likely to be in error.
plotErrorlod(x, chr, ind, breaks=c(-Inf,2,3,4.5,Inf), col=c("white","gray85","hotpink","purple3"), alternate.chrid=FALSE, ...)
plotErrorlod(x, chr, ind, breaks=c(-Inf,2,3,4.5,Inf), col=c("white","gray85","hotpink","purple3"), alternate.chrid=FALSE, ...)
x |
An object of class |
chr |
Optional vector indicating the chromosomes to be drawn in
the plot. This should be a vector of character strings referring to
chromosomes by name; numeric values are converted to strings. Refer
to chromosomes with a preceding |
ind |
Indicates the individuals for which the error LOD scores
should be plotted (passed to |
breaks |
A set of breakpoints for the colors; must give one more breakpoint than color. Intervals are open on the left and closed on the right, except for the lowest interval. |
col |
A vector of colors to appear in the image. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
... |
Ignored at this point. |
Uses image
to plot a grid with different shades
of pixels to indicate which genotypes are likely to be in error.
Darker pixels have higher error LOD scores:
in white;
in gray;
in pink;
in purple.
None.
Karl W Broman, [email protected]
Lincoln, S. E. and Lander, E. S. (1992) Systematic detection of errors in genetic linkage data. Genomics 14, 604–610.
calc.errorlod
,
top.errorlod
, image
,
subset.cross
, plotGeno
data(hyper) # Calculate error LOD scores hyper <- calc.errorlod(hyper,error.prob=0.01) # plot the error LOD scores; print those above a specified cutoff plotErrorlod(hyper) plotErrorlod(hyper,chr=1)
data(hyper) # Calculate error LOD scores hyper <- calc.errorlod(hyper,error.prob=0.01) # plot the error LOD scores; print those above a specified cutoff plotErrorlod(hyper) plotErrorlod(hyper,chr=1)
Plot the genotypes on a particular chromosome for a set of individuals, flagging likely errors.
plotGeno(x, chr, ind, include.xo=TRUE, horizontal=TRUE, cutoff=4, min.sep=2, cex=1.2, ...)
plotGeno(x, chr, ind, include.xo=TRUE, horizontal=TRUE, cutoff=4, min.sep=2, cex=1.2, ...)
x |
An object of class |
chr |
The chromosome to plot. Only one chromosome is allowed. (This should be a character string referring to the chromosomes by name.) |
ind |
Vector of individuals to plot (passed to |
include.xo |
If TRUE, plot X's in intervals having a crossover. Not available for a 4-way cross. |
horizontal |
If TRUE, chromosomes are plotted horizontally. |
cutoff |
Genotypes with error LOD scores above this value are flagged as possible errors. |
min.sep |
Markers separated by less than this value (as a percent of the chromosome length) are pulled apart, so that they may be distinguished in the picture. |
cex |
Character expansion for the size of points in the plot.
Larger numbers give larger points; see |
... |
Ignored at this point. |
Plots the genotypes for a set of individuals. Likely errors are indicated by red squares. In a backcross, genotypes AA and AB are indicated by white and black circles, respectively. In an intercross, genotypes AA, AB and BB are indicated by white, gray, and black circles, respectively, and the partially missing genotypes "not BB" (D in mapmaker) and "not AA" (C in mapmaker) are indicated by green and orange circles, respectively.
For the X chromosome in a backcross or intercross, hemizygous males are plotted as if they were homozygous (that is, with white and black circles).
For a 4-way cross, two lines are plotted for each individual. The left or upper line indicates the allele A (white) or B (black); the right or lower line indicates the allele C (white) or D (black). For the case that genotype is known to be only AC/BD or AD/BC, we use green and orange, respectively.
None.
Karl W Broman, [email protected]
calc.errorlod
,
top.errorlod
, subset.cross
data(hyper) # Calculate error LOD scores hyper <- calc.errorlod(hyper,error.prob=0.01) # print those above a specified cutoff top.errorlod(hyper,cutoff=4) # plot genotype data, flagging genotypes with error LOD > cutoff plotGeno(hyper, chr=1, ind=160:200, cutoff=7, min.sep=2)
data(hyper) # Calculate error LOD scores hyper <- calc.errorlod(hyper,error.prob=0.01) # print those above a specified cutoff top.errorlod(hyper,cutoff=4) # plot genotype data, flagging genotypes with error LOD > cutoff plotGeno(hyper, chr=1, ind=160:200, cutoff=7, min.sep=2)
Plot a measure of the proportion of missing information in the genotype data.
plotInfo(x, chr, method=c("entropy","variance","both"), step=1, off.end=0, error.prob=0.001, map.function=c("haldane","kosambi","c-f","morgan"), alternate.chrid=FALSE, fourwaycross=c("all", "AB", "CD"), include.genofreq=FALSE, ...)
plotInfo(x, chr, method=c("entropy","variance","both"), step=1, off.end=0, error.prob=0.001, map.function=c("haldane","kosambi","c-f","morgan"), alternate.chrid=FALSE, fourwaycross=c("all", "AB", "CD"), include.genofreq=FALSE, ...)
x |
An object of class |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
method |
Indicates whether to plot the entropy version of the information, the variance version, or both. |
step |
Maximum distance (in cM) between positions at which the
missing information is calculated, though for |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype probability calculations will be carried. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi or Carter-Falconer map function when converting genetic distances into recombination fractions. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
fourwaycross |
For a phase-known four-way cross, measure missing
genotype information overall ( |
include.genofreq |
If TRUE, estimated genotype frequencies (from
the results of
|
... |
Passed to |
The entropy version of the missing information: for a single
individual at a single genomic position, we measure the missing
information as , where
is the probability of the
genotype
, and
is the number of possible genotypes,
defining
. This takes values between 0
and 1, assuming the value 1 when the genotypes (given the marker data)
are equally likely and 0 when the genotypes are completely determined.
We calculate the missing information at a particular position as the
average of
across individuals. For an intercross, we don't
scale by
but by the entropy in the case of genotype
probabilities (1/4, 1/2, 1/4).
The variance version of the missing information: we calculate the average, across individuals, of the variance of the genotype distribution (conditional on the observed marker data) at a particular locus, and scale by the maximum such variance.
Calculations are done in C (for the sake of speed in the presence of
little thought about programming efficiency) and the plot is created
by a call to plot.scanone
.
Note that summary.scanone
may be used to display
the maximum missing information on each chromosome.
An object with class scanone
: a data.frame with columns the
chromosome IDs and cM positions followed by the entropy and/or
variance version of the missing information.
Karl W Broman, [email protected]
plot.scanone
,
plotMissing
, calc.genoprob
,
geno.table
data(hyper) plotInfo(hyper,chr=c(1,4)) # save the results and view maximum missing info on each chr info <- plotInfo(hyper) summary(info) plotInfo(hyper, bandcol="gray70")
data(hyper) plotInfo(hyper,chr=c(1,4)) # save the results and view maximum missing info on each chr info <- plotInfo(hyper) summary(info) plotInfo(hyper, bandcol="gray70")
Use the results of refineqtl
to plot
one-dimensional LOD profiles for each QTL.
plotLodProfile(qtl, chr, incl.markers=TRUE, gap=25, lwd=2, lty=1, col="black", qtl.labels=TRUE, mtick=c("line", "triangle"), show.marker.names=FALSE, alternate.chrid=FALSE, add=FALSE, showallchr=FALSE, labelsep=5, ...)
plotLodProfile(qtl, chr, incl.markers=TRUE, gap=25, lwd=2, lty=1, col="black", qtl.labels=TRUE, mtick=c("line", "triangle"), show.marker.names=FALSE, alternate.chrid=FALSE, add=FALSE, showallchr=FALSE, labelsep=5, ...)
qtl |
An object of class |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
incl.markers |
Indicate whether to plot line segments at the marker locations. |
gap |
Gap separating chromosomes (in cM). |
lwd |
Line widths for each QTL trace (length 1 or the number of QTL). |
lty |
Line types for each QTL trace (length 1 or the number of QTL). |
col |
Line col for each QTL trace (length 1 or the number of QTL). |
qtl.labels |
If TRUE, place a label on each QTL trace. |
mtick |
Tick mark type for markers (line segments or upward-pointing triangels). |
show.marker.names |
If TRUE, show the marker names along the x axis. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
add |
If TRUE, add curves to a current plot. |
showallchr |
If FALSE (the default), only show the chr with a QTL |
labelsep |
If |
... |
Passed to the function |
The function plots LOD profiles in the context of a multiple QTL model, using a scheme best described in Zeng et al. (2000). The position of each QTL is varied, keeping all other loci fixed. If a QTL is isolated on a chromosome, the entire chromosome is scanned; if there are additional linked QTL, the position of a QTL is scanned over the largest interval possible without allowing the order of QTLs along a chromosome to change. At each position for the QTL being scanned, we calculate a LOD score comparing the full model, with the QTL of interest at that particular position (and all others at their fixed positions) to the model with the QTL of interest (and any interactions that include that QTL) omitted.
Care should be take regarding the arguments lwd
, lty
,
and col
; if vectors are given, they should be in the order of
the QTL within the object, which may be different than the order in
which they are plotted. (The LOD profiles are sorted by chromosome
and position.)
None.
Karl W Broman, [email protected]
Zeng Z.-B., Liu, J., Stam, L. F., Kao, C.-H., Mercer, J. M. and Laurie, C. C. (2000) Genetic architecture of a morphological shape difference between two Drosophila species. Genetics 154, 299–310.
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2) qtl <- makeqtl(fake.bc, chr=c(2,5), pos=c(32.5, 17.5), what="prob") out <- scanone(fake.bc, method="hk") # refine QTL positions and keep LOD profiles rqtl <- refineqtl(fake.bc, qtl=qtl, method="hk", keeplodprofile=TRUE) # plot the LOD profiles plotLodProfile(rqtl) # add the initial scan results, for comparison plot(out, add=TRUE, chr=c(2,5), col="red")
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2) qtl <- makeqtl(fake.bc, chr=c(2,5), pos=c(32.5, 17.5), what="prob") out <- scanone(fake.bc, method="hk") # refine QTL positions and keep LOD profiles rqtl <- refineqtl(fake.bc, qtl=qtl, method="hk", keeplodprofile=TRUE) # plot the LOD profiles plotLodProfile(rqtl) # add the initial scan results, for comparison plot(out, add=TRUE, chr=c(2,5), col="red")
Plot genetic map of marker locations for all chromosomes.
## S3 method for class 'map' plot(x, map2, chr, horizontal=FALSE, shift=TRUE, show.marker.names=FALSE, alternate.chrid=FALSE, ...) plotMap(x, map2, chr, horizontal=FALSE, shift=TRUE, show.marker.names=FALSE, alternate.chrid=FALSE, ...)
## S3 method for class 'map' plot(x, map2, chr, horizontal=FALSE, shift=TRUE, show.marker.names=FALSE, alternate.chrid=FALSE, ...) plotMap(x, map2, chr, horizontal=FALSE, shift=TRUE, show.marker.names=FALSE, alternate.chrid=FALSE, ...)
x |
A list whose components are vectors of marker locations. A
|
map2 |
An optional second genetic map with the same number (and
names) of chromosomes. As with the first argument, a
|
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
horizontal |
Specifies whether the chromosomes should be plotted horizontally. |
shift |
If TRUE, shift the first marker on each chromosome to be at 0 cM. |
show.marker.names |
If TRUE, marker names are included. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
... |
Passed to |
Plots the genetic map for each chromosome, or a comparison of the genetic maps if two maps are given.
For a comparison of two maps, the first
map is on the left (or, if horizontal=TRUE
, on the top). Lines
are drawn to connect markers. Markers that exist in just one map and
not the other are indicated by short line segments, on one side or the
other, that are not connected across.
For a sex-specific map, female and male maps are plotted against one another. For two sex-specific maps, the two female maps are plotted against one another and the two male maps are plotted against one another.
None.
Karl W Broman, [email protected]
data(fake.bc) plotMap(fake.bc) plotMap(fake.bc,horizontal=TRUE) newmap <- est.map(fake.bc) plot(newmap) plotMap(fake.bc, newmap) plotMap(fake.bc, show.marker.names=TRUE)
data(fake.bc) plotMap(fake.bc) plotMap(fake.bc,horizontal=TRUE) newmap <- est.map(fake.bc) plot(newmap) plotMap(fake.bc, newmap) plotMap(fake.bc, show.marker.names=TRUE)
Plot a grid showing which genotypes are missing.
plotMissing(x, chr, reorder=FALSE, main="Missing genotypes", alternate.chrid=FALSE, ...)
plotMissing(x, chr, reorder=FALSE, main="Missing genotypes", alternate.chrid=FALSE, ...)
x |
An object of class |
||||||
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
||||||
reorder |
Specify whether to reorder individuals according to their phenotypes.
|
||||||
main |
Title to place on plot. |
||||||
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
||||||
... |
Ignored at this point. |
Uses image
to plot a grid with black pixels where the
genotypes are missing. For intercross and 4-way cross data, gray
pixels are plotted for the partially missing genotypes (for example,
"not AA").
None.
Karl W Broman, [email protected]
data(fake.f2) plotMissing(fake.f2)
data(fake.f2) plotMissing(fake.f2)
Plot a graphical representation of a QTL model, with nodes representing QTL and line segments representing pairwise interactions.
plotModel(qtl, formula, circrad.rel=0.25, circrad.abs, cex.name=1, chronly=FALSE, order, ...)
plotModel(qtl, formula, circrad.rel=0.25, circrad.abs, cex.name=1, chronly=FALSE, order, ...)
qtl |
A QTL object (as created by |
formula |
Optional formula defining the QTL model. If missing,
we look for an attribute |
circrad.rel |
Radius of the circles that indicate the QTL, relative to the distance between the circles. |
circrad.abs |
Optional radius of the circles that indicate the QTL; note that the plotting region will have x- and y-axis limits spanning 3 units. |
cex.name |
Character expansion for the QTL names. |
chronly |
If TRUE and a formal QTL object is given, only the chromosome IDs are used to identify the QTL. |
order |
Optional vector indicating a permutation of the QTL to define where they are to appear in the plot. QTL are placed around a circle, starting at the top and going clockwise. |
... |
Passed to the function |
None.
Karl W Broman, [email protected]
# plot a QTL model, using a vector of character strings to define the QTL plotModel(c("1","4","6","15"), formula=y~Q1+Q2+Q3*Q4) # plot an additive QTL model data(hyper) hyper <- calc.genoprob(hyper) qtl <- makeqtl(hyper, chr=c(1,4,6,15), pos=c(68.3,30,60,18), what="prob") plotModel(qtl) # include an interaction plotModel(qtl, formula=y~Q1+Q2+Q3*Q4) # alternatively, include the formula as an attribute to the QTL object attr(qtl, "formula") <- y~Q1+Q2+Q3*Q4 plotModel(qtl) # if formula given, the attribute within the object is ignored plotModel(qtl, y~Q1+Q2+Q3+Q4) # NULL formula indicates additive QTL model plotModel(qtl, NULL) # reorder the QTL in the figure plotModel(qtl, order=c(1,3,4,2)) # show just the chromosome numbers plotModel(qtl, chronly=TRUE)
# plot a QTL model, using a vector of character strings to define the QTL plotModel(c("1","4","6","15"), formula=y~Q1+Q2+Q3*Q4) # plot an additive QTL model data(hyper) hyper <- calc.genoprob(hyper) qtl <- makeqtl(hyper, chr=c(1,4,6,15), pos=c(68.3,30,60,18), what="prob") plotModel(qtl) # include an interaction plotModel(qtl, formula=y~Q1+Q2+Q3*Q4) # alternatively, include the formula as an attribute to the QTL object attr(qtl, "formula") <- y~Q1+Q2+Q3*Q4 plotModel(qtl) # if formula given, the attribute within the object is ignored plotModel(qtl, y~Q1+Q2+Q3+Q4) # NULL formula indicates additive QTL model plotModel(qtl, NULL) # reorder the QTL in the figure plotModel(qtl, order=c(1,3,4,2)) # show just the chromosome numbers plotModel(qtl, chronly=TRUE)
Plots a histogram or barplot of the data for a phenotype from an experimental cross.
plotPheno(x, pheno.col=1, ...)
plotPheno(x, pheno.col=1, ...)
x |
An object of class |
pheno.col |
The phenotype column to plot: a numeric index, or the phenotype name as a character string. Alternatively, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
... |
Numeric phenotypes are displayed as histograms with approximately
bins. Phenotypes that are factors or that
have very few unique values are displayed as barplots.
None.
Karl W Broman, [email protected]
plot.cross
, plotMap
,
plotMissing
,
hist
, barplot
data(fake.bc) plotPheno(fake.bc, pheno.col=1) plotPheno(fake.bc, pheno.col=3) plotPheno(fake.bc, pheno.col="age")
data(fake.bc) plotPheno(fake.bc, pheno.col=1) plotPheno(fake.bc, pheno.col=3) plotPheno(fake.bc, pheno.col="age")
Plot the phenotype values versus the genotypes at a marker or markers.
plotPXG(x, marker, pheno.col=1, jitter=1, infer=TRUE, pch, ylab, main, col, ...)
plotPXG(x, marker, pheno.col=1, jitter=1, infer=TRUE, pch, ylab, main, col, ...)
x |
An object of class |
marker |
Marker name (a character string; can be a vector). |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
jitter |
A positive number indicating how much to spread out the points horizontally. (Larger numbers correspond to greater spread.) |
infer |
If TRUE, missing genotypes are filled in with a single random imputation and plotted in red; if FALSE, only individuals typed at the specified marker are plotted. |
pch |
Plot symbol. |
ylab |
Label for y-axis. |
main |
Main title for the plot. If missing, the names of the markers are used. |
col |
A vector of colors to use for the confidence intervals (optional). |
... |
Passed to |
Plots the phenotype data against the genotypes at the specified
marker. If infer=TRUE, the genotypes of individuals that were not
typed is inferred based the genotypes at linked markers via a single
imputation from sim.geno
; these points are plotted
in red. For each genotype, the phenotypic mean is plotted, with error
bars at 1 SE.
A data.frame with initial columns the marker genotypes, then the phenotype data, then a column indicating whether any of the marker genotypes were inferred (1=at least one genotype inferred, 0=none were inferred).
Karl W Broman, [email protected]; Brian Yandell
find.marker
, effectplot
,
find.flanking
, effectscan
data(listeria) mname <- find.marker(listeria, 5, 28) # marker D5M357 plotPXG(listeria, mname) mname2 <- find.marker(listeria, 13, 26) # marker D13Mit147 plotPXG(listeria, c(mname, mname2)) plotPXG(listeria, c(mname2, mname)) # output of the function contains the raw data output <- plotPXG(listeria, mname) head(output) # another example data(fake.f2) mname <- find.marker(fake.f2, 1, 37) # marker D1M437 plotPXG(fake.f2, mname) mname2 <- find.marker(fake.f2, "X", 14) # marker DXM66 plotPXG(fake.f2, mname2) plotPXG(fake.f2, c(mname,mname2)) plotPXG(fake.f2, c(mname2,mname))
data(listeria) mname <- find.marker(listeria, 5, 28) # marker D5M357 plotPXG(listeria, mname) mname2 <- find.marker(listeria, 13, 26) # marker D13Mit147 plotPXG(listeria, c(mname, mname2)) plotPXG(listeria, c(mname2, mname)) # output of the function contains the raw data output <- plotPXG(listeria, mname) head(output) # another example data(fake.f2) mname <- find.marker(fake.f2, 1, 37) # marker D1M437 plotPXG(fake.f2, mname) mname2 <- find.marker(fake.f2, "X", 14) # marker DXM66 plotPXG(fake.f2, mname2) plotPXG(fake.f2, c(mname,mname2)) plotPXG(fake.f2, c(mname2,mname))
Plot a grid showing the recombination fractions for all pairs of markers, and/or the LOD scores for tests of linkage between pairs of markers.
plotRF(x, chr, what=c("both","lod","rf"), alternate.chrid=FALSE, zmax=12, mark.diagonal=FALSE, col.scheme=c("viridis", "redblue"), ...)
plotRF(x, chr, what=c("both","lod","rf"), alternate.chrid=FALSE, zmax=12, mark.diagonal=FALSE, col.scheme=c("viridis", "redblue"), ...)
x |
An object of class |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
what |
Indicate whether to plot LOD scores, recombination fractions or both. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
zmax |
Maximum LOD score plotted; values above this are all thresholded at this value. |
mark.diagonal |
If TRUE, include black line segments around the pixels along the diagonal, to better separate the upper left triangle from the lower right triangle. |
col.scheme |
The color palette. The default is "viridis"; see Option D at https://bids.github.io/colormap/ |
... |
Generally ignored, but you can include |
Uses image
to plot a grid showing the
recombination fractions and/or LOD scores for all pairs of markers.
(The LOD scores are for a test of .)
If both are plotted, the recombination fractions are in the upper left
triangle while the LOD scores are in the lower right triangle.
With col.scheme="viridis"
(the default), purple corresponds to
a large LOD score or a small recombination fraction, while yellow is
the reverse. With col.scheme="redblue"
, red corresponds to a
large LOD or a small recombination fraction, while blue is the
reverse. Note that missing values appear in light gray.
Recombination fractions are transformed by to make them on the same sort of scale as LOD
scores. Values of LOD or the transformed recombination fraction that
are above 12 are set to 12.
None.
Karl W Broman, [email protected]
est.rf
, pull.rf
, plot.rfmatrix
,
image
,
badorder
, ripple
data(badorder) badorder <- est.rf(badorder) plotRF(badorder) # plot just chr 1 plotRF(badorder, chr=1) # plot just the recombination fractions plotRF(badorder, what="rf") # plot just the LOD scores, and just for chr 2 and 3 plotRF(badorder, chr=2:3, what="lod")
data(badorder) badorder <- est.rf(badorder) plotRF(badorder) # plot just chr 1 plotRF(badorder, chr=1) # plot just the recombination fractions plotRF(badorder, what="rf") # plot just the LOD scores, and just for chr 2 and 3 plotRF(badorder, chr=2:3, what="lod")
Pull out the results of argmax.geno
from a cross as a matrix.
pull.argmaxgeno(cross, chr, include.pos.info=FALSE, rotate=FALSE)
pull.argmaxgeno(cross, chr, include.pos.info=FALSE, rotate=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
include.pos.info |
If TRUE, include columns with marker name,
chromosmoe ID, and cM position. (If |
rotate |
If TRUE, return matrix with individuals as columns and positions as rows. If FALSE, rows correspond to individuals. |
A matrix containing numeric indicators of the inferred genotypes. Multiple chromosomes are pasted together.
Karl W Broman, [email protected]
pull.geno
, pull.genoprob
,
pull.draws
, argmax.geno
data(listeria) listeria <- argmax.geno(listeria, step=1, stepwidth="max") amg <- pull.argmaxgeno(listeria, chr=c(5,13), include.pos.info=TRUE, rotate=TRUE) amg[1:5,1:10]
data(listeria) listeria <- argmax.geno(listeria, step=1, stepwidth="max") amg <- pull.argmaxgeno(listeria, chr=c(5,13), include.pos.info=TRUE, rotate=TRUE) amg[1:5,1:10]
Pull out the results of sim.geno
from a cross as an array.
pull.draws(cross, chr)
pull.draws(cross, chr)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
An array containing numeric indicators of the imputed genotypes. Multiple chromosomes are pasted together. The dimensions are individuals by positions by imputations
Karl W Broman, [email protected]
pull.geno
, pull.genoprob
,
pull.argmaxgeno
, sim.geno
data(listeria) listeria <- sim.geno(listeria, step=5, stepwidth="max", n.draws=8) dr <- pull.draws(listeria, chr=c(5,13)) dr[1:20,1:10,1]
data(listeria) listeria <- sim.geno(listeria, step=5, stepwidth="max", n.draws=8) dr <- pull.draws(listeria, chr=c(5,13)) dr[1:20,1:10,1]
Pull out the genotype data from a cross object, as a single big matrix.
pull.geno(cross, chr)
pull.geno(cross, chr)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
A matrix of size n.ind x tot.mar. The raw genotype data in the input cross object, with the chromosomes pasted together.
Karl W Broman, [email protected]
pull.pheno
, pull.map
pull.draws
, pull.genoprob
,
pull.argmaxgeno
data(listeria) dat <- pull.geno(listeria) # image of the genotype data image(1:ncol(dat),1:nrow(dat),t(dat),ylab="Individuals",xlab="Markers", col=c("red","yellow","blue","green","violet")) abline(v=cumsum(c(0,nmar(listeria)))+0.5) abline(h=nrow(dat)+0.5)
data(listeria) dat <- pull.geno(listeria) # image of the genotype data image(1:ncol(dat),1:nrow(dat),t(dat),ylab="Individuals",xlab="Markers", col=c("red","yellow","blue","green","violet")) abline(v=cumsum(c(0,nmar(listeria)))+0.5) abline(h=nrow(dat)+0.5)
Pull out the results of calc.genoprob
from a cross as a matrix.
pull.genoprob(cross, chr, omit.first.prob=FALSE, include.pos.info=FALSE, rotate=FALSE)
pull.genoprob(cross, chr, omit.first.prob=FALSE, include.pos.info=FALSE, rotate=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
omit.first.prob |
If TRUE, omit the probabilities for the first genotype at each position (since they sum to 1). |
include.pos.info |
If TRUE, include columns with marker name,
genotype, chromosome ID, and cM position. (If
|
rotate |
If TRUE, return matrix with individuals as columns and positions/genotypes as rows. If FALSE, rows correspond to individuals. |
A matrix containing genotype probabilities. Multiple chromosomes and the multiple genotypes at each position are pasted together.
Karl W Broman, [email protected]
pull.geno
, pull.argmaxgeno
,
pull.draws
, calc.genoprob
data(listeria) listeria <- calc.genoprob(listeria, step=1, stepwidth="max") pr <- pull.genoprob(listeria, chr=c(5,13), omit.first.prob=TRUE, include.pos.info=TRUE, rotate=TRUE) pr[1:5,1:10]
data(listeria) listeria <- calc.genoprob(listeria, step=1, stepwidth="max") pr <- pull.genoprob(listeria, chr=c(5,13), omit.first.prob=TRUE, include.pos.info=TRUE, rotate=TRUE) pr[1:5,1:10]
Pull out the map portion of a cross object.
pull.map(cross, chr, as.table=FALSE)
pull.map(cross, chr, as.table=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
as.table |
If TRUE, return the genetic map as a table with
chromosome assignments and marker names. If FALSE, return the map as a
|
The genetic map: a list with each component containing the marker
positions (in cM) for a chromosome. Each component has class
A
or X
according to whether it is an autosome or the X
chromosome. The components are either vectors of marker positions or,
for a sex-specific map, 2-row matrices containing the female and male
marker locations. The map itself is given class map
.
Karl W Broman, [email protected]
replace.map
, plotMap
, map2table
data(fake.f2) map <- pull.map(fake.f2) plot(map)
data(fake.f2) map <- pull.map(fake.f2) plot(map)
Drop all but a selected set of markers from the data matrices and genetic maps.
pull.markers(cross, markers)
pull.markers(cross, markers)
cross |
An object of class |
markers |
A character vector of marker names. |
The input object, with any markers not specified in the vector markers
removed
from the genotype data matrices, genetic maps, and, if applicable, any
derived data (such as produced by calc.genoprob
).
(It might be a good idea to re-derive such things after using this
function.)
Karl W Broman, [email protected]
drop.nullmarkers
, drop.markers
,
geno.table
,
clean.cross
data(listeria) listeria2 <- pull.markers(listeria, c("D10M44","D1M3","D1M75"))
data(listeria) listeria2 <- pull.markers(listeria, c("D10M44","D1M3","D1M75"))
Pull out selected phenotype data from a cross object, as a data frame or vector.
pull.pheno(cross, pheno.col)
pull.pheno(cross, pheno.col)
cross |
An object of class |
pheno.col |
A vector specifying which phenotypes to keep or discard. This may be a logical vector, a numeric vector, or a vector of character strings (for the phenotype names). If missing, the entire set of phenotypes is output. |
A data.frame with columns specifying phenotypes and rows specifying individuals. If there is just one phenotype, a vector (rather than a data.frame) is returned.
Karl W Broman, [email protected]
data(listeria) pull.pheno(listeria, "sex")
data(listeria) pull.pheno(listeria, "sex")
Pull out either the pairwise recombination fractions or the LOD
scores, as calculated by est.rf
, from a cross object.
pull.rf(cross, what=c("rf", "lod"), chr)
pull.rf(cross, what=c("rf", "lod"), chr)
cross |
An object of class |
what |
Indicates whether to pull out a matrix of estimated recombination fractions or a matrix of LOD scores. |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
An object of class "rfmatrix"
, which is a matrix of either
estimated recombination fractions between all marker pairs or of LOD
scores (for the test of rf=1/2) for all marker pairs.
The genetic map is included as an attribute.
Karl W Broman, [email protected]
data(fake.f2) fake.f2 <- est.rf(fake.f2) rf <- pull.rf(fake.f2) lod <- pull.rf(fake.f2, "lod") plot(rf[1,], lod[1,], xlab="rec frac", ylab="LOD score") marker <- markernames(fake.f2, chr=5)[6] par(mfrow=c(2,1)) plot(rf, marker, bandcol="gray70") plot(lod, marker, bandcol="gray70")
data(fake.f2) fake.f2 <- est.rf(fake.f2) rf <- pull.rf(fake.f2) lod <- pull.rf(fake.f2, "lod") plot(rf[1,], lod[1,], xlab="rec frac", ylab="LOD score") marker <- markernames(fake.f2, chr=5)[6] par(mfrow=c(2,1)) plot(rf, marker, bandcol="gray70") plot(lod, marker, bandcol="gray70")
Print the version number of the currently installed version of R/qtl.
qtlversion()
qtlversion()
A character string with the version number of the currently installed version of R/qtl.
Karl W Broman, [email protected]
qtlversion()
qtlversion()
Data for a QTL experiment is read from a set of files and converted
into an object of class cross
. The comma-delimited format
(csv
) is recommended. All formats require chromosome
assignments for the genetic markers, and assume that markers are in
their correct order.
read.cross(format=c("csv", "csvr", "csvs", "csvsr", "mm", "qtx", "qtlcart", "gary", "karl", "mapqtl", "tidy"), dir="", file, genfile, mapfile, phefile, chridfile, mnamesfile, pnamesfile, na.strings=c("-","NA"), genotypes=c("A","H","B","D","C"), alleles=c("A","B"), estimate.map=FALSE, convertXdata=TRUE, error.prob=0.0001, map.function=c("haldane", "kosambi", "c-f", "morgan"), BC.gen=0, F.gen=0, crosstype, ...)
read.cross(format=c("csv", "csvr", "csvs", "csvsr", "mm", "qtx", "qtlcart", "gary", "karl", "mapqtl", "tidy"), dir="", file, genfile, mapfile, phefile, chridfile, mnamesfile, pnamesfile, na.strings=c("-","NA"), genotypes=c("A","H","B","D","C"), alleles=c("A","B"), estimate.map=FALSE, convertXdata=TRUE, error.prob=0.0001, map.function=c("haldane", "kosambi", "c-f", "morgan"), BC.gen=0, F.gen=0, crosstype, ...)
format |
Specifies the format of the data file or files. Details on the various file formats are provided below. |
dir |
Directory in which the data files will be found. In
Windows, use forward slashes ( |
file |
The main input file for formats |
genfile |
File with genotype data (formats |
mapfile |
File with marker position information (all
except the |
phefile |
File with phenotype data (formats |
chridfile |
File with chromosome ID for each marker ( |
mnamesfile |
File with marker names ( |
pnamesfile |
File with phenotype names ( |
na.strings |
A vector of strings which are to be interpreted as
missing values ( |
genotypes |
A vector of character strings specifying the genotype
codes ( If you are trying to read 4-way cross data, your file must have
genotypes coded as described below, and you need to set
|
alleles |
A vector of two one-letter character strings (or four, for the four-way cross), to be used as labels for the two alleles. |
estimate.map |
For all formats but |
convertXdata |
If TRUE, any X chromosome genotype data is
converted to the internal standard, using columns |
error.prob |
In the case that the marker map must be estimated: Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
In the case that the marker map must be estimated: Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. (Ignored if m > 0.) |
BC.gen |
Used only for cross type |
F.gen |
Used only for cross type |
crosstype |
Optional character string to force a particular cross type. |
... |
Additional arguments, passed to the function
|
The available formats are comma-delimited (csv
), rotated
comma-delimited (csvr
), comma-delimited with separate files for
genotype and phenotype data (csvs
), rotated comma-delimited
with separate files for genotype and phenotype data (csvsr
),
Mapmaker (mm
), Map Manager QTX (qtx
), Gary Churchill's
format (gary
), Karl Broman's format (karl
) and
MapQTL/JoinMap (mapqtl
). The required files and their
specification for each format appears below. The comma-delimited
formats are recommended. Note that most of these formats work only
for backcross and intercross data.
The sampledata
directory in the package distribution contains
sample data files in multiple formats. Also see
https://rqtl.org/sampledata/.
The ...
argument enables additional arguments to be passed to
the function read.table
in the case of csv
and csvr
formats. In particular, one may use the argument
sep
to specify the field separator (the default is a comma),
dec
to specify the character used for the decimal point (the
default is a period), and comment.char
to specify a character
to indicate comment lines.
An object of class cross
, which is a list with two components:
geno |
This is a list with elements corresponding to
chromosomes. There are two components for each chromosome: The genotype data gets converted into numeric codes, as follows. The genotype data for a backcross is coded as NA = missing, 1 = AA, 2 = AB. For an F2 intercross, the coding is NA = missing, 1 = AA, 2 = AB, 3 = BB, 4 = not BB (i.e. AA or AB; D in Mapmaker/qtl), 5 = not AA (i.e. AB or BB; C in Mapmaker/qtl). For a 4-way cross, the mother and father are assumed to have genotypes AB and CD, respectively. The genotype data for the progeny is assumed to be phase-known, with the following coding scheme: NA = missing, 1 = AC, 2 = BC, 3 = AD, 4 = BD, 5 = A = AC or AD, 6 = B = BC or BD, 7 = C = AC or BC, 8 = D = AD or BD, 9 = AC or BD, 10 = AD or BC, 11 = not AC, 12 = not BC, 13 = not AD, 14 = not BD. |
pheno |
data.frame of size ( |
While the data format is complicated, there are a number of functions,
such as subset.cross
, to
assist in pulling out portions of the data.
The genotypes for the X chromosome require special care!
The X chromosome should be given chromosome identifier X
or
x
. If it is labeled by a number or by Xchr
, it will be
interpreted as an autosome.
The phenotype data should contain a column named "sex"
which
indicates the sex of each individual, either coded as 0
=female and
1
=male, or as a factor with levels female
/male
or
f
/m
. Case will be
ignored both in the name and in the factor levels. If no such
phenotype column is included, it will be assumed that all individuals
are of the same sex.
In the case of an intercross, the phenotype data may also contain a
column named "pgm"
(for "paternal grandmother") indicating the
direction of the cross. It should be coded as 0/1 with 0 indicating
the cross (AxB)x(AxB) or (BxA)x(AxB) and 1 indicating the cross
(AxB)x(BxA) or (BxA)x(BxA). If no such phenotype column is included,
it will be assumed that all individuals come from the same direction
of cross.
The internal storage of X chromosome data is quite different from that
of autosomal data. Males are coded 1=AA and 2=BB; females with pgm==0
are coded 1=AA and 2=AB; and females with pgm==1 are coded 1=BB and
2=AB. If the argument convertXdata
is TRUE, conversion to this
format is made automatically; if FALSE, no conversion is done,
summary.cross
will likely return a warning, and
most analyses will not work properly.
Use of convertXdata=FALSE
(in which case the X chromosome
genotypes will not be converted to our internal standard) can be
useful for diagnosing problems in the data, but will require some
serious mucking about in the internal data structure.
The input file is a comma-delimited text file. A different field
separator may be specified via the argument sep
, which will be passed
to the function read.table
). For example, in
Europe, it is common to use a comma in place of the decimal point in
numbers and so a semi-colon in place of a comma as the field
separator; such data may be read by using sep=";"
and
dec=","
.
The first line should contain the phenotype names followed by the marker names. At least one phenotype must be included; for example, include a numerical index for each individual.
The second line should contain blanks in the phenotype columns,
followed by chromosome identifiers for each marker in all other
columns. If a chromosome has the identifier X
or x
, it
is assumed to be the X chromosome; otherwise, it is assumed to be an
autosome.
An optional third line should contain blanks in the phenotype columns, followed by marker positions, in cM.
Marker order is taken from the cM positions, if provided; otherwise, it is taken from the column order.
Subsequent lines should give the data, with one line for each individual, and with phenotypes followed by genotypes. If possible, phenotypes are made numeric; otherwise they are converted to factors.
The genotype codes must be the same across all markers. For example, you can't have one marker coded AA/AB/BB and another coded A/H/B. This includes genotypes for the X chromosome, for which hemizygous individuals should be coded as if they were homoyzogous.
The cross is determined to be a backcross if only the first two elements
of the genotypes
string are found; otherwise, it is assumed to
be an intercross.
This is just like the csv
format, but rotated (or really
transposed), so that rows are columns and columns are rows.
This is like the csv
format, but with separate files for the
genotype and phenotype data.
The first column in the genotype data must specify individuals'
identifiers, and there must be a column in the phenotype data with
precisely the same information (and with the same name). These IDs
will be included in the data as a phenotype. If the name id
or
ID
is used, these identifiers will be used in
top.errorlod
, plotErrorlod
, and
plotGeno
as identifiers for the individual.
The first row in each file contains the column names. For the
phenotype file, these are the names of the phenotypes. For the
genotype file, the first cell will be the name of the identifier
column (id
or ID
) and the subsequent fields will be the
marker names.
In the genotype data file, the second row gives the chromosome IDs. The cell in the second row, first column, must be blank. A third row giving cM positions of markers may be included, in which case the cell in the third row, first column, must be blank.
There need be no blank rows in the phenotype data file.
This is just like the csvs
format, but with each file rotated
(or really transposed), so that rows are columns and columns are rows.
This format requires two files. The so-called rawfile, specified by
the argument file
, contains the genotype and phenotype
data. Rows beginning with the symbol #
are ignored. The first
line should be either data type f2 intercross
or
data type f2 backcross
. The second line should begin with
three numbers indicating the numbers of individuals, markers and
phenotypes in the file. This line may include the word symbols
followed by symbol assignments (see the documentation for mapmaker,
and cross your fingers). The rest of the lines give genotype data
followed by phenotype data, with marker and phenotype names always
beginning with the *
symbol.
A second file contains the genetic map information, specified with
the argument mapfile
. The map file may be in
one of two formats. The function will determine which format of map
file is presented.
The simplest format for the map file is not standard for the Mapmaker software, but is easy to create. The file contains two or three columns separated by white space and with no header row. The first column gives the chromosome assignments. The second column gives the marker names, with markers listed in the order along the chromosomes. An optional third column lists the map positions of the markers.
Another possible format for the map file is the .maps
format, which is produced by Mapmaker. The code for reading this
format was written by Brian Yandell.
Marker order is taken from the map file, either by the order they are presented or by the cM positions, if specified.
This format requires a single file (that produced by the Map Manager QTX program).
This format requires two files: the .cro
and .map
files
for QTL Cartographer (produced by the QTL Cartographer
sub-program, Rmap and Rcross).
Note that the QTL Cartographer cross types are converted as follows: RF1 to riself, RF2 to risib, RF0 (doubled haploids) to bc, B1 or B2 to bc, RF2 or SF2 to f2.
This format requires three simple CSV files, separating the genotype, phenotype, and marker map information so that each file may be of a simple form.
This format requires the six files. All files have default names, and so the file names need not be specified if the default names are used.
genfile
(default = "geno.dat"
) contains the genotype
data. The file contains one line per individual, with genotypes for
the set of markers separated by white space. Missing values are
coded as 9, and genotypes are coded as 0/1/2 for AA/AB/BB.
mapfile
(default = "markerpos.txt"
) contains two
columns with no header row: the marker names in the first column and
their cM position in the second column. If marker positions are not
available, use mapfile=NULL
, and a dummy map will be inserted.
phefile
(default = "pheno.dat"
) contains the phenotype
data, with one row for each mouse and one column for each phenotype.
There should be no header row, and missing values are coded as
"-"
.
chridfile
(default = "chrid.dat"
) contains the
chromosome identifier for each marker.
mnamesfile
(default = "mnames.txt"
) contains the marker
names.
pnamesfile
(default = "pnames.txt"
) contains the names
of the phenotypes. If phenotype names file is not available, use
pnamesfile=NULL
; arbitrary phenotype names will then be
assigned.
This format requires three files; all files have default names, and so need not be specified if the default name is used.
genfile
(default = "gen.txt"
) contains the genotype
data. The file contains one line per individual, with genotypes
separated by white space. Missing values are coded 0; genotypes are
coded as 1/2/3/4/5 for AA/AB/BB/not BB/not AA.
mapfile
(default = "map.txt"
) contains the map
information, in the following complicated format: n.chr
n.mar(1) rf(1,1) rf(1,2) ... rf(1,n.mar(1)-1)
mar.name(1,1)
mar.name(1,2)
...
mar.name(1,n.mar(1))
n.mar(2)
...
etc.
phefile
(default = "phe.txt"
) contains a matrix of
phenotypes, with one individual per line. The first line in the
file should give the phenotype names.
This format requires three files, described in the manual of the MapQTL program (same as JoinMap).
genfile
corresponds to the loc file containing the genotype
data. Each marker and its genotypes should be on a single line.
mapfile
corresponds to the map file containing the linkage
group assignment, marker names and their map positions.
phefile
corresponds to the qua file containing the phenotypes.
For the moment, only 4-way crosses are supported (CP population type in MapQTL).
Karl W Broman, [email protected]; Brian S. Yandell; Aaron Wolen
Broman, K. W. and Sen, Ś. (2009) A guide to QTL mapping with R/qtl. Springer. https://rqtl.org/book/
subset.cross
, summary.cross
,
plot.cross
, c.cross
, clean.cross
,
write.cross
, sim.cross
, read.table
.
The sampledata
directory in the package distribution contains
sample data files in multiple formats. Also see
https://rqtl.org/sampledata/.
## Not run: # CSV format dat1 <- read.cross("csv", dir="Mydata", file="mydata.csv") # CSVS format dat2 <- read.cross("csvs", dir="Mydata", genfile="mydata_gen.csv", phefile="mydata_phe.csv") # you can read files directly from the internet datweb <- read.cross("csv", "https://rqtl.org/sampledata", "listeria.csv") # Mapmaker format dat3 <- read.cross("mm", dir="Mydata", file="mydata.raw", mapfile="mydata.map") # Map Manager QTX format dat4 <- read.cross("qtx", dir="Mydata", file="mydata.qtx") # QTL Cartographer format dat5 <- read.cross("qtlcart", dir="Mydata", file="qtlcart.cro", mapfile="qtlcart.map") # Gary format dat6 <- read.cross("gary", dir="Mydata", genfile="geno.dat", mapfile="markerpos.txt", phefile="pheno.dat", chridfile="chrid.dat", mnamesfile="mnames.txt", pnamesfile="pnames.txt") # Karl format dat7 <- read.cross("karl", dir="Mydata", genfile="gen.txt", phefile="phe.txt", mapfile="map.txt") ## End(Not run)
## Not run: # CSV format dat1 <- read.cross("csv", dir="Mydata", file="mydata.csv") # CSVS format dat2 <- read.cross("csvs", dir="Mydata", genfile="mydata_gen.csv", phefile="mydata_phe.csv") # you can read files directly from the internet datweb <- read.cross("csv", "https://rqtl.org/sampledata", "listeria.csv") # Mapmaker format dat3 <- read.cross("mm", dir="Mydata", file="mydata.raw", mapfile="mydata.map") # Map Manager QTX format dat4 <- read.cross("qtx", dir="Mydata", file="mydata.qtx") # QTL Cartographer format dat5 <- read.cross("qtlcart", dir="Mydata", file="qtlcart.cro", mapfile="qtlcart.map") # Gary format dat6 <- read.cross("gary", dir="Mydata", genfile="geno.dat", mapfile="markerpos.txt", phefile="pheno.dat", chridfile="chrid.dat", mnamesfile="mnames.txt", pnamesfile="pnames.txt") # Karl format dat7 <- read.cross("karl", dir="Mydata", genfile="gen.txt", phefile="phe.txt", mapfile="map.txt") ## End(Not run)
Data for a set of 4- or 8-way recombinant inbred lines (RIL) is read
from a pair of comma-delimited files and converted
into an object of class cross
.
We require chromosome
assignments for the genetic markers, and assume that markers are in
their correct order.
readMWril(dir="", rilfile, founderfile, type=c("ri4self", "ri4sib", "ri8self", "ri8selfIRIP1", "ri8sib", "bgmagic16"), na.strings=c("-","NA"), rotate=FALSE, ...)
readMWril(dir="", rilfile, founderfile, type=c("ri4self", "ri4sib", "ri8self", "ri8selfIRIP1", "ri8sib", "bgmagic16"), na.strings=c("-","NA"), rotate=FALSE, ...)
dir |
Directory in which the data files will be found. In
Windows, use forward slashes ( |
rilfile |
Comma-delimited file for the RIL, in the |
founderfile |
File with founder strains' genotypes, in the same
orientation as the |
type |
The type of RIL. |
na.strings |
A vector of strings which are to be interpreted as
missing values. For the
|
rotate |
If TRUE, the |
... |
Additional arguments, passed to the function
|
The rilfile
should include a phenotype cross
containing
character strings of the form ABCDEFGH
, indicating the cross
used to generate each RIL. The genotypes should be coded as
integers (e.g., 1 and 2).
The founder strains in the founderfile
should be the strains
A
, B
, C
, ..., as indicated in the cross
phenotype.
The default arrangement of the files is to have markers as columns and
individuals/founders as rows. If rotate=TRUE
, do the opposite:
markers as rows and individuals/founders as columns.
An object of class cross
; see the help file for
read.cross
for details.
An additional component crosses
is included; this is a matrix
indicating the crosses used to generate the RIL.
Karl W Broman, [email protected]
## Not run: ril <- read.cross("../Data", "ril_data.csv", "founder_geno.csv", "ri4self", rotate=TRUE) ## End(Not run)
## Not run: ril <- read.cross("../Data", "ril_data.csv", "founder_geno.csv", "ri4self", rotate=TRUE) ## End(Not run)
For high-density marker data, rather than run scanone
at both the
markers and at a set of pseudomarkers, we reduce to just
a set of evenly-spaced pseudomarkers
reduce2grid(cross)
reduce2grid(cross)
cross |
An object of class |
Genotype probabilities (from calc.genoprob
) and/or
imputations (from sim.geno
) are subset to a grid of
pseudomarkers.
This is so that, in the case of high-density markers, we can do the genome scan calculations at a smaller set of points (on an evenly-spaced grid, but not at the markers) to save computation time.
You need to first have run calc.genoprob
and/or
sim.geno
, and you must use stepwidth="fixed"
.
When plotting results with plot.scanone
, use
incl.markers=FALSE
, as the output of scanone
won't include information about the marker locations and so will plot
tick marks only at the first marker on each chromosome.
The input cross
object with included genotype probabilities or
imputations subset to an evenly-spaced grid.
Karl W Broman, [email protected]
calc.genoprob
,
sim.geno
, scanone
, plot.scanone
data(hyper) hyper <- calc.genoprob(hyper, step=2) hypersub <- reduce2grid(hyper) ## Not run: out <- scanone(hypersub) plot(out, incl.markers=FALSE) ## End(Not run)
data(hyper) hyper <- calc.genoprob(hyper, step=2) hypersub <- reduce2grid(hyper) ## Not run: out <- scanone(hypersub) plot(out, incl.markers=FALSE) ## End(Not run)
Iteratively scan the positions for QTL in the context of a multiple QTL model, to try to identify the positions with maximum likelihood, for a fixed QTL model.
refineqtl(cross, pheno.col=1, qtl, chr, pos, qtl.name, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), verbose=TRUE, maxit=10, incl.markers=TRUE, keeplodprofile=TRUE, tol=1e-4, maxit.fitqtl=1000, forceXcovar=FALSE)
refineqtl(cross, pheno.col=1, qtl, chr, pos, qtl.name, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), verbose=TRUE, maxit=10, incl.markers=TRUE, keeplodprofile=TRUE, tol=1e-4, maxit.fitqtl=1000, forceXcovar=FALSE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix to be used as the phenotype. One may also give a character string matching the phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
A QTL object, as produced by |
chr |
Vector indicating the chromosome for each QTL; if |
pos |
Vector indicating the positions for each QTL; if |
qtl.name |
Optional user-specified name for each QTL. If
|
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
verbose |
If TRUE, give feedback about progress. If
|
maxit |
Maximum number of iterations. |
incl.markers |
If FALSE, do calculations only at points on an evenly spaced grid. |
keeplodprofile |
If TRUE, keep the LOD profiles from the last iteration as attributes to the output. |
tol |
Tolerance for convergence for the binary trait model. |
maxit.fitqtl |
Maximum number of iterations for fitting the binary trait model. |
forceXcovar |
If TRUE, force inclusion of X-chr-related covariates (like sex and cross direction). |
QTL positions are optimized, within the context of a fixed QTL model, by a scheme described in Zeng et al. (1999). Each QTL is considered one at a time (in a random order), and a scan is performed, allowing the QTL to vary across its chromosome, keeping the positions of all other QTL fixed. If there is another QTL on the chromosome, the position of the floating QTL is scanned from the end of the chromosome to the position of the flanking QTL. If the floating QTL is between two QTL on a chromosome, its position is scanned between those two QTL positions. Each QTL is moved to the position giving the highest likelihood, and the entire process is repeated until no further improvement in likelihood can be obtained.
One may provide either a qtl
object (as produced by
makeqtl
), or vectors chr
and pos
(and, optionally, qtl.name
) indicating the positions of the
QTL.
If a qtl
object is provided, QTL that do not appear in
the model formula
are ignored, but they remain part of the QTL
object that is output.
An object of class qtl
, with QTL placed in their new positions.
If keeplodprofile=TRUE
, LOD profiles from the last pass through
the refinement algorithm are retained as an attribute,
"lodprofile"
, to the object. These may be plotted with
plotLodProfile
.
Karl W Broman, [email protected]
Zeng, Z.-B., Kao, C.-H., and Basten, C. J. (1999) Estimating the genetic architecture of quantitative traits. Genet. Res. 74, 279–289.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
fitqtl
, makeqtl
,
scanqtl
, addtoqtl
,
dropfromqtl
, replaceqtl
,
plotLodProfile
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2) qtl <- makeqtl(fake.bc, chr=c(2,5), pos=c(32.5, 17.5), what="prob") rqtl <- refineqtl(fake.bc, qtl=qtl, method="hk")
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2) qtl <- makeqtl(fake.bc, chr=c(2,5), pos=c(32.5, 17.5), what="prob") rqtl <- refineqtl(fake.bc, qtl=qtl, method="hk")
This function changes the order of the QTL in a QTL object.
reorderqtl(qtl, neworder)
reorderqtl(qtl, neworder)
qtl |
A qtl object, as created by |
neworder |
A vector containing the positive integers up to the
number of QTL in |
Everything in the input qtl
is reordered except the
altname
component, which contains names of the form Q1
,
Q2
, etc.
The input qtl
object, with the loci reordered.
Karl W Broman, [email protected]
makeqtl
, fitqtl
,
dropfromqtl
, addtoqtl
,
replaceqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2) qtl <- makeqtl(fake.f2, qc, qp, what="prob") qtl <- reorderqtl(qtl, c(2,3,1)) qtl qtl <- reorderqtl(qtl) qtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2) qtl <- makeqtl(fake.f2, qc, qp, what="prob") qtl <- reorderqtl(qtl, c(2,3,1)) qtl qtl <- reorderqtl(qtl) qtl
Replace the map portion of a cross object.
replace.map(cross, map) ## S3 method for class 'cross' replacemap(object, map)
replace.map(cross, map) ## S3 method for class 'cross' replacemap(object, map)
cross |
An object of class |
object |
Same as |
map |
A list containing the new genetic map. This must be the
same length and with the same marker names as that contained in
|
The input cross
object with the genetic map replaced by
the input map
. Maps for results from
calc.genoprob
, sim.geno
and
argmax.geno
are also replaced, using interpolation if
necessary.
Karl W Broman, [email protected]
data(fake.f2) newmap <- est.map(fake.f2) plotMap(fake.f2, newmap) fake.f2 <- replace.map(fake.f2, newmap)
data(fake.f2) newmap <- est.map(fake.f2) plotMap(fake.f2, newmap) fake.f2 <- replace.map(fake.f2, newmap)
Replace the positions of LOD scores in output from
scanone
with values
based on an alternative map (such as a physical map), with
pseudomarker locations determined by linear interpolation.
## S3 method for class 'scanone' replacemap(object, map)
## S3 method for class 'scanone' replacemap(object, map)
object |
An object of class |
map |
A list containing the alternative genetic map. All
chromosomes in |
The positions of pseudomarkers are determined by linear interpolation
between markers. In the case of pseudomarkers beyond the ends of the
terminal markers on chromosomes, we use the overall lengths of the
chromosome in object
and map
to determine the new
spacing.
The input object
with the positions of LOD scores
revised to match those in the input map
.
Karl W Broman, [email protected]
replacemap.cross
,
est.map
, replacemap.scantwo
data(fake.f2) origmap <- pull.map(fake.f2) newmap <- est.map(fake.f2) fake.f2 <- replacemap(fake.f2, newmap) fake.f2 <- calc.genoprob(fake.f2, step=2.5) out <- scanone(fake.f2, method="hk") out.rev <- replacemap(out, origmap)
data(fake.f2) origmap <- pull.map(fake.f2) newmap <- est.map(fake.f2) fake.f2 <- replacemap(fake.f2, newmap) fake.f2 <- calc.genoprob(fake.f2, step=2.5) out <- scanone(fake.f2, method="hk") out.rev <- replacemap(out, origmap)
Replace the positions of LOD scores in output from
scantwo
with values
based on an alternative map (such as a physical map), with
pseudomarker locations determined by linear interpolation.
## S3 method for class 'scantwo' replacemap(object, map)
## S3 method for class 'scantwo' replacemap(object, map)
object |
An object of class |
map |
A list containing the alternative genetic map. All
chromosomes in |
The positions of pseudomarkers are determined by linear interpolation
between markers. In the case of pseudomarkers beyond the ends of the
terminal markers on chromosomes, we use the overall lengths of the
chromosome in object
and map
to determine the new
spacing.
The input object
with the positions of LOD scores
revised to match those in the input map
.
Karl W Broman, [email protected]
replacemap.cross
,
est.map
, replacemap.scanone
data(hyper) origmap <- pull.map(hyper) newmap <- est.map(hyper) hyper <- replacemap(hyper, newmap) hyper <- calc.genoprob(hyper, step=0) out <- scantwo(hyper, method="hk") out.rev <- replacemap(out, origmap)
data(hyper) origmap <- pull.map(hyper) newmap <- est.map(hyper) hyper <- replacemap(hyper, newmap) hyper <- calc.genoprob(hyper, step=0) out <- scantwo(hyper, method="hk") out.rev <- replacemap(out, origmap)
This function replaces a QTL or QTLs in a qtl object with a different position.
replaceqtl(cross, qtl, index, chr, pos, qtl.name, drop.lod.profile=TRUE)
replaceqtl(cross, qtl, index, chr, pos, qtl.name, drop.lod.profile=TRUE)
cross |
An object of class |
qtl |
A qtl object, as created by |
index |
Numeric index indicating the QTL to be replaced. |
chr |
Vector (of same length as |
pos |
Vector (of same length as |
qtl.name |
Optional vector (of same length as |
drop.lod.profile |
If TRUE, remove any LOD profiles from the object. |
The input qtl
object,
but with some QTL replaced by new ones. See makeqtl
for
details on the format.
Karl W Broman, [email protected]
makeqtl
, fitqtl
,
dropfromqtl
, addtoqtl
,
reorderqtl
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") qtl <- replaceqtl(fake.f2, qtl, 2, 6, 48.1)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") qtl <- replaceqtl(fake.f2, qtl, 2, 6, 48.1)
Rescale a genetic map by multiplying all positions by a constant
rescalemap(object, scale=1e-6)
rescalemap(object, scale=1e-6)
object |
An object of class |
scale |
Scale factor by which all positions will be multiplied. |
This function is included particularly for the case that map positions
in a cross object were provided in basepairs and one wishes to quickly
convert them to Mbp or some other approximation of cM distances. (In
the mouse, 1 cM is approximation 2 Mbp, so one might use
scale=5e-7
in this function.)
If the input is a map
object, a map
object is returned; if
the input is a cross
object, a cross
object is returned.
In either case, the positions of markers are simply multiplied by
scale
.
Karl W Broman, [email protected]
data(hyper) rescaled <- rescalemap(hyper, scale=2) plotMap(hyper, rescaled)
data(hyper) rescaled <- rescalemap(hyper, scale=2) plotMap(hyper, rescaled)
Investigate different marker orders for a given chromosome, comparing all possible permutations of a sliding window of markers.
ripple(cross, chr, window=4, method=c("countxo","likelihood"), error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE, n.cluster=1)
ripple(cross, chr, window=4, method=c("countxo","likelihood"), error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE, n.cluster=1)
cross |
An object of class |
chr |
The chromosome to investigate. Only one chromosome is allowed. (This should be a character string referring to the chromosomes by name.) |
window |
Number of markers to include in the sliding window of permuted markers. Larger numbers result in the comparison of a greater number of marker orders, but will require a considerable increase in computation time. |
method |
Indicates whether to compare orders by counting the number of obligate crossovers, or by a likelihood analysis. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. |
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
verbose |
If TRUE, information about the number of orders (and, if
|
n.cluster |
If the package |
For method="likelihood"
, calculations are done by first
constructing a matrix of marker orders and then making repeated calls
to the R function est.map
. Of course, it would be
faster to do everything within C, but this was a lot easier to code.
For method="countxo"
, calculations are done within C.
A matrix, given class "ripple"
; the first set of columns are
marker indices describing the order. In the case of
method="countxo"
, the last column is the number of obligate
crossovers for each particular order. In the case of
method="likelihood"
, the last two columns are LOD scores (log
base 10 likelihood ratios) comparing each order to the initial order
and the estimated chromosome length for the given order. Positive LOD
scores indicate that the alternate order has more support than the
original.
Karl W Broman, [email protected]
summary.ripple
, switch.order
,
est.map
, est.rf
data(badorder) rip1 <- ripple(badorder, chr=1, window=3) summary(rip1) ## Not run: rip2 <- ripple(badorder, chr=1, window=2, method="likelihood") summary(rip2) ## End(Not run) badorder <- switch.order(badorder, 1, rip1[2,])
data(badorder) rip1 <- ripple(badorder, chr=1, window=3) summary(rip1) ## Not run: rip2 <- ripple(badorder, chr=1, window=2, method="likelihood") summary(rip2) ## End(Not run) badorder <- switch.order(badorder, 1, rip1[2,])
Genome scan with a single QTL model, with possible allowance for covariates, using any of several possible models for the phenotype and any of several possible numerical methods.
scanone(cross, chr, pheno.col=1, model=c("normal","binary","2part","np"), method=c("em","imp","hk","ehk","mr","mr-imp","mr-argmax"), addcovar=NULL, intcovar=NULL, weights=NULL, use=c("all.obs", "complete.obs"), upper=FALSE, ties.random=FALSE, start=NULL, maxit=4000, tol=1e-4, n.perm, perm.Xsp=FALSE, perm.strata=NULL, verbose, batchsize=250, n.cluster=1, ind.noqtl)
scanone(cross, chr, pheno.col=1, model=c("normal","binary","2part","np"), method=c("em","imp","hk","ehk","mr","mr-imp","mr-argmax"), addcovar=NULL, intcovar=NULL, weights=NULL, use=c("all.obs", "complete.obs"), upper=FALSE, ties.random=FALSE, start=NULL, maxit=4000, tol=1e-4, n.perm, perm.Xsp=FALSE, perm.strata=NULL, verbose, batchsize=250, n.cluster=1, ind.noqtl)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes for which LOD
scores should be calculated. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix which should be
used as the phenotype. This can be a vector of integers; for methods
|
model |
The phenotype model: the usual normal model, a model for binary traits, a two-part model or non-parametric analysis |
method |
Indicates whether to use the EM algorithm,
imputation, Haley-Knott regression, the extended Haley-Knott method,
or marker regression. Not all methods are available for all models.
Marker regression is performed either by dropping individuals with
missing genotypes ( |
addcovar |
Additive covariates; allowed only for the normal and binary models. |
intcovar |
Interactive covariates (interact with QTL genotype); allowed only for the normal and binary models. |
weights |
Optional weights of individuals. Should be either NULL
or a vector of length n.ind containing positive weights. Used only
in the case |
use |
In the case that multiple phenotypes are selected to be scanned, this argument indicates whether to use all individuals, including those missing some phenotypes, or just those individuals that have data on all selected phenotypes. |
upper |
Used only for the two-part model; if true, the "undefined" phenotype is the maximum observed phenotype; otherwise, it is the smallest observed phenotype. |
ties.random |
Used only for the non-parametric "model"; if TRUE, ties in the phenotypes are ranked at random. If FALSE, average ranks are used and a corrected LOD score is calculated. |
start |
Used only for the EM algorithm with the normal model and
no covariates. If |
maxit |
Maximum number of iterations for methods |
tol |
Tolerance value for determining convergence for methods
|
n.perm |
If specified, a permutation test is performed rather than an analysis of the observed data. This argument defines the number of permutation replicates. |
perm.Xsp |
If |
perm.strata |
If |
verbose |
In the case |
batchsize |
The number of phenotypes (or permutations) to be run
as a batch; used only for methods |
n.cluster |
If the package |
ind.noqtl |
Indicates individuals who should not be allowed a QTL effect (used rarely, if at all); this is a logical vector of same length as there are individuals in the cross. |
Use of the EM algorithm, Haley-Knott regression, and the extended
Haley-Knott method require that multipoint genotype probabilities are
first calculated using calc.genoprob
. The
imputation method uses the results of sim.geno
.
Individuals with missing phenotypes are dropped.
In the case that n.perm
>0, so that a permutation
test is performed, the R function scanone
is called repeatedly.
If perm.Xsp=TRUE
, separate permutations are performed for the
autosomes and the X chromosome, so that an X-chromosome-specific
threshold may be calculated. In this case, n.perm
specifies
the number of permutations used for the autosomes; for the X
chromosome, n.perm
permutations
will be run, where
and
are the total genetic
lengths of the autosomes and X chromosome, respectively. More
permutations are needed for the X chromosome in order to obtain
thresholds of similar accuracy.
For further details on the models, the methods and the use of covariates, see below.
If n.perm
is missing, the function returns a data.frame whose
first two columns contain the chromosome IDs and cM positions.
Subsequent columns contain the LOD scores for each phenotype.
In the case of the two-part model, there are three LOD score columns
for each phenotype: LOD(), LOD(
) and
LOD(
). The result is given class
"scanone"
and
has attributes "model"
, "method"
, and
"type"
(the latter is the type of cross analyzed).
If n.perm
is specified, the function returns the results of a
permutation test and the output has class "scanoneperm"
. If
perm.Xsp=FALSE
, the function returns a matrix with
n.perm
rows, each row containing the genome-wide
maximum LOD score for each of the phenotypes. In the case of the
two-part model, there are three columns for each phenotype,
corresponding to the three different LOD scores. If
perm.Xsp=TRUE
, the result contains separate permutation results
for the autosomes and the X chromosome respectively, and an attribute
indicates the lengths of the chromosomes and an indicator of which
chromosome is X.
The normal model is the standard model for QTL mapping (see Lander and Botstein 1989). The residual phenotypic variation is assumed to follow a normal distribution, and analysis is analogous to analysis of variance.
The binary model is for the case of a binary phenotype, which
must have values 0 and 1. The proportions of 1's in the different
genotype groups are compared. Currently only methods em
, hk
, and
mr
are available for this model. See Xu and Atchley (1996) and
Broman (2003).
The two-part model is appropriate for the case of a spike in
the phenotype distribution (for example, metastatic density when many
individuals show no metastasis, or survival time following an
infection when individuals may recover from the infection and fail to
die). The two-part model was described by
Boyartchuk et al. (2001) and Broman (2003). Individuals with QTL
genotype have probability
of having an
undefined phenotype (the spike), while if their phenotype is defined,
it comes from a normal distribution with mean
and
common standard deviation
. Three LOD scores are
calculated: LOD(
) is for the test of the hypothesis
that
and
.
LOD(
) is for the test that
while the
may vary. LOD(
) is for the test that
while the
may vary.
With the non-parametric "model", an extension of the
Kruskal-Wallis test is used; this is similar to the method described
by Kruglyak and Lander (1995). In the case of incomplete genotype
information (such as at locations between genetic markers), the
Kruskal-Wallis statistic is modified so that the rank for each
individual is weighted by the genotype probabilities, analogous to
Haley-Knott regression. For this method, if the argument
ties.random
is TRUE, ties in the phenotypes are assigned random
ranks; if it is FALSE, average ranks are used and a corrected LOD
score is calculate. Currently the method
argument is ignored
for this model.
em
: maximum likelihood is performed via the
EM algorithm (Dempster et al. 1977), first used in this context by
Lander and Botstein (1989).
imp
: multiple imputation is used, as described by Sen
and Churchill (2001).
hk
: Haley-Knott regression is used (regression of the
phenotypes on the multipoint QTL genotype probabilities), as described
by Haley and Knott (1992).
ehk
: the extended Haley-Knott method is used (like H-K,
but taking account of the variances), as described in Feenstra et
al. (2006).
mr
: Marker regression is used. Analysis is performed
only at the genetic markers, and individuals with missing genotypes
are discarded. See Soller et al. (1976).
Covariates are allowed only for the normal and binary models. The
normal model is where q is the unknown QTL genotype, A
is a matrix of additive covariates, and Z is a matrix of
covariates that interact with the QTL genotype. The columns of Z
are forced to be contained in the matrix A. The binary model is
the logistic regression analog.
The LOD score is calculated comparing the likelihood of the above
model to that of the null model .
Covariates must be numeric matrices. Individuals with any missing covariates are discarded.
The X chromosome must be treated specially in QTL mapping. See Broman et al. (2006).
If both males and females are included, male hemizygotes are allowed to be different from female homozygotes. Thus, in a backcross, we will fit separate means for the genotype classes AA, AB, AY, and BY. In such cases, sex differences in the phenotype could cause spurious linkage to the X chromosome, and so the null hypothesis must be changed to allow for a sex difference in the phenotype.
Numerous special cases must be considered, as detailed in the following table.
BC | Sexes | Null | Alternative | df | |
both sexes | sex | AA/AB/AY/BY | 2 | ||
all female | grand mean | AA/AB | 1 | ||
all male | grand mean | AY/BY | 1 | ||
F2 | Direction | Sexes | Null | Alternative | df |
Both | both sexes | femaleF/femaleR/male | AA/ABf/ABr/BB/AY/BY | 3 | |
all female | pgm | AA/ABf/ABr/BB | 2 | ||
all male | grand mean | AY/BY | 1 | ||
Forward | both sexes | sex | AA/AB/AY/BY | 2 | |
all female | grand mean | AA/AB | 1 | ||
all male | grand mean | AY/BY | 1 | ||
Backward | both sexes | sex | AB/BB/AY/BY | 2 | |
all female | grand mean | AB/BB | 1 | ||
all male | grand mean | AY/BY | 1 | ||
In the case that the number of degrees of freedom for the linkage test
for the X chromosome is different from that for autosomes, a separate
X-chromosome LOD threshold is recommended. Autosome- and
X-chromosome-specific LOD thresholds may be estimated by permutation
tests with scanone
by setting n.perm
>0 and using
perm.Xsp=TRUE
.
Karl W Broman, [email protected]; Hao Wu
Boyartchuk, V. L., Broman, K. W., Mosher, R. E., D'Orazio S. E. F., Starnbach, M. N. and Dietrich, W. F. (2001) Multigenic control of Listeria monocytogenes susceptibility in mice. Nature Genetics 27, 259–260.
Broman, K. W. (2003) Mapping quantitative trait loci in the case of a spike in the phenotype distribution. Genetics 163, 1169–1175.
Broman, K. W., Sen, Ś, Owens, S. E., Manichaikul, A., Southard-Smith, E. M. and Churchill G. A. (2006) The X chromosome in quantitative trait locus mapping. Genetics, 174, 2151–2158.
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39, 1–38.
Feenstra, B., Skovgaard, I. M. and Broman, K. W. (2006) Mapping quantitative trait loci by an extension of the Haley-Knott regression method using estimating equations. Genetics, 173, 2111–2119.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Kruglyak, L. and Lander, E. S. (1995) A nonparametric approach for mapping quantitative trait loci. Genetics 139, 1421–1428.
Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185–199.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
Soller, M., Brody, T. and Genizi, A. (1976) On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines. Theor. Appl. Genet. 47, 35–39.
Xu, S., and Atchley, W.R. (1996) Mapping quantitative trait loci for complex binary diseases using line crosses. Genetics 143, 1417–1424.
plot.scanone
,
summary.scanone
, scantwo
,
calc.genoprob
, sim.geno
,
max.scanone
,
summary.scanoneperm
,
-.scanone
, +.scanone
################### # Normal Model ################### data(hyper) # Genotype probabilities for EM and H-K ## Not run: hyper <- calc.genoprob(hyper, step=2.5) out.em <- scanone(hyper, method="em") out.hk <- scanone(hyper, method="hk") # Summarize results: peaks above 3 summary(out.em, thr=3) summary(out.hk, thr=3) # An alternate method of summarizing: # patch them together and then summarize out <- c(out.em, out.hk) summary(out, thr=3, format="allpeaks") # Plot the results plot(out.hk, out.em) plot(out.hk, out.em, chr=c(1,4), lty=1, col=c("blue","black")) # Imputation; first need to run sim.geno # Do just chromosomes 1 and 4, to save time ## Not run: hyper.c1n4 <- sim.geno(subset(hyper, chr=c(1,4)), step=2.5, n.draws=8) ## End(Not run) out.imp <- scanone(hyper.c1n4, method="imp") summary(out.imp, thr=3) # Plot all three results plot(out.imp, out.hk, out.em, chr=c(1,4), lty=1, col=c("red","blue","black")) # extended Haley-Knott out.ehk <- scanone(hyper, method="ehk") plot(out.hk, out.em, out.ehk, chr=c(1,4)) # Permutation tests ## Not run: permo <- scanone(hyper, method="hk", n.perm=1000) # Threshold from the permutation test summary(permo, alpha=c(0.05, 0.10)) # Results above the 0.05 threshold summary(out.hk, perms=permo, alpha=0.05) #################### # scan with square-root of phenotype # (Note that pheno.col can be a vector of phenotype values) #################### out.sqrt <- scanone(hyper, pheno.col=sqrt(pull.pheno(hyper, 1))) plot(out.em - out.sqrt, ylim=c(-0.1,0.1), ylab="Difference in LOD") abline(h=0, lty=2, col="gray") #################### # Stratified permutations #################### extremes <- (nmissing(hyper)/totmar(hyper) < 0.5) ## Not run: operm.strat <- scanone(hyper, method="hk", n.perm=1000, perm.strata=extremes) ## End(Not run) summary(operm.strat) #################### # X-specific permutations #################### data(fake.f2) ## Not run: fake.f2 <- calc.genoprob(fake.f2, step=2.5) # genome scan out <- scanone(fake.f2, method="hk") # X-chr-specific permutations ## Not run: operm <- scanone(fake.f2, method="hk", n.perm=1000, perm.Xsp=TRUE) # thresholds summary(operm) # scanone summary with p-values summary(out, perms=operm, alpha=0.05, pvalues=TRUE) ################### # Non-parametric ################### out.np <- scanone(hyper, model="np") summary(out.np, thr=3) # Plot with previous results plot(out.np, chr=c(1,4), lty=1, col="green") plot(out.imp, out.hk, out.em, chr=c(1,4), lty=1, col=c("red","blue","black"), add=TRUE) ################### # Two-part Model ################### data(listeria) ## Not run: listeria <- calc.genoprob(listeria,step=2.5) out.2p <- scanone(listeria, model="2part", upper=TRUE) summary(out.2p, thr=c(5,3,3), format="allpeaks") # Plot all three LOD scores together plot(out.2p, out.2p, out.2p, lodcolumn=c(2,3,1), lty=1, chr=c(1,5,13), col=c("red","blue","black")) # Permutation test ## Not run: permo <- scanone(listeria, model="2part", upper=TRUE, n.perm=1000) ## End(Not run) # Thresholds summary(permo) ################### # Binary model ################### binphe <- as.numeric(pull.pheno(listeria,1)==264) out.bin <- scanone(listeria, pheno.col=binphe, model="binary") summary(out.bin, thr=3) # Plot LOD for binary model with LOD(p) from 2-part model plot(out.bin, out.2p, lodcolumn=c(1,2), lty=1, col=c("black", "red"), chr=c(1,5,13)) # Permutation test ## Not run: permo <- scanone(listeria, pheno.col=binphe, model="binary", n.perm=1000) ## End(Not run) # Thresholds summary(permo) ################### # Covariates ################### data(fake.bc) ## Not run: fake.bc <- calc.genoprob(fake.bc, step=2.5) # genome scans without covariates out.nocovar <- scanone(fake.bc) # genome scans with covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") out.covar <- scanone(fake.bc, pheno.col=1, addcovar=ac, intcovar=ic) summary(out.nocovar, thr=3) summary(out.covar, thr=3) plot(out.covar, out.nocovar, chr=c(2,5,10))
################### # Normal Model ################### data(hyper) # Genotype probabilities for EM and H-K ## Not run: hyper <- calc.genoprob(hyper, step=2.5) out.em <- scanone(hyper, method="em") out.hk <- scanone(hyper, method="hk") # Summarize results: peaks above 3 summary(out.em, thr=3) summary(out.hk, thr=3) # An alternate method of summarizing: # patch them together and then summarize out <- c(out.em, out.hk) summary(out, thr=3, format="allpeaks") # Plot the results plot(out.hk, out.em) plot(out.hk, out.em, chr=c(1,4), lty=1, col=c("blue","black")) # Imputation; first need to run sim.geno # Do just chromosomes 1 and 4, to save time ## Not run: hyper.c1n4 <- sim.geno(subset(hyper, chr=c(1,4)), step=2.5, n.draws=8) ## End(Not run) out.imp <- scanone(hyper.c1n4, method="imp") summary(out.imp, thr=3) # Plot all three results plot(out.imp, out.hk, out.em, chr=c(1,4), lty=1, col=c("red","blue","black")) # extended Haley-Knott out.ehk <- scanone(hyper, method="ehk") plot(out.hk, out.em, out.ehk, chr=c(1,4)) # Permutation tests ## Not run: permo <- scanone(hyper, method="hk", n.perm=1000) # Threshold from the permutation test summary(permo, alpha=c(0.05, 0.10)) # Results above the 0.05 threshold summary(out.hk, perms=permo, alpha=0.05) #################### # scan with square-root of phenotype # (Note that pheno.col can be a vector of phenotype values) #################### out.sqrt <- scanone(hyper, pheno.col=sqrt(pull.pheno(hyper, 1))) plot(out.em - out.sqrt, ylim=c(-0.1,0.1), ylab="Difference in LOD") abline(h=0, lty=2, col="gray") #################### # Stratified permutations #################### extremes <- (nmissing(hyper)/totmar(hyper) < 0.5) ## Not run: operm.strat <- scanone(hyper, method="hk", n.perm=1000, perm.strata=extremes) ## End(Not run) summary(operm.strat) #################### # X-specific permutations #################### data(fake.f2) ## Not run: fake.f2 <- calc.genoprob(fake.f2, step=2.5) # genome scan out <- scanone(fake.f2, method="hk") # X-chr-specific permutations ## Not run: operm <- scanone(fake.f2, method="hk", n.perm=1000, perm.Xsp=TRUE) # thresholds summary(operm) # scanone summary with p-values summary(out, perms=operm, alpha=0.05, pvalues=TRUE) ################### # Non-parametric ################### out.np <- scanone(hyper, model="np") summary(out.np, thr=3) # Plot with previous results plot(out.np, chr=c(1,4), lty=1, col="green") plot(out.imp, out.hk, out.em, chr=c(1,4), lty=1, col=c("red","blue","black"), add=TRUE) ################### # Two-part Model ################### data(listeria) ## Not run: listeria <- calc.genoprob(listeria,step=2.5) out.2p <- scanone(listeria, model="2part", upper=TRUE) summary(out.2p, thr=c(5,3,3), format="allpeaks") # Plot all three LOD scores together plot(out.2p, out.2p, out.2p, lodcolumn=c(2,3,1), lty=1, chr=c(1,5,13), col=c("red","blue","black")) # Permutation test ## Not run: permo <- scanone(listeria, model="2part", upper=TRUE, n.perm=1000) ## End(Not run) # Thresholds summary(permo) ################### # Binary model ################### binphe <- as.numeric(pull.pheno(listeria,1)==264) out.bin <- scanone(listeria, pheno.col=binphe, model="binary") summary(out.bin, thr=3) # Plot LOD for binary model with LOD(p) from 2-part model plot(out.bin, out.2p, lodcolumn=c(1,2), lty=1, col=c("black", "red"), chr=c(1,5,13)) # Permutation test ## Not run: permo <- scanone(listeria, pheno.col=binphe, model="binary", n.perm=1000) ## End(Not run) # Thresholds summary(permo) ################### # Covariates ################### data(fake.bc) ## Not run: fake.bc <- calc.genoprob(fake.bc, step=2.5) # genome scans without covariates out.nocovar <- scanone(fake.bc) # genome scans with covariates ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") out.covar <- scanone(fake.bc, pheno.col=1, addcovar=ac, intcovar=ic) summary(out.nocovar, thr=3) summary(out.covar, thr=3) plot(out.covar, out.nocovar, chr=c(2,5,10))
Nonparametric bootstrap to get an estimated confidence interval for the location of a QTL, in the context of a single-QTL model.
scanoneboot(cross, chr, pheno.col=1, model=c("normal","binary","2part","np"), method=c("em","imp","hk","ehk","mr","mr-imp","mr-argmax"), addcovar=NULL, intcovar=NULL, weights=NULL, use=c("all.obs", "complete.obs"), upper=FALSE, ties.random=FALSE, start=NULL, maxit=4000, tol=1e-4, n.boot=1000, verbose=FALSE)
scanoneboot(cross, chr, pheno.col=1, model=c("normal","binary","2part","np"), method=c("em","imp","hk","ehk","mr","mr-imp","mr-argmax"), addcovar=NULL, intcovar=NULL, weights=NULL, use=c("all.obs", "complete.obs"), upper=FALSE, ties.random=FALSE, start=NULL, maxit=4000, tol=1e-4, n.boot=1000, verbose=FALSE)
cross |
An object of class |
chr |
The chromosome to investigate. Only one chromosome is allowed. (This should be a character string referring to the chromosomes by name.) |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
model |
The phenotypic model: the usual normal model, a model for binary traits, a two-part model or non-parametric analysis |
method |
Indicates whether to use the EM algorithm,
imputation, Haley-Knott regression, the extended Haley-Knott method,
or marker regression. Not all methods are available for all models.
Marker regression is performed either by dropping individuals with
missing genotypes ( |
addcovar |
Additive covariates; allowed only for the normal and binary models. |
intcovar |
Interactive covariates (interact with QTL genotype); allowed only for the normal and binary models. |
weights |
Optional weights of individuals. Should be either NULL
or a vector of length n.ind containing positive weights. Used only
in the case |
use |
In the case that multiple phenotypes are selected to be scanned, this argument indicates whether to use all individuals, including those missing some phenotypes, or just those individuals that have data on all selected phenotypes. |
upper |
Used only for the two-part model; if true, the "undefined" phenotype is the maximum observed phenotype; otherwise, it is the smallest observed phenotype. |
ties.random |
Used only for the non-parametric "model"; if TRUE, ties in the phenotypes are ranked at random. If FALSE, average ranks are used and a corrected LOD score is calculated. |
start |
Used only for the EM algorithm with the normal model and
no covariates. If |
maxit |
Maximum number of iterations for methods |
tol |
Tolerance value for determining convergence for methods
|
n.boot |
Number of bootstrap replicates. |
verbose |
If TRUE, display information about the progress of the bootstrap. |
We recommend against the use of the bootstrap to derive a confidence
interval for the location of a QTL; see Manichaikul et al. (2006).
Use lodint
or bayesint
instead.
The bulk of the arguments are the same as for the
scanone
function. A single chromosome should be
indicated with the chr
argument; otherwise, we focus on the
first chromosome in the input cross
object.
A single-dimensional scan on the relevant chromosome is performed. We further perform a nonparametric bootstrap (sampling individuals with replacement from the available data, to create a new data set with the same size as the input cross; some individuals with be duplicated and some omitted). The same scan is performed with the resampled data; for each bootstrap replicate, we store only the location with maximum LOD score.
Use summary.scanoneboot
to obtain the desired
confidence interval.
A vector of length n.boot
, giving the estimated QTL locations
in the bootstrap replicates. The results for the original data are
included as an attribute, "results"
.
Karl W Broman, [email protected]
Manichaikul, A., Dupuis, J., Sen, Ś and Broman, K. W. (2006) Poor performance of bootstrap confidence intervals for the location of a quantitative trait locus. Genetics 174, 481–489.
Visscher, P. M., Thompson, R. and Haley, C. S. (1996) Confidence intervals in QTL mapping by bootstrap. Genetics 143, 1013–1020.
scanone
, summary.scanoneboot
,
plot.scanoneboot
,
lodint
, bayesint
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=1, err=0.001) ## Not run: bootoutput <- scanoneboot(fake.f2, chr=13, method="hk") plot(bootoutput) summary(bootoutput)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=1, err=0.001) ## Not run: bootoutput <- scanoneboot(fake.f2, chr=13, method="hk") plot(bootoutput) summary(bootoutput)
Genome scan with a single QTL model for loci that can affect the variance as well as the mean.
scanonevar(cross, pheno.col=1, mean_covar=NULL, var_covar=NULL, maxit=25, tol=1e-6, quiet=TRUE)
scanonevar(cross, pheno.col=1, mean_covar=NULL, var_covar=NULL, maxit=25, tol=1e-6, quiet=TRUE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This must be a single value (integer index or phenotype name) or a numeric vector of phenotype values, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
mean_covar |
Numeric matrix with covariates affecting the mean. |
var_covar |
Numeric matrix with covariates affecting the variances. |
maxit |
Maximum number of iterations in the algorithm to fit the model at a given position. |
tol |
Tolerance for convergence. |
quiet |
If |
A data frame (with class "scanone"
, in the form output by
scanone
), with four columns: chromosome, position, the -log P-value for
the mean effect, and the -log P-value for the effect on the variance.
The result is given class "scanone"
Lars Ronnegard and Karl Broman
Ronnegard, L. and Valdar W. (2011) Detecting major genetic loci controlling phenotypic variability in experimental crosses. Genetics 188:435-447
Ronnegard, L. and Valdar W. (2012) Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability. BMC Genetics 13:63
scanone
,
summary.scanone
, calc.genoprob
,
summary.scanoneperm
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) out <- scanonevar(fake.bc) color <- c("slateblue", "violetred") plot(out, lod=1:2, col=color, bandcol="gray80") legend("topright", lwd=2, c("mean", "variance"), col=color) # use format="allpeaks" to get summary for each of mean and variance # also consider format="tabByCol" or format="tabByChr" summary(out, format="allpeaks") # with sex and age as covariates covar <- fake.bc$pheno[,c("sex", "age")] out.cov <- scanonevar(fake.bc, mean_covar=covar, var_covar=covar)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) out <- scanonevar(fake.bc) color <- c("slateblue", "violetred") plot(out, lod=1:2, col=color, bandcol="gray80") legend("topright", lwd=2, c("mean", "variance"), col=color) # use format="allpeaks" to get summary for each of mean and variance # also consider format="tabByCol" or format="tabByChr" summary(out, format="allpeaks") # with sex and age as covariates covar <- fake.bc$pheno[,c("sex", "age")] out.cov <- scanonevar(fake.bc, mean_covar=covar, var_covar=covar)
Executes permutations of the genotypes in the mean-effect part of scanonevar
scanonevar.meanperm(cross, pheno.col=1, mean_covar=NULL, var_covar=NULL, maxit=25, tol=1e-6, n.mean.perm = 2, seed = 27517, quiet=TRUE)
scanonevar.meanperm(cross, pheno.col=1, mean_covar=NULL, var_covar=NULL, maxit=25, tol=1e-6, n.mean.perm = 2, seed = 27517, quiet=TRUE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This must be a single value (integer index or phenotype name) or a numeric vector of phenotype values, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
mean_covar |
Numeric matrix with covariates affecting the mean. |
var_covar |
Numeric matrix with covariates affecting the variances. |
maxit |
Maximum number of iterations in the algorithm to fit the model at a given position. |
tol |
Tolerance for convergence. |
n.mean.perm |
Numeric vector of length one indicates the number of permutations to execute. |
seed |
Numeric vector of length one indicates the random seed to start the permutations. |
quiet |
If |
A vector of length n.mean.perm
of the maximum negative log10 p-value that resulted from each permutation.
Executes permutations of the genotypes in the variance-effect part of scanonevar
scanonevar.varperm(cross, pheno.col=1, mean_covar=NULL, var_covar=NULL, maxit=25, tol=1e-6, n.var.perm = 2, seed = 27517, quiet=TRUE)
scanonevar.varperm(cross, pheno.col=1, mean_covar=NULL, var_covar=NULL, maxit=25, tol=1e-6, n.var.perm = 2, seed = 27517, quiet=TRUE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This must be a single value (integer index or phenotype name) or a numeric vector of phenotype values, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
mean_covar |
Numeric matrix with covariates affecting the mean. |
var_covar |
Numeric matrix with covariates affecting the variances. |
maxit |
Maximum number of iterations in the algorithm to fit the model at a given position. |
tol |
Tolerance for convergence. |
n.var.perm |
Numeric vector of length one indicates the number of permutations to execute. |
seed |
Numeric vector of length one indicates the random seed to start the permutations. |
quiet |
If |
A vector of length n.var.perm
of the maximum negative log10 p-value that resulted from each permutation.
Jointly consider multiple intercrosses with a single diallelic QTL model, considering all possible partitions of the strains into the two QTL allele groups.
scanPhyloQTL(crosses, partitions, chr, pheno.col=1, model=c("normal", "binary"), method=c("em", "imp", "hk"), addcovar, maxit=4000, tol=0.0001, useAllCrosses=TRUE, verbose=FALSE)
scanPhyloQTL(crosses, partitions, chr, pheno.col=1, model=c("normal", "binary"), method=c("em", "imp", "hk"), addcovar, maxit=4000, tol=0.0001, useAllCrosses=TRUE, verbose=FALSE)
crosses |
A list with each component being an intercross, as an object of class
|
partitions |
A vector of character strings of the form "AB|CD" or "A|BCD" indicating the set of paritions of the strains into two allele groups. If missing, all partitions should be considered. |
chr |
Optional vector indicating the chromosomes for which LOD
scores should be calculated. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix which should be
used as the phenotype. This can be a vector of integers; for methods
|
model |
The phenotype model: the usual normal model or a model for binary traits |
method |
Indicates whether to use the EM algorithm, imputation, or Haley-Knott regression. |
addcovar |
Optional set of additive covariates to include in the
analysis, as a list with the same length as |
maxit |
Maximum number of iterations for method |
tol |
Tolerance value for determining convergence for method
|
useAllCrosses |
If TRUE, use all crosses in the analysis of all partitions, with crosses not segregating the QTL included in the estimation of the residual variance. |
verbose |
If TRUE, print information about progress. |
The aim is to jointly consider multiple intercrosses to not just map QTL but to also, under the assumption of a single diallelic QTL, identify the set of strains with each QTL allele.
For each partition (of the strains into two groups) that is under consideration, we pull out the set of crosses that are segregating the QTL, re-code the alleles, and combine the crosses into one large cross. Crosses not segregating the QTL are also used, though with no QTL effects.
Additive covariate indicators for the crosses are included in the analysis, to allow for the possibility that there are overall shifts in the phenotypes between crosses.
A data frame, as for the output of scanone
, though with
LOD score columns for each partition that is considered. The result
is given class "scanPhyloQTL"
.
Karl W Broman, [email protected]
Broman, K. W., Kim, S., An\'e, C. and Payseur, B. A. Mapping quantitative trait loci to a phylogenetic tree. In preparation.
plot.scanPhyloQTL
,
summary.scanPhyloQTL
, max.scanPhyloQTL
,
inferredpartitions
,
simPhyloQTL
# example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross ## Not run: x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot results plot(out)
# example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross ## Not run: x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot results plot(out)
Performs a multiple QTL scan for specified chromosomes and positions or intervals, with the possible inclusion of QTL-QTL interactions and/or covariates.
scanqtl(cross, pheno.col=1, chr, pos, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), incl.markers=FALSE, verbose=TRUE, tol=1e-4, maxit=1000, forceXcovar=FALSE)
scanqtl(cross, pheno.col=1, chr, pos, covar=NULL, formula, method=c("imp","hk"), model=c("normal", "binary"), incl.markers=FALSE, verbose=TRUE, tol=1e-4, maxit=1000, forceXcovar=FALSE)
cross |
An object of class |
pheno.col |
Column number in the phenotype matrix to be used as the phenotype. One may also give a character string matching a phenotype name. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
chr |
Vector indicating the chromosome for each QTL. (These should be character strings referring to the chromosomes by name.) |
pos |
List indicating the positions or intervals on the chromosome to be scanned. Each element should be either a single number (for a specific position) or a pair of numbers (for an interval). |
covar |
A matrix or data.frame of covariates. These must be strictly numeric. |
formula |
An object of class |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
incl.markers |
If FALSE, do calculations only at points on an
evenly spaced grid. If |
verbose |
If TRUE, give feedback about progress. |
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
forceXcovar |
If TRUE, force inclusion of X-chr-related covariates (like sex and cross direction). |
The formula is used to specified the model to be fit. In the
formula, use Q1
, Q2
, etc., or q1
,
q2
, etc., to represent the QTLs, and the column names in the
covariate data frame to represent the covariates.
We enforce a hierarchical structure on the model formula: if a QTL or covariate is in involved in an interaction, its main effect are also be included.
Only the interaction terms need to be specifed in the formula. The
main effects of all input QTLs (as specified by chr and pos) and
covariates (as specifed by covar) will be included by default. For
example, if the formula is y~Q1*Q2*Sex
, and there are three
elements in input chr
and pos
and Sex is one of the
column names for
input covariates, the formula used in genome scan will be
y ~ Q1 + Q2 + Q3 + Sex + Q1:Q2 + Q1:Sex + Q2:Sex + Q1:Q2:Sex
.
The input pos
is a list or vector to specify the position/range
of the input chromosomes to be scanned. If it is a vector, it gives the
precise positions of the QTL on the chromosomes. If it is a list, it will
contain either the precise positions or a range on the chromosomes. For
example, consider the case that the input chr = c(1, 6,
13)
. If pos = c(9.8, 34.0, 18.6)
,
it means to fit a model with QTL on chromosome 1 at 9.8cM, chromosome
6 at 34cM and chromosome 13 at 18.6cM.
If pos = list(c(5,15), c(30,36), 18)
, it
means to scan chromosome 1 from 5cM to 15cM, chromosome 6 from 30cM to
36cM, fix the QTL on chromosome 13 at 18cM.
An object of class scanqtl
. It is a multi-dimensional
array of LOD scores, with the number of dimension equal to the number
of QTLs specifed.
Hao Wu
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
data(fake.f2) # take out several QTLs qc <- c(1, 8, 13) fake.f2 <- subset(fake.f2, chr=qc) # imputate genotypes fake.f2 <- calc.genoprob(fake.f2, step=5, err=0.001) # 2-dimensional genome scan with additive 3-QTL model pos <- list(c(15,35), c(45,65), 28) result <- scanqtl(fake.f2, pheno.col=1, chr=qc, pos=pos, formula=y~Q1+Q2+Q3, method="hk") # image of the results # chr locations chr1 <- as.numeric(matrix(unlist(strsplit(colnames(result),"@")), ncol=2,byrow=TRUE)[,2]) chr8 <- as.numeric(matrix(unlist(strsplit(rownames(result),"@")), ncol=2,byrow=TRUE)[,2]) # image plot image(chr1, chr8, t(result), las=1, col=rev(rainbow(256,start=0,end=2/3))) # do the same, allowing the QTLs on chr 1 and 13 to interact result2 <- scanqtl(fake.f2, pheno.col=1, chr=qc, pos=pos, formula=y~Q1+Q2+Q3+Q1:Q3, method="hk") # image plot image(chr1, chr8, t(result2), las=1, col=rev(rainbow(256,start=0,end=2/3)))
data(fake.f2) # take out several QTLs qc <- c(1, 8, 13) fake.f2 <- subset(fake.f2, chr=qc) # imputate genotypes fake.f2 <- calc.genoprob(fake.f2, step=5, err=0.001) # 2-dimensional genome scan with additive 3-QTL model pos <- list(c(15,35), c(45,65), 28) result <- scanqtl(fake.f2, pheno.col=1, chr=qc, pos=pos, formula=y~Q1+Q2+Q3, method="hk") # image of the results # chr locations chr1 <- as.numeric(matrix(unlist(strsplit(colnames(result),"@")), ncol=2,byrow=TRUE)[,2]) chr8 <- as.numeric(matrix(unlist(strsplit(rownames(result),"@")), ncol=2,byrow=TRUE)[,2]) # image plot image(chr1, chr8, t(result), las=1, col=rev(rainbow(256,start=0,end=2/3))) # do the same, allowing the QTLs on chr 1 and 13 to interact result2 <- scanqtl(fake.f2, pheno.col=1, chr=qc, pos=pos, formula=y~Q1+Q2+Q3+Q1:Q3, method="hk") # image plot image(chr1, chr8, t(result2), las=1, col=rev(rainbow(256,start=0,end=2/3)))
Perform a two-dimensional genome scan with a two-QTL model, with possible allowance for covariates.
scantwo(cross, chr, pheno.col=1, model=c("normal","binary"), method=c("em","imp","hk","mr","mr-imp","mr-argmax"), addcovar=NULL, intcovar=NULL, weights=NULL, use=c("all.obs", "complete.obs"), incl.markers=FALSE, clean.output=FALSE, clean.nmar=1, clean.distance=0, maxit=4000, tol=1e-4, verbose=TRUE, n.perm, perm.Xsp=FALSE, perm.strata=NULL, assumeCondIndep=FALSE, batchsize=250, n.cluster=1)
scantwo(cross, chr, pheno.col=1, model=c("normal","binary"), method=c("em","imp","hk","mr","mr-imp","mr-argmax"), addcovar=NULL, intcovar=NULL, weights=NULL, use=c("all.obs", "complete.obs"), incl.markers=FALSE, clean.output=FALSE, clean.nmar=1, clean.distance=0, maxit=4000, tol=1e-4, verbose=TRUE, n.perm, perm.Xsp=FALSE, perm.strata=NULL, assumeCondIndep=FALSE, batchsize=250, n.cluster=1)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes for which LOD
scores should be calculated. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix which should be
used as the phenotype. This can be a vector of integers; for methods
|
model |
The phenotype model: the usual normal model or a model for binary traits. |
method |
Indicates whether to use the
the EM algorithm, imputation, Haley-Knott regression, or marker
regression. Marker regression is performed either by dropping
individuals with missing genotypes ( |
addcovar |
Additive covariates. |
intcovar |
Interactive covariates (interact with QTL genotype). |
weights |
Optional weights of individuals. Should be either NULL
or a vector of length n.ind containing positive weights. Used only
in the case |
use |
In the case that multiple phenotypes are selected to be scanned, this argument indicates whether to use all individuals, including those missing some phenotypes, or just those individuals that have data on all selected phenotypes. |
incl.markers |
If FALSE, do calculations only at points on an
evenly spaced grid. If |
clean.output |
If TRUE, clean the output with
|
clean.nmar |
If |
clean.distance |
If |
maxit |
Maximum number of iterations; used
only with method |
tol |
Tolerance value for determining convergence; used only with
method |
verbose |
If TRUE, display information about the progress of
calculations. For method |
n.perm |
If specified, a permutation test is performed rather than an analysis of the observed data. This argument defines the number of permutation replicates. |
perm.Xsp |
If |
perm.strata |
If |
assumeCondIndep |
If TRUE, assume conditional independence of QTL genotypes given marker genotypes. This is an approximation, but it may speed things up. |
batchsize |
The number of phenotypes (or permutations) to be run
as a batch; used only for methods |
n.cluster |
If the package |
Standard interval mapping (method="em"
) and Haley-Knott
regression (method="hk"
) require that multipoint genotype probabilities are
first calculated using calc.genoprob
. The
imputation method uses the results of sim.geno
.
The method "em"
is standard interval mapping by the EM algorithm
(Dempster et al. 1977; Lander and Botstein 1989). Marker regression
(method="mr"
) is simply linear regression of phenotypes on
marker genotypes (individuals with missing genotypes are
discarded). Haley-Knott regression (method="hk"
) uses the
regression of phenotypes on multipoint genotype probabilities. The
imputation method (method="imp"
) uses the pseudomarker
algorithm described by Sen and Churchill (2001).
Individuals with missing phenotypes are dropped.
In the presence of covariates, the full model is
where and
are the unknown QTL genotypes at two
locations, A is a matrix of covariates, and Z is a
matrix of covariates that interact with QTL genotypes. The columns of
Z are forced to be contained in the matrix A.
The above full model is compared to the additive QTL model,
and also to the null model, with no QTL,
In the case that n.perm
is specified, the R function
scantwo
is called repeatedly.
For model="binary"
, a logistic regression model is used.
If n.perm
is missing, the function returns a list with class
"scantwo"
and containing three components. The first component
is a matrix of dimension [tot.pos x tot.pos]; the upper triangle
contains the LOD scores for the additive model, and the lower triangle
contains the LOD scores for the full model. The diagonal contains the
results of scanone
. The second component of the
output is a data.frame indicating the locations at which the two-QTL
LOD scores were calculated. The first column is the chromosome
identifier, the second column is the position in cM, the third column
is a 1/0 indicator for ease in later pulling out only the equally
spaced positions, and the fourth column indicates whether the position
is on the X chromosome or not. The final component is a version of
the results of scanone
including sex and/or cross
direction as additive covariates, which is needed for a proper
calculation of conditional LOD scores.
If n.perm
is specified, the function returns a list with six
different LOD scores from each of the permutation replicates.
First, the maximum LOD score for the full model (two QTLs plus an
interaction). Second, for each pair of
chromosomes, we take the difference between the full LOD and the
maximum single-QTL LOD for those two chromosomes, and then maximize
this across chromosome pairs. Third, for each pair of chromosomes we
take the difference between the maximum full LOD and the maximum
additive LOD, and then maximize this across chromosome pairs. Fourth,
the maximum LOD score for the additive QTL model. Fifth, for each
pair of chromosomes, we take the difference between the additive LOD
and the maximum single-QTL LOD for those two chromosomes, and then
maximize this across chromosome pairs. Finally, the maximum
single-QTL LOD score (that is, from a single-QTL scan). The latter is
not used in summary.scantwo
, but does get
calculated at each permutation, so we include it for the sake of
completeness.
If n.perm
is specified and perm.Xsp=TRUE
, the result is
a list with the permutation results for the regions A:A, A:X, and X:X,
each of which is a list with the six different LOD scores. Independent
permutations are performed in each region, n.perm
is the number
of permutations for the A:A region; additional permutations are are
used for the A:X and X:X parts, as estimates of quantiles farther out
into the tails are needed.
The X chromosome must be treated specially in QTL mapping.
As in scanone
, if both males and females are
included, male hemizygotes are allowed to be different from female
homozygotes, and the null hypothesis must be changed in order to ensure
that sex- or pgm-differences in the phenotype do not results in spurious
linkage to the X chromosome. (See the help file for
scanone
.)
If n.perm
is specified and perm.Xsp=TRUE
,
X-chromosome-specific permutations are performed, to obtain separate
thresholds for the regions A:A, A:X, and X:X.
Karl W Broman, [email protected]; Hao Wu
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39, 1–38.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185–199.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
Soller, M., Brody, T. and Genizi, A. (1976) On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines. Theor. Appl. Genet. 47, 35–39.
plot.scantwo
, summary.scantwo
,
scanone
, max.scantwo
,
summary.scantwoperm
,
c.scantwoperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") plot(out.2dim) # permutations ## Not run: permo.2dim <- scantwo(fake.f2, method="hk", n.perm=1000) summary(permo.2dim, alpha=0.05) # summary with p-values summary(out.2dim, perms=permo.2dim, pvalues=TRUE, alphas=c(0.05, 0.10, 0.10, 0.05, 0.10)) # covariates data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=10) ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") out <- scantwo(fake.bc, method="hk", pheno.col=1, addcovar=ac, intcovar=ic) plot(out)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") plot(out.2dim) # permutations ## Not run: permo.2dim <- scantwo(fake.f2, method="hk", n.perm=1000) summary(permo.2dim, alpha=0.05) # summary with p-values summary(out.2dim, perms=permo.2dim, pvalues=TRUE, alphas=c(0.05, 0.10, 0.10, 0.05, 0.10)) # covariates data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=10) ac <- pull.pheno(fake.bc, c("sex","age")) ic <- pull.pheno(fake.bc, "sex") out <- scantwo(fake.bc, method="hk", pheno.col=1, addcovar=ac, intcovar=ic) plot(out)
Perform a permutation test with a two-dimensional genome scan with a two-QTL model, with possible allowance for additive covariates, by Haley-Knott regression.
scantwopermhk(cross, chr, pheno.col=1, addcovar=NULL, weights=NULL, n.perm=1, batchsize=1000, perm.strata=NULL, perm.Xsp=NULL, verbose=FALSE, assumeCondIndep=FALSE)
scantwopermhk(cross, chr, pheno.col=1, addcovar=NULL, weights=NULL, n.perm=1, batchsize=1000, perm.strata=NULL, perm.Xsp=NULL, verbose=FALSE, assumeCondIndep=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes for which LOD
scores should be calculated. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. This should be a single value (numeric index or character string for a phenotype name), but it may also be a vector of numeric values with length equal to the number of individuals in the cross, in which case it is taken to be a vector of individuals' phenotypes. |
addcovar |
Additive covariates. |
weights |
Optional weights of individuals. Should be either NULL
or a vector of length n.ind containing positive weights. Used only
in the case |
n.perm |
Number of permutation replicates. |
batchsize |
If |
perm.strata |
Used to perform a stratified permutation test. This should be a vector with the same number of individuals as in the cross data. Unique values indicate the individual strata, and permutations will be performed within the strata. |
perm.Xsp |
If TRUE, run separate permutations for A:A, A:X, and
X:X. In this case, |
verbose |
If TRUE, display information about the progress of calculations. |
assumeCondIndep |
If TRUE, assume conditional independence of QTL genotypes given marker genotypes. This is an approximation, but it may speed things up. |
This is a scaled-back version of the permutation test provided by
scantwo
: only for a normal model with Haley-Knott
regression, and not allowing interactive covariates.
This is an attempt to speed things up and attentuate the memory usage
problems in scantwo
.
In the case of perm.Xsp=TRUE
(X-chr-specific thresholds), we
use a stratified permutation test, stratified by sex and
cross-direction.
A list with six
different LOD scores from each of the permutation replicates.
First, the maximum LOD score for the full model (two QTLs plus an
interaction). Second, for each pair of
chromosomes, we take the difference between the full LOD and the
maximum single-QTL LOD for those two chromosomes, and then maximize
this across chromosome pairs. Third, for each pair of chromosomes we
take the difference between the maximum full LOD and the maximum
additive LOD, and then maximize this across chromosome pairs. Fourth,
the maximum LOD score for the additive QTL model. Fifth, for each
pair of chromosomes, we take the difference between the additive LOD
and the maximum single-QTL LOD for those two chromosomes, and then
maximize this across chromosome pairs. Finally, the maximum
single-QTL LOD score (that is, from a single-QTL scan). The latter is
not used in summary.scantwoperm
, but does get
calculated at each permutation, so we include it for the sake of
completeness.
If perm.Xsp=TRUE
, this is a list of lists, for the A:A, A:X,
and X:X sections, each being a list as described above.
Karl W Broman, [email protected]; Hao Wu
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
scantwo
, plot.scantwoperm
,
summary.scantwoperm
,
c.scantwoperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) operm <- scantwopermhk(fake.f2, n.perm=2) summary(operm, alpha=0.05)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) operm <- scantwopermhk(fake.f2, n.perm=2) summary(operm, alpha=0.05)
Shift starting points in a genetic map to a set of defined positions
shiftmap(object, offset=0)
shiftmap(object, offset=0)
object |
An object of class |
offset |
Defines the starting position for each chromosome. This should be a single value (to be used for all chromosomes) or a vector with length equal to the number of chromosomes, defining individual starting positions for each chromosome. For a sex-specific map (as in a 4-way cross), we use the same offset for both the male and female maps. |
If the input is a map
object, a map
object is returned; if
the input is a cross
object, a cross
object is returned.
In either case, the positions of markers are shifted so that the
starting positions are as in offset
.
Karl W Broman, [email protected]
data(hyper) shiftedhyper <- shiftmap(hyper, offset=0) par(mfrow=c(1,2)) plotMap(hyper, shift=FALSE, alternate.chrid=TRUE) plotMap(shiftedhyper, shift=FALSE, alternate.chrid=TRUE)
data(hyper) shiftedhyper <- shiftmap(hyper, offset=0) par(mfrow=c(1,2)) plotMap(hyper, shift=FALSE, alternate.chrid=TRUE) plotMap(shiftedhyper, shift=FALSE, alternate.chrid=TRUE)
Simulates data for a QTL experiment using a model in which QTLs act additively.
sim.cross(map, model=NULL, n.ind=100, type=c("f2", "bc", "4way", "risib", "riself", "ri4sib", "ri4self", "ri8sib", "ri8self", "bcsft"), error.prob=0, missing.prob=0, partial.missing.prob=0, keep.qtlgeno=TRUE, keep.errorind=TRUE, m=0, p=0, map.function=c("haldane","kosambi","c-f","morgan"), founderGeno, random.cross=TRUE, ...)
sim.cross(map, model=NULL, n.ind=100, type=c("f2", "bc", "4way", "risib", "riself", "ri4sib", "ri4self", "ri8sib", "ri8self", "bcsft"), error.prob=0, missing.prob=0, partial.missing.prob=0, keep.qtlgeno=TRUE, keep.errorind=TRUE, m=0, p=0, map.function=c("haldane","kosambi","c-f","morgan"), founderGeno, random.cross=TRUE, ...)
map |
A list whose components are vectors containing the marker locations on each of the chromosomes. |
model |
A matrix where each row corresponds to a different QTL, and gives the chromosome number, cM position and effects of the QTL. |
n.ind |
Number of individuals to simulate. |
type |
Indicates whether to simulate an intercross ( |
error.prob |
The genotyping error rate. |
missing.prob |
The rate of missing genotypes. |
partial.missing.prob |
When simulating an intercross or 4-way cross, this gives the rate at which markers will be incompletely informative (i.e., dominant or recessive). |
keep.qtlgeno |
If TRUE, genotypes for the simulated QTLs will be included in the output. |
keep.errorind |
If TRUE, and if |
m |
Interference parameter; a non-negative integer. 0 corresponds to no interference. |
p |
Probability that a chiasma comes from the no-interference mechanism |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. |
founderGeno |
For 4- or 8-way RIL, the genotype data of the founder strains, as a list whose components are numeric matrices (no. markers x no. founders), one for each chromosome. |
random.cross |
For 4- or 8-way RIL, indicates whether the order of the founder strains should be randomized, independently for each RIL, or whether all RIL be derived from a common cross. In the latter case, for a 4-way RIL, the cross would be (AxB)x(CxD). |
... |
For |
Meiosis is assumed to follow the Stahl model for crossover
interference (see the references, below), of which the no interference
model and the chi-square model are special cases. Chiasmata on the
four-strand bundle are a superposition of chiasmata from two different
mechanisms. With probability p
, they arise by a mechanism
exhibiting no interference; the remainder come from a chi-square model
with inteference parameter m
. Note that m=0
corresponds
to no interference, and with p=0
, one gets a pure chi-square
model.
If a chromosomes has class X
, it is assumed to be the X
chromosome, and is assumed to be segregating in the cross. Thus, in
an intercross, it is segregating like a backcross chromosome. In a
4-way cross, a second phenotype, sex
, will be generated.
QTLs are assumed to act additively, and the residual phenotypic variation is assumed to be normally distributed with variance 1.
For a backcross, the effect of a QTL is a single number corresponding to the difference between the homozygote and the heterozygote.
For an intercross, the effect of a QTL is a pair of numbers,
(), where
is the additive effect (half the difference
between the homozygotes) and
is the dominance deviation (the
difference between the heterozygote and the midpoint between the
homozygotes).
For a four-way cross, the effect of a QTL is a set of three numbers,
(), where, in the case of one QTL, the mean phenotype,
conditional on the QTL genotyping being AC, BC, AD or BD, is
,
,
or 0, respectively.
An object of class cross
. See read.cross
for
details.
If keep.qtlgeno
is TRUE, the cross object will contain a
component qtlgeno
which is a matrix containing the QTL
genotypes (with complete data and no errors), coded as in the genotype
data.
If keep.errorind
is TRUE and errors were simulated, each
component of geno
will each contain a matrix errors
,
with 1's indicating simulated genotyping errors.
In the simulation of recombinant inbred lines (RIL), we simulate a
single individual from each line, and no phenotypes are simulated (so the
argument model
is ignored).
The types riself
and risib
are the usual two-way RIL.
The types ri4self
, ri4sib
, ri8self
, and
ri8sib
are RIL by selfing or sib-mating derived from four or
eight founding parental strains.
For the 4- and 8-way RIL, one must include the genotypes of the
founding individuals; these may be simulated with
simFounderSnps
. Also, the output cross will
contain a component cross
, which is a matrix with rows
corresponding to RIL and columns corresponding to the founders,
indicating order of the founder strains in the crosses used to
generate the RIL.
The coding of genotypes in 4- and 8-way RIL is rather complicated. It
is a binary encoding of which founder strains' genotypes match the
RIL's genotype at a marker, and not that this is specific to the order
of the founders in the crosses used to generate the RIL. For example,
if an RIL generated from 4 founders has the 1 allele at a SNP, and the
four founders have SNP alleles 0, 1, 0, 1, then the RIL allele matches
that of founders B and D. If the RIL was derived by the cross (AxB)x(CxD),
then the RIL genotype would be encoded .
If the cross was derived by the cross (DxA)x(CxB), then the RIL
genotype would be encoded
.
These get reorganized after calls to
calc.genoprob
,
sim.geno
, or argmax.geno
, and
this approach simplifies the hidden Markov model (HMM) code.
For the 4- and 8-way RIL, genotyping errors are simulated only if the founder genotypes are 0/1 SNPs.
Karl W Broman, [email protected]
Copenhaver, G. P., Housworth, E. A. and Stahl, F. W. (2002) Crossover interference in arabidopsis. Genetics 160, 1631–1639.
Foss, E., Lande, R., Stahl, F. W. and Steinberg, C. M. (1993) Chiasma interference as a function of genetic distance. Genetics 133, 681–691.
Zhao, H., Speed, T. P. and McPeek, M. S. (1995) Statistical analysis of crossover interference using the chi-square model. Genetics 139, 1045–1056.
Broman, K. W. (2005) The genomes of recombinant inbred lines Genetics 169, 1133–1146.
Teuscher, F. and Broman, K. W. (2007) Haplotype probabilities for multiple-strain recombinant inbred lines. Genetics 175, 1267–1274.
sim.map
, read.cross
,
fake.f2
, fake.bc
fake.4way
, simFounderSnps
# simulate a genetic map map <- sim.map() ### simulate 250 intercross individuals with 2 QTLs fake <- sim.cross(map, type="f2", n.ind=250, model = rbind(c(1,45,1,1),c(5,20,0.5,-0.5))) ### simulate 100 backcross individuals with 3 QTL # a 10-cM map model after the mouse data(map10) fakebc <- sim.cross(map10, type="bc", n.ind=100, model=rbind(c(1,45,1), c(5,20,1), c(5,50,1))) ### simulate 8-way RIL by sibling mating # get lengths from the above 10-cM map L <- ceiling(sapply(map10, max)) # simulate a 1 cM map themap <- sim.map(L, n.mar=L+1, eq.spacing=TRUE) # simulate founder genotypes pg <- simFounderSnps(themap, "8") # simulate the 8-way RIL by sib mating (256 lines) ril <- sim.cross(themap, n.ind=256, type="ri8sib", founderGeno=pg)
# simulate a genetic map map <- sim.map() ### simulate 250 intercross individuals with 2 QTLs fake <- sim.cross(map, type="f2", n.ind=250, model = rbind(c(1,45,1,1),c(5,20,0.5,-0.5))) ### simulate 100 backcross individuals with 3 QTL # a 10-cM map model after the mouse data(map10) fakebc <- sim.cross(map10, type="bc", n.ind=100, model=rbind(c(1,45,1), c(5,20,1), c(5,50,1))) ### simulate 8-way RIL by sibling mating # get lengths from the above 10-cM map L <- ceiling(sapply(map10, max)) # simulate a 1 cM map themap <- sim.map(L, n.mar=L+1, eq.spacing=TRUE) # simulate founder genotypes pg <- simFounderSnps(themap, "8") # simulate the 8-way RIL by sib mating (256 lines) ril <- sim.cross(themap, n.ind=256, type="ri8sib", founderGeno=pg)
Uses the hidden Markov model technology to simulate from the joint distribution Pr(g | O) where g is the underlying genotype vector and O is the observed multipoint marker data, with possible allowance for genotyping errors.
sim.geno(cross, n.draws=16, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
sim.geno(cross, n.draws=16, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
cross |
An object of class |
n.draws |
Number of simulation replicates to perform. |
step |
Maximum distance (in cM) between positions at which the
simulated genotypes will be drawn, though for |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype simulations will be carried. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. |
stepwidth |
Indicates whether the intermediate points should with
fixed or variable step sizes. We recommend using
|
After performing the forward-backward equations, we draw from
and then
.
In the case of the 4-way cross, with a sex-specific map, we assume a constant ratio of female:male recombination rates within the inter-marker intervals.
The input cross
object is returned with a component,
draws
, added to each component of cross$geno
.
This is an array of size [n.ind x n.pos x n.draws] where n.pos is
the number of positions at which the simulations were performed and
n.draws is the number of replicates. Attributes "error.prob"
,
"step"
, and "off.end"
are set to the values of the
corresponding arguments, for later reference.
Karl W Broman, [email protected]
data(fake.f2) fake.f2 <- sim.geno(fake.f2, step=2, n.draws=8)
data(fake.f2) fake.f2 <- sim.geno(fake.f2, step=2, n.draws=8)
Simulate the positions of markers on a genetic map.
sim.map(len=rep(100,20), n.mar=10, anchor.tel=TRUE, include.x=TRUE, sex.sp=FALSE, eq.spacing=FALSE)
sim.map(len=rep(100,20), n.mar=10, anchor.tel=TRUE, include.x=TRUE, sex.sp=FALSE, eq.spacing=FALSE)
len |
A vector specifying the chromosome lengths (in cM) |
n.mar |
A vector specifying the number of markers per chromosome. |
anchor.tel |
If true, markers at the two telomeres will always be
included, so if |
include.x |
Indicates whether the last chromosome should be considered the X chromosome. |
sex.sp |
Indicates whether to create sex-specific maps, in which case the output will be a vector of 2-row matrices, with rows corresponding to the maps for the two sexes. |
eq.spacing |
If TRUE, markers will be equally spaced. |
Aside from the telomeric markers, marker positions are simulated as
iid Uniform(). If
len
or n.mar
has just one element,
it is expanded to the length of the other argument. If they both have
just one element, only one chromosome is simulated.
If eq.spacing
is TRUE, markers are equally spaced between 0 and
. If
anchor.tel
is FALSE, telomeric markers are not
included.
A list of vectors, each specifying the locations of the markers. Each
component of the list is given class A
or X
, according
to whether it is autosomal or the X chromosome.
Karl W Broman, [email protected]
sim.cross
, plotMap
,
replace.map
, pull.map
# simulate 4 autosomes, each with 10 markers map <- sim.map(c(100,90,80,40), 10, include.x=FALSE) plotMap(map) # equally spaced markers map2 <- sim.map(c(100,90,80,40), 10, include.x=FALSE, eq.spacing=TRUE) plot(map2)
# simulate 4 autosomes, each with 10 markers map <- sim.map(c(100,90,80,40), 10, include.x=FALSE) plotMap(map) # equally spaced markers map2 <- sim.map(c(100,90,80,40), 10, include.x=FALSE, eq.spacing=TRUE) plot(map2)
Simulate genotype data for the founding strains for a panel of multiple-strain RIL.
simFounderSnps(map, n.str=c("4","8"), pat.freq)
simFounderSnps(map, n.str=c("4","8"), pat.freq)
map |
A list whose components are vectors containing the marker locations on each of the chromosomes. |
n.str |
Number of founding strains (4 or 8). |
pat.freq |
Frequency of SNP genotype patterns in the founder (a
vector of length |
The SNPs are simulated to be in linkage equilibrium.
A vector of the same length as there are chromosomes in map
,
with each component being a matrix of 0's and 1's, of dim n.str
x n.mar
.
Karl W Broman, [email protected]
data(map10) x <- simFounderSnps(map10, "8", c(0, 0.5, 0.2, 0.2, 0.1))
data(map10) x <- simFounderSnps(map10, "8", c(0, 0.5, 0.2, 0.2, 0.1))
Simulate a set of intercrosses with a single diallelic QTL.
simPhyloQTL(n.taxa=3, partition, crosses, map, n.ind=100, model, error.prob=0, missing.prob=0, partial.missing.prob=0, keep.qtlgeno=FALSE, keep.errorind=TRUE, m=0, p=0, map.function=c("haldane","kosambi","c-f","morgan"))
simPhyloQTL(n.taxa=3, partition, crosses, map, n.ind=100, model, error.prob=0, missing.prob=0, partial.missing.prob=0, keep.qtlgeno=FALSE, keep.errorind=TRUE, m=0, p=0, map.function=c("haldane","kosambi","c-f","morgan"))
n.taxa |
Number of taxa (i.e., strains). |
partition |
A vector of character strings of the form "AB|CD" or "A|BCD" indicating, for each QTL, which taxa have which allele. If missing, simulate under the null hypothesis of no QTL. |
crosses |
A vector of character strings indicating the crosses to do (for the form "AB", "AC", etc.). These will be sorted and then only unique ones used. If missing, all crosses will be simulated. |
map |
A list whose components are vectors containing the marker locations on each of the chromosomes. |
n.ind |
The number of individuals in each cross. If length 1, all
crosses will have the same number of individuals; otherwise the length
should be the same as |
model |
A matrix where each row corresponds to a different QTL, and gives the chromosome number, cM position and effects of the QTL (assumed to be the same in each cross in which the QTL is segregating). |
error.prob |
The genotyping error rate. |
missing.prob |
The rate of missing genotypes. |
partial.missing.prob |
When simulating an intercross or 4-way cross, this gives the rate at which markers will be incompletely informative (i.e., dominant or recessive). |
keep.qtlgeno |
If TRUE, genotypes for the simulated QTLs will be included in the output. |
keep.errorind |
If TRUE, and if |
m |
Interference parameter; a non-negative integer. 0 corresponds to no interference. |
p |
Probability that a chiasma comes from the no-interference mechanism |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. |
Meiosis is assumed to follow the Stahl model for crossover
interference (see the references, below), of which the no interference
model and the chi-square model are special cases. Chiasmata on the
four-strand bundle are a superposition of chiasmata from two different
mechanisms. With probability p
, they arise by a mechanism
exhibiting no interference; the remainder come from a chi-square model
with inteference parameter m
. Note that m=0
corresponds
to no interference, and with p=0
, one gets a pure chi-square
model.
QTLs are assumed to act additively, and the residual phenotypic variation is assumed to be normally distributed with variance 1.
The effect of a QTL is a pair of numbers,
(), where
is the additive effect (half the difference
between the homozygotes) and
is the dominance deviation (the
difference between the heterozygote and the midpoint between the
homozygotes).
A list with each component being an object of class cross
. See read.cross
for
details. The names (e.g. "AB", "AC", "BC") indicate the crosses.
If keep.qtlgeno
is TRUE, each cross object will contain a
component qtlgeno
which is a matrix containing the QTL
genotypes (with complete data and no errors), coded as in the genotype
data.
If keep.errorind
is TRUE and errors were simulated, each
component of geno
in each cross will each contain a matrix errors
,
with 1's indicating simulated genotyping errors.
Karl W Broman, [email protected]
Broman, K. W., Kim, S., An\'e, C. and Payseur, B. A. Mapping quantitative trait loci to a phylogenetic tree. In preparation.
scanPhyloQTL
, inferredpartitions
,
summary.scanPhyloQTL
, max.scanPhyloQTL
,
plot.scanPhyloQTL
,
sim.cross
, read.cross
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
Simulate missing genotype data by removing some genotype data from the cross object
simulatemissingdata(cross, percentage = 5)
simulatemissingdata(cross, percentage = 5)
cross |
An object of class |
percentage |
How much of the genotype data do we need to randomly drop? |
An object of class cross
with percentage
Danny Arends [email protected]
The MQM tutorial: https://rqtl.org/tutorials/MQM-tour.pdf
MQM
- MQM description and references
mqmscan
- Main MQM single trait analysis
mqmscanall
- Parallellized traits analysis
mqmaugment
- Augmentation routine for estimating missing data
mqmautocofactors
- Set cofactors using marker density
mqmsetcofactors
- Set cofactors at fixed locations
mqmpermutation
- Estimate significance levels
scanone
- Single QTL scanning
data(multitrait) multitrait <- fill.geno(multitrait) multimissing5 <- simulatemissingdata(multitrait,perc=5) perc <- (sum(nmissing(multimissing5))/sum(ntyped(multimissing5)))
data(multitrait) multitrait <- fill.geno(multitrait) multimissing5 <- simulatemissingdata(multitrait,perc=5) perc <- (sum(nmissing(multimissing5))/sum(ntyped(multimissing5)))
Performs forward/backward selection to identify a multiple QTL model, with model choice made via a penalized LOD score, with separate penalties on main effects and interactions.
stepwiseqtl(cross, chr, pheno.col=1, qtl, formula, max.qtl=10, covar=NULL, method=c("imp", "hk"), model=c("normal", "binary"), incl.markers=TRUE, refine.locations=TRUE, additive.only=FALSE, scan.pairs=FALSE, penalties, keeplodprofile=TRUE, keeptrace=FALSE, verbose=TRUE, tol=1e-4, maxit=1000, require.fullrank=FALSE)
stepwiseqtl(cross, chr, pheno.col=1, qtl, formula, max.qtl=10, covar=NULL, method=c("imp", "hk"), model=c("normal", "binary"), incl.markers=TRUE, refine.locations=TRUE, additive.only=FALSE, scan.pairs=FALSE, penalties, keeplodprofile=TRUE, keeptrace=FALSE, verbose=TRUE, tol=1e-4, maxit=1000, require.fullrank=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider in
search for QTL. This should be a vector of character strings
referring to chromosomes by name; numeric values are converted to
strings. Refer to chromosomes with a preceding |
pheno.col |
Column number in the phenotype matrix which should be used as the phenotype. One may also give character strings matching the phenotype names. Finally, one may give a numeric vector of phenotypes, in which case it must have the length equal to the number of individuals in the cross, and there must be either non-integers or values < 1 or > no. phenotypes; this last case may be useful for studying transformations. |
qtl |
Optional QTL object (of class |
formula |
Optional formula to define the QTL model to be used as a starting point. |
max.qtl |
Maximum number of QTL to which forward selection should proceed. |
covar |
Data frame of additive covariates. |
method |
Indicates whether to use multiple imputation or Haley-Knott regression. |
model |
The phenotype model: the usual model or a model for binary traits |
incl.markers |
If FALSE, do calculations only at points on an evenly spaced grid. |
refine.locations |
If TRUE, use |
additive.only |
If TRUE, allow only additive QTL models; if FALSE, consider also pairwise interactions among QTL. |
scan.pairs |
If TRUE, perform a two-dimensional, two-QTL scan at each step of forward selection. |
penalties |
Vector of three (or six) values indicating the penalty on the number of QTL terms. If three values, these are the penalties on main effects and heavy and light penalties on interactions. If six values, these include X-chr-specific penalties, and the values are: main effect for autosomes, main effect for X chr, heavy penalty on A:A interactions, light penalty on A:A interactions, penalty on A:X interactions, and penalty on X:X interactions. See the Details below. If missing, default values are used that are based on simulations of backcrosses and intercrosses with genomes modeled after that of the mouse. |
keeplodprofile |
If TRUE, keep the LOD profiles from the last iteration as attributes to the output. |
keeptrace |
If TRUE, keep information on the sequence of models visited through the course of forward and backward selection as an attribute to the output. |
verbose |
If TRUE, give feedback about progress. If
|
tol |
Tolerance for convergence for the binary trait model. |
maxit |
Maximum number of iterations for fitting the binary trait model. |
require.fullrank |
If TRUE, give LOD=0 when covariate matrix in the linear regression is not of full rank. |
We seek to identify the model with maximal penalized LOD score. The
penalized LOD score, defined in Manichaikul et al. (2009),
is the LOD score for the model (the likelihood
ratio comparing the model to the null model with no QTL) with
penalties on the number of QTL and QTL:QTL interactions.
We consider QTL models allowing pairwise interactions among QTL but with an enforced hierarchy in which inclusion of a pairwise interaction requires the inclusion of both of the corresponding main effects. Additive covariates may be included, but currently we do not explore QTL:covariate interactions. Also, the penalized LOD score criterion is currently defined only for autosomal loci, and results with the X chromosome should be considered with caution.
The penalized LOD score is of the form where
denotes a model,
is the number of QTL in the model ("main effects"),
is the number of pairwise interactions that will be
given a heavy interaction penalty,
is the number of pairwise
interactions that will be given a light interaction penalty,
is the penalty on main effects,
is the heavy
interaction penalty, and
is the light interaction
penalty. The
penalties
argument is the vector . If
is missing (
penalties
has a vector of length 2), we assume , and so
all pairwise interactions are assigned the same penalty.
The "heavy" and "light" interaction penalties can be a bit
confusing. Consider the clusters of QTL that are connected via one or
more pairwise interactions. To each such cluster, we assign at most
one "light" interaction penalty, and give all other pairwise
interactions the heavy interaction penalty. In other words, if
is the total number of pairwise interactions for a QTL
model, we let
be the number of clusters of connected QTL
with at least one pairwise interaction, and then let
.
Let us give an explicit example. Consider a model with 6 QTL, and
with interactions between QTL 2 and 3, QTL 4 and 5 and QTL 4 and 6
(so we have the model formula
y ~ Q1 + Q2 + Q3 + Q4 + Q5 + Q6 + Q2:Q3 + Q4:Q5 + Q4:Q6
).
There are three clusters of connected QTL: (1), (2,3) and (4,5,6). We
would assign 6 main effect penalties (), 2 light
interaction penalties (
), and 1 heavy interaction penalty
(
).
Manichaikul et al. (2009) described a system for deriving the
three penalties on the basis of permutation results from a
two-dimensional, two-QTL genome scan (as calculated with
scantwo
). These may be calculated with the
function calc.penalties
.
A forward/backward search method is used, with the aim to optimize the penalized LOD score criterion. That is, we seek to identify the model with maximal the penalized LOD score. The search algorithm was based closely on an algorithm described by Zeng et al. (1999).
We use forward selection to a model of moderate size (say 10 QTL),
followed by backward elimination all the way to the null model. The
chosen model is that which optimizes the penalized LOD score
criterion, among all models visited. The detailed algorithm is as
follows. Note that if additive.only=TRUE
, no pairwise
interactions are considered.
Start at the null model, and perform a single-QTL genome scan,
and choose the position giving the largest LOD score. If
scan.pairs=TRUE
, start with a two-dimensional, two-QTL genome
scan instead. If an initial QTL model were defined through the
arguments qtl
and formula
, start with this model and
jump immediately to step 2.
With a fixed QTL model in hand:
Scan for an additional additive QTL.
For each QTL in the current model, scan for an additional interacting QTL.
If there are 2 QTL in the current model,
consider adding one of the possible pairwise interactions.
If scan.pairs=TRUE
perform a two-dimensional, two-QTL
scan, seeking to add a pair of novel QTL, either additive or
interacting.
Step to the model that gives the largest value for the model comparison criterion, among those considered at the current step.
Refine the locations of the QTL in the current model (if
refine.locations=TRUE
).
Repeat steps 2 and 3 up to a model with some pre-determined number of loci.
Perform backward elimination, all the way back to the null model. At each step, consider dropping one of the current main effects or interactions; move to the model that maximizes the model comparison criterion, among those considered at this step. Follow this with a refinement of the locations of the QTL.
Finally, choose the model having the largest model comparison criterion, among all models visited.
In this forward/backward algorithm, it is likely best to build up to an overly large model and then prune it back. Note that there is no "stopping rule"; the chosen model is that which optimizes the model comparison criterion, among all models visited. The search can be time consuming, particularly if a two-dimensional scan is performed at each forward step. Such two-dimensional scans may be useful for identifying QTL linked in repulsion (having effects of opposite sign) or interacting QTL with limited marginal effects, but our limited experience suggests that they are not necessary; important linked or interacting QTL pairs can be picked up in the forward selection to a large model, and will be retained in the backward elimination phase.
The output is a representation of the best model, as measured by the
penalized LOD score (see Details), among all models visited.
This is QTL object (of class "qtl"
, as produced by
makeqtl
), with attributes "formula"
,
indicating the model formula, and "pLOD"
indicating the
penalized LOD score.
If keeplodprofile=TRUE
, LOD profiles from the last pass through
the refinement algorithm are retained as an attribute,
"lodprofile"
, to the object. These may be plotted with
plotLodProfile
.
If keeptrace=TRUE
, the output will contain an attribute
"trace"
containing information on the best model at each step
of forward and backward elimination. This is a list of objects of
class "compactqtl"
, which is similar to a QTL object (as
produced by makeqtl
) but containing just
a vector of chromosome IDs and positions for the QTL. Each will also
have attributes "formula"
(containing the model formula) and
"pLOD"
(containing the penalized LOD score.
imp
: multiple imputation is used, as described by Sen
and Churchill (2001).
hk
: Haley-Knott regression is used (regression of the
phenotypes on the multipoint QTL genotype probabilities), as described
by Haley and Knott (1992).
Karl W Broman, [email protected]
Manichaikul, A., Moon, J. Y., Sen, Ś, Yandell, B. S. and Broman, K. W. (2009) A model selection approach for the identification of quantitative trait loci in experimental crosses, allowing epistasis. Genetics, 181, 1077–1086.
Broman, K. W. and Speed, T. P. (2002) A model selection approach for the identification of quantitative trait loci in experimental crosses (with discussion). J Roy Stat Soc B 64, 641–656, 731–775.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
Zeng, Z.-B., Kao, C.-H. and Basten, C. J. (1999) Estimating the genetic architecture of quantitative traits. Genetical Research, 74, 279–289.
calc.penalties
,
plotModel
, makeqtl
,
fitqtl
, refineqtl
,
addqtl
, addpair
data(fake.bc) ## Not run: fake.bc <- calc.genoprob(fake.bc, step=2.5) outsw <- stepwiseqtl(fake.bc, max.qtl=3, method="hk", keeptrace=TRUE) # best model outsw plotModel(outsw) # path through model space thetrace <- attr(outsw, "trace") # plot of these par(mfrow=c(3,3)) for(i in seq(along=thetrace)) plotModel(thetrace[[i]], main=paste("pLOD =",round(attr(thetrace[[i]],"pLOD"), 2)))
data(fake.bc) ## Not run: fake.bc <- calc.genoprob(fake.bc, step=2.5) outsw <- stepwiseqtl(fake.bc, max.qtl=3, method="hk", keeptrace=TRUE) # best model outsw plotModel(outsw) # path through model space thetrace <- attr(outsw, "trace") # plot of these par(mfrow=c(3,3)) for(i in seq(along=thetrace)) plotModel(thetrace[[i]], main=paste("pLOD =",round(attr(thetrace[[i]],"pLOD"), 2)))
Replace all partially informative genotypes (e.g., dominant markers in an intercross) with missing values.
strip.partials(cross, verbose=TRUE)
strip.partials(cross, verbose=TRUE)
cross |
An object of class |
verbose |
If TRUE, print the number of genotypes removed. |
The same class cross
object as in the input, but with partially
informative genotypes made missing.
Karl W Broman, [email protected]
data(listeria) sum(nmissing(listeria)) listeria <- strip.partials(listeria) sum(nmissing(listeria))
data(listeria) sum(nmissing(listeria)) listeria <- strip.partials(listeria) sum(nmissing(listeria))
Pull out a specified set of chromosomes and/or individuals from a
cross
object.
## S3 method for class 'cross' subset(x, chr, ind, ...) ## S3 method for class 'cross' x[chr, ind]
## S3 method for class 'cross' subset(x, chr, ind, ...) ## S3 method for class 'cross' x[chr, ind]
x |
An object of class |
chr |
Optional vector specifying which chromosomes to keep or discard. This may be a logical, numeric, or character string vector. See Details, below. |
ind |
Optional vector specifying which individuals to keep discard. This may be a logical, numeric or chacter string vector. See Details, below. |
... |
Ignored at this point. |
The chr
argument may be a logical vector with length equal to
the number of chromosomes in the input cross x
. Alternatively, it
should be a vector of character strings referring to chromosomes by
name. Numeric values are converted to strings. Refer to chromosomes
with a preceding -
to have all chromosomes but those
considered.
If the ind
argument is a logical vector
(TRUE
/FALSE
), it should have length equal to the number
of individuals in the input cross x
. The individuals with
corresponding TRUE
values are retained.
If the ind
argument is numeric, it should have values either
between 1 and the number of individuals in the input cross x
(in which case these individuals will be retained),
or it should have values between -1
and -n
, where
n
is the number of individuals in the input cross x
, in
which case all except these individuals will be retained.
If the input cross object x
contains individual identifiers (a
phenotype column labeled "id"
or "ID"
), and if the
ind
argument contains character strings, then these will be
matched against the individual identifiers.
If all values in ind
are
preceded by a -
), we omit those individuals whose IDs match
those in ind
. Otherwise, we retain those individuals whose IDs
match those in ind
.
The input cross
object, but with only the specified subset
of the data.
Karl W Broman, [email protected]
pull.map
, drop.markers
, subset.map
data(fake.f2) fake.f2.A <- subset(fake.f2, chr=c("5","13")) fake.f2.B <- subset(fake.f2, ind = -c(1,5,10)) fake.f2.C <- subset(fake.f2, chr=1:5, ind=1:50) data(listeria) y <- pull.pheno(listeria, 1) listeriaB <- subset(listeria, ind = (!is.na(y) & y < 264)) # individual identifiers listeria$pheno$ID <- paste("mouse", 1:nind(listeria), sep="") listeriaC <- subset(listeria, ind=c("mouse1","mouse11","mouse21")) listeriaD <- subset(listeria, ind=c("-mouse1","-mouse11","-mouse21")) # you can also use brackets (like matrix with rows=chromosomes and columns=individuals) temp <- listeria[c("5","13"),] # chr 5 and 13 temp <- listeria[ , 1:10] # first ten individuals temp <- listeria[5, 1:10] # chr 5 for first ten individuals
data(fake.f2) fake.f2.A <- subset(fake.f2, chr=c("5","13")) fake.f2.B <- subset(fake.f2, ind = -c(1,5,10)) fake.f2.C <- subset(fake.f2, chr=1:5, ind=1:50) data(listeria) y <- pull.pheno(listeria, 1) listeriaB <- subset(listeria, ind = (!is.na(y) & y < 264)) # individual identifiers listeria$pheno$ID <- paste("mouse", 1:nind(listeria), sep="") listeriaC <- subset(listeria, ind=c("mouse1","mouse11","mouse21")) listeriaD <- subset(listeria, ind=c("-mouse1","-mouse11","-mouse21")) # you can also use brackets (like matrix with rows=chromosomes and columns=individuals) temp <- listeria[c("5","13"),] # chr 5 and 13 temp <- listeria[ , 1:10] # first ten individuals temp <- listeria[5, 1:10] # chr 5 for first ten individuals
Pull out a specified set of chromosomes from a
map
object.
## S3 method for class 'map' subset(x, ...) ## S3 method for class 'map' x[...]
## S3 method for class 'map' subset(x, ...) ## S3 method for class 'map' x[...]
x |
A list whose components are vectors of marker locations. |
... |
Vector of chromosome indices. |
The input map
object, but with only the specified subset
of chromosomes.
Karl W Broman, [email protected]
data(map10) map10 <- subset(map10, chr=1:5) # you can also use brackets map10 <- map10[2:3]
data(map10) map10 <- subset(map10, chr=1:5) # you can also use brackets map10 <- map10[2:3]
Pull out a specified set of chromosomes and/or LOD columns from
scanone
output.
## S3 method for class 'scanone' subset(x, chr, lodcolumn, ...)
## S3 method for class 'scanone' subset(x, chr, lodcolumn, ...)
x |
An object of class |
chr |
Optional vector specifying which chromosomes to keep.
This should be a vector of character strings referring to
chromosomes by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
lodcolumn |
A vector specifying which LOD columns to keep (or, if
negative), omit. These should be between 1 and the number of LOD
columns in the input |
... |
Ignored at this point. |
The input scanone
object, but with only the specified
subset of the data.
Karl W Broman, [email protected]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) out <- scanone(fake.bc, method="hk", pheno.col=1:2) summary(subset(out, chr=18:19), format="allpeaks")
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=2.5) out <- scanone(fake.bc, method="hk", pheno.col=1:2) summary(subset(out, chr=18:19), format="allpeaks")
Pull out results for a specified set LOD columns from
permutation results from scanone
.
## S3 method for class 'scanoneperm' subset(x, repl, lodcolumn, ...) ## S3 method for class 'scanoneperm' x[repl, lodcolumn]
## S3 method for class 'scanoneperm' subset(x, repl, lodcolumn, ...) ## S3 method for class 'scanoneperm' x[repl, lodcolumn]
x |
Permutation results from
|
repl |
A vector specifying which permutation replicates to keep or (if negative) omit. |
lodcolumn |
A vector specifying which LOD columns to keep or (if
negative) omit. These should be between 1 and the number of LOD
columns in the input |
... |
Ignored at this point. |
The input scanone
permutation results, but with only the specified
subset of the data.
Karl W Broman, [email protected]
summary.scanoneperm
,
scanone
, c.scanoneperm
,
cbind.scanoneperm
,
rbind.scanoneperm
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=5) operm <- scanone(fake.bc, method="hk", pheno.col=1:2, n.perm=25) operm2 <- subset(operm, lodcolumn=2) # alternatively operm2alt <- operm[,2]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=5) operm <- scanone(fake.bc, method="hk", pheno.col=1:2, n.perm=25) operm2 <- subset(operm, lodcolumn=2) # alternatively operm2alt <- operm[,2]
Pull out a specified set of chromosomes and/or LOD columns from
scantwo
output.
## S3 method for class 'scantwo' subset(x, chr, lodcolumn, ...)
## S3 method for class 'scantwo' subset(x, chr, lodcolumn, ...)
x |
An object of class |
chr |
Optional vector specifying which chromosomes to keep.
This should be a vector of character strings referring to
chromosomes by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
lodcolumn |
A vector specifying which LOD columns to keep (or, if
negative), omit. These should be between 1 and the number of LOD
columns in the input |
... |
Ignored at this point. |
The input scantwo
object, but with only the specified
subset of the data.
Karl W Broman, [email protected]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) out <- scantwo(fake.bc, method="hk", pheno.col=1:2) summary(subset(out, chr=18:19))
data(fake.bc) fake.bc <- calc.genoprob(fake.bc) out <- scantwo(fake.bc, method="hk", pheno.col=1:2) summary(subset(out, chr=18:19))
Pull out results for a specified set LOD columns from
permutation results from scantwo
.
## S3 method for class 'scantwoperm' subset(x, repl, lodcolumn, ...) ## S3 method for class 'scantwoperm' x[repl, lodcolumn]
## S3 method for class 'scantwoperm' subset(x, repl, lodcolumn, ...) ## S3 method for class 'scantwoperm' x[repl, lodcolumn]
x |
Permutation results from
|
repl |
A vector specifying which permutation replicates to keep or (if negative) omit. Ignored in case of X-chr specific permutations |
lodcolumn |
A vector specifying which LOD columns to keep or (if
negative) omit. These should be between 1 and the number of LOD
columns in the input |
... |
Ignored at this point. |
The input scantwo
permutation results, but with only the specified
subset of the data.
Karl W Broman, [email protected]
summary.scantwoperm
,
scantwo
, c.scantwoperm
,
rbind.scantwoperm
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=0) operm <- scantwo(fake.bc, method="hk", pheno.col=1:2, n.perm=5) operm2 <- subset(operm, lodcolumn=2) # alternatively operm2alt <- operm[,2]
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=0) operm <- scantwo(fake.bc, method="hk", pheno.col=1:2, n.perm=5) operm2 <- subset(operm, lodcolumn=2) # alternatively operm2alt <- operm[,2]
Prints a summary the output from comparegeno
that
includes pairs of individuals whose proportion of matching genotypes
is above a chosen threshold.
## S3 method for class 'comparegeno' summary(object, thresh=0.9, ...)
## S3 method for class 'comparegeno' summary(object, thresh=0.9, ...)
object |
An object of class |
thresh |
Threshold on the proportion of matching genotypes. |
... |
Ignored at this point. |
A data frame with each row being a pair of individuals and columns
including the individual identifiers (via getid
, or just as
numeric indexes) along with the proportion of matching genotypes.
Karl W Broman, [email protected]
data(fake.f2) cg <- comparegeno(fake.f2) summary(cg, 0.7)
data(fake.f2) cg <- comparegeno(fake.f2) summary(cg, 0.7)
Print summary information about a cross
object.
## S3 method for class 'cross' summary(object, ...)
## S3 method for class 'cross' summary(object, ...)
object |
An object of class |
... |
Ignored at this point. |
An object of class summary.cross
containing a variety of summary information about the cross (this is
generally printed automatically).
Karl W Broman, [email protected]
read.cross
, plot.cross
,
nind
,
nmar
,
nchr
,
totmar
,
nphe
data(fake.f2) summary(fake.f2)
data(fake.f2) summary(fake.f2)
Print summary information about the results of fitqtl
.
## S3 method for class 'fitqtl' summary(object, pvalues=TRUE, simple=FALSE, ...)
## S3 method for class 'fitqtl' summary(object, pvalues=TRUE, simple=FALSE, ...)
object |
Output from |
pvalues |
If FALSE, don't include p-values in the summary. |
simple |
If TRUE, don't include p-values or sums of squares in the summary. |
... |
Ignored at this point. |
An object of class summary.fitqtl
, which is not all that
different than the input, but when printed gives summary information
about the results.
Hao Wu; Karl W Broman, [email protected]
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # fit model with 3 interacting QTLs interacting # (performing a drop-one-term analysis) lod <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1*Q2*Q3, method="hk") summary(lod)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 8, 13) qp <- c(26, 56, 28) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") # fit model with 3 interacting QTLs interacting # (performing a drop-one-term analysis) lod <- fitqtl(fake.f2, pheno.col=1, qtl, formula=y~Q1*Q2*Q3, method="hk") summary(lod)
Print summary information about a qtl
object.
## S3 method for class 'qtl' summary(object, ...)
## S3 method for class 'qtl' summary(object, ...)
object |
An object of class |
... |
Ignored at this point. |
An object of class summary.qtl
, which is just a data.frame
containing the chromosomes, positions, and number of possible
genotypes for each QTL.
Karl W Broman, [email protected]
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") summary(qtl)
data(fake.f2) # take out several QTLs and make QTL object qc <- c(1, 6, 13) qp <- c(25.8, 33.6, 18.63) fake.f2 <- subset(fake.f2, chr=qc) fake.f2 <- calc.genoprob(fake.f2, step=2, err=0.001) qtl <- makeqtl(fake.f2, qc, qp, what="prob") summary(qtl)
Print marker orders, from the output of the function ripple
,
for which the log10 likelihood relative to the initial order is above
a specified cutoff.
## S3 method for class 'ripple' summary(object, lod.cutoff = -1, ...)
## S3 method for class 'ripple' summary(object, lod.cutoff = -1, ...)
object |
An object of class |
lod.cutoff |
Only marker orders with LOD score (relative to the
initial order) above this cutoff will be displayed. For output of
|
... |
Ignored at this point. |
An object of class summary.ripple
, whose rows correspond to
marker orders with likelihood (or number of obligate crossovers)
within some cutoff of the initial order. If no marker order, other
than the initial one, has likelihood within the specified range, the
initial and next-best orders are returned.
Karl W Broman, [email protected]
## Not run: data(badorder) rip1 <- ripple(badorder, 1, 7) summary(rip1) rip2 <- ripple(badorder, 1, 2, method="likelihood") summary(rip2) badorder <- switch.order(badorder, 1, rip2[2,]) ## End(Not run)
## Not run: data(badorder) rip1 <- ripple(badorder, 1, 7) summary(rip1) rip2 <- ripple(badorder, 1, 2, method="likelihood") summary(rip2) badorder <- switch.order(badorder, 1, rip2[2,]) ## End(Not run)
Print the rows of the output from scanone
that
correspond to the maximum LOD for each chromosome, provided that they
exceed some specified thresholds.
## S3 method for class 'scanone' summary(object, threshold, format=c("onepheno", "allpheno", "allpeaks", "tabByCol", "tabByChr"), perms, alpha, lodcolumn=1, pvalues=FALSE, ci.function=c("lodint", "bayesint"), ...)
## S3 method for class 'scanone' summary(object, threshold, format=c("onepheno", "allpheno", "allpeaks", "tabByCol", "tabByChr"), perms, alpha, lodcolumn=1, pvalues=FALSE, ci.function=c("lodint", "bayesint"), ...)
object |
An object output by the function
|
threshold |
LOD score thresholds. Only peaks with LOD score above
this value will be returned. This could be a single number or (for
formats other than |
format |
Format for the output. See Details, below. |
perms |
Optional permutation results used to derive thresholds or
to calculate genome-scan-adjusted p-values. This must be consistent
with the |
alpha |
If perms are included, this is the significance level used
to calculate thresholds for determining which peaks to pull out.
If |
lodcolumn |
If |
pvalues |
If TRUE, include columns with genome-scan-adjusted
p-values in the results. This requires that |
ci.function |
For formats |
... |
For formats |
This function is used to report loci deemed interesting from a one-QTL
genome scan (by scanone
).
For format="onepheno"
, we focus on a single LOD score column,
indicated by lodcolumn
. The single largest LOD score peak on
each chromosome is extracted. If threshold
is specified, only
those peaks with LOD meeting the threshold will be
returned. If perms
and alpha
are specified, a threshold
is calculated based on the permutation results in perms
for the
significance level alpha
. If neither threshold
nor
alpha
are specified, the peak on each chromosome is returned.
Again note that with this format, only the LOD score column indicated
by lodcolumn
is considered in deciding which chromosomes to
return, but the LOD scores from other columns, at the position with
maximum LOD score in the lodcolumn
column, are also returned.
For format="allpheno"
, we consider all LOD score columns, and
pull out the position, on each chromosome, showing the largest LOD
score. The output thus may contain multiple rows for a chromosome.
Here threshold
may be a vector of LOD score thresholds, one for
each LOD score column, in which case only those positions for which a
LOD score column exceeded its threshold are given. If
threshold
is a single number, it is applied to all of the LOD
score columns. If alpha
is specified, it must be a single
significance level, applied for all LOD score columns, and again
perms
must be specified, and these are used to calculate the
LOD score threshold for the significance level alpha
.
For format="allpeaks"
, the output will contain, for each
chromosome, the maximum LOD score for each LOD score column, at the
position at which it achieved its maximum. Thus, the output will
contain no more than one row per chromosome, but will contain the
position and maximum LOD score for each of the LOD score columns.
The arguments threshold
and alpha
may be specified as
for the "allpheno"
format. The results for a chromosome are
returned if at least one of the LOD score columns exceeded its
threshold.
For format="tabByCol"
, there will be a separate table for each
LOD score column, with a single peak per chromosome. Included are
columns indicating chromosome, peak position, lower and upper limits
of the confidence interval calculated via lodint
or
bayesint
, and lod score.
The output for format="tabByChr"
, is similar to that of
format="tabByCol"
, but with results organized by chromosome
rather than by LOD score column.
If pvalues=TRUE
, and perms
is specified,
genome-scan-adjusted p-values are calculated for each LOD score
column, and there are additional columns in the output containing
these p-values.
In the case that X-chromosome specific permutations were performed
(with perm.Xsp=TRUE
in scanone
), autosome-
and X-chromosome specific thresholds and p-values are calculated by
the method in Broman et al. (2006).
An object of class summary.scanone
, to be printed by
print.summary.scanone
.
Karl W Broman, [email protected]
Broman, K. W., Sen, Ś, Owens, S. E., Manichaikul, A., Southard-Smith, E. M. and Churchill G. A. (2006) The X chromosome in quantitative trait locus mapping. Genetics, 174, 2151–2158.
scanone
, plot.scanone
,
max.scanone
, subset.scanone
,
c.scanone
, summary.scanoneperm
c.scanoneperm
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=5) # genome scan by Haley-Knott regression out <- scanone(fake.bc, method="hk") # permutation tests ## Not run: operm <- scanone(fake.bc, method="hk", n.perm=1000) # peaks for all chromosomes summary(out) # results with LOD >= 3 summary(out, threshold=3) # the same, but also showing the p-values summary(out, threshold=3, perms=operm, pvalues=TRUE) # results with LOD meeting the 0.05 threshold from the permutation results summary(out, perms=operm, alpha=0.05) # the same, also showing the p-values summary(out, perms=operm, alpha=0.05, pvalues=TRUE) ##### summary with multiple phenotype results out2 <- scanone(fake.bc, pheno.col=1:2, method="hk") # permutations ## Not run: operm2 <- scanone(fake.bc, pheno.col=1:2, method="hk", n.perm=1000) # results with LOD >= 2 for the 1st phenotype and >= 1 for the 2nd phenotype # using format="allpheno" summary(out2, thr=c(2, 1), format="allpheno") # The same with format="allpeaks" summary(out2, thr=c(2, 1), format="allpeaks") # The same with p-values summary(out2, thr=c(2, 1), format="allpeaks", perms=operm2, pvalues=TRUE) # results with LOD meeting the 0.05 significance level by the permutations # using format="allpheno" summary(out2, format="allpheno", perms=operm2, alpha=0.05) # The same with p-values summary(out2, format="allpheno", perms=operm2, alpha=0.05, pvalues=TRUE) # The same with format="allpeaks" summary(out2, format="allpeaks", perms=operm2, alpha=0.05, pvalues=TRUE) # format="tabByCol" summary(out2, format="tabByCol", perms=operm2, alpha=0.05, pvalues=TRUE) # format="tabByChr", but using bayes intervals summary(out2, format="tabByChr", perms=operm2, alpha=0.05, pvalues=TRUE, ci.function="bayesint") # format="tabByChr", but using 99% bayes intervals summary(out2, format="tabByChr", perms=operm2, alpha=0.05, pvalues=TRUE, ci.function="bayesint", prob=0.99)
data(fake.bc) fake.bc <- calc.genoprob(fake.bc, step=5) # genome scan by Haley-Knott regression out <- scanone(fake.bc, method="hk") # permutation tests ## Not run: operm <- scanone(fake.bc, method="hk", n.perm=1000) # peaks for all chromosomes summary(out) # results with LOD >= 3 summary(out, threshold=3) # the same, but also showing the p-values summary(out, threshold=3, perms=operm, pvalues=TRUE) # results with LOD meeting the 0.05 threshold from the permutation results summary(out, perms=operm, alpha=0.05) # the same, also showing the p-values summary(out, perms=operm, alpha=0.05, pvalues=TRUE) ##### summary with multiple phenotype results out2 <- scanone(fake.bc, pheno.col=1:2, method="hk") # permutations ## Not run: operm2 <- scanone(fake.bc, pheno.col=1:2, method="hk", n.perm=1000) # results with LOD >= 2 for the 1st phenotype and >= 1 for the 2nd phenotype # using format="allpheno" summary(out2, thr=c(2, 1), format="allpheno") # The same with format="allpeaks" summary(out2, thr=c(2, 1), format="allpeaks") # The same with p-values summary(out2, thr=c(2, 1), format="allpeaks", perms=operm2, pvalues=TRUE) # results with LOD meeting the 0.05 significance level by the permutations # using format="allpheno" summary(out2, format="allpheno", perms=operm2, alpha=0.05) # The same with p-values summary(out2, format="allpheno", perms=operm2, alpha=0.05, pvalues=TRUE) # The same with format="allpeaks" summary(out2, format="allpeaks", perms=operm2, alpha=0.05, pvalues=TRUE) # format="tabByCol" summary(out2, format="tabByCol", perms=operm2, alpha=0.05, pvalues=TRUE) # format="tabByChr", but using bayes intervals summary(out2, format="tabByChr", perms=operm2, alpha=0.05, pvalues=TRUE, ci.function="bayesint") # format="tabByChr", but using 99% bayes intervals summary(out2, format="tabByChr", perms=operm2, alpha=0.05, pvalues=TRUE, ci.function="bayesint", prob=0.99)
Calculates a bootstrap confidence interval for QTL location, using the
bootstrap results from scanoneboot
.
## S3 method for class 'scanoneboot' summary(object, prob=0.95, expandtomarkers=FALSE, ...)
## S3 method for class 'scanoneboot' summary(object, prob=0.95, expandtomarkers=FALSE, ...)
object |
Output from |
prob |
Desired coverage. |
expandtomarkers |
If TRUE, the interval is expanded to the nearest flanking markers. |
... |
Ignored at this point. |
An object of class scanone
, indicating the
position with the maximum LOD, and indicating endpoints
for the estimated bootstrap confidence interval.
Karl W Broman, [email protected]
scanoneboot
, plot.scanoneboot
,
lodint
, bayesint
## Not run: data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=1, err=0.001) bootoutput <- scanoneboot(fake.f2, chr=13, method="hk") summary(bootoutput) ## End(Not run)
## Not run: data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=1, err=0.001) bootoutput <- scanoneboot(fake.f2, chr=13, method="hk") summary(bootoutput) ## End(Not run)
Print the estimated genome-wide LOD thresholds on the basis of
permutation results from scanone
(with
n.perm
> 0).
## S3 method for class 'scanoneperm' summary(object, alpha=c(0.05, 0.10), controlAcrossCol=FALSE, ...)
## S3 method for class 'scanoneperm' summary(object, alpha=c(0.05, 0.10), controlAcrossCol=FALSE, ...)
object |
Output from the function |
alpha |
Genome-wide significance levels. |
controlAcrossCol |
If TRUE, control error rate not just across the genome but also across the columns of LOD scores. |
... |
Ignored at this point. |
If there were autosomal data only or scanone
was
run with perm.Xsp=FALSE
, genome-wide LOD thresholds are given;
these are the 1- quantiles of the genome-wide maximum LOD
scores from the permutations.
If there were autosomal and X chromosome data and
scanone
was run with perm.Xsp=TRUE
,
autosome- and X-chromsome-specific LOD thresholds are given, by the
method described in Broman et al. (2006). Let and
be total the genetic lengths of the autosomes and X
chromosome, respectively, and let
Then in place of
, we use
as the significance level for the autosomes and
as the significance level for the X chromosome. The result is a list with two matrices, one for the autosomes and one for the X chromosome.
If controlAcrossCol=TRUE
, we use a trick to control the error
rate not just across the genome but also across the LOD score
columns. Namely, we convert each column of permutation results to
ranks, and then for each permutation replicate we find the maximum
rank across the columns. We then find the appropriate quantile of the
maximized ranks, and then backtrack to the corresponding LOD score
within each of the columns. See Burrage et al. (2010),
right column on page 118.
An object of class summary.scanoneperm
, to be printed by
print.summary.scanoneperm
. If there were X chromosome data and
scanone
was run with perm.Xsp=TRUE
, there are two
matrices in the results, for the autosome and X-chromosome LOD
thresholds.
Karl W Broman, [email protected]
Broman KW, Sen Ś, Owens SE, Manichaikul A, Southard-Smith EM, Churchill GA (2006) The X chromosome in quantitative trait locus mapping. Genetics, 174, 2151–2158.
Burrage LC, Baskin-Hill AE, Sinasac DS, Singer JB, Croniger CM, Kirby A, Kulbokas EJ, Daly MJ, Lander ES, Broman KW, Nadeau JH (2010) Genetic resistance to diet-induced obesity in chromosome substitution strains of mice. Mamm Genome, 21, 115–129.
Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
scanone
,
summary.scanone
,
plot.scanoneperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=2.5) operm1 <- scanone(fake.f2, n.perm=100, method="hk") summary(operm1) operm2 <- scanone(fake.f2, n.perm=100, method="hk", perm.Xsp=TRUE) summary(operm2) # Add noise column fake.f2$pheno$noise <- rnorm(nind(fake.f2)) operm3 <- scanone(fake.f2, pheno.col=c("phenotype", "noise"), n.perm=10, method="hk") summary(operm3) summary(operm3, controlAcrossCol=TRUE, alpha=c(0.05, 0.36))
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=2.5) operm1 <- scanone(fake.f2, n.perm=100, method="hk") summary(operm1) operm2 <- scanone(fake.f2, n.perm=100, method="hk", perm.Xsp=TRUE) summary(operm2) # Add noise column fake.f2$pheno$noise <- rnorm(nind(fake.f2)) operm3 <- scanone(fake.f2, pheno.col=c("phenotype", "noise"), n.perm=10, method="hk") summary(operm3) summary(operm3, controlAcrossCol=TRUE, alpha=c(0.05, 0.36))
Print the maximum LOD scores for each partition on each chromosome,
from the results of scanPhyloQTL
.
## S3 method for class 'scanPhyloQTL' summary(object, format=c("postprob", "lod"), threshold, ...)
## S3 method for class 'scanPhyloQTL' summary(object, format=c("postprob", "lod"), threshold, ...)
object |
An object output by the function
|
format |
Indicates whether to provide LOD scores or approximate posterior probabilities; see Details below. |
threshold |
A threshold determining which chromosomes should be output; see Details below. |
... |
Ignored at this point. |
This function is used to report chromosomes deemed interesting from a one-QTL
genome scan to map QTL to a phylogenetic tree (by scanPhyloQTL
).
For format="lod"
, the output contains the maximum LOD score for
each partition on each chromosome (which do not necessarily occur at
the same position). The position corresponds to the peak location for
the partition with the largest LOD score on that chromosome. The
last column is the overall maximum LOD (across partitions) on that
chromosome. The second-to-last column is the inferred partition
(i.e., that with the largest LOD
score. The third-to-last column is the difference between the LOD score for
the best partition and that for the second-best.
For format="postprob"
, the final column contains the maximum
LOD score across partitions. But instead of providing the LOD
scores for each partition, these are converted to approximate
posterior probabilities under the assumption of a single diallelic QTL
on that chromosome: on each chromosome, we take
for the partitions and rescale them to sum to 1.
The threshold
argument is applied to the last column (the
maximum LOD score across partitions).
An object of class summary.scanPhyloQTL
, to be printed by
print.summary.scanPhyloQTL
.
Karl W Broman, [email protected]
Broman, K. W., Kim, S., An\'e, C. and Payseur, B. A. Mapping quantitative trait loci to a phylogenetic tree. In preparation.
scanPhyloQTL
, plot.scanPhyloQTL
,
max.scanPhyloQTL
, summary.scanone
,
inferredpartitions
,
simPhyloQTL
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
## Not run: # example map; drop X chromosome data(map10) map10 <- map10[1:19] # simulate data x <- simPhyloQTL(4, partition="AB|CD", crosses=c("AB", "AC", "AD"), map=map10, n.ind=150, model=c(1, 50, 0.5, 0)) # run calc.genoprob on each cross x <- lapply(x, calc.genoprob, step=2) # scan genome, at each position trying all possible partitions out <- scanPhyloQTL(x, method="hk") # maximum peak max(out, format="lod") # approximate posterior probabilities at peak max(out, format="postprob") # all peaks above a threshold for LOD(best) - LOD(2nd best) summary(out, threshold=1, format="lod") # all peaks above a threshold for LOD(best), showing approx post'r prob summary(out, format="postprob", threshold=3) # plot of results plot(out) ## End(Not run)
Summarize the interesting aspects of the results of scantwo
.
## S3 method for class 'scantwo' summary(object, thresholds, what=c("best", "full", "add", "int"), perms, alphas, lodcolumn=1, pvalues=FALSE, allpairs=TRUE, ...)
## S3 method for class 'scantwo' summary(object, thresholds, what=c("best", "full", "add", "int"), perms, alphas, lodcolumn=1, pvalues=FALSE, allpairs=TRUE, ...)
object |
An object of class |
thresholds |
A vector of length 5, giving LOD thresholds for the full, conditional-interactive, interaction, additive, and conditional-additive LOD scores. See Details, below. |
what |
Indicates for which LOD score the maximum should be reported. See Details, below. |
perms |
Optional permutation results used to derive thresholds or
to calculate genome-scan-adjusted p-values. This must be consistent
with the |
alphas |
If perms are included, these are the significance levels used
to calculate thresholds for determining which peaks to pull out. It
should be a vector of length 5, giving significance levels
for the full, conditional-interactive, interaction, additive, and
conditional-additive LOD scores. (It can also be a single number, in
which case it is assumed that the same value is used for all five LOD
scores.) If |
lodcolumn |
If the scantwo results contain LOD scores for multiple phenotypes, this argument indicates which to use in the summary. Only one LOD score column may be considered at a time. |
pvalues |
If TRUE, include columns with genome-scan-adjusted
p-values in the results. This requires that |
allpairs |
If TRUE, all pairs of chromosomes are considered. If FALSE, only self-self pairs are considered, so that one may more conveniently check for possible linked QTL. |
... |
Ignored at this point. |
If what="best"
, we calculate, for each pair of chromosomes, the
maximum LOD score for the full model (two QTL plus interaction) and
the maximum LOD score for the additive model. The difference between
these is a LOD score for a test for interaction. We also calculate
the difference between the maximum full LOD and the maximum single-QTL LOD
score for the two chromosomes; this is the LOD score for a test for a
second QTL, allowing for epistasis, which we call either the
conditional-interactive or "fv1" LOD score. Finally,
we calculate the difference between the maximum additive LOD score and
the maximum single-QTL LOD score for the two chromosomes; this is the
LOD score for a test for a second QTL, assuming that the two QTL act
additively, which we call either the conditional-additive or "av1" LOD
score. Note that the maximum full LOD and additive LOD are allowed to
occur in different places.
If what="full"
, we find the maximum full LOD and extract the
additive LOD at the corresponding pair of positions; we derive
the other three LOD scores for that fixed pair of positions.
If what="add"
, we find the maximum additive LOD and extract the
full LOD at the corresponding pair of positions; we derive
the other three LOD scores for that fixed pair of positions.
If what="int"
, we find the pair of positions for which the
difference between the full and additive LOD scores is largest, and
then calculate the five LOD scores at that pair of positions.
If thresholds
or alphas
is provided (and note that when
alphas
is provided, perms
must also), we extract just
those pairs of chromosomes for which either (a) the full LOD score
exceeds its thresholds and either the conditional-interactive LOD or
the interaction LOD exceed their threshold, or (b) the additive LOD
score exceeds its threshold and the conditional-additive LOD exceeds
its threshold. The thresholds or alphas must be given in the order
full, cond-int, int, add, cond-add.
Thresholds may be obtained by a permutation test with
scantwo
, but these are extremely time-consuming.
For a mouse backcross, we suggest the thresholds (6.0, 4.7, 4.4, 4.7,
2.6) for the full, conditional-interactive, interaction, additive, and
conditional-additive LOD scores, respectively.
For a mouse intercross, we suggest the thresholds (9.1, 7.1, 6.3, 6.3,
3.3) for the full, conditional-interactive, interaction, additive, and
conditional-additive LOD scores, respectively. These were obtained by
10,000 simulations of crosses with 250 individuals, markers at a 10 cM
spacing, and analysis by Haley-Knott regression.
An object of class summary.scantwo
, to be printed by
print.summary.scantwo
;
Note that, for output from addpair
in which the
new loci are indicated explicitly in the formula, the summary provided
by summary.scantwo
is somewhat special.
All arguments except allpairs
and thresholds
(and, of
course, the input object
) are ignored.
If the formula is symmetric in the two new QTL, the output has just two LOD
score columns: lod.2v0
comparing the full model to the model
with neither of the new QTL, and lod.2v1
comparing the full
model to the model with just one new QTL.
If the formula is not symmetric in the two new QTL, the output
has three LOD score columns: lod.2v0
comparing the full model
to the model with neither of the new QTL, lod.2v1b
comparing
the full model to the model in which the first of the new QTL is
omitted, and lod.2v1a
comparing the full model to the model
with the second of the new QTL omitted.
The thresholds
argument should have length 1 or 2, rather than
the usual 5. Rows will be retained if lod.2v0
is greater than
thresholds[1]
and lod.2v1
(or either of lod.2v1a
or lod.2v1b
) is greater than thresholds[2]
. (If a
single thresholds is given, we assume that thresholds[2]==0
.)
The previous version of this function is still available, though it is
now named summaryScantwoOld
.
We much prefer the revised function. However, while we are confident
that this function (and the permutations in
scantwo
) are calculating the relevant statistics,
the appropriate significance levels for these relatively complex
series of statistical tests is not yet completely clear.
Karl W Broman, [email protected]
scantwo
, plot.scantwo
,
max.scantwo
, condense.scantwo
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") # All pairs of chromosomes summary(out.2dim) # Chromosome pairs meeting specified criteria summary(out.2dim, thresholds=c(9.1, 7.1, 6.3, 6.3, 3.3)) # Similar, but ignoring the interaction LOD score in the rule summary(out.2dim, thresholds=c(9.1, 7.1, Inf, 6.3, 3.3)) # Pairs having largest interaction LOD score, if it's > 4 summary(out.2dim, thresholds=c(0, Inf, 4, Inf, Inf), what="int") # permutation test to get thresholds; run in two batches # and then combined with c.scantwoperm ## Not run: operm.2dimA <- scantwo(fake.f2, method="hk", n.perm=500) operm.2dimB <- scantwo(fake.f2, method="hk", n.perm=500) operm.2dim <- c(operm.2dimA, operm.2dimB) ## End(Not run) # estimated LOD thresholds summary(operm.2dim) # Summary, citing significance levels and so estimating thresholds # from the permutation results summary(out.2dim, perms=operm.2dim, alpha=rep(0.05, 5)) # Similar, but ignoring the interaction LOD score in the rule summary(out.2dim, perms=operm.2dim, alpha=c(0.05, 0.05, 0, 0.05, 0.05)) # Similar, but also getting genome-scan-adjusted p-values summary(out.2dim, perms=operm.2dim, alpha=c(0.05, 0.05, 0, 0.05, 0.05), pvalues=TRUE)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") # All pairs of chromosomes summary(out.2dim) # Chromosome pairs meeting specified criteria summary(out.2dim, thresholds=c(9.1, 7.1, 6.3, 6.3, 3.3)) # Similar, but ignoring the interaction LOD score in the rule summary(out.2dim, thresholds=c(9.1, 7.1, Inf, 6.3, 3.3)) # Pairs having largest interaction LOD score, if it's > 4 summary(out.2dim, thresholds=c(0, Inf, 4, Inf, Inf), what="int") # permutation test to get thresholds; run in two batches # and then combined with c.scantwoperm ## Not run: operm.2dimA <- scantwo(fake.f2, method="hk", n.perm=500) operm.2dimB <- scantwo(fake.f2, method="hk", n.perm=500) operm.2dim <- c(operm.2dimA, operm.2dimB) ## End(Not run) # estimated LOD thresholds summary(operm.2dim) # Summary, citing significance levels and so estimating thresholds # from the permutation results summary(out.2dim, perms=operm.2dim, alpha=rep(0.05, 5)) # Similar, but ignoring the interaction LOD score in the rule summary(out.2dim, perms=operm.2dim, alpha=c(0.05, 0.05, 0, 0.05, 0.05)) # Similar, but also getting genome-scan-adjusted p-values summary(out.2dim, perms=operm.2dim, alpha=c(0.05, 0.05, 0, 0.05, 0.05), pvalues=TRUE)
Print the estimated genome-wide LOD thresholds on the basis of
permutation results from scantwo
(with
n.perm
> 0).
## S3 method for class 'scantwoperm' summary(object, alpha=c(0.05, 0.10), ...)
## S3 method for class 'scantwoperm' summary(object, alpha=c(0.05, 0.10), ...)
object |
Output from the function |
alpha |
Genome-wide significance levels. |
... |
Ignored at this point. |
We take the quantiles of the individual LOD
scores.
In the case of X-chr-specific permutations, we use the combined length
of the autosomes, , and the length of the X chromosome,
, and calculate the area of the A:A, A:X, and X:X regions as
,
, and
, and then use the
nominal significance levels of
,
where
is the proportional area for that region.
An object of class summary.scantwoperm
, to be printed by
print.summary.scantwoperm
.
Karl W Broman, [email protected]
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
scantwo
,
summary.scantwo
,
plot.scantwoperm
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=0) ## Not run: operm <- scantwo(fake.f2, n.perm=100, method="hk") summary(operm)
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=0) ## Not run: operm <- scantwo(fake.f2, n.perm=100, method="hk") summary(operm)
Print summary information about a map
object.
## S3 method for class 'map' summary(object, ...) summaryMap(object, ...)
## S3 method for class 'map' summary(object, ...) summaryMap(object, ...)
object |
An object of class |
... |
Ignored at this point. |
An object of class summary.map
, which is just a data.frame
containing the number of markers, length, the average inter-marker
spacing, and the maximum distance between markers, for each chromosome
and overall. An attribute sexsp
indicates whether the map was
sex-specific.
Karl W Broman, [email protected]
chrlen
, pull.map
,
summary.cross
data(map10) summary(map10)
data(map10) summary(map10)
Summarize the interesting aspects of the results of
scantwo
; this is the version of
summary.scantwo
that was included in R/qtl version
1.03 and earlier.
summaryScantwoOld(object, thresholds = c(0, 0, 0), lodcolumn=1, type = c("joint","interaction"), ...)
summaryScantwoOld(object, thresholds = c(0, 0, 0), lodcolumn=1, type = c("joint","interaction"), ...)
object |
An object of class |
thresholds |
A vector of length three, giving LOD thresholds for the joint LOD, interaction LOD and single-QTL conditional LOD. Negative threshold values are taken relative to the maximum joint, interaction, or individual QTL LOD, respectively. |
lodcolumn |
If the scantwo results contain LOD scores for multiple phenotypes, this argument indicates which to use in the summary. |
type |
Indicates whether to pick peaks with maximal joint or interaction LOD. |
... |
Ignored at this point. |
For each pair of chromosomes, the pair of loci for which the
LOD score (either joint or interaction LOD, according to the argument
type
) is a maximum is considered. The pair is printed only if
its joint LOD score exceeds the joint threshold and either (a) the
interaction LOD score exceeds its threshold or (b) both of the loci have
conditional LOD scores that are above the conditional LOD threshold,
where the conditional LOD score for locus ,
, is the
likelihood ratio
comparing the model with
and
acting
additively to the model with
alone.
In the case the results of scanone
are not
available, the maximum locus pair for each chromosome is printed
whenever its joint LOD exceeds the joint LOD threshold.
The criterion used in this summary is due to Gary Churchill and Śaunak Sen, and deserves careful consideration and possible revision.
An object of class summary.scantwo.old
, to be printed by
print.summary.scantwo.old
. Pairs of loci meeting
the specified criteria are printed, with their joint LOD, interaction
LOD, and the conditional LOD for each locus, along with single-point
P-values calculated by the approximation.
P-values are printed as
.
If the input scantwo
object does not include the results of
scanone
, the interaction and conditional LOD thresholds are
ignored, and all pairs of loci for which the joint LOD exceeds its
threshold are printed, though without their conditional LOD scores.
Hao Wu; Karl W Broman, [email protected]; Brian Yandell
summary.scantwo
,
scantwo
, plot.scantwo
,
max.scantwo
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") # All pairs of loci summaryScantwoOld(out.2dim) # Pairs meeting specified criteria summaryScantwoOld(out.2dim, c(7, 3, 3)) # Pairs with both conditional LODs > 2 summaryScantwoOld(out.2dim,c(0,1000,2)) # Pairs with interaction LOD is above 3 summaryScantwoOld(out.2dim,c(0,3,1000))
data(fake.f2) fake.f2 <- calc.genoprob(fake.f2, step=5) out.2dim <- scantwo(fake.f2, method="hk") # All pairs of loci summaryScantwoOld(out.2dim) # Pairs meeting specified criteria summaryScantwoOld(out.2dim, c(7, 3, 3)) # Pairs with both conditional LODs > 2 summaryScantwoOld(out.2dim,c(0,1000,2)) # Pairs with interaction LOD is above 3 summaryScantwoOld(out.2dim,c(0,3,1000))
Switch the order of markers on a specified chromosome to a specified new order.
switch.order(cross, chr, order, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-6, sex.sp=TRUE)
switch.order(cross, chr, order, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), maxit=4000, tol=1e-6, sex.sp=TRUE)
cross |
An object of class |
chr |
The chromosome for which the marker order is to be switched. Only one chromosome is allowed. (This should be a character string referring to the chromosomes by name.) |
order |
A vector of numeric indices defining the new marker
order. The vector may have length two more than the number of
markers, for easy in use with the output of the function
|
error.prob |
Assumed genotyping error rate (passed to
|
map.function |
Map function to be used (passed to
|
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
The input cross
object, but with the marker order on the
specified chromosome updated, and with any derived data removed
(except for recombination fractions, if present, which are not
removed); the genetic map for the relevant chromosome is
re-estimated.
Karl W Broman, [email protected]
flip.order
, ripple
, clean.cross
data(fake.f2) fake.f2 <- switch.order(fake.f2, 1, c(1,3,2,4:7))
data(fake.f2) fake.f2 <- switch.order(fake.f2, 1, c(1,3,2,4:7))
Switch alleles at selected markers in a cross object.
switchAlleles(cross, markers, switch=c("AB", "CD", "ABCD", "parents"))
switchAlleles(cross, markers, switch=c("AB", "CD", "ABCD", "parents"))
cross |
An object of class |
markers |
Names of markers whose alleles are to be switched. |
switch |
For a 4-way cross, indicates how to switch the alleles (A
for B, C for D, both A for B and C for D), or both A for C and B for D ( |
For a backcross, we exchange homozygotes (AA) and heterozygotes (AB).
For doubled haploids and recombinant inbred lines, we exchange the two homozygotes.
For an intercross, we exchange the two homozygotes, and exchange C (i.e., not AA) and D (i.e., not BB). (The heterozygotes in an intercross are left unchanged.)
For a 4-way cross, we consider the argument switch
, and the
exchanges among the genotypes are more complicated.
The input cross object, with alleles at selected markers switched.
Karl W Broman, [email protected]
checkAlleles
, est.rf
, geno.crosstab
data(fake.f2) geno.crosstab(fake.f2, "D5M391", "D5M81") # switch homozygotes at marker D5M391 fake.f2 <- switchAlleles(fake.f2, "D5M391") geno.crosstab(fake.f2, "D5M391", "D5M81") ## Not run: fake.f2 <- est.rf(fake.f2) checkAlleles(fake.f2) ## End(Not run)
data(fake.f2) geno.crosstab(fake.f2, "D5M391", "D5M81") # switch homozygotes at marker D5M391 fake.f2 <- switchAlleles(fake.f2, "D5M391") geno.crosstab(fake.f2, "D5M391", "D5M81") ## Not run: fake.f2 <- est.rf(fake.f2) checkAlleles(fake.f2) ## End(Not run)
Convert a data frame with marker positions to a map object.
table2map(tab)
table2map(tab)
tab |
A data frame with two columns: chromosome and position. The row names are the marker names. |
A map
object: a list whose components (corresponding
to chromosomes) are vectors of marker positions.
Karl W Broman, [email protected]
tab <- data.frame(chr=c(1,1,1,1,2,2,2,2,3,3,3,3), pos=c(0,2,4,8,0,2,4,8,0,2,4,8)) rownames(tab) <- paste0("marker", 1:nrow(tab)) map <- table2map(tab)
tab <- data.frame(chr=c(1,1,1,1,2,2,2,2,3,3,3,3), pos=c(0,2,4,8,0,2,4,8,0,2,4,8)) rownames(tab) <- paste0("marker", 1:nrow(tab)) map <- table2map(tab)
Prints those genotypes with error LOD scores above a specified cutoff.
top.errorlod(cross, chr, cutoff=4, msg=TRUE)
top.errorlod(cross, chr, cutoff=4, msg=TRUE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding |
cutoff |
Only those genotypes with error LOD scores above this cutoff will be listed. |
msg |
If TRUE, print a message if there are no apparent errors. |
A data.frame with 4 columns, whose rows correspond to the genotypes that are possibly in error. The four columns give the chromosome number, individual number, marker name, and error LOD score.
Karl W Broman, [email protected]
calc.errorlod
, plotGeno
, plotErrorlod
data(hyper) # Calculate error LOD scores hyper <- calc.errorlod(hyper,error.prob=0.01) # Print those above a specified cutoff top.errorlod(hyper,cutoff=4)
data(hyper) # Calculate error LOD scores hyper <- calc.errorlod(hyper,error.prob=0.01) # Print those above a specified cutoff top.errorlod(hyper,cutoff=4)
Determine the total number of markers in a cross or map object.
totmar(object)
totmar(object)
object |
An object of class |
The total number of markers in the input.
Karl W Broman, [email protected]
read.cross
, plot.cross
,
summary.cross
,
nind
,
nchr
,
nmar
,
nphe
data(fake.f2) totmar(fake.f2) map <- pull.map(fake.f2) totmar(map)
data(fake.f2) totmar(fake.f2) map <- pull.map(fake.f2) totmar(map)
Transform phenotypes in a cross object; by default use a logarithmic transformation, though any function may be used.
transformPheno(cross, pheno.col=1, transf=log, ...)
transformPheno(cross, pheno.col=1, transf=log, ...)
cross |
An object of class |
pheno.col |
A vector of numeric indices or character strings (indicating phenotypes by name) of phenotypes to be transformed. |
transf |
The function to use in the transformation. |
... |
Additional arguments, to be passed to |
The input cross object with the transformed phenotypes
Danny Arends [email protected]
data(multitrait) # Log transformation of all phenotypes multitrait.log <- transformPheno(multitrait, pheno.col=1:nphe(multitrait)) # Square-root transformation of all phenotypes multitrait.sqrt <- transformPheno(multitrait, pheno.col=1:nphe(multitrait), transf=sqrt)
data(multitrait) # Log transformation of all phenotypes multitrait.log <- transformPheno(multitrait, pheno.col=1:nphe(multitrait)) # Square-root transformation of all phenotypes multitrait.sqrt <- transformPheno(multitrait, pheno.col=1:nphe(multitrait), transf=sqrt)
Try all possible positions for a marker, keeping all other markers fixed, and evaluate the log likelihood and estimate the chromosome length.
tryallpositions(cross, marker, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE)
tryallpositions(cross, marker, chr, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), m=0, p=0, maxit=4000, tol=1e-6, sex.sp=TRUE, verbose=TRUE)
cross |
An object of class |
marker |
Character string with name of the marker to move about. |
chr |
A vector specifying which chromosomes to test for the
position of the marker. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions. (Ignored if m > 0.) |
m |
Interference parameter for the chi-square model for interference; a non-negative integer, with m=0 corresponding to no interference. This may be used only for a backcross or intercross. |
p |
Proportion of chiasmata from the NI mechanism, in the Stahl model; p=0 gives a pure chi-square model. This may be used only for a backcross or intercross. |
maxit |
Maximum number of EM iterations to perform. |
tol |
Tolerance for determining convergence. |
sex.sp |
Indicates whether to estimate sex-specific maps; this is used only for the 4-way cross. |
verbose |
If TRUE, print information on progress. |
A data frame (actually, an object of class "scanone"
, so that
one may use plot.scanone
,
summary.scanone
, etc.) with each row being a
possible position for the marker.
The first two columns are the chromosome ID and position. The third
column is a LOD score comparing the hypotheses that the marker is in that
position versus the hypothesis that it is not linked to that chromosome.
In the case of a 4-way cross, with sex.sp=TRUE
, there are two
additional columns with the estimated female and male genetic lengths
of the respective chromosome, when the marker is in that position.
With sex.sp=FALSE
, or for other types of crosses, there is one
additional column, with the estimated genetic length of the respective
chromosome, when the marker is in that position.
The row names indicate the nearest flanking markers for each interval.
Karl W Broman, [email protected]
droponemarker
, est.map
, ripple
,
est.rf
, switch.order
,
movemarker
data(fake.bc) tryallpositions(fake.bc, "D7M301", 7, error.prob=0, verbose=FALSE)
data(fake.bc) tryallpositions(fake.bc, "D7M301", 7, error.prob=0, verbose=FALSE)
Calculates, for each individual on each chromosome, the maximum distance between genotyped markers.
typingGap(cross, chr, terminal=FALSE)
typingGap(cross, chr, terminal=FALSE)
cross |
An object of class |
chr |
Optional vector indicating the chromosomes to consider.
This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding |
terminal |
If TRUE, just look at terminal typing gaps (from the terminal markers to the first typed marker). |
We consider not just the distances between internal genotypes, but
also distances from the beginning of the chromosome to the first typed
marker, and similarly for the end of the chromosome. (The start and end
of a chromosome are taken to be the locations of the initial and final
markers.) If terminal=TRUE
, we look only at those beginning
and end distances.
A matrix with rows corresponding to individuals and columns corresponding to chromosomes. (If there is just one chromosome, it is a numeric vector rather than a matrix.)
Karl W Broman, [email protected]
data(hyper) plot(typingGap(hyper, chr=5), ylab="Maximum gap between typed markers (cM)", ylim=c(0, diff(range(pull.map(hyper,chr=5)[[1]])))) plot(typingGap(hyper, chr=4), ylab="Maximum gap between typed markers (cM)", ylim=c(0, diff(range(pull.map(hyper,chr=4)[[1]])))) plot(typingGap(hyper, chr=4, terminal=TRUE), ylab="Maximum gap between chr end and typed marker (cM)", ylim=c(0, diff(range(pull.map(hyper,chr=4)[[1]]))))
data(hyper) plot(typingGap(hyper, chr=5), ylab="Maximum gap between typed markers (cM)", ylim=c(0, diff(range(pull.map(hyper,chr=5)[[1]])))) plot(typingGap(hyper, chr=4), ylab="Maximum gap between typed markers (cM)", ylim=c(0, diff(range(pull.map(hyper,chr=4)[[1]])))) plot(typingGap(hyper, chr=4, terminal=TRUE), ylab="Maximum gap between chr end and typed marker (cM)", ylim=c(0, diff(range(pull.map(hyper,chr=4)[[1]]))))
Data for a QTL experiment is written to a file (or files).
write.cross(cross, format=c("csv", "csvr", "csvs", "csvsr", "mm", "qtlcart", "gary", "qtab", "mapqtl", "tidy"), filestem="data", chr, digits=NULL, descr)
write.cross(cross, format=c("csv", "csvr", "csvs", "csvsr", "mm", "qtlcart", "gary", "qtab", "mapqtl", "tidy"), filestem="data", chr, digits=NULL, descr)
cross |
An object of class |
format |
Specifies whether to write the data in comma-delimited, rotated comma-delimited, Mapmaker, QTL Cartographer, Gary Churchill's, QTAB, MapQTL format. |
filestem |
A character string giving the first part of the output
file names (the bit before the dot). In Windows, use forward
slashes ( |
chr |
A vector specifying for which chromosomes genotype data
should be written. This should be a vector of character strings
referring to chromosomes by name; numeric values are converted to
strings. Refer to chromosomes with a preceding |
digits |
Number of digits to which phenotype values and genetic map positions should be rounded. If NULL (the default), they are not rounded. |
descr |
Character string description; used only with |
Comma-delimited formats: a single csv file is created in the formats
"csv"
or "csvr"
. Two files are created (one for the
genotype data and one for the phenotype data) for the formats
"csvs"
and "csvsr"
; if filestem="file"
, the two
files will be names "file_gen.csv"
and "file_phe.csv"
.
See the help file for read.cross
for details on these formats.
Mapmaker format: Data is written to two files.
Suppose filestem="file"
. Then "file.raw"
will contain
the genotype and phenotype data, and "file.prep"
will contain
the necessary code for defining the chromosome assignments, marker
order, and inter-marker distances.
QTL Cartographer format: Data is written to two files. Suppose
filestem="file"
. Then "file.cro"
will contain
the genotype and phenotype data, and "file.map"
will contain
the genetic map information. Note that cross types are converted to
QTL Cartographer cross types as follows: riself to RF1, risib to RF2,
bc to B1 and f2 to RF2.
Gary's format: Data is written to six files. They are: "geno.data"
- genotype data; "pheno.data"
- phenotype data; "chrid.dat"
- the chromosome identifier for each marker; "mnames.txt"
- the marker names; "markerpos.txt"
- the marker positions; "pnames.txt"
- the phenotype names
QTAB format: See documentation.
MapQTL format: See documentation.
Tidy format: Data is written to three files, "stem_gen.csv"
,
"stem_phe.csv"
, and "stem_map.csv"
(where stem
is
taken from the filestem
argument.
Karl W Broman, [email protected]; Hao Wu; Brian S. Yandell; Danny Arends; Aaron Wolen
## Not run: data(fake.bc) # comma-delimited format write.cross(fake.bc, "csv", "Data/fakebc", c(1,5,13)) # rotated comma-delimited format write.cross(fake.bc, "csvr", "Data/fakebc", c(1,5,13)) # split comma-delimited format write.cross(fake.bc, "csvs", "Data/fakebc", c(1,5,13)) # split and rotated comma-delimited format write.cross(fake.bc, "csvsr", "Data/fakebc", c(1,5,13)) # Mapmaker format write.cross(fake.bc, "mm", "Data/fakebc", c(1,5,13)) # QTL Cartographer format write.cross(fake.bc, "qtlcart", "Data/fakebc", c(1,5,13)) # Gary's format write.cross(fake.bc, "gary", c(1,5,13)) ## End(Not run)
## Not run: data(fake.bc) # comma-delimited format write.cross(fake.bc, "csv", "Data/fakebc", c(1,5,13)) # rotated comma-delimited format write.cross(fake.bc, "csvr", "Data/fakebc", c(1,5,13)) # split comma-delimited format write.cross(fake.bc, "csvs", "Data/fakebc", c(1,5,13)) # split and rotated comma-delimited format write.cross(fake.bc, "csvsr", "Data/fakebc", c(1,5,13)) # Mapmaker format write.cross(fake.bc, "mm", "Data/fakebc", c(1,5,13)) # QTL Cartographer format write.cross(fake.bc, "qtlcart", "Data/fakebc", c(1,5,13)) # Gary's format write.cross(fake.bc, "gary", c(1,5,13)) ## End(Not run)
Get x-axis locations for given cM positions on given chromosomes in a
plot from plot.scanone
)
xaxisloc.scanone(out, thechr, thepos, chr, gap=25)
xaxisloc.scanone(out, thechr, thepos, chr, gap=25)
out |
An object of class |
thechr |
Chromosome IDs at which x-axis locations are to be determined. |
thepos |
Chromosome positions at which x-axis locations are to be determined. |
chr |
Optional vector specifying which chromosomes were plotted.
This must be identical to what was used in the call to
|
gap |
Gap separating chromosomes (in cM). This must be identical
to what was used in the call to |
This function allows you to identify the x-axis locations in a plot of
genome scan results, produced by
plot.scanone
. This is useful for adding
annotations, such as text or arrows.
The arguments out
, chr
, and gap
must match what
was used in the call to plot.scanone
.
The arguments thechr
and thepos
indicate the genomic
positions for which x-axis locations are desired. If they both have
length > 1, they must have the same length. If one has length > 1 and
one has length 1, the one with length 1 is expanded to match.
A numeric vector of x-axis locations.
Karl W Broman, [email protected]
data(hyper) hyper <- calc.genoprob(hyper) out <- scanone(hyper, method="hk") plot(out, chr=c(1, 4, 6, 15)) # add arrow and text to indicate peak LOD score mxout <- max(out) x <- xaxisloc.scanone(out, mxout$chr, mxout$pos, chr=c(1,4,6,15)) arrows(x+30, mxout$lod, x+5, mxout$lod, len=0.1, col="blue") text(x+35, mxout$lod, "the peak", col="blue", adj=c(0, 0.5))
data(hyper) hyper <- calc.genoprob(hyper) out <- scanone(hyper, method="hk") plot(out, chr=c(1, 4, 6, 15)) # add arrow and text to indicate peak LOD score mxout <- max(out) x <- xaxisloc.scanone(out, mxout$chr, mxout$pos, chr=c(1,4,6,15)) arrows(x+30, mxout$lod, x+5, mxout$lod, len=0.1, col="blue") text(x+35, mxout$lod, "the peak", col="blue", adj=c(0, 0.5))