Title: | Simulation of Chromosomal Regions Shared by Family Members |
---|---|
Description: | Simulation of segments shared identical-by-descent (IBD) by pedigree members. Using sex specific recombination rates along the human genome (Halldorsson et al. (2019) <doi:10.1126/science.aau1043>), phased chromosomes are simulated for all pedigree members. Applications include calculation of realised relatedness coefficients and IBD segment distributions. 'ibdsim2' is part of the 'pedsuite' collection of packages for pedigree analysis. A detailed presentation of the 'ped suite', including a separate chapter on 'ibdsim2', is available in the book 'Pedigree analysis in R' (Vigeland, 2021, ISBN:9780128244302). A 'Shiny' app for visualising and comparing IBD distributions is available at <https://magnusdv.shinyapps.io/ibdsim2-shiny/>. |
Authors: | Magnus Dehli Vigeland [aut, cre] |
Maintainer: | Magnus Dehli Vigeland <[email protected]> |
License: | GPL-3 |
Version: | 2.1.1 |
Built: | 2025-01-07 06:45:07 UTC |
Source: | CRAN |
Convert between physical position (in megabases) and genetic position (centiMorgan) given a chromosome map. Linear extrapolation is used to convert positions between map points.
convertPos( chrom = NULL, Mb = NULL, cM = NULL, map = "decode19", sex = c("average", "male", "female") )
convertPos( chrom = NULL, Mb = NULL, cM = NULL, map = "decode19", sex = c("average", "male", "female") )
chrom |
(Optional) A vector of chromosome labels. |
Mb |
A vector of physical positions (in Mb), or NULL. |
cM |
A vector of genetic positions (in cM), or NULL. |
map |
A |
sex |
Either "average", "male" or "female". |
A vector of the same length as the input.
# Chromosome 1 of the built-in recombination map map = loadMap(chrom = 1)[[1]] head(map$male) # Conversion Mb -> cM phys = 1:5 gen = convertPos(Mb = phys, map = map, sex = "male") gen # Convert back (note the first position, which was outside of map) convertPos(cM = gen, map = map, sex = "male")
# Chromosome 1 of the built-in recombination map map = loadMap(chrom = 1)[[1]] head(map$male) # Conversion Mb -> cM phys = 1:5 gen = convertPos(Mb = phys, map = map, sex = "male") gen # Convert back (note the first position, which was outside of map) convertPos(cM = gen, map = map, sex = "male")
Create custom recombination maps for use in ibdsim()
.
customMap(x)
customMap(x)
x |
A data frame or matrix. See details for format specifications. |
The column names of x
must include either
chrom
, mb
and cm
(sex-averaged map)
or
chrom
, mb
, male
and female
(sex-specific map)
Upper-case letters are allowed in these names. The mb
column should contain
physical positions in megabases, while cm
, male
, female
give the
corresponding genetic position in centiMorgans.
An object of class genomeMap
.
# A map including two chromosomes. df1 = data.frame(chrom = c(1, 1, 2, 2), mb = c(0, 2, 0, 5), cm = c(0, 3, 0, 6)) map1 = customMap(df1) map1 # Use columns "male" and "female" to make sex specific maps df2 = data.frame(chrom = c(1, 1, 2, 2), mb = c(0, 2, 0, 5), male = c(0, 3, 0, 6), female = c(0, 4, 0, 7)) map2 = customMap(df2) map2
# A map including two chromosomes. df1 = data.frame(chrom = c(1, 1, 2, 2), mb = c(0, 2, 0, 5), cm = c(0, 3, 0, 6)) map1 = customMap(df1) map1 # Use columns "male" and "female" to make sex specific maps df2 = data.frame(chrom = c(1, 1, 2, 2), mb = c(0, 2, 0, 5), male = c(0, 3, 0, 6), female = c(0, 4, 0, 7)) map2 = customMap(df2) map2
Estimate by simulation various relatedness coefficients, and two-locus
versions of the same coefficients, for a given recombination rate. The
current implementation covers inbreeding coefficients, kinship coefficients,
IBD (kappa) coefficients between noninbred individuals, and condensed
identity coefficients. These functions are primarily meant as tools for
validating exact algorithms, e.g., as implemented in the ribd
package.
estimateInbreeding(x, id, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusInbreeding( x, id, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... ) estimateKinship(x, ids, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusKinship( x, ids, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... ) estimateKappa(x, ids, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusKappa( x, ids, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... ) estimateIdentity(x, ids, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusIdentity( x, ids, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... )
estimateInbreeding(x, id, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusInbreeding( x, id, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... ) estimateKinship(x, ids, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusKinship( x, ids, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... ) estimateKappa(x, ids, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusKappa( x, ids, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... ) estimateIdentity(x, ids, Nsim, Xchrom = FALSE, verbose = FALSE, ...) estimateTwoLocusIdentity( x, ids, rho = NULL, cM = NULL, Nsim, Xchrom = FALSE, verbose = FALSE, ... )
x |
A pedigree in the form of a |
id , ids
|
A vector of one or two ID labels. |
Nsim |
The number of simulations. |
Xchrom |
A logical indicating if the loci are X-linked or autosomal. |
verbose |
A logical. |
... |
Further arguments passed on to |
rho |
A scalar in the interval |
cM |
A non-negative number: the genetic distance between the two loci,
given in centiMorgans. Either |
In the following, let L1 and L2 denote two arbitrary autosomal loci with
recombination rate , and let A and B be members of the pedigree
x
.
The two-locus inbreeding coefficient of A is defined as the
probability that A is autozygous at both L1 and L2 simultaneously.
The two-locus kinship coefficient of A and B is defined
as the probability that a random gamete emitted from A, and a random gamete
emitted from B, contain IBD alleles at both L1 and L2.
The two-locus kappa coefficient , for
, of noninbred A and B, is the probability that A and B share exactly
i
alleles IBD at L1, and exactly j
alleles IBD at L2.
The two-locus identity coefficient ,
is defined for any (possibly inbred) A and B, as the probability that A and B
are in identity state
i
at L1, and state j
at L2. This uses the
conventional ordering of the nine condensed identity states. For details, see
for instance the GitHub page of the ribd
package.
estimateInbreeding()
: a single probability.
estimateTwoLocusInbreeding()
: a single probability.
estimateKappa()
: a numeric vector of length 3, with the estimated
coefficients.
estimateTwoLocusKappa()
: a symmetric, numerical 3*3 matrix, with the
estimated values of , for
.
estimateIdentity()
: a numeric vector of length 9, with the estimated
identity coefficients.
estimateTwoLocusIdentity()
: a symmetric, numerical 9*9 matrix, with the
estimated values of , for
.
############################ ### Two-locus inbreeding ### ############################ x = cousinPed(0, child = TRUE) rho = 0.25 Nsim = 10 # Increase! estimateTwoLocusInbreeding(x, id = 5, rho = rho, Nsim = Nsim, seed = 123) # Exact: ribd::twoLocusInbreeding(x, id = 5, rho = rho) ######################################## ### Two-locus kappa: ### ### Grandparent vs half sib vs uncle ### ######################################## # These are indistinguishable with unlinked loci, see e.g. # pages 182-183 in Egeland, Kling and Mostad (2016). # In the following, each simulation approximation is followed # by its exact counterpart. rho = 0.25; R = .5 * (rho^2 + (1-rho)^2) Nsim = 10 # Should be increased to at least 10000 # Grandparent/grandchild G = linearPed(2); G.ids = c(1,5); # plot(G, hatched = G.ids) estimateTwoLocusKappa(G, G.ids, rho = rho, Nsim = Nsim, seed = 123)[2,2] .5*(1-rho) # exact # Half sibs H = halfSibPed(); H.ids = c(4,5); # plot(H, hatched = H.ids) estimateTwoLocusKappa(H, H.ids, rho = rho, Nsim = Nsim, seed = 123)[2,2] R # exact # Uncle U = avuncularPed(); U.ids = c(3,6); # plot(U, hatched = U.ids) estimateTwoLocusKappa(U, U.ids, rho = rho, Nsim = Nsim, seed = 123)[2,2] (1-rho) * R + rho/4 # exact # Exact calculations by ribd: # ribd::twoLocusIBD(G, G.ids, rho = rho, coefs = "k11") # ribd::twoLocusIBD(H, H.ids, rho = rho, coefs = "k11") # ribd::twoLocusIBD(U, U.ids, rho = rho, coefs = "k11") ########################## ### Two-locus Jacquard ### ########################## x = fullSibMating(1) rho = 0.25 Nsim = 10 # (increase to at least 10000) estimateTwoLocusIdentity(x, ids = 5:6, rho = rho, Nsim = Nsim, seed = 123) # Exact by ribd: # ribd::twoLocusIdentity(x, ids = 5:6, rho = rho)
############################ ### Two-locus inbreeding ### ############################ x = cousinPed(0, child = TRUE) rho = 0.25 Nsim = 10 # Increase! estimateTwoLocusInbreeding(x, id = 5, rho = rho, Nsim = Nsim, seed = 123) # Exact: ribd::twoLocusInbreeding(x, id = 5, rho = rho) ######################################## ### Two-locus kappa: ### ### Grandparent vs half sib vs uncle ### ######################################## # These are indistinguishable with unlinked loci, see e.g. # pages 182-183 in Egeland, Kling and Mostad (2016). # In the following, each simulation approximation is followed # by its exact counterpart. rho = 0.25; R = .5 * (rho^2 + (1-rho)^2) Nsim = 10 # Should be increased to at least 10000 # Grandparent/grandchild G = linearPed(2); G.ids = c(1,5); # plot(G, hatched = G.ids) estimateTwoLocusKappa(G, G.ids, rho = rho, Nsim = Nsim, seed = 123)[2,2] .5*(1-rho) # exact # Half sibs H = halfSibPed(); H.ids = c(4,5); # plot(H, hatched = H.ids) estimateTwoLocusKappa(H, H.ids, rho = rho, Nsim = Nsim, seed = 123)[2,2] R # exact # Uncle U = avuncularPed(); U.ids = c(3,6); # plot(U, hatched = U.ids) estimateTwoLocusKappa(U, U.ids, rho = rho, Nsim = Nsim, seed = 123)[2,2] (1-rho) * R + rho/4 # exact # Exact calculations by ribd: # ribd::twoLocusIBD(G, G.ids, rho = rho, coefs = "k11") # ribd::twoLocusIBD(H, H.ids, rho = rho, coefs = "k11") # ribd::twoLocusIBD(U, U.ids, rho = rho, coefs = "k11") ########################## ### Two-locus Jacquard ### ########################## x = fullSibMating(1) rho = 0.25 Nsim = 10 # (increase to at least 10000) estimateTwoLocusIdentity(x, ids = 5:6, rho = rho, Nsim = Nsim, seed = 123) # Exact by ribd: # ribd::twoLocusIdentity(x, ids = 5:6, rho = rho)
Extract ID labels from simulation output
extractIds(sim)
extractIds(sim)
sim |
Output from |
A character vector
s = ibdsim(nuclearPed(2), N=1, ids = 3:4) stopifnot(all(extractIds(s) == c("3", "4")))
s = ibdsim(nuclearPed(2), N=1, ids = 3:4) stopifnot(all(extractIds(s) == c("3", "4")))
Find segments satisfying a particular pattern of IBD sharing, in a list of IBD simulations.
findPattern(sims, pattern, merge = TRUE, cutoff = 0, unit = "mb")
findPattern(sims, pattern, merge = TRUE, cutoff = 0, unit = "mb")
sims |
A |
pattern |
A named list of vectors containing ID labels. Allowed names
are |
merge |
A logical, indicating if adjacent segments should be merged. Default: TRUE. |
cutoff |
A non-negative number. Segments shorter than this are excluded from the output. Default: 0. |
unit |
The unit of |
For each simulation, this function extracts the subset of rows satisfying the
allele sharing specified by pattern
. That is, segments where, for some allele,
all of pattern$autozygous
are autozygous
all of pattern$heterozygous
have exactly one copy
all of pattern$carriers
have at least one copy
none of pattern$noncarriers
carry the allele.
A matrix (if sims
is a single genomeSim
object), or a list of
matrices.
x = nuclearPed(3) s = ibdsim(x, N = 1, map = uniformMap(M = 1), seed = 1729) # Segments where some allele is shared by 3 and 4, but not 5 pattern = list(carriers = 3:4, noncarriers = 5) findPattern(s, pattern) # Exclude segments less than 7 cM findPattern(s, pattern, cutoff = 7) # Visual confirmation: haploDraw(x, s)
x = nuclearPed(3) s = ibdsim(x, N = 1, map = uniformMap(M = 1), seed = 1729) # Segments where some allele is shared by 3 and 4, but not 5 pattern = list(carriers = 3:4, noncarriers = 5) findPattern(s, pattern) # Exclude segments less than 7 cM findPattern(s, pattern, cutoff = 7) # Visual confirmation: haploDraw(x, s)
Visualise the IBD pattern of a single chromosome, by drawing haplotypes onto the pedigree.
haploDraw( x, ibd, chrom = NULL, ids = NULL, unit = "mb", L = NULL, pos = 1, cols = NULL, height = 4, width = 0.75, sep = 0.75, dist = 1, keep.par = FALSE, ... )
haploDraw( x, ibd, chrom = NULL, ids = NULL, unit = "mb", L = NULL, pos = 1, cols = NULL, height = 4, width = 0.75, sep = 0.75, dist = 1, keep.par = FALSE, ... )
x |
A |
ibd |
A |
chrom |
A chromosome number, needed if |
ids |
A vector indicating for which pedigree members haplotypes should
be drawn. If NULL (default), all individuals in |
unit |
Either "mb" (default) or "cm". |
L |
A positive number: the chromosome length. By default derived from
|
pos |
A vector recycled to |
cols |
A colour vector corresponding to the alleles in |
height |
The height of the haplotype rectangles in units of the pedigree symbol height. Default: 4. |
width |
The width of the haplotype rectangles in units of the pedigree symbol width. Default: 0.75. |
sep |
The separation between haplotypes within a pair, measured in pedigree symbol widths. |
dist |
The distance between pedigree symbols and the closest haplotype, measured in pedigree symbol widths. |
keep.par |
A logical, by default FALSE; passed on to |
... |
Further arguments passed on to |
None.
############################### # Example 1: A family quartet # ############################### x = nuclearPed(2) map = uniformMap(M = 1) s = ibdsim(x, map = map, seed = 4276) haploDraw(x, s) # Custom colours and placements haploDraw(x, s, cols = c(3,7,2,4), pos = c(2,4,2,4)) # Standard plot options apply haploDraw(x, s, margins = 3, cex = 1.5, title = "Full sibs") ########################### # Example 2: Autozygosity # ########################### x = halfCousinPed(0, child = TRUE) map = uniformMap(M = 1) s = ibdsim(x, map = map, skipRecomb = c(1,3), seed = 2) # Only include relevant individuals (skip 1 and 3) haploDraw(x, s, ids = c(2,4,5,6), pos = c(1,2,4,4)) ############################### # Example 3: X-chromosomal sims ############################### x = nuclearPed(2, sex = 2:1) s = ibdsim(x, N = 1, map = uniformMap(M = 1, chrom = "X"), seed = 123) haploDraw(x, s)
############################### # Example 1: A family quartet # ############################### x = nuclearPed(2) map = uniformMap(M = 1) s = ibdsim(x, map = map, seed = 4276) haploDraw(x, s) # Custom colours and placements haploDraw(x, s, cols = c(3,7,2,4), pos = c(2,4,2,4)) # Standard plot options apply haploDraw(x, s, margins = 3, cex = 1.5, title = "Full sibs") ########################### # Example 2: Autozygosity # ########################### x = halfCousinPed(0, child = TRUE) map = uniformMap(M = 1) s = ibdsim(x, map = map, skipRecomb = c(1,3), seed = 2) # Only include relevant individuals (skip 1 and 3) haploDraw(x, s, ids = c(2,4,5,6), pos = c(1,2,4,4)) ############################### # Example 3: X-chromosomal sims ############################### x = nuclearPed(2, sex = 2:1) s = ibdsim(x, N = 1, map = uniformMap(M = 1, chrom = "X"), seed = 123) haploDraw(x, s)
This is the main function of the package, simulating the recombination process in each meioses of a pedigree. The output summarises the IBD segments between all or a subset of individuals.
ibdsim( x, N = 1, ids = labels(x), map = "decode", model = c("chi", "haldane"), skipRecomb = NULL, simplify1 = TRUE, seed = NULL, verbose = TRUE )
ibdsim( x, N = 1, ids = labels(x), map = "decode", model = c("chi", "haldane"), skipRecomb = NULL, simplify1 = TRUE, seed = NULL, verbose = TRUE )
x |
A |
N |
A positive integer indicating the number of simulations. |
ids |
A subset of pedigree members whose IBD sharing should be analysed. If NULL, all members are included. |
map |
The genetic map to be used in the simulations: Allowed values are:
Default: "decode19". |
model |
Either "chi" or "haldane", indicating the statistical model for recombination (see details). Default: "chi". |
skipRecomb |
A vector of ID labels indicating individuals whose meioses
should be simulated without recombination. (Each child will then receive a
random strand of each chromosome.) The default action is to skip
recombination in founders who are uninformative for IBD sharing in the
|
simplify1 |
A logical, by default TRUE, removing the outer list layer when N = 1. See Value. |
seed |
An integer to be passed on to |
verbose |
A logical. |
Each simulation starts by unique alleles (labelled 1, 2, ...) being
distributed to the pedigree founders. In each meiosis, homologue chromosomes
are made to recombine according to the value of model
:
model = "haldane"
: In this model, crossover events are modelled as a
Poisson process along each chromosome.
model = "chi"
(default): This uses a renewal process along the
four-strand bundle, with waiting times following a chi square distribution.
Recombination rates along each chromosome are determined by the map
parameter. The default value ("decode19") loads a thinned version of the
recombination map of the human genome published by Halldorsson et al (2019).
In many applications, the fine-scale default map is not necessary, and should
be replaced by simpler maps with constant recombination rates. See
uniformMap()
and loadMap()
for ways to produce such maps.
A list of N
objects of class genomeSim
.
If N = 1 the outer list layer is removed by default, which is typically
desired in interactive use (especially when piping). To enforce a list
output, add simplify1 = FALSE
.
A genomeSim
object is essentially a numerical matrix describing the
allele flow through the pedigree in a single simulated. Each row
corresponds to a chromosomal segment. The first 3 columns (chrom, startMB,
endMB) describe the physical location of the segment. Next, the genetic
coordinates (startCM, endCM), which are computed from map
by averaging
the male and female values. Then follow the allele columns, two for each
individual in ids
, suffixed by ":p" and ":m" signifying the paternal and
maternal alleles, respectively.
If ids
has length 1, a column named Aut
is added, whose entries are 1
for autozygous segments and 0 otherwise.
If ids
has length 2, two columns are added:
IBD
: The IBD status of each segment (= number of alleles shared
identical by descent). For a given segment, the IBD status is either 0, 1,
2 or NA. If either individual is autozygous in a segment, the IBD status is
reported as NA. With inbred individuals the Sigma
column (see below) is
more informative than the IBD
column.
Sigma
: The condensed identity ("Jacquard") state of each segment,
given as an integer in the range 1-9. The numbers correspond to the
standard ordering of the condensed states. In particular, for non-inbred
individuals, the states 9, 8, 7 correspond to IBD status 0, 1, 2
respectively.
Halldorsson et al. Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363, no. 6425 (2019).
### Example 1: Half siblings ### hs = halfSibPed() sim = ibdsim(hs, map = uniformMap(M = 1), seed = 10) sim # Plot haplotypes haploDraw(hs, sim) #' ### Example 2: Full sib mating ### x = fullSibMating(1) sim = ibdsim(x, ids = 5:6, map = uniformMap(M = 10), seed = 1) head(sim) # All 9 identity states are present stopifnot(setequal(sim[, 'Sigma'], 1:9))
### Example 1: Half siblings ### hs = halfSibPed() sim = ibdsim(hs, map = uniformMap(M = 1), seed = 10) sim # Plot haplotypes haploDraw(hs, sim) #' ### Example 2: Full sib mating ### x = fullSibMating(1) sim = ibdsim(x, ids = 5:6, map = uniformMap(M = 10), seed = 1) head(sim) # All 9 identity states are present stopifnot(setequal(sim[, 'Sigma'], 1:9))
Show chromosomal segments in a diploid karyogram
karyoDiploid( paternal, maternal, chrom = 1:22, col = c(paternal = "lightblue", maternal = "orange"), alpha = 1, bgcol = "gray95", title = NULL )
karyoDiploid( paternal, maternal, chrom = 1:22, col = c(paternal = "lightblue", maternal = "orange"), alpha = 1, bgcol = "gray95", title = NULL )
paternal , maternal
|
data.frames (or objects coercible to data.frames) containing the segments to be shown on the paternal and maternal strands of the karyogram. The first three columns must contain chromosome (with or without "chr" prefix), start position and stop position (in Mb). Column names are ignored, as well as any further columns. |
chrom |
The (autosomal) chromosomes to be included in the plot, given as a subset of the integers 1, 2,..., 22. |
col |
A vector of two colours (in any form recognisable by R). If only one colour is given it is recycled. If the vector is named, a colour legend is included in the plot, using the names as labels. |
alpha |
A single numeric in |
bgcol |
The background colour of the chromosomes. |
title |
Plot title. |
The plot object is returned invisibly, so that additional ggplot layers may be added if needed.
## Not run: pat = data.frame(chrom = c(1,4,5,5,10,10), start = c(100,50,20,80,10,80), end = c(120,100,25,100,70,120)) mat = data.frame(chrom = c(2,4,5,5,10), start = c(80,50,10,80,50), end = c(120,100,35,100,120)) karyoDiploid(pat, mat) ## End(Not run)
## Not run: pat = data.frame(chrom = c(1,4,5,5,10,10), start = c(100,50,20,80,10,80), end = c(120,100,25,100,70,120)) mat = data.frame(chrom = c(2,4,5,5,10), start = c(80,50,10,80,50), end = c(120,100,35,100,120)) karyoDiploid(pat, mat) ## End(Not run)
Functions for visualising IBD segments in karyograms. The karyogram1()
and
karyogram2()
functions produces karyograms illustrating the output of
ibdsim()
for one or two specified individuals. The actual plotting is done
by functions karyoHaploid()
and karyoDiploid()
.
karyogram2(sim, ids = NULL, verbose = TRUE, ...)
karyogram2(sim, ids = NULL, verbose = TRUE, ...)
sim |
A |
ids |
A vector of one or two ID labels. |
verbose |
A logical. |
... |
Further arguments passed on to |
A plot object returned invisibly.
x = quadHalfFirstCousins() s = ibdsim(x, seed = 1729) # karyogram2(s, ids = leaves(x), title = "QHFC")
x = quadHalfFirstCousins() s = ibdsim(x, seed = 1729) # karyogram2(s, ids = leaves(x), title = "QHFC")
Show chromosomal segments in a haploid karyogram
karyoHaploid( segments, chrom = 1:22, colBy = NULL, col = NULL, separate = TRUE, alpha = 1, bgcol = "gray92", title = NULL, legendTitle = NULL, base_size = 16 )
karyoHaploid( segments, chrom = 1:22, colBy = NULL, col = NULL, separate = TRUE, alpha = 1, bgcol = "gray92", title = NULL, legendTitle = NULL, base_size = 16 )
segments |
A data.frame (or an object coercible to data.frame)
containing the segments to be shown on the karyogram. The first three
columns must contain chromosome (with or without "chr" prefix), start
position and stop position (in Mb). Any further columns are ignored, except
possibly a column indicated by |
chrom |
A vector indicating which chromosomes to include. |
colBy |
A character vector naming the columns to be used for colouring. If NULL (default), all segments have the same colour. |
col |
A single fill colour for all the segments, or (if |
separate |
A logical; relevant only if the |
alpha |
A single numeric in |
bgcol |
The background colour of the chromosomes. |
title |
Plot title. |
legendTitle |
Legend title. |
base_size |
Font size, passed onto |
The plot object is returned invisibly, so that additional ggplot
layers may be added if needed.
## Not run: segs = data.frame(chrom = c(1,4,5,5,10,10), start = c(100,50,20,80,10,50), end = c(120,100,25,100,70,120), IBD = c("paternal","maternal")) cols = c(paternal = "blue", maternal = "red") karyoHaploid(segs, col = "cyan") karyoHaploid(segs, colBy = "IBD", col = cols) # Note difference if `separate = FALSE` karyoHaploid(segs, colBy = "IBD", col = cols, separate = FALSE) # Reduce alpha to see the overlaps: karyoHaploid(segs, colBy = "IBD", col = cols, separate = FALSE, alpha = 0.7) ## End(Not run)
## Not run: segs = data.frame(chrom = c(1,4,5,5,10,10), start = c(100,50,20,80,10,50), end = c(120,100,25,100,70,120), IBD = c("paternal","maternal")) cols = c(paternal = "blue", maternal = "red") karyoHaploid(segs, col = "cyan") karyoHaploid(segs, colBy = "IBD", col = cols) # Note difference if `separate = FALSE` karyoHaploid(segs, colBy = "IBD", col = cols, separate = FALSE) # Reduce alpha to see the overlaps: karyoHaploid(segs, colBy = "IBD", col = cols, separate = FALSE, alpha = 0.7) ## End(Not run)
This launches the Shiny app for simulating IBD segment distributions.
launchApp()
launchApp()
No return value, called for side effects.
## Not run: launchApp() ## End(Not run)
## Not run: launchApp() ## End(Not run)
This function loads one of the built-in genetic maps. Currently, the only option is a detailed human recombination map, based on the publication by Halldorsson et al. (2019).
loadMap(map = "decode19", chrom = 1:22, uniform = FALSE, sexAverage = FALSE)
loadMap(map = "decode19", chrom = 1:22, uniform = FALSE, sexAverage = FALSE)
map |
The name of the wanted map, possibly abbreviated. Currently, the only valid choice is "decode19" (default). |
chrom |
A vector containing a subset of the numbers 1,2,...,23,
indicating which chromosomes to load. As a special case, |
uniform |
A logical. If FALSE (default), the complete inhomogeneous map is used. If TRUE, a uniform version of the same map is produced, i.e., with the correct physical range and genetic lengths, but with constant recombination rates along each chromosome. |
sexAverage |
A logical, by default FALSE. If TRUE, a sex-averaged map is returned, with equal recombination rates for males and females. |
For reasons of speed and efficiency, the map published by map Halldorsson et al. (2019) has been thinned down to around 60 000 data points.
By setting uniform = TRUE
, a uniform version of the map is returned, in
which each chromosome has the same genetic lengths as in the original, but
with constant recombination rates. This gives much faster simulations and may
be preferable in some applications.
An object of class genomeMap
, which is a list of chromMap
objects. A chromMap
is a list of two matrices, named "male" and "female",
with various attributes:
physStart
: The first physical position (Mb) on the chromosome covered
by the map
physEnd
: The last physical position (Mb) on the chromosome covered by
the map
physRange
: The physical map length (Mb), equal to physEnd - physStart
mapLen
: A vector of length 2, containing the centiMorgan lengths of the
male and female strands
chrom
: A chromosome label
Xchrom
: A logical. This is checked by ibdsim()
and other function, to
select mode of inheritance
Halldorsson et al. Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363, no. 6425 (2019).
# By default, the complete map of all 22 autosomes is returned loadMap() # Uniform version m = loadMap(uniform = TRUE) m # Check chromosome 1 m1 = m[[1]] m1 m1$male m1$female # The X chromosome loadMap(chrom = "X")[[1]]
# By default, the complete map of all 22 autosomes is returned loadMap() # Uniform version m = loadMap(uniform = TRUE) m # Check chromosome 1 m1 = m[[1]] m1 m1$male m1$female # The X chromosome loadMap(chrom = "X")[[1]]
Utility functions for extracting the physical or genetic length of chromosome maps and genome maps.
mapLen(x, ...) ## S3 method for class 'chromMap' mapLen(x, sex = c("male", "female"), ...) ## S3 method for class 'genomeMap' mapLen(x, sex = c("male", "female"), ...) physRange(x, ...) ## S3 method for class 'chromMap' physRange(x, ...) ## S3 method for class 'genomeMap' physRange(x, ...)
mapLen(x, ...) ## S3 method for class 'chromMap' mapLen(x, sex = c("male", "female"), ...) ## S3 method for class 'genomeMap' mapLen(x, sex = c("male", "female"), ...) physRange(x, ...) ## S3 method for class 'chromMap' physRange(x, ...) ## S3 method for class 'genomeMap' physRange(x, ...)
x |
A |
... |
Not used. |
sex |
Either "male", "female" or both. |
mapLen()
returns a numeric of the same length as sex
, with the
genetic length(s) in centiMorgan.
physRange()
returns the physical length (in Mb) of the chromosome/genome
covered by the map. For example, for a chromosome map starting at 2 Mb and
ending at 8 Mb, the output is 6.
m = loadMap(chrom = 1:2) m # Applied to `genomeMap` object: physRange(m) mapLen(m) # Applied to `chromMap` object: physRange(m[[1]]) mapLen(m[[1]])
m = loadMap(chrom = 1:2) m # Applied to `genomeMap` object: physRange(m) mapLen(m) # Applied to `chromMap` object: physRange(m[[1]]) mapLen(m[[1]])
Visualise and compare count/length distributions of IBD segments. Two types are currently implemented: Segments of autozygosity (for a single person) and segments with (pairwise) IBD state 1.
plotSegmentDistribution( ..., type = c("autozygosity", "ibd1"), ids = NULL, unit = "cm", labels = NULL, col = NULL, shape = 1, alpha = 1, ellipses = TRUE, title = NULL, xlab = NULL, ylab = NULL, legendInside = TRUE )
plotSegmentDistribution( ..., type = c("autozygosity", "ibd1"), ids = NULL, unit = "cm", labels = NULL, col = NULL, shape = 1, alpha = 1, ellipses = TRUE, title = NULL, xlab = NULL, ylab = NULL, legendInside = TRUE )
... |
One or several objects of class |
type |
A string indicating which segments should be plotted. Currently, the allowed entries are "autozygosity" and "ibd1". |
ids |
A list of the same length as Two other short-cuts are possible: If a single vector is given, it is
repeated for all pedigrees. Finally, if |
unit |
Length unit; either "cm" (centiMorgan) or "mb" (megabases). |
labels |
An optional character vector of labels used in the legend. If
NULL, the labels are taken from |
col |
An optional colour vector of the same length as |
shape |
A vector with point shapes, of the same length as |
alpha |
A transparency parameter for the scatter points. |
ellipses |
A logical: Should confidence ellipses be added to the plot? |
title , xlab , ylab
|
Title and axis labels. |
legendInside |
A logical controlling the legend placement. |
This function takes as input one or several complete outputs from the
ibdsim()
, and produces a scatter plot of the number and average length of
IBD segments from each.
Contour curves are added to plot, corresponding to the
theoretical/pedigree-based values: either inbreeding coefficients (if type = "autozygosity"
) or (if
type = "ibd1"
).
# Simulation parameters used in the below examples. map = uniformMap(M = 10) # recombination map N = 5 # number of sims # For more realistic results, replace with e.g.: # map = loadMap("decode19") # N = 1000 ################################################################# # EXAMPLE 1 # Comparison of IBD segment distributions # between paternal and maternal half siblings. ################################################################# # Define the pedigrees xPat = halfSibPed() xMat = swapSex(xPat, 1) simPat = ibdsim(xPat, N = N, map = map) simMat = ibdsim(xMat, N = N, map = map) # By default, the IBD segments of the "leaves" are computed and plotted plotSegmentDistribution(simPat, simMat, type = "ibd1", ids = 4:5, labels = c("HSpat", "HSmat")) ################################################################# # EXAMPLE 2 # Half siblings vs half uncle vs grandparent/grandchild ################################################################# # Only one pedigree needed here x = addSon(halfSibPed(), 5) s = ibdsim(x, N = N, map = map) # Indicate the pairs explicitly this time. ids = list(GR = c(2,7), HS = 4:5, HU = c(4,7)) # List names are used as labels in the plot plotSegmentDistribution(s, type = "ibd1", ids = ids, shape = 1:3) ################################################################# # EXAMPLE 3 # Comparison of autozygosity distributions in various individuals # with the same expected inbreeding coefficient (f = 1/8) ################################################################# G = linearPed(2) |> swapSex(5) |> addSon(c(1,5)) # grandfath/granddaughter HSpat = halfSibPed(sex2 = 2) |> addSon(4:5) # paternal half sibs HSmat = swapSex(HSpat, 2) # maternal half sibs QHFC = quadHalfFirstCousins() # quad half first cousins QHFC = addSon(QHFC, 9:10) peds = list(G = G, HSpat = HSpat, HSmat = HSmat, QHFC = QHFC) plotPedList(peds, newdev = TRUE) dev.off() # Simulations s = lapply(peds, function(p) ibdsim(p, N = N, ids = leaves(p), verbose = FALSE, map = map)) # Plot distributions plotSegmentDistribution(s, type = "autoz", title = "Autozygous segments")
# Simulation parameters used in the below examples. map = uniformMap(M = 10) # recombination map N = 5 # number of sims # For more realistic results, replace with e.g.: # map = loadMap("decode19") # N = 1000 ################################################################# # EXAMPLE 1 # Comparison of IBD segment distributions # between paternal and maternal half siblings. ################################################################# # Define the pedigrees xPat = halfSibPed() xMat = swapSex(xPat, 1) simPat = ibdsim(xPat, N = N, map = map) simMat = ibdsim(xMat, N = N, map = map) # By default, the IBD segments of the "leaves" are computed and plotted plotSegmentDistribution(simPat, simMat, type = "ibd1", ids = 4:5, labels = c("HSpat", "HSmat")) ################################################################# # EXAMPLE 2 # Half siblings vs half uncle vs grandparent/grandchild ################################################################# # Only one pedigree needed here x = addSon(halfSibPed(), 5) s = ibdsim(x, N = N, map = map) # Indicate the pairs explicitly this time. ids = list(GR = c(2,7), HS = 4:5, HU = c(4,7)) # List names are used as labels in the plot plotSegmentDistribution(s, type = "ibd1", ids = ids, shape = 1:3) ################################################################# # EXAMPLE 3 # Comparison of autozygosity distributions in various individuals # with the same expected inbreeding coefficient (f = 1/8) ################################################################# G = linearPed(2) |> swapSex(5) |> addSon(c(1,5)) # grandfath/granddaughter HSpat = halfSibPed(sex2 = 2) |> addSon(4:5) # paternal half sibs HSmat = swapSex(HSpat, 2) # maternal half sibs QHFC = quadHalfFirstCousins() # quad half first cousins QHFC = addSon(QHFC, 9:10) peds = list(G = G, HSpat = HSpat, HSmat = HSmat, QHFC = QHFC) plotPedList(peds, newdev = TRUE) dev.off() # Simulations s = lapply(peds, function(p) ibdsim(p, N = N, ids = leaves(p), verbose = FALSE, map = map)) # Plot distributions plotSegmentDistribution(s, type = "autoz", title = "Autozygous segments")
This function simulates genotypes for a set of markers conditional on a
specific underlying IBD pattern (typically produced with ibdsim()
).
profileSimIBD( x, ibdpattern, ids = NULL, markers = NULL, seed = NULL, verbose = TRUE )
profileSimIBD( x, ibdpattern, ids = NULL, markers = NULL, seed = NULL, verbose = TRUE )
x |
A |
ibdpattern |
A |
ids |
A vector of ID labels. If NULL, extracted from |
markers |
A vector with names or indices of markers attached to |
seed |
An integer seed for the random number generator. |
verbose |
A logical, by default TRUE. |
It should be noted that the only random part of this function is the sampling of founder alleles for each marker. Given those, all other genotypes in the pedigree are determined by the underlying IBD pattern.
A copy of x
where marker genotypes have been simulated conditional
on ibdpattern
.
ibdsim()
, forrel::profileSim()
.
# Brother-sister pedigree ped = nuclearPed(2, sex = 1:2) # Alleles als = letters[1:10] ### Autosomal simulation x = ped |> addMarker(alleles = als, chrom = 1, posMb = 20) |> addMarker(alleles = als, chrom = 1, posMb = 50) |> addMarker(alleles = als, chrom = 1, posMb = 70) # Simulate the underlying IBD pattern in the pedigree sim = ibdsim(x, map = uniformMap(M = 1, chrom = 1), seed = 123) # Simulate genotypes for the sibs conditional on the given IBD pattern profileSimIBD(x, sim, ids = 3:4, seed = 123) # With a different seed profileSimIBD(x, sim, ids = 3:4, seed = 124) ### X chromosomal simulation y = ped |> addMarker(alleles = als, chrom = "X", posMb = 1) |> addMarker(alleles = als, chrom = "X", posMb = 50) |> addMarker(alleles = als, chrom = "X", posMb = 100) simy = ibdsim(y, map = loadMap("decode19", chrom = 23), seed = 11) profileSimIBD(y, simy, seed = 12)
# Brother-sister pedigree ped = nuclearPed(2, sex = 1:2) # Alleles als = letters[1:10] ### Autosomal simulation x = ped |> addMarker(alleles = als, chrom = 1, posMb = 20) |> addMarker(alleles = als, chrom = 1, posMb = 50) |> addMarker(alleles = als, chrom = 1, posMb = 70) # Simulate the underlying IBD pattern in the pedigree sim = ibdsim(x, map = uniformMap(M = 1, chrom = 1), seed = 123) # Simulate genotypes for the sibs conditional on the given IBD pattern profileSimIBD(x, sim, ids = 3:4, seed = 123) # With a different seed profileSimIBD(x, sim, ids = 3:4, seed = 124) ### X chromosomal simulation y = ped |> addMarker(alleles = als, chrom = "X", posMb = 1) |> addMarker(alleles = als, chrom = "X", posMb = 50) |> addMarker(alleles = als, chrom = "X", posMb = 100) simy = ibdsim(y, map = loadMap("decode19", chrom = 23), seed = 11) profileSimIBD(y, simy, seed = 12)
Compute the realised values of various pedigree coefficients, from simulated data. The current implementation covers inbreeding coefficients for single pedigree members, and kinship, kappa and condensed identity coefficients for pairwise relationships.
realisedInbreeding(sims, id = NULL, unit = "cm") realisedKinship(sims, ids = NULL, unit = "cm") realisedKappa(sims, ids = NULL, unit = "cm") realisedIdentity(sims, ids = NULL, unit = "cm")
realisedInbreeding(sims, id = NULL, unit = "cm") realisedKinship(sims, ids = NULL, unit = "cm") realisedKappa(sims, ids = NULL, unit = "cm") realisedIdentity(sims, ids = NULL, unit = "cm")
sims |
A list of genome simulations, as output by |
id , ids
|
A vector with one or two ID labels. |
unit |
Either "mb" (megabases) or "cm" (centiMorgan); the length unit for genomic segments. Default is "cm", which normally gives lower variance. |
The inbreeding coefficient of a pedigree member is defined as the
probability of autozygosity (homozygous for alleles that are identical by
descent) in a random autosomal locus. Equivalently, the inbreeding
coefficient is the expected autozygous proportion of the autosomal
chromosomes.
The realised inbreeding coefficient in a given individual is the
actual fraction of the autosomes covered by autozygous segments. Because of
the stochastic nature of meiotic recombination, this may deviate
substantially from the pedigree-based expectation.
Similarly, the pedigree-based IBD coefficients of noninbred pairs of individuals have realised counterparts. For
any given pair of individuals we define
to be the actual fraction
of the autosome where the individuals share exactly
alleles IBD,
where
.
Finally, we can do the same thing for each of the nine condensed identity
coefficients of Jacquard. For each we define
the
be the fraction of the autosome where a given pair of individuals are in
identity state
. This uses the conventional ordering of the nine
condensed identity states; see for instance the
ribd
GitHub page.
# Realised IBD coefficients between full siblings x = nuclearPed(2) s = ibdsim(x, N = 2) # increase N realisedKappa(s, ids = 3:4) ########### # Realised inbreeding coefficients, child of first cousins x = cousinPed(1, child = TRUE) s = ibdsim(x, N = 2) # increase N realisedInbreeding(s, id = 9) # Same data: realised kinship coefficients between the parents realisedKinship(s, ids = parents(x, 9)) ########### # Realised identity coefficients after full sib mating x = fullSibMating(1) s = ibdsim(x, N = 2) # increase N realisedIdentity(s, ids = 5:6)
# Realised IBD coefficients between full siblings x = nuclearPed(2) s = ibdsim(x, N = 2) # increase N realisedKappa(s, ids = 3:4) ########### # Realised inbreeding coefficients, child of first cousins x = cousinPed(1, child = TRUE) s = ibdsim(x, N = 2) # increase N realisedInbreeding(s, id = 9) # Same data: realised kinship coefficients between the parents realisedKinship(s, ids = parents(x, 9)) ########### # Realised identity coefficients after full sib mating x = fullSibMating(1) s = ibdsim(x, N = 2) # increase N realisedIdentity(s, ids = 5:6)
Compute summary statistics for segments identified by findPattern()
.
segmentStats( x, quantiles = c(0.025, 0.5, 0.975), returnAll = FALSE, unit = "mb" )
segmentStats( x, quantiles = c(0.025, 0.5, 0.975), returnAll = FALSE, unit = "mb" )
x |
A list of matrices produced with |
quantiles |
A vector of quantiles to include in the summary. |
returnAll |
A logical, by default FALSE. If TRUE, the output includes a
vector |
unit |
Either "mb" (megabases) or "cm" (centiMorgan); the length unit for genomic segments. |
A list containing a data frame perSim
, a matrix summary
and (if
returnAll
is TRUE) a vector allSegs
.
Variables used in the output:
Count
: The total number of segments in a simulation
Total
: The total sum of the segment lengths in a simulation
Average
: The average segment lengths in a simulation
Shortest
: The length of the shortest segment in a simulation
Longest
: The length of the longest segment in a simulation
Overall
(only in summary
): A summary of all segments from all
simulations
x = nuclearPed(3) sims = ibdsim(x, N = 2, map = uniformMap(M = 2), model = "haldane", seed = 1729) # Segments where all siblings carry the same allele segs = findPattern(sims, pattern = list(carriers = 3:5)) # Summarise segmentStats(segs, unit = "mb") # The unit does not matter in this case (since the map is trivial) segmentStats(segs, unit = "cm")
x = nuclearPed(3) sims = ibdsim(x, N = 2, map = uniformMap(M = 2), model = "haldane", seed = 1729) # Segments where all siblings carry the same allele segs = findPattern(sims, pattern = list(carriers = 3:5)) # Summarise segmentStats(segs, unit = "mb") # The unit does not matter in this case (since the map is trivial) segmentStats(segs, unit = "cm")
Create a uniform recombination map of a given length.
uniformMap(Mb = NULL, cM = NULL, M = NULL, cmPerMb = 1, chrom = 1)
uniformMap(Mb = NULL, cM = NULL, M = NULL, cmPerMb = 1, chrom = 1)
Mb |
Map length in megabases. |
cM |
Map length in centiMorgan. |
M |
Map length in Morgan. |
cmPerMb |
A positive number; the cM/Mb ratio. |
chrom |
A chromosome label, which may be any string. The values "X" and
"23" have a special meaning, both resulting in the |
An object of class chromMap
. See loadMap()
for details.
m = uniformMap(Mb = 1, cM = 2:3) m m$male m$female mx = uniformMap(M = 1, chrom = "X") mx mx$male mx$female
m = uniformMap(Mb = 1, cM = 2:3) m m$male m$female mx = uniformMap(M = 1, chrom = "X") mx mx$male mx$female
Estimate the probability of no IBD sharing in a pairwise relationship.
zeroIBD(sims, ids = NULL, threshold = 0, unit = "cm")
zeroIBD(sims, ids = NULL, threshold = 0, unit = "cm")
sims |
A list of genome simulations, as output by |
ids |
A vector with two ID labels. If NULL (default), these are deduced
from the |
threshold |
A nonnegative number (default:0). Only IBD segments longer than this are included in the computation. |
unit |
The unit of measurement for |
A list with the following two entries:
zeroprob
: The fraction of sims
in which ids
have no IBD sharing
stErr
: The standard error of zeroprob
### # The following example computes the probability of # no IBD sharing between a pair of fourth cousins. # We also show how the probability is affected by # truncation, i.e., ignoring short segments. ### # Define the pedigree x = cousinPed(4) cous = leaves(x) # Simulate (increase N!) s = ibdsim(x, N = 10) # Probability of zero ibd segments. (By default all segs are used) zeroIBD(s, ids = cous) # Re-compute with nonzero threshold zeroIBD(s, ids = cous, threshold = 1, unit = "cm") zeroIBD(s, ids = cous, threshold = 1, unit = "mb")
### # The following example computes the probability of # no IBD sharing between a pair of fourth cousins. # We also show how the probability is affected by # truncation, i.e., ignoring short segments. ### # Define the pedigree x = cousinPed(4) cous = leaves(x) # Simulate (increase N!) s = ibdsim(x, N = 10) # Probability of zero ibd segments. (By default all segs are used) zeroIBD(s, ids = cous) # Re-compute with nonzero threshold zeroIBD(s, ids = cous, threshold = 1, unit = "cm") zeroIBD(s, ids = cous, threshold = 1, unit = "mb")