Ease
aims to
implement in a simple and efficient way in R the possibility to perform
population genetics simulations considering multiple loci whose
epistasis is fully customizable. Specifically suited to the modelling of
multilocus nucleocytoplasmic systems, it is nevertheless possible to
simulate purely nucleic, i.e. diploid (or purely cytoplasmic,
i.e. haploid) genetic models. The simulations are not individual-centred
in that the transition from one generation to the next is done
matrix-wise on the basis of deterministic equations. Instead of each
individual being described separately, the simulations only handle the
genotype frequencies within the population. All possible genotype
frequencies considering the loci and alleles defined by the user are
explicitly tracked. The simulations are therefore fast only if the
number of loci and alleles is not too large.
The consideration of genetic drift and a specific population size is
nevertheless introduced as a multinomial draw each generation, which
adds to the realism of the simulations by adding randomisation. One can
also simulate a simple demography, i.e. an initial population size, a
growth rate and a carrying capacity. The size of the population is thus
drawn each generation in a Poisson distribution. Varying the carrying
capacity is not (yet) included in Ease
, but it is possible
to run different models in succession, giving the next one as the
initial genotype frequencies the frequencies of the previous one.
The population considered can be structured and connected by migration. Migration is only possible at the individual stage and not at the gamete stage, which future updates may add. It is also not possible to define different migration rates between different genotypes.
The consideration of genetic drift and thus a specific population
size is nevertheless introduced as a multinomial draw each generation,
which adds to the realism of the simulations by adding randomisation. In
the Ease
package, the life cycle of the simulated
population is standard ([migration] - [selection on gamete production] -
[gametogenesis (recombination + meiosis + mutation)] - [selection on
gametes] - [syngamy] - [selection on individuals] - [demography] -
[drift]) and may consider the population dioecious or
hermaphroditic.
A genome is defined by the set of loci to which lists of alleles are attached. Each loci and each allele is defined by a unique name, which allows it to be unequivocally identified.
There are two types of loci: diploid and haploid. A genotype is
defined as an allelic combination of all the alleles of an individual’s
loci and a haplotype as only those alleles that have been inherited
together from a single parent. A genotype is therefore made up of two
haplotypes. A distinction is also made between diploid (resp. haploid)
haplotypes which correspond to allelic combinations taking into account
only diploid (resp. haploid) loci. The loci are defined by a list of
vectors that enumerates their respective alleles. The order in which the
loci are placed is not important in the case of haploid loci. It does
matter in the case of diploid loci because recombination is likely to
affect the haplotypes. In the Ease
package, diploid loci
are
In the case of diploid loci, however, if several are defined, the
order of the diploid loci in the list is not trivial. The rates of
two-to-one combinations between them must indeed be defined by a vector
of recombination rates. For example, if three diploid loci are defined,
this vector must be of length 2, the first of its values defining the
recombination rate between the first and second loci, the second of its
values the recombination rate between the second and third loci. For
example, if we want to define two groups of two loci that are linked to
each other but are on two different chromosomes, we can define the
recombination rate vector as c(0.1, 0.5, 0.1)
. The first
two loci are thus relatively linked (recombination rate of
0.1
), as are the last two loci. On the other hand, the
recombination rate of 0.5
between the second and third loci
ensures that the two groups are independent.
To create a haplotype ID, we concatenate all diploid alleles and all
haploid alleles separately, then concatenate these two strings by
separating them with "||"
. For example
"Ab||CD"
corresponds to a haplotype with four loci, two
diploid with alleles A
and b
, and two haploid
with alleles C
and D
. The principle is the
same for the genotypes, but the second diploid haplotype is added by
separating it from the first by a "/"
, for example
"Ab/ab||CD"
.
Each loci is represented by a name and a factor vector that lists its
alleles. If one wish to consider a system with two loci, a diploid and a
haploid, each of which has two alleles, A
and
a
, and B
and b
respectively, the
construction of the genome is done as follows:
DL <- list(dl = c("A", "a"))
HL <- list(hl = c("B", "b"))
genomeObj <- setGenome(listHapLoci = HL, listDipLoci = DL)
The haplotypes and genotypes have been generated automatically, their
numbers can be retrieved by simply displaying the Genome
object created:
genomeObj
#> -=-=-=-=-=-= GENOME OBJECT =-=-=-=-=-=-
#> # 1 haploid locus, with 2 allele(s)
#> # 1 diploid locus, with 2 allele(s)
#> # 4 haplotypes
#> # 6 genotypes
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print for a list of haplotypes and genotypes)
and an exhaustive list can be displayed using the print
method:
print(genomeObj)
#> -=-=-=-=-=-= GENOME OBJECT =-=-=-=-=-=-
#> in details
#>
#> # 1 haploid loci:
#> - 'hl' with B and b alleles
#>
#> # 1 diploid loci:
#> - 'dl' with A and a alleles
#>
#> # 4 alleles:
#> [1] 1) B 2) b 3) A 4) a
#>
#> # 4 haplotypes:
#> [1] 1) A||B 2) a||B 3) A||b 4) a||b
#>
#> # 6 genotypes:
#> [1] 1) A/A||B 2) A/a||B 3) a/a||B 4) A/A||b 5) A/a||b
#> [6] 6) a/a||b
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
The haplotypes and genotypes are numbered, and these numberings will be important in defining the different types of fitness, as we shall now see.
A genome necessarily has a mutation matrix attached to it. This mutation matrix is haplotypic: it is a square probability matrix (the sum of the rows of which is equal to 1), of size equal to the number of haplotypes defined in the genome. This mutation matrix is not provided as is by the user, in which case it would be too tedious to define. Instead the user is asked to either :
mutation
functionNOTE: In practice, the mutation matrix is not used as such in the simulations. It is associated with the recombination matrix and the meiosis matrix which associates to each genotype the probability that they produce each haplotype by chromosomal segregation. It is with a matrix product Recombination matrix x Meiosis matrix x Mutation matrix that a single gametogenesis matrix is produced and used for the simulations.
Definition of the haplotypic mutation matrix by filling in the allelic mutation matrices :
mutMatrixObj <- setMutationMatrix(
genomeObj = genomeObj,
mutHapLoci = list(matrix(c(
0.95, 0.05,
0.03, 0.97
), 2, byrow = TRUE)),
mutDipLoci = list(matrix(c(
0.9, 0.1,
0.09, 0.91
), 2, byrow = TRUE))
)
mutMatrixObj
#> -=-=-=- MUTATION MATRIX OBJECT -=-=-=-
#> # Haplotypic mutation matrix:
#> A||B a||B A||b a||b
#> A||B 0.8550 0.0950 0.0450 0.0050
#> a||B 0.0855 0.8645 0.0045 0.0455
#> A||b 0.0270 0.0030 0.8730 0.0970
#> a||b 0.0027 0.0273 0.0873 0.8827
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print to access the allelic mutation matrices)
Definition of the haplotypic mutation matrix by filling in the forward and backward mutation rates:
mutMatrixObj <- setMutationMatrix(genomeObj = genomeObj, forwardMut = 0.01)
mutMatrixObj
#> -=-=-=- MUTATION MATRIX OBJECT -=-=-=-
#> # Haplotypic mutation matrix:
#> A||B a||B A||b a||b
#> A||B 0.9801 0.0099 0.0099 1e-04
#> a||B 0.0000 0.9900 0.0000 1e-02
#> A||b 0.0000 0.0000 0.9900 1e-02
#> a||b 0.0000 0.0000 0.0000 1e+00
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print to access the allelic mutation matrices)
Definition of the haplotypic mutation matrix by making a mutation list:
mutMatrixObj <- setMutationMatrix(
genomeObj = genomeObj,
mutations = list(
mutation(from = "A", to = "a", rate = 0.01),
mutation(from = "B", to = "b", rate = 0.01)
)
)
mutMatrixObj
#> -=-=-=- MUTATION MATRIX OBJECT -=-=-=-
#> # Haplotypic mutation matrix:
#> A||B a||B A||b a||b
#> A||B 0.9801 0.0099 0.0099 1e-04
#> a||B 0.0000 0.9900 0.0000 1e-02
#> A||b 0.0000 0.0000 0.9900 1e-02
#> a||b 0.0000 0.0000 0.0000 1e+00
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print to access the allelic mutation matrices)
Selection can be defined at three stages of this cycle: on mature individuals directly, on their gamete production or on the gametes. For individuals and gamete production, a fitness value is associated with each genotype. For gametes, it is with each haplotype. A fitness value is any positive or zero real. Fitness values are relative, so if all genotypes have a fitness value of 3, there will be no effect on the dynamics of the model. By default, as is customary, the fitness of individuals is set to 1, and the additional fitness effects are added multiplicatively.
In all cases, the construction of a Selection
object is
done using a genome class object (which is used to check the
compatibility between the constructed genome and the desired selection
parameters). There are two ways of defining selection : * give a vector
of length equal to the number of genotypes. By defining the fitness
vector as such, it is therefore necessary to know the order of
haplotypes and genotypes (use the method print
on an object
of class Genome
). This way is often impractical and the
second way is often preferred. * a simpler and more readable way to
define the selection is to create a list of selection formulas, as they
will be called here. The principle is explained below.
A selection formula is a formula (in the sense of R) that associates allelic combinations with a fitness effect. The fitness effect is the left-hand member of the formula, while the right-hand member corresponds to the allelic combinations.
The fitness effect is any literal or numerical expression. If it is literal, it will be evaluated at the time of the definition of the object selection
The definition of allelic combinations is based on several rules.
First, each allelic combination is described using the keywords
:
and h()
. The first one, :
, is
used to separate the alleles from each other in the enumeration of
allele combinations and h()
is used to specify whether the
allele must be in the homozygous state (if this keyword is absent, the
allele is in the heterozygous state). For example, in a model with two
diploid loci and two alleles each, respectively A and
a, B and b, we note h(a):b
to
mean that the fitness effect must apply to any genotype that has the
a allele in the homozygous state and the b allele in
the heterozygous state. Other examples are: a:b
(a
heterozygous and b heterozygous), h(A):h(B)
(A and B homozygous). We can therefore construct the
formula: 1 + s ~ a:b
, which means that the presence of
a and b in the heterozygous state in a genotype has an
effect (1+s
) on the fitness of that genotype. To add other
allelic combinations to which the same fitness effect will apply, the
operator +
must be used. Using the previous example,
1 + s ~ a:b + h(a)
means that the fitness effect should
also apply to genotypes that are homozygous for a.
To define a selection regime, one has to give as input a list of
selection formula that describe how, depending on their allelic
combinations, genotypes suffer or benefit from fitness effects, for
example of the form
list(1 + s ~ h(a):h(b), 1 + h*s ~ a:b + a:h(b) + h(a):b)
.
An important thing to note is that
list(1 + s ~ a + b)
is not equivalent to
list(1 + s ~ a, 1 + s ~ b)
In the first situation, if a genotype has a OR b in the heterozygous state, the fitness effect will apply. In the second situation the effect will apply twice (multiplicatively) to the same genotype because it fits two different formulas.
NOTE: This way of defining fitness does not allow to distinguish between cis and trans effects on fitness.
Then it is for example possible to define no selection (neutral
model) with the function setSelectNeutral
to construct a
selection
object where the fitnesses are all identical
(equal to 1):
We can then check that no selection has been defined:
selectionObj
#> -=-=-=-=-=- SELECTION OJBECT =-=-=-=-=-
#> # On individuals: NO
#> # On gametes: NO
#> # On gamete production: NO
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print to access the fitness values)
or with :
Using the example given in the Genome section,
one might want to simulate a system of genetic incompatibility where
when the derived alleles a
and b
are put
together within the same genotype, they induce a fitness cost through
negative epistasis. This cost, which we will call s
, is
associated with h
dominance which reduces this cost when
the a
nuclear allele is in the heterozygous state. Thus
individuals A/A||B
, A/a||B
,
a/a||B
and A/A||b
do not suffer any fitness
cost (because they have only one of the two incompatible alleles), their
fitness is equal to 1. The genotype A/a||b
undergoing the
reduced cost of incompatibility has a fitness of 1-h*s
and
the genotype a/a||b
undergoing the full cost of
incompatibility has a fitness of 1 - s
. To define selection
it is then sufficient to specify that if only one allele a
is present with allele b
(a:b
), the cost is
only 1 - h*s
, and if allele a
is homozygous
with allele b
, the cost is 1 - s
.
indFit <- list(
1 - h * s ~ a:b,
1 - s ~ h(a):b
)
s <- 0.8
h <- 0.5
selectionObj <- setSelectOnInds(
genomeObj = genomeObj,
indFit = indFit
)
We can then check that selection has been defined:
selectionObj
#> -=-=-=-=-=- SELECTION OJBECT =-=-=-=-=-
#> # On individuals: YES
#> # On gametes: NO
#> # On gamete production: NO
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print to access the fitness values)
Regarding selection on individuals, it is necessary to understand
that it will potentially not be identical if the modelled population is
hermaphroditic or dioecious. In the case of hermaphroditism there is no
distinction between female and male fitness, and so the
indFit
parameter will govern their fitness. If the sexes
are separated, however, one can either define a fitness in individuals
indFit
that will apply to both males and females, or
specify separately for males and females with the parameters
femaleFit
and maleFit
.
In any case it is good to check with the print
method
that the fitnesses are those wanted:
print(selectionObj)
#> -=-=-=-=-=- SELECTION OJBECT =-=-=-=-=-
#> in details
#>
#> Individuals Female Male
#> A/A||B 1.0 1.0 1.0
#> A/a||B 1.0 1.0 1.0
#> a/a||B 1.0 1.0 1.0
#> A/A||b 1.0 1.0 1.0
#> A/a||b 0.6 0.6 0.6
#> a/a||b 0.2 0.2 0.2
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Selection can also be defined on gamete production:
indProdFit <- list(
1 - h * s ~ a:b,
1 - s ~ h(a):b
)
s <- 0.8
h <- 0.5
selectionObj <- setSelectOnGametesProd(
genomeObj = genomeObj,
indProdFit = indProdFit
)
or on the gametes directly:
gamFit <- list(
1 - s ~ a:b
)
s <- 0.8
h <- 0.5
selectionObj <- setSelectOnGametes(
genomeObj = genomeObj,
gamFit = gamFit
)
For the latter two selection modes, it is possible to specify the fitness effect according to sex or to define the same effect regardless of sex, as desired.
Last but not least, it is obviously possible to combine these
different layers of selections. This is done using the
selectionObj
parameter that each of the
setSelect...
functions has (except
setSelectNeutral
), it is then unnecessary to recall the
genome to which the selection refers. For example, if we want to combine
the three types of selections presented here :
indFit <- list(
1 - h * s ~ a:b,
1 - s ~ h(a):b
)
indProdFit <- list(
1 - h * s ~ a:b,
1 - s ~ h(a):b
)
gamFit <- list(
1 - s ~ a:b
)
s <- 0.8
h <- 0.5
selectionObj <- setSelectOnInds(genomeObj = genomeObj, indFit = indFit)
selectionObj <- setSelectOnGametesProd(indProdFit = indProdFit, selectionObj = selectionObj)
selectionObj <- setSelectOnGametes(femaleFit = gamFit, selectionObj = selectionObj)
print(selectionObj)
#> -=-=-=-=-=- SELECTION OJBECT =-=-=-=-=-
#> in details
#>
#> Individuals Female Male
#> A/A||B 1.0 1.0 1.0
#> A/a||B 1.0 1.0 1.0
#> a/a||B 1.0 1.0 1.0
#> A/A||b 1.0 1.0 1.0
#> A/a||b 0.6 0.6 0.6
#> a/a||b 0.2 0.2 0.2
#>
#> # On gametes:
#> Female gamete Male gamete
#> A||B 1.0 1
#> a||B 1.0 1
#> A||b 1.0 1
#> a||b 0.2 1
#>
#> # On gamete production:
#> Female gamete Male gamete
#> A/A||B 1.0 1.0
#> A/a||B 1.0 1.0
#> a/a||B 1.0 1.0
#> A/A||b 1.0 1.0
#> A/a||b 0.6 0.6
#> a/a||b 0.2 0.2
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Once the previous steps have been completed, i.e. definition of the genome, mutation and selection, the population can be constructed. A population is defined strictly by a name, a size, a sexual system (dioecy or hermaphodite), and the three objects defined previously: genome, mutation matrix and selection. In addition to that, it is possible to define
Two demographic regimes are possible: no demography, i.e. a fixed
population size, or demography, i.e. a population where the size
fluctuates stochastically. The boolean argument demography
is used to define whether there should be stochasticity. For a fixed
population size, it is therefore sufficient to define that
demography = FALSE
(default) and to set the desired
population size with the popSize
parameter.
For a fluctuating demography, demography
must be
TRUE
and three other parameters are then needed: the
initial population size (initPopSize
), the population
growth rate (growthRate
) and the carrying capacity of the
population (the population size, popSize
).
It is also possible to avoid defining a population size altogether,
by setting off the genetic drift (drift
parameter). This
will allow the model to be simulated deterministically.
Definition of a population in its simplest form:
DL <- list(dl = c("A", "a"))
HL <- list(hl = c("B", "b"))
mutations <- list(
mutation(from = "A", to = "a", rate = 1e-3),
mutation(from = "B", to = "b", rate = 1e-3)
)
genomeObj <- setGenome(listHapLoci = HL, listDipLoci = DL)
pop <- setPopulation(
name = "A",
size = 1000,
dioecy = TRUE,
genomeObj = genomeObj,
selectionObj = setSelectNeutral(genomeObj),
mutMatrixObj = setMutationMatrix(genomeObj, mutations = mutations)
)
pop
#> -=-=-=-=-= Population OBJECT =-=-=-=-=-
#> Population 'A' of 1000 dioecious individuals
#> There is no demography.
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print for details on the genome)
More information on the population can be accessed using the
print
method;
print(pop)
#> -=-=-=-=-= Population OBJECT =-=-=-=-=-
#> in details
#> Population 'A' of 1000 dioecious individuals
#> There is no demography.
#> The initial genotypes frequency are:
#> A/A||B A/a||B a/a||B A/A||b A/a||b a/a||b
#> 1 0 0 0 0 0
#> Selection:
#> No selection defined.
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
To simulate it is then necessary to go through the metapopulation class, even if only one population is simulated. More information below.
A metapopulation is a set of population(s) (from 1) that are simulated with potential migration between them. Only genotypes can migrate, i.e. adult individuals.
The construction of a Metapopulation
object requires
only two arguments (one optional). The first is a population(s) list,
defined from the population class. The second is a migration matrix,
which connects the populations together. This matrix is a probability
matrix (square with the sum of the rows equal to 1, whose size is equal
to the number of populations) where each value corresponds to the
proportion of individuals (genotypes) that disperse from their source
population (row) to their target population (column).
The nameOutFunct
(optionnal) argument is the name of a
function that allow to produce a custom output for a simulation. It is
called each generation in each population of a simulation and
systematically returns a list with the first element being a logical
that indicates whether something should be saved. If so, the second
element of this list will be saved (no need to add a second element the
first is FALSE
). The custom output function must take only
one argument, pop
. This argument is a list of :
customOutput
: list of all previous savingsgen
: generationfreqGeno
: list of genotypic frequency matrices (matrix
1 x # genotypes). The list is constructed as follows: if the population
is hermaphroditic it has only one element “ind”, if the population is
dioecious it has three elements, “female”, “male” and “ind” which
correspond respectively to the genotypic frequencies of the females, the
males and the average of the two (assuming a sex ratio of 50:50).freqHaplo
: list of genotypic frequency matrices
(matrix 1 x # haplotypes). The list is constructed in the same way as
for genotypic frequencies (see above).freqAlleles
: list of allelic frequency matrices
(matrix 1 x # alleles). The list is constructed in the same way as for
genotypic frequencies (see above).An example of such a function could be :
customOutFunct <- function(pop) {
if (pop$freqAlleles[4] < 0.1 | pop$freqAlleles[4] > 0.9) {
return(list(TRUE, a = list(gen = pop$gen, freq = pop$freqAlleles[4])))
}
return(list(FALSE))
}
Here the function returns orders to save the frequency of the fourth
allele of the model if its frequency is between 0 and 0.1 or between 0.9
and 1. After defining it in this way, one will have to give
"customOutFunct"
as the value of the
nameOutFunct
parameter in the simulate
method
of the class Metapopulation
for it to be taken into
account.
For example, consider a model with a single locus with two alleles, A and a, where the a allele has a small positive fitness effect.
The genome:
DL <- list(dl = c("A", "a"))
genomeObj <- setGenome(listDipLoci = DL)
#> Warning in .local(.Object, ...): No haploid locus has been set. By construction
#> it is necessary that there is at least 1 haploid locus, so it has been defined
#> with only one allele (this will not affect the simulations).
print(genomeObj)
#> -=-=-=-=-=-= GENOME OBJECT =-=-=-=-=-=-
#> in details
#>
#> # 1 haploid loci:
#> - 'NoneHL' with NoneH alleles
#>
#> # 1 diploid loci:
#> - 'dl' with A and a alleles
#>
#> # 3 alleles:
#> [1] 1) NoneH 2) A 3) a
#>
#> # 2 haplotypes:
#> [1] 1) A||NoneH 2) a||NoneH
#>
#> # 3 genotypes:
#> [1] 1) A/A||NoneH 2) A/a||NoneH 3) a/a||NoneH
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This warning is normal when working only with diploid (or only haploid) loci.
The mutation matrix:
mutMatrixObj <- setMutationMatrix(
genomeObj = genomeObj,
forwardMut = 1e-3,
backwardMut = 1e-3
)
print(mutMatrixObj)
#> -=-=-=- MUTATION MATRIX OBJECT -=-=-=-
#> in details
#>
#> # 1 haploid locus allelic matrix:
#>
#> NoneHL :
#> NoneH
#> NoneH 1
#>
#> # 1 diploid locus allelic matrix:
#>
#> dl :
#> A a
#> A 0.999 0.001
#> a 0.001 0.999
#>
#> # Haplotypic mutation matrix:
#> A||NoneH a||NoneH
#> A||NoneH 0.999 0.001
#> a||NoneH 0.001 0.999
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
The object of selection is then created which describes that allele
A
is beneficial in population 1 while allele a
is beneficial in population 2. This implements a selection differential
between the two populations, a form of local selection.
# Selection in population 1
indFit <- list(
1 + h * s ~ a,
1 + s ~ h(a)
)
h <- 0.5
s <- 0.05
selectionObj1 <- setSelectOnInds(
genomeObj = genomeObj,
indFit = indFit
)
print(selectionObj1)
#> -=-=-=-=-=- SELECTION OJBECT =-=-=-=-=-
#> in details
#>
#> Individuals Female Male
#> A/A||NoneH 1.000 1.000 1.000
#> A/a||NoneH 1.025 1.025 1.025
#> a/a||NoneH 1.050 1.050 1.050
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
# Selection in population 2
indFit <- list(
1 + h * s ~ A,
1 + s ~ h(A)
)
h <- 0.5
s <- 0.05
selectionObj2 <- setSelectOnInds(
genomeObj = genomeObj,
indFit = indFit
)
print(selectionObj2)
#> -=-=-=-=-=- SELECTION OJBECT =-=-=-=-=-
#> in details
#>
#> Individuals Female Male
#> A/A||NoneH 1.050 1.050 1.050
#> A/a||NoneH 1.025 1.025 1.025
#> a/a||NoneH 1.000 1.000 1.000
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
We can now define a migration rate of 0.005 from each of the two populations to the other.
And finally create the Metapopulation
object
metapop <- setMetapopulation(
populations = list(
setPopulation(
name = "pop1", size = 500, dioecy = F, genomeObj = genomeObj,
selectionObj = selectionObj1, mutMatrixObj = mutMatrixObj, selfRate = 0.5
),
setPopulation(
name = "pop2", size = 500, dioecy = F, genomeObj = genomeObj,
selectionObj = selectionObj2, mutMatrixObj = mutMatrixObj, selfRate = 0.5
)
),
migMat = migMat
)
metapop
#> -=-=-=-= Metapopulation OBJECT =-=-=-=-
#> 2 populations of size 500 and 500,
#> forming a metapopulation of 1000 individuals.
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#> (use print to get population details)
The print method
allows access to all the
characteristics of an object of class Metapopulation
print(metapop)
#> -=-=-=-= Metapopulation OBJECT =-=-=-=-
#> in details
#>
#> # Populations:
#>
#> Population 'pop1' of 500 hermaphroditic individuals
#> with a 50% selfing rate.
#> There is no demography.
#> The initial genotypes frequency are:
#> A/A||NoneH A/a||NoneH a/a||NoneH
#> 1 0 0
#> Selection:
#> - On individuals:
#> Fitness
#> A/A||NoneH 1.000
#> A/a||NoneH 1.025
#> a/a||NoneH 1.050
#> ~-~-~
#> Population 'pop2' of 500 hermaphroditic individuals
#> with a 50% selfing rate.
#> There is no demography.
#> The initial genotypes frequency are:
#> A/A||NoneH A/a||NoneH a/a||NoneH
#> 1 0 0
#> Selection:
#> - On individuals:
#> Fitness
#> A/A||NoneH 1.050
#> A/a||NoneH 1.025
#> a/a||NoneH 1.000
#>
#> # Migration matrix:
#>
#> pop1 pop2
#> pop1 0.995 0.005
#> pop2 0.005 0.995
#>
#> # Haploptypes:
#>
#> 1 2
#> "A||NoneH" "a||NoneH"
#>
#> # Genotypes:
#>
#> 1 2 3
#> "A/A||NoneH" "A/a||NoneH" "a/a||NoneH"
#>
#> # Matrices involved in gametogenesis:
#>
#> - Mutation matrix:
#> A||NoneH a||NoneH
#> A||NoneH 0.999 0.001
#> a||NoneH 0.001 0.999
#>
#> - Recombination matrix:
#> A/A||NoneH A/a||NoneH a/a||NoneH
#> A/A||NoneH 1 0 0
#> A/a||NoneH 0 1 0
#> a/a||NoneH 0 0 1
#>
#> - Meiosis matrix:
#> A||NoneH a||NoneH
#> A/A||NoneH 1.0 0.0
#> A/a||NoneH 0.5 0.5
#> a/a||NoneH 0.0 1.0
#>
#> - Final gametogenesis matrix:
#> A||NoneH a||NoneH
#> A/A||NoneH 0.999 0.001
#> A/a||NoneH 0.500 0.500
#> a/a||NoneH 0.001 0.999
#>
#> # Haplotypes crossing matrix:
#>
#> A||NoneH(Mal) a||NoneH(Mal)
#> A||NoneH(Fem) A/A||NoneH A/a||NoneH
#> a||NoneH(Fem) A/a||NoneH a/a||NoneH
#>
#> # Allele frequencies from genotype frequencies matrix:
#>
#> NoneH A a
#> A/A||NoneH 1 1.0 0.0
#> A/a||NoneH 1 0.5 0.5
#> a/a||NoneH 1 0.0 1.0
#>
#> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
The model is then simulated 100 times recording the population state
every 50 generations up to 800 generations. The seed
is
specified to ensure reproducibility of the results obtained.
nsim <- 100
metapop <- simulate(metapop,
nsim = nsim, seed = 123,
recording = TRUE, recordGenGap = 50,
threshold = 800
)
metapopDeterminist <- simulate(metapop,
drift = FALSE, seed = 123,
recording = TRUE, threshold = 800
)
Then we sum up the recording data frames of each simulation and divide them by the number of simulations (which results in an overall average of the data frames):
rec <- getRecords(metapop)
recMean <- Reduce(
function(x, y) {
x$pop1 <- x$pop1 + y$pop1
x$pop2 <- x$pop2 + y$pop2
x
},
rec
)
recMean$pop1 <- recMean$pop1 / nsim
recMean$pop2 <- recMean$pop2 / nsim
recDeterminist <- getRecords(metapopDeterminist)$s1
This creates a mutation - selection - drift - migration equilibrium visible on these graphs. Dashed lines indicate deterministic predictions (without drift).
par(mfrow = c(2, 1))
### Allelic frequency
plot(c(),
xlim = c(0, 800), ylim = c(0, 1), col = "blue",
xlab = "Generation\n(blue for A, red for B)",
ylab = "Frequency of a"
)
# Stochastic
for (i in seq_len(nsim)) {
lines(rec[[i]]$pop1$gen, rec[[i]]$pop1$a, lty = "longdash", col = "lightblue")
lines(rec[[i]]$pop2$gen, rec[[i]]$pop2$a, lty = "longdash", col = "lightpink")
}
lines(recMean$pop1$gen, recMean$pop1$a, lty = "longdash", col = "blue")
lines(recMean$pop2$gen, recMean$pop2$a, lty = "longdash", col = "red")
# Deterministic
lines(recDeterminist$pop1$gen, recDeterminist$pop1$a, lty = "longdash", col = "blue", lwd = 2)
lines(recDeterminist$pop2$gen, recDeterminist$pop2$a, lty = "longdash", col = "red", lwd = 2)
# Mean stochastic
lines(recMean$pop1$gen, recMean$pop1$a, col = "blue")
lines(recMean$pop2$gen, recMean$pop2$a, col = "red")
### Mean fitness
plot(c(),
xlim = c(0, 800), ylim = c(1, 1 + s), col = "blue",
xlab = "Generation\n(blue for A, red for B)",
ylab = "Frequency of a"
)
# Stochastic
for (i in seq_len(nsim)) {
lines(rec[[i]]$pop1$gen, rec[[i]]$pop1$indMeanFit, lty = "longdash", col = "lightblue")
lines(rec[[i]]$pop2$gen, rec[[i]]$pop2$indMeanFit, lty = "longdash", col = "lightpink")
}
lines(recMean$pop1$gen, recMean$pop1$indMeanFit, lty = "longdash", col = "blue")
lines(recMean$pop2$gen, recMean$pop2$indMeanFit, lty = "longdash", col = "red")
# Deterministic
lines(recDeterminist$pop1$gen, recDeterminist$pop1$indMeanFit, lty = "longdash", col = "blue", lwd = 2)
lines(recDeterminist$pop2$gen, recDeterminist$pop2$indMeanFit, lty = "longdash", col = "red", lwd = 2)
# Mean stochastic
lines(recMean$pop1$gen, recMean$pop1$indMeanFit, col = "blue")
lines(recMean$pop2$gen, recMean$pop2$indMeanFit, col = "red")
The average fitnesses in the populations equalise while the allelic frequencies are maintained at intermediate values but still as close as possible to the optimum in both populations.