Package 'purgeR' reference manual

Title:	Inbreeding-Purging Estimation in Pedigreed Populations
Description:	Inbreeding-purging analysis of pedigreed populations, including the computation of the inbreeding coefficient, partial, ancestral and purged inbreeding coefficients, and measures of the opportunity of purging related to the individual reduction of inbreeding load. In addition, functions to calculate the effective population size and other parameters relevant to population genetics are included. See López-Cortegano E. (2021) <doi:10.1093/bioinformatics/btab599>.
Authors:	Eugenio López-Cortegano [aut, cre]
Maintainer:	Eugenio López-Cortegano <elcortegano@gmail.com>
License:	GPL-2
Version:	1.8.2
Built:	2025-03-23 07:41:27 UTC
Source:	CRAN

Individuals to be evaluated in purging analyses

Description

Returns a boolean vector indicating what individuals are suitable for purging analyses, given a measure of fitness. Individuals with NA values of fitness, and that do not have descendants with non-NA fitness values, are excluded.

Usage

ancestors(ped, reference, rp_idx, nboot = 10000L, seed = NULL, skip_Ng = FALSE)
ancestors(ped, reference, rp_idx, nboot = 10000L, seed = NULL, skip_Ng = FALSE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.
`rp_idx`	Vector containing the indexes of individuals of the RP
`nboot`	Number of bootstrap iterations (for computing Ng).
`seed`	Sets a seed for the random number generator.
`skip_Ng`	Skip Ng computation or not (FALSE by default).

Value

Boolean vector indicating what individuals will be evaluated.

Arrui pedigree

Description

This data set contains the pedigree of the arrui (Ammotragus lervia), also known as barbary sheep. A total of 380 individuals is included, as well as measurements of biological fitness and other factors (see reference below for details).

Usage

arrui
arrui

Format

A data frame with with records from 380 individuals (in rows), and 10 variables:

id - Individual identity.
dam - Maternal identity.
sire - Paternal identity.
survival15 - 15-days survival.
prod - Female productivity.
sex - Individual sex.
yob - Year of birth.
pom - Period of management.
target - Individual in the target population.
eeza_id - Individual identity (as recorded in the original studbook)

Source

The original studbook containing the complete and updated pedigree can be found at: http://www.eeza.csic.es/en/programadecria.aspx.

References

López-Cortegano E et al. 2021. Genetic purging in captive endangered ungulates with extremely low effective population sizes.*Heredity*, https://www.nature.com/articles/s41437-021-00473-2.

Cuvier's gazelle pedigree

Description

This data set contains the pedigree of Cuvier's gazelle (Gazella atlas). A total of 948 individuals is included, as well as measurements of biological fitness and other factors (see reference below for details).

Usage

atlas
atlas

Format

A data frame with with records from 948 individuals (in rows), and 10 variables:

id - Individual identity.
dam - Maternal identity.
sire - Paternal identity.
survival15 - 15-days survival.
prod - Female productivity.
sex - Individual sex.
yob - Year of birth.
pom - Period of management.
target - Individual in the target population.
eeza_id - Individual identity (as recorded in the original studbook)

Source

The original studbook containing the complete and updated pedigree can be found at: http://www.eeza.csic.es/en/programadecria.aspx.

References

López-Cortegano E et al. 2021. Genetic purging in captive endangered ungulates with extremely low effective population sizes. *Heredity*, https://www.nature.com/articles/s41437-021-00473-2.

Check ancestor individuals

Description

Takes a column name, and checks its use as target. It should name a boolean vector (or coercible to it), with at least one TRUE value.

Usage

check_ancestors(id, ancestors)
check_ancestors(id, ancestors)

Arguments

`id`	Vector of individual ids.
`ancestors`	Vector of ancestor ids.

Value

No return value. Will print an error message if checking fail.

Check basic

Description

This function will group some other checking functions, that should be run everytime when using functions in this package, to avoid unexpected errors.

Usage

check_basic(
  ped,
  id_name = "id",
  dam_name = "dam",
  sire_name = "sire",
  when_rename = FALSE,
  when_sort = FALSE
)
check_basic(
  ped,
  id_name = "id",
  dam_name = "dam",
  sire_name = "sire",
  when_rename = FALSE,
  when_sort = FALSE
)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`id_name`	Column name for individual id.
`dam_name`	Column name for dam.
`sire_name`	Column name for sire.
`when_rename`	True when called from ped_rename function. It softs checks on individual ID column name and types
`when_sort`	True when called from ped_sort function. It softs checks on pedigree sorting

Value

No return value. Will print an error message if checking fail.

Check if a variable is boolean or not

Description

Can be used to test arguments that need to be of logical (boolean) type

Usage

check_bool(variable)
check_bool(variable)

Arguments

variable

Variable to test

Value

No return value. Will print an error message if checking fail.

Check that optional column is included

Description

Some functions require additional columns. Check that they are named in the pedigree.

Usage

check_col(names, name)
check_col(names, name)

Arguments

`names`	Column names (all)
`name`	Column name to check.

Value

No return value. Will print an error message if checking fail.

Check purging coefficient

Description

The purging coefficient must be a number between 0 and 0.5

Usage

check_d(d)
check_d(d)

Arguments

`d`	Purging coefficient (taking values between 0.0 and 0.5).

Value

No return value. Will print an error message if checking fail.

Check pedigree class

Description

The pedigree must be of object class 'data.frame'.

Usage

check_df(obj)
check_df(obj)

Arguments

obj

Object to test

Value

No return value. Will print an error message if checking fail.

Check columns with inbreeding values

Description

Takes a column name, and checks its use as inbreeding coefficient. It should name a numeric vector, with values in the range [0,1]

Usage

check_Fcol(ped, Fcol, compute = TRUE)
check_Fcol(ped, Fcol, compute = TRUE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`Fcol`	Name of column with inbreeding coefficient values. If none is used, inbreeding will be computed.
`compute`	Compute inbreeding if Fcol is NULL

Value

Vector of inbreeding values (if checks are successful)

Check individual index

Description

Renamed individuals must be named by their index (from 1 to N)

Usage

check_index(id)
check_index(id)

Arguments

`id`	Column of individual ids.

Value

No return value. Will print an error message if checking fail.

Check if a variable is a positive integer or not

Description

Can be used to test arguments that need to be integers

Usage

check_int(variable)
check_int(variable)

Arguments

variable

Variable to test

Value

No return value. Will print an error message if checking fail.

Check if a variable has length >1

Description

Used to test arguments that need to be of length 1

Usage

check_length(variable, message = "Expected value of length 1")
check_length(variable, message = "Expected value of length 1")

Arguments

`variable`	Variable to test
`message`	Error message to display

Value

No return value. Will print an error message if checking fail.

Check if a vector contains NA values

Description

Return warning when NA values are present

Usage

check_na(variable)
check_na(variable)

Arguments

variable

Variable to test

Value

No return value. Will print an error message if checking fail.

Check that mandatory column names are included

Description

Columns for id, dam and sire are mandatory. This function checks that they are named in the pedigree. The function works with arbitrary column names (not 'id', 'dam' and 'sire') to work with ped_rename()

Usage

check_names(ped, id_name = "id", dam_name = "dam", sire_name = "sire")
check_names(ped, id_name = "id", dam_name = "dam", sire_name = "sire")

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`id_name`	Column name for individual id.
`dam_name`	Column name for dam.
`sire_name`	Column name for sire.

Value

No return value. Will print an error message if checking fail.

Check Ne

Description

The effective population size (Ne) must be a number higher than 0

Usage

check_Ne(Ne)
check_Ne(Ne)

Arguments

`Ne`	Effective population size

Value

No return value. Will print an error message if checking fail.

Check if optional column is included

Description

Some functions require additional columns. Check if they are already named in the pedigree.

Usage

check_not_col(names, name)
check_not_col(names, name)

Arguments

`names`	Column names (all)
`name`	Column name to check.

Value

No return value. Will print an error message if checking fail.

Check observed and expected number of rows

Description

Expected and observed number of rows must be equal.

Usage

check_nrows(df, exp, message = "Expected value of length 1")
check_nrows(df, exp, message = "Expected value of length 1")

Arguments

`df`	Dataframe to test
`exp`	Expected number of rows
`message`	Error message to display

Value

No return value. Will print an error message if checking fail.

Check individual order

Description

Individuals must be sorted from older to younger

Usage

check_order(id, dam, sire, soft_sorting = FALSE)
check_order(id, dam, sire, soft_sorting = FALSE)

Arguments

`id`	Vector of individual ids.
`dam`	Vector of dam ids.
`sire`	Vector of sire ids.
`soft_sorting`	If TRUE checking is relaxed, allowing descendants to be declared before ancestors

Value

No return value. Will print an error message if checking fail.

Check columns with reference individuals

Description

Takes a column name, and checks its use as reference. It should name a boolean vector (or coercible to it), with at least one TRUE value.

Takes a column name, and checks its use as target. It should name a boolean vector (or coercible to it), with at least one TRUE value.

Usage

check_reference(ped, reference)

check_target(ped, reference, target, variable)
check_reference(ped, reference)

check_target(ped, reference, target, variable)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.
`target`	Target column
`variable`	To be used in printed messages

Value

Vector of reference numbers (if checks are successful)

Vector of target numbers (if checks are successful)

Check repeated ids

Description

Individual id are unique and cannot be repeated

Usage

check_repeat_id(id)
check_repeat_id(id)

Arguments

`id`	Vector of individual ids.

Value

No return value. Will print an error message if checking fail.

Check columns with generation numbers

Description

Takes a column name, and checks its use as generation numbers. It should name a numeric vector, with values >= 0.

Usage

check_tcol(ped, tcol, compute = TRUE, force_int = FALSE)
check_tcol(ped, tcol, compute = TRUE, force_int = FALSE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`tcol`	Name of column with individual generation times. If none is used, the number of equivalent complete generations is computed.
`compute`	Compute generation numbers if tcol is NULL
`force_int`	Generation numbers must be integers (disabled by default)

Value

Vector of generation numbers (if checks are successful)

Check that mandatory column names are of type int

Description

Columns for id, dam and sire are mandatory, and required to be of type integer

Usage

check_types(id, dam, sire)
check_types(id, dam, sire)

Arguments

`id`	Vector of individual ids.
`dam`	Vector of dam ids.
`sire`	Vector of sire ids.

Value

No return value. Will print an error message if checking fail.

Check individuals named zero

Description

Individual id cannot equal zero (0). This is reserved to dams and sires.

Usage

check_zero_id(id)
check_zero_id(id)

Arguments

`id`	Vector of individual ids.

Value

No return value. Will print an error message if checking fail.

Dama gazelle pedigree

Description

This data set contains the pedigree of the dama gazelle (Nanger dama). A total of 1316 individuals is included, as well as measurements of biological fitness and other factors (see reference below for details).

Usage

dama
dama

Format

A data frame with with records from 1316 individuals (in rows), and 10 variables:

id - Individual identity.
dam - Maternal identity.
sire - Paternal identity.
survival15 - 15-days survival.
prod - Female productivity.
sex - Individual sex.
yob - Year of birth.
pom - Period of management.
target - Individual in the target population.
eeza_id - Individual identity (as recorded in the original studbook)

Source

The original studbook containing the complete and updated pedigree can be found at: http://www.eeza.csic.es/en/programadecria.aspx.

References

López-Cortegano E et al. 2021. Genetic purging in captive endangered ungulates with extremely low effective population sizes. *Heredity*, https://www.nature.com/articles/s41437-021-00473-2.

Darwin/Wedgwood pedigree

Description

This data set contains the pedigree of the Darwin/Wedgwood dynasty. It is composed by a total of 63 individuals, including Charles R. Darwin and Francis Galton.

Usage

darwin
darwin

Format

A data frame with with records from 63 individuals (in rows), and 3 variables:

Individual - Individual identity.
Mother - Mother's identity.
Father - Father's identity.

Source

The pedigree is adapted from Berra et al. (2010)

References

Berra TM et al. 2010. Was the Darwin/Wedgwood dynasty adversely affected by consanguinity?. BioScience 60(5): 376-383.

Individual inbreeding variation

Description

Computes the increase in inbreeding coefficient for a given individual

Usage

delta_Fi(Fi, t)
delta_Fi(Fi, t)

Arguments

`Fi`	Individual inbreeding coefficient.
`t`	Individual generation number.

Value

Individual variation in inbreeding.

Dorcas gazelle pedigree

Description

This data set contains the pedigree of dorcas gazelle (Gazella dorcas). A total of 1279 individuals is included, as well as measurements of biological fitness and other factors (see reference below for details).

Usage

dorcas
dorcas

Format

A data frame with with records from 1279 individuals (in rows), and 10 variables:

id - Individual identity.
dam - Maternal identity.
sire - Paternal identity.
survival15 - 15-days survival.
prod - Female productivity.
sex - Individual sex.
yob - Year of birth.
pom - Period of management.
target - Individual in the target population.
eeza_id - Individual identity (as recorded in the original studbook)

Source

The original studbook containing the complete and updated pedigree can be found at: http://www.eeza.csic.es/en/programadecria.aspx.

References

López-Cortegano E et al. 2021. Genetic purging in captive endangered ungulates with extremely low effective population sizes. *Heredity*, https://www.nature.com/articles/s41437-021-00473-2.

Expected inbreeding coefficient

Description

Estimates the expected inbreeding coefficient (F) as a function of the effective population size and generation number

Usage

exp_F(Ne, t)
exp_F(Ne, t)

Arguments

`Ne`	Effective population size
`t`	Generation number

Details

Computation of the inbreeding coefficient uses the classical formula:

F(t) = 1 - (1 - 1/2N) ^ t

Value

The inbreeding coefficient

References

Falconer DS, Mackay TFC. 1996. Introduction to Quantitative Genetics. 4th edition. Longman, Essex, U.K.

Examples

exp_F(Ne = 50, t = 0)
exp_F(Ne = 50, t = 50)
exp_F(Ne = 10, t = 50)
exp_F(Ne = 50, t = 0)
exp_F(Ne = 50, t = 50)
exp_F(Ne = 10, t = 50)

Expected ancestral inbreeding coefficient

Description

Estimates the expected ancestral inbreeding coefficient (Fa) as a function of the effective population size and generation number

Usage

exp_Fa(Ne, t)
exp_Fa(Ne, t)

Arguments

`Ne`	Effective population size
`t`	Generation number

Details

Computation of the ancestral inbreeding coefficient uses the adaptation from Ballou's (1997) formula, as in López-Cortegano et al. (2018):

Fa(t) = 1 - (1 - 1/2N) ^ (1/2 (t-1)t)

Value

The ancestral inbreeding coefficient

References

Ballou JD. 1997. Ancestral inbreeding only minimally affects inbreeding depression in mammalian populations. J Hered. 88:169–178.
López-Cortegano E et al. 2018. Detection of genetic purging and predictive value of purging parameters estimated in pedigreed populations. Heredity 121(1): 38-51.

Examples

exp_Fa(Ne = 50, t = 0)
exp_Fa(Ne = 50, t = 50)
exp_Fa(Ne = 10, t = 50)
exp_Fa(Ne = 50, t = 0)
exp_Fa(Ne = 50, t = 50)
exp_Fa(Ne = 10, t = 50)

Expected purged inbreeding coefficient

Description

Estimates the expected purged inbreeding coefficient (g) as a function of the effective population size, generation number, and purging coefficient

Usage

exp_g(Ne, t, d)
exp_g(Ne, t, d)

Arguments

`Ne`	Effective population size
`t`	Generation number
`d`	Purging coefficient (taking values between 0.0 and 0.5).

Details

Computation of the purged inbreeding coefficient is calculated as in García-Dorado (2012):

g(t) = [ (1 - 1/2N) g(t-1) + 1/2N] * [1 - 2d F(t-1)]

When convergence is reached, the asymptotic value g(a) is returned:

g(a) = (1 - 2d) / (1 + 2d (2N-1))

Value

The purged inbreeding coefficient

References

García-Dorado. 2012. Understanding and predicting the fitness decline of shrunk populations: Inbreeding, purging, mutation, and standard selection. Genetics 190: 1-16.

Examples

exp_g(Ne = 50, t = 0, d = 0.15)
exp_g(Ne = 50, t = 50, d = 0.15)
exp_g(Ne = 10, t = 50, d = 0.15)
exp_g(Ne = 50, t = 0, d = 0.15)
exp_g(Ne = 50, t = 50, d = 0.15)
exp_g(Ne = 10, t = 50, d = 0.15)

Inbreeding coefficient

Description

Computes the standard inbreeding coefficient (F). This is the probability that two alleles on a locus are identical by descent (Falconer and Mackay 1996, Wright 1922), calculated from the genealogical coancestry matrix (Malécot 1948).

Usage

F(ped, name_to)
F(ped, name_to)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_to`	A string naming the new output column.

Value

The input dataframe, plus an additional column named "F" with individual inbreeding coefficient values.

References

Falconer DS, Mackay TFC. 1996. Introduction to Quantitative Genetics. 4th edition. Longman, Essex, U.K.
Malécot G, 1948. Les Mathématiques de l’hérédité. Masson & Cie., Paris.
Wright S. 1922. Coefficients of inbreeding and relationship. The American Naturalist 56: 330-338.

Ancestral inbreeding coefficient

Description

Computes the ancestral inbreeding coefficient (Fa). This is the probability that an allele has been in homozygosity in at least one ancestor (Ballou 1997). A genedrop approach is included to compute unbiased estimates of Fa (Baumung et al. 2015).

Usage

Fa(ped, Fi, name_to, genedrop = 0L, seed = NULL)
Fa(ped, Fi, name_to, genedrop = 0L, seed = NULL)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`Fi`	Vector of inbreeding coefficient values
`name_to`	A string naming the new output column.
`genedrop`	Number of genedrop iterations to run. If set to zero (as default), Ballou's Fa is computed.
`seed`	Sets a seed for the random number generator.

Value

The input dataframe, plus an additional column named "Fa" with individual ancestral inbreeding coefficient values.

References

Ballou JD. 1997. Ancestral inbreeding only minimally affects inbreeding depression in mammalian populations. J Hered. 88:169–178.
Baumung et al. 2015. GRAIN: A computer program to calculate ancestral and partial inbreeding coefficients using a gene dropping approach. Journal of Animal Breeding and Genetics 132: 100-108.

Partial inbreeding coefficient (core function)

Description

Computes partial inbreeding coefficients, Fi(j). A coefficient Fi(j) can be read as the probability of individual i being homozygous for alleles derived from ancestor j

Usage

Fij_core(ped, ancestors, ancestors_idx, Fi, mapa, ncores = 1, genedrop, seed)
Fij_core(ped, ancestors, ancestors_idx, Fi, mapa, ncores = 1, genedrop, seed)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`ancestors`	Vector of the identities to be assumed as founder ancestors.
`ancestors_idx`	Index of ancestors.
`Fi`	Vector of inbreeding coefficients.
`mapa`	Map of ancestors
`ncores`	Number of cores to use for parallel computing (default = 1)
`genedrop`	Enable genedrop simulation
`seed`	Sets a seed for the random number generator.

Value

A matrix of partial inbreeding coefficients. Fi(j) values can thus be read from row i and column j.

Partial inbreeding coefficient (core function)

Description

Computes partial inbreeding coefficients, Fi(j). A coefficient Fi(j) can be read as the probability of individual i being homozygous for alleles derived from ancestor j

Usage

Fij_core_i_cpp(dam, sire, anc_idx, mapa, Fi, genedrop = 0L, seed = NULL)
Fij_core_i_cpp(dam, sire, anc_idx, mapa, Fi, genedrop = 0L, seed = NULL)

Arguments

`dam`	Vector of dam ids.
`sire`	Vector of sire ids.
`anc_idx`	Index of ancestors.
`mapa`	Map of ancestors
`Fi`	Vector of inbreeding coefficients.
`genedrop`	Enable genedrop simulation
`seed`	Sets a seed for the random number generator.

Value

A matrix of partial inbreeding coefficients. Fi(j) values can thus be read from row i and column j.

Purged inbreeding coefficient

Description

Computes the purged inbreeding coefficient (g). This is the probability that two alleles on a locus are identical by descent, but relative to deleterious recessive alleles (García-Dorado 2012). The reduction in g relative to standard inbreeding (F) is given by an effective purging coefficient (d), that measures the strength of the deleterious recessive component in the genome. The coefficient g is computed following the methods for pedigrees in García-Dorado (2012) and García-Dorado et al. (2016).

Usage

g(ped, d, Fi, name_to)
g(ped, d, Fi, name_to)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`d`	Purging coefficient (taking values between 0.0 and 0.5).
`Fi`	Vector of inbreeding coefficient values
`name_to`	A string naming the new output column.

Value

The input dataframe, plus an additional column named "g" followed by the purging coefficient, containing purged inbreeding coefficient values.

References

García-Dorado. 2012. Understanding and predicting the fitness decline of shrunk populations: Inbreeding, purging, mutation, and standard selection. Genetics 190: 1-16.
García-Dorado et al. 2016. Predictive model and software for inbreeding-purging analysis of pedigreed populations. G3 6: 3593-3601.

Deviation from Hardy-Weinberg equilibrium

Description

Computes the deviation from Hardy-Weinberg equilibrium following Caballero and Toro (2000).

Usage

hwd(ped, reference = NULL)
hwd(ped, reference = NULL)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.

Value

A numeric value indicating the deviation from Hardy-Weinberg equilibrium.

References

Caballero A, Toro M. 2000. Interrelations between effective population size and other pedigree tools for the management of conserved populations. Genet. Res. 75: 331-343.

Index ancestors

Description

Creates a vector of length N (the number of individuals) Only coordinates for valid ancestors will be given

Usage

idx_ancestors(ids, N)
idx_ancestors(ids, N)

Arguments

`ids`	Ancestor identities
`N`	Total number of individuals

Value

A logical matrix.

Inbreeding coefficient

Description

Usage

ip_F(ped, name_to = "Fi")
ip_F(ped, name_to = "Fi")

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_to`	A string naming the new output column.

Value

The input dataframe, plus an additional column with individual inbreeding coefficient values (named "Fi" by default).

References

Falconer DS, Mackay TFC. 1996. Introduction to Quantitative Genetics. 4th edition. Longman, Essex, U.K.
Malécot G, 1948. Les Mathématiques de l’hérédité. Masson & Cie., Paris.
Wright S. 1922. Coefficients of inbreeding and relationship. The American Naturalist 56: 330-338.

Examples

data(dama)
dama <- ip_F(dama)
tail(dama)
data(dama)
dama <- ip_F(dama)
tail(dama)

Ancestral inbreeding coefficient

Description

Usage

ip_Fa(ped, name_to = "Fa", genedrop = 0, seed = NULL, Fcol = NULL)
ip_Fa(ped, name_to = "Fa", genedrop = 0, seed = NULL, Fcol = NULL)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_to`	A string naming the new output column.
`genedrop`	Number of genedrop iterations to run. If set to zero (as default), Ballou's Fa is computed.
`seed`	Sets a seed for the random number generator.
`Fcol`	Name of column with inbreeding coefficient values. If none is used, inbreeding will be computed.

Value

The input dataframe, plus an additional column with individual ancestral inbreeding coefficient values (named "Fa" by default).

References

Ballou JD. 1997. Ancestral inbreeding only minimally affects inbreeding depression in mammalian populations. J Hered. 88:169–178.
Baumung et al. 2015. GRAIN: A computer program to calculate ancestral and partial inbreeding coefficients using a gene dropping approach. Journal of Animal Breeding and Genetics 132: 100-108.

Examples

data(dama)
# dama <- ip_Fa(dama) # Compute F on the go (won't be kept in the pedigree).
dama <- ip_F(dama)
dama <- ip_Fa(dama, Fcol = 'Fi') # If F is computed in advance.
tail(dama)
data(dama)
# dama <- ip_Fa(dama) # Compute F on the go (won't be kept in the pedigree).
dama <- ip_F(dama)
dama <- ip_Fa(dama, Fcol = 'Fi') # If F is computed in advance.
tail(dama)

Partial inbreeding coefficient

Description

Computes partial inbreeding coefficients, Fi(j). A coefficient Fi(j) can be read as the probability of individual i being homozygous for alleles derived from ancestor j. It is calculated following the tabular method described by Gulisija & Crow (2007). Optionally, it can be estimated via genedrop simulation.

Usage

ip_Fij(
  ped,
  mode = "founders",
  ancestors = NULL,
  Fcol = NULL,
  genedrop = 0,
  seed = NULL,
  ncores = 1L
)
ip_Fij(
  ped,
  mode = "founders",
  ancestors = NULL,
  Fcol = NULL,
  genedrop = 0,
  seed = NULL,
  ncores = 1L
)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`mode`	Defines the set of ancestors considered when computing partial inbreeding. It can be set as: "founder" for inbreeding conditional to founders only (default), "all" for all individuals in the pedigree (it may take long to compute in large pedigrees), and "custom" for individuals identities given in a integer vector (see 'ancestors' argument).
`ancestors`	Under the "custom" run mode, it defines a vector of ancestors that will be considered when computing partial inbreeding values.
`Fcol`	Name of column with inbreeding coefficient values. If none is used, inbreeding will be computed.
`genedrop`	Number of genedrop iterations to run. If set to zero (as default), exact coefficients are computed.
`seed`	Sets a seed for the random number generator (only if genedrop is enabled).
`ncores`	Number of cores to use for parallel computing (default = 1)

Value

A matrix of partial inbreeding coefficients. Fi(j) values can thus be read from row i and column j. In the resultant matrix, there are as many rows as individuals in the pedigree, and as many columns as ancestors used. Columns will be named and sorted by ancestor identity.

References

Gulisija D, Crow JF. 2007. Inferring purging from pedigree data. Evolution 61(5): 1043-1051.

Examples

# Original pedigree file in Gulisija & Crow (2007)
pedigree <- tibble::tibble(
  id = c("M", "K", "J", "a", "c", "b", "e", "d", "I"),
  dam = c("0", "0", "0", "K", "M", "a", "c", "c", "e"),
  sire = c("0", "0", "0", "J", "a", "J", "b", "b", "d")
)
pedigree <- purgeR::ped_rename(pedigree, keep_names = TRUE)

# Partial inbreeding relative to founder ancestors
m <- ip_Fij(pedigree)
# Note that in the example above, the sum of the values in
# rows will equal the vector of inbreeding coefficients
# i.e. base::rowSums(m) equals purgeR::ip_F(pedigree)$Fi

# Compute partial inbreeding relative to an arbitrary ancestor
# with id = 3 (i.e. individual named "J")
anc <- as.integer(c(3))
m <- ip_Fij(pedigree, mode = "custom", ancestors = anc)
# Original pedigree file in Gulisija & Crow (2007)
pedigree <- tibble::tibble(
  id = c("M", "K", "J", "a", "c", "b", "e", "d", "I"),
  dam = c("0", "0", "0", "K", "M", "a", "c", "c", "e"),
  sire = c("0", "0", "0", "J", "a", "J", "b", "b", "d")
)
pedigree <- purgeR::ped_rename(pedigree, keep_names = TRUE)

# Partial inbreeding relative to founder ancestors
m <- ip_Fij(pedigree)
# Note that in the example above, the sum of the values in
# rows will equal the vector of inbreeding coefficients
# i.e. base::rowSums(m) equals purgeR::ip_F(pedigree)$Fi

# Compute partial inbreeding relative to an arbitrary ancestor
# with id = 3 (i.e. individual named "J")
anc <- as.integer(c(3))
m <- ip_Fij(pedigree, mode = "custom", ancestors = anc)

Purged inbreeding coefficient

Description

Usage

ip_g(ped, d, name_to = "g<d>", Fcol = NULL)
ip_g(ped, d, name_to = "g<d>", Fcol = NULL)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`d`	Purging coefficient (taking values between 0.0 and 0.5).
`name_to`	A string naming the new output column.
`Fcol`	Name of column with inbreeding coefficient values. If none is used, inbreeding will be computed.

Value

The input dataframe, plus an additional column containing purged inbreeding coefficient values (named "g" and followed by the purging coefficient value by default).

References

García-Dorado. 2012. Understanding and predicting the fitness decline of shrunk populations: Inbreeding, purging, mutation, and standard selection. Genetics 190: 1-16.
García-Dorado et al. 2016. Predictive model and software for inbreeding-purging analysis of pedigreed populations. G3 6: 3593-3601.

Examples

data(dama)
dama <- ip_g(dama, d = 0.23)
tail(dama)
data(dama)
dama <- ip_g(dama, d = 0.23)
tail(dama)

Opportunity of purging

Description

The potential reduction in individual inbreeding load can be estimated by means of the opportunity of purging (O) and expressed opportunity of purging (Oe) parameters described by Gulisija and Crow (2007). Whereas O relates to the total potential reduction of the inbreeding load in an individual, as a consequence of it having inbred ancestors, Oe relates to the expressed potential reduction of the inbreeding load. Only Oe is computed by default. Estimates of O and Oe need to be corrected in complex pedigrees (see Details below). Both corrected (named "O" and "Oe" by default), and non-corrected (suffixed with "_raw") are returned.

Usage

ip_op(
  ped,
  name_Oe = "Oe",
  compute_O = FALSE,
  name_O = "O",
  Fcol = NULL,
  ncores = 1L,
  genedrop = 0,
  seed = NULL,
  complex = NULL
)
ip_op(
  ped,
  name_Oe = "Oe",
  compute_O = FALSE,
  name_O = "O",
  Fcol = NULL,
  ncores = 1L,
  genedrop = 0,
  seed = NULL,
  complex = NULL
)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_Oe`	A string naming the new output column for the expressed opportunity of purging (defaults to "Oe")
`compute_O`	Enable computation of total opportunity of purging (disabled by default)
`name_O`	A string naming the new output column for total opportunity of purging (defaults to "O")
`Fcol`	Name of column with inbreeding coefficient values. If none is used, inbreeding will be computed.
`ncores`	Number of cores to use for parallel computing (default = 1)
`genedrop`	Number of genedrop iterations run to compute partial inbreedng. If set to zero (as default), exact coefficients are computed.
`seed`	Sets a seed for the random number generator (only if genedrop is enabled).
`complex`	Enable correction for complex pedigrees (deprecated in v1.3, both raw and corrected measures of "Oe" are returned now).

Details

Model used here assume fully recessive, high effect size alleles (Gulisija and Crow, 2007).

In simple pedigrees, the opportunity of purging (O) and the expressed opportunity of purging (Oe) are estimated as in Gulisija and Crow (2007). For complex pedigrees involving more than one autozygous individual per path from an individual to an ancestor, O and Oe in the closer ancestors need to be discounted for what was already accounted for in their predecessors. To solve this problem, Gulisija and Crow (2007) provide expression to correct O and Oe (see equations 21 and 22 in the manuscript).

Here, an heuristic approach is used to prevent the inflation of O and Oe, and avoid the use of additional looped expressions that may result in an excessive computational cost. To do so, only the contribution of the most recent ancestors in a path will be considered. Specifically, the method skips contributions from "far" ancestors k, such that Fj(k) > 0, where j is an intermediate ancestor, both referred to an individual i of interest. Fj(k) refers to the partial inbreeding of j for alleles derived from k (see ip_Fij). This may not provide exact values of O and Oe, but we expect little bias, since more distant ancestors also contribute lesser to O and Oe.

Both types of estimates (corrected and non-corrected) are returned (non-corrected estimates, prefixed with "_raw").

Value

The input dataframe, plus an additional column containing Oe and Oe_raw estimates (additional columns for O can appended by enabling compute_O = TRUE).

References

Gulisija D, Crow JF. 2007. Inferring purging from pedigree data. Evolution 61(5): 1043-1051.

Examples

# Original pedigree file in Gulisija & Crow (2007)
pedigree <- tibble::tibble(
  id = c("M", "K", "J", "a", "c", "b", "e", "d", "I"),
  dam = c("0", "0", "0", "K", "M", "a", "c", "c", "e"),
  sire = c("0", "0", "0", "J", "a", "J", "b", "b", "d")
)
pedigree <- purgeR::ped_rename(pedigree, keep_names = TRUE)
ip_op(pedigree, compute_O = TRUE)
# Original pedigree file in Gulisija & Crow (2007)
pedigree <- tibble::tibble(
  id = c("M", "K", "J", "a", "c", "b", "e", "d", "I"),
  dam = c("0", "0", "0", "K", "M", "a", "c", "c", "e"),
  sire = c("0", "0", "0", "J", "a", "J", "b", "b", "d")
)
pedigree <- purgeR::ped_rename(pedigree, keep_names = TRUE)
ip_op(pedigree, compute_O = TRUE)

Map ancestors

Description

Creates a logical matrix that indicates whether an individual i (in columns) is ancestor of other j (in rows) For example, matrix[, 1] will indicate descendants of id = 1 And matrix[1, ] indicates ancestors of id = 1

Usage

map_ancestors(ped, idx)
map_ancestors(ped, idx)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`idx`	Index of ancestors to map

Value

A logical matrix.

Realized effective population size (mean)

Description

Computes the mean realized effective population size. Note this function expected a mean delta_F value for all individuals in the reference population

Computes the standard error of the realized effective population size. Note this function expects the mean and standard deviation of delta F, as well as the total number of individuals in the reference population

Usage

Ne_delta(delta)

se_Ne_delta(delta)
Ne_delta(delta)

se_Ne_delta(delta)

Arguments

delta

Vector of individual variations in inbreeding.

Value

Mean effective population size.

Standard error of the effective population size.

Opportunity of purging

Description

The potential reduction in individual inbreeding load can be estimated by means of the opportunity of purging (O) and expressed opportunity of purging (Oe) parameters described by Gulisija and Crow (2007). Whereas O relates to the total potential reduction of the inbreeding load in an individual, as a consequence of it having inbred ancestors, Oe relates to the expressed potential reduction of the inbreeding load. In both cases, these measures are referred to fully recessive, high effect size alleles (e.g. lethals). For complex pedigrees, involving more than one autozygous individual per path from a reference individual to an ancestor, these estimates are estimated following an heuristic approach (see details below).

Usage

op(ped, pi, Fi, name_O, name_Oe, sufix, compute_O = FALSE)
op(ped, pi, Fi, name_O, name_Oe, sufix, compute_O = FALSE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`pi`	Partial inbreeding matrix
`Fi`	Vector of inbreeding coefficient values
`name_O`	A string naming the new output column for total opportunity of purging (defaults to "O")
`name_Oe`	A string naming the new output column for the expressed opportunity of purging (defaults to "Oe")
`sufix`	A string naming the sufix for non-corrected O and Oe measures
`compute_O`	Enable computation of total opportunity of purging (false by default)

Details

Here, an heuristic approach is used to prevent the inflation of O and Oe, and avoid the use of additional looped expressions that may result in an excessive computational cost. To do so, when using ip_op(complex = TRUE) only the contribution of the most recent ancestors in a path will be considered. This may not provide exact values of O and Oe, but we expect little bias, since more distant ancestors also contribute lesser to O and Oe.

Value

The input dataframe, plus two additional column named "O" and "Oe", containing total and expressed opportunity of purging measures.

References

Gulisija D, Crow JF. 2007. Inferring purging from pedigree data. Evolution 61(5): 1043-1051.

Remove individuals not used for purging analyses

Description

Remove individuals that are not necessary for purging analyses involving fitness. This will reduce the size of the pedigree, and speed up the computation of inbreeding parameters. Individuals removed include those with unknown (NA) values of a given parameter, as long as they do not have any descendant in the pedigree with known values of that parameter. Cleaned pedigrees will automatically have individual identities renamed (see ped_rename).

Usage

ped_clean(ped, value_from)
ped_clean(ped, value_from)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`value_from`	Name of the column of interest.

Value

A dataframe with the pedigree cleaned for the specificed parameter (column) provided.

Examples

data(arrui)
nrow(arrui)
arrui <- ped_clean(arrui, "survival15")
nrow(arrui)
data(arrui)
nrow(arrui)
arrui <- ped_clean(arrui, "survival15")
nrow(arrui)

Input for igraph

Description

Processes a pedigree into a list with two objects, one dataframe of edges, and a dataframe of vertices, which can be used as input for functions of the igraph package.

Usage

ped_graph(ped)
ped_graph(ped)

Arguments

ped

A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.

Value

A list with one dataframe 'edges' and another 'vertices', each following igraph format.

The 'edges' dataframe will contain two columns in addition to the defaults "from" and "to": 1) 'from_parent' indicates whether the vertex from which the edge originates represents a mother ("dam") or a father ("sire"). 2) 'to_parent' indicates whether the vertex to which the edge is directed represents a mother ("dam"), father ("sire") or none ("NA").

Examples

data(atlas)
atlas_graph <- ped_graph(atlas)
G <- igraph::graph_from_data_frame(d = atlas_graph$edges,
                                   vertices = atlas_graph$vertices,
                                   directed = TRUE)
data(atlas)
atlas_graph <- ped_graph(atlas)
G <- igraph::graph_from_data_frame(d = atlas_graph$edges,
                                   vertices = atlas_graph$vertices,
                                   directed = TRUE)

Maternal effects

Description

For every individual in the pedigree, it will assign them their maternal (or paternal) value for an observed variable of interest.

Usage

ped_maternal(ped, value_from, name_to, use_dam = TRUE, set_na = NULL)
ped_maternal(ped, value_from, name_to, use_dam = TRUE, set_na = NULL)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`value_from`	Name of the column of interest.
`name_to`	A string naming the new output column.
`use_dam`	Extract maternal values. If false, parental values are returned.
`set_na`	When maternal values are unknown, NA values are generated by default. This option allows to set a different value.

Value

The input dataframe, plus an additional column with maternal (or paternal) values of a variable of interest.

Examples

# To assign maternal inbreeding as a new variable, we can do as follows:
data(dama)
dama <- ip_F(dama)
dama <- ped_maternal(dama, value_from = "Fi", name_to = "Fdam")
tail(dama)
# To assign maternal inbreeding as a new variable, we can do as follows:
data(dama)
dama <- ip_F(dama)
dama <- ped_maternal(dama, value_from = "Fi", name_to = "Fdam")
tail(dama)

Rename individuals in a pedigree from 1 to N

Description

Functions in purgeR require individuals to be named with integers from 1 to N. This takes a dataframe containing a pedigree, and rename individuals having names in any format to that required by other functions in purgeR. The process will also check that the pedigree format is suitable for other functions in the package.

Usage

ped_rename(ped, id = "id", dam = "dam", sire = "sire", keep_names = FALSE)
ped_rename(ped, id = "id", dam = "dam", sire = "sire", keep_names = FALSE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`id`	A string naming the column with individual identities. It will be renamed to its default value 'id'.
`dam`	A string naming the column with maternal identities. It will be renamed to its default value 'dam'.
`sire`	A string naming the column with paternal identities. It will be renamed to its default value 'sire'.
`keep_names`	A boolean value indicating whether the original identity values should be kept on a separate column (named 'names'), or not.

Value

A dataframe with the pedigree's identities renamed.

Examples

data(darwin)
darwin <- ped_rename(darwin, id = "Individual", dam = "Mother", sire = "Father", keep_names = TRUE)
head(darwin)
data(darwin)
darwin <- ped_rename(darwin, id = "Individual", dam = "Mother", sire = "Father", keep_names = TRUE)
head(darwin)

Sort individuals (with ancestors on top of descendants)

Description

Individuals can be sorted according to the pedigree structure, without need of birth dates. In the sorted pedigree, descendants will always be placed in rows with higher index number than that of their ancestors. This way, individuals born first will tend to be in the top of the pedigree. Younger individuals, and individuals with no descendants will tend to be placed at the bottom. This function uses the sorting algorithm developed by Zhang et al (2009). After sorting, individuals will be renamed from 1 to N using ped_rename.

Usage

ped_sort(ped, id = "id", dam = "dam", sire = "sire", keep_names = FALSE)
ped_sort(ped, id = "id", dam = "dam", sire = "sire", keep_names = FALSE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`id`	A string naming the column with individual identities. It will be renamed to its default value 'id'.
`dam`	A string naming the column with maternal identities. It will be renamed to its default value 'dam'.
`sire`	A string naming the column with paternal identities. It will be renamed to its default value 'sire'.
`keep_names`	A boolean value indicating whether the original identity values should be kept on a separate column (named 'names'), or not.

Value

A sorted pedigree dataframe (with ancestors on top of descendants).

References

Zhang Z, Li C, Todhunter RJ, Lust G, Goonewardene L, Wang Z. 2009. An algorithm to sort complex pedigrees chronologically without birthdates. J Anim Vet Adv. 8 (1): 177-182.

Examples

data(darwin)
# Here we reshuffle rows in the pedigree. It won't be usable for other functions in the package
darwin <- darwin[sample(1:nrow(darwin)), ]
# Below, we sort the pedigree again. The order might not be the same as before.
# But ancestors will always be placed on top of descendants,
# making the pedigree usable for other functions in the package.
darwin <- ped_sort(darwin, id = "Individual", dam = "Mother", sire = "Father", keep_names = TRUE)
data(darwin)
# Here we reshuffle rows in the pedigree. It won't be usable for other functions in the package
darwin <- darwin[sample(1:nrow(darwin)), ]
# Below, we sort the pedigree again. The order might not be the same as before.
# But ancestors will always be placed on top of descendants,
# making the pedigree usable for other functions in the package.
darwin <- ped_sort(darwin, id = "Individual", dam = "Mother", sire = "Father", keep_names = TRUE)

Sorting steps

Description

Recursive function that computes steps for sorting algorithm described by Zhang et al (2009).

Usage

sort_step(p, id, dam, sire, t, S, G, t_G)
sort_step(p, id, dam, sire, t, S, G, t_G)

Arguments

`p`	Pedigree to sort (used as template)
`id`	A string naming the column with individual identities. It will be renamed to its default value 'id'.
`dam`	A string naming the column with maternal identities. It will be renamed to its default value 'dam'.
`sire`	A string naming the column with paternal identities. It will be renamed to its default value 'sire'.
`t`	Template for the new sorted pedigree
`S`	Vector of assumed parent individuals
`G`	Vector of generation numbers (0 identifies the youngest)
`t_G`	Vector G for the new sorted pedigree

Value

No return value. Will print an error message if checking fail.

Filled template for the sorted pedigree. Once recursion ends, it returns the sorted pedigree

References

Zhang Z, Li C, Todhunter RJ, Lust G, Goonewardene L, Wang Z. 2009. An algorithm to sort complex pedigrees chronologically without birthdates. J Anim Vet Adv. 8 (1): 177-182.

Deviation from Hardy-Weinberg equilibrium

Description

Computes the deviation from Hardy-Weinberg equilibrium following Caballero and Toro (2000).

Usage

pop_hwd(ped, reference = NULL)
pop_hwd(ped, reference = NULL)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.

Value

A numeric value indicating the deviation from Hardy-Weinberg equilibrium.

References

Caballero A, Toro M. 2000. Interrelations between effective population size and other pedigree tools for the management of conserved populations. Genet. Res. 75: 331-343.

Examples

data(atlas)
pop_hwd(dama)
data(atlas)
pop_hwd(dama)

Population founders and ancestors

Description

Estimate the total and effective number of founders and ancestors in a pedigree, as well as the number of founder genome equivalents (see details on these parameters below). Note that a reference population (RP) must be defined, so that founders and ancestors are referred to the set of individuals belonging to that RP. This is set by means of a boolean vector passed as argument.

Usage

pop_Nancestors(ped, reference, nboot = 10000L, seed = NULL, skip_Ng = FALSE)
pop_Nancestors(ped, reference, nboot = 10000L, seed = NULL, skip_Ng = FALSE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.
`nboot`	Number of bootstrap iterations (for computing Ng).
`seed`	Sets a seed for the random number generator.
`skip_Ng`	Skip Ng computation or not (FALSE by default).

Details

The total number of founders (Nf) and ancestors (Na) are calculated simply as the count of founders and ancestors of individuals belonging to the reference population (RP). Founders here are defined as individuals with both parentals unknown.

The effective number of founders (Nfe) is the number of equally contributing founders, that would account for observed genetic diversity in the RP, while the effective number of ancestors (Nae) is defined as the minimum number of ancestors, founders or not, required to account for the genetic diversity observed in the RP. These parameters are computed from the probability of gene origin, following methods in Tahmoorespur and Sheikhloo (2011).

While the previous parameters account for diversity loss due to bottlenecks at the level of founders or ancestors, other sources of random loss of alleles (such as drift) can be accounted by means of the number of founder genome equivalents (Ng, Caballero and Toro 2000). This parameter is estimated via Monte Carlo simulation of allele segregation, following Boichard et al. (1997).

Value

A dataframe containing population size estimates for founders and ancestors:

Nr - Total number of individuals in the RP
Nf - Total number of founders
Nfe - Effective number of founders
Na - Total number of ancestors
Nae - Effective number of ancestors
Ng - Number of founder genome equivalents
se_Ng - Standard error of Ng

If some of the auxiliary functions is used (e.g. pop_Nr), only the corresponding numerical estimate will be returned. In the case of pop_Ng, a list object is returned, with the number of founder genome equivalents (Ng) and its standard error (se_Ng).

References

Boichard D, Maignel L, Verrier E. 1997. The value of using probabilities of gene origin to measure genetic variability in a population. Genet. Sel. Evol. 29: 5-23.
Caballero A, Toro M. 2000. Interrelations between effective population size and other pedigree tools for the management of conserved populations. Genet. Res. 75: 331-343.
Tahmoorespur M, Sheikhloo M. 2011. Pedigree analysis of the closed nucleus of Iranian Baluchi sheep. Small Rumin. Res. 99: 1-6.

Examples

data(arrui)
pop_Nancestors(arrui, reference = "target", skip_Ng = TRUE)
data(arrui)
pop_Nancestors(arrui, reference = "target", skip_Ng = TRUE)

Effective population size

Description

Estimate the effective population size (Ne). This is computed from the increase in individual inbreeding, following the method described by Gutiérrez et al (2008, 2009).

Usage

pop_Ne(ped, Fcol, tcol)
pop_Ne(ped, Fcol, tcol)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`Fcol`	Name of column with inbreeding coefficient values.
`tcol`	Name of column with generation numbers.

Value

A list with the effective population size (Ne) and its standard error (se_Ne).

References

Gutiérrez JP, Cervantes I, Molina A, Valera M, Goyache F. 2008. Individual increase in inbreeding allows estimating effective sizes from pedigrees. Genet. Sel. Evol. 40: 359-378.
Gutiérrez JP, Cervantes I, Goyache F. 2009. Improving the estimation of realized effective population sizes in farm animals. J. Anim. Breed. Genet. 126: 327-332.

Examples

data(atlas)
atlas <- ip_F(atlas) # compute inbreeding, appending column "F"
atlas <- pop_t(atlas) # compute generations, appending column "t"
pop_Ne(atlas, Fcol = "Fi", tcol = "t")
data(atlas)
atlas <- ip_F(atlas) # compute inbreeding, appending column "F"
atlas <- pop_t(atlas) # compute generations, appending column "t"
pop_Ne(atlas, Fcol = "Fi", tcol = "t")

Number of equivalent complete generations

Description

Computes the number of equivalent complete generations (t), as defined by Boichard et al (1997).

Usage

pop_t(ped, name_to = "t")
pop_t(ped, name_to = "t")

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_to`	A string naming the new output column.

Value

The input dataframe, plus an additional column corresponding to the number of equivalent complete generations of every individual (named "t" by default).

References

Boichard D, Maignel L, Verrier E. 1997. The value of using probabilities of gene origin to measure genetic variability in a population. Genet. Sel. Evol., 29: 5-23.

Examples

data(dama)
dama <- pop_t(dama)
tail(dama)
data(dama)
dama <- pop_t(dama)
tail(dama)

purgeR: Estimation of inbreeding-purging genetic parameters in pedigreed populations

Description

The purgeR package includes functions for the computation of parameters related to inbreeding and genetic purging in pedigreed populations, including standard, ancestral and purged inbreeding coefficients, among other measures of inbreeding and purging. In addition, functions to compute the effective population size and other parameters relevant to population genetics and structure are included.

Details

A complete user's guide with examples is provided as vignettes, introducing functions in this package and providing examples of use. Navigate these vignettes from R with:

browseVignettes("purgeR")

There are currently two vignettes available:

purgeR-tutorial: A complete overview of all functions in the package, including easy to follow examples.
ip: A more advanced guide showing examples of inbreeding purging analyses.

Functions

Preprocessing

ped_rename: Rename individuals in a pedigree from 1 to N
ped_sort: Sort individuals (with ancestors on top of descendants)
ped_clean: Remove individuals not used for purging analyses
ped_maternal: Maternal effects
ped_graph: Input for igraph

Inbreeding and purging

ip_F: Inbreeding coefficient
ip_Fa: Ancestral inbreeding coefficient
ip_Fij: Partial inbreeding coefficient
ip_g: Purged inbreeding coefficient
ip_op: Opportunity of purging
exp_F: Expected inbreeding coefficient
exp_Fa: Expected ancestral inbreeding coefficient
exp_g: Expected purged inbreeding coefficient

Population parameters

pop_hwd: Deviation from Hardy-Weinberg equilibrium
pop_t: Number of equivalent complete generations
pop_Ne: Effective population size
pop_Nancestors: Population founders and ancestors
pop_Na: Total number of ancestors
pop_Nae: Effective number of ancestors
pop_Nf: Total number of founders
pop_Nfe: Effective number of founders
pop_Ng: Number of founder genome equivalents

Fitness

w_grandoffspring: Grandoffspring
w_offspring: Offspring
w_reproductive_value: Reproductive value

Author(s)

Eugenio López-Cortegano <elcortegano@gmail.com> (ORCID)

References

López-Cortegano E. 2022. purgeR: Inbreeding and purging in pedigreed populations. Bioinformatics, https://doi.org/10.1093/bioinformatics/btab599.

Reproductive value

Description

Computes the reproductive value

Usage

reproductive_value(
  ped,
  reference,
  name_to,
  target = NULL,
  enable_correction = TRUE
)
reproductive_value(
  ped,
  reference,
  name_to,
  target = NULL,
  enable_correction = TRUE
)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.
`name_to`	A string naming the new output column.
`target`	A string naming a column indicating whether individuals belong to the target population or not. Column must be boolean or coercible to boolean type. By default, all descendants of the reference population are used.
`enable_correction`	Correct reproductive values.

Value

The input dataframe, plus an additional column with reproductive values for the reference and target populations assumed.

References

Hunter DC et al. 2019. Pedigree-based estimation of reproductive value. Journal of Heredity 110 (4): 433-444

Sample dam or sire inherited allele

Description

Given two alleles (one from dam, the other from sire), it samples one at random.

Arguments

`dam_al`	Dam allele.
`sire_al`	Sire allele.

Value

The sampled allele.

Search and individuals' ancestors

Description

Recursive function that gathers all founders and ancestors for a given individual

Arguments

`dam`	Vector of dams.
`sire`	Vector of sires.
`i`	Reference individual (its index, not id).
`fnd`	Vector of founders (to be returned as reference).
`anc`	Vector of ancestors (to be returned as reference).

Value

The sampled allele.

Grandoffspring

Description

Counts the number of grandoffspring for individuals in the pedigree.

Usage

w_grandoffspring(ped, name_to)
w_grandoffspring(ped, name_to)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_to`	A string naming the new output column.

Value

The input dataframe, plus an additional column indicating the total number of grandoffspring.

Examples

data(arrui)
dama <- w_grandoffspring(arrui, name_to = "G")
head(arrui)
data(arrui)
dama <- w_grandoffspring(arrui, name_to = "G")
head(arrui)

Offspring

Description

Counts the number of offspring for individuals in the pedigree.

Usage

w_offspring(ped, name_to, dam_offspring = TRUE, sire_offspring = TRUE)
w_offspring(ped, name_to, dam_offspring = TRUE, sire_offspring = TRUE)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`name_to`	A string naming the new output column.
`dam_offspring`	Compute dam offspring (TRUE by default).
`sire_offspring`	Compute sire offspring (TRUE by default).

Value

The input dataframe, plus an additional column indicating the total number of offspring.

Examples

data(arrui)
dama <- w_offspring(arrui, name_to = "P")
head(arrui)
data(arrui)
dama <- w_offspring(arrui, name_to = "P")
head(arrui)

Reproductive value

Description

Computes the reproductive value following the method by Hunter et al. (2019). This is a measure of how well a gene originated in a set of 'reference' individuals is represented in a different set of 'target' individuals. By default, fitness is computed for individuals in the reference population, using all of their descendants as target. A generation-wise mode can also be enabled, to compute fitness contributions consecutively from one generation to the next.

Usage

w_reproductive_value(
  ped,
  reference,
  name_to,
  target = NULL,
  enable_correction = TRUE,
  generation_wise = FALSE
)
w_reproductive_value(
  ped,
  reference,
  name_to,
  target = NULL,
  enable_correction = TRUE,
  generation_wise = FALSE
)

Arguments

`ped`	A dataframe containing the pedigree. Individual (id), maternal (dam), and paternal (sire) identities are mandatory columns.
`reference`	A string naming a column indicating whether individuals belong to the reference population or not. Column must be boolean or coercible to boolean type.
`name_to`	A string naming the new output column.
`target`	A string naming a column indicating whether individuals belong to the target population or not. Column must be boolean or coercible to boolean type. By default, all descendants of the reference population are used.
`enable_correction`	Correct reproductive values (enabled by default).
`generation_wise`	Assume that the reference population is a vector of integers indicating generation numbers. Reproductive values will be computed generation by generation independently (except for the last one).

Details

A reference population must be defined, which represents a set of individuals whose reproductive value is to be calculated. By default, genetic contributions to the rest of individuals in the pedigree is assumed, but a target population can also be defined, restricting the set of individuals accounted when computing the reproductive value. This could represent for example a cohort of alive individuals.

Value

The input dataframe, plus an additional column with reproductive values for the reference and target populations assumed.

References

Hunter DC et al. 2019. Pedigree-based estimation of reproductive value. Journal of Heredity 10(4): 433-444.

Examples

library(dplyr)
library(magrittr)
# Pedigree used in Hunter et al. (2019)
id <- c("A1", "A2", "A3", "A4", "A5", "A6",
        "B1", "B2", "B3", "B4",
        "C1", "C2", "C3", "C4")
dam <- c("0", "0", "0", "0", "0", "0",
         "A2", "A2", "A2", "A4",
         "B2", "B2", "A4", "A6")
sire <- c("0", "0", "0", "0", "0", "0",
          "A1", "A1", "A1", "A5",
          "B1", "B3", "B3", "A5")
t <- c(0, 0, 0, 0, 0, 0,
       1, 1, 1, 1,
       2, 2, 2, 2)
ped <- tibble::tibble(id, dam, sire, t)
ped <- purgeR::ped_rename(ped, keep_names = TRUE) %>%
 dplyr::mutate(reference = ifelse(t == 1, TRUE, FALSE))
purgeR::w_reproductive_value(ped, reference = "reference", name_to = "R")
library(dplyr)
library(magrittr)
# Pedigree used in Hunter et al. (2019)
id <- c("A1", "A2", "A3", "A4", "A5", "A6",
        "B1", "B2", "B3", "B4",
        "C1", "C2", "C3", "C4")
dam <- c("0", "0", "0", "0", "0", "0",
         "A2", "A2", "A2", "A4",
         "B2", "B2", "A4", "A6")
sire <- c("0", "0", "0", "0", "0", "0",
          "A1", "A1", "A1", "A5",
          "B1", "B3", "B3", "A5")
t <- c(0, 0, 0, 0, 0, 0,
       1, 1, 1, 1,
       2, 2, 2, 2)
ped <- tibble::tibble(id, dam, sire, t)
ped <- purgeR::ped_rename(ped, keep_names = TRUE) %>%
 dplyr::mutate(reference = ifelse(t == 1, TRUE, FALSE))
purgeR::w_reproductive_value(ped, reference = "reference", name_to = "R")

Package 'purgeR'

Help Index

Individuals to be evaluated in purging analyses

Description

Usage

Arguments

Value

Arrui pedigree

Description

Usage

Format

Source

References

Cuvier's gazelle pedigree

Description

Usage

Format

Source

References

Check ancestor individuals

Description

Usage

Arguments

Value

Check basic

Description

Usage

Arguments

Value

Check if a variable is boolean or not

Description

Usage

Arguments

Value

Check that optional column is included

Description

Usage

Arguments

Value

Check purging coefficient

Description

Usage

Arguments

Value

Check pedigree class

Description

Usage

Arguments

Value

Check columns with inbreeding values

Description

Usage

Arguments

Value

Check individual index

Description

Usage

Arguments

Value

Check if a variable is a positive integer or not

Description

Usage

Arguments

Value

Check if a variable has length >1

Description

Usage

Arguments

Value

Check if a vector contains NA values

Description

Usage

Arguments

Value

Check that mandatory column names are included

Description

Usage

Arguments

Value

Check Ne