Package 'BREADR' reference manual

Title:	Estimates Degrees of Relatedness (Up to the Second Degree) for Extreme Low-Coverage Data
Description:	The goal of the package is to provide an easy-to-use method for estimating degrees of relatedness (up to the second degree) for extreme low-coverage data. The package also allows users to quantify and visualise the level of confidence in the estimated degrees of relatedness.
Authors:	Jono Tuke [aut, cre] , Adam B. Rohrlach [aut] , Wolfgang Haak [aut] , Divyaratan Popli [aut]
Maintainer:	Jono Tuke <[email protected]>
License:	MIT + file LICENSE
Version:	1.0.2
Built:	2025-01-08 06:54:28 UTC
Source:	CRAN

callRelatedness

Description

A function that takes PMR observations, and (given a prior distribution for degrees of relatedness) returns the posterior probabilities of all pairs of individuals being (a) the same individual/twins, (b) first-degree related, (c) second-degree related or (d) "unrelated" (third-degree or higher). The highest posterior probability degree of relatedness is also returned as a hard classification. Options include setting the background relatedness (or using the sample median), a minimum number of overlapping SNPs if one uses the sample median for background relatedness, and a minimum number of overlapping SNPs for including pairs in the analysis.

Usage

callRelatedness(
  pmr_tibble,
  class_prior = rep(0.25, 4),
  average_relatedness = NULL,
  median_co = 500,
  filter_n = 1
)
callRelatedness(
  pmr_tibble,
  class_prior = rep(0.25, 4),
  average_relatedness = NULL,
  median_co = 500,
  filter_n = 1
)

Arguments

`pmr_tibble`	a tibble that is the output of the processEigenstrat function.
`class_prior`	the prior probabilities for same/twin, 1st-degree, 2nd-degree, unrelated, respectively.
`average_relatedness`	a single numeric value, or a vector of numeric values, to use as the average background relatedness. If NULL, the sample median is used.
`median_co`	if average_relatedness is left NULL, then the minimum cutoff for the number of overlapping snps to be included in the median calculation is 500.
`filter_n`	the minimum number of overlapping SNPs for which pairs are removed from the entire analysis. If NULL, default is 1.

Value

results_tibble: A tibble containing 13 columns:

row: The row number
pair: the pair of individuals that are compared.
relationship: the highest posterior probability estimate of the degree of relatedness.
pmr: the pairwise mismatch rate (mismatch/nsnps).
sd: the estimated standard deviation of the pmr.
mismatch: the number of sites which did not match for each pair.
nsnps: the number of overlapping snps that were compared for each pair.
ave_re;: the value for the background relatedness used for normalisation.
Same_Twins: the posterior probability associated with a same individual/twins classification.
First_Degree: the posterior probability associated with a first-degree classification.
Second_Degree: the posterior probability associated with a second-degree classification.
Unrelated: the posterior probability associated with an unrelated classification.
BF: A strength of confidence in the Bayes Factor associated with the highest posterior probability classification compared to the 2nd highest. (No longer included)

Examples

callRelatedness(counts_example,
  class_prior=rep(0.25,4),
  average_relatedness=NULL,
  median_co=5e2,filter_n=1
)
callRelatedness(counts_example,
  class_prior=rep(0.25,4),
  average_relatedness=NULL,
  median_co=5e2,filter_n=1
)

counts_example

Description

this is an example of the tibble made by processEigenstrat().

Usage

counts_example
counts_example

Format

`counts_example`

A data frame with 15 rows and 4 columns:

pair: the pair of individuals that are compared
nsnps: the number of overlapping snps that were compared for each pair.
mismatch: the number of sites which did not match for each pair.
pmr: the pairwise mismatch rate (mismatch/nsnps).

get column

Description

get column

Usage

get_column_new(genofile, col = 1)
get_column_new(genofile, col = 1)

Arguments

`genofile`	genofile
`col`	column to return

Value

column of numbers

plotLOAF

Description

Plots all (sorted by increasing value) observed PMR values with maximum posterior probability classifications represented by colour and shape. Options include a cut off for the minimum number of overlapping SNPs, the max number of pairs to plot and x-axis font size.

Usage

plotLOAF(in_tibble, nsnps_cutoff = NULL, N = NULL, fntsize = 7, verbose = TRUE)
plotLOAF(in_tibble, nsnps_cutoff = NULL, N = NULL, fntsize = 7, verbose = TRUE)

Arguments

`in_tibble`	a tibble that is the output of the callRelatedness() function.
`nsnps_cutoff`	the minimum number of overlapping SNPs for which pairs are removed from the plot. If NULL, default is 500.
`N`	the number of (sorted by increasing PMR) pairs to plot. Avoids plotting all pairs (many of which are unrelated).
`fntsize`	the fontsize for the x-axis names.
`verbose`	if TRUE, then information about the plotting process is sent to the console

Value

a ggplot object

Examples

relatedness_example
plotLOAF(relatedness_example)
relatedness_example
plotLOAF(relatedness_example)

plotSLICE

Description

A function for plotting the diagnostic information when classifying a specific pair (defined by the row number or pair name) of individuals. Output includes the PDFs for each degree of relatedness (given the number of overlapping SNPs) in panel A, and the normalised posterior probabilities for each possible degree of relatedness.

Usage

plotSLICE(
  in_tibble,
  row,
  title = NULL,
  class_prior = rep(1/4, 4),
  showPlot = TRUE,
  which_plot = 0,
  labels = NULL
)
plotSLICE(
  in_tibble,
  row,
  title = NULL,
  class_prior = rep(1/4, 4),
  showPlot = TRUE,
  which_plot = 0,
  labels = NULL
)

Arguments

`in_tibble`	a tibble that is the output of the callRelatedness() function.
`row`	either the row number or pair name for which the posterior distribution is to be plotted.
`title`	an optional title for the plot. If NULL, the pair from the user-defined row is used.
`class_prior`	the prior probabilities for same/twin, 1st-degree, 2nd-degree, unrelated, respectively.
`showPlot`	If TRUE, display plot. If FALSE, just pass plot as a variable.
`which_plot`	if 1, returns just the plot of the posterior distributions, if 2 returns just the normalised posterior values. Anything else returns both plots.
`labels`	a length two character vector of labels for plots. Default is no labels.

Value

a two-panel diagnostic ggplot object

Examples

plotSLICE(relatedness_example, row = 1)
plotSLICE(relatedness_example, row = 1)

process Eigenstrat data - alternative version

Description

A function that takes paths to an eigenstrat trio (ind, snp and geno file) and returns the pairwise mismatch rate for all pairs on a thinned set of SNPs. Options include choosing thinning parameter, subsetting by population names, and filtering out SNPs for which deamination is possible.

Usage

processEigenstrat(
  indfile,
  genofile,
  snpfile,
  filter_length = NULL,
  pop_pattern = NULL,
  filter_deam = FALSE,
  outfile = NULL,
  chromosomes = NULL,
  verbose = TRUE
)
processEigenstrat(
  indfile,
  genofile,
  snpfile,
  filter_length = NULL,
  pop_pattern = NULL,
  filter_deam = FALSE,
  outfile = NULL,
  chromosomes = NULL,
  verbose = TRUE
)

Arguments

`indfile`	path to eigenstrat ind file
`genofile`	path to eigenstrat geno file.
`snpfile`	path to eigenstrat snp file.
`filter_length`	the minimum distance between sites to be compared (to reduce the effect of LD).
`pop_pattern`	a character vector of population names to filter the ind file if only some populations are to compared.
`filter_deam`	a TRUE/FALSE for if C->T and G->A sites should be ignored.
`outfile`	(OPTIONAL) a path and filename to which we can save the output of the function as a TSV, if NULL, no back up saved. If no outfile, then a tibble is returned.
`chromosomes`	the chromosome to filter the data on.
`verbose`	controls printing of messages to console

Value

out_tibble: A tibble containing four columns:

Examples

# Use internal files to the package as an example
indfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
genofile <- system.file("extdata", "example.geno.txt", package = "BREADR")
snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
processEigenstrat(
indfile, genofile, snpfile,
filter_length=1e5,
pop_pattern=NULL,
filter_deam=FALSE
)
# Use internal files to the package as an example
indfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
genofile <- system.file("extdata", "example.geno.txt", package = "BREADR")
snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
processEigenstrat(
indfile, genofile, snpfile,
filter_length=1e5,
pop_pattern=NULL,
filter_deam=FALSE
)

process Eigenstrat data

Description

Usage

processEigenstrat_old(
  indfile,
  genofile,
  snpfile,
  filter_length = NULL,
  pop_pattern = NULL,
  filter_deam = FALSE,
  outfile = NULL,
  chromosomes = NULL,
  verbose = TRUE
)
processEigenstrat_old(
  indfile,
  genofile,
  snpfile,
  filter_length = NULL,
  pop_pattern = NULL,
  filter_deam = FALSE,
  outfile = NULL,
  chromosomes = NULL,
  verbose = TRUE
)

Arguments

`indfile`	path to eigenstrat ind file
`genofile`	path to eigenstrat geno file.
`snpfile`	path to eigenstrat snp file.
`filter_length`	the minimum distance between sites to be compared (to reduce the effect of LD).
`pop_pattern`	a character vector of population names to filter the ind file if only some populations are to compared.
`filter_deam`	a TRUE/FALSE for if C->T and G->A sites should be ignored.
`outfile`	(OPTIONAL) a path and filename to which we can save the output of the function as a TSV, if NULL, no back up saved. If no outfile, then a tibble is returned.
`chromosomes`	the chromosome to filter the data on.
`verbose`	controls printing of messages to console

Value

out_tibble: A tibble containing four columns:

Examples

# Use internal files to the package as an example
indfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
genofile <- system.file("extdata", "example.geno.txt", package = "BREADR")
snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
processEigenstrat_old(
indfile, genofile, snpfile,
filter_length=1e5,
pop_pattern=NULL,
filter_deam=FALSE
)
# Use internal files to the package as an example
indfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
genofile <- system.file("extdata", "example.geno.txt", package = "BREADR")
snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
processEigenstrat_old(
indfile, genofile, snpfile,
filter_length=1e5,
pop_pattern=NULL,
filter_deam=FALSE
)

read_ind

Description

read_ind

Usage

read_ind(filename)
read_ind(filename)

Arguments

filename

a IND text file.

Value

tibble with column headings: ind (CHR), sex (CHR), pop (CHR)

Examples

ind_snpfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
read_ind(ind_snpfile)
ind_snpfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
read_ind(ind_snpfile)

read_snp

Description

read_snp

Usage

read_snp(filename)
read_snp(filename)

Arguments

filename

a SNP text file.

Value

tibble with column headings: snp (CHR), chr (DBL), pos (DBL), site (DBL), anc (CHR), and der (CHR).

Examples

std_snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
broken_snpfile <- system.file("extdata", "broken.snp.txt", package = "BREADR")
read_snp(std_snpfile)
read_snp(broken_snpfile)
std_snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
broken_snpfile <- system.file("extdata", "broken.snp.txt", package = "BREADR")
read_snp(std_snpfile)
read_snp(broken_snpfile)

relatedness_example

Description

this is an example of the tibble made by callRelatedness()

Usage

relatedness_example
relatedness_example

Format

`relatedness_example`

A data frame with 15 rows and 13 columns:

row: The row number
pair: the pair of individuals that are compared.
relationship: the highest posterior probability estimate of the degree of relatedness.
pmr: the pairwise mismatch rate (mismatch/nsnps).
sd: the estimated standard deviation of the pmr.
mismatch: the number of sites which did not match for each pair.
nsnps: the number of overlapping snps that were compared for each pair.
ave_re: the value for the background relatedness used for normalisation.
Same_Twins: the posterior probability associated with a same individual/twins classification.
First_Degree: the posterior probability associated with a first-degree classification.
Second_Degree: the posterior probability associated with a second-degree classification.
Unrelated: the posterior probability associated with an unrelated classification.
BF: A strength of confidence in the Bayes Factor associated with the highest posterior probability classification compared to the 2nd highest.

saveSLICES

Description

Plots all pairwise diagnostic plots (in a tibble as output by callRelatedness), as produced by plotSLICE, to a folder. Options include the width and height of the output files, and the units in which these dimensions are measured.

Usage

saveSLICES(
  in_tibble,
  outFolder = NULL,
  width = 297,
  height = 210,
  units = "mm",
  verbose = TRUE
)
saveSLICES(
  in_tibble,
  outFolder = NULL,
  width = 297,
  height = 210,
  units = "mm",
  verbose = TRUE
)

Arguments

`in_tibble`	a tibble that is the output of the callRelatedness() function.
`outFolder`	the folder into which all diagnostic plots will be saved
`width`	the width of the output PDFs.
`height`	the height of the output PDFs.
`units`	the units for the height and width of the output PDFs.
`verbose`	Controls the printing of progress to console.

Value

nothing

Examples


saveSLICES(relatedness_example[1:3, ], outFolder = tempdir())

saveSLICES(relatedness_example[1:3, ], outFolder = tempdir())

sim_geno

Description

Simulated geno file of eigenstrat format

Usage

sim_geno(n_ind, n_snp, filename)
sim_geno(n_ind, n_snp, filename)

Arguments

`n_ind`	number of individuals
`n_snp`	number of SNPs
`filename`	filename of export

Value

NULL exports a file

Examples

## Not run: 
sim_geno(10, 5, "geno.txt")

## End(Not run)
## Not run: 
sim_geno(10, 5, "geno.txt")

## End(Not run)

split line

Description

takes a line for a SNP file and splits into parts.

Usage

split_line(x)
split_line(x)

Arguments

`x`	line from SNP file

Value

tibble with 6 columns.

Examples

split_line("1_14.570829090394763     1        0.000000              14 A X")
split_line("rs3094315	1	0.0	752566	G	A")
split_line("1_14.570829090394763     1        0.000000              14 A X")
split_line("rs3094315	1	0.0	752566	G	A")

test_degree

Description

Test if a degree of relatedness is consistent with an observed PMR

Usage

test_degree(in_tibble, row, degree, verbose = TRUE)
test_degree(in_tibble, row, degree, verbose = TRUE)

Arguments

`in_tibble`	a tibble that is the output of the callRelatedness() function.
`row`	either the row number or pair name for which the posterior distribution is to be plotted.
`degree`	the degree of relatedness to be tested.
`verbose`	a logical (boolean) for whether all test output should be printed to screen.

Value

the associated p-value for the test

Examples

test_degree(relatedness_example, 1, 1)
test_degree(relatedness_example, 1, 1)

Package 'BREADR'

Help Index

callRelatedness

Description

Usage

Arguments

Value

Examples

counts_example

Description

Usage

Format

counts_example

get column

Description

Usage

Arguments

Value

plotLOAF

Description

Usage

Arguments

Value

Examples

plotSLICE

Description

Usage

Arguments

Value

Examples

process Eigenstrat data - alternative version

Description

Usage

Arguments

Value

Examples

process Eigenstrat data

Description

Usage

Arguments

Value

Examples

read_ind

Description

Usage

Arguments

Value

Examples

read_snp

Description

Usage

Arguments

Value

Examples

relatedness_example

Description

Usage

Format

relatedness_example

saveSLICES

Description

Usage

Arguments

Value

Examples

sim_geno

Description

Usage

Arguments

Value

Examples

split line

Description

Usage

Arguments

Value

Examples

test_degree

Description

Usage

`counts_example`

`relatedness_example`