revert is an R package for detecting reversions from next-generation DNA sequencing data. It analyses reads aligned to the locus of a given pathogenic mutation and reports reversion events where secondary mutations have restored or undone the deleterious effect of the original pathogenic mutation, e.g., secondary indels complement to a frameshift pathogenic mutation converting the orignal frameshift mutation into inframe mutaions, deletions or SNVs that replaced the original pathogenic mutation restoring the open reading frame, SNVs changing the stop codon caused by the original nonsense SNV into an amino acid, etc. The revert package is designed to be applicable to most types of DNA sequencing data. The current version works for whole genome sequencing (WGS) and targeted genomic sequencing data such as whole exome sequencing (WES) and targeted amplicon sequecing (TAS) data. To start using revert quickly, see the Examples section.
samtools >= 1.11
R >= 4.1.0
revert performs reversion detection based on the provided BAM file
and the reads alignments are crucial for identifying the reversion
mutations. Many state-of-art NGS aligners enable clipping modes to
improve the accuracy of reads alignment by focusing on the
high-confidence and well-aligned parts of a read and discarding
(hard-clipping) or ignoring (soft-clipping) the non-aligned parts caused
by adapters, indels or translocations. The indels in clipped reads are
important because they might be potential reversions for the pathogenic
mutation if they convert the pathogenic mutation into inframe variants
and restore the open reading frame. Consequently, soft-clipping or
hard-clipping of reads alignment will result in loss of some important
reversions. One practical solution to this limitation is to align the
original reads without clipping by using an aligner capable of
performing end-to-end read alignment, e.g., “bowtie2” with the parameter
--end-to-end
. Furthermore, relaxing gap opening and
extension penalty scores for read alignment will result in more mapped
reads by allowing more frequent and larger deletions, e.g., setting
parameters --rdg 1,1
for “bowtie2”. However, it is to be
noted that relaxing too much on alignment penalty scores can be
detrimental to the overall quality of the alignment.
The main function getReversions()
outputs a list object
containing two tables summarizing the reversion detection:
A table showing frequencies of different types of events including:
A table including the details of reversion mutations detected from the input bam file for the pathogenic mutation, with the following columns:
rev_id: Unique ID for reversion event
rev_freq: Frequency of reversion event
rev_type: Type of reversion event, i.e., complement reversion to pathogenic mutation, replacement reversion of pathogenic mutation, or alternative reversion to pathogenic mutation
rev_mut_number: Index of each mutation in a reversion event
mut_id: Unique ID for reversion mutation
chr: Chromosome
mut_start_pos: Start position of reversion mutation
mut_type: Type of reversion mutation, i.e., SNV, DEL or INS
mut_seq: Sequence changes of mutation, i.e., inserted or deleted sequences for indels, or reference and alternative alleles for SNVs
mut_length: Length of mut_seq, 0 for SNV
mut_hgvs: HGVS Genomic DNA ID of reversion mutation
pathog_mut_hgvs: Original pathogenic reversion mutation
dist_to_pathog_mut: Distance between original pathogenic mutation and reversion mutation
For example, to detect reversions for a pathogenic deletion variant “chr17:g.43082434G>A” in BRCA1, run revert as follows:
library(revert)
bam.file1 <- system.file('extdata', 'toy_alignments_1.bam', package = 'revert')
reversions <- getReversions(
bam.file = bam.file1,
genome.version = "BSgenome.Hsapiens.UCSC.hg38",
chromosome = "chr17",
pathog.mut.start = 43082434,
pathog.mut.type = "SNV",
snv.reference.allele = "G",
snv.alternative.allele = "A",
flanking.window = 100,
minus.strand = TRUE
)
For example, to detect reversions for a pathogenic deletion variant “chr13:g.32338763-32338764delAT” in BRCA2, run revert as follows:
bam.file2 <- system.file('extdata', 'toy_alignments_2.bam', package = 'revert')
reversions <- getReversions(
bam.file = bam.file2,
genome.version = "BSgenome.Hsapiens.UCSC.hg38",
chromosome = "chr13",
pathog.mut.start = 32338763,
pathog.mut.type = "DEL",
deletion.sequence = "AT",
deletion.length = 2,
flanking.window = 100
)
For example, to detect reversions for a pathogenic deletion variant “chr17:g.43092689_43092690insT” in BRCA1, run revert as follows:
bam.file3 <- system.file('extdata', 'toy_alignments_3.bam', package = 'revert')
reversions <- getReversions(
bam.file = bam.file3,
genome.version = "BSgenome.Hsapiens.UCSC.hg38",
chromosome = "chr17",
pathog.mut.start = 43092689,
pathog.mut.type = "INS",
insertion.sequence = "T",
flanking.window = 100,
minus.strand = TRUE
)
Development of revert was supported by Breast Cancer Now.