ImprintCapASM-workflow

library(ImprintCapASM)

Overview

ImprintCapASM is a three-step pipeline for SNP-phased allele-specific methylation (ASM) analysis across the 41 known imprinted differentially methylated regions (DMRs). It accepts a VCF SNP file, a bisulfite methylation table, and an aligned BAM file, and returns per-allele methylation fractions, summary tables, and diagnostic plots per DMR.

Installation

install.packages("ImprintCapASM")

Requirements

  • R >= 4.1.0
  • samtools must be installed and on your system PATH
  • R packages: data.table, vcfR, readxl, writexl, ggplot2, Rsamtools

Input files

Place the following files in your input/ folder:

File Description
SAMPLEID_all.SNPs.out VCF SNP calls from bisulfite sequencing
SAMPLEID_all.CGmeth.txt Bisulfite methylation table
SAMPLEID_all_markdup.bam Aligned, duplicate-marked BAM file

The reference file data/filter_Cpgs.xlsx is bundled with the package.

Running the full pipeline

source("run_pipeline.R")
# You will be prompted:
# Enter sample type [control / patient]: control

Step 1 — prepare_cpg_snp_input()

prepare_cpg_snp_input(
  snp_file     = "input/SAMPLE_all.SNPs.out",
  meth_file    = "input/SAMPLE_all.CGmeth.txt",
  cpg_ref_file = "data/filter_Cpgs.xlsx",
  output_file  = "asm_results/cpg_snps_control_SAMPLE.xlsx",
  sample_type  = "control"
)

Step 2 — extract_bam_regions()

extract_bam_regions(
  bam_file    = "input/SAMPLE_all_markdup.bam",
  bed_file    = "asm_results/cpg_snps_control_SAMPLE.bed",
  output_dir  = "bam_asm/",
  sample_type = "control"
)

Step 3 — ASM()

ASM(
  cpg_snp_file     = "asm_results/cpg_snps_control_SAMPLE.xlsx",
  sam_file         = "bam_asm/control_SAMPLE_all_wide.bam",
  filter_cpgs_file = "data/filter_Cpgs.xlsx",
  output_file      = "asm_results/asm_control_SAMPLE.xlsx",
  sample_type      = "control"
)

Output files

File Contents
asm_<type>_<id>.xlsx Full read-level ASM table
snp_cpg_<type>_<id>.xlsx Per SNP×CpG methylation fractions
meth_summary_<type>_<id>.xlsx Allele methylation summary per DMR
dmr_plots_<type>_<id>.pdf Diagnostic line plots per DMR