--- title: "vcfR workflow" author: "Brian J. Knaus" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{vcfR workflow} %\VignetteEngine{knitr::rmarkdown} --- ## Workflow Work begins with a variant call format file (VCF). A reference file (FASTA) and an annotation file are also suggested. These later two files are not necessary, but I feel they contribute substantially. The below flowchart outlines the major steps involved with vcfR use. ![flowchart image](vcfR_flowchart.png) The green rectangles contain functions that the user will want to become familiar with. There are effectively three phases in the workflow: reading in data, processing the objects in memory and plotting the data graphically. Reading in the data frequently presents a bottleneck. Unfortunately, this is typically due to the time it takes to read from a drive and therefore may not be anything I can improve on. Once the data are in memory we can manipulate it. VcfR uses Rcpp to implement functions written in C++ to try to improve the performance of these functions. Lastly, we visualize the objects. Visualization relies on R's base graphics package. This is also something that I am not likely to be able to improve on if it becomes a bottleneck. By dividing the analysis into these three phases I've divided that workflow into things I may be able to improve upon through my code writing, and things which I have little control over.