--- title: "Workflow" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction This vignette is a workflow template for data import and downstream analysis with **mpwR** including highlighting number of identifications, data completeness, quantitative and retention time precision etc. It demonstrates significant steps and showcases functions and applicability. ## Loading R packages ```{r setup, message = FALSE, warning = FALSE} library(mpwR) library(flowTraceR) library(magrittr) library(dplyr) library(tidyr) library(stringr) library(tibble) library(ggplot2) library(flextable) ``` # Import ## Import your data Importing the output files from each software can be performed with `prepare_mpwR`. Please put all output files in one folder and follow the guidelines for naming the files. No other files/subfolders are allowed. Details are provided in the vignette [Import](https://okdll.github.io/mpwR/articles/Import.html). ```{r import, eval = FALSE} files <- prepare_mpwR(path = "Path_to_Folder_with_files") ``` ## Examples Some examples are provided to explore the workflow with `create_example`. ```{r get-example-data} files <- create_example() ``` # Number of Identifications ## Report The number of identifications can be determined with `get_ID_Report`. ```{r ID-Report} ID_Reports <- get_ID_Report(input_list = files) ```

 

For each analysis an ID Report is generated and stored in a list. Each ID Report entry can be easily accessed: ```{r show-ID-Report} flextable::flextable(ID_Reports[["DIA-NN"]]) ```

 

## Plot ### Individual Each ID Report can be plotted with `plot_ID_barplot` from precursor- to proteingroup-level. The generated barplots are stored in a list. ```{r plot-ID-barplot} ID_Barplots <- plot_ID_barplot(input_list = ID_Reports, level = "ProteinGroup.IDs") ```

 

The individual barplots can be easily accessed: ```{r show-ID-barplot} ID_Barplots[["DIA-NN"]] ```

 

### Summary As a visual summary a boxplot can be generated with `plot_ID_boxplot`. ```{r plot-ID-boxplot} plot_ID_boxplot(input_list = ID_Reports, level = "ProteinGroup.IDs") ```

 

# Data Completeness ## Report Data Completeness can be determined with `get_DC_Report` for absolute numbers or in percentage. ```{r DC-Report} DC_Reports <- get_DC_Report(input_list = files, metric = "absolute") DC_Reports_perc <- get_DC_Report(input_list = files, metric = "percentage") ```

 

For each analysis a DC Report is generated and stored in a list. Each DC Report entry can be easily accessed: ```{r show-DC-Report} flextable::flextable(DC_Reports[["DIA-NN"]]) ```

 

## Plot ### Individual #### Absolute Each DC Report can be plotted with `plot_DC_barplot` from precursor- to proteingroup-level. The generated barplots are stored in a list. ```{r plot-DC-barplot} DC_Barplots <- plot_DC_barplot(input_list = DC_Reports, level = "ProteinGroup.IDs", label = "absolute") ```

 

The individual barplots can be easily accessed: ```{r show-DC-barplot} DC_Barplots[["DIA-NN"]] ```

 

#### Percentage ```{r show-DC-barplot-percentage} plot_DC_barplot(input_list = DC_Reports_perc, level = "ProteinGroup.IDs", label = "percentage")[["DIA-NN"]] ```

 

### Summary As a visual summary a stacked barplot can be generated with `plot_DC_stacked_barplot`. #### Absolute ```{r plot-DC-stacked-barplot} plot_DC_stacked_barplot(input_list = DC_Reports, level = "ProteinGroup.IDs", label = "absolute") ```

 

#### Percentage ```{r plot-DC-stacked-barplot-percentage} plot_DC_stacked_barplot(input_list = DC_Reports_perc, level = "ProteinGroup.IDs", label = "percentage") ```

 

# Missed Cleavages ## Report A report for Missed Cleavages can be generated with `get_MC_Report` for absolute numbers or in percentage. ```{r MC-Report} MC_Reports <- get_MC_Report(input_list = files, metric = "absolute") MC_Reports_perc <- get_MC_Report(input_list = files, metric = "percentage") ```

 

For each analysis a MC Report is generated and stored in a list. Each MC Report entry can be easily accessed: ```{r show-MC-Report} flextable::flextable(MC_Reports[["Spectronaut"]]) ```

 

## Plot ### Individual #### Absolute Each MC Report can be plotted with `plot_MC_barplot` from precursor- to proteingroup-level. The generated barplots are stored in a list. ```{r plot-MC-barplot} MC_Barplots <- plot_MC_barplot(input_list = MC_Reports, label = "absolute") ```

 

The individual barplots can be easily accessed: ```{r show-MC-barplot} MC_Barplots[["Spectronaut"]] ```

 

#### Percentage ```{r show-MC-barplot-percentage} plot_MC_barplot(input_list = MC_Reports_perc, label = "percentage")[["Spectronaut"]] ```

 

### Summary As a visual summary a stacked barplot can be generated with `plot_MC_stacked_barplot`. #### Absolute ```{r plot-MC-stacked-barplot} plot_MC_stacked_barplot(input_list = MC_Reports, label = "absolute") ```

 

#### Percentage ```{r plot-MC-stacked-barplot-percentage} plot_MC_stacked_barplot(input_list = MC_Reports_perc, label = "percentage") ```

 

# Retention Time Precision ## Preparation The coefficient of variation (CV) can be calculated with `get_CV_RT`. Only complete profiles are used. ```{r CV-RT} CV_RT <- get_CV_RT(input_list = files) ```

 

## Plot As a visual summary a density plot for all analyses can be accessed via `plot_CV_density`. ```{r CV-RT-plot} plot_CV_density(input_list = CV_RT, cv_col = "RT") ```

 

# Quantitative Precision ## Peptide-level ### Preparation The CV can be calculated with `get_CV_LFQ_pep`. Only complete profiles are used. ```{r CV-Pep} CV_LFQ_Pep <- get_CV_LFQ_pep(input_list = files) ```

 

### Plot As a visual summary a density plot for all analyses can be accessed via `plot_CV_density`. ```{r CV-Pep-plot} plot_CV_density(input_list = CV_LFQ_Pep, cv_col = "Pep_quant") ``` ## Proteingroup-level ### Preparation The CV can be calculated with `get_CV_LFQ_pg`. Only complete profiles are used. ```{r CV-PG} CV_LFQ_PG <- get_CV_LFQ_pg(input_list = files) ```

 

### Plot As a visual summary a density plot for all analyses can be accessed via `plot_CV_density`. ```{r CV-PG-plot} plot_CV_density(input_list = CV_LFQ_PG, cv_col = "PG_quant") ```

 

# Upset Plot Common identifications and intersections between analyses can be highlighted. ## Preparation Use `get_Upset_list` to prepare for Upset plotting. ```{r prepare-Upset} Upset_prepared <- get_Upset_list(input_list = files, level = "ProteinGroup.IDs") ``` ## Plot The Upset plot can be generated with `plot_Upset`. ```{r plot-Upset} plot_Upset(input_list = Upset_prepared, label = "ProteinGroup.IDs") ``` ## Inter-software Comparison - flowTraceR Functions of the package [flowTraceR](https://CRAN.R-project.org/package=flowTraceR) are incorporated in mpwR for inter-software comparisons. Software outputs are standardized and easily comparable. ### Precursor-level without flowTraceR Without standardizing the precursor-level information, the software outputs only form software-dependent cluster. ```{r Upset-flowTraceR-off} get_Upset_list(input_list = files, level = "Peptide.IDs") %>% #prepare Upset plot_Upset(label = "Peptide.IDs") #plot ``` ### Precursor-level with flowTraceR By enabling flowTraceR the precursor-level information is standardized and common identifications can be inferred. ```{r Upset-flowTraceR-on} get_Upset_list(input_list = files, level = "Peptide.IDs", flowTraceR = TRUE) %>% #prepare Upset plot_Upset(label = "Peptide.IDs") #plot ```

 

# Summary **mpwR** offers functions to summarize the downstream analysis. ## Report A summary report can be generated with `get_summary_Report`. ```{r summary-report, eval = FALSE} Summary_Report <- get_summary_Report(input_list = files) ``` ## Plot As a visual summary a radar chart for all analyses can be accessed via `plot_radarchart`. ### Overview ```{r plot-radarchart, eval = FALSE} plot_radarchart(input_df = Summary_Report) ``` ### Details To highlight individual categories, the generated summary report can be easily adjusted and used for plotting. ```{r plot-radarchart-DC, eval = FALSE} #Focus on Data Completeness Summary_Report %>% dplyr::select(Analysis, contains("Full")) %>% #Analysis column and at least one category column is required plot_radarchart() ```