--- title: "Data Transfers" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Data Transfers} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} library(strollur) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` The *strollur* package stores the data associated with your Amplicon Sequence Analysis. This tutorial will explain how to save, load, copy, export, and import your `strollur` object. If you haven't reviewed the [Getting Started](http://mothur.org/strollur/articles/strollur.html) tuturial, we recommend you start there. Let's use the `miseq_sop_example()` function to create a strollur object from the [Miseq SOP Example](https://mothur.org/wiki/miseq_sop/). ```{r} miseq <- miseq_sop_example() miseq ``` ## Saving and Loading The strollur package has a function to save a dataset object as an *.rds* file, `save_dataset()`, and a function to create a dataset from an *.rds* file, `load_dataset()`. Let's use the miseq data object to learn how to do that. ```{r} file_name <- file.path(tempdir(), "miseq_sop.rds") save_dataset(miseq, file = file_name) miseq_from_rds <- load_dataset(file = file_name) miseq_from_rds unlink(file_name) ``` We can see that the summaries of miseq and miseq_from_rds are identical. Let's modify miseq_from_rds to verify they are not referring to the same object. We will add clusters created by [mothur](https://mothur.org) using [vsearch's](https://github.com/torognes/vsearch) distance-based greedy clustering (dgc) algorithm. ```{r} dgc_data <- read_mothur_list(list = strollur_example("final.dgc.list.gz")) assign(miseq_from_rds, table = dgc_data, bin_type = "dgc") miseq_from_rds miseq ``` We can see from the summary that 361 'dgc' bins were added to miseq_from_rds and not to miseq. ## Export and Import The *.rds* file is in binary format and is not human readable. You can use the `export_dataset()` to see a human readable form of the raw data stored in the dataset. Let's export *miseq* and look at the table created. ```{r} table <- export_dataset(miseq) str(table) ``` Similarly to `load_dataset()`, you can use the `import_dataset()` function to create a new dataset object from the exported table. ```{r} miseq_import <- import_dataset(table = table) miseq_import ``` Again, we can see that the summary of miseq_import is identical to the summary of miseq. ## Copy Lastly, you can make a deep copy of your dataset using the `copy_dataset()` function. Note, if you use an assignment operator to copy it's a shallow copy. The dataset object is an R6 object to keep the memory usage low. First let's learn how to use the `copy_dataset()` function, then we will take a closer look at how deep and shallow copying differ. ```{r} miseq_deep_copy <- copy_dataset(miseq) miseq_shallow_copy <- miseq ``` Let's add the dgc_data to miseq_shallow_copy and then compare miseq, miseq_deep_copy, and mise_shallow_copy. ```{r} assign(miseq_shallow_copy, table = dgc_data, bin_type = "dgc") miseq miseq_shallow_copy miseq_deep_copy ``` You can see from the summaries that the dgc_data was added to both miseq and miseq_shallow_copy because they actually reference the same object, but miseq_deep_copy was not modified.