--- title: "Projects, Files, Apps and Tasks execution with Seven Bridges API R Client" date: "`r Sys.Date()`" output: rmarkdown::html_document: toc: true toc_float: true toc_depth: 4 number_sections: false theme: "flatly" highlight: "textmate" css: "sevenbridges.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Projects, Files, Apps and Tasks execution with Seven Bridges API R Client} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` # Projects Projects are the core building blocks of the platform. Each project corresponds to a distinct scientific investigation, serving as a container for its data, analysis tools, results, and collaborators. All projects related operations can be accessed through the `projects` path from the `Auth` object. `Projects` is also a `Resource` R6 class which contains implementation of `query()`, `get()` and `delete()` methods for listing, fetching a single project and deleting a specific project. Besides those, there is also a custom method to create projects. When you fetch a single project, it is represented as an object of the `Project` class containing all project information and additional methods that can be executed directly on the project such as: updating the project, project members management, listing project files, apps and tasks etc. ## List all projects The following call returns a Collection with a list of all projects you are a member of. Each project's `project_id` and name will be printed. For full project information, you can access the `items` field in the `Collection` object and preview the list of projects. ```{r} # List and view your projects all_my_projects <- a$projects$query() View(all_my_projects$items) ``` If you want to list the projects owned by and accessible to a particular user, specify the `owner` argument as follows. ```{r} # List projects of particular user a$projects$query(owner = "") a$projects$query(owner = "") ``` ## Partial match project name For a more friendly interface and convenient search, the `sevenbridges2` package supports _partial name matching_. Set the `name` parameter in the `query()` method: ```{r} # List projects whose name contains 'demo' a$projects$query(name = "demo") ```
top
## Filter by project creation date, modification date, and creator Project creation date, modification date, and creator information is useful for quickly locating the project you need, especially when you want to follow the life cycle of a large number of projects and distinguish recent projects from old ones. To facilitate such needs, the fields `created_by`, `created_on`, and `modified_on` are returned in the project query calls. Since these fields cannot be passed to the `query()` function as parameters, you can use the helper code below in order to perform such action: ```{r} # Return all projects matching the name "wgs" wgs_projects <- a$projects$query(name = "wgs") # Filter by project creators creators <- sapply(wgs_projects$items, "[[", "created_by") wgs_projects$items[which(creators == "")] # Filter by project creation date create_date <- as.Date(sapply(wgs_projects$items, "[[", "created_on")) wgs_projects$items[which(as.Date(create_date) < as.Date("2019-01-01"))] # Filter by project modification date modify_date <- as.Date(sapply(wgs_projects$items, "[[", "modified_on")) wgs_projects$items[which(as.Date(modify_date) < as.Date("2019-01-01"))] ```
top
## Create a new project To create a new project, use the `create()` method on the Projects path. Users need to specify the following: - `name` (required) - `billing_group` (required) Other parameters and settings are optional. You can find more information in the `create()` function documentation on `?Projects`. ```{r} # Get billing group billing_groups <- a$billing_groups$query() billing_group <- a$billing_groups$get("") # Create a project named 'API Testing' a$projects$create( name = "API Testing", billing_group = billing_group, description = "Test for API" ) ```
top
## Get a single project Let's fetch the project we've just created by its ID. For this purpose, we can use Projects' `get()` method. This method accepts only project ID which consists of: - user's username or division name (for Seven Bridges platform users that are part of some divisions) and - project's short name in lowercase with spaces replaced by dashes, in the form of `/`. This id can also be seen in the URL of the project on the UI. ```{r} # Fetch previously created project p <- a$projects$get(id = "/api-testing") ``` To print all details about the project, use `detailed_print()` method directly on the `Project` object: ```{r} # Print all project info p$detailed_print() ``` ## Delete a project There are two ways to delete a project. One is from the `projects` path on the authentication object and the other one is to call the `delete()` method directly on the `Project` object you want to delete: ```{r} # Delete project using Auth$projects path a$projects$delete(project = "") # Delete project directly from the project object p$delete() ``` Please be careful when using this method and note that calling it will permanently delete the project from the platform. ## Edit an existing project If you want to edit an existing project, you can do so by using the `update()` method on the Project object. As a project Admin you can use it to change the name, description, settings, tags or billing group of the project. For example, if you want to change the name and description of the project, you can do it in the following way: ```{r} # Update project p$update( name = "Project with modified name", description = "This is the modified description." ) ``` Keep in mind that this modifies only the name of the project, not its short name. Therefore, after calling this method, the ID of the project will remain the same. If something changes in the project in the Platform UI, you can refresh your Project object to fetch the changes, by reloading it with: ```{r} # Reload project object p$reload() ``` ## Project members management ### List project members This call returns a `Collection` with a list of members of the specified project. For each member, the response is wrapped into a Member class object containing: - The member's username, email, id, and type and - The member's permissions in the specified project. ```{r} # List project members p$list_members() ``` ### Add a member to a project This call adds a new user to the specified project. It can only be made by a user who has admin permissions in the project. Requests to add a project member must include the key `permissions`. However, if you do not include a value, the member's permissions will be set to default values, which is read-only (only the `read` value will be set to TRUE). Set permissions by creating a named list with `copy`, `write`, `execute`, `admin`, or `read` names and assign TRUE or FALSE values to them. Note: `read` is implicit and set by default. You can not be a project member without having `read` permissions. ```{r} # Add project member p$add_member( user = "", permissions = list(write = TRUE, execute = TRUE) ) ``` ``` ── Member ───────────────────────────────────────────────────────────────────── • type: USER • email: new_user@velsera.com • username: • id: • href: https://api.sbgenomics.com/v2/projects//api-testing/members/ • permissions: • write: TRUE • read: TRUE • copy: FALSE • execute: TRUE • admin: FALSE ``` ### Get and modify a project member's permissions Sometimes you may just want to investigate a member's permissions within a specified project or update them, and you can do that by calling the `modify_member_permissions()` method. For this method to work, the user calling it must have admin permissions in the project. For example, you may want to give `write` permissions to a project member: ```{r} # Modify project member's permissions p$modify_member_permissions( user = "", permissions = list(copy = TRUE) ) ``` ### Remove a project member On the other hand, you can delete a member from the project in a similar way with the `remove_member()` operation: ```{r} # Remove a project member p$remove_member(user = "") ``` ## List project files In order to list all files and folders (special type of files) within the specified project object, you can use the Project's `list_files()` method. ```{r} # List project files p$list_files() ``` It will return a `Collection` object with the `items` field containing a list of returned `File` objects, along with pagination options. ## Create a folder within project Files You are also able to create a folder within a project's root Files directory using the `create_folder()` method. You have to specify the folder name which should not start with '__' or contain spaces. ```{r} # Create a folder within project files p$create_folder(name = "My_new_folder") ``` ## Get a project's root folder object Lastly, the project's root directory with all your files is a folder itself, therefore you are able to get this folder as a File object too using `get_root_folder()`. ```{r} # Get a project's root folder object p$get_root_folder() ``` ## List project's apps, tasks and import jobs We will just briefly mention that you can also list all project's apps, tasks and import jobs (created for Volume imports) directly on the Project object, but more details about these topics will be explained in the upcoming chapters: ```{r} # List project's apps p$list_apps() # List project's tasks p$list_tasks() # List project's imports p$list_imports() ``` ## Create a new app and task within a project Another shortcut is available on the Project object and that is creation of apps and tasks. More details about this topic will be provided in the next chapters. # Files, folders and metadata All file-related operations can be accessed through the `files` path from the `Auth` object. `Files` also inherits `Resource` R6 class which contains an implementation of `query()`, `get()` and `delete()` methods for listing, fetching a single file/folder, and deleting a specific file/folder. Besides those, there are also custom methods to copy files/folders and create folders. When you fetch single file/folder, it is represented as an object of `File` class. Note that class of both `files` and `subdirectories` is `File`. The difference between them is in the `type` parameter which is: - `File` for `files` - `Folder` for `subdirectories`. `File` object contains all file/folder information and additional methods that can be executed directly on the object like updating, adding tags, setting metadata, copying or moving files, exporting to volumes etc. ## List files This call lists `files` and `subdirectories` in a specified **project** or **directory** within a project, with specified properties that you can access. The project or directory whose contents you want to list is specified as a parameter in the call. The result will be a `Collection` class containing a list of File objects in the `items` field. ```{r} # List files in the project root directory api_testing_files <- a$files$query(project = "project_object_or_id") api_testing_files ``` ``` [[1]] ── File ──────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 61f3f9c6e6aad23247516bf30 • url: NA • modified_on: 2023-04-15T08:54:32Z • created_on: 2023-04-11T10:04:50Z • project: /api-testing • size: 56 bytes • name: Drop-seq_small_example.bam • id: 643530c28345522d97313d17 • href: https://api.sbgenomics.com/v2/files/643530c28345522d97313d17 [[2]] ── File ──────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 61f3f9c6e6aae54367516bf30 • url: NA • modified_on: 2023-04-11T10:29:13Z • created_on: 2023-04-11T10:29:13Z • project: /api-testing • size: 56 bytes • name: G20479.HCC1143.2_1Mreads.tar.gz • id: 6435367943r4456ecb66cfb2 • href: https://api.sbgenomics.com/v2/files/6435367943r4456ecb66cfb2 ``` Note that this call lists both `files` and `subdirectories` in the specified project or directory within a project, but **not the contents of the subdirectories**. To list the contents of a subdirectory, make a new call and specify the subdirectory as the `parent` parameter. ``` {r} # List files in a subdirectory a$files$query(parent = "") ``` You can also try and find a file with specific: 1. **Name** - List the file with the specified name. Note that the name must be an exact complete string for the results to match. 2. **Metadata** - List only files that have the specified value in a metadata field. Note that multiple instances of the same metadata field are implicitly separated by the OR operation. Conversely, different metadata fields are implicitly separated by the AND operation. 3. **Tag** - List files containing the specified tag. Note that the tag must be an exact complete string for the results to match. The OR operation is performed between multiple tags. 4. **Origin task** - List only files produced by the task specified by the ID in this field. ```{r} # List files with these names a$files$query( project = "", "") ) # List files with metadata fields sample_id and library values set a$files$query( project = "", metadata = list( "sample_id" = "", "library" = "" ) ) # List files with this tag a$files$query(project = "", tag = c("")) # List files from this task a$files$query(project = "", task = "") ``` To combine everything in a more realistic example - the following code gives us all files in the `user1/api-testing` project that have sample_id metadata set to "Sample1" __OR__ "Sample2", __AND__ the library id "EXAMPLE", __AND__ have either "hello" __OR__ "world" tag: ```{r} # Query project files according to described criteria my_files <- a$files$query( project = "user1/api-testing", metadata = list( sample_id = "Sample1", sample_id = "Sample2", library_id = "EXAMPLE" ), tag = c("hello", "world") ) ``` ### List public data To list publicly available files on the Seven Bridges Platform, set the project parameter to `admin/sbg-public-data`. ```{r} # Query public files public_files <- a$files$query(project = "admin/sbg-public-data") ``` ## Get a single file/folder To return a specific file or folder, knowing their ID, you can use the `get()` method, same as for other resources. File id can also be extracted from the URL in the Platform's visual interface. ```{r} # Get a single File object by ID a$files$get(id = "") ``` ``` ── File ──────────────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 61f3f9c6e6aad8667516rf543 • url: NA • modified_on: 2023-04-11T10:29:13Z • created_on: 2023-04-11T10:29:13Z • project: /api-testing • size: 56 bytes • name: G20479.HCC1143.2_1Mreads.tar.gz • id: 6435367997d934334fb66cfb2 • href: https://api.sbgenomics.com/v2/files/6435367997d934334fb66cfb2 ``` ## Delete a file The `delete` action only works for one file at a time. It can be called from the `Auth$files` path and accepts the `File` object or ID of the file you want to delete. ```{r} # Delete a file a$files$delete(file = "") ``` ## Copy files The `copy()` method allows you to copy multiple files between projects at a time. It can also be called from the `Auth$files` path and accepts a list of File objects or their ids within the `files` parameter. Besides this, you have to specify the destination project too. The result will contain a printed response with information about the copied files - their destination names and ids. ```{r} # Fetch files by id to copy into the api-testing project file1 <- a$files$get(id = "6435367997d9446ecb66cfb2") file2 <- a$files$get(id = "6435367997d9446ecb66cgr2") # Copy files to the project a$files$copy( files = list(file1, file2), destination_project = "/api-testing" ) ``` ## Get details of multiple files The `bulk_get()` method allows you to retrieve details for multiple files efficiently - in a single API call. This method accepts an argument, `files`, which can be either a list of File objects or a list of strings representing file IDs. File ID can also be extracted from the URL in the Platform's visual interface. ```{r} # Get details of multiple files by providing their IDs a$files$bulk_get(files = list("", "")) ``` ``` ── 1 ── ── File ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 52e0ed0de4b069f418bc13c7 • modified_on: 2022-01-11T11:41:17Z • created_on: 2016-06-17T16:43:52Z • project: admin/sbg-public-data • size: 2780048573 bytes • name: mouse_mm10_ucsc.fasta • id: 5772b6dc507c1752674486eb • href: https://api.sbgenomics.com/v2/files/5772b6dc507c1752674486eb ── 2 ── ── File ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 52e0ed0de4b069f418bc13c7 • modified_on: 2022-01-11T11:41:17Z • created_on: 2016-06-17T16:42:50Z • project: admin/sbg-public-data • size: 3189750467 bytes • name: human_g1k_v37_decoy.fasta • id: 5772b6d8507c1752674486e6 • href: https://api.sbgenomics.com/v2/files/5772b6d8507c1752674486e6 ``` ## Update details of multiple files The `bulk_update()` method updates the details for multiple specified files. It requires a single argument, `files`, which should be a list of `File` objects. Use this call to set new information for the files, thus replacing all existing information and erasing omitted parameters. For each of the specified files, the call sets a new name, new tags, and metadata. When editing fields in the File objects you wish to update, keep the following in mind: - The `name` field should be a string representing the new name of the file. - The `metadata` field should be a named list of key-value pairs. The keys and values should be strings. - The `tags` field should be an unnamed list of values. The maximum number of files you can update the details for per call is 100. ```{r} # Get files file_obj_1 <- a$files$get(id = "") file_obj_2 <- a$files$get(id = "") # Edit file_obj_1 fields file_obj_1$name <- "new_file_1_name.txt" file_obj_1$metadata <- list("new_metadata_field" = "123") file_obj_1$tags <- list("bulk_update_tag") # Edit file_obj_2 fields file_obj_2$name <- "new_file_2_name.txt" file_obj_2$metadata <- list("new_metadata_field" = "123") file_obj_2$tags <- list("bulk_update_tag") # Bulk update a$files$bulk_update(files = list(file_obj_1, file_obj_2)) ``` ## Edit details of multiple files The `bulk_edit()` method edits the details for multiple specified files. It requires a single argument, `files`, which should be a list of `File` objects. Use this call to modify the existing information for the files or add new information while preserving omitted parameters. For each of the specified files, the call edits its name, tags, and metadata. When editing fields in the File objects you wish to update, keep the following in mind: - The `name` field should be a string representing the new name of the file. - The `metadata` field should be a named list of key-value pairs. The keys and values should be strings. - The `tags` field should be an unnamed list of values. The maximum number of files you can edit the details for per call is 100. ```{r} # Get files file_obj_1 <- a$files$get(id = "") file_obj_2 <- a$files$get(id = "") # Edit file_obj_1 fields file_obj_1$name <- "new_file_1_name.txt" file_obj_1$metadata <- list("new_metadata_field" = "123") file_obj_1$tags <- list("bulk_edit_tag") # Edit file_obj_2 fields file_obj_2$name <- "new_file_2_name.txt" file_obj_2$metadata <- list("new_metadata_field" = "123") file_obj_2$tags <- list("bulk_edit_tag") # Bulk edit a$files$bulk_edit(files = list(file_obj_1, file_obj_2)) ``` ## Create a folder within the destination project or parent folder To create a new folder on the Platform, use the `Auth$files` method `create_folder()`. It allows you to create a new folder on the Platform within the root folder of a specified destination project or the provided parent folder. Remember that you should provide either the destination project (as the `project` parameter) or the destination folder (as the `parent` parameter), not both. ```{r} # Option 1 - Using the project parameter # Option 1.a (providing a Project object as the project parameter) my_project <- a$projects$get(project = "/api-testing") demo_folder <- a$files$create_folder( name = "my_new_folder", project = my_project ) # Option 1.b (providing a project's ID as the project parameter) demo_folder <- a$files$create_folder( name = "my_new_folder", project = "/api-testing" ) ``` Alternatively, you can provide the `parent` parameter to specify the destination where the new folder is going to be created. The `parent` parameter can be either a File object (must be of type `folder`) or an ID of the parent destination folder. ```{r} # Option 2 - Using the parent parameter # Option 2.a (providing a File (must be a folder) object as parent parameter) my_parent_folder <- a$files$get(id = "") demo_folder <- a$files$create_folder( name = "my_new_folder", parent = my_parent_folder ) # Option 2.b (providing a file's (folder's) ID as project parameter) demo_folder <- a$files$create_folder( name = "my_new_folder", parent = "" ) ``` ## File object operations Let's see now all available operations on the `File` objects that can be called. ### File print File object has a regular `print()` method which gives you most important information about the file: ```{r} # Get some file demo_file <- a$files$get(id = "") # Regular file print demo_file$print() ``` ``` ── File ──────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 61f3f9c6e6aad86675453ff30 • url: NA • modified_on: 2023-04-15T08:54:32Z • created_on: 2023-04-11T10:04:50Z • project: /api-testing • size: 56 bytes • name: Drop-seq_small_example.bam • id: 643530c286c9522d9222213d17 • href: `https://api.sbgenomics.com/v2/files/643530c286c9522d9222213d17` ``` But if you want to see all the details about a file in a specific format, you can use the `detailed_print()` method: ```{r} # Pretty print demo_file$detailed_print() ``` ``` ── File ──────────────────────────────────────────────────────────────────────────────────────────────────────── • type: file • parent: 61f3f9c6e6aad86675453ff30 • url: NA • modified_on: 2023-04-15T08:54:32Z • created_on: 2023-04-11T10:04:50Z • project: /api-testing • size: 56 bytes • name: Drop-seq_small_example.bam • id: 643530c286c9522d9222213d17 • href: `https://api.sbgenomics.com/v2/files/643530c286c9522d9222213d17` • tags • tag_1: TEST • tag_2: SEQ • metadata • reference_genome: GSM1629193_hg19_ERCC • investigation: GSM1629193 • md5_sum: 6294fee8200b29e03d3dc464f9c46a9c • sbg_public_files_category: test • storage • type: PLATFORM • hosted_on_locations: list("aws:us-east-1", "aws:us-west-2") ``` ### Update file details You can call the `update()` function on the File object. With this call, the following can be updated: - The file's `name`, - The file's `metadata`, - The file's `tags`. Read more details about this method in our [API documentation](https://docs.sevenbridges.com/reference/update-file-details). ```{r} # Update file name demo_file$update(name = "") # Update file metadata demo_file$update( metadata = list("" = "")) ``` ### Add tags to a file You can tag your files with keywords or strings to make it easier to identify and organize files. Tags are different from metadata and are more convenient and visible from the files list in the visual interface. You can tag your files using the `add_tag()` method. This method will automatically just add a new tag to a list of already existing ones, but you also have the option to set the `overwrite` parameter, which will erase old ones and set the new one. ```{r} # Add a new tag to a file demo_file$add_tag(tags = list("new_tag")) # Add a new tag to a file and overwrite old ones demo_file$add_tag(tags = list("new_tag"), overwrite = TRUE) # Delete all tags - just set tags to NULL demo_file$add_tag(tags = NULL, overwrite = TRUE) ``` ### Copy a single file between projects This call copies the specified file to a new project. Files retain their metadata when copied, but may be assigned new names in their target project. If you don't specify a new name, the file will retain its old name in the new project. To make this call, you should have the [copy permission](https://docs.sevenbridges.com/docs/set-permissions) within the project you are copying from. This call returns the `File` object of the newly copied file. ```{r} # Copy a file to a new project and set a new name demo_file$copy_to( project = "", name = "" ) ``` ### Get downloadable URL for a file To get a URL that you can use to download the specified file, you can use the `get_download_url()` method. This will set the `url` parameter in the File object and can later be used to download the file. ```{r} # Get downloadable URL for a file demo_file$get_download_url() ``` ### Get a file's metadata Files from curated datasets on Seven Bridges environments have a defined set of metadata which is visible in the visual interface of each environment. `File` object has the `get_metadata()` method which returns the metadata values for the specified file. This will pull and reload file's metadata from the platform. ```{r} # Get file metadata demo_file$get_metadata() ``` ### Modify file metadata You can also pass additional metadata for each file which is stored with your copy of the file in your project. To modify a file's metadata use the `set_metadata()` method. Here you can also use the `overwrite` parameter if you want to erase previous metadata fields and add a new one (by default it's set to `FALSE`). ```{r} # Set file metadata demo_file$set_metadata( metadata_fields = list("" = "metadata_field_value"), overwrite = TRUE ) ``` ### List folder contents Directories can have multiple `files`/`subdirectories` inside. You can see them using the `list_contents()` method. Note that this operation will work only on `File` objects whose type is `folder`. The result will also be a `Collection` class object containing a list of File objects in the `items` field. ```{r} # List folder contents demo_folder$list_contents() ``` ### Move a file into a folder This call moves a file from one folder to another. Moving folders is not allowed by the API. Moving of files is only allowed within the same project. Parent parameter must be a folder id or a `File` object whose type is `folder`. A file can also be renamed at the destination by setting the `name` argument. ```{r} # Move a file to a folder demo_file$move_to_folder( parent = "", name = "Moved_file.txt" ) ``` ### Download a file `File` object has a `download()` method, which allows you to download that file to your local computer. You should provide the `directory_path` parameter, which specifies the destination directory to which your file will be downloaded. By default, this parameter is set to your current working directory. You can also set the new name for your resulting (downloaded) file by providing the `filename` parameter. Otherwise, the default name (the one stored in the `name` field of your `File` object) will be used. ```{r} # Download a file demo_file$download(directory_path = "/path/to/your/destination/folder") ``` ### Get a file's parent directory Sometimes, it's convenient to get the parent folder ID for a file or folder: This information is stored in the `parent` field of the `File` object. ```{r} # Get a file's parent directory demo_file$parent ``` ``` [1] "5bd7c53ee4b04b8fb1a9f454x" ``` This is essentially the root folder ID. Alternatively, to get the parent folder as an object, use: ```{r} # Get a folder object parent_folder <- a$files$get(demo_file$parent) ``` ### Delete a file/folder User can delete files and folders using the `delete()` method directly on the `File` object. Please be aware that `folder` can only be deleted if it's empty. ```{r} # Delete a file demo_file$delete() # Delete a folder demo_folder$delete() ``` ### Delete multiple files/folders To delete multiple files in a single API call, use the `bulk_delete()` method. This method accepts either a list of `File` objects or a vector of strings (IDs) representing the files you intend to delete. The method also works with `folders`. However, please note that a `folder` can only be deleted if it is empty. ```{r} # Delete two files by providing their IDs a$files$delete(files = list("", "")) # Delete two files by providing a list of File objects file_object_1 <- a$files$get(id = "") file_object_2 <- a$files$get(id = "") a$files$delete(files = list(file_object_1, file_object_2)) ``` ### Reload a file To keep your local `File` object up to date with the file on the platform, you can always call the `reload()` function: ```{r} # Reload file/folder objects demo_file$reload() demo_folder$reload() ``` # Apps Following the same logic as with other `Resource` classes, all apps related operations are grouped under the `Apps` class, that can be accessed within `Auth` objects on the `Auth$apps` path. From here you can call operations to list all apps, fetch single app by its id, copy or create a new app. When you operate with a single app, it is represented as an object of `App` class. The `App` object contains almost all app information and additional methods that can be executed directly on the object, such as getting or creating new app revisions, copying, syncing with the latest revision or creating tasks with this app, etc. Note that we say almost all information, because we don't return all fields by default for apps - the raw CWL field is excluded due to its size and speed of execution. Therefore, if you wish to fetch the raw CWL of an app, there is a separate method for this purpose that you can call on the App object (`get_raw_cwl()`). ## List apps You can list all apps available to you by calling the `apps$query()` method from the authentication object. The method has several parameters that allow you to search for apps in various places and by specified search terms. Note that you can see all of the publicly available apps on the Seven Bridges Platform by setting the `visibility` parameter to `public`. If you omit this parameter (it will use the default value `private`), and you will see all your private apps, i.e. those in projects that you can access. Learn more about public apps in our documentation. ```{r} # Query public apps - set visibility parameter to "public" a$apps$query(visibility = "public", limit = 10) ``` The same can be done for private apps. The following call will return all the apps available to you, i.e. all the apps that you have in your projects: ```{r} # Query private apps my_apps <- a$apps$query() ``` Just to remind you that not all of the available apps are going to be returned, because the `limit` parameter is set to 50 by default. Since the result is a `Collection` object, you can navigate through results by calling `next_page()` and `prev_page()` or call `all()` to return all results. ```{r} # Load next 50 apps my_apps$next_page() ``` Alternatively, you can query all the apps in a specific project by providing the project of interest using the `project` parameter. You can either use the `Project` object, or a project ID (string). ```{r} # Query apps within your project - set limit to 10 a$apps$query(project = "", limit = 10) ``` You can also use one or more search terms via the `query_terms` parameter to query all apps that are available to you. Search terms should relate to the following app details: * name * label * toolkit * toolkit version * category * tagline * description For example, to get public apps that contain the term **"VCFtools"** anywhere in the app details, you can make a call similar to this one: ```{r} # List public apps containing the term "VCFtools" in app's details a$apps$query(visibility = "public", query_terms = list("VCFtools"), limit = 10) ``` For the query to return results, each term must match at least one of the fields that describe an app. For example, the first term can match the app's name while the second one can match the app description. However, if any part of the search fails to match app details, the call will return an empty list. Another useful option is to query apps by id. You can do so either for public apps, or for private apps (apps available to you). The following example illustrates how this can be done for public apps: ```{r} # List files in project root directory a$apps$query( visibility = "public", id = "admin/sbg-public-data/vcftools-convert" ) ``` ### List project apps All available apps in a specific project can also be listed by calling the `list_apps()` method directly on the `Project` object. This method has the `project` and `visibility` arguments predefined, while all other parameters are identical to those presented in the `apps$query()` function. ```{r} # Get project p <- a$projects$get("/api-testing") # List apps in the specified project p$list_apps(limit = 10) ``` ## Get app information If you need information about a specific app, you can get it using the `apps$get()` method. Keep in mind that the app should be in a project that you can access. This could be an app that has been uploaded to the Seven Bridges Platform by a project member, or a publicly available app. You should provide the `id` of the app of interest, and optionally its `revision`. If no revision is specified, the latest one will be used. ```{r} # Get a public App object bcftools_app <- a$apps$get(id = "admin/sbg-public-data/bcftools-call-1-15-1") ``` ## Copy an app To copy an app to a specified destination project, you can use the `apps$copy()` method. Keep in mind that the app should be in a project that you can access. This could be an app that has been uploaded to the Seven Bridges Platform by a project member, or a publicly available app. Destination project (`project` parameter) should be provided either as an object of the `Project` class, or as an ID of the target project of interest. You might want to set the new name that the app will have in the target project. To do so, use the `name` parameter. If the app's name will not change, omit the `name` parameter. Keep in mind that there are different strategies for copying the apps on the platform: * `clone` : copy all revisions; get updates from the same app as the copied app (default) * `direct`: copy latest revision; get updates from the copied app * `clone_direct`: copy all revisions; get updates from the copied app * `transient`: copy latest revision; get updates from the same app as the copied app Learn more about copy strategies in our public [API documentation](https://docs.sevenbridges.com/reference/copy-an-app). The following example demonstrates how can you copy the previously created `bcftools_app` to a project: ```{r} # Copy an app to a project app_copy <- a$apps$copy(bcftools_app, project = "", name = "New_app_name" ) ``` ## Create new app The `apps$create()` method allows you to add an app using raw CWL. The raw CWL can be provided either through the `raw` parameter, or by using the `file_path` parameter. Keep in mind that these two parameters should not be used together. If you choose to use the `raw` parameter, make sure to provide a list containing raw CWL for the app you are about to create. To generate such a list, you might want to load an existing `JSON` / `YAML` file. In case that your CWL file is in JSON format, please use the `fromJSON` function from the `jsonlite` package to minimize potential problems with parsing the JSON file. If you want to load a CWL file in YAML format, it is highly recommended to use the `read_yaml` function from the `yaml` package. Make sure to set the `raw_format` parameter to match the type of the provided raw CWL file (`JSON` / `YAML`). By default, this parameter is set to `JSON`. ```{r} # Load the JSON file file_json <- jsonlite::read_json("/path/to/your/raw_cwl_in_json_format.cwl") # Create app from raw CWL (JSON) new_app_json <- a$apps$create( project = "", raw = file_json, name = "New_app_json", raw_format = "JSON" ) ``` If you opt for the `file_path` parameter instead, you should provide a path to a file containing the raw CWL for the app (`JSON` or `YAML`). ```{r} # Create an app from raw CWL (YAML) new_app_yaml <- a$apps$create( project = "", from_path = "/path/to/your/raw_cwl_in_yaml_format.cwl", name = "New_app_yaml", raw_format = "YAML" ) ``` ### Create an app in a project The app can also be directly created on a `Project` object by invoking `create_app()`. Except for the predefined `project` parameter, the `create_app()` has the same other parameters as `apps$create()`. ```{r} # Load the JSON file file_json <- jsonlite::read_json("/path/to/your/raw_cwl_in_json_format.cwl") # Get project p <- a$projects$get("/api-testing") # Create app from raw CWL (JSON) in specified project p$create_app( raw = file_json, name = "New_app_json", raw_format = "JSON" ) ``` ## App object operations Once you've fetched the `App` object, you'll see that it also has various useful methods within itself. The following actions are available for an App object: * print * input_matrix * output_matrix * get_revision * create_revision * copy * sync * create_task * reload ### Print an app The `print` method prints the app details to the console. ```{r} # Fetch the first app from project's apps p <- a$projects$get("/api-testing") my_apps <- p$list_apps() my_new_app <- my_apps$items[[1]] # Print app's details my_new_app$print() ``` ``` ── App ────────────────────────────────────────────────────────────────────────────────────────────────────── • revision: 0 • name: BCFtools Call • project: /api-testing • id: /api-testing/new_app_json • href: https://api.sbgenomics.com/v2/apps//api-testing/new_app_json/0 ``` ### Get an app's raw CWL If the app's `raw` field is empty, just call the `reload()` method, to fetch app's raw CWL. ### Preview app's inputs and expected outputs Usually, for most of the tasks, some inputs should be defined, which are required by the app. Information about which inputs are required or optional to be set for the app is stored in its CWL. However, we have provided a utility function `input_matrix()` on the `App` object that can parse this information and return the app's input matrix for you. This way, users will know how to construct the list of inputs (how to name them and make them available within files) when creating the task. **NOTE** that `id` field in the data frame is the name you should use when specifying task inputs. ```{r} # Get app's inputs details my_new_app$input_matrix() ``` ``` id label required type in_variants Input Mpileup VCF file TRUE File regions_file Regions from file FALSE File? output_name Output file name FALSE string? output_type Output type FALSE enum regions Regions for processing FALSE string[]? ... ``` Besides id and label describing the input, you can see whether the input is required or not and which type is expected. For most of the inputs, if you notice that `type` field contains '?', it means that the field is optional. There is another utility operation on the `App` object to list expected outputs of an app or task. This information can be received by calling the `output_matrix()` method: ```{r} # Get app's outputs details my_new_app$output_matrix() ``` ``` id label type 1 summary_metrics Summary Metrics File 2 out_filtered_variants Output filtered VCF File? 3 html_report HTML report File? ... ``` ### Get an app revision To obtain a particular revision of an app, use the `get_revision()` method and set the `revision` parameter to the number of the version you want to get. Keep in mind that there is another important parameter that can be set for this method. If the `in_place` parameter is set to `TRUE`, the current app object will be replaced with the new one for specified app revision. By default, this parameter is set to `FALSE`. ```{r} # Get an app revision my_app <- a$apps$get(id = "/api-testing/new_app_json/0") my_app$print() ``` ``` ── App ────────────────────────────────────────────────────────────────────────────────────────────────────── • latest_revision: 1 • copy_of: admin/sbg-public-data/bcftools-call-1-15-1/0 • revision: 0 • name: BCFtools Call • project: /api-testing • id: /api-testing/new_app_json • href: https://api.sbgenomics.com/v2/apps//api-testing/new_app_json/0 ``` ```{r} # Get an app revision my_app$get_revision(revision = 1) ``` ```{r} # Get an app revision and update the object my_app$get_revision(revision = 1, in_place = TRUE) ``` ``` ── App ────────────────────────────────────────────────────────────────────────────────────────────────────── • latest_revision: 1 • copy_of: admin/sbg-public-data/bcftools-call-1-15-1/0 • revision: 1 • name: BCFtools Call • project: /api-testing • id: /api-testing/new_app_json • href: https://api.sbgenomics.com/v2/apps//api-testing/new_app_json/1 ``` ### Create an app revision The `create_revision()` method allows you to create a new revision for an existing app. The raw CWL can be provided either through the `raw` parameter, or by using the `file_path` parameter. Keep in mind that these two parameters should not be used together. If you choose to use the `raw` parameter, make sure to provide a list containing raw CWL for the app revision you are about to create. To generate such a list, you might want to load an existing `JSON` / `YAML` file. In case that your CWL file is in JSON format, please use the `fromJSON` function from the `jsonlite` package to minimize potential problems with parsing the JSON file. If you want to load a CWL file in YAML format, it is highly recommended to use the `read_yaml` function from the `yaml` package. Make sure to set the `raw_format` parameter to match the type of the provided raw CWL file (`JSON` / `YAML`). By default, this parameter is set to `JSON`. Using `in_place` parameter will overwrite the current app object with new app revision information. ```{r} # Create an app revision from a file raw_cwl_as_list <- jsonlite::read_json( path = "/path/to/your/raw_cwl_in_json_format.cwl" ) my_app$create_revision(raw = raw_cwl_as_list, in_place = TRUE) ``` If you opt for the `file_path` parameter instead, you should provide a path to a file containing the raw CWL for the app (`JSON` or `YAML`). ```{r} # Create a new revision for an existing app my_app$create_revision( from_path = "/path/to/your/raw_cwl_in_json_format.cwl", in_place = TRUE ) ``` ### Copy an app An app can be copied to a specified destination project directly from an app's object too, by calling its own `copy()`method. Destination project (`project` parameter) should be provided either as an object of the `Project` class, or as an ID of the target project of interest. You can set the new name that the app will have in the target project with the `name` parameter. Keep in mind that are different strategies for copying apps on the platform: * `clone` : copy all revisions; get updates from the same app as the copied app (default) * `direct`: copy latest revision; get updates from the copied app * `clone_direct`: copy all revisions; get updates from the copied app * `transient`: copy latest revision; get updates from the same app as the copied app Learn more about copy strategies in our public [API documentation](https://docs.sevenbridges.com/reference/copy-an-app). ```{r} # Copy app copied_app <- my_app$copy( project = "", name = "New_app_name" ) ``` ### Sync a copied app To synchronize a copied app with the source app from which it has been copied, so it uses the latest revision, you can call the `sync()` method. The `App` object will be overwritten with the latest app. ```{r} # Sync a copied app to the latest revision created copied_app$sync() ``` ### Reload an app To keep your local `App` object up to date with the app on the platform, you can always call the `reload()` function: ```{r} # Reload an app object my_app$reload() ``` # Tasks All task related operations are grouped under the `Tasks` class within the authentication object, which also inherits the `Resource` class and implements `query()`, `get()` and `delete()` operations for listing tasks, fetching single task and deleting tasks. Besides these, users are able to create new tasks with the `create()` operation from this `Auth$tasks` path. Furthermore, users can retrieve details for multiple tasks with a single API call using the `bulk_get()` method, also available from `Auth$tasks`. When you operate with a single task, it is represented as an object of the `Task` class. The `Task` object contains all task information and additional methods that can be executed directly on the object such as running, aborting, cloning, updating, deleting the task, etc. ## List tasks As mentioned above, you can list your tasks by calling the `tasks$query()` method from the authentication object. The method has many additional query parameters that could allow you to search for tasks by specific criteria such as: `status`, `parent`, `project`, `created_from`, `created_to`, `started_from`, `started_to`, `ended_from`, `ended_to`, `order_by`, `order`, `origin_id`. Let's list all tasks that were completed: ```{r} # Query all tasks a$tasks$query() # Query tasks by their status a$tasks$query(status = "COMPLETED", limit = 5) ``` To list all the tasks in a project, use the following. ```{r} # Find the project and pass it in the project parameter p <- a$projects$query(id = "") a$tasks$query(project = p) # Alternatively you can list all tasks directly from the Project object p <- a$projects$get(id = "") p$list_tasks() ``` Similar to previous query methods, here you will also get the `Collection` object where resulting tasks will be stored in the `items` fields and you can use pagination to navigate through results. ## Get single task information In order to retrieve information about a single task of interest, you can get it using the `tasks$get()` method using its id as parameter. ```{r} # Get specific task by ID a$tasks$get(id = "") ``` ## Get details of multiple tasks To retrieve details of multiple tasks in a single API call, use the `tasks$bulk_get()` method. The `tasks$bulk_get()` method allows you to retrieve details of multiple tasks efficiently - in a single API call. This method accepts a single argument, `tasks`, which can be either a list of Task objects or a list of strings representing task IDs. Task ID can be extracted from the URL in the Platform's visual interface. ```{r} # Get details of multiple tasks by providing their IDs a$tasks$bulk_get(tasks = list("", "task_2_id")) # Get details of multiple tasks by providing Task objects task_obj_1 <- a$tasks$get("") task_obj_2 <- a$tasks$get("") a$tasks$bulk_get(tasks = list(task_obj_1, task_obj_2)) ``` ## Create a draft task To create a new draft task, you can use the `tasks$create` method. The method accepts various arguments such as: in which project to create a task, which app and its revision to use, task name, description, which inputs it requires, batching options, execution settings, etc. However, we can create a draft task by only defining the project and the app that will be run, since all other parameters are optional: ```{r} # Create a draft task draft_task <- a$tasks$create( project = "", app = "" ) ``` This will create an empty task, without any parameter defined. User has the option to set execution settings by using `execution_settings` parameter, and also to define usage of interruptible instances through `use_interruptible_instances` parameter. ```{r} # Create task with execution settings and with use of interruptible instances execution_settings <- list( "instance_type" = "c4.2xlarge;ebs-gp2;2000", "max_parallel_instances" = 2, "use_memoization" = TRUE, "use_elastic_disk" = FALSE ) task_exec_settings <- a$tasks$create( project = "", app = "", execution_settings = execution_settings, use_interruptible_instances = FALSE, ) ``` To run the app immediately after it was created we have `action` parameter, which when set to `run` will start the analysis task when it's created. ```{r} # Create and run task task_exec_settings <- a$tasks$create( project = "", app = "", input = "", action = "run" ) ``` ## Create a batch task To run tasks in batch mode we have `batch`, `batch_input` and `batch_by` parameters. The `batch` parameter defines whether to run a batch task or not, while `batch_input` and `batch_by` define the input by which the task will be batched and by which criteria, respectively. The example below shows the format of creating a batch task for an input file named 'reads', with batch criteria set to the 'sample_id' metadata field: ```{r} # Create a draft task batch_task <- a$tasks$create( project = "", app = "", inputs = list( "reads" = "", "reference" = "" ), batch = TRUE, batch_input = "reads", batch_by = list( type = "CRITERIA", criteria = list("metadata.sample_id") ) ) ``` ## Task operations Once you've fetched the `Task` object, you can execute various operations directly on it. ### Print task To print all task details, call the print() method directly on the `Task` object: ```{r} # Print task details draft_task$print() ``` ``` ── Task ───────────────────────────────────────────────────────────────────── • batch: FALSE • end_time: 2023-11-22T16:58:16Z • start_time: 2023-11-22T16:51:35Z • executed_by: • created_by: • app: /api-testing/rna-seq-alignment-star/0 • project: /api-testing • description: STAR test 2 • status: COMPLETED • name: Star-alignment-task • id: 66f7a639-85fb-4594-aa93-d435ra37fb1b • href: https://api.sbgenomics.com/v2/tasks/66f7a639-85fb-4594-aa93-d435ra37fb1b ``` ### Run a task To actually start the execution of a created draft task, use the task object's `run()` function. You can modify input parameters values for: `in_place` - set to FALSE if you wish to store response in new task object, `batch` - this is used for tasks that are already batch tasks and this option allows the users to switch the batch mode off, `use_interruptible_instances` - This field can be TRUE or FALSE. Set this field to TRUE to allow the use of spot instances. Only tasks with a `DRAFT` status may be run. ```{r} # Run a task draft_task$run(in_place = TRUE) ``` ### Abort a task Users can abort the task execution by calling the `abort()` function. It immediately stops the execution and puts it into `ABORT` status. Only tasks whose status is `RUNNING` may be aborted. ```{r} # Abort a task draft_task$abort() ``` ### Clone a task In order to copy a task, the user can clone it. Once cloned, the task can either be in `DRAFT` mode or immediately run, by setting the `run` parameter to `TRUE`. ```{r} # Clone a task cloned_task <- draft_task$clone_task() ``` ### Get execution details If users would like to explore or debug the logs of task execution, they can use the `get_execution_details()` function. It returns execution details of the specified task and breaks down the information into the task's distinct jobs. A job is a single subprocess carried out in a task. The information returned by this call is broadly similar to the one that can be found in the task stats and logs provided on the Platform. Task execution details include the following information: * The name of the command line job that executed * The start time of the job * End time of the job (if it completed) * The status of the job (`DONE`, `FAILED`, or `RUNNING`) * Information on the computational instance that the job was run on, including the provider ID, the type of instance used and the cloud service provider * A link that can be used to download the standard error logs for the job * SHA hash of the Docker image ('checksum'). ```{r} # Get execution details of the task details <- draft_task$get_execution_details() ``` ### List batch children This operation retrieves child tasks for a batch task. It works just like the `tasks$query()` function, so you can set query parameters such as `status`, `created_from`, `created_to`, `started_from`, `started_to`, `ended_from`, `ended_to`, `origin`, and `order` to narrow down the search. ```{r} # List batch children children_tasks <- batch_task$list_batch_children() ``` ### Update task Users can use the `update()` method to change the details of the specified task, including its name, description, and inputs. Note that you can only modify tasks with a task status of `DRAFT`. Tasks which are `RUNNING`, `QUEUED`, `ABORTED`, `COMPLETED` or `FAILED` cannot be modified in order to enable the reproducibility of analyses. There are two things to note if you are editing a batch task: * If you want to change the input on which to batch and the batch criteria, you need to specify the `batch_input` and `batch_by` parameters together in the same function call. * If you want to disable batching on a task, set `batch` to false. Or, you can also set the parameters `batch_input` and `batch_by` to `NULL`. ```{r} # Update task draft_task$update( description = "New description", batch_by = list( type = "CRITERIA", criteria = list("metadata.diagnosis") ), inputs = list("in_reads" = "") ) ``` ### Rerun a task Users can also rerun the task which will actually clone the original task for them and start the execution immediately. ```{r} # Rerun task draft_task$rerun() ``` ### Reload task In order to refresh the `Task` object and get the up to date info about its status, you can always call the `reload()` function: ```{r} # Reload task object draft_task$reload() ``` ### Delete task Lastly, the task can be deleted using `delete()` method directly on the `Task` object too: ```{r} # Delete task draft_task$delete() ```