FireCloud Service Selector (FISS) is a utility command module that allows API (Application Programming Interface) calls from the notebook to the workspace. Scripting with FISS is much like you would run on a local machine, but with Terra's built-in security and cloud integration. This article covers the basics of scripting on top of the Terra and Google Cloud environments using FISS to access Terra's APIs.
Why use scripting/FISS APIs?
Why would you want to use Terra through an API when you can use Terra’s interface instead? It comes down to automation and scalability, and personal preference. You can use FISS commands from a Jupyter notebook or from the command line.
- Flexible data access
Interacting directly with back-end Application Programming Interfaces (APIs) gives greater flexibility when manipulating data and setting up data tables. It lets you scale up your analysis by automating how you configure and run workflows. To do this, you can use.
Automation and scalability
Maybe you’re hoping to streamline your analysis while avoiding human errors. Using the APIs that power Terra programmatically means you can automate much of the setup process, which lets you standardize and scale up your work.
There may be times when you want to upload data and run workflows without using Terra's graphical interface. Maybe you are more comfortable with scripting than clicking buttons. If you just prefer scripting, you can do that as well.
|Accessing data from a workspace table in a notebook is only possible with FISS
Accessing controlled data in an interactive analysis (i.e. notebook) is only possible with FISS
Collecting data references from multiple tables into a single table, for example, is easier with FISS. Using the standard interface for this involves three manual steps: downloading and editing the tsv file by hand, then uploading back the workspace.
What is FISS?
|- Python client API and command-line interface to FireCloud/Terra API
- Callable from R
Two library levels
Pre-installed in Terra cloud environment
High-level FISS functions from the command-line
Use command-line interface for high-level FISS functions
- List available commands:
- Display command information:
FISS config file available
- Default values for commonly used parameters
- Ex: billing project, workspace, etc.
Useful subcommands for working with data tables
Copy entities from one workspace to another:
Delete specific entity(s)/row(s) in a data table:
List all the entities/rows by name and id in a workspace:
Return data table entities/rows in TSV format (limited scalability):
Return the names of the entity types/tables in a workspace:
Use case: Managing data with the FISS API - beyond the UI
Managing a small collection of samples by manually creating a workspace table to track them and their metadata is fairly straightforward. It can be accomplished using a spreadsheet and the "Upload TSV" feature of the Terra UI. However large projects can produce data on hundreds, thousands, even hundreds of thousands of samples. The time it takes to upload this data to a Google bucket or reference the files in your workspace data model is significant, and manually creating a workspace, uploading data, and tracking potentially hundreds of fields of extra metadata is infeasible to do by hand.
Instead, when projects grow, the best approach is to script data management - both uploading data to your workspace bucket and creating a workspace data table to track the file locations along with all their extra metadata. This allows you to deal with workspaces that contain thousands (or many more) samples while minimizing errors and maximizing your time.
|It's important to remember that you are ultimately responsible for abiding by the data use agreements for data you are authorized to use - whether you are using FISS or the UI to manipulate that data. You should never copy controlled-access data to a workspace where it can be accessed by someone not authorized to access it.
Use case: Automating workflows with the FISS API
Manually setting up and running 5 workflows is trivial, but if you had to run that same workflow on 100,000 whole genome samples, setting up manually is impossible. Scripting the process will make the task tractable, saving you time and effort, especially when you want to repeat the same analysis over and over again. You can use this approach to build command line scripts written in Python, or even Jupyter Notebooks, that help you with your day-to-day work.
With the FISS software library you can build automation - set up a workspace, move and manage data, and perform an analysis - but through Python scripts that you run on your computer or in a Jupyter Notebook. Leveraging the library allows you to perform many of the same actions you can do in the graphical Terra interface, but interacting more directly using scripting enables automating and scaling.
FISS API tutorial workspace
Click here to go to the FISS tutorial workspace.
Want to know when the tutorial will be updated? Click the "Follow" button at the top right f the article.
The tutorial workspace will include use cases with sample code for times when you might want to go beyond Terra's graphic interface. Notebooks within the workspace will highlight different examples of when you might find scripting particularly helpful.
Notebook #1: Create a workspace and add data in a Terra workspace with FISS
Python script in a notebook will help
- Create the workspace
- Upload data to your workspace Google bucket
- Upload data pointers and metadata to the workspace
- Use the data model in a Python script or notebook
Notebook #2: Automating and scaling workflows with the FISS API
Python script in a notebook will help
- Import and configure a workflow
- Launch and monitor the workflow analysis
- Debug your analysis
Resources: Using the FISS library
Use of `model = flexible parameter` is required for Gen3 data in Terra! Otherwise you are constrained to `sample` and `participant` data tables