How to Manage data with the FISS API

FireCloud Service Selector (FISS) is a utility command module that allows API (Application Programming Interface) calls from the notebook to the workspace. Scripting with FISS is much like running on a local machine, but with Terra's built-in security and cloud integration. This article covers the basics of scripting on top of the Terra and Google Cloud environments using FISS to access Terra's APIs.

Why use scripting/APIs?

Why would you want to use Terra through an API when you can work directly in Terra? It comes down to automation and scalability, and personal preference. You can use FISS commands from a Jupyter Notebook or from the command line.

Flexible data access

Interacting directly with back-end Application Programming Interfaces (APIs) gives greater flexibility when manipulating data and setting up data tables. It lets you scale up your analysis by automating how you configure and run workflows.

Automation and scalability

Maybe you hope to streamline your analysis while avoiding human errors. Using the APIs that power Terra programmatically means you can automate much of the setup process, which lets you standardizeand scale up your work.

Personal preference

Sometimes you may you want to upload data and run workflows without using Terra's graphical interface. Maybe you are more comfortable with scripting than clicking buttons. If you just prefer scripting, you can do that as well.

Tasks that are easier - or even only possible - using FISSAccessing data from a workspace table in a notebook is only possible with FISS.

Accessing controlled data in an interactive analysis (i.e., notebook) is only possible with FISS.

Collecting data references from multiple tables into a single table is easier with FISS. Using the standard interface for this involves three manual steps: downloading and editing the tsv file by hand, then uploading back the workspace.

What is FISS?

FISS stands for (FI)reCloud (S)ervice (S)elector. It is a Python-based API and command-line interface to the FireCloud/Terra APIs. It is also callable from R.

Two library levels

Low-level API: corresponds closely to the FireCloud/Terra API
(calling the low-level/API layer is easy from Python/R using standard function call syntax)
Higher-level: provides additional support for chunking/batching for scalability, etc.

Preinstalled in Terra cloud environment

Manually install on other systems using `pip install firecloud`

Technical details

Refer to the FISS code and FireCloud/Terra API (Swagger) for more details
Data table operations listed in the Swagger `Entities` section

High-level FISS functions from the command-line

Calling high-level/FISS layer requires creation of a “parameter object” (e.g., Python named tuple) and it is often easier to “shell out” (!) from the notebook to the FISS command-line than to call the high-level functions directly.

Use command-line interface for high-level FISS functions

List available commands:fissfc –l
Display command information: fissfc –help

FISS config file available

Default values for commonly used parameters
Ex: billing project, workspace, etc.

Useful subcommands for working with data tables

Scalable/chunked upload of TSV file: entity_import

Copy entities from one workspace to another: entity_copy

Delete specific entity(s)/row(s) in a data table: entity_delete

List all the entities/rows by name and id in a workspace: entity_list

Return data table entities/rows in TSV format (limited scalability): entity_tsv

Return the names of the entity types/tables in a workspace: entity_types

Use case: Managing data with the FISS API - beyond the UI

Managing a small collection of samples by manually creating a workspace table to track them and their metadata is fairly straightforward. It can be accomplished using a spreadsheet and the "Upload TSV" feature of the Terra UI. However, large projects can produce data on hundreds, thousands, even hundreds of thousands of samples. The time it takes to upload this data to a Google bucket or reference the files in your workspace data model is significant, and manually creating a workspace, uploading data, and tracking potentially hundreds of fields of extra metadata is infeasible to do by hand.

Instead, when projects grow, the best approach is to script data management - both uploading data to your workspace bucket and creating a workspace data table to track the file locations along with all their extra metadata. This allows you to deal with workspaces that contain thousands (or many more) samples while minimizing errors and maximizing your time.

Be careful when managing controlled-access data with FISS It's important to remember that you are ultimately responsible for abiding by the data use agreements for data you are authorized to use - whether you are using FISS or Terra to manipulate that data. Never copy controlled-access data to a workspace where it can be accessed by someone not authorized to access it.

Comments

4 comments

Priyanka Srivastava
- Edited March 15, 2021 22:01
Allie Hajian, when can we expect the FISS API tutorial to be out? Keen to know how would the authorization work if we try to call the APIs from an external app? I couldn't find that mentioned any where.

1
Allie Hajian
- March 17, 2021 16:21
Hi Priyanka Srivastava! User Ed is working on the FISS API tutorials, though I am not sure they will answer your specific question. I have submitted a ticket to Frontline on your behalf and you should hear from them, soon!

0
Samantha (she/her)
- March 17, 2021 20:10
Hi Priyanka Srivastava,

To authorize your account when calling the FISS APIs from external apps, you will just need to run gcloud auth login --update-adc. Please let me know if you have any other questions.

Best,

Samantha

0
Priyanka Srivastava
- Edited March 25, 2021 20:11
Thanks Allie and Samantha,

I have a couple of further questions.
I want to be able to invoke the Firecloud APIs from an external app to import the tsv inorder to populate the data tables and also invoke some other APIS like getWorkspaces etc. How can I do that?
Want to confirm my understanding from this doc:
Is using FISS is the only way to make a call to the fire cloud APIs or could we invoke them directly through our java app? If yes, how will the user be authorized?
Does FISS only allow invoking the APIs through the notebooks in the terra workspace?

0

Please sign in to leave a comment.