How to Manage data with the FISS API

Allie Hajian

FireCloud Service Selector (FISS) is a utility command module that allows API (Application Programming Interface) calls from the notebook to the workspace. Scripting with FISS is much like running on a local machine, but with Terra's built-in security and cloud integration. This article covers the basics of scripting on top of the Terra and Google Cloud environments using FISS to access Terra's APIs.

Why use scripting/APIs?

Why would you want to use Terra through an API when you can work directly in Terra?  It comes down to automation and scalability, and personal preference. You can use FISS commands from a Jupyter Notebook or from the command line. 

Flexible data access
Interacting directly with back-end Application Programming Interfaces (APIs) gives greater flexibility when manipulating data and setting up data tables. It lets you scale up your analysis by automating how you configure and run workflows. 

Automation and scalability
Maybe you hope to streamline your analysis while avoiding human errors. Using the APIs that power Terra programmatically means you can automate much of the setup process, which lets you standardizeand scale up your work.

Personal preference
Sometimes you may you want to upload data and run workflows without using Terra's graphical interface. Maybe you are more comfortable with scripting than clicking buttons. If you just prefer scripting, you can do that as well. 

Tasks that are easier - or even only possible - using FISSAccessing data from a workspace table in a notebook is only possible with FISS.

Accessing controlled data in an interactive analysis (i.e., notebook) is only possible with FISS.

Collecting data references from multiple tables into a single table is easier with FISS. Using the standard interface for this involves three manual steps: downloading and editing the tsv file by hand, then uploading back the workspace. 

What is FISS?

FISS stands for (FI)reCloud (S)ervice (S)elector. It is a Python-based API and command-line interface to the FireCloud/Terra APIs. It is also callable from R. 

Two library levels- Low-level API: corresponds closely to the FireCloud/Terra API
(calling the low-level/API layer is easy from Python/R using standard function call syntax)
- Higher-level: provides additional support for chunking/batching for scalability, etc.

Preinstalled in Terra cloud environment

  • Manually install on other systems using `pip install firecloud`

Technical details

  • Refer to the FISS code and FireCloud/Terra API (Swagger) for more details
  • Data table operations listed in the Swagger `Entities` section

High-level FISS functions from the command-line

Calling high-level/FISS layer requires creation of a “parameter object” (e.g., Python named tuple) and it is often easier to “shell out” (!) from the notebook to the FISS command-line than to call the high-level functions directly.

Use command-line interface for high-level FISS functions

  • List available commands:fissfc –l
  • Display command information: fissfc –help

FISS config file available

  • Default values for commonly used parameters
  • Ex: billing project, workspace, etc.

Useful subcommands for working with data tables

Scalable/chunked upload of TSV file: entity_import

Copy entities from one workspace to another: entity_copy 

Delete specific entity(s)/row(s) in a data table: entity_delete

List all the entities/rows by name and id in a workspace: entity_list

Return data table entities/rows in TSV format (limited scalability): entity_tsv

Return the names of the entity types/tables in a workspace: entity_types 

Use case: Managing data with the FISS API - beyond the UI

Managing a small collection of samples by manually creating a workspace table to track them and their metadata is fairly straightforward. It can be accomplished using a spreadsheet and the "Upload TSV" feature of the Terra UI.  However, large projects can produce data on hundreds, thousands, even hundreds of thousands of samples. The time it takes to upload this data to a Google bucket or reference the files in your workspace data model is significant, and manually creating a workspace, uploading data, and tracking potentially hundreds of fields of extra metadata is infeasible to do by hand.

Instead, when projects grow, the best approach is to script data management - both uploading data to your workspace bucket and creating a workspace data table to track the file locations along with all their extra metadata.  This allows you to deal with workspaces that contain thousands (or many more) samples while minimizing errors and maximizing your time.

Be careful when managing controlled-access data with FISS It's important to remember that you are ultimately responsible for abiding by the data use agreements for data you are authorized to use - whether you are using FISS or Terra to manipulate that data. Never copy controlled-access data to a workspace where it can be accessed by someone not authorized to access it.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

4 comments

  • Comment author
    Priyanka Srivastava
    • Edited

    Allie Hajian, when can we expect the FISS API tutorial to be out? Keen to know how would the authorization work if we try to call the APIs from an external app? I couldn't find that mentioned any where.

    1
  • Comment author
    Allie Hajian

    Hi Priyanka Srivastava! User Ed is working on the FISS API tutorials, though I am not sure they will answer your specific question. I have submitted a ticket to Frontline on your behalf and you should hear from them, soon!

    0
  • Comment author
    Samantha (she/her)

    Hi Priyanka Srivastava,

    To authorize your account when calling the FISS APIs from external apps, you will just need to run gcloud auth login --update-adc. Please let me know if you have any other questions.

    Best,

    Samantha

    0
  • Comment author
    Priyanka Srivastava
    • Edited

    Thanks Allie and Samantha, 

    I have a couple of further questions.
    I want to be able to invoke the Firecloud APIs from an external app to import the tsv inorder to populate the data tables and also invoke some other APIS like getWorkspaces etc. How can I do that?
    Want to confirm my understanding from this doc:
    Is using FISS is the only way to make a call to the fire cloud APIs or could we invoke them directly through our java app? If yes, how will the user be authorized?
    Does FISS only allow invoking the APIs through the notebooks in the terra workspace?

    0

Please sign in to leave a comment.