Workflow setup: VM and other options

Allie Hajian
  • Updated

This article outlines workflow runtime options (including cost saving options) - what they are and how to specify them. If this is your first time running a workflow, the default runtime options are usually adequate. 

Workflow options overview

Options that can be adjusted in Terra include what version of the workflow to run and whether to use cost-saving features like call caching or delete intermediate files, reference disks or retry with more memory, etc. You will configure all the runtime options in the workflow submission form. 

The configuration form displays default values provided by the workflow author. 

Running your first workflow? Use the defaults!If you are just getting familiar with running a workflow, you can always use the default runtime options. These are set up to make it easiest and to save money for most users. 

You can skim below for a description of each option. 

1. Workflow information (version, source and synopsis)

You will see all available versions in this dropdown. You can choose to use the most up-to-date version of the workflow, or a previous version (if you need to maintain consistency, for example). Terra will automatically run the version you choose.

The form also lists the workflow tools repository (source) and a synopsis (if available). 

2. Money-saving options

There are several features in Terra designed to help save money when running a workflow. Scroll down for a brief description of these as well as some tips around using them and links to more details. 

2.1. The Use call cachingDelete intermediate outputs options

Call caching allows Terra's execution engine (aka Cromwell) to detect when a job has been run in the past so that it doesn't have to re-compute results. The call caching feature in Terra can save you time and money when you are repeating all or parts of a workflow analysis. 

Deleting intermediate outputs allows you to save storage costs by automatically deleting outputs from intermediate steps upon successful completion of the workflow. This feature is most useful when these intermediate outputs are not very useful compared to the overall results.

Note that complex workflows can have a large number of large intermediates, which can dramatically increase the storage costs of a project. For example, a large scale project recently discovered that as much as 85% of their storage cost was going to store intermediate files that no one ever accessed or used.

These two options save storage costs in two different ways, and cannot be combined. To learn more about call caching and when to use it, see this article.

To learn how to save storage costs by deleting intermediate inputs, see this article

2.2. Select Use reference disks if your workflow uses HG19/HG 38 reference files

To save time localizing large reference inputs, Terra can automatically attach a disk containing HG 19/HG 38 references to your Google Virtual Machine. If the checkbox labeled ‘Use Reference Disks’ is selected, the execution engine will examine the job inputs to see if any of them correspond to reference inputs available on a reference disk image.

For more details, including the full reference disk manifests, see Reference Disks in Terra

2.3. Optional Retry with more memory feature

If a task has a maxRetries value greater than zero and fails because it ran out of memory, Terra will automatically retry it with more memory if this option is selected.

For more details, see the Out of Memory Retry documentation. 

Video and tutorial workflow resources 

To learn more about using data tables to organize your data and enable you to scale your
analysis, see Managing data with workspace tables.

To understand how to adjust data tables, see Making, modifying, and deleting tables.

For hands-on practice with data tables, try the Data Tables QuickStart.

To learn more about how to update workflows to the latest version, see this article.

To see a video tutorial on configuring a workflow, see this video walkthrough of the Workflows Quickstart - Part II here

Hands-on practice setting up and running a workflow analysis (Note: To run the exercises you will need to clone the workspace to your own billing project) 
To practice setting up and running workflows, work through the Terra-
 workspace. It should take about half an hour to complete the
hands-on tutorial and cost less than a dime (GCP costs).

Was this article helpful?

1 out of 2 found this helpful

Have more questions? Submit a request



Please sign in to leave a comment.