Workflows cost estimating notebook

Allie Cliffe
  • Updated

If you need granular (task-level) cost reporting, you can estimate the cost of running your workflow with a Workflow Cost Estimator notebook. Follow the instructions below to find the workflow cost estimating python code to run in a notebook.  

See Monitoring GCP resources used in a workflow for another way to track resources used in each task of a workflow. 

Cost-estimate notebook overview

You can find the Workflow Cost Estimator.ipynb notebook in the NIH Cloud Platforms - Notebook collection workspace

Notebooks allow cost reporting for workflows in progress

You can run the cost-estimating notebook at any point in the workflow execution and it will list how much the workflow run has cost thus far. You can use as an early indicator of whether you may exceed your budget.

Notebooks give more granular cost details

Unlike Terra's cost report, which only provides the total job cost, the notebook estimates task-level and instance-level costs (i.e., the cost of running a particular task within a workflow). This is useful when you want to know what task(s) are responsible for most of the cost.

The notebook gives an estimate; built-in reporting is a final cost. However (see below), this estimate is usually fairly accurate. 

What to expect (notebook output)

The notebook uses FireCloud Service Selector (FISS) to request information on all the submitted jobs associated with the workspace.

The notebook will list all the submission IDs on the screen, and you'll choose which submission to process for cost estimates. The notebook will then use FISS to obtain metadata information about the particular submission - such as how many VMs were used, the number of CPUs used, the duration for each VM, etc. A cost formula uses this information to calculate the cost of running the workflow. The cost formula is based on the GCP's price estimate per resource.

How accurate are notebook-generated workflow cost estimates?

These costs do not come directly from Google billing. Instead, the notebook calculates an estimate based on metadata from Terra. The estimates are (usually) very close to the real cost, though they could be slightly off. Below are descriptions of what's accounted for in the calculations, the difference between the notebook results, and what Terra's built-in results showed in a benchmark.

Included GCP costs

Note that estimates will be lower than actual costs, because the notebook cost formula does not account for all possible GCP resources. The table below lists each parameter available in a WDL runtime block (i.e., what type of GCP resource is used for a task) and whether it's included in the cost formula. Note, in particular, that egress costs are not included in the estimate (if your data is stored in a different GCP region than the notebook VM region, for example).

WDL Runtime Parameters
Accounted for in Formula?
CPU/GPU*
No
Memory
Yes
Preemptibles
Yes
Disk
Yes
Data Egress
No
noAddress (rarely used)
No
cpuPlatform (rarely used)
No
zones (rarely used)
No


* Currently, the cost formula assumes all instances are of type N1 listed here, which uses the least expensive type of CPU instance - even if GPUs are being used in the workflow execution. 

Benchmark: Terra cost report versus notebook cost estimates

The two spend report options were benchmarked by running the Cram-to-Bam workflow N times on different CRAM samples. The notebook cost-estimates and built-in cost reporting showed an average difference of $0.02 per sample/run.

Note that when running larger sample sets, there will be a larger differences as minor differences accumulate.

CRAM_to_BAM - Sample Number Terra Cost Report Notebook Cost Estimator
1 $0.22 $0.23
50 $14.87 $13.95
100 $30.39 $26.59
200 $55.62 $53.41

Screenshot comparing Terra built-in cost report in green and cost estinmating notebook values in organge. The two costs are equal when processing a single sample with the cram-to-bam workflow. For 200 samples, the built-in cost reporting is about $58 and the cost estimating notebook is about (estimating notebook values as about $56

 

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.