How to retrieve detailed workflow cost information on GCP

Allie Hajian
  • Updated

Below is an example of how to access the time and cost from Google's BigQuery database after a submitted workflow on Terra has completed. Querying your billing data export directly is a way to surface more granular cost details. 

Preliminary Step: Enable BigQuery Exports

Before you begin, BigQuery's exporting feature must be enabled for your Google Project. To do this, follow the steps in Google Cloud's documentation.

Once exports have been enabled, you will obtain a BigQuery_Billing_Export_ID - you will use this in Step 5 below.

Step 1: Run a workflow

Wait 24 hours before checking your workflow's costWorkflow data on BigQuery is not available until some hours after the workflow has completed. It is therefore best to allow a day to pass before querying the database.

Step 2: Retrieve the workflow's time and cost

  1. Once the status of the workflow is done and you have given BigQuery a few hours to export the workflow information into its database (~24 hours), open the Google Cloud console for your project.

  2. Click on the Monitor tab.
  3. Click on "View" next to your workflow.

  4. You'll be taken to a new page with some light info about the submission. Click on View again for more in depth information about the run.

  5. Now you can view the details on the Submit, Start, and End time of the workflow. To determine the actual run time of the workflow use the Start and End time to calculate the duration. The workflow ID is also present on this page, this ID can be used in BigQuery to identify the run's resource usage to calculate the cost. Copy the workflow ID and and proceed to the BigQuery website in the next step.

  6. Visit the Bigquery website : https://bigquery.cloud.google.com/ Click on "Compose Query":

  7. You will be provided with a text box where you need to enter query using the workflow ID along with particular details of the submitted job you would like to search.
     
    Below is a template of a SQL query that can be used in the text box, be sure to substitute in the BigQuery_Billing_Export_ID ,Billing_Project_ID, and Workflow_ID.

BigQuery_Billing_Export_ID: Available in BigQuery on the left hand side, once enabling Billing Exports. Composed of three parts, separated by a colon and period. Note: The newer version may include “v1”. Example: my-gcp-billing:gcp_billing_export.gcp_billing_export_123456_12345_12345

Billing_Project_ID: Available in Workspace title before the the forward slash Example: Billing-Project-ID/Workspace-Title Workflow_ID: Available in Workspace Monitoring for a submitted workflow Example: b112b781-9bf4-4a8b-9aac-905b78316987


Query Template:

SELECT project.id, 
       GROUP_CONCAT(labels.key) WITHIN RECORD AS labels_key,  
       GROUP_CONCAT(labels.value) WITHIN RECORD labels_value,  
       cost
FROM [<BIGQUERY_BILLING_EXPORT_ID>]  
WHERE project.id in ('<BILLING_PROJECT_ID>')  
        AND  
      labels.key IN ("cromwell-workflow-id",  
                     "cromwell-workflow-name",  
                     "cromwell-sub-workflow-name",  
                     "wdl-task-name",  
                     "wdl-call-alias") 
        AND  
      cost > 0 
HAVING  
   labels_value LIKE "%<WORKFLOW_ID>%" 

 

  1. Once you've entered your query in the text box click "Run Query".

Step 3: Calculate the workflow's cost

  1. You'll be provided with the results in the form of a tsv. It's possible to download this to your computer or you can also have Google copy the tsv to you Google drive as a Google sheet by clicking "Save to Google Sheet".
  2. After clicking the icon to save as a sheets file click "Click to View" to open the sheets doc.

  3. Sum the cost column to calculate the total cost of the workflow.

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.