Below is an example of how the time and cost is obtained after a submitted workflow on Terra has completed.
Note that workflow data on BigQuery is not available until some hours after the workflow has complete. Its best to allow a day to pass before querying the database.
Preliminary step: Before you begin. BigQuery exporting feature must be enabled. The steps are simple available here. Once created you will obtain a BigQuery_Billing_Export_ID, which will be used in step 5.
Once the status of the workflow is done and you have given BigQuery a few hours to export the workflow information into its database (~24 hours) click on "View".
You'll be taken to a new page with some light info about the submission. Click on View again for more in depth information about the run.
Now you can view the details on the Submit, Start, and End time of the workflow. To determine the actual run time of the workflow use the Start and End time to calculate the duration. The workflow ID is also present on this page, this ID can be used in BigQuery to identify the run's resource usage to calculate the cost. Copy the workflow ID and and proceed to the BigQuery website in the next step.
Visit the Bigquery website : https://bigquery.cloud.google.com/ Click on "Compose Query":
- You will be provided with a text box where you need to enter query using the workflow ID along with particular details of the submitted job you would like to search.
Below is a template of a SQL query that can be used in the text box, be sure to substitute in the
BigQuery_Billing_Export_ID: Available in BigQuery on the left hand side, once enabling Billing Exports. Composed of three parts, separated by a colon and period. Note: The newer version may include â€œv1â€. Example:
Billing_Project_ID: Available in Workspace title before the the forward slash Example:
Billing-Project-ID/Workspace-Title Workflow_ID: Available in Workspace Monitoring for a submitted workflow Example:
SELECT project.id, GROUP_CONCAT(labels.key) WITHIN RECORD AS labels_key, GROUP_CONCAT(labels.value) WITHIN RECORD labels_value, cost FROM [<BIGQUERY_BILLING_EXPORT_ID>] WHERE project.id in ('<BILLING_PROJECT_ID>') AND labels.key IN ("cromwell-workflow-id", "cromwell-workflow-name", "cromwell-sub-workflow-name", "wdl-task-name", "wdl-call-alias") AND cost > 0 HAVING labels_value LIKE "%<WORKFLOW_ID>%"
Once you've entered your query in the text box click "Run Query".
- You'll be provided with the results in the form of a tsv. It's possible to download this to your computer or you can also have Google copy the tsv to you Google drive as a Google sheet by clicking "Save to Google Sheet".
After clicking the icon to save as a sheets file click "Click to View" to open the sheets doc.
- Now simply calculate the sum of the cost column to retrieve the total cost of the workflow.
Please sign in to leave a comment.