Welcome to your workspace operations dashboard, where you can check the status of past and current pipelines run, drill down into various views of what’s going on and find direct links to all input and output files involved.
Your analyses are organized and reported on following this hierarchical structure:
- Submission: a collection of workflows that you have submitted in one batch.
- Workflow: in this context, the term workflow refers to a particular run of a method on a specific dataset.
- Task: the lowest level of analysis reporting representing individual calls/jobs made during workflow execution.
By default the Job History page lists all submissions that have been made so far in a workspace along with their status; it is possible to filter the list, but it is not possible to delete any submissions. Similarly, it is not possible to delete workflows within a submission. For guidance relating to deleting files that belong to past submissions, please see the forum.
After you’ve launched a new submission, the Job History page is loaded for you automatically so you can see the list of one or more workflows that are contained in the submission, along with their current status and links to further information. You can always go up one level to view all submissions by clicking on the Job History tab – but for now let’s assume we just launched an analysis and we want to drill down into the status of a particular workflow.
The workflows can be in the following states: Queued, Launching, Submitted, Running, Aborting, Succeeded, Failed or Aborted. When a workflow is in a Queued, Launching or Submitted state, the first three sequential statuses, the workflows are being handed off from Terra to Cromwell. More on this later…
Following Submitted is “Running;” it means that the commands specified in the method WDL script are being executed on virtual machines.
The Happy Path diagram below shows how each task status affects the workflow and analysis submission status -- in the best case scenario where all your tasks and workflows succeed.
Note: Aborted and Aborting status will display for analysis submissions, workflows, and tasks if you have requested a workflow to be aborted. These are not pictured here.
When all the tasks reach Done successfully, the workflow will be updated to Succeeded, and the Job History page will show the submission as Done. You can look for outputs in the data table (via the Data tab) if you configured outputs to write to the data table (
this.X). Remember “this” refers to what data entity you are running your method on. If you chose participant and called an output
this.participant_file, a new column will display in the participant section called
participant_file and a link to the output will be put there, while the actual output file will be saved to the workspace’s bucket.
You can also view outputs from the Job Manager page by
- Clicking on the succeeded submission (far left) in the Job History page
- Clicking "View" (far left) to drill down for more details
- Clicking on the icon in the "Outputs" column:
Behind the scenes
Under the hood, quite a lot is happening when you launch an analysis: various system components kick into gear to ensure that your submission of one or more workflows gets properly assembled and when that’s done, that individual task gets dispatched to the Google Compute Engine for execution. Meanwhile on the surface, Terra automatically takes you to the Monitor page where you can view the status of your workflow(s) and monitor how the work is progressing. If systems could talk, it would kind of look like this:
- Terra takes the workflow specified in the WDL and asks Cromwell to run it.
- Cromwell asks the Google Pipelines API (PAPI) to launch each task in the workflow when the inputs become available. Cromwell is responsible for managing the sequence of the tasks/jobs.
- PAPI starts a virtual machine per task and provides the inputs; the WDL specifies what it should do, the environment to do it in (the Docker image), and requests the outputs when it is done. Each virtual machine’s (VM) requirements can be specified in the task (RAM, disk space, memory size, number of CPUs). Once the task is done, PAPI will shut down the VM.
- The Docker required for each task will be pulled to the virtual machine along with any inputs from Google buckets. When the output is produced, it will be put in the Google bucket of the workspace where the analysis was launched. Links to the outputs will be written back to that workspace’s data table.