The Job History tab is your workspace operations dashboard, where you can check the status of past and current workflow submissions, drill down to see what’s going on (i.e., troubleshoot), and find direct links to all input and output files. This article walks through functions you need to know.
Job History reporting structure
Workflow information is organized in a hierarchy from submissions to tasks. As you click into each level, you get increasingly granular details.
Submission: a collection of workflows submitted in one batch
Workflow: a particular run of a workflow/method on a specific dataset
Task: the lowest level of analysis reporting representing individual
calls/jobs made during workflow execution
Workspace submissions (Job History top level)
Here is a list of all submissions, along with their current status (i.e., submitted, queued, submitted, running, completed or failed), and links to further information. Each submission is one row, no matter how many entities you run on in that submission.
A note about deleting information in Job History It's possible to filter the list, but you can't delete any submissions. Similarly, it's not possible to delete workflows within a submission.
For guidance about deleting files that belong to past submissions, please see the forum.
Workflow submissions can be in the following states, which will be listed (and updated in real time) in the top-level Job History page.
Queued, Launching or Submitted
In these states, the workflows are being handed off from Terra to Cromwell (see Overview: How the workflow system works.
When running, the commands specified in the WDL script are being executed on virtual machines.
Aborted and aborting
These display for analysis submissions, workflows, and tasks if you requested a workflow to be aborted. These are not pictured here.
When all the tasks reach Done successfully, the workflow is updated to Succeeded, and the Job History page shows the submission as Done.
Workflow-level details (Job History submission page)
Clicking on a particular submission opens the next level in the Job History. At the top of the page is information about the submission as a whole.
- Workflow statuses (all workflows in the submission)
- Workflow configuration
- What input data was processed
- Who submitted the workflow
- Total run cost (see how to enable this feature in How much does my workflow cost?)
- Workflow options (learn more about available options in Workflow setup: VM and other options)
The submission page lists each workflow within the submission and its status (failed, queued, etc.). If you ran on two samples, for example, there would be two rows, one for each sample.
Task-level details (Job Manager or Workflow Dashboard)
From the submission page you can access task-level details by selecting one of the three icons at the right of the particular workflow you're interested in.
If you don't see these iconsIf your job failed because it never started (e.g., if Terra could not find your input files to localize), you won't see these options.
Clicking the icon at left opens the Job Manager, your go-to location for a more thorough breakdown of your workflow. Here you can find information about each individual task in the workflow, including
- Failure messages
- Log files, links to Google Cloud executions directories, and compute details
Note: The Job Manager will open in a new tab and is outside of your workspace.
If Job Manager won’t loadJob Manager may fail to load if your job produced huge amounts of metadata. In these cases, skip to the Workflow Dashboard (described below).
Backend task log
If it's not immediately obvious what failed, the best sources of information are often log files. These files are generated by Cromwell when executing any task and are placed in the task's folder along with its output. In Terra, we add quick links to these files to make troubleshooting easier.
The backend task log gives a step-by-step report of actions during the execution of the task. These details include information about Docker setup, localization (the step of copying files from your Google bucket into the Docker container), stdout from tools run within the command block of the task, and finally, the delocalization and Docker shutdown steps.
You can also see this in Google Cloud console by clicking the link at the bottom.
If your log stopped abruptly Some log files seem to stop abruptly, not yet having reached the delocalization stage. This is almost certainly because the task has run out of memory. We recommend retrying with more memory to see if your job gets farther. See Out Of Memory Retry to learn more about how to configure your workflow to immediately retry certain tasks if the only error was to run out of memory.
Clicking on this icon will redirect you to the exact folder/directory in your workspace Google bucket where you can find your stderr, stdout, and backend logs. From there, you can open those files to view their contents or download them. If your task generates outputs, this is where you will find them as well.
A log file tracking the events that occurred in performing the task such as downloading Docker, localizing files, etc. This is the same log mentioned in the previous section. Occasionally a workflow will fail without a stderr and stdout files, leaving you with only a task log.
2. stderr and stdout
Standard Error (stderr)
A file containing error messages produced by the commands executed in the task. A good place to start for a failed task, as many common task-level errors are indicated in the stderr file.
Standard Output (stdout)
A file containing log outputs generated by commands in the task. Not all commands generate log outputs and so this file may be empty.
This section displays information on the workflow at the Google Pipelines worker level, including timestamps for the execution of worker tasks and virtual machine (VM) configuration information. You can use this section to understand or validate the configuration of your worker VM (memory, disk size, machine type, etc.). You can also check this section if you suspect your workflow failed due to a transient Google issue.
This information is available for 42 days from when the pipeline (VM) started; after which time, it ceases to be accessible. This is a Google lifecycle policy and there's no workaround to retrieve the data after 42 days.
The workflow dashboard includes some of the details in the Job Manager, but as part of your workspace.
Execution directory (icon at right)
The Execution directory, which is on Google Cloud console, includes a wealth of details on the API side of things.
For more information about what goes on under the hood, see What happens when you launch a workflow?