Workflow delays and upcoming improvements
If you've noticed slower queue times or workflow delays recently, we have some context and a few improvements in the works.
What's going on
During periods of high platform activity, workflows can sit in the queue longer than usual, and status updates for completed jobs may lag. This is expected behavior under elevated load on a shared cluster; things are processing normally, just potentially slower than you're used to.
What we're doing about it
Improved queue visibility
The Job Submission UI will be updated to show whether your workflow is waiting behind others, so you'll always know where your submission stands.
Limits on extremely large workflows
To protect performance for everyone on shared infrastructure, we'll be introducing limits on very large workflow requests that can cause broader slowdowns. Our first limit will automatically abort workflows that exceed a metadata writing threshold, returning a descriptive error message with your workflow ID and relevant metadata size. This is expected to affect only a very small number of users; if Terra Support has not reached out to you about large workflow submissions in the past, we do not expect you to be impacted. If you are impacted and have questions, please write to us at support@terra.bio.
We'll share updates as these changes roll out, and appreciate your patience.
Comments
2 comments
We have released the improved queue visibility and limit on extremely large workflow improvements!
Improved queue visibility comes with a collection of more informative statuses:
Optimizing metadata writes in your workflow
If you're interested in optimizing your workflow's metadata output. There are two main areas to consider:
1. Total number of tasks run by the root workflow
This applies regardless of subworkflow structure. Depending on your setup, you might consider:
2. Number of metadata records created per task
Task inputs and outputs are stored in metadata and are a common source of high metadata volume. Complex input/output types (arrays, structs, maps) are worth revisiting. For example, writing an array of 100,000 file paths as a task output generates 100,000 metadata rows. A more efficient approach would be to use a FOFN (file of file names) to pass lists of files between tasks, rather than passing them directly in WDL inputs/outputs.
Please sign in to leave a comment.