Workflow delays and upcoming improvements

Post author
Jason Cerrato

If you've noticed slower queue times or workflow delays recently, we have some context and a few improvements in the works.

What's going on

During periods of high platform activity, workflows can sit in the queue longer than usual, and status updates for completed jobs may lag. This is expected behavior under elevated load on a shared cluster; things are processing normally, just potentially slower than you're used to.

What we're doing about it

Improved queue visibility

The Job Submission UI will be updated to show whether your workflow is waiting behind others, so you'll always know where your submission stands.

Limits on extremely large workflows

To protect performance for everyone on shared infrastructure, we'll be introducing limits on very large workflow requests that can cause broader slowdowns. Our first limit will automatically abort workflows that exceed a metadata writing threshold, returning a descriptive error message with your workflow ID and relevant metadata size. This is expected to affect only a very small number of users; if Terra Support has not reached out to you about large workflow submissions in the past, we do not expect you to be impacted. If you are impacted and have questions, please write to us at support@terra.bio.

 

We'll share updates as these changes roll out, and appreciate your patience.

Comments

2 comments

  • Comment author
    Jason Cerrato
    • Edited
    • Official comment

    We have released the improved queue visibility and limit on extremely large workflow improvements! 

    Improved queue visibility comes with a collection of more informative statuses:

     

  • Comment author
    Beth Sheets

    Optimizing metadata writes in your workflow

    If you're interested in optimizing your workflow's metadata output. There are two main areas to consider:

    1. Total number of tasks run by the root workflow
    This applies regardless of subworkflow structure. Depending on your setup, you might consider:

    • Reducing the width of any scatter operations
    • Consolidating tasks (e.g., running multiple tools within a single task rather than as separate tasks)
    • Breaking the workflow into smaller workflows that run in sequence

    2. Number of metadata records created per task
    Task inputs and outputs are stored in metadata and are a common source of high metadata volume. Complex input/output types (arrays, structs, maps) are worth revisiting. For example, writing an array of 100,000 file paths as a task output generates 100,000 metadata rows. A more efficient approach would be to use a FOFN (file of file names) to pass lists of files between tasks, rather than passing them directly in WDL inputs/outputs.

    0

Please sign in to leave a comment.