Workflow auto-retry based on error message

Could we get the ability to automatically retry a failed workflow for certain specific failure messages in Terra?  For example, if I see the string "Execution failed: generic::unknown: installing drivers: installing GPU drivers" as the error message in Terra, I would like to automatically retry this job.

Comments

2 comments

  • Comment author
    Anthony DiCi

    Hi Stephen,​ 

    Thank you for writing in! I've sent this request to our development team for consideration, and I'll be happy to follow up with you if this feature gets built.

    Best,

    Anthony

    0
  • Comment author
    Stephen Fleming

    I am still seeing a lot of this:

    Task ... failed. The job was stopped before the command finished. PAPI error code 2. Execution failed: generic::unknown: installing drivers: installing GPU drivers and /proc/driver/nvidia does not exist: signal: terminated

    Seems like it is becoming more and more common.

    0

Please sign in to leave a comment.