Neverending jobs
Hi,
We are observing some super long runs for a few tasks that normally take minutes / hour to complete. Could you investigate what's going on?
Workspace rjxmicrobiome/rjxmicrobiome, workflow id e4c73b68-684d-45d7-b396-996112d8766c.
Damian
Comments
2 comments
Hi Damian -
I have contacted the Cromwell team to get an answer for you. I am also creating Zendesk ticket for this issue.
Adelaide
Damian,
Khalid suggested try to:
`gsutil cat gs://fc-secure-802bf880-16b1-4a10-ad89-da98f79919b8/01f0aeea-3d94-4a2c-ad22-057c82724c0e/workflowBiobakery/e4c73b68-684d-45d7-b396-996112d8766c/call-combineMetaphlan/attempt-3/combineMetaphlan.log`
You may see that the job is _very slowly_ downloading the 2,405 inputs.
If that's the case, this is the issue https://broadworkbench.atlassian.net/browse/BA-5666 The downloads are moving very slowly, but should eventually finish.
NOTE: While the above issue is being actively worked on, the ticket also has a comment with a workaround. The user can create a Jira account to see said comment directly, including the part that mentions if they upgrade their WDL to version 1.0 they can add the workaround _and_ maintain (future) call caching.
All that said, if there is empirical evidence within the logs that the downloads are stuck (not a lack of logs, but instead hundreds of downloads then nothing for a day) then we should kill the job.
Let me know if we should kill it.
Thanks,
Adelaide
Please sign in to leave a comment.