Jobs say they're running but they do not appear to be
My job was queued for 1 hour today, and then has been in a running state for over 3.5 hours ever since. However, the google bucket is completely empty, so it appears that my tasks -- which Terra tell me "2 currently being processed" -- are not actually being processed
Comments
12 comments
Hi all,
I've created a Known Issues post for this issue. All relevant updates will be posted there.
https://support.terra.bio/hc/en-us/community/posts/360058314871-Queued-submissions-and-inaccurate-job-status
Kind regards,
Jason
From yesterday and today, we've had 4 or 5 people in our lab encounter this problem when using cumulus/bcl2fastq and cumulus/dropseq_workflow. It seems that some tasks complete delocalization successfully (from the logs that we are able to access) but then hang indefinitely until we abort them. Relaunching the jobs has worked with mixed success (provided call-caching is off). I haven't really had time to delve into the compute details so I don't really know what's going on. Following this.
My submission is not also processed. It shows "QueuedInCromwell" for long time.
My running job was last changed 4 hours ago.
I have a few jobs that are still showing they are running in the job manager but when you look at the task level, all of the tasks show completion. The data model also is not being populated with the final results.
My job failed, though the status is still running:
The error says that this task was actually not executed, so there was nothing to restart...
I am glad to see that the Terra team is working on the queuing and wrong status problem. Does it mean that call caching problem is also taken into account? My failed jobs run (or try to run) from scratch instead of call caching.
Hi Edyta,
I can follow-up with the team once the hotfix is in place to see whether this call caching issue is related. Would you be willing to create a new General Discussion post about it so we can take a look? Please include details about when the behavior started, submission and workflow IDs, a link to the workspace where you're experiencing this issue, and any screenshots that may be helpful. Please also share the workspace with GROUP_FireCloud-Support@firecloud.org.
Many thanks,
Jason
Hi Jason, I am also seeing this issue about call caching isn't working -- in particular, my few jobs that ran in the last 24 hours are not call caching, and are rerunning from scratch..
Thanks,
Sarah
Hi Jason,
In the mean time my job did the call cashing (I can see data in the bucket) and the only problem is with the wrong job status. As soon as I see another instance of call cashing problem I will start a new ticket with all details.
Thank you!
Edyta
Hi Sarah and Edyta,
Thank you for the information. It sounds likely that this is related to the ongoing issue, but if the issue persists after the hotfix goes out, please provide us those details previously mentioned and we'll be happy to investigate.
Many thanks,
Jason
Hi all,
A hotfix for this issue is now in place and the issue should be resolved. If you have any further issues you would like us to take a look at, please create a new post detailing your issue and we will be happy to take a look.
Kind regards,
Jason
Please sign in to leave a comment.