Jobs say they're running but they do not appear to be

Post author
Sarah Walker

My job was queued for 1 hour today, and then has been in a running state for over 3.5 hours ever since. However, the google bucket is completely empty, so it appears that my tasks -- which Terra tell me "2 currently being processed" -- are not actually being processed

Comments

12 comments

  • Comment author
    Jason Cerrato
    • Official comment

    Hi all,

    I've created a Known Issues post for this issue. All relevant updates will be posted there.

    https://support.terra.bio/hc/en-us/community/posts/360058314871-Queued-submissions-and-inaccurate-job-status

    Kind regards,

    Jason

  • Comment author
    James Gatter
    • Edited

    From yesterday and today, we've had 4 or 5 people in our lab encounter this problem when using cumulus/bcl2fastq and cumulus/dropseq_workflow. It seems that some tasks complete delocalization successfully (from the logs that we are able to access) but then hang indefinitely until we abort them. Relaunching the jobs has worked with mixed success (provided call-caching is off). I haven't really had time to delve into the compute details so I don't really know what's going on. Following this.

    0
  • Comment author
    jtsuji

    My submission is not also processed.  It shows "QueuedInCromwell" for long time.

    0
  • Comment author
    Edyta Malolepsza

    My running job was last changed 4 hours ago.

    0
  • Comment author
    Sabrina Camp

    I have a few jobs that are still showing they are running in the job manager but when you look at the task level, all of the tasks show completion. The data model also is not being populated with the final results.

    0
  • Comment author
    Edyta Malolepsza
    • Edited

    My job failed, though the status is still running:


    The error says that this task was actually not executed, so there was nothing to restart...

    0
  • Comment author
    Edyta Malolepsza

    I am glad to see that the Terra team is working on the queuing and wrong status problem. Does it mean that call caching problem is also taken into account? My failed jobs run (or try to run) from scratch instead of call caching.

    0
  • Comment author
    Jason Cerrato

    Hi Edyta,

    I can follow-up with the team once the hotfix is in place to see whether this call caching issue is related. Would you be willing to create a new General Discussion post about it so we can take a look? Please include details about when the behavior started, submission and workflow IDs, a link to the workspace where you're experiencing this issue, and any screenshots that may be helpful. Please also share the workspace with GROUP_FireCloud-Support@firecloud.org.

    Many thanks,

    Jason

    0
  • Comment author
    Sarah Walker

    Hi Jason, I am also seeing this issue about call caching isn't working -- in particular, my few jobs that ran in the last 24 hours are not call caching, and are rerunning from scratch..

    Thanks,

    Sarah

    0
  • Comment author
    Edyta Malolepsza

    Hi Jason,

    In the mean time my job did the call cashing (I can see data in the bucket) and the only problem is with the wrong job status. As soon as I see another instance of call cashing problem I will start a new ticket with all details.

    Thank you!
    Edyta

    0
  • Comment author
    Jason Cerrato

    Hi Sarah and Edyta,

    Thank you for the information. It sounds likely that this is related to the ongoing issue, but if the issue persists after the hotfix goes out, please provide us those details previously mentioned and we'll be happy to investigate.

    Many thanks,

    Jason

    0
  • Comment author
    Jason Cerrato

    Hi all,

    A hotfix for this issue is now in place and the issue should be resolved. If you have any further issues you would like us to take a look at, please create a new post detailing your issue and we will be happy to take a look.

    Kind regards,

    Jason

    0

Please sign in to leave a comment.