very, very long lags in retrieving cached results
I have a multi-step workflow in dockstore, and am currently working on improving some final output metrics. I am using call-caching and am making some iterative improvements to the final outputs. I assume the call-caching is working because the outputs are in file paths such as the below:
gs://fc-secure-bdfd208d-6339-4d60-b52c-084c11cf863a/8b8783cb-0e69-45a3-8e86-8cec794f043e/Post_Merge_SV/873d9a47-509a-4af6-b1dc-f8049c09b69d/call-Summarize_Variant_Counts_DEL/cacheCopy/1kg_lumpy_manta.vcf.gz.variants.summary.del.txt.gz
However, it is taking many, many hours to retrieve the cached results. This is just a very small sample workflow with a couple of hundred samples; it is at least 3 orders of magnitude less than what we would intend to run 'in real life'. My experience in standalone cromwell is that is does take some time to retrieve cached results but this seems way beyond what I would expect.
Comments
11 comments
For example, this entire workflow (up to and beyond the current point) should have been cached but for some reason the 'Sort_Index_VCF_DEL' step (which has already been run and cached) has taken over 5 hours.
Hi Haley,
Thank you for writing in about this. It looks like Cromwell started experiencing some slowness around this time yesterday. I'm going to chat with the engineers to get a better sense of what's going on and get back to you ASAP.
Can you share the submission and workflow IDs for this affected job (and any others you notice)?
Kind regards,
Jason
I can find some workflow ids if needed, but the problems seems improved now.
Hi Haley,
If things are looking better, no need to share. Many thanks for letting me know!
If things start looking weird again please feel free to flag it up.
Kind regards,
Jason
Hi Jason,
My jobs experience similar problem since yesterday evening. Here is an example of a job that was finished at 4:03 pm and still "is running":
Thanks for your help.
Best regards,
Edyta
An update: my jobs that finished around 4 pm are still shown as running and the data model was not yet populated with these results.
My jobs are in the same spot as yesterday, in running mode.
Hi Edyta,
Thank you for writing in about this. We are still investigating the delays in submission and results retrieval. You can follow this article for the latest updates: https://support.terra.bio/hc/en-us/articles/360042960371-Service-Incident-April-30-2020
This is the same article that links from the banner in the Terra interface. I will also update this thread when I receive word on resolution.
Kind regards,
Jason
Hi Jason,
Thank you and good luck with solving the problem.
Best,
Edyta
Hi Edyta,
We are seeing queues having returned to normal. I've sent you a message directly to follow up on your issue in case it persists.
Kind regards,
Jason
Hi Jason,
I shared my workspace right after receiving your direct message. My jobs are still in the running mode. I will abort them.
Best,
Edyta
Please sign in to leave a comment.