one shard keeps failing to load a BAM file
Hi Terra team,
I would like to kindly ask for help with solving a mystery why one shard in my run keeps failing to load a BAM file. I have restarted this job multiple times in hope to get assigned to less busy server in case the problem is caused by overloading, but the run fails every time.
Please take a look at shard 40 in https://app.terra.bio/#workspaces/broadtagteam/tag_626_Roche_Rautanen_Mouse_WGS30X_variant_calling/job_history/7ea74da1-8224-4e77-8c09-deed25ab1ae5
I have seen many shards failing in this job, but restarts helped to solve the problem in all shard except one.
I have granted read access to GROUP_FireCloud-Support@firecloud.org.
Thank you for your help.
Best regards,
Edyta
Comments
8 comments
Hi Edyta,
Thanks for writing in. We can confirm we have access to the workspace. We'll take a closer look and get back to you as soon as we can!
Kind regards,
Jason
Hi Edyta,
I took a look at your linked job and a few of the other runs to compare the logs. I'm seeing consistent failure to fully copy the bam file into the VM, and it seems to consistently fail at the same point.
If you take a look at our Troubleshooting article for PAPI Error Code 10 you can see that this failure can sometimes occur due to inadequate memory or disk space. Because this shard's log file stops so abruptly, with no error messages or delocalization, I'm led to believe this might be a memory issue.
You can see this in the log for this shard across multiple submissions:
7ea74da1-8224-4e77-8c09-deed25ab1ae5
ddf48fde-bd4e-43af-9548-f5fb8cab825e
ea844851-bb6a-4d14-825a-f3ee4c276643
I would recommend trying a rerun of the job with more memory to see if it gets you farther in the job, or to completion. If that doesn't have an effect, you may want to try a slightly higher disk size to check for the same improvements.
Let me know how it goes!
Kind regards,
Jason
Hi Jason,
Thank you very much for looking at this run. I will request more memory and see if it helps.
Best regards,
Edyta
Hi Edyta,
Those links for the logs may not work. Try these if so:
7ea74da1-8224-4e77-8c09-deed25ab1ae5
ddf48fde-bd4e-43af-9548-f5fb8cab825e
ea844851-bb6a-4d14-825a-f3ee4c276643
Kind regards,
Jason
Hi Jason,
Yes, I cleaned some failed buckets and keep a record of it. I just restarted the run with more memory.
Best regards,
Edyta
Hi Edyta,
Sounds good! Let us know how it goes.
Kind regards,
Jason
Hi Jason,
The first trial with increases memory failed as well. After requesting double amount of memory and almost double amount of disk space the run finished succesfully.
Thank you for your help!
Best regards,
Edyta
Hi Edyta,
Great to hear! If we can be of further assistance, please let us know.
Kind regards,
Jason
Please sign in to leave a comment.