one shard keeps failing to load a BAM file
Hi Terra team,
I would like to kindly ask for help with solving a mystery why one shard in my run keeps failing to load a BAM file. I have restarted this job multiple times in hope to get assigned to less busy server in case the problem is caused by overloading, but the run fails every time.
Please take a look at shard 40 in https://app.terra.bio/#workspaces/broadtagteam/tag_626_Roche_Rautanen_Mouse_WGS30X_variant_calling/job_history/7ea74da1-8224-4e77-8c09-deed25ab1ae5
I have seen many shards failing in this job, but restarts helped to solve the problem in all shard except one.
I have granted read access to GROUP_FireCloud-Support@firecloud.org.
Thank you for your help.
Best regards,
Edyta
-
Hi Edyta,
I took a look at your linked job and a few of the other runs to compare the logs. I'm seeing consistent failure to fully copy the bam file into the VM, and it seems to consistently fail at the same point.
If you take a look at our Troubleshooting article for PAPI Error Code 10 you can see that this failure can sometimes occur due to inadequate memory or disk space. Because this shard's log file stops so abruptly, with no error messages or delocalization, I'm led to believe this might be a memory issue.
You can see this in the log for this shard across multiple submissions:
7ea74da1-8224-4e77-8c09-deed25ab1ae5
ddf48fde-bd4e-43af-9548-f5fb8cab825e
ea844851-bb6a-4d14-825a-f3ee4c276643
I would recommend trying a rerun of the job with more memory to see if it gets you farther in the job, or to completion. If that doesn't have an effect, you may want to try a slightly higher disk size to check for the same improvements.
Let me know how it goes!
Kind regards,
Jason
-
Hi Edyta,
Those links for the logs may not work. Try these if so:
7ea74da1-8224-4e77-8c09-deed25ab1ae5
ddf48fde-bd4e-43af-9548-f5fb8cab825e
ea844851-bb6a-4d14-825a-f3ee4c276643
Kind regards,
Jason
Please sign in to leave a comment.
Comments
8 comments