one shard keeps failing to load a BAM file

Hi Terra team,

I would like to kindly ask for help with solving a mystery why one shard in my run keeps failing to load a BAM file. I have restarted this job multiple times in hope to get assigned to less busy server in case the problem is caused by overloading, but the run fails every time.

Please take a look at shard 40 in https://app.terra.bio/#workspaces/broadtagteam/tag_626_Roche_Rautanen_Mouse_WGS30X_variant_calling/job_history/7ea74da1-8224-4e77-8c09-deed25ab1ae5

I have seen many shards failing in this job, but restarts helped to solve the problem in all shard except one.

I have granted read access to GROUP_FireCloud-Support@firecloud.org.

Thank you for your help.

Best regards,

Edyta

Comments

8 comments

  • Comment author
    Jason Cerrato

    Hi Edyta,

    Thanks for writing in. We can confirm we have access to the workspace. We'll take a closer look and get back to you as soon as we can!

    Kind regards,

    Jason

    0
  • Comment author
    Jason Cerrato

    Hi Edyta,

    I took a look at your linked job and a few of the other runs to compare the logs. I'm seeing consistent failure to fully copy the bam file into the VM, and it seems to consistently fail at the same point.

    If you take a look at our Troubleshooting article for PAPI Error Code 10 you can see that this failure can sometimes occur due to inadequate memory or disk space. Because this shard's log file stops so abruptly, with no error messages or delocalization, I'm led to believe this might be a memory issue.

    You can see this in the log for this shard across multiple submissions:

    7ea74da1-8224-4e77-8c09-deed25ab1ae5

    ddf48fde-bd4e-43af-9548-f5fb8cab825e

    ea844851-bb6a-4d14-825a-f3ee4c276643

    I would recommend trying a rerun of the job with more memory to see if it gets you farther in the job, or to completion. If that doesn't have an effect, you may want to try a slightly higher disk size to check for the same improvements.

    Let me know how it goes!

    Kind regards,

    Jason

    0
  • Comment author
    Edyta Malolepsza

    Hi Jason,

    Thank you very much for looking at this run. I will request more memory and see if it helps.

    Best regards,

    Edyta

    0
  • Comment author
    Jason Cerrato

    Hi Edyta,

    Those links for the logs may not work. Try these if so:

    7ea74da1-8224-4e77-8c09-deed25ab1ae5

    ddf48fde-bd4e-43af-9548-f5fb8cab825e

    ea844851-bb6a-4d14-825a-f3ee4c276643

    Kind regards,

    Jason

    0
  • Comment author
    Edyta Malolepsza

    Hi Jason,

    Yes, I cleaned some failed buckets and keep a record of it. I just restarted the run with more memory.

    Best regards,

    Edyta

    0
  • Comment author
    Jason Cerrato

    Hi Edyta,

    Sounds good! Let us know how it goes.

    Kind regards,

    Jason

    0
  • Comment author
    Edyta Malolepsza

    Hi Jason,

    The first trial with increases memory failed as well. After requesting double amount of memory and almost double amount of disk space the run finished succesfully.

    Thank you for your help!

    Best regards,

    Edyta

    0
  • Comment author
    Jason Cerrato

    Hi Edyta,

    Great to hear! If we can be of further assistance, please let us know.

    Kind regards,

    Jason

    0

Please sign in to leave a comment.