shard-9 consistently failing on Mutect2 PON creation

July 22, 2019 19:34
3 comments

Hello,

I've been trying to run the recommended Mutect2-PON workflow on 10 WGS samples, and the job is failing ONLY on the last shard (shard-9) of every M2 call, for all 10 M2 calls spawned from Mutect2.

Screenshot of an example of the M2 job status:

The stderr logs tend to be rather uninformative (most of them suddenly truncate, although one which went to attempt-12 actually hit a delocalization error).

I shared the workspace with GROUP_FireCloud-Support@firecloud.org. The workspace I'm working in is called "pd-wgs-workspace", and can be found in the project "pd-wgs-project". The submission ID for this job is 4ba3b45d-d8d2-49a7-a7a6-c32079f09aea, and similar errors also occurred in earlier runs.

Help would be much appreciated! :) This has been tough to debug because of the relatively long run time for this tool.

Comments

3 comments

Sushma Chaluvadi
- July 22, 2019 21:54
Hi Aspen,

I am going to take a look at the workspace and get back to you!

Sushma

0
Sushma Chaluvadi
- July 23, 2019 21:06
Hi Aspen,

We think that perhaps something in shard9 specifically is computationally extensive compared to the others since it seems to fail each time. Can you check to see if the interval in shard9 is much larger or if the interval contains regions/chromosomes that are not similar to the contents of the other intervals?

0
Aspen Neuro
- July 24, 2019 17:46
Hi Sushma,

I've solved my issue when thinking about your suggestion that the last shard was computationally intensive. I looked back at a previous Mutect2 PON workflow I ran last year, and realized that the default scatter_count parameter back then was 50, whereas now it was set to 10. Changing the scatter_count back to 50 allowed the PON creation to run successfully, likely as a result of having less load for each shard.

Thanks!

0

Please sign in to leave a comment.