Papi Error Code 13
Hi,
Looking for some direction as to how to address the error below. Thanks for the help!
The job was stopped before the command finished. PAPI error code 13. Execution failed: generic::internal: action 14: waiting for container: container is still running, possibly due to low system resources
Comments
39 comments
Jason Cerrato here's the submission ID: 8e84a6d4-9dd0-4c14-a4ec-e17d29ed1e62 and workflow ID: 1c4b5fee-864d-43b7-8af0-ab35586ef936
Thanks,
Jake
Jason Cerrato
I ran a different workflow vanallenlab/facets instead of my version of of the same method jakec/FACETS, and the vanallenlab version of the workflow ran successfully. However, the vanallenlab version is built off of my version, so there should be very little difference w.r.t. both the WDLs and the dockers. I haven't updated the docker or WDL for jakec/FACETS since July 2019, and it ran fine up until late April. Also I had that weird little time interval earlier this month where ~1/2 of the samples ran successfully.
Upon running my version immediately after successful runs with the vanallenlab version, I still got the error (which is now saying not enough disk space per our last email). Is this a docker related issue? Does google cloud flag or limit the amount of times a particular docker can be pulled?
Thanks,
Jake
Hi Jake,
That's very interesting. I don't believe there's a limit but I'll be happy to confirm with my colleagues.
Can you share both of those methods with jcerrato@broadinstitute.org and/or send me the WDL files?
Kind regards,
Jason
Hi Jake,
Thank you for the WDLs. Would you be able to provide the workflow IDs for both the job run with vanallenlab/facets and the job run with jakec/FACETS? If the error for jakec/FACETS is the exact same as the previously provided workflow 1c4b5fee-864d-43b7-8af0-ab35586ef936, which it sounds like it is, then providing just the vanallenlab/facets one should work for comparison.
If you are able to share the workspace where you ran these jobs with GROUP_FireCloud-Support@firecloud.org, that would make investigation easier, but I understand if the workspace is protected by an authorization domain and thus cannot be shared.
Many thanks,
Jason
Hey Jason,
I gave workspace access to that support group. It's just TCGA data.
The workflow run on May 10, 2020, 6:52 PM includes successful and failed runs using jakec/FACETS.
Thanks,
Jake
Hi Jake,
Perfect—can you share the name of the workspace or provide a link? Since it's a group, we don't get individual notifications for being added.
Many thanks,
Jason
Hi Jake Conway,
Looking at your latest jobs, am I correct in understanding that you are only running into Failures when using DRS paths? Can you confirm that your NIH account and framework services are connected and active in your profile?
Kind regards,
Jason
Hey Jason,
Not sure if only the DRS paths are failing. The majority of TCGA bams have been moved to DRS paths. All my account services and frameworks are active, and were active at time of running those. I just ran the van allen lab version of tool on entire cohort, and all succeeded.
Thanks,
Jake
Hi Jake,
Understood. Based on a quick glance of your jobs using jakec/FACETS, it looks like this one with only gs:// inputs succeeded.
where these with drs:// failed
It could be coincidence though—I'll dig deeper.
Kind regards,
Jason
Please sign in to leave a comment.