Task can't find localized file.
Hi! I think my file is localized properly, but the actual run can't find/recognize it? ! I'm not sure how to resolve this. I pasted the log below. Thanks!
2020/01/28 07:28:20 Starting container setup.
2020/01/28 07:28:24 Done container setup.
2020/01/28 07:28:26 Starting localization.
2020/01/28 07:28:31 Localization script execution started...
2020/01/28 07:28:31 Localizing input gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf -> /cromwell_root/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf
2020/01/28 07:28:42 Localizing input gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/script -> /cromwell_root/script
2020/01/28 07:28:43 Localization script execution complete.
2020/01/28 07:28:46 Done localization.
2020/01/28 07:28:47 Running user action: docker run -v /mnt/local-disk:/cromwell_root --entrypoint= miguelpmachado/htslib@sha256:3a7f4d4f972a6ea598622a51337835aad8833af370876cae4a86ad03334eb101 /bin/bash /cromwell_root/script
[bgzip] No such file or directory: /cromwell_root/fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733/27e32dc1-9849-4b7d-af5b-227d17cae43c/M1_PON/a6ecf052-587b-41d1-a04f-36ec60d6fe2c/call-CombineVariants/normals.merged.min5.vcf
tbx_index_build failed: /cromwell_root/fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733/27e32dc1-9849-4b7d-af5b-227d17cae43c/M1_PON/a6ecf052-587b-41d1-a04f-36ec60d6fe2c/call-CombineVariants/normals.merged.min5.vcf.gz
2020/01/28 07:28:48 Starting delocalization.
2020/01/28 07:28:49 Delocalization script execution started...
2020/01/28 07:28:49 Delocalizing output /cromwell_root/memory_retry_rc -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/memory_retry_rc
2020/01/28 07:28:49 Delocalizing output /cromwell_root/rc -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/rc
2020/01/28 07:28:50 Delocalizing output /cromwell_root/stdout -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/stdout
2020/01/28 07:28:52 Delocalizing output /cromwell_root/stderr -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/stderr
2020/01/28 07:28:53 Delocalizing output /cromwell_root/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf.gz -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf.gz
Required file output '/cromwell_root/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf.gz' does not exist.
Comments
13 comments
Hi Sehyun,
Is this also happening in pilot-study-credit/CNVworkflow_TCGA_LUAD? If not, can you share the workspace with GROUP_FireCloud-Support@firecloud.org and let us know the workspace name, as well as the relevant submission and workflow ID(s)?
Kind regards,
Jason
HI Jason,
This error is from the different workspace. I added the support account to this workspace.
workspace : pilot-study-credit/Tumor_Only_CNV_rerunSynData
workflow : 1_MuTect1_PON
submission ID : 98802fc1-48fe-44bf-b6fa-1f98f9404f0b
- Sehyun
Hi Jason,
Is there any update on this issue?
- Sehyun
Hi Sehyun,
Looking at the log files, we've identified some funkiness going on in the log file associated with the submission.
If you take a look at the lines from the log file above, specifically the bolded parts, you will see that localization and delocalization is happening in the google bucket associated with the workspace as expected (fc-02b29faa-2a28-4e51-88db-a8a55cf72c48). However, the errors with bgzip and tbx_index_build are looking for normals.merged.min5.vcf.gz in a different bucket (fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733).
Do you have any idea why the workflow would be expecting to find the file in a different workspace bucket rather than its own, where the file should theoretically be getting generated as output from the CombineVariants task?
Kind regards,
Jason
Hi Jason,
The funkiness you pointed out is exactly what I'm curious about too. I don't know why localization happens successfully, but task is looking for it in a different bucket. This exact workflow even worked fine before - I'm just running it again for test.
- Sehyun
Hi Sehyun,
I'm running some tests to see if I can drill down to the specific issue you're facing here. I was able to run a successful submission using the samples in the case sample set. I'm doing a run of the neutral set now to see if I run into the same exact error.
Jason
Hi Sehyun,
I was able to get a successful run with the neutral set using what appear to be the exact same inputs as your run. I'm taking a look to see if I can identify any differences I've overlooked.
I've shared the workspace with you so you can take a look too, if you're interested: https://app.terra.bio/#workspaces/broad-firecloud-dsde/Tumor_Only_CNV_rerunSynData_jcerrato/job_history/62c27bd3-3823-4c25-bf27-47d63c4a2df0
Kind regards,
Jason
Hi Jason!
Can you share the output.json of your successful run above? Thanks!
- Sehyun
Hi Sehyun,
I see that you were able to run a successful submission on the neutral set. Do you still require the output.json?
Our developers have taken a look at the differences between your failed and successful submissions and believe this to be a bug. Does these workflows contain any sensitive data? They would like to include the metadata but we would like to confirm the absence of sensitive data before proceeding.
Jason
Hi Jason,
Yes, I still need the output.json. Actually, the successful run you see is not actually successful - I kind of force-fed the mis-localizing input to check the other part of the workflow.
Sehyun
Hi Jason,
One un-related question: how/why did your job history update 'run cost' information?
- Sehyun
Hi Sehyun,
Can you provide a little more information for what you mean?
Kind regards,
Jason
Hi Sehyun,
Ah—I see what you are referring to. Run cost information is currently made available to members of the Broad Institute after 24 hours due to the billing account structure of the organization. This visibility is currently being worked on for Terra users outside of the Broad.
Jason
Please sign in to leave a comment.