Task can't find localized file.

Post author
Sehyun Oh

Hi! I think my file is localized properly, but the actual run can't find/recognize it? ! I'm not sure how to resolve this. I pasted the log below. Thanks!

 

2020/01/28 07:28:20 Starting container setup.
2020/01/28 07:28:24 Done container setup.
2020/01/28 07:28:26 Starting localization.
2020/01/28 07:28:31 Localization script execution started...
2020/01/28 07:28:31 Localizing input gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf -> /cromwell_root/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf
2020/01/28 07:28:42 Localizing input gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/script -> /cromwell_root/script
2020/01/28 07:28:43 Localization script execution complete.
2020/01/28 07:28:46 Done localization.
2020/01/28 07:28:47 Running user action: docker run -v /mnt/local-disk:/cromwell_root --entrypoint= miguelpmachado/htslib@sha256:3a7f4d4f972a6ea598622a51337835aad8833af370876cae4a86ad03334eb101 /bin/bash /cromwell_root/script
[bgzip] No such file or directory: /cromwell_root/fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733/27e32dc1-9849-4b7d-af5b-227d17cae43c/M1_PON/a6ecf052-587b-41d1-a04f-36ec60d6fe2c/call-CombineVariants/normals.merged.min5.vcf
tbx_index_build failed: /cromwell_root/fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733/27e32dc1-9849-4b7d-af5b-227d17cae43c/M1_PON/a6ecf052-587b-41d1-a04f-36ec60d6fe2c/call-CombineVariants/normals.merged.min5.vcf.gz
2020/01/28 07:28:48 Starting delocalization.
2020/01/28 07:28:49 Delocalization script execution started...
2020/01/28 07:28:49 Delocalizing output /cromwell_root/memory_retry_rc -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/memory_retry_rc
2020/01/28 07:28:49 Delocalizing output /cromwell_root/rc -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/rc
2020/01/28 07:28:50 Delocalizing output /cromwell_root/stdout -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/stdout
2020/01/28 07:28:52 Delocalizing output /cromwell_root/stderr -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/stderr
2020/01/28 07:28:53 Delocalizing output /cromwell_root/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf.gz -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf.gz
Required file output '/cromwell_root/fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-CombineVariants/normals.merged.min5.vcf.gz' does not exist.

Comments

13 comments

  • Comment author
    Jason Cerrato

    Hi Sehyun,

    Is this also happening in pilot-study-credit/CNVworkflow_TCGA_LUAD? If not, can you share the workspace with GROUP_FireCloud-Support@firecloud.org and let us know the workspace name, as well as the relevant submission and workflow ID(s)?

    Kind regards,

    Jason

    0
  • Comment author
    Sehyun Oh

    HI Jason, 

    This error is from the different workspace. I added the support account to this workspace.

    workspace : pilot-study-credit/Tumor_Only_CNV_rerunSynData

    workflow : 1_MuTect1_PON

    submission ID : 98802fc1-48fe-44bf-b6fa-1f98f9404f0b

    - Sehyun

    0
  • Comment author
    Sehyun Oh

    Hi Jason,

    Is there any update on this issue?

    - Sehyun

    0
  • Comment author
    Jason Cerrato

    Hi Sehyun,

    Looking at the log files, we've identified some funkiness going on in the log file associated with the submission.

    2020/01/28 07:28:42 Localizing input gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/script -> /cromwell_root/script
    2020/01/28 07:28:43 Localization script execution complete.
    2020/01/28 07:28:46 Done localization.
    2020/01/28 07:28:47 Running user action: docker run -v /mnt/local-disk:/cromwell_root --entrypoint= miguelpmachado/htslib@sha256:3a7f4d4f972a6ea598622a51337835aad8833af370876cae4a86ad03334eb101 /bin/bash /cromwell_root/script
    [bgzip] No such file or directory: /cromwell_root/fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733/27e32dc1-9849-4b7d-af5b-227d17cae43c/M1_PON/a6ecf052-587b-41d1-a04f-36ec60d6fe2c/call-CombineVariants/normals.merged.min5.vcf
    tbx_index_build failed: /cromwell_root/fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733/27e32dc1-9849-4b7d-af5b-227d17cae43c/M1_PON/a6ecf052-587b-41d1-a04f-36ec60d6fe2c/call-CombineVariants/normals.merged.min5.vcf.gz
    2020/01/28 07:28:48 Starting delocalization.
    2020/01/28 07:28:49 Delocalization script execution started...
    2020/01/28 07:28:49 Delocalizing output /cromwell_root/memory_retry_rc -> gs://fc-02b29faa-2a28-4e51-88db-a8a55cf72c48/4ef7953e-c333-4b30-82a5-4966adb8969d/M1_PON/1d146c61-2b52-47b7-985b-a4c622325663/call-htslib/memory_retry_rc

    If you take a look at the lines from the log file above, specifically the bolded parts, you will see that localization and delocalization is happening in the google bucket associated with the workspace as expected (fc-02b29faa-2a28-4e51-88db-a8a55cf72c48). However, the errors with bgzip and tbx_index_build are looking for normals.merged.min5.vcf.gz in a different bucket (fc-c162cdc9-0d45-4fab-9d0b-5a5ef80ec733).

    Do you have any idea why the workflow would be expecting to find the file in a different workspace bucket rather than its own, where the file should theoretically be getting generated as output from the CombineVariants task?

    Kind regards,

    Jason

    0
  • Comment author
    Sehyun Oh

    Hi Jason, 

    The funkiness you pointed out is exactly what I'm curious about too. I don't know why localization happens successfully, but task is looking for it in a different bucket. This exact workflow even worked fine before - I'm just running it again for test.

    - Sehyun

    0
  • Comment author
    Jason Cerrato

    Hi Sehyun,

    I'm running some tests to see if I can drill down to the specific issue you're facing here. I was able to run a successful submission using the samples in the case sample set. I'm doing a run of the neutral set now to see if I run into the same exact error.

    Jason

    0
  • Comment author
    Jason Cerrato

    Hi Sehyun,

    I was able to get a successful run with the neutral set using what appear to be the exact same inputs as your run. I'm taking a look to see if I can identify any differences I've overlooked.

    I've shared the workspace with you so you can take a look too, if you're interested: https://app.terra.bio/#workspaces/broad-firecloud-dsde/Tumor_Only_CNV_rerunSynData_jcerrato/job_history/62c27bd3-3823-4c25-bf27-47d63c4a2df0

    Kind regards,

    Jason

    0
  • Comment author
    Sehyun Oh

    Hi Jason!

    Can you share the output.json of your successful run above? Thanks! 

    - Sehyun

    0
  • Comment author
    Jason Cerrato

    Hi Sehyun,

    I see that you were able to run a successful submission on the neutral set. Do you still require the output.json?

    Our developers have taken a look at the differences between your failed and successful submissions and believe this to be a bug. Does these workflows contain any sensitive data? They would like to include the metadata but we would like to confirm the absence of sensitive data before proceeding. 

    Jason

    0
  • Comment author
    Sehyun Oh

    Hi Jason,

    Yes, I still need the output.json. Actually, the successful run you see is not actually successful - I kind of force-fed the mis-localizing input to check the other part of the workflow. 

    Sehyun

     

    0
  • Comment author
    Sehyun Oh

    Hi Jason,

    One un-related question: how/why did your job history update 'run cost' information?

    - Sehyun

    0
  • Comment author
    Jason Cerrato

    Hi Sehyun,

    Can you provide a little more information for what you mean?

    Kind regards,

    Jason

    0
  • Comment author
    Jason Cerrato

    Hi Sehyun,

    Ah—I see what you are referring to. Run cost information is currently made available to members of the Broad Institute after 24 hours due to the billing account structure of the organization. This visibility is currently being worked on for Terra users outside of the Broad.

    Jason

    0

Please sign in to leave a comment.