file localization not working
Hi, I've checked the other related articles on issues with file localization, and my problem doesn't seem to be amongst those. I've written a WDL to use samtools on a bam and a ref fasta.
1. Problem: The bai does not localize, all other files are localized:
2021/11/22 19:12:43 Starting container setup.
2021/11/22 19:12:45 Done container setup.
2021/11/22 19:12:50 Starting localization.
2021/11/22 19:13:14 Localization script execution started...
2021/11/22 19:13:14 Localizing input gs://fc-secure-faf8c8cd-0082-4cee-84a8-76472106ceed/test_positions.bed -> /cromwell_root/fc-secure-faf8c8cd-0082-4cee-84a8-76472106ceed/test_positions.bed
2021/11/22 19:13:18 Localizing input gs://fc-secure-faf8c8cd-0082-4cee-84a8-76472106ceed/18f90bdc-92fc-433c-93c9-0882d57dad55/callMpileup/ce577710-2cb5-4fc9-9dde-ca5f011484b9/call-Mpileup/script -> /cromwell_root/script
2021/11/22 19:13:20 Localizing input gs://gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta -> /cromwell_root/gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta
2021/11/22 19:13:59 Localizing input gs://gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta.fai -> /cromwell_root/gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta.fai
Copying gs://gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta.fai...
/ [0 files][ 0.0 B/ 2.7 KiB] / [1 files][ 2.7 KiB/ 2.7 KiB]
Operation completed over 1 objects/2.7 KiB.
2021/11/22 19:14:04 Localizing input gs://fc-7320b057-72ec-4702-9c4f-662efc9af9e1/Brastianos_BrainTumor_Sample_set_2015/RP-328/Exome/BMM-13S/v4/BMM-13S.bam -> /cromwell_root/fc-7320b057-72ec-4702-9c4f-662efc9af9e1/Brastianos_BrainTumor_Sample_set_2015/RP-328/Exome/BMM-13S/v4/BMM-13S.bam
2021/11/22 19:23:23 Localization script execution complete.
2021/11/22 19:23:39 Done localization.
2021/11/22 19:23:42 Running user action: docker run -v /mnt/local-disk:/cromwell_root -v /mnt/d-c74a541aa27f13cfe59c2f998a664729:/mnt/d9e025138b28caa42dd4006fc3636661:ro --entrypoint=/bin/bash us.gcr.io/broad-gotc-prod/genomes-in-the-cloud@sha256:4fca8ca945c17fd86e31eeef1c02983e091d4f2cb437199e74b164d177d5b2d1 /cromwell_root/script
[mpileup] fail to load index for /cromwell_root/fc-7320b057-72ec-4702-9c4f-662efc9af9e1/Brastianos_BrainTumor_Sample_set_2015/RP-328/Exome/BMM-13S/v4/BMM-13S.bam
Why is the bai not localized? It's an input into the task.
2. Problem: If I make localization optional for bam, bai, fasta, and fai, it can not open the fasta.
2021/11/22 19:16:43 Starting container setup.
2021/11/22 19:16:45 Done container setup.
2021/11/22 19:16:49 Starting localization.
2021/11/22 19:17:20 Localization script execution started...
2021/11/22 19:17:20 Localizing input gs://fc-secure-faf8c8cd-0082-4cee-84a8-76472106ceed/test_positions.bed -> /cromwell_root/fc-secure-faf8c8cd-0082-4cee-84a8-76472106ceed/test_positions.bed
2021/11/22 19:17:23 Localizing input gs://fc-secure-faf8c8cd-0082-4cee-84a8-76472106ceed/140b7701-b79e-4dad-96b3-78ac9a9cb36a/callMpileup/8fda3d75-38fd-4cd8-938a-815085847433/call-Mpileup/script -> /cromwell_root/script
2021/11/22 19:17:25 Localization script execution complete.
2021/11/22 19:17:35 Done localization.
2021/11/22 19:17:36 Running user action: docker run -v /mnt/local-disk:/cromwell_root --entrypoint=/bin/bash us.gcr.io/broad-gotc-prod/genomes-in-the-cloud@sha256:4fca8ca945c17fd86e31eeef1c02983e091d4f2cb437199e74b164d177d5b2d1 /cromwell_root/script
[fai_load] build FASTA index.
[fai_build] fail to open the FASTA file gs://gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta
It's also beyond me why it needs to rebuild the fasta index as it's supplied.
I'm happy to share access with the testing workspace for the support team to have a look.
Best,
Philipp
Comments
18 comments
Hi Philipp,
Thank you for writing in about this issue. Can you share the workspace where you are seeing this issue with GROUP_FireCloud-Support@firecloud.org by clicking the Share button in your workspace? The Share option is in the three-dots menu at the top-right.
Please provide us with
We’ll be happy to take a closer look as soon as we can!
Kind regards,
Jason
Hi Jason,
thanks for coming back to me!
I've shared the workspace with the email. The link is https://app.terra.bio/#workspaces/carterlabtest/test_mpileup
I've set up this workspace to exactly showcase the two problems, so there is only one workflow present (mpileup) and two job submissions. Let me know if you also need me to copy the bam over to this workspace so you can run the workflow.
Best,
Philipp
Hi Philipp,
Thanks for sharing that workspace. Would you be willing to add my account (jcerrato@broadinstitute.org) to the authorization domain carterlab temporarily for the purposes of troubleshooting?
Kind regards,
Jason
My PI is admin of that group. He'll do that as soon as he can.
Great! Let me know when I've been added and I'll be happy to take a closer look.
All right, in the interest of time, I've re-created the testing workspace without authorization domain. https://app.terra.bio/#workspaces/carterlabtest/test
Let me know if you can't access it!
Hi Philipp,
I can access that one! I'll take a closer look and get back to you as soon as I can.
Kind regards,
Jason
Hi Philipp,
Would you be able to give my account (jcerrato@broadinstitute.org) access to this WDL, or share the .wdl file?
Kind regards,
Jason
I gave you full access to the WDL
Hi Philipp,
I took a look at your two submissions and noticed something interesting. In your first submission 957f51c9-c60a-4396-a2b9-bfb8ca650fe1 workflow ID 9b4c9b17-74d3-478a-b0e3-4c49faa214de I opened the Inputs for your Mpileup task and it looks like .bam files are being fed in for both the bam and bai input.
It looks like this is because of the way the task is called in the WDL, where the bam workflow input for the workflow is provided for both the bam and bai inputs for the Mpileup task.
Changing this so that the Mpileup task bai gets the workflow's bai input should fix this issue.
I'll see if I can find out what's going on with the fasta file.
Kind regards,
Jason
Hi Jason,
thanks a lot for spotting this embarrassing bug! I'm curious to hear your verdict on the fasta file too!
Best,
Philipp
Hi Philipp,
I see you've set localization_optional for the ref_fasta file in your WDL. I'm wondering if this is what's causing the failure with the file, because it's being told not to localize the file to the disk.
Some tools, like certain GATK tools, can stream files in to the command which means that it doesn't need to have the file localized in order to work. Do you know if samtools is capable of streaming in files having only the gs:// path? I wonder if turning localization_optional off will allow this to work. What do you think?
Kind regards,
Jason
To be clear, you may need to remove localization_optional for all files if samtools isn't capable of streaming in the files when given their gs:// paths.
That's a good point. Looking at the Mutect2 WDL, one of the tasks is CramToBam, which uses samtools, and does not make localization optional.
Do you know where I can look into how GATK streams files into the command? I'm wondering if it's easy enough to write a wrapper for samtools to make localization optional ... but that's just curious me.
For now, I'll use the working localization option.
Many thanks for taking your time to look into these issues!
Best,
Philipp
Hey Philipp,
I'm not privy to how GATK streams files but I can try to find out next week and get back to you! For now localizing the files definitely seems like the right route.
I'll follow up and let you know what I find, if anything.
Have a great rest of your week!
Kind regards,
Jason
Hey Philipp,
I've learned that GATK streams in files using java’s NIO package. You can search through the gatk git repo to find issue tickets and PR that are associated with “nio” for examples.
I hope this helps you find what you need!
Kind regards,
Jason
Thanks, Jason! I'll take a look.
I am not familiar with how GATK streams files, but I can try to find out next week. I would suggest localizing the files for now.
I'll follow up and let you know if I find anything.
Enjoy the rest of your week!
Kind regards!!!
Maika
Please sign in to leave a comment.