failed on RecalibrateGqTask of the GATK-SV-single sample pipeline
Hi everyone,
I’m trying to run a test of the GATK-Structural-Variants-Single-Sample pipeline (v1.1) on Terra using the demo dataset NA12878, but the workflow fails with an error in the task of RecalibrateGqTask.
I’ve attached a screenshot of the error message for reference. Could someone help take a look and advise what might be causing this issue? Please just let me know what other information is needed.
Thank you very much for your help!
Best,
Qiushi
Comments
8 comments
Hi Qiushi,
Thank you for writing in about this issue. Can you share the workspace where you are seeing this issue with Terra Support by clicking the Share button in your workspace? The Share option is in the three-dots menu at the top-right.
Please provide us with
We’ll be happy to take a closer look as soon as we can!
Kind regards,
Samantha
Hi Samantha,
Thank you for your response and for offering to take a closer look.
I have shared the workspace with Support. Below are the requested details:
• Workspace link:
https://app.terra.bio/#workspaces/WGS-BakerLab/GATK-Structural-Variants-Single-Sample%2002252026
• Relevant submission ID:
c12dac7a-6190-40cf-8b6d-6cad567b710e
• Relevant workflow ID:
https://doi.org/10.5281/zenodo.17055982 (GATK-SV-v1.1)
Please let me know if you need any additional information from my end.
Best regards,
Qiushi
Hi Qiushi,
I'm part of the GATK-SV team looking into your issue. We noticed that a number of inputs to the workflow are missing in the submission you linked, including genome_tracks, which is an input to the task that failed. We have not yet had a chance to test whether this is the cause of the failure due to the number of inputs that were missing, but it is what we suspect so far, and it will certainly impact the functionality of the pipeline regardless, so we would like to address this first. The inputs may have disappeared when you previously tried a few different versions of the pipeline, as inputs can differ across versions of the code; we recommend always using the workflow versions, workspace data, and input configurations present in the featured workspace to avoid compatibility issues.
To test this out, could you please re-try the workflow on NA12878 with the original input configuration from the featured workspace? To make sure you have the right input configuration, you can either make a fresh clone of the featured workspace, or you can copy the workflow from the featured workspace into your existing workspace by clicking on the three dots at the bottom right corner of the workflow and selecting "Copy to another workspace." Then, launch the updated workflow on NA12878 without making any edits to the inputs.
Once you try this, let us know if the workflow succeeds or if you are still experiencing issues.
Best,
Emma
Hi Emma,
I wanted to follow up and thank you for your help earlier. The test job using the demo dataset (NA12878) now runs successfully in a freshly cloned workspace.
However, when I submitted a job using our own data, I encountered the following error in the backend log:
“ERROR: (gcloud.storage.cp) HTTPError 403: pet-27649560779295c688860@terra-6ca4d305.iam.gserviceaccount.com does not have storage.objects.get access to the Google Cloud Storage object. Permission ‘storage.objects.get’ denied on resource (or it may not exist). This command is authenticated as pet-27649560779295c688860@terra-6ca4d305.iam.gserviceaccount.com which is the active account specified by the [core/account] property.”
The CRAM and CRAI files are stored in another workspace within our lab under the same billing project. It seems like this might be a permissions issue between workspaces.
Could you advise how to grant the necessary access so that the workflow in this GATK-SV Single-Sample workspace can read those files?
• Workspace link:
https://app.terra.bio/#workspaces/WGS-BakerLab/GATK-Structural-Variants-Single-Sample%2003162026
• Relevant submission ID:
0f6f209d-4171-426b-9000-596f907c4751
• Relevant workflow ID:
https://doi.org/10.5281/zenodo.17055982 (GATK-SV-v1.1)
Thank you again for your support.
Qiushi
Hi Qiushi Li,
Thank you for sharing those links. I can see the input files are stored in the Terra Data Repository. Does your account have access to those relevant snapshots? Or if the intention is for the permissions to sync with the workspace permissions, do you know if the snapshot has the Add workspace policy groups to snapshot readers configuration enabled, assuming you are a member of the workspace where that snapshot was exported?
Kind regards,
Jason
Hi Jason,
I can see the file paths in the workspace data table, but I cannot access the data and I also do not see the corresponding snapshot in Terra.
It seems I do not have access to the Terra Data Repository snapshot. I've forwarded this post to the owner of the data in the lab. Hopefully he can figure that out.
Thanks
Qiushi
Hi Jason,
The account manager has added me as a steward to the snapshot, and the workspace policy groups were already included in the snapshot readers. However, the newly submitted job still failed with the same error (submission ID: ad2a1a79-7dac-4459-9846-155d54c6d123).
Do you have any suggestions on what might be causing this?
Also, do you have a support email address? The account manager would like to be looped into the discussion to help troubleshoot more efficiently.
Kind regards,
Qiushi
Hi Qiushi Li,
You can send a message to support@terra.bio and we'll follow up with you via email.
Best,
Samantha
Please sign in to leave a comment.