gatk4-exome-analysis-pipeline BamProcessing.SortSam fails with Error Code 10 Answered
Beri
I'm trying to run your ExomeGermlineSingleSample workflow (Source: github.com/gatk-workflows/gatk4-exome-analysis-pipeline/ExomeGermlineSingleSample:update2.0v)
I was able to successfully use this workflow to analyze the sample dataset (SRR099969.unmapped.bam), as described in the documentation here.
When I try to analyze my own provided exome dataset, the first several steps of the analysis succeed, but the job fails during the BamProcessing.SortSam step. The workflow log contains the error message: "Job Failed with Error Code 10 for a machine where Preemptible is set to false". I have tried this two times using version 1.1 and a third time using version 2.0. These have all failed with Error Code 10 during SortSam. I have provided some more information here.
I would appreciate any advise you can give me to enable this workflow to succeed on my dataset.
Thanks in advance!
Comments
4 comments
Beri
I was able to resolve this issue by increasing the runtime memory for the SortSam call in BamProcessing.wdl from 5000 to 10000, like this:
runtime {
docker: "us.gcr.io/broad-gotc-prod/genomes-in-the-cloud:2.4.3-1564508330"
disks: "local-disk " + disk_size + " HDD"
cpu: "1"
memory: "10000 MiB"
preemptible: preemptible_tries
}
The steps I followed to do this are:
1. Fork the git repo from https://github.com/gatk-workflows/gatk4-exome-analysis-pipeline
2. Edit the tasks/BamProcessing.wdl to increase the allocated memory for the SortSam call.
3. Edit the absolute imports to point to my git repo, for example:
import "https://raw.githubusercontent.com/rogerdettloff/gatk4-exome-analysis-pipeline//1.2.1/tasks/UnmappedBamToAlignedBam.wdl
4. Create a new workflow using the FireCloud portal described here, and export it into my Terra Workspace.
I don't know that 10000 is the optimal size for the runtime memory; however, it seems that this experiment has shown that the current value of 5000 is too low for the SortSam task on some datasets.
Beri
I've noticed that the latest version of this workflow in the git repo now has relative imports for several wdl files. For example: `import "./tasks/UnmappedBamToAlignedBam.wdl" as ToBam`.
How can I use a wdl script with relative imports like that in my own workflows? Using the FireCloud portal described here to upload a workflow, seems to only allow me to upload one script that is pasted into the textbox. That simple interface does not seem suitable for this more complex workflow which is composed of several interdependent scripts. For example, if I were to fork your latest version and edit one of the scripts, I would need some way to upload all of the interdependent scripts as a package or a bundle for my new workflow to execute. Is there a different interface that would allow me to do that?
I had one more issue with this gatk4-exome-analysis-pipeline:
I also needed to increase the runtime memory for the QC.CollectHsMetrics step. This was easy to do, since that task takes a "memory_multiplier" parameter as input. I just needed to add the memory_multiplier parameter and set it to 2 (the default is 1). In the top level wdl file, I changed the QC.CollectHsMetrics call to be like this:
call QC.CollectHsMetrics as CollectHsMetrics {
input:
input_bam = UnmappedBamToAlignedBam.output_bam,
input_bam_index = UnmappedBamToAlignedBam.output_bam_index,
metrics_filename = sample_and_unmapped_bams.base_file_name + ".hybrid_selection_metrics",
ref_fasta = references.reference_fasta.ref_fasta,
ref_fasta_index = references.reference_fasta.ref_fasta_index,
target_interval_list = target_interval_list,
bait_interval_list = bait_interval_list,
preemptible_tries = papi_settings.agg_preemptible_tries,
memory_multiplier = 2
}
This workflow completed successfully on my dataset with these changes.
I'm happy to hear that you were able to resolve the issue and thanks for posting the steps used to resolve it.
Regarding relative imports, this is a feature offered by Dockstore and isn't available in the Firecloud method repo. If you would like to use this feature you can register you forked repo to dockstore and import those workflows into your Terra workspace. Step by step procedure to do this can be found here Importing-a-Dockstore-workflow-into-Terra
Please sign in to leave a comment.