IO Error while running MergeBamAlignment on Terra
Hello,
I'm currently running a sample through snapshot 8 of the "processing-for-variant-discovery-gatk4" workflow on Terra, and I'm consistently hitting the following error on the MergeBamAlignment step:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/tmp.3e98f934
01:09:27.823 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Tue Jul 02 01:09:27 UTC 2019] MergeBamAlignment --UNMAPPED_BAM /cromwell_root/fc-2c60522a-326b-4267-94c5-f21b05441fc1/SQ_ROUND_1/uBAMs/HDF111iPS501_D0_r1/HDF111iPS501_D0_r1_H7VLWALXX_1.unmapped.bam --ALIGNED_BAM /cromwell_root/fc-2c60522a-326b-4267-94c5-f21b05441fc1/32928054-6abf-4275-ae63-c1f84fca4f51/PreProcessingForVariantDiscovery_GATK4/01d357c5-80e7-48d1-b164-94f61c9e6832/call-SamToFastqAndBwaMem/shard-0/HDF111iPS501_D0_r1_H7VLWALXX_1.unmapped.unmerged.bam --OUTPUT HDF111iPS501_D0_r1_H7VLWALXX_1.unmapped.aligned.unsorted.bam --PROGRAM_RECORD_ID bwamem --PROGRAM_GROUP_VERSION 0.7.15-r1140 --PROGRAM_GROUP_COMMAND_LINE bwa mem -K 100000000 -p -v 3 -t 16 -Y /cromwell_root/broad-references/hg38/v0/Homo_sapiens_assembly38.fasta --PROGRAM_GROUP_NAME bwamem --PAIRED_RUN true --CLIP_ADAPTERS false --IS_BISULFITE_SEQUENCE false --ALIGNED_READS_ONLY false --MAX_INSERTIONS_OR_DELETIONS -1 --ATTRIBUTES_TO_RETAIN X0 --EXPECTED_ORIENTATIONS FR --ALIGNER_PROPER_PAIR_FLAGS true --SORT_ORDER unsorted --PRIMARY_ALIGNMENT_STRATEGY MostDistant --ADD_MATE_CIGAR true --UNMAP_CONTAMINANT_READS true --UNMAPPED_READ_STRATEGY COPY_TO_TAG --VALIDATION_STRINGENCY SILENT --MAX_RECORDS_IN_RAM 2000000 --REFERENCE_SEQUENCE /cromwell_root/broad-references/hg38/v0/Homo_sapiens_assembly38.fasta --ADD_PG_TAG_TO_READS true --ATTRIBUTES_TO_REVERSE OQ --ATTRIBUTES_TO_REVERSE U2 --ATTRIBUTES_TO_REVERSE_COMPLEMENT E2 --ATTRIBUTES_TO_REVERSE_COMPLEMENT SQ --READ1_TRIM 0 --READ2_TRIM 0 --CLIP_OVERLAPPING_READS true --INCLUDE_SECONDARY_ALIGNMENTS true --MIN_UNCLIPPED_BASES 32 --MATCHING_DICTIONARY_TAGS M5 --MATCHING_DICTIONARY_TAGS LN --VERBOSITY INFO --QUIET false --COMPRESSION_LEVEL 5 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Tue Jul 02 01:09:28 UTC 2019] Executing as root@79edb330a5ca on Linux 4.14.111+ amd64; OpenJDK 64-Bit Server VM 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.0.0
INFO 2019-07-02 01:09:28 SamAlignmentMerger Processing SAM file(s): [/cromwell_root/fc-2c60522a-326b-4267-94c5-f21b05441fc1/32928054-6abf-4275-ae63-c1f84fca4f51/PreProcessingForVariantDiscovery_GATK4/01d357c5-80e7-48d1-b164-94f61c9e6832/call-SamToFastqAndBwaMem/shard-0/HDF111iPS501_D0_r1_H7VLWALXX_1.unmapped.unmerged.bam]
INFO 2019-07-02 01:09:57 AbstractAlignmentMerger Merged 1,000,000 records. Elapsed time: 00:00:29s. Time for last 1,000,000: 27s. Last read position: chr18:58,937,624
INFO 2019-07-02 01:09:57 AbstractAlignmentMerger 64939 Reads have been unmapped due to being suspected of being Cross-species contamination.
INFO 2019-07-02 01:10:23 AbstractAlignmentMerger Merged 2,000,000 records. Elapsed time: 00:00:55s. Time for last 1,000,000: 26s. Last read position: chr2:13,856,463
INFO 2019-07-02 01:10:23 AbstractAlignmentMerger 130164 Reads have been unmapped due to being suspected of being Cross-species contamination.
...
<truncated for readability>
...
INFO 2019-07-02 03:35:03 AbstractAlignmentMerger Merged 371,000,000 records. Elapsed time: 02:25:34s. Time for last 1,000,000: 23s. Last read position: chr3:196,442,210
INFO 2019-07-02 03:35:03 AbstractAlignmentMerger 24191587 Reads have been unmapped due to being suspected of being Cross-species contamination.
INFO 2019-07-02 03:35:26 AbstractAlignmentMerger Merged 372,000,000 records. Elapsed time: 02:25:57s. Time for last 1,000,000: 22s. Last read position: chr1:106,898,739
INFO 2019-07-02 03:35:26 AbstractAlignmentMerger 24257995 Reads have been unmapped due to being suspected of being Cross-species contamination.
INFO 2019-07-02 03:35:50 AbstractAlignmentMerger Merged 373,000,000 records. Elapsed time: 02:26:21s. Time for last 1,000,000: 24s. Last read position: chr17:45,882,095
INFO 2019-07-02 03:35:50 AbstractAlignmentMerger 24324507 Reads have been unmapped due to being suspected of being Cross-species contamination.
INFO 2019-07-02 03:36:12 AbstractAlignmentMerger Merged 374,000,000 records. Elapsed time: 02:26:43s. Time for last 1,000,000: 21s. Last read position: */*
INFO 2019-07-02 03:36:12 AbstractAlignmentMerger 24390885 Reads have been unmapped due to being suspected of being Cross-species contamination.
INFO 2019-07-02 03:36:35 AbstractAlignmentMerger Merged 375,000,000 records. Elapsed time: 02:27:06s. Time for last 1,000,000: 23s. Last read position: chr8:137,530,448
INFO 2019-07-02 03:36:35 AbstractAlignmentMerger 24457599 Reads have been unmapped due to being suspected of being Cross-species contamination.
[Tue Jul 02 03:36:52 UTC 2019] picard.sam.MergeBamAlignment done. Elapsed time: 147.42 minutes.
Runtime.totalMemory()=7337410560
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:222)
at htsjdk.samtools.util.BlockCompressedOutputStream.writeGzipBlock(BlockCompressedOutputStream.java:429)
at htsjdk.samtools.util.BlockCompressedOutputStream.deflateBlock(BlockCompressedOutputStream.java:392)
at htsjdk.samtools.util.BlockCompressedOutputStream.write(BlockCompressedOutputStream.java:291)
at htsjdk.samt
A few notes:
1. I've tried increasing the java_opt and mem_size inputs of MergeBamAlignment to "-Xms7000m" and "7500 MB" respectively. Unsurprisingly, this didn't work, as the error seems to be in writing the output of MergeBamAlignment.
2. I've shared the workspace "pd-wgs-workspace" in the project "pd-wgs-project" with the FireCloud support group. The offending sample is "HDF111iPS501_D0_r1".
Thank you!
Lee
Comments
6 comments
Hi Aspen Neuro,
I am taking a look at this and will get back to you shortly!
Hi Aspen Neuro,
Is this the relevant Submission ID = 08836390-ebf4-4cd3-84cf-94b6e4ed4b28 ran on June 27th?
Update: Just so you know, a newer version of this tool can be found in Dockstore and exported to your Terra workspace. You can also find the most up to date versions of tools and their configurations here.
Thanks!
Hello Tiffany,
That submission ID works, you can also look at 32928054-6abf-4275-ae63-c1f84fca4f51 as well.
I ran a diff on the newest version of the tool, and the WDL file is identical to the one I'm using right now (with the exception of the maxRetries parameter I added to SortAndFixTags). The configuration settings are almost identical as well (the differences are due to me upping the memory on some tasks).
Thanks!
Hello Tiffany,
I fixed the issue. It turns out the stderr file was truncated for many of the jobs, and the full stderr file can be found in some of the later failed runs.
Based on this "No Space Left On Device" error, I increased the "flowcell_medium_disk" setting, and MergeBamAlignment ran without error.
Hello Aspen,
Thanks for letting us know that you were able to resolve this!
Sushma
Hi Aspen Neuro Sushma Chaluvadi is there a tutorial/walk through documentation to run "processing-for-variant-discovery-gatk4" workflow on Terra with different read groups?
thanks
sam
Please sign in to leave a comment.