Warning of gatk MarkDuplicatesSpark
I ran GATK MarkDuplicatesSpark on our cluster but got the following warnings. It looks the job finished and the marked_duplicates.bam output was created. Due to the warnings, I wonder whether there might be concerns for downstream BQSR analyses using the marked_duplicates.bam output.
Thank you!
Duan
Warning 1:
Runtime.totalMemory()=8747220992
Using GATK jar /home/software/gatk/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/software/gatk/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar MarkDuplicatesSpark -I SRR5134749_sorted_reads.bam -O SRR5134749_marked_duplicates.bam
03:45:57.003 WARN SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
03:45:57.132 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/software/gatk/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
May 03, 2022 3:45:58 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
03:45:58.852 INFO MarkDuplicatesSpark - ------------------------------------------------------------
03:45:58.853 INFO MarkDuplicatesSpark - The Genome Analysis Toolkit (GATK) v4.1.2.0
03:45:58.853 INFO MarkDuplicatesSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
03:45:58.853 INFO MarkDuplicatesSpark - Executing as duan@b16.private on Linux v3.10.0-1127.19.1.el7.x86_64 amd64
03:45:58.854 INFO MarkDuplicatesSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_262-b10
03:45:58.854 INFO MarkDuplicatesSpark - Start Date/Time: May 3, 2022 3:45:57 AM EDT
Warning 2:
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Warning 3:
WARN HadoopFileSystemWrapper: Concat not supported, merging serially
Comments
1 comment
Hi Duan,
Thanks for writing in. It sounds like you are running a GATK tool on your own cluster and not on Terra, is that correct? If so, a better venue for your question would be the GATK forum since this page is focused on issues related to the Terra platform. GATK support staff or a member of the community will be able to assist you.
Best,
Samantha
Please sign in to leave a comment.