FASTQ file format for Paired-FASTQ-to-unmapped-BAM pipeline
Hello,
I am using zipped FASTQ files as inputs for this pipeline and getting an error. Here is the exception.
htsjdk.samtools.SAMException: Sequence header must start with @: ?�����???v???e??_??7??k?<?uqsdf+??[???�?[?????D?0???2CA?<????m???[?^t????9?????????v????\?W?k????^?^??????|??????????|?G??h?DB?????$?????h??w2(z??!6??#? "J????\???Y\?s????????x:?o??N?r??????Z??????????I~J%???M1?e/????3`?????:?,ZT?????<??w???RF?
Just want to check if the pipleline can process file in *.fastq.gz.1 , *.fastq.gz.2
format.
Do I need to use additional INPUT params to handle gzip files?
Regards,
Vithal
Comments
5 comments
Hi Vithal,
Happy to see if I can help here. Can you point me to the pipeline you are referring to? I do not see it listed as a featured workspace or workflow.
Kind regards,
Jason
Hi Jason,
The workspace name is: Sequence-Format-Conversion
The tools you need to convert various sequencing file formats to GATK analysis ready input formats. Plus a validation tool to confirm that SAM or BAM files are in the proper format.
1) Interleaved FASTQ to paired FASTQ
2) Paired FASTQ to unmapped BAM
3) BAM to unmapped BAM
4) CRAM to BAM files from sequencer output for use in GATK analysis tools.
The Validate BAM tool is also added to confirm proper formatting of SAM or BAM files.
I am trying #2 Paired FASTQ to unmapped BAM by passing FASTQ files in gzip format.
Thanks,
Vithal
Hi Vithal,
If you look further down the page for the workspace description, you can see this paragraph:
For more details on input types typically used by GATK please review the following article: What Input Files does the GATK Accept/Require?
If you follow this link, it brings you to a post that describes what types of files are accepted.
Based on this information, it appears that gzipped fasta files will not work, as you are currently experiencing. I recommend following that link for more information on preparing FASTA reference sequences for use with the GATK.
Kind regards,
Jason
Thank you Jason. I will read the documentation and will make necessary changes. It will be nice if the tool allows zipped versions. I hope future versions will allow.
Hi Vithal,
If you are interested, you can make a feature request for GATK here: https://github.com/broadinstitute/gatk/issues
Kind regards,
Jason
Please sign in to leave a comment.