Running `GATK Best Practices Germline SNPs and INDELS` on custom genome
Hello,
This question is related to `GATK Best Practices Germline SNPs and INDELS` workspace and my question is about implementing this on a custom genome (non-model) species.
Our real dataset (~40Tb total data size) is too big for our cluster and we want to test if Terra would be a suitable platform to run GATK variant calling. I'm currently testing it on a downsized sample (just 6 lines total <120Gb total) - although I was able to successfully run the "paired fastq to unmapped bam" and "preprocessing for variant discovery" modifying the reference genome (replacing it with mine), I'm getting stuck at "joint genotyping". Normally when running this locally, I skip the VQSR step as we do not have any information about the golden SNPs/Indels for our reference, but it seems like there is no option to do so in this workflow. I understand that this is designed specifically for human reference, but if you can help me make changes to this workflow to run joint genotyping calls without using known SNPs/Indels, I will greatly appreciate it.
If not, can you please point me to any examples where this workflow has been modified for non-human reference?
Thanks very much!
Comments
4 comments
Hi Arun Seetharam,
Thank you for writing in about this! We will take a look and get back to you as soon as we can.
Kind Regards,
Anika
Hi Arun Seetharam,
How do you run joint genotyping locally? Do you have your own WDL you run? Or are you just running GATK commands and skipping
ApplyVQSR
? If you have your own WDL, you can just bring that onto Terra. If not, you would have to edit the joint genotyping WDL by creating your own copy.I am not aware of any workflows modified for non-human reference, but you could check Dockstore or the Broad Methods Repository
Hope that is helpful!
Kind Regards,
Anika
Hi @Anika Das
Thank you for pointing me in the right direction. I was able to find a version of joint genotyping identical to the one I used to run locally (simple bash script). If interested, this is one I'm currently using: gatk4-basic-joint-genotyping
Thanks,
Hi Arun Seetharam,
Great, glad to hear! If we can be of any further assistance, please let us know.
Kind Regards,
Anika
Please sign in to leave a comment.