Need Help?

Search our documentation and community forum

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate.
Terra powers important scientific projects like FireCloud, AnVIL, and BioData Catalyst. Learn more.

Large WGS cohort genotyping with 'GATK Best Practices Germline SNPs and INDELS'

Comments

2 comments

  • Avatar
    Beri Shifaw

    Hi dsilencio,

    I've forwarded your question to the workflow authors and we'll get back to you with an answer.

    0
    Comment actions Permalink
  • Avatar
    Beri Shifaw

    The workflow authors suggested if you are planning to work with a large sample set then try to enabling GnarlyGenotyper instead of GenotypeGVCFs in the workflow. This would require reblocking the gvcfs before feeding them to the pipeline which is mentioned in the workspace notes.

    1)Please excuse the note about disk size guidelines, this was leftover wording for a previous workflow (wording has now been removed). There isn't any suggestions for changes to the disk size, if you come across any problems simply rerun the workflow with a larger disk size for the particular task that failed. Rerunning workflows with Terra's call caching will allow you to start from the failed task in the previous submission.  

    2)There isn't any guidance on the ideal number of shards per sample to optimize for cost for a cohort this size, this isn't something the authors have benchmarked.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk