Mitochondria-SNPs-Indels-hg38 copy
Hi Beri,
What is the right place of the most recent mitochondrial analysis pipeline? Is it in this workspace https://app.terra.bio/#workspaces/gatk-workflows/Mitochondria-SNPs-Indels-hg38%20copy in Terra? If so, is this pipeline enough to run the full analysis as performed in the paper https://www.biorxiv.org/content/10.1101/2021.07.23.453510v1.full.
At first I came across this pipeline https://github.com/gatk-workflows/gatk4-mitochondria-pipeline from gatk-workflows on Github. I used it locally.
In the paper, it is stated that the following is performed to do the whole analysis including filtering, annotation... "The Mutect2 pipeline is available through GATK at https://github.com/broadinstitute/gatk/blob/master/scripts/mitochondria_m2_wdl/MitochondriaPipeline.wdl (the data available in gnomAD v3.1 was generated using https://portal.firecloud.org/?return=terra#methods/mitochondria/MitochondriaPipeline/25), and the Hail scripts used for combining the VCFs, filtering samples and variants, adding annotations, and performing analyses can be found at https://github.com/broadinstitute/gnomad-mitochondria."
Could you please guide me in specifying the right pipeline workflow? If the workflow in Terra is performed then the outputs can be used to run the Hail scripts?
Comments
3 comments
Hi Halima Alachram,
Thanks for writing in. The gatk-workflows/gatk4-mitochondria-pipeline repo has been archived and is no longer being updated.
The GitHub repo that the paper references (broadinstitute/gatk) is the correct place to find the most up to date version of the Mitochondria pipeline (https://github.com/broadinstitute/gatk/blob/master/scripts/mitochondria_m2_wdl/MitochondriaPipeline.wdl). The Mitochondria-SNPs-Indels-hg38 featured workspace also pulls from the broadinstitute/gatk repo so it uses the same workflow.
Lastly, yes, if you run the workflow on Terra, the outputs can be used to run the Hail scripts in https://github.com/broadinstitute/gnomad-mitochondria.
Please let me know if you have any other questions.
Best,
Samantha
Hi Samantha (she/her),
Thank you for the clarification. Now it is more clear.
I am trying to run the Hail scripts locally (I realized that this is the only way to use them), however I have issues with the inputs.
1. I am not able to get/download the following resources from the google cloud bucket.
RESOURCES = {
"variant_context": "gs://gnomad-public-requester-pays/resources/mitochondria/variant_context/chrM_pos_ref_alt_context_categories.txt",
"phylotree": "gs://gnomad-public-requester-pays/resources/mitochondria/phylotree/rCRS-centered_phylo_vars_final_update.txt",
"pon_mt_trna": "gs://gnomad-public-requester-pays/resources/mitochondria/trna_predictions/pon_mt_trna_predictions_08_27_2020.txt",
"mitotip": "gs://gnomad-public-requester-pays/resources/mitochondria/trna_predictions/mitotip_scores_08_27_2020.txt",
}
I found the last three files in https://github.com/broadinstitute/gnomad-mitochondria/tree/main/gnomad_mitochondria/resources but not the variant_context file.
2. Is there any documentation on how to get the "freemix_percentage: Uploaded to Terra by user, can be calculated with VerifyBamIDb)" to use it properly in the script? What genotype information should be used to apply VerifyBamIDb?
Best,
Halima
Hi Halima Alachram,
Sorry for the delayed response.
1. Those files are stored in a Google bucket, so you should be able to download them to local storage using the gsutil cp command. For more information on using gsutil cp, see the following article: https://support.terra.bio/hc/en-us/articles/360024056512. It looks like the bucket has requester pays enabled so you'll need to provide a billing project to charge for the download using the -u flag in your gsutil cp command:
(https://cloud.google.com/storage/docs/using-requester-pays#using)
2. I would suggest taking a look the gnomAD documentation linked in the README section (https://gnomad.broadinstitute.org/news/2020-11-gnomad-v3-1-mitochondrial-dna-variants/) or reaching out to the gnomAD team directly (gnomad@broadinstitute.org).
Best,
Samantha
Please sign in to leave a comment.