Where did all my credits go?
I decided to finally give cloud computing a try using the $300 "trial use" credit that Google offers for new users. I wanted to see how far I could get with the GATK4 joint genotyping workflow. This is a summary of what I did (all workflows run via the Terra GUI):
1. Created a Google storage bucket.
2. Uploaded 469 BAM files (totaling ~3 Tb) and my reference genome (mosquito) to the bucket.
3. Launched a few small tests of HaplotypeCaller on a some tiny test intervals of the genome.
4. Ran HaplotypeCaller successfully on one BAM file (completed in ~14 hours).
5. Ran HaplotypeCaller successfully on a batch of 10 more BAM files (completed in <24 hours).
6. Downloaded the resulting g.vcf.gz files from the Google bucket to my local file system.
Now I find that I have used $263 of my $300 credit. That rather surprised me. My question is: did I do something wrong, or is that really how much cloud computing costs? Since my ultimate goal was to do joint genotyping on the full set of 469 genomes, this doesn't scale very well for me. Extrapolating linearly, the full analysis would end up costing me over $10,000. Comments? Suggestions?
Comments
5 comments
Hi Marc Crepeau,
Thanks for writing in. Can you share your workspace with GROUP_FireCloud-Support@firecloud.org by clicking the Share button in your workspace? The Share option is in the three-dots menu at the top-right.
Let us know the workspace name, as well as the relevant submission and workflow IDs. We’ll be happy to take a closer look as soon as we can.
Best,
Samantha
Okay, I did that, thanks! Any help understanding where the bulk of my expense lies would help. Unfortunately the Billing console doesn't give any itemized details about how credit is spent. It looks like I'd get plenty of information if it was my own money, but for the trial-offer credit, all I get is how much has been spent, without any details.
Hi Marc Crepeau,
Can you please let us know the name of your workspace?
Thanks,
Samantha
Oh sorry, yes, you asked for that in your first message. The workspace is HaplotypeCaller_test, and I had a little workflow called HaplotypeCallerBatch that I was testing. The runs that worked were 3bac8b64-4980-46d7-985a-9377292712f0 for a single genome on Feb. 16th and d5e3f428-401b-420f-a4cd-61447e0533c8 for a batch of 10 genomes on Feb. 17th.
Okay, it looks like this was my bad. I was misinterpreting the info on the billing page. On running some further workflows, I see that the dollar amount is going down, not up. So the $263 was how much credit I had *remaining*, not how much I had spent. I had spent $300 - $263 = $37, which is much more reasonable for halplotype calling 11 mosquito genomes. So I think this ticket can be closed. I would still comment that it would be nice if the billing page gave itemized details for how the credit is consumed, since one of the factors people will be considering during a trial period is the eventual cost of their anticipated analyses.
Please sign in to leave a comment.