In the Google Cloud Platform (GCP), quotas specify how many resources, such as central processing units (CPUs) and persistent disks (PDs), can be used by a Google Project at any given time. Quotas prevent unforeseen spikes in usage so resources are available to the community at all times. For information on quotas, including how to increase your quota, read on.
- How do I request a quota increase?
- Why do quotas matter in terra?
- When would I need to change quotas?
- How do I check my quotas?
- How much quota will I need?
How do I request a quota increase?
You can make quota requests by sending an email to firstname.lastname@example.org.
Please be sure to provide the following information:
- How many CPUs you will need at any given time
- How much persistent disk you will need
- How much (if any) SSD persistent disk you will need
Why do quotas matter in Terra?
Computational tasks define a tool used for analysis as well as the number of CPUs and persistent disks (PDs) needed to compute results. Workflows (aka methods) are run from within Workspaces, which are paid for by a Terra Google Billing Project.
All workflows (or "methods") run from a workspace are affected by quotas on the Google Project.The Google project is tied to a Google billing account, which will be charged for data storage and compute costs. Google enforces default quotas for compute engine resources for Terra billing projects based on a user's billing reputation.
When would I need to change quotas?
You may experience one of the following after launching a workflow analysis if there is not enough quota:
- Tasks within your workflow will wait on quota availability. For example, if you requested 1,000 tasks with eight CPUs each, and your quotas allow 24 CPUs at once, you can only run three tasks at a time. Each subsequent task is queued.
- A task within your workflow may fail because it requested more resources than allowed by your quotas. For example, if you requested 60 CPUs in your task and your quota is capped at 24 CPUs at once, your workflow may fail to launch.
Please note that unless you are seeing errors, you do not need to update quotas - your analysis will simply run more slowly. If your analysis runs more slowly than you expect, or if you see errors/messages related to quota in your logs, you may want to request an increase.
How do I check my quotas?
If you are a Terra billing project Owner, you can check your Google projects quota at a URL like this: https://console.cloud.google.com/iam-admin/quotas?project where "project" is the name of your Terra billing project.
Here you'll see a long list of quotas for the project. In this example, the CPU quota in region "us-central1" is maxed out (the orange bar near 100%) and you'll need to request more Quota (see steps above):
Note that quotas are defined per region. To run your analysis across multiple regions (e.g. us-east1 and us-central1), you need to request larger quota in both.
Terra cares about the following Google Compute Engine API quotas:
- CPUs: how many CPUs you can use at once across all of your tasks
- Preemptible CPUs: the pool of CPUs that would only be used by preemptible instances. You can learn more about this quota here and about preemptible instances here.
- Persistent disk standard(GB): how much total disk non-SSD you can have attached at once to your task VMs
- Persistent disk SSD(GB): how much total SSD disk you can have attached at once to your task VMs
- Local SSD(GB): how much SSD is attached directly to the server running the task VMs. You can learn more here. Only applicable if you are using local SSD in your Task.
How much quota will I need?
The right amount of quota is a function of the number of workflows being launched, the number of concurrent tasks running within each workflow, and the resources being requested by those tasks.
To calculate the quota needed for the workflows, you need to do a bit of diving into your WDL to examine what it is doing.
For example, say we have a three-task WDL that will run on one to many samples. We need to look across these tasks to determine what the maximum amount of CPU and PD we expect to need at any given time.
- Task 1: uses 10 CPUs and 10GB of PD
- Task 2: uses 1 CPU, 1GB of PD and scatters 10-ways wide
- Task 3: uses 10 CPU, 10GB of PD and scatters 10-ways wide
In this example, tasks 1 and 2 are using the same amount of resources because task 2 scatters. Task 3, however, uses more resources than Task 1 or 2.
When running task 3 on ten samples at once, the task requests a total of 100 CPUs and 100GB of PD (due to scattering 10-ways wide) for one sample. Because we are running this on ten samples, we are trying to use ten times those resources at once -- 1000 CPUs and 1TB of persistent disk. If our current quota is set at 24 CPUs and 100GB of PD and we want this workflow to run as quickly as possible, we will need to make a request for at least 1000 CPUs and 1TB of PD.