Are resource quotas slowing your analysis down?

Allie Cliffe
  • Updated

Does your analysis seem "stuck" - running or progressing unusually slowly? Did you get an error message that includes the word "quota" when trying to run a large analysis (workflow or interactive)? Find out if you've exceeded your Google Cloud resource quotas and how they can affect your work on Terra. 

Overview: Resource quotas and why they matter in Terra

Google Cloud resource quotas limit how many resources can be used by a single Google project at any given time. GCP resources consumed in Terra include central processing units (CPUs and GPUs) and persistent disks (PDs). 

Why does Google have resource quotas?

Quotas prevent unforeseen spikes in usage, making sure resources are available to the community at all times. To learn more, see Google's resource quotas documentation here.

How resource quotas impact your analyses 

Resource quotas affect your ability to spin up a large virtual machine (VM) to run a workflow or interactive analysis. They can also impact the speed of your analysis since WDL tasks will slow down or pause altogether as you run up against a compute or disk quota.

What is affected by resource quotas?

If you are close to or exceed your resource quota, Terra cannot secure the CPUs, GPUs, or PD requested. All workflows (or "methods") and Cloud Environments that run in Terra are affected by Google Cloud compute and disk quotas. 

  • CPUs: how many CPUs you can use at once across all tasks
  • GPUs: how many GPUs you can use at once across all tasks
  • Preemptible CPUs: the pool of CPUs that would only be used by preemptible instances. You can learn more about this quota here and about preemptible instances here.
  • Persistent disk standard(GB): how much total disk (non-SSD) you can have attached at once to your task VMs
  • Persistent disk SSD(GB): how much total SSD disk you can have attached at once to your task VMs
  • Local SSD(GB): how much SSD is attached directly to the server running the task VMs. You can learn more in Google's documentation. This quota only applies if you are using local SSD in your task.

What decides your quota?

Google enforces default resource quotas (i.e., limits on how much of a resource can be used by a single Google project) based on the Google Cloud billing reputation of the Cloud Billing account owner

What does your quota cover?In Terra, these limits apply per workspace (for those created after September 27, 2021) or per Terra Billing project (for workspaces created before September 27).

If you (your Google ID) are new, you will have the default quota

As you use (and pay for) Google Cloud resources, your quota will increase. You can also request an increase (see instructions in How to troubleshoot and fix stalled workflows).

Symptoms of bumping up against a resource quota

Quota limits are not always easy to diagnose! Below are some behaviors and error messages you may experience after launching a workflow analysis if there is not enough resource (i.e., VM compute or disk capacity) in your quota:

  • Tasks within your workflow progress slowly (i.e., go from queued to running) while they wait on quota availability.
    If you requested 1,000 tasks with eight CPUs each, and your quotas allow 24 CPUs at once, you can only run three tasks at a time. Each subsequent task is queued.

  • A task in your workflow fails when it requests more resources than your quota allows.
    For example, if you requested 60 CPUs in your task and your quota is capped at 24 CPUs at once, your workflow may fail to launch.

Check your quota and ask for an increase

1. Confirm if a resource quota is keeping your analysis from running efficiently and request more following the instructions in How to troubleshoot and fix stalled workflows.

When to request more resource quota If you are seeing errors
If you see quota errors or messages in your logs - when your workflow fails because a task requested more resources than you have in your quota - you will need to update your resource quota.

If you need to see results faster
In many cases, if you exceed your resource quota, your analysis will simply run more slowly. This may or may not be fine. If your workflow is stalled and you need to progress, you may want to request an increase.

2. Request a quota increase on Google Cloud console following the instructions in How to troubleshoot and fix stalled workflows (step 4).

Curious what is happening behind the scenes? See How the workflow system works.

Was this article helpful?

0 out of 0 found this helpful



Please sign in to leave a comment.