'Quota exceeded' error for GCR pulls (Workflows)
We've received multiple reports of users getting an error similar to the following when running workflows on Terra:
Execution failed: generic::unknown: pulling image: docker pull: running ["docker" "pull" "gcr.io/google.com/cloudsdktool/cloud-sdk:276.0.0-slim"]: exit status 1 (standard error: "Error response from daemon: Head \"https://gcr.io/v2/google.com/cloudsdktool/cloud-sdk/manifests/276.0.0-slim\": toomanyrequests: Quota exceeded for quota metric 'Requests per project in the US multi-region' and limit 'Requests per project in the US multi-region per minute' of service 'artifactregistry.googleapis.com' for consumer 'project_number:32555940559'.\n")
The quota limit mentioned in the error is for a Google-owned project, which is not something we have access to increase. Our engineers are aware of this issue and have filed a case with Google. We will share updates to this post as soon as we have them.
As a workaround, users can resubmit failed jobs, or add maxRetries
to the task runtime so the failed task automatically gets retried.
Our engineers have now declared this issue as an incident. Updates will be posted to the following Service Incident article: https://support.terra.bio/hc/en-us/articles/22211781195291.
Maybe time to implement in Cromwell/Terra a system to cache docker images when running large arrays of tasks? (similarly to what is done when running Cromwell with HPC shared filesystem environments) 😀
Please sign in to leave a comment.