'Quota exceeded' error for GCR pulls (Workflows)
We've received multiple reports of users getting an error similar to the following when running workflows on Terra:
Execution failed: generic::unknown: pulling image: docker pull: running ["docker" "pull" "gcr.io/google.com/cloudsdktool/cloud-sdk:276.0.0-slim"]: exit status 1 (standard error: "Error response from daemon: Head \"https://gcr.io/v2/google.com/cloudsdktool/cloud-sdk/manifests/276.0.0-slim\": toomanyrequests: Quota exceeded for quota metric 'Requests per project in the US multi-region' and limit 'Requests per project in the US multi-region per minute' of service 'artifactregistry.googleapis.com' for consumer 'project_number:32555940559'.\n")
The quota limit mentioned in the error is for a Google-owned project, which is not something we have access to increase. Our engineers are aware of this issue and have filed a case with Google. We will share updates to this post as soon as we have them.
As a workaround, users can resubmit failed jobs, or add maxRetries
to the task runtime so the failed task automatically gets retried.
Comments
2 comments
Our engineers have now declared this issue as an incident. Updates will be posted to the following Service Incident article: https://support.terra.bio/hc/en-us/articles/22211781195291.
Maybe time to implement in Cromwell/Terra a system to cache docker images when running large arrays of tasks? (similarly to what is done when running Cromwell with HPC shared filesystem environments) 😀
Please sign in to leave a comment.