Docker Image Publishers Tips

Allie Hajian
  • Updated

Learn how to minimize network data transfer out (formerly "egress") charges when sharing tools with publicly available Docker images.

Source material for this article was contributed by Matt Bookman and the Verily Life Sciences solutions team as part of the design and engineering rollout of Terra support for data regionality.

Overview

Sharing tools using Docker images can help reduce the time and effort it takes the research community to build and configure software. Google Container Registry or Google Artifact Registry are great places to share your Docker images, particularly for workflows run on Google Cloud and in Terra.

However, when you share your images via Google Container Registry or Google Artifact Registry, there's the potential for incurring network data transfer charges. Though they'd be incurred by end users, these charges would go to you as the Docker image owner. Unlike Google Cloud Storage, neither Google Container Registry nor Google Artifact Registry has a requester pays option.

Creating these tips was motivated by the ongoing addition of regionality capabilities into Terra. For more information on regionality, see US Regional or Multiregional US buckets: tradeoffs and Customizing where your data are stored and analyzed

Scope of charges - example

As an example of the data transfer cost risk, consider a 1 GB docker image. This is a fairly large image, but not uncommon for toolbox-style images or those not optimized for size.

Data moves between different locations on the same continent 

Size*

Price** (per GB)

Data transfer charge for 1,000 tasks

Data transfer charge for 10,000 tasks

Data transfer charge for 100,000 tasks

1.0 GB

$0.01

$10

$100

$1,000

Data moves between different continents and neither is Australia.

Size*

Price** (per GB)

Data transfer charge for 1,000 tasks

Data transfer charge for 10,000 tasks

Data transfer charge for 100,000 tasks

1.0 GB

$0.08

$80

$800

$8,000

Data moves between different continents and one is Australia.

Size*

Price** (per GB)

Data transfer charge for 1,000 tasks

Data transfer charge for 10,000 tasks

Data transfer charge for 100,000 tasks

1.0 GB

$0.15

$150

$1500

$15,000

Consider whether your organization and budget can absorb the occasional accidental data transfer out charges of your Docker image by anyone you share it with, or worse - a systematic recurrence of such usage charges.

Options for making Docker images available

When making Docker images available, you have several choices, each with different values and costs to you and the community. In this article, we look at the following (click the links to jump down to the relevant section).

Publish a Dockerfile

To avoid building new Docker images and managing hosting and access controls yourself, one option is to publish a Dockerfile that contains instructions for how to build the image your analysis uses. With this Dockerfile, members in the community can build the Docker image themselves and then follow instructions to Publish a Docker container image to Google Container Registry (GCR).

Pros

  • Low overhead for you the code owner/publisher
  • No risk of unexpected Cloud charges

Cons

  • (Likely) community member learning curve for building and publishing Docker images
  • Community member needs to create infrastructure for publishing, such as creating a non-Terra Google-native Cloud project (for Google Container Registry)
  • Distributed cost across the community for storing multiple Docker images

Publish to a non-Google docker container registry

Several Docker image repositories are available with varying costs associated with image building, storage, and serving.

Some popular registries (many more exist)

Container registry services have varying levels of rate limiting in place to handle surges in Docker image requests. For example, Docker Hub introduced rate limiting in November 2020, for anonymous and free authenticated use (see Understanding Docker Hub Rate Limiting). When this change went into effect, popular workflows run at very large scale on Terra started failing and needed to be retried.

If you're interested in using a non-Google container registry, please review their cost structures and rate limiting.

If you put your Docker image in a rate-limited repository and members of the community see failures when run at scale, they can follow instructions to copy your image and Publish a Docker container image to Google Container Registry (GCR).

Pros

  • No risk of unexpected Cloud charges

Cons

  • Potential need for community members running at scale to copy image

Publish to a Google docker container registry (experimental)

If you're interested in publishing Docker images to Google Container Registry or Artifact Registry, for best integration and performance within Google Cloud, you may use VPC service controls within a Cloud Organization to put a service perimeter around the Cloud project that contains your Docker image registry.

Please refer to Configure GCR/Artifact Registry to prevent data transfer charges for the most recent recommendations.

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.