This is a step-by-step guide for 1) building and publishing a custom Docker image and 2) running a Jupyter Notebook on Terra using a Docker image modified to include additional packages.
Step 1. Clone the Git repository with the base images
First, you need to download all our base images by cloning the Terra GitHub repository. You can grab them all at once.
1.1. Select the green Code button on the GitHub repo page and copy the URL.
1.2. Open a local terminal and execute the command Git clone LINK
using the link from the previous step.
Upon executing this command, you’ll see something like this.
Now, you should have our entire collection of Docker base images on your local machine in a new directory called terra-docker
. Inside this directory, you should see a folder terra-jupyter-r
. This is the image we will modify in this tutorial.
Step 2. Modify a Docker file to meet your needs
The next step is to modify one of the base Docker files and “build an instance” of your desired Docker image (this involves just one command but can take your computer some time to accomplish).
2.1. Find the folder terra-jupyter-r
(by typing in “terra-jupyter-r” in your Finder search bar) and open the Docker file (conveniently called Dockerfile
) in your favorite text editor.
If you scroll through this file, at the bottom, you should see a list of R packages, mostly installed with BiocManager. This is where you will add a new package to create your custom Docker image.
2.2. Under the line containing && R -e 'BiocManager::install(c( \
add a new line, "edgeR"
. This will add the edgeR package - a popular BioConductor package for the analysis of digital gene expression data - to your Docker image.
2.3. Once you add this to the code, just click Save! No need to Save as - you shouldn't rename the file in any way.
Step 3. Remove half-finished Docker builds on your machine
Before you build!You are almost ready to build and push your custom Docker image! Before you execute the build command, you may need to remove any half-finished Docker builds from your machine and set up your DockerHub or Google Container Registry (GCR).
We'll walk you through these steps, but if you have some Docker experience, you might not need to worry about these and can skip to Step 5. Build your custom Docker image (assuming you already have a Docker repository with a name and tag matching the image you are about to build).
If you never used Docker images on the machine you’re using for this exercise, you probably don’t need to do this part. But if you played around with Docker, you may need to follow the pruning steps below. If you skip these steps and have trouble down the road, come back to see if this helps when troubleshooting.
3.1. In your local terminal, execute the command Docker image ls
to see if there are any other images on your machine. Conveniently, this command can be executed while in any directory. If you come up with an empty list, skip to Step 4. Set a destination for your Docker image.
3.2. If your list ISN'T empty (and you don’t need the images listed), execute the following command.
docker system prune -a;
3.3. Execute the Docker image ls
command again to check that the pruning worked. Now the list should be empty.
Step 4. Set a destination for your Docker image
You must set up a destination for your Docker image, so there is a place to push it to.
Where to store your image Terra accepts Docker images stored in the following registries
- Google Cloud Container Registry (GCR)
- GitHub Container Registry (GHCR)
- DockerHub
The advantage of using GCR is the ability to use private buckets. DockerHub users are limited to public repositories, while GCR buckets can give Terra convenient access to private resources.
At this time, Quay is not a supported registry for custom cloud environments. You can, however, use Quay images for workflow submissions.
Note: It's important to put in the same image name (and tag) you intend to use in your build command.
Follow the instructions below for setting up the destination for your image using either DockerHub or Google Container Registry (GCR).
4.1. If you haven't already, sign up for a Docker account and install DockerHub locally by following these instructions for Mac, Windows, or Linux.
4.2. Go to your DockerHub account and Create Repository.
4.3. Give your repository a Name and make sure the Visibility is set to Public so Terra can access your Docker image.
4.4. Create your repository by clicking on the blue Create Repository button.
4.5. Create a bucket in Google Cloud Storage.
4.6. Give Terra access to a private GCR bucket by adding your individual, personal Terra group as a member of the bucket.
Why we recommend using Personal Terra groupsTerra groups are a way to harness Terra's security structure and avoid giving permission to a user ID while keeping members easy to identify. To learn how and why to make a personal Terra group to access external resources, see Best practices for accessing external resources.
Alternatively, if you want a group of collaborators to have access to your private Docker container, you can add the @firecloud.org email address for that group (found in the Groups section of your Terra profile).
Step 5. Build and push your custom Docker image
The build command must be executed from within the directory with the modified Docker file.
Before you start: Make note of these common mistakes1. Make sure the repository name and image name match what you’ve set up in your Docker hub.
2. The Docker package builds the image based on the Docker file in the present directory, so don’t forget the period (“.”) at the end of the build command!
You MUST run your command from the directory containing the dockerfile
Docker only recognizes dockerfiles named simply Dockerfile
(no extensions), so you can have as many dockerfiles as you want on your computer, but they need to be in separate folders, with only one dockerfile per folder. When you execute the Docker build
command, it will look for a dockerfile in the directory you're looking at in your terminal. There must be a single file simply named Dockerfile
in that directory, or the command will fail.
Follow the instructions below to build and push your custom Docker image.
5.1. Change directory into your terra-jupyter-r
directory using the following command.
cd terra-jupyter-r
If you're following this tutorial exactly, the contents of the folders you cloned from Git should be right.
If you're trying to use these instructions for your own Docker adventures, you may want to use the ls
command to list the contents of the directory to make sure the necessary dockerfile is present. If you just made your own dockefile from scratch and you're having trouble getting rid of an extension (such as .txt
), you can get rid of it by renaming the file with this command.
mv Dockerfile.txt Dockerfile
5.2. Execute the build command below.
Docker build -t RESPOSITORY_NAME/DOCKER_IMAGE_NAME:TAG_NAME .
The building process should take about 10 minutes.
5.3. Execute the push command to upload your custom image to your repo.
Docker push RESPOSITORY_NAME/DOCKER_IMAGE_NAME:TAG_NAME
This step may also take up to 10 minutes.
How to find your Docker container's digest
Sometimes you need to know a Docker container's digest - a unique content-addressable identifier - to be certain that all nodes are running the correct version of the container.
There are two ways to get the digest depending on where your image is stored. In both cases, you'll look for something with the format sha256:SOMETHING_LONG
, where the SOMETHING_LONG
bit is the digest.
Follow the instructions below, depending on whether your image is stored on your local machine or not.
- In the terminal, type
docker inspect
at the prompt. Note: The output is more complicated (there are two things that look likesha256:SOMETHING_LONG
. The one you want is the "RepoDigests" one, not the "Id"):~ $ docker inspect MY_REPO/MY_IMAGE:TAG [ { "Id": "sha256:a98acb9802cbf46eb71e28c652f58026c027d9580ff390c6fa9ae4dec07ae13d", "RepoTags": [ "MY_REPO/MY_IMAGE:TAG" ], "RepoDigests": [ "MY_REPO/MY_IMAGE@sha256:96bf2261d3ac54c30f38935d46f541b16af7af6ee3232806a2910cf19f9611ce" ], ...and a lot of other details we don't care about right now.
- In the terminal, type
docker pull MY_REPO/MY_IMAGE:TAG
at the prompt. The digest will be displayed in the output as:Digest: sha256:96bf2261d3ac54c30f38935d46f541b16af7af6ee3232806a2910cf19f9611ce
Launching a Notebook with your custom Docker image
You should now be ready to launch a Notebook Cloud Environment based on your custom Docker image!
1. Navigate to the workspace Analyses page with the notebook you want to run using the custom Docker image.
2. Select the Environment Configuration button (cloud icon) in the side panel on the right side of the screen.
3. Select Environment Settings under the Jupyter section of the panel that opens up.
4. If you already have a Jupyter Cloud Environment in the workspace, select the Custom Environment option at the bottom of the Application Configuration dropdown.
If you are creating a new Cloud Environment, select the option to Customize the Cloud Environment, and select the Custom Environment option at the bottom of the Application Configuration Dropdown.
5. Fill in the required field with the name and location of the image in your repository.
6. Select Create/Replace at the bottom right of the form to create or update the Environment, depending on whether one already existed for your workspace. It will take about 10 minutes for the new virtual machine (VM) to spin up.
7. Open any notebook (or create a new one) in the same workspace.
8. Test to see if the new packages have been installed on your virtual machine.
9. Don't forget to save the image identifier and URL right in your notebook to keep track of which image the notebook is intended to use.
Add a custom Docker to your WDL
In your WDL, you should include MY_REPO/MY_IMAGE@sha256:SOMETHING_LONG
. Note: The tag isn't there at all; it's been replaced by the digest, which is a more specific identifier.