This is a step-by-step guide for building and publishing a custom Docker image, and running a Jupyter Notebook on Terra using a Docker image modified to include additional packages.
In this tutorial you will learn:
- How to modify one of our base Docker images by adding packages to the Docker file
- How to push that Docker image file to a repository so that Terra can access it
- How to launch a Notebook Runtime in Terra using your custom docker image
Clone the Git repository with the base images
To get started, you should download all of our base images here by cloning the Terra GitHub repository (you can grab them all at once!)
- To do this, start by finding the green “Clone or download” button in the GitHub repo page and copying the URL
- Open your terminal and put in the command “Git clone [link]” using the link you found in the previous step. Upon executing this command, you’ll see something like this:
Modify a Docker file
The next step is to modify one of the base Docker files and to “build an instance” of your desired Docker image (this involves one simple command, but takes your computer some time to accomplish).
You should now have our entire collection of Docker base images on your local machine in a new directory called “terra-docker”. Inside of this directory you can find a folder titled “terra-jupyter-r” – this is the image we will modify in this tutorial.
Find this folder (for example by typing in “terra-jupyter-r” in your Finder search bar) and open the Docker file (conveniently called “Dockerfile”) in your favorite text editor:
If you scroll through this file, at the bottom you can see a list of R packages, mostly installed with BiocManager. This is where you can add our package:
- Under the line containing && R -e 'BiocManager::install(c( add the package “edgeR” – a popular BioConductor package for analysis of digital gene expression data
- Once you’ve added this to the code, just click “save”! No need to “save as” and you should not rename the file in any way.
Building and pushing your Docker image
You are almost ready to build and push your custom Docker image! Before you execute the build command, you may need to (a) make sure that there are no half-finished Docker builds on your machine that will mess up the building and pushing process. and (b) set up your Docker hub or Google container registry so that you have a place to push your custom image. If you have some Docker experience, you may not need to worry about these, and can skip to step (c) (assuming you already have a Docker repository with a name and tag matching the image you are about to build).
(a) If you’ve never used Docker images on the machine you’re using for this exercise, you probably don’t need to do this part, but if you have been playing around with docker, you may need to follow the pruning step outlined below (and if you skip it and have trouble down the road, come back to see if this helps shoot that trouble down):
- In your terminal, type the command “Docker image ls” to see if there are any other images on your machine. Conveniently, this command can be executed while in any directory. If you come up with an empty list, skip to step b.
- If your list is not empty (and you don’t need the images listed), execute the following command:
docker system prune -a;
- Use the “Docker image ls” command again to check that the pruning worked
(b) You must now set up a destination for your Docker image. The custom environment input field described in the end of this tutorial accepts images from both Docker hub and Google container registry. Note that it's important to put in the same image name (and tag) that you intend to use in your build command.
For Docker hub users:
- To install Docker first sign up for docker and then install according to these instructions:
- Go to your DockerHub account and create new repository. Make sure that Terra can access your Docker image by either making the repository public.
For GCR users:
- Create a bucket in the Google container registry. The advantage of using GCR is the ability to use private buckets. Docker hub users are limited to only using public repositories, while GCR buckets have a convenient way to give Terra access to private resources.
- For your Terra workspace to have access to a private GCR bucket, you will need to add your @firecloud.org proxy email address (which you can find in your profile on Terra) as a member to that bucket:
Alternatively, if you want a group of collaborators to have access to your private Docker container, you can add the @firecloud.org email address for that group (found in the Groups section of your Terra profile) instead of an individual user's proxy address.
(c) This is the crucial step: building and pushing your custom image! The build command must be executed from within the directory with the modified docker file. Important:
- Make sure that the repository name and image name match what you’ve set up in your Docker hub.
- The Docker package builds the image based on the docker file in the present directory, so don’t forget the period (“.”) at the end of the build command!
- Now cd into your “terra-jupyter-r” directory:
- Execute the build command (the building process should take about 10 minutes):
Docker build -t Repository_name/docker_image_name:tag1 .
- Execute the push command to upload your custom image to your repo (may also take up to 10 minutes) :
Docker push Repository_name/docker_image_name:tag1
Launch a Notebook with your custom Docker image
You should now be ready to launch a Notebook Runtime Environment based on your custom Docker image! To do this, simply go into a workspace where you are able to create notebooks, click on the “Runtime Environment” button in the upper right corner of the screen, select the “Custom Environment” option and fill the required field with the location of the image in your repository:
- Enter the Docker image location and name as described above
- Click “Create”/”Replace” and wait another 10 minutes
- Open any notebook (or create a new one) in the same workspace
- Test to see if the new packages have installed on your virtual machine:
- Don't forget to save the image identifier and URL right in your notebook in order to keep track of which image the notebook is intended to use!