In Terra, you can set up a workspace virtual environment by selecting a Terra Jupyter-base Docker image, creating and linking to your own custom Docker image (based on an existing Jupyter-base image), or selecting a project-specific image. This document describes the currently available project-specific environments and how to use them in Terra.
- Custom environments vs. Terra Jupyter-base Docker environments
- Current project-specific environments
- Accessing project-specific environments in Terra
- Saving data generated using project-specific environments
Project-specific environments versus Terra Jupyter-base Docker environments
Terra supports multiple genomic research projects, each requiring a different computational environment to analyze their genomic datasets. These project-specific environments allow project teams to customize their Terra workspace so that it can run necessary applications (Jupyter Notebooks, Bioconductor, R, etc.). Like all Terra environments, project-specific environments are created using a Docker image, but they are not based off one of the existing Terra base images, which currently only support Jupyter Notebook applications.
|Warning: Code developed in project-specific environments may not be saved to the workspace Google bucket.|
|If a project-specific image does not use a Jupyter Notebook or Terra base image, any work performed in the image will not be saved to your workspace Google bucket. While your code will be saved on the runtime environment, if you delete the runtime (or if your runtime becomes unresponsive), you will lose code. To avoid losing work, make sure to back up your code. The end of this guide provides steps for saving data or code generated with project-specific images.|
The following section lists projects that have custom, Terra-supported environments; this list will be updated as new project-specific environments become available.
**Current projects with custom, Terra-supported environments**
- AnVIL (see the AnVIL Docker documentation for details).
- Pegasus, a Python package for analyzing and visualizing large single-cell transcriptomes (find the Pegasus Jupyter Image here)
Accessing a project-specific environment in Terra
You can access a project-specific environment by selecting "Project-Specific Environment" from the Environment pull-down menu of the workspace Notebook Runtime box. The following steps will guide you through this process.
1. Go to your Terra workspace and select the Notebook Runtime box in the workspace upper right corner.
2. From the environment drop-down menu, select "Project-specific Environment".
3. In the URL text box of the same Notebook Runtime box, paste the path to the project-specific image. For example, for the Bioconductor image, paste "us.gcr.io/broad-dsp-gcr-public/terra-jupyter-r:0.0.7".
4. Select "Create" at the bottom of the Notebook Runtime window.
5. A warning box will appear to inform you that the Docker image is unverified. Select "Create". The application will then take a few minutes to load.
6. When your application is ready, a screen will appear stating that your new runtime is ready to use. Select "Apply".
7. A message will appear in the upper right corner asking if you would like to launch. Select "Launch Application" to open the application.
8. The application will open in the workspace. Your Notebook Runtime box will indicate that the application is running.
9. To stop the application, press the pause button on the Notebook Runtime box.
Saving work generated in a project-specific environment
Any work performed in a project-specific environment will not be saved to your workspace. While your code will be saved on the runtime environment, if you delete the runtime (or if your runtime becomes unresponsive), you will lose code.
You can save files and code generated using a project-specific image by 1) copying them to your workspace google bucket, 2) downloading them from your workspace bucket to your local computer, and 3) checking the code into GitHub.
1. Copying work to a workspace google bucket
Use the gsutil tool to copy files to your workspace google bucket. Before copying files, you must first identify the url for the workspace google bucket. This information is found in the Dashboard's Workspace Information panel (see below).
Next, to copy all files generated after completing your work in any application, use the bash command shown below in either the application console or in the terminal. If you want to copy individual files, you can replace `*` with the file name to copy. If using the workspace terminal, you will have to navigate to the folder containing the files.
gsutil cp ./* gs://<WORKSPACE_BUCKET>
For example, if the google bucket id is 'fc-7da2e5e7-d21f-4f78-90c2-27fb2414086e', type the following command:
gsutil cp ./* gs://fc-7da2e5e7-d21f-4f78-90c2-27fb2414086e
More details on how to copy files into a google bucket using python or R commands can be found in the article "Copying notebook output to a Google bucket".
2. Downloading files from a workspace google bucket
Once your files are copied to a workspace google bucket, you can access them by selecting the Data tab of the workspace and choosing the Files option on the bottom left. This will display the files available in your google bucket. By selecting a file, you download it directly. Additionally, this Terra support document details alternative techniques you can use to download data files.
3. Checking code into GitHub
You can install Git on your application and use it to check code into GitHub.