The prepackaged Cloud Environment software environments (in the Application Configuration dropdown menu) may not come with all of the packages you consistently need. If you want a more efficient way to launch a Jupyter Notebook, RStudio, or Galaxy analysis without having to install each package manually, you can use a startup script to streamline the process. This article describes why and how to use startup scripts.
Why use a startup script?
Standardize your Cloud Environment
Using the same software application configurations ensures that everyone has the same computational environment and gets the same results (when inputting the same data and using the same analysis tools, of course!). If one of the preconfigured application options doesn't meet your needs, you can make your own custom application configuration (i.e., preinstall software and dependencies in the virtual machine [VM]) with startup script (see additional Docker documentation for another option).
Startup script versus custom Docker
Startup scripts are good for installing packages and to make environment changes that typically require sudo. This makes them an efficient alternative to creating custom Docker images. (If you're curious about doing that, see this tutorial, Custom cloud environments for Jupyter Notebooks).
A custom Docker is a great way to take a snapshot of a set of package versions to keep your environment consistent. However, using a startup script is a quick way to add anything you need - including updated packages - to whatever Docker image you're working with, whether you're using a custom Docker or one of our preconfigured ones. Currently, startup scripts are supported for both Jupyter and RStudio environments.
Startup script Tutorial
In this short tutorial, you'll see an example of a startup script and learn how to upload it, find the file path you'll need to provide as a link, and use that link to launch your custom Jupyter or RStudio Cloud Environment.
Below is the example startup script we'll use in this tutorial. This startup script installs the
multtest package that's part of Bioconductor, but is not included in the default R/Bioconductor environment in Terra.
Step 1: Check if the package is already in a preconfigured environment
Start with a quick confidence check to make sure the package you want isn't already part of one of the preconfigured environments in Terra.
1.1. Go to the Application configuration section of the Cloud Environment customization pane.
1.2. Start by selecting the default R/Bioconductor image in the Application Configuration dropdown menu.
1.3. Before adding a custom startup script, try to import the package by typing
library(test-name) into a code block and running the cell.
You should see an error saying that no such package is currently on your virtual machine:
Step 2: Store script in Google bucket
To add a startup script to a Terra Cloud Environment, you need to give the script a URI (Unique Resource Identifier, similar to a URL) so Terra can access the script.
2.1. Store the script in workspace storage (Google bucket)
You can upload the startup script to any Google bucket, provided your workspace can access that bucket. For this tutorial, upload the startup script file to the workspace storage (i.e., Google bucket) - by 1) going to the Data tab of the workspace, 2) selecting the Files icon at the bottom of the left-hand menu, and 3) clicking the Upload button in the bottom right.
2.2. Copy the URI of the script
Once the file is in workspace storage, go to Google Cloud Console where you can copy the URI of your script file. You can do this either with the link to your bucket (in the Open in browser link at the right-hand side of your workspace dashboard) or by clicking the file link in the Data tab.
URI in Google Cloud console (below)
2.3. Once you have the URI, paste it into the Startup Script field under the Cloud Compute Profile section of the Jupyter Cloud Environment customization pane.
2.4. Choose any other compute profile customizations, then click the Update or Create button to spin up your Jupyter or RStudio Cloud Environment.
What to expect
Once your environment is ready, you can launch a notebook or Rstudio analysis. If you run the same command we attempted during the confidence check at the beginning of this tutorial, now you can successfully import this package: