Pre-configure a Cloud Environment with a startup script

Anton Kovalsky
  • Updated

If the pre-packaged Cloud Environment software environments (in the Application Configuration dropdown menu) don't come with all of the packages you consistently need, and you'd like a more efficient way of launching a Jupyter notebook, RStudio or Galaxy analysis without having to install each package manually, you can use a startup script to streamline the process. This article describes why and how to use startup scripts. 

Why use a startup script?

Standardize your Cloud Environment

Using the same software application configurations ensures that everyone has the same computational environment and gets the same results (when inputting the same data and using the same analysis tools, of course!). If one of the pre-configured application options doesn't meet your needs, you can make your own custom application configuration (i.e. pre-install software and dependencies in the VM) with startup script. (see additional Docker documentation for another option).

Startup script versus custom Docker

Not only are startup scripts good for installing packages, you can use them to make environment changes that typically require sudo. This makes them an efficient alternative to creating custom Docker images (if you're curious about doing that, check out this tutorial, Custom cloud environments for Jupyter notebooks).

While a custom Docker is a great way to take a snapshot of a set of package versions to keep your environment consistent, using a startup script is a quick way to add anything you need - including updated packages - to whatever Docker image you're working with, whether you're using a custom Docker or one of our pre-configured ones. Currently, startup scripts are supported for both Jupyter and RStudio environments.

Startup script Tutorial

In this short tutorial, you'll see an example of a startup script and learn how to upload it, find the file path you'll need to provide as a link, and use that link to launch your custom Jupyter or RStudio Cloud Environment.

Below is the example startup script we'll be using in this tutorial. This startup script installs the multtest package that's part of Bioconductor, but is not included in Terra's default R/Bioconductor environment.

Screen_Shot_2021-03-17_at_2.04.10_PM.png

Step 1: Check if packages are already in a pre-configured environment

We can start with a quick confidence check - let's make sure the package we're interested in isn't already part of one of the pre-configured environments available in Terra.

1.1. Go to the Application configuration section of the Cloud Environment customization pane.

1.2. Start by selecting the default R/Bioconductor image in the Application Configuration dropdown menu.
Jupyter-R_Bioconductor-image_Screen_shot.png

1.3. Before adding a custom startup script, try to import the package by typing library(test-name) into a code block and running the cell.

You should see an error saying that no such package is currently on your virtual machine:
Screen_Shot_2021-03-17_at_1.46.45_PM.png

Step 2: Store script in Google bucket

To add a startup script to a Terra Cloud Environment, you'll first need to give the script a URI (Unique Resource Identifier, similar to a URL) so Terra can access the script.

2.1. Store the script in workspace storage (Google bucket)
You can upload the startup script to any Google bucket, provided your workspace can access to that bucket. For this tutorial, we've uploaded the startup script file to the workspace storage (i.e. Google bucket) - by 1) going to the Data tab of the workspace, 2) selecting the Files icon at the bottom of the left hand menu, and 3) clicking the Upload button in the bottom right.

Start-up-script_How-to-upload-to-workspace-storage_Screen_shot.png

2.2. Copy the URI of the script
Once the file is in workspace storage, go to Google Cloud Console where you can copy the URI of your script file. You can do this either with the link to your bucket (in the Open in browser link at the right-hand side of your workspace dashboard, or by clicking the file link in the Data tab)

rstudio2.gif

URI in GCP console (below)
Startup-script_Get-the-URI-from-GCP-console_Screen_shot.png

2.3. Once you have the URI, paste it into the Startup Script field under the Cloud Compute Profile section of the Jupyter Cloud Environment customization pane.
Start-up-script_Enter-URI-in-startup-script-field_Screen_shot_.png

2.4.  Choose any other compute profile customizations, then click the Update or Create button to spin up your Jupyter or RStudio Cloud Environment.

What to expect

Once your environment is ready, you can launch a notebook or Rstudio analysis. If you run the same command we attempted during the confidence check at the beginning of this tutorial, you should now be able to successfully import this package:
Screen_Shot_2021-03-17_at_1.55.53_PM.png

Was this article helpful?

1 out of 1 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.