How to customize and launch JupyterLab

Derek Caetano-Anolles
  • Updated

If you're interested in using Terra on Azure, please email terra-enterprise@broadinstitute.org.

Jupyter Notebooks run on virtual machines (VMs) or clusters of machines in your Jupyter Cloud Environment. You can adjust the configuration of your Jupyter app to fit your computational needs.  This article gives step-by-step instructions for customizing your Jupyter Cloud Environment virtual machine, installed software.

Accessing DRS URIs data (AnVIL users)

Note that resolving AnVIL DRS URIs with terra-notebook-utils requires Python 3.10, which is not currently included by default. For step-by-step instructions to specify Python 3.10 instead of allowing the default when creating your JupyterLab virtual cloud environment, see How to add Python 3.10 to your JupyterLab kernel

Caveat re. data generated in Jupyter LabNote that data generated in an interactive (JupyterLab) analysis is stored in the Cloud Environment's persistent disk - not the workspace blob container storage. You can move data generated in an interactive analysis to your workspace storage for more permanent storage (generated data will be deleted when the Jupyter VM is deleted or recreated).

Starting a Jupyter virtual machine (VM)

Follow the step-by-step instructions below to launch your Jupyter Cloud Environment VM in a Terra workspace. 

1. Start in the Analyses tab of your workspace.

2. Click the cloud icon in the right sidebar
ToA-Launch-Jupyter-1.2_Cloud-icon-in-sidebar.png

You can also get to the (Jupyter) Cloud Environment pane by clicking the notebook name.

3. In the Cloud Environment Details pane, click the gear icon (Environment settings) under the Jupyter logo.
ToA-Launch-Jupyter_Gear-icon-in-Cloud-Environment-Details-pane_Screenshot.png

4. In the Azure Cloud Environment configuration pane, click the Create button (at the bottom right) to start a VM with the default settings.
ToA-Launch-Jupyter-1.4_Create-default-Jupyter-Cloud-Environment_Screenshot.png

5. Click on the name of a Jupyter notebook in the Analyses tab

6. In Preview mode, click the OPEN button at the top. 

ToA-Launch-Jupyter_Select-OPEN-in-preview_Screenshot.png

It will take 3-5 minutes for the cloud environment to create, during which you can read the notebook content in preview mode

ToA-Launch-Jupyter_Cloud-Environment-creating_Screenshot.png

When your Cloud Environment is ready (screenshot)

ToA-Launch-Jupyter_Cloud-environment-is-ready_Screenshot.png

7. Click OPEN at the top (again) to launch JupyterLab.

ToA-Launch-Jupyter_Jupyter-Lab-in-browser_Screenshot.png

8. Click the name of the notebook you want to open in the left sidebar to open the notebook in Jupyter Lab.

ToA-Open-Jupyter_Notebook-in-Jupyter-Lab_Screenshot.png

Note that the green dot by the Jupyter logo in the sidebar (circled) indicates that the VM is running (i.e., costing money). 

How to run a terminal in an Azure Cloud Environment

Follow steps 1-7 above to open an Azure Cloud Environment. In the launcher pane, scroll down to click on the terminal icon (highlighted below).

JupyterLab-Launcher-in-Azure-Cloud-Environment_Screenshot.png

Note that you can also click the blue button with the "+" at the top left at any time to get to the launcher page. 

This will launch a bash terminal running in your Jupyter Cloud Environment VM.

Terminal-in-Azure-Cloud-Environment_Screenshot.png 

How to delete or pause your VM

Option 1: Pause Jupyter Environment

Pausing a Cloud Environment saves money while preserving all generated data and VM configurations. 

1.1. Click the Jupyter icon in the right sidebar.

1.2. Click the Pause button in the JupyterLab Environment Details pane. 

What to expect

It will take a few minutes to pause the VM, after which you'll see the reduced paused cost in the Environment Details pane and in the sidebar.

ToA_Screenshot-of-paused-Cloud-Environment.png

Option 2: Delete Jupyter Environment

Deleting a Cloud Environment will also delete generated data, so be sure to copy anything you want to keep to workspace storage

2.1. Click the Jupyter icon in the right sidebar.

2.2. Click the settings (gear icon) in the JupyterLab Environment Details pane.  

ToA-JupyterLab-Environment-Details-pane_Screenshot.png

2.3. Click the Delete Environment button (at the bottom of the form).

ToA-Delete-Azure-Cloud-Environment_Screenshot.png

2.4. Click the blue Delete button to confirm. 

If your Cloud Environment is stuck in an error state

You can delete (or pause) Cloud Environments, even if they are in an error state, from your Cloud Environments page (https://app.terra.bio/#clusters) from the main navigation menu (top left of any page in Terra). 

ToA-Deleting-cloud-environments-in-error-state_Screenshot.png

Customizing your Jupyter Cloud Environment

Currently, you can select from four pre-configured Cloud compute profiles and adjust the disk (RAM) size of your Jupyter VM in Terra. There is an adjustable autopause with a default value of 30 minutes. Autopause prevents you from leaving your Jupyter VM running (and costing money). 

When can you change your Jupyter Cloud Environment? You can modify your Jupyter Cloud Environment at any time, even if you've already started working in a notebook. 

Most updates that involve increasing Cloud Environment resources will preserve any previous work. This is why we recommend starting with the minimum resources you think you need and scaling up if it's not enough.

1. Specify what you want by selecting from the Cloud compute profile dropdown and filling in the Disk size (GB) field in the Cloud Environment customization pane.

ToA-Customize-Jupyter_Select-Cloud-compute-profile-and-disk-size_Screenshot.png

2. Click the Create button and Terra will recreate your cloud environment with the new specifications. Scroll down for more details about each customization option.

Cost-saving recommendationsSize your compute power appropriately
You pay a fixed amount while a notebook is running, whether or not you are doing active calculations. The cost is based on the compute power of your virtual machine, not how much computation is being done. So choose enough power to do your computations in a reasonable amount of time, but not excessive power that you pay for and don't use.

Start small and scale up
Generally, you don't lose data if you increase resources (e.g, CPUs or disk sizes), so it's best to start small and increase as needed. 

Updating your Jupyter VM in real time (without losing data)

Data generated in a Jupyter analysis is stored in the Cloud Environment VM by default, and will be lost when you delete or recreate your Jupyter Cloud Environment VM. Make sure to copy any data you want to keep to workspace blob storage

There are many changes you can make - even while your Jupyter Cloud Environment is running - without worrying about losing data.

Changes that don't put data at risk

Below are all changes you can make to the virtual environment where your notebook analysis runs without losing data stored in the disk (RAM). 

  • Increase or decrease the # of CPUs or VM memory
    During this update, the Notebook will pause the cloud environment, update, and then restart. The update will take a couple of minutes to complete, and you cannot edit or run the Notebook while it completes.

Avoid losing data: Jupyter Cloud Environment considerationsDeleting the Cloud Environment will mean files generated or stored in the application memory will be lost when Terra re-creates the Cloud Environment. To avoid losing data, make sure to copy any important data to workspace storage and only increase resources.

See How to save data from an interactive analysis to workspace storage (blob storage).

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.