How to set up persistent disk storage for your analysis app

Allie Cliffe
  • Updated

Learn how to set up detachable persistent disk (PD) storage when running an interactive analysis app (Jupyter Notebook, RStudio, or Galaxy). To learn more about detachable persistent disks, see the overview article

You can control how your persistent disk is set up when creating, modifying, or deleting your Cloud Environment.

Creating a new Cloud Environment

When you click on the Cloud Environment button, you should see configuration options for your environment in this popup. At the bottom is a box for entering the type and size of your persistent disk (PD). 

Screenshot showing part of the Cloud Environment configuration options menu. An orange rectangle highlights the options in the Disk Type drop-down menu: Standard, Balanced, and Solid State Drive (SSD). An orange arrow highlights the 'Create' button, used to start a Cloud Environment with the selected configuration.

Modifying an existing Cloud Environment

If you modify the configuration of an existing Cloud Environment, you'll see the Update button turn blue (active) at the bottom of the Cloud Environment configuration menu. After you click this button, your Cloud Environment will go through a brief downtime while Terra updates the Cloud Environment. Unless otherwise noted, your data will be preserved during this process.

  • Shrinking your PD can result in lost work

    Decreasing your persistent disk will remove active code and any files on the PD. You could lose work if you choose to decrease the PD size in the middle of the analysis. Updating the PD with a smaller disk size will trigger a warning message to this effect:

    Screenshot of the warning message that Terra will show if you attempt to decrease the size of an existing Persistent Disk.

Deleting a Cloud Environment

When you delete a Cloud Environment, you can choose whether to delete the PD as well.

Clicking Delete Environment in the Cloud Environment configuration menu will reveal the options shown below.

Screenshot of the options that appear when deleting a Cloud Environment. The default option is to keep the Persistent Disk, while deleting the application's configuration and compute profile. The second option is to delete everything, including the Peristent Disk. An orange arrow highlights the Delete button at the bottom of this menu.

Default: Keep persistent disk

Selecting the default option, "Keep persistent disk, delete application and compute profile", will delete the VM after detaching the persistent disk. The files on the PD will be saved and you will continue to be charged for the PD.

The next time you spin up a cloud environment in this workspace, the PD may automatically be reattached, depending on the kind of VM you choose in the Cloud compute profile section of the Cloud Environment configuration menu.

Screenshot of the Cloud Environment configuration menu that appears when creating a new Cloud Environment when the Persistent Disk has been saved from the last Cloud Environment. An orange rectangle highlights the options in the Compute Type drop-down menu, which is used to select the type of VM for the new Cloud Environment.

  • If you choose the standard VM it will automatically reattach the saved disk.
  • If you choose a Spark mode, this storage will NOT reattach to that cloud environment because Spark and hail application configurations don't support the persistent disk feature. The PD will, however, be saved until the next time you choose the standard VM option and click “Create”.

Delete everything

If you don't want to save the contents of your detachable persistent disk, select the "Delete everything, including persistent disk" option. Just make sure you've moved anything you wish to keep from the Cloud Environment virtual machine (VM) to another location, such as your workspace bucket.

See How to transfer data between your Cloud Environment PD and workspace storage for instructions on how to move your files off of the PD before deleting it.  

Delete the cloud environment, then the PD

Once you've deleted the Cloud Environment, you might decide that you no longer need to keep the files and data stored on the PD. If so, you can delete the PD in two ways:

  • Go to your Cloud Environments page (see the section below). This will show you a summary table of your persistent disks and Cloud Environments. Each PD will have a Delete option in the Actions column. 
  • If you're spinning up a new Cloud Environment, you can click “Delete persistent disk” in the Cloud Environment configuration menu.

Note: You can't delete a persistent disk that's attached to a cloud environment without first deleting that environment. The option to delete the PD through either of these menus will be activated only after you've detached the disk by deleting the Environment first.

Viewing PD details in the Cloud Environment page

For persistent disk information (name in Google Cloud console, billing ID, workspace ID), go to your Cloud Environments page (main menu navigation > your name > "Cloud Environment" from the top left corner of any page in Terra). Then, click on the "View" link in the "Details" column for the disk you're interested in. 

Screenshot showing how to see details about a Persistent Disk from the Cloud Environment page. An orange rectangle and an orange arrow highlight the information for an example disk.

You can use this information when copying data from your Cloud Environment to another location in order to keep from losing work while deleting or modifying your persistent disk. For detailed instructions, see Saving data from an interactive environment to your workspace bucket.

A note about auto-syncing behavior

Auto syncing - when Terra frequently autosaves your notebook back to workspace storage - may affect files stored on the VM's persistent disk. When you use a notebook in a Terra workspace, the VM creates subdirectories named after the workspace in the directory where the PD is mounted, and the Terra auto-syncing feature regularly interacts with the notebooks in these subdirectories.

If you're storing anything on the VM's persistent disk that you don't want to be affected by the auto-syncing behavior - for example, notebooks that you would like to keep private - we recommend keeping these in a specifically named subdirectory that is not named after a workspace (such as /home/jupyter/no-sync/).

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.