Getting started with GPUs in a Cloud Environment

Anton Kovalsky
  • Updated

Large-scale analyses - such as training a Machine Learning model - are often more efficient when run on a Cloud Environment that uses a graphics processing unit (GPU). Terra supports the use of GPUs for Jupyter Notebook Cloud Environments. This feature is currently in beta; this article outlines some general limitations and gives a breakdown of cost estimates for available configurations.

To learn more about GPUs on Terra, check out Speed up your machine learning work with GPUs.

Background

Graphics Processing Units (GPUs) are a useful tool when running large-scale machine learning analyses in a Terra notebook. All of Terra's interactive cloud environment docker images support GPUs. This is because Terra's interactive cloud environment images extend from a base Docker image dependent on Google's Deep Learning platform. This platform includes packages for using the CUDA parallel processing platform, the TensorFlow machine learning platform, and the PyTorch machine learning framework. This image also installs the NVIDIA drivers necessary for GPU support

How to add GPUs to your cloud environment

To take advantage of GPUs in Terra, add them to your cloud environment's compute configuration.

1. Navigate to a workspace where you have Compute access.

2. Open your workspace's Cloud Environment Configuration menu.

If you're starting up a new Cloud environment, do this by clicking on the cloud icon with a lightning bolt inside it on the right-hand panel and then select Settings in the Jupyter section. GPUs are available for use with Jupyter notebooks, but not RStudio or Galaxy analyses.

If you're modifying an existing environment, click on the Jupyter icon on the right-hand panel.

3. Select your desired cloud environment settings. To learn more about these settings, read Your Interactive Analysis VM (Cloud Environment).

4. Enable GPUs: if you're starting up a new Cloud Environment,  (the cloud icon), then click the Enable GPUs check box in the "Cloud compute profile" section.

 

Moving gif showinging a user clicking the 'Enable GPUs' checkbox for their Cloud Compute Profile.

To add GPUs to an existing environment, delete the environment first If you already have an existing environment, you'll see that the checkbox is unavailable, and you have to delete the environment manually. You can do this either by clicking the "delete" button at the bottom of the Cloud Environment widget or from the section of your profile that lists your cloud environments. This will make the GPU checkbox available for creating a new environment.

This is also true if you want to modify the GPU configuration of an existing environment. For instance, if you want to increase GPU power or change the GPU type, you need to delete the existing environment and re-create it with the desired configuration.

How to check that you successfully enabled GPUs

You can check that you successfully enabled GPUs and installed the libraries that are relevant to many Machine Learning analyses by running the code snippets below in a Jupyter notebook. Note that for these commands to succeed, you must have clicked the Enable GPUs checkbox.

  • Note: this code will only work if you've set up your Cloud Environment using an Application Configuration that includes PyTorch (e.g., Pegasus). Check whether PyTorch is included in your Application Configuration by clicking What's installed on this environment? under the Application Configuration drop-down menu in the Cloud Environment setup menu.

    import torch
    print(torch.cuda.is_available())
    print(torch.version.cuda)
    print(torch.cuda.current_device())
    print(torch.cuda.get_device_name())

    If PyTorch is installed, you should see something like this:
    Screenshow showing an example of the output you can expect from running the code to check whether PyTorch has been successfully installed on your Cloud Environment.

  • Note: this code will only work if you've set up your Cloud Environment using an Application Configuration that includes TensorFlow. Check whether TensorFlow is included in your Application Configuration by clicking What's installed on this environment? under the Application Configuration drop-down menu in the Cloud Environment setup menu.
    import tensorflow as tf
    print(tf.config.list_physical_devices('GPU'))
    from tensorflow.python.client import device_lib
    print(device_lib.list_local_devices())

    If TensorFlow is installed, you should see something like this:
    Screenshot showing an example of the output you can expect from running the code to check whether TensorFlow has been successfully installed in your Cloud Environment.

GPU Limitations

  • GPUs can be used on Terra with Jupyter Notebooks (e.g., for TensorFlow), but not with Galaxy or RStudio.
  • As with other interactive analysis compute resources in Terra, only the n1 family of machines is supported.
  • Terra does not support updating an existing machine's GPU configuration. If you need to modify your GPU-enabled machine, you need to delete and recreate the Cloud Environment.
  • You may experience a runtime creation failure in one of the following circumstances:
  • Terra only supports GPU use with the standard VM, so make sure you don't select a Spark compute type (or a Hail image).

Screenshot of the drop-down menu used to select a VM type. An orange rectangle highlights the Standard VM option, which is the only type of VM that is compatible with GPUs.

How to estimate the cost of your GPU configuration

Each GPU type has limitations as to how many GPUs are available for a given quantity of CPUs and memory. As a result, each GPU configuration has a different cost. To estimate the cost of your configuration, read more about GPUs on the Google Cloud Engine and GPU pricing in the Google Cloud Compute Engine documentation.

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.