Getting started with GPUs in a Cloud Environment

Anton Kovalsky
  • Updated

A graphics processing unit (GPU) is optimized for linear algebra computations, such as matrix multiplication. Terra now supports the use of GPUs when using Jupyter notebook cloud environments. This feature is currently in beta, and you should be aware of some general limitations. This article outlines these limitations and gives a breakdown of cost estimates for available configurations.

To learn more about GPUs on Terra, check out Speed up your machine learning work with GPUs.

Background

The base Docker image from which Terra's interactive cloud environment images are extended is based on Google's Deep Learning platform. This platform includes packages for using the CUDA parallel processing platform, the TensorFlow machine learning platform, and the PyTorch machine learning framework. This image also installs the NVidia drivers necessary for GPU support. All of Terra's interactive cloud environment docker images extend from Terra's base docker image, and so all are intended to support GPUs.

Warning There have been widely reported issues with the Deep Learning base image having internal version conflicts involving CUDA and TensorFlow's tendency to automatically update to an unsupported version. For this reason, we want to point out that the GATK base image supports TensorFlow 2.4.2 and CUDA 11.0. Images that automatically install TensorFlow 2.5.0 and CUDA 11.2 are prone to compatibility issues.

terra-jupyter-base 1.0.+
terra-jupyter-bioconductor 2.0.+
terra-jupyter-gatk 2.0.+
terra-jupyter-hail 1.0.+
terra-jupyter-r 2.0.+
terra-jupyter-python 1.0.+

How to add GPUs to your cloud environment

To add GPUs to your cloud environment's compute configuration, navigate to a workspace where you have compute access, click on the "Cloud Environment" button at the top right of your screen, then click the "Enable GPUs" check box in the "Cloud compute profile" section:

 

Jul-05-2021_14-39-05.gif

If you already have an existing environment, you'll see that the checkbox is unavailable, and you have to delete the environment manually. You can do this either by clicking the "delete" button at the bottom of the Cloud Environment widget or from the section of your profile that lists your cloud environments. This will make the GPU checkbox available when creating a new environment. This is also true if you want to modify the GPU configuration of an existing environment. For instance, if you want to increase GPU power or change the GPU type, you'll need to delete the existing environment and re-create it with the desired configuration.

How to check that you've successfully enabled GPUs

You can check that you've successfully installed relevant libraries using some of the code snippets below:

PyTorch

torch.cuda.is_available()
print(torch.version.cuda)
torch.cuda.current_device()
torch.cuda.get_device_name(#)

TensorFlow

tf.config.list_physical_devices('GPU')
tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

GPU Limitations

  • GPUs can be used on Terra with Jupyter Notebooks (e.g., for TensorFlow), but not with Galaxy or RStudio.
  • As with other interactive analysis compute resources in Terra, only the n1 family of machines is supported.
  • Terra does not support updating an existing machine's GPU configuration. If you need to modify your GPU-enabled machine, you'll need to delete and recreate the cloud environment.
  • You may experience a runtime creation failure in one of the following circumstances:
  • Terra only supports GPU use with the standard VM, so make sure you don't select a Spark compute type (or a Hail image).

compute-type.png

Cost estimates for available configurations

Each GPU type has limitations with respect to how many GPUs are available for a given quantity of CPUs and memory. Below are a set of tables - one table for each GPU type - outlining cost estimates for launching cloud environments with all available GPU configurations. You can read more about GPUs on the Google Cloud Engine and GPU pricing in the Google Cloud Compute Engine documentation.

NVIDIA Tesla T4

# of CPUs Memory size (GB) # of GPUs Cost estimate (USD per hour)
1 3.75 GB 1 $0.41
1 3.75 GB 2 $0.76
1 3.75 GB 4 $1.46
2 7.5 GB 1 $0.45
2 7.5 GB 2 $0.80
2 7.5 GB 4 $1.50
2 13 GB 1 $0.48
2 13 GB 2 $0.83
2 13 GB 4 $1.53
4 15 GB 1 $0.55
4 15 GB 2 $0.90
4 15 GB 4 $1.60
4 26 GB 1 $0.60
4 26 GB 2 $0.95
4 26 GB 4 $1.65
8 7.2 GB 1 $0.64
8 7.2 GB 2 $0.99
8 7.2 GB 4 $1.69
8 30 GB 1 $0.74
8 30 GB 2 $1.09
8 30 GB 4 $1.79
8 52 GB 1 $0.83
8 52 GB 2 $1.18
8 52 GB 4 $1.88
16 14.4 GB 1 $0.93
16  14.4 GB 2 $1.28
16 14.4 GB 4 $1.98
16 60 GB 1 $1.12
16 60 GB 2 $1.47
16 60 GB 4 $2.17
16 104 GB 1 $1.31
16 104 GB 2 $1.66
16 104 GB 4 $2.36
32 28.8 GB 2 $1.84
32 28.8 GB 4 $2.54
32 120 GB 2 $2.23
32 120 GB 4 $2.93
32 208 GB 2 $2.60
32 208 GB 4 $3.30
64 57.6 GB 4 $3.68
64 240 GB 4 $4.45
64 416 GB 4

$5.20

NVIDIA Tesla K80

# of CPUs Memory size (GB) # of GPUs Cost estimate (USD per hour)
1 3.75 GB 1 $0.51
1 3.75 GB 2 $0.96
1 3.75 GB 4 $1.41
1 3.75 GB 8 $1.86
2 7.5 GB 1 $0.55
2 7.5 GB 2 $1.00
2 7.5 GB 4 $1.46
2 7.5 GB 8 $1.90
2 13 GB 1 $0.58
2 13 GB 2 $1.03
2 13 GB 4 $1.48
2 13 GB 8 $1.93
4 15 GB 1 $0.65
4 15 GB 2 $1.10
4 15 GB 4 $1.55
4 15 GB 8 $2.00
4 26 GB 1 $0.70
4 26 GB 2 $1.15
4 26 GB 4 $1.60
4 26 GB 8 $2.05
8 7.2 GB 1 $0.74
8 7.2 GB 2 $1.19
8 7.2 GB 4 $1.64
8 7.2 GB 8 $2.09
8 30 GB 1 $0.84
8 30 GB 2 $1.29
8 30 GB 4 $1.74
8 30 GB 8 $2.19
8 52 GB 1 $0.93
8 52 GB 2 $1.38
8 52 GB 4 $1.83
8 52 GB 8 $2.28
16 14.4 GB 2 $1.48
16  14.4 GB 4 $1.93
16 14.4 GB 8 $2.38
16 60 GB 2 $1.67
16 60 GB 4 $2.12
16 60 GB 8 $2.57
16 104 GB 2 $1.86
16 104 GB 4 $2.31
16 104 GB 8 $2.76
32 28.8 GB 4 $2.49
32 28.8 GB 8 $2.94
32 120 GB 4 $2.88
32 120 GB 8 $3.33
32 208 GB 4 $3.25
32 208 GB 8 $3.70
64 57.6 GB 8 $4.08/3.63

NVIDIA Tesla P4

# of CPUs Memory size (GB) # of GPUs Cost estimate (USD per hour)
1 3.75 GB 1 $0.66
1 3.75 GB 2 $1.26
1 3.75 GB 4 $1.86
2 7.5 GB 1 $0.70
2 7.5 GB 2 $1.30
2 7.5 GB 4 $1.90
2 13 GB 1 $0.73
2 13 GB 2 $1.33
2 13 GB 4 $1.93
4 15 GB 1 $0.80
4 15 GB 2 $1.40
4 15 GB 4 $2.00
4 26 GB 1 $0.85
4 26 GB 2 $1.45
4 26 GB 4 $2.05
8 7.2 GB 1 $0.89
8 7.2 GB 2 $1.49
8 7.2 GB 4 $2.09
8 30 GB 1 $0.99
8 30 GB 2 $1.59
8 30 GB 4 $2.19
8 52 GB 1 $1.08
8 52 GB 2 $1.68
8 52 GB 4 $2.28
16 14.4 GB 1 $1.18
16  14.4 GB 2 $1.78
16 14.4 GB 4 $2.38
16 60 GB 1 $1.37
16 60 GB 2 $1.97
16 60 GB 4 $2.57
16 104 GB 1 $1.56
16 104 GB 2 $2.16
16 104 GB 4 $2.76
32 28.8 GB 2 $2.34
32 28.8 GB 4 $2.94
32 120 GB 2 $2.73
32 120 GB 4 $3.33
32 208 GB 2 $3.10
32 208 GB 4 $3.70
64 57.6 GB 4 $4.08
64 240 GB 4 $4.85
64 416 GB 4 $5.60

NVIDIA Tesla V100

# of CPUs Memory size (GB) # of GPUs Cost estimate (USD per hour)
1 3.75 GB 1 $2.54
1 3.75 GB 2 $5.02
1 3.75 GB 4 $9.98
1 3.75 GB 8 $19.90
2 7.5 GB 1 $2.58
2 7.5 GB 2 $5.06
2 7.5 GB 4 $10.02
2 7.5 GB 8 $19.94
2 13 GB 1 $2.61
2 13 GB 2 $5.09
2 13 GB 4 $10.05
2 13 GB 8 $19.97
4 15 GB 1 $2.68
4 15 GB 2 $5.16
4 15 GB 4 $10.12
4 15 GB 8 $20.04
4 26 GB 1 $2.73
4 26 GB 2 $5.21
4 26 GB 4 $10.17
4 26 GB 8 $20.09
8 7.2 GB 1 $2.77
8 7.2 GB 2 $5.25
8 7.2 GB 4 $10.21
8 7.2 GB 8 $20.13
8 30 GB 1 $2.87
8 30 GB 2 $5.35
8 30 GB 4 $10.31
8 30 GB 8 $20.23
8 52 GB 1 $2.96
8 52 GB 2 $5.44
8 52 GB 4 $10.40
8 52 GB 8 $20.32
16 14.4 GB 2 $5.54
16  14.4 GB 4 $10.50
16 14.4 GB 8 $20.42
16 60 GB 2 $5.73
16 60 GB 4 $10.69
16 60 GB 8 $20.61
16 104 GB 2 $5.92
16 104 GB 4 $10.88
16 104 GB 8 $20.80
32 28.8 GB 4 $11.06
32 28.8 GB 8 $20.98
32 120 GB 4 $11.45
32 120 GB 8 $21.37
32 208 GB 4 $11.82
32 208 GB 8 $21.74
64 57.6 GB 8 $22.12
64 240 GB 8 $22.89
64 416 GB 8 $23.64

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.