Best practices for accessing external buckets, GCP VMs, and machine-learning tools

Anton Kovalsky
  • Updated

Learn best practices for accessing external GCP resources (using data stored in external buckets, or running Google Cloud virtual machines or machine-learning tools in a notebook) from within Terra. Working in Terra lets you take advantage of Terra's built-in FedRAMP and FISMA-moderate security perimeter.

If you are looking for step-by-step directions, see How to access external Google Cloud resources.

If you are running automation or service outside of Terra that needs to call Terra APIs, see When and how to use a service account in Terra

Overview - Accessing external resources

Terra is designed to remove some of the barriers of moving to the cloud, letting you focus on your work instead of spending time wrangling with Google. Behind the scenes, Terra interfaces with Google on your behalf using a special kind of Google account - called a service account - to perform tasks like accessing data in external Google buckets, or requesting and requisitioning other Google Cloud resources (VMs that power Cloud Environments and workflows). This lets you work with data stored in Google buckets and run tools on Google VMs right in your workspace without having to worry about many of the technical details.

Service account format

PROXY_<long-number>@firecloud.org

Every Terra user has one or more of these "pet" service accounts (one for each  Billing project) for interfacing with the cloud outside Terra.

When Terra uses a service account (examples)

  • Accessing a non-Terra GCS bucket, BQ dataset, GCR Docker image, etc.     
  • Running workflows or interactive analyses on virtual machines (VMs)

Service accounts protect your identity

Terra assumes the identity of the service account - rather than your user ID credentials - to call Google application programming interfaces (APIs). Using an anonymous service account is required for data and workspace security, but it means there are a lot of nonhuman-friendly details (like the PROXY service accounts above) in the backend. 

Best practices for individuals accessing external resources

In theory, you could use these predefined service/proxy groups if you need to interface with Google resources directly. We don't recommend that, because the long string of numbers is hard for people to manage (imagine you're a resource owner trying to identify who has access to the data in your external bucket. It's tough when the list is a random string of numbers and letters like PROXY_18340hruhgouhb1foy34g<long-number>@firecloud.org).

Use a human-friendly group to represent a user (including their service account)

An alternative that maintains Terra's built-in data and workplace security is to create a Terra user group with a human-friendly name for interfacing with Google Cloud (i.e., granting access to external buckets). Terra-managed groups can include you and any other users who need access to the same workspace or billing project. Groups are designed to streamline resource management by making it easy to share with one entity, rather than several individuals. 

Since Terra-managed groups include each user's non-human-friendly proxy by default, they can be used as a proxy to the service account, an easy way to put a human-friendly label on this back-end tool

Always use Terra groups for accessing external resources, even for one user! 

To learn more about Terra groups, see Managing access to shared resources.

Example: Terra group for a single user (user ID: j_doe@someplace.org)

  1. Create a Terra Group: j_doe_at_someplace_org
  2. Don't add anyone else to this group
  3. The group includes your Terra ID, and your service account proxy
  4. Make grants to j_doe_at_someplace_org@firecloud.org

Best practices for groups to access external resources

Managed groups are the best way to share resources (workspaces and billing as well as external resources) within a group of individuals, such as everyone in a lab. Sharing with a managed group, instead of a long list of individuals, saves time and avoids errors. The groups can be updated in Terra when people are added to - or leave - the lab or project.

Recommendations

Create a Terra user group with a human-friendly name that includes all team members and collaborators who need access to the external resource. Use it for interfacing with Google Cloud (i.e., granting access to external buckets) as well as managing access to shared team resources. 

Example: Terra group for lab (User ID: my_lab@someplace_org)

  1. Set up a Terra Group for all collaborators: my_lab_at_someplace_org
  2. Include the Terra user ID of everyone in the lab
  3. Make grants to my_lab_at_someplace_org@firecloud.org

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.