Accessing GCP features that are not in the Terra UI

Allie Hajian

Do you want to perform Google Cloud Platform (GCP) operations not currently available in the Terra UI? You can do many of these things in Terra already! This article explains how to leverage Terra notebooks and workflows to access additional Google Cloud Platform (GCP) features in Terra.
   - WRITE to BigQuery 

   - Interact with Cloud Storage buckets other than the workspace bucket
   - Run dsub jobs
   - Run Cloud Dataflow jobs
   - Run Cloud ML engine jobs

Getting Started with advanced GCP features

The Terra platform is designed to remove some of the barriers of moving to the cloud: Terra interfaces directly with Google so you don't have to. However, there are many GCP features that are not included in the platform. Some are on the horizon, others are niche capabilities that may never be integrated with the Terra UI.

Just because they aren't in the UI doesn't mean you cannot use them, however. You can access these advanced features through a GCP project, which you will set up on the GCP console and connect to Terra with a human-friendly personal Terra group following the steps below. Once you follow these three setup steps, you'll be able to use the GCP project to leverage advanced GCP features by running notebooks and workflows on Terra.

Accessing_advanced-GCP-features_Steps-diagram.png

icon-warning2.png

 
Before you start: Your Terra user ID must have a Google Cloud Billing account

  In order to set up a GCP-native project on GCP console, you need to be an owner or a user on a GCP Billing account linked to Terra. If what you see on the console does not look like the screenshots, it is most likely because you do not have the right permissions on a GCP Billing account.

To learn how to set up GCP billing, and access $300 in free credits from Google, see this article

1. Set up a GCP-native project (on GCP console)

1.1. From the main menu (three horizontal lines at the top left of the GCP console page) go to the IAM & Admin > Manage resources page. 
Advanced-GCP-features_Set-up-project_Step-1.png

1.2. On the "Select organization" drop-down at the top of the page, select the organization in which you want to create a project. Free trial users can skip this step, as this list does not appear.

1.3. Select Create project.
Advanced-GCP-features_Set-up-project_Step2.png

1.4. In the New Project window, enter a project name and select a Billing account. This is the Cloud Billing account that will cover all GCP costs incurred in your Google project.

Advanced-GCP-features_Set-up-project_Step-3.png
  • If you don't see a Billing
    account
    in the drop-down, you
    can set one up following these
    instructions
    .

  • Note that a project name can
    contain only letters, numbers,
    single quotes, hyphens, spaces,
    or exclamation points, and must
    be between 4 and 30 characters.

1.5. Enter the parent organization or folder in the Location box. That resource will be the hierarchical parent of the new project.

1.6. When you're finished entering new project details, click Create.

2. Create a human-friendly personal Terra group in three steps

G0_tip-icon.png


Why use a Terra group for external access?

 

Each Terra user has a pre-built "Proxy" Group for accessing resources that exist outside of Terra.

However, your proxy group is not very human-friendly. If you're
looking at a list of users with access to an external GCS bucket,
seeing that there's a grant to PROXY_11564882405514439@firecloud.org

is not helpful unless you happen to have a way to figure out what
user is associated with that Proxy Group.

Instead, you can create a Terra group (with a sensible name) as an
alias for your proxy. if your registered Terra account is
j_doe@someplace.org, create a Terra Group named
j_doe_at_someplace_org. Don't add anyone else to this group. You
can then make grants to j_doe_at_someplace_org@firecloud.org.
This is group contains one member, namely the proxy group for
j_doe@someplace.org. This is much easier for a human to
reason over.

2.1. Go to your Groups page (Your name > "Groups" from the main menu at top left of any page in Terra).
Create-Terra-Group_Step-1_Screen_shot.png

2.2. Click on the blue Create a new group button.
Create-a-new-group_Screen_shot.png

2.3. Enter your human-friendly user-ID (can be your Terra login - see screenshot below) and click the Create Group button.
Create-Terra-Group_Step-3_Scren_shot.png
Terra creates a mirrored Google group (your Terra ID plus your built-in proxy) for interfacing directly with GCP that you can use as well.  

You'll see the full name in your list of Groups (below). In the next step, you'll grant permission for this group to access the cloud-native GCP you created in step 1:
Accessing-advanced-GCP-features_Personal-group_Screen_shot.png

3. Add your Terra group on the Google project

This step allows you to work in the UI (i.e. a Terra notebook), while Terra acts on your behalf (as your "proxy") behind the scenes in the project you just set up in GCP. 

You will give your personal Terra group "Editor" permission (for more information about GCP permissions, see this article).

Note that if your Terra group includes additional people, you will want to be careful what permissions you grant the group. This is because editors can turn on a large number of services, including ones that can be expensive!

3.1. Go to IAM >Manage Resources in your new GCP project and select Add Member.
Access-advanced-GCP-features_Add-member-to-project_Screen_shot.png

3.2. Add your human-friendly personal Terra group as a member in your project permissions.
Access-advanced-GCP-features_Add-personal-group_Screen_shot.png

3.3. Give the group Editor permission.
Access-advanced-GCP-features_Project-editor_Screen_shot.png

Once these three steps are complete, you'll be able to do many advanced GCP tasks. In many cases, Terra will interface with GCP on your behalf! Read on for details of how to do specific tasks. We will continue to add to this list. 

Step-by-step instructions and template notebooks

Below are a series of features users have asked about that are not (yet!) available in Terra. Expand each section for step-by-step instructions - or a link to a notebook in the public workspace.

Create an external GCP bucket accessible by your Terra workspaces

To learn more about the benefits of using external buckets for storing shared data resources, see this article. 

1. Go to GCP Storage Console.

2. Select your GCP-native project from the dropdown and click Create bucket.
Advanced-GCP-features_Create-external-bucket_Step1.png

G0_icon-tip.png


External GCP bucket configuration tips

 

In general, you can use the default values when setting up your external bucket.

For customization details, see the Google documentation

When you are done, you will see your external bucket in the console!
Advanced-GCP-features_Create-external-bucket_Final.png

Set an external GCP bucket to auto-delete

When you're testing code, you may generate a lot of data that you don't want to keep (or pay for). To avoid having to clean up at the end of the day, you can set your storage bucket to delete the contents every day with the following steps.

1. Go to GCP Storage console.

2. Select the bucket you want to set to automatically delete data by clicking the bucket name.
Advanced-GCP-features_Autodelete-bucket_Step1.png

3. Select the Lifecycle tab.
Advanced-GCP-features_Autodelete-bucket_Step2.png

4. Choose Add a Rule.
Advanced-GCP-features_Autodelete-bucket_Step3.png

5. Follow the instructions to set up a custom rule.

If you set up a rule to delete contents after 1 day, for example, you will see this:
Advanced-GCP-features_Autodelete-bucket_Final.png

Interact with Cloud storage buckets other than Workspace bucket (template notebook)

There are times when you may not want to keep shared data in a Workspace bucket (particularly if you're sharing large numbers of large data files with a large group - see this article for why).

G-_tip-icon.png


Why use external buckets?

 

To learn more about sharing large numbers of large data files with large groups, see this article.

Example notebook

For an end-to-end example of interacting with an external bucket, see this template notebook.

Create a BigQuery dataset (in GCP console)

1. Go to BigQuery in the GCP console and select the native GCP project you created above.

2. Select Create Dataset to the right of the project name.
Advanced-GCP-features_Create-BQ-dataset_Step1.png

3. In the dataset creation form, choose a unique dataset name and select the default table expiration.
In general, you would choose "Never". But if you are testing queries and saving those results as tables, you may generate a lot of tables that you don't want to keep (or pay for). To avoid having to clean up those tables at the end of the day, you can create a BigQuery dataset for test results that auto deletes its tables after a period of time has elapsed.
Advanced-GCP-features_Create-BQ-dataset-with-autodelete_Step2.png

4. You will see your new Big Query dataset in the Resources section on the far left.
Advanced-GCP-features_Create-BQ-autodelete-dataset_Step3.png

How to load data to BigQuery (template notebook)

Note that before you can load data to BigQuery, you must have (at least) WRITE access permission to an existing BQ dataset. If you have set up your own BigQuery dataset (above), you will automatically have those permissions. 

G0_tip-icon.png


Example notebook

  See an example notebook in a public Terra workspace

Other things you can do in a GCP Project

G0-tip-icon.png


Additional resources

 

dsub
See this GCP tutorial on running dsub jobs in Python.

Cloud Dataflow

See this GCP quickstart on running Dataflow in Python.

Cloud ML
See GCP documentation on ML or tensorflow.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

3 comments

  • Comment author
    Laura Egolf
    • Edited

    Can the GCP project created using these instructions be linked to a Terra billing project or workspace? My organization wants to create a GCP project for me so that they can have better control of billing on their end, but I don't see a way to link this GCP project to a Terra project/workspace.

    0
  • Comment author
    Allie Hajian

    Laura Egolf Unfortunately, you cannot create a Google project on GCP and add/link it to Terra. Billing projects See this article (the section on the relationship between GCP projects and Terra Billing projects).

    I completely understand your organization wanting to have control over billing and it's possible! The key is Ownership of the GCP Billing account linked to your Terra account. here are the steps:

    1. Your organization can create a GCP Billing account 
    2. They then link it to Terra 
    3. Then within Terra, you would create a Terra Billing project funded by that GCP account that you and they would be able to see on GCP console.

    As the owners of the GCP billing, they would have access to all billing and spend information about that Terra Billing project (on GCP and in the Terra UI, as available). For more detail, see this article. Hope this helps!

    1
  • Comment author
    William Grisaitis
    • Edited

    Allie Hajian what permissions do i need so that the drop-down for selecting a billing account shows up in the "create project" dialog on GCP? i am able to create [terra] workspaces in [terra] billing projects, but i see that i can't enable many GCP APIs... so it sounds like i need to create a new GCP project. but, when i follow the directions above and pursue "create priject" (under IAM & Permissions > mMagage Resources), i don't see that drop-down for selecting my lab's billing project. Someone else on my team (who is an owner on the billing project) does see the dropdown. so, do I need to be an owner? is there any other way?

    my goal - all i want to do is have a normal GCP project where i can enable GCP APIs and have billing linked to my lab's terra billing project. what's the simplest way to do that? can i just create a GCP project on my own (not via terra) and then link it to a terra billing project later? would there be any downside to doing this? (would data in my buckets not be accessible for running terra pipelines?)

    EDIT - i asked this question here, too: https://support.terra.bio/hc/en-us/community/posts/4409556648475-What-permissions-do-I-need-to-create-GCP-projects-on-GCP-not-on-terra-linked-to-a-Terra-billing-project-

    0

Please sign in to leave a comment.