Maintaining control and privacy of sensitive data are significant concerns in the world of genetic analysis. Research is frequently collaborative -- how do you balance keeping data secure, but still easy to share? Terra was designed with these competing requirements in mind. Read on to see how.
- Use case scenario
- Traditional approach to data security
- Data security with authorization domains
- A step by step guide to setting up and using authorization domains
Use case scenario
Conventionally, a researcher would need to be authorized to work with sensitive data (let’s say it’s private to your lab. On Terra, you would have a workspace in which you did your analysis and where any data generated would be stored. Let's assume that the derived data must still be kept under controlled access.
Traditional approach to data security
You know you are a member of your lab (indicated by the blue color in the diagram above), so you can access your lab’s data (blue again) in the workspace. You perform some analysis and generate a clone of the workspace containing your derived data, which is possible because you have this lab's authorization. If a new coworker asks you to share the workspace with them, you would be responsible for checking that this new coworker is officially a part of your lab rather than an imposter! (Unlikely, but still.) Your cloned workspace has no inherent way of checking whether or not the recipient has the proper authorization. You’d have to keep track for yourself which of your fellow scientists do and do not have up-to-date authorization.
Assuring security with authorization domains
Enter Authorization Domains, which are like a badge you wear that gives you access to workspaces with the same badge. When you clone a workspace, that badge stays with the new copy. You no longer need to worry about accidentally sharing sensitive data because if you try to share the cloned workspace with a user who doesn’t have the right badge, that researcher won’t be able to enter.
Let’s revisit our example from before. With authorization domains enabled, you have your lab’s badge, and you’re working with a workspace that also has the lab’s badge. You do your analysis, generate some derived data, and create a clone. The copied workspace also has the lab badge. When you try to share the workspace with your new coworker, the authorization domain will check to see if your coworker has the right badge, or if they are an imposter! (Or more likely, need to go get a badge). Either way, the responsibility is on the authorization domain to keep track of access, not you.
A step by step guide to setting up and using authorization domains
(steps 1 and 2 shown above)
STEP 1: Set the workspace Authorization Domain
When creating a workspace, you can select one or more groups to set as the Authorization Domain. (If you don’t see your group in the list, you may need to create it. See Step 2.) An Authorization Domain can only be set when creating the workspace, and once set, it cannot be removed from the workspace. It will be copied over to any cloned version of the workspace to keep any derived data protected.
When multiple groups are set as the Authorization Domain, the system requires the user to be a member of all groups in order to access the workspace. This is because there are strict guidelines with third-party dbGaP registered datasets (TCGA and Target).
For example, say there is a workspace whose Authorization Domain contains the TCGA and Target groups. If a user is invited to the workspace, the system checks the TCGA and the Target whitelist for their accounts before allowing access.
To import data from another workspace, the groups in the Authorization Domain of the source workspace (where data is coming from) must be a subset of the groups in the destination workspace (where the data is going to).
For example, if the destination workspace has TCGA-dbGap-Authorized and Tiffs-Test-Group groups in the Authorization Domain, you can import data from workspace’s whose Authorization domain is set to TCGA-dbGap-Authorized only (row 2, 3, 5-10), Tiffs-Test-Group only (row 4), both groups (row 1) or no groups. If the source workspace had additional groups, you would not be able to import from it. In this example, Terra informs you there are six workspaces that are unavailable because of this.
STEP 2: Give users their badges.
There are two ways to get your badge, depending on whether you use a third-party or user-defined group. The difference between third-party and user-defined is how membership to the group is managed.
In the case of third-party groups, external permissions are checked in order to give you access. Currently Terra supports two third-party party groups, TCGA Controlled Access and Target. To gain access, you must link your Terra account to your eRA Commons or NIH account on your Profile page. Terra then checks for the user ID of the linked account in the dbGAP whitelist to complete the authorization.
User-defined groups are created and managed within Terra. Groups are simple to set up, and are perfect to use when you want to share data with a set group of people (within your lab, for example). Your PI can create a(user-defined) group by going to the Groups page found in the menu under your username, following the prompts to create a group, e.g. “sample_group”, and adding each member of your lab to the group, thereby giving them the “sample_group” badges. The PI (or anyone they give Owner access to the group) is then responsible for giving and revoking these badges.
STEP 3: Share the workspace.
To complete the process, you can now share the workspace. You can share the workspace with the group you used in the Authorization Domain, or with an individual.
To share with a group, start typing the name into the Sharing dialog and choose from the autocomplete options:
If you share with individuals or a group who is not in the Authorization Domain, they will see the workspace greyed out in their workspace list. When they click it, Terra facilitates a request process that sends an email to all owners of the groups in the Authorization domain. Once the user has the proper badge(s), they can enter the workspace to see the protected data.
You can also create new workspaces within existing authorization domains so that users already in that authorization domain will already have the proper permission to enter that workspace once it is shared with them: