Step-by-step instructions to access GTEx data for analysis in a Terra workspace or download locally once your dbGaP request has been approved. These instructions will work for AnVIL and Biodata Catalyst researchers working in Terra.
Overview
Requests for GTEx data can no longer be made through DUOS and must be submitted through dbGaP. Once your request has been approved, you can access the data from the DUOS Data Library (see instructions below).
Please note: If your request has been approved in dbGaP, you should now see a blue Export link in the View by Dataset tab under Export to Terra (right column) in the DUOS Data Library.
Step 1: Register in DUOS
1.1. Go to duos.org and click sign-up/sign in. You have the option to sign in with either a Microsoft or Google-backed account. Make sure to sign in with the same email account that was used to request GTEx access through dbGaP and is an institutional email.
1.2. Accept the Terms of service.
Step 2 : Complete Your User Profile
2.1. In the Researcher Console, select Your Profile under your name (top right).
2.2. Once you land on the profile page, add your full name and select your institution from the dropdown (start to fill in the name). If your institution is not yet in DUOS, please email DUOS support at support@duos.org and we will add it for you!
2.3. Link your NIH RAS Account. If you do not already have a RAS account, you can find instructions for obtaining one here. You’ll be taken to the external NIH page to sign into your RAS account.
Please note: You do not need to obtain a Library Card to access GTEx data through DUOS
Step 3: Access GTEx data
Please remember that you must already have an approved dbGaP request to access GTEx data in DUOS. You can either access the data by cloning the associated workspace (preferred method) or by exporting the snapshot to Terra.
Option 1 (preferred method): Access GTEx data through the AnVIL workspace
3.1. Go to the DUOS Data Library (https://duos.org/datalibrary) and filter/search for the dataset you’re looking for. Click on the dataset name that you want to access.
3.2. You will be taken to a page containing information about the dataset. Click on the link to the associated AnVIL workspace.
3.3. You will then be taken to the AnVIL workspace. Click the three dots in the upper right hand corner of the workspace, then select "Clone" to clone the workspace.
For more detailed instructions on cloning a Terra workspace, please see How to Clone Your Workspace
Option 2: Export the snapshot to Terra
3.4. If your request has been approved in dbGaP, you should now see a blue Export link in the View by Dataset tab under Export to Terra (right column) in the Data Library.
3.5. Clicking the button should allow you to export to an existing Terra workspace or to create a new one.
3.6. Note that if you're accessing controlled data, DUOS will automatically enable additional security on your new or existing workspace.
What to expect
DUOS will import the data snapshot to a new or existing workspace.
It will take a few minutes to export the snapshot. You’ll get a green popup (upper right) when data is in your workspace.
Once you refresh your page, you’ll see the data tables containing all the snapshot data and metadata in the Data tab of your workspace. Note the security shield at the top right indicating additional security monitoring.
Download GTEx data to local machine
GTEx data, including controlled access data, can be downloaded from the Terra workspace GCP bucket using the CLI commands provided by Google (gsutil or gcloud storage).
Download caveats
- The bucket has requester pays enabled.
- You must be in the appropriate Authorization Domain to access these workspaces.
GTEx v11 details
- Workspace: AnVIL_GTEx_v11_hg38
- Bucket: fc-secure-1078149c-12c9-44ee-bdfe-1389fc74cfbf
GTEx v10 details
- Workspace: AnVIL_GTEx_v10_hg38
- Bucket: fc-secure-e0503432-75b9-4674-8e6d-2597dc529c4c
GTEX v8 details
- Workspace: AnVIL_GTEx_v8_hg38
- Bucket: fc-secure-ff8156a3-ddf3-42e4-9211-0fd89da62108
Step-by-step instructions
1. Install the gcloud CLI
For detailed instructions, see How to install gcloud on a local machine.
2. Authenticate with Google
Set up user credentials with the Google user identity you use when logging in to Terra (described in the article above).
3. Select and download the desired files
- See How to move data to/from a Google bucket for detailed instructions.
- Because the bucket has requester pays enabled, you must provide a Google project to be billed (using the "--billing-project" or "-u" option), as described in How to access Requester Pays data/resources in Terra.