GWAS tutorial (Terra on Azure)

Allie Cliffe
  • Updated

Step-by-step instructions to run a GWAS workflow in Terra on Azure. See the GWAS using Regenie workspace. 

Step 1: Clone the template workspace

1.1. Click on the three-dot menu at the top-right of the GWAS using Regenie workspace and select Clone.

1.2. Fill in a Workspace name. For example, you can add your initials and the date so it’s easy to identify your copy.

Select your Terra Billing project and click the CLONE WORKSPACE button.

1.3. After a few minutes, your workspace will finish creating and you will automatically be redirected to your cloned workspace.

Step 2. Examine the workspace data table

2.1. In your cloned workspace, click on the DATA tab.

2.2. The workspace comes with a data table called callsets. Click on the name to examine the table.

2.3. This table contains one row and six columns of sample data.

What’s in the callsets table?

The table includes links to the input files in a Terra workspace’s Azure storage container that will be used as inputs for the GWAS-Regenie workflow.

Step 3. Run the GWAS-Regenie workflow

Workflows are orchestrated in Terra by the workflows application - Cromwell. When you are in a new workspace, you will need to launch Cromwell.

Note that you only need to do this once, the first time you set up or run a workflow in the workspace.

3.1. Go to the WORKFLOWS tab.

3.2. Click the blue Launch Workflows App button.

The Workflows app may take between 5-15 minutes to launch.

When workflows are ready

After a few minutes, you will see the workflows menu on the left-hand side and the card for the GWAS-Regenie workflow from the original workspace in the center under Workflows in this workspace.

3.3. Click the blue Configure button.

3.4. The data table should already be set to the callsets table. If it is not already selected, choose it from the dropdown menu circled in the screenshot below). Check the box to select the only row in the table (the test_cohort callset).

3.5. The Inputs and Outputs have already been configured for you. You can click on these tabs to see what is configured for each workflow attribute, and what columns will be generated for the outputs.

Pre-configured inputs screenshot

The workflow will generate a new file, workflow_output. The output file will be written back to the original data table you examined earlier.

Pre-configured outputs screenshot


3.6. When you’re ready to run the workflow, click the blue SUBMIT button to surface the Send submission popup. You can add a comment for this submission if you wish, then click SUBMIT again.

3.7. The first time you run a workflow in your workspace, Cromwell will take some time to launch. The workflow will submit automatically once Cromwell is up and running and you’ll be redirected to the Submission details page.

3.8. You can monitor your submission progress at any time by clicking the Submission history option on the left side of the Workflows page. Clicking the submission name (link next to arrow) will give you more information.

3.9. Once the submission is complete, you will see the status change to a green check mark and Success.

3.10. Click on the Submission name for more Submission details

Under Workflows Data (far right column in the screenshot above), you'll find direct links to the inputs, outputs, and log files in cloud storage.

Clicking on the link under value column will let you download the files locally. 

Step 4. Find/explore results

4.1. Click on the DATA tab again to examine the callsets table once more.

4.2. Click to open the callsets table and scroll to the right to see the newly generated output file that has been added to the table in a new workflow_output column.

4.3. Clicking on the link in the workflow_output column will expose where the file is stored in your workspace cloud storage and allow you to download the output file to local storage.



Was this article helpful?

0 out of 0 found this helpful



Please sign in to leave a comment.