We think the best way to get started running workflows on Terra is to dive in and get started!
The Terra QuickStart workspace is a hands-on practice that guides you through the process. You'll follow the steps below to get experience with increasingly more realistic exercises. Setting up and running all the exercises should take about 15 minutes total.
Exercise 1: Run a preconfigured workflow from the data tab
Here you'll run a short file format conversion (BAM_to_CRAM) workflow on data that is already in the workspace sample data table. In addition to giving you the satisfaction of running your first workflow, you'll see how the data table gets updated with workflow outputs (the QuickStart workflow is configured to write to the data table). You'll be running on a downsized sample BAM, NA12868, in a public bucket.
Follow the step-by-step instructions in the dashboard to select pre-loaded sample data and run a pre-configured file format conversion workflow, Exercise_1_CRAM_to_BAM.
Monitor your submission
After submitting the workflow, you'll be redirected to the Job History tab, where you will see the job status of Exercise_1_CRAM_to_BAM. How long it waits in the queue depends on how many other jobs are being submitted at the same time. Once it starts running, the workflow should only take a few minutes to complete.
- Note that to see the current status you will need to refresh the page, since the page does not automatically update.
After your job runs successfully, a green check will show up in Job History.
To learn more about troubleshooting and monitoring your workflow status, see this article.
Check outputs in the data table
After running, you'll notice that the data table contains workflow outputs.
Results and thought questions
How is the sample data table different than before running the WDL?
Where did the additional columns come from? (Hint: Compare the outputs attributes in the "Workflow" tab to the new columns in the workspace data table).
What metadata do the new columns contain?
Exercise 1b: Run a pre-configured workflow on your own data
After running your first completely pre-configured workflow on sample data, you'll move onto the next step - running the same workflow but with your own data. In this exercise, you'll change the input metadata in the data table to a data file in a public bucket that is a proxy for your own data.
Go into the sample data table, and edit the metadata in the second (template)row. First update `template_sample_id` to your own ID. Then replace `your_file.cram` with the the complete path to this sample CRAM (in a public Google bucket):
Then select this data row and run and monitor the Exercise_1_CRAM_to_BAM as before.
Exercise 2: Configure and run an analysis on your own data from the Workflows tab
Going through the DATA tab is one way to submit workflows in Terra, but it's not the only way. You can also run an analysis from the Workflows tab. In exercise two you'll explore how to configure and run an analysis from the Workflows tab.
Follow the step-by-step instructions for configuring and submitting the Exercise_2_CRAM_to_BAM workflow in the QuickStart dashboard.
Along the way, you'll get a sense of how to configure all variables and attributes of a workflow using the built-in Terra interface.
Once you complete all three exercises, you'll be well on your way to running your own workflows on your own data.