Welcome to the Workflows Quickstart Tutorial, Part I. Learn the basics of how to launch and monitor workflows to analyze a single entity of genomic data in Terra.
There are four parts to the Workflows Quickstart. Each part is independent, with its own learning objectives and time and cost estimates to complete. You should do the Parts in order, but you don’t need to do them in one setting.
Make your own copy of the Quickstart workspace
Run a pre-configured workflow on sample data
- Learning objectives: time and cost to complete
- The Quickstart workflows, explained
- Choose data and set up and run the workflow
- Your submission is complete! What to expect
- Where’s the data?
- Follow up questions
First - Make your own copy of the Quickstart workspace
The Workflows-Quickstart featured workspace is “Read only”. For hands-on practice, you'll need to be able to run workflows and store data in your workspace bucket. Making you own copy of the workspace gives you that power. If you haven't already done so, you'll need to make your own copy of this workspace following the directions below.
Start by clicking on the round circle with three dots at the upper right hand corner and select "Clone from the dropdown menu. Then follow the directions below to complete the form:
Step-by-step instructions + video tip
Once you're in your own copy of the workspace, you'll be ready to get hands-on to learn about setting up and running workflows!
Run a preconfigured workflow on sample data
What you will learn
How much will it cost? How long will it take?
Right click to open the tutorial demo in a new tab
|The workflows in Parts 1 and 2 are identical - they convert genomic files from one format
(CRAM) to another (BAM) for downstream analysis. They’ve been renamed to simplify the
instructions. This workflow should complete in just a few minutes once it starts running.
Don’t forget to refresh the Job History tab to monitor your submissions.
Start by going to the Workflows page, and select the Part1_CRAM_to_BAM workflow by clicking on the name.
This will reveal the workflows configuration form, which is where you will set up the workflow to run on your data. For the first part of the Quickstart, this form has mostly been filled out for you and you will be confirming what's there.
Set up the workflow to run: step-by-step instructions
In the workflow configuration form:
1. Confirm the data inputs - Make sure the "Run workflows(s) with inputs defined by the data table" radio button is checked
2. Step 1 - Select root entity type = "sample"
3. Step 2 - Click the "Select Data" button
In the Select Data form:
4. Choose data to process - Select the Choose specific rows to process radio button
5. Select the "NA12878" sample (first row)
6. Click "OK" to finalize your selection
When you return to the workflow configuration form, click the blue "Run Analysis" button to submit your workflow.
When it's submitted, you'll be directed to the Job History page to monitor your submission. To see job status updates, refresh the page.
Your submission is complete! Expand for what to expect
When your job completes successfully, you'll see a green check in the Status column. This should only take a couple of minutes once the job starts running. Your Job History will look like this:
Once you see those green checks, go back to the data page.
Your data table will include a new column with links to the generated output data in the workspace bucket. It will look like this:
Where’s the (generated) data?
All output files generated when you run a workflow are stored in the workspace bucket by default. You can check this by clicking on the “File” icon (far left column) in the Data tab. Note that this workflow is pre-configured to write to the data table. When the workflow completes, the data table contains links to the actual data, which are located in a public bucket.
Follow-up (thought) questions
- How is the data table different after running the workflow?
- Where did the additional columns come from? (hint: select the workflow card in the Workflows tab and compare the “Outputs” attributes to the new columns in the sample table).
- What do the new columns contain?