Workflows Quickstart Part 3 - Run downstream analysis

Allie Hajian
  • Updated

Welcome to the Workflows Quickstart Tutorial, Part 3. Learn how to run a downstream analysis on the output data you generated in Part 2 of the Workflows Quickstart. 

Using workflow output data in downstream analysis

Learning objectives: time and cost to completeIn part 3, you'll learn how data tables can help streamline and scale your analysis.

How data tables help scale your analysis
Using a sample set as input makes it quicker to set up the workflow to run on a subset of samples in the sample table, with less possibility of human error. You'll learn to set up a workflow to run on samples from a sample_set table in the configuration form.

Streamlining your analysis
Using the data table makes it easy for a workflow to use the outputs of a previous workflow analysis as inputs. Since this can be scripted, it helps automate running back-to-back workflows. You'll learn how to configure back-to-back workflows in the configuration form.

How much will it cost? How long will it take? 
The exercise should take no more than fifteen minutes (unless you are in the queue a long time) and cost a few pennies.

You'll build on the first two exercises by configuring a workflow to use the outputs fromPart_2_CRAM_to_BAM as inputs to the Part_3_BAM_to_unmapped_BAM workflow. You
will run on the two samples as a group, using the set Terra created in Part 2.

About the workflow
The workflow in Part 3 takes the BAM file output from Part 2 and converts it to an unmapped BAM file (uBAM). 

HINT: Right click to open the tutorial demo in a new tab

Setting up the workflow: Overview

Your goal is to run the Part 3 workflow on the output data in the set you created in Part 2.

Start by selecting the Part3_BAM_to_unmappedBAM workflow. You'll be directed to the configuration form (see screenshot below). The parts you will need to complete are numbered. See if you can complete them on your own. Open the sections below for hints. 

Workflows-QuickStart-Part3_Config-form_Screen_shot.png

Step 1. Choose the root entity type

HINT: The root entity is the table where the input data files are referenced.

  • Workflows-QuickStart_Part3-Select-root-entity-type-Answer_Screen_shot.png

    The root entity type is "sample" because the outputs from the previous workflow were written to the sample table, where you will find them alongside the primary input data.

Step 2. Select Data

For this part, you'll run on the output of the two samples in the set from the Quickstart Part 2. See if you can set it up on your own!

  • Workflows-QuickStart_Part3_Select-data-answer_Screen_shot.png

Step 3. Configure Input data

Notice you can choose to show only the required variables in the Inputs tab to simplify things.

  • 3.1. In the input_bam attribute field, start typing this. Select the output name you set up in Part 2 from the options in the dropdown.

    Workflows-QuickStart_Part3_Configure-inputs_Screen_shot.png

    Notice that the dropdown includes all columns in the "samples" data table, including those from Part 1! Be careful to select the name you used as output in Part 2.          

    3.2. Save your Inputs attributes by clicking on the blue Save button!

Step 4. Configure the outputs and run the workflow

  • 4.1. Go to the Outputs tab of the configuration form, where you'll fill in the attribute for the output_bams variable.

    4.2. To write to the data table, start by typing "this." and then add a name for this attribute. The workflow will generate a column for the generated data in the sample table. 

    Workflows-QuickStart_Part3_Configure-outputs_Screen_shot.png

    4.3. Save the Outputs.

    4.4. Click the blue Run Analysis button to submit your workflows.

    You will see the following popup

    Workflows-QuickStart_Part3_Confirm-launch_Screen_shot.png

     

Once your job is running, you can sit back and wait for the results!

What to expect - successful submissions

Congratulations! You configured the inputs correctly and the workflow succeeded. You should see the green “succeeded” icon in the Job History.

Workflows-QuickStart_Part3_Succeeded-run_Screen_shot.png

If you go to the Data tab and expand the "sample" table, you will see the outputs under a new column (whatever you named it - uBAM in the example below).

Workflows-QuickStart_Part3_Outputs-in-table_Screen_shot.png

If you click on the "3 items" link, you'll notice that there are three output files for each sample, corresponding to how the workflow processes the data (by separate shards):

Workflows-QuickStart_Part3_Outputs-closeup_Screen_shot.png

Workflow didn't succeed? Try these troubleshooting tips

If your Job History looks like the screenshot below, don’t despair! Especially if your submission failed immediately, it’s likely the error is a mistyped input attribute in the workflow configuration form. You can get further information by clicking the Submission (arrow).
Workflows-QuickStart_Part3_Failed-workflow-View-submission_Screen_shot.png

This will lead you to a more detailed page (below). If you hover over the link in the Messages column, you'll get information that can help troubleshooting. In the case below, one of the submissions failed because it didn't find the input file. I was using the output name from Part 1, where I only ran the first sample.Workflows-QuickStart_Part3_Failed-workflow-message_Screen_shot.png

Troubleshooting tips and tricks

  1. Check attribute names carefully - if you start typing “this.” in the inputs form, you’ll get a dropdown list of available data files. Make sure that the file type (BAM) matches the expected input.
  2. Check the log files by selecting the submission details (in the box outlined orange in the screenshot above).

For more tips, see Troubleshooting Workflows: Tips and Tricks.

G0-smiley-icon.png Congratulations! You've completed Part 3 of the Workflows Quickstart!

Was this article helpful?

2 out of 2 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.