ExomeGermlineSingleSample fails

October 18, 2021 13:46
13 comments

I'm new to Terra.bio, but I've converted my fastq.gz files to uBAM using the Sequence-Format-Conversion workflow, and now want to put that uBAM into the ExomeGermlineSingleSample workflow.

I've imported the text file into the read_group table for my single sample with the path to my uBAM.

As it's a single uBAM I amend the sample_and_unmapped_bams attribute by changing this

{ "sample_name": this.read_group_set_id, "base_file_name": this.read_group_set_id, "flowcell_unmapped_bams": [this.read_groups.flowcell_unmapped_bams], "final_gvcf_base_name": this.read_group_set_id, "unmapped_bam_suffix": ".bam" }

But I keep getting the error

Workflow input processing failed (Caused by [reason 1 of 1]: Failed to evaluate input 'sample_and_unmapped_bams' (reason 1 of 1): Error(s): No coercion defined from 'null' of type 'spray.json.JsNull$' to 'File'. No coercion defined from 'null' of type 'spray.json.JsNull$' to 'String'. No coercion defined from 'null' of type 'spray.json.JsNull$' to 'String'.)

Any help is appreciated Kylee Degatano

Comments

13 comments

Emil Furat
- October 18, 2021 17:39
Hi Mat,

Thanks for reaching out. Can you share the workspace where you are seeing this issue with GROUP_FireCloud-Support@firecloud.org by clicking the Share button in your workspace? The Share option is in the three-dots menu at the top-right.
1. Add GROUP_FireCloud-Support@firecloud.org to the User email field and press enter on your keyboard.
2. Click Save.
Let us know the workspace name, as well as the relevant submission and workflow IDs. We’ll be happy to take a closer look as soon as we can.

Best,

Emil
0
Mat Nightingale
- October 18, 2021 18:16
Thanks Emil. I've shared it, i'm trying to run 1-ExomeGermlineSingleSample, my sample is F10-081 and can be seen in the read_group area.

I'm sure it's something simple but I'm really looking to dip my toe into Terra.bio by running a single sample before switching my institutes workflow over completely.

Let me know if you need anything else.

Mat

0
Emil Furat
- October 18, 2021 18:31
Hi Mat,

Could you please share a link to your workspace?

Best,

Emil

0
Mat Nightingale
- October 18, 2021 20:34
Hi Emil,

Did the share not go through, its showing as shared to GROUP_FireCloud-Support@firecloud.org at my end.

Mat

0
Emil Furat
- October 18, 2021 20:57
Hi Mat,

My apologies for the misunderstanding, could you please share the URL to your workspace?

Best,

Emil

0
Mat Nightingale
- October 19, 2021 10:32
Hi Emil,

It's https://app.terra.bio/#workspaces/Genomics/Exome-Analysis-Pipeline%20copy

0
Emil Furat
- Edited October 21, 2021 19:44
Hi Mat,

If I'm not mistaken, the two "sample_and_unmapped_bams" attribute descriptions you have shared are the same, is it possible that you have shared the wrong version?

I would recommend reviewing how you have defined your sample_and_unmapped_bams workflow attribute, the following article has more information about properly configuring workflows in Terra: https://support.terra.bio/hc/en-us/articles/360026521831

I see that for your most recent attempts at the ExomeGermlineSingleSample workflow you have defined your root entity as "F10-081 (read_group)":

Looking at this data entity in the Data section of your workspace, I see the following columns for the read_group data table:

Given that you have set read_group as the root entity for your ExomeGermlineSingleSample workflows, this allows you to make the following calls when defining your workflow attributes: this.read_group_id, this.flowcell_unmapped_bams, and this.sample (all of the columns defined in the data table for read_group).

In the sample_and_unmapped_bams attribute configuration you have shared, however, I see that you are trying to call on "this.read_group_set_id". This would only be possible if you set read_group_set as the root entity for your workflow since the read_group_set_id column exists in the read_group_set data table:

In order to correct this, you will want to either change the data entity you are using in your workflow to the read_group_set data table, or change the sample_and_unmapped_bams attribute configuration to only reference columns present in the read_group data table.

One other thing: when I click on the "F10-081.unmapped.bam" link to the F10-081 entity in your read_group data table I received the following error message:

Do you receive the same message when you click on the link to your F10-081 file?

Kind regards,
Emil

0
Mat Nightingale
- October 22, 2021 12:22
Hi Emil,

Thanks for the reply.

Changing the attributes from read_group_set to just read_group made no difference.

I still get the error

Workflow input processing failed (Caused by [reason 1 of 1]: Failed to evaluate input 'sample_and_unmapped_bams' (reason 1 of 1): Error(s): No coercion defined from 'null' of type 'spray.json.JsNull$' to 'File'.)

If I download the example read_group_set file to see how it's structured it doesn't even mention the example H2GCKCCXX.7.unmapped.bam file in the .tsv files anywhere.

It makes me wonder if the paired-fastq-to-unmapped-bam workflow is a suitable precursor to 1-ExomeGermlineSingleSample.

When I click on the F10-081.unmapped.bam file I get the option to download them so it should be accessible to the workspace.

It amazes me how unintuitive this is. I could understand it if I was trying to do something unconventional but all I want to do is throw a standard set of fastq files through a pipeline and get a list of variants. This is something that I've had working on my own locally computed hg19 pipeline for years, that switching to GATK4 & hg38 is so cumbersome after it being out for so long is unbelievable.

I'll keep banging away at it but there has to be a more streamlined workflow than this.

Mat

0
Emil Furat
- Edited October 22, 2021 18:31
Hi Mat,

We have made a copy of your workspace and reconfigured the data tables/workflow you are using - our hope is that we will be able to use this in order to help identify the problem. Could you please follow the instructions provided below and let us know what (if any) errors you receive along the way?

1. Navigate to the following workspace which has already been shared with you: https://app.terra.bio/#workspaces/help-terra/Exome-Analysis-Pipeline-Matt-copy-test

2. Copy the read_group data table into your workspace by selecting all three rows and clicking on the three vertical white dots inside of the blue circle. Export to your workspace:

3. Follow the same procedure as in step (2) for the read_group_set data table:
4. Copy the 1-ExomeGermlineSingleSample workflow from the workspace I have shared with you into your workspace:

5. Go to the workflow you have copied into your workspace and press "select data":

6. Select F10-081 and press ok:

7. Run your analysis

I'm sorry to hear that you have found the process of running your workflow to be unintuitive. Since you are new to Terra I would highly suggest that you review the quickstart tutorials from our documentation here: https://support.terra.bio/hc/en-us/sections/4408259082139-Tutorials.

These tutorials provide step-by-step instructions for all of the basic operations in Terra such as loading data into data tables.

Kind regards,

Emil

0
Liz Kiernan
- Edited October 23, 2021 11:36
Mat Nightingale Thanks for reaching out about the issue.

I just wanted to give some context for what's different about the setup in the workspace Emil shared. We think the issue might be related to the fact that you are running a single uBAM file, which requires a slight modification to the workflow setup; the instructions for this are listed on the workspace Dashboard in case that's helpful:
- Important configuration notes
  - The workflow is written in WDL1.0 and imports structs to organize and use inputs.
  - If you run the workflow on a sample that only has one uBAM (i.e. one read group), you need to update the config attributes for sample_and_unmapped_bams to include [] around the flowcell_unmapped_bams as shown below:
  { "sample_name": this.read_group_set_id, "base_file_name": this.read_group_set_id, "flowcell_unmapped_bams": [this.read_groups.flowcell_unmapped_bams], "final_gvcf_base_name": this.read_group_set_id, "unmapped_bam_suffix": ".bam" }
The workflow in the workspace Emil shared is set up to use the read_group_set table and it includes this correction in the configuration. We hope these two changes will help the workflow run.
0
Emil Furat
- October 27, 2021 17:12
Hi Mat,

We haven't heard from you in a couple of days so I just wanted to check-in and see how things are going. Have you had any luck using the re-configured data tables/workflow we shared with you?

If you have any other questions please let us know!

Kind regards,

Emil

0
Mat Nightingale
- October 28, 2021 15:47
Hi Emil,

I was able to use your instructions to complete the analysis successfully, but I've not had much time to look at it since. I hope to spend some time on it over the next week or so.

Thanks for your assistance.

Mat

0
Emil Furat
- October 28, 2021 19:57
Hi Mat,

Glad to hear!

As always, if you have any other questions please let us know.

Kind regards,

Emil

0

Please sign in to leave a comment.