Need Help?

Search our documentation and community forum

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate.
Terra powers important scientific projects like FireCloud, AnVIL, and BioData Catalyst. Learn more.

Only copy outputs to data model for conditional call

Comments

7 comments

  • Avatar
    Samantha (she/her)

    Hi Kathleen Morrill,

     

    Thanks for reaching out. Can you share the workspace where you are seeing this issue with GROUP_FireCloud-Support@firecloud.org by clicking the Share button in your workspace? The Share option is in the three-dots menu at the top-right.

    1. Add GROUP_FireCloud-Support@firecloud.org to the User email field and press enter on your keyboard.
    2. Click Save.

    Let us know the workspace name, as well as the relevant submission and workflow IDs. We’ll be happy to take a closer look as soon as we can!

     

    Best,

    Samantha​ 

    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    Thanks Samantha, I have added the support email to the workspace.

     

    Here is the workspace name: Dog Aging Project - Sequencing Data

    The following submission has examples of samples for which the conditional is false and I do not want outputs copied to the data model:

    a255de8c-14c6-4bb9-8c81-bc62fd4b679e

    The workflows are as follows:

    DAP_SequencingData_Status checks the website for sample assignments and the sequencing platform, and updates the data model.

    DAP_SequencingData_Ingestion checks whether data…
    1. is ready to download (sample_status == "succeeded")
    2. is available to download (sample_availability != "archived"
    3. is not already downloaded (data_ingested != "TRUE")) ...if so, then return Boolean download = true

    If download is true, then the download task occurs with outputs to the data model.

    I suppose I could make a task that takes in the data model attributes and re-outputs them if (!download). If they are bucket links (File), then should they also be input as File, or as String? I don't actually want the data localized or copied.

    0
    Comment actions Permalink
  • Avatar
    Samantha (she/her)

    Hi Kathleen Morrill,

     

    Sorry for the delayed response. I brought this to our engineers to confirm the behavior you were seeing. Unfortunately, there's no way in the data model to prevent overwriting if the result of the workflow output is NULL. I'd be happy to create a feature request for this for the team to consider.

    For now, you could try your proposed workaround of creating another task that inputs the current data model attributes and re-outputs them if(!download). They should be input as String.

    Please let me know if you have any questions.

     

    Best,

    Samantha

    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    Thank you!

    Another suggested feature request, which might help with the same goal in mind, is the ability to sort the data model by multiple columns when selecting samples for a sample set. That way, we can easily select samples fulfilling multiple conditions for workflow submission (e.g. new sample, successful sequencing, not yet run).

    Also, this is probably a less ideal solution than the input->output way I mentioned, but is there a way to force a workflow failure after a boolean condition? I noticed that failed workflows do not output anything to the data model.

    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    So, I have tried the solution of requesting the current data model attributes as outputs for a task that runs only if !download. However, I've found that the successful workflow will attempt to update the data model with the outputs from both tasks (if(download) and if(!download)) -- which, are the same attributes -- even when only one task runs, setting the outputs for the un-run task to NULL and overwriting the outputs from the other.

    I have an example of this behavior for the following run:
    Sample 31020061513478
    Submission 03f0ac71-8e93-400c-9b15-73f1bf6a0590
    Workflow 71c915c8-93eb-4441-a574-5bc28a577214

    vcf and bam got set from GencoveAPI_Download outputs but fastqr1 and fastqr2 got set from ReturnModel outputs (null), even though fastqr1 and fastqr2 had valid outputs from GencoveAPI_Download.

    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    Found a solution! Rather than inputting existing attributes and outputting those, I set the Task for if(!download) to run the command `exit 1`, forcing a fail state for the workflow. Failed workflows don't output to the data model, so that's just what I needed.

    0
    Comment actions Permalink
  • Avatar
    Samantha (she/her)

    Hi Kathleen Morrill,

     

    Glad to hear you were able to get it to work, and thank you for sharing your solution here. I'll submit feature requests for this and the data table sorting functionality you mentioned. 

    If you need assistance with anything else, please don't hesitate to reach out!

     

    Best,

    Samantha

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk