Need Help?

Search our documentation and community forum

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate.
Terra powers important scientific projects like FireCloud, AnVIL, and BioData Catalyst. Learn more.

write_lines()/write_map()/write_tsv()/write_json() fail when run in a workflow rather than in a task

Comments

4 comments

  • Avatar
    Jason Cerrato

    Hi Giulio,

    I'll take a closer look at this and get back to you as soon as I can.

    Kind regards,

    Jason

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Can you share the workspace where you are seeing this issue with GROUP_FireCloud-Support@firecloud.org by clicking the Share button in your workspace (see the icon with the three dots at the top-right)?

    1. Add GROUP_FireCloud-Support@firecloud.org to the User email field and press Enter on your keyboard
    2. Click Save

    Let me know the workspace name, as well as the relevant submission and workflow IDs. If there are any authorization domains, please add jcerrato@broadinstitute.org to them if possible. If not, please still provide the relevant submission and workflow IDs.

    Many thanks,

    Jason

    0
    Comment actions Permalink
  • Avatar
    Giulio Genovese

    Hi Jason,

    So Chris Whelan has explained to me that this is the result of a deliberate configuration of Cromwell on Terra that does not allow to serialize tables on the machine that runs the Cromwell server. It was explained to me that it used to be possible but someone abused it and caused the server to crash in the past and since then it is not allowed. However, as reasonable as this sounds, this breaks the WDL specification which does not restrict the use of write_map() outside of the workflow space.

    I have resolved my own issue by writing a simple task equivalent of write_map():

    task write_map_task {
    input {
    Map[String, String] map
    String docker
    }

    command <<<
    >>>

    output {
    File map_file = write_map(map)
    }

    runtime {
    docker: docker
    }
    }

    However, I don't understand why I have to modify my workflow, which was written according to the WDL specification, to accommodate Cromwell breaking the specification. Wouldn't it make more sense for Cromwell on Terra to automatically dispatch write_lines()/write_map()/write_tsv()/write_json() as separate tasks (maybe only when the input is large enough) so that developers can avoid this additional setback?

    Giulio

    0
    Comment actions Permalink
  • Avatar
    Chris Llanwarne

    Hi Giulio - 

     

    I tried to answer this for you in your other post (https://support.terra.bio/hc/en-us/community/posts/360071476431-Terra-fails-to-delocalize-files-listed-through-read-lines-?page=1#community_comment_360011392571) but wanted to link here in case others come across your question. If you have follow ups, we can continue in that thread.

    Thanks - 

    Chris

    ---

    Copy of the relevant part of the answer from the other thread:

     

    (ii) and (iii) are side-effects of Cromwell's first cloud backend being JES (now named PAPIv1 - the Pipelines API on Google cloud). In PAPIv1 the request to the API needs to specify the spec of the VM we want to run on, which files to localize at the start, which script to run, and which files to delocalize at the end. PAPIv1 then deletes the VM and any files we didn't record ahead of time to rescue before letting us know that it did what we asked it to do. That model makes writing simple jobs in PAPIv1 easy, but doesn't fit well with file outputs being defined as functions of other file outputs (we can't predict the result ahead of time), nor file outputs being optional. HOWEVER: now that we're operating in PAPIv2 we do have the opportunity to refactor some of that localization/delocalization logic to happen on the VM itself after the job completes, rather than having to predict it ahead of time in the Cromwell engine. We have tickets in our backlog to do just that.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk