Need Help?

Search our documentation and community forum

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate.
Terra powers important scientific projects like FireCloud, AnVIL, and BioData Catalyst. Learn more.

WDL function read_string() not properly delocalizing paths as files

Comments

11 comments

  • Avatar
    Jason Cerrato

    Hi James,

    Thank you for your inquiry. We'll take a look at this as soon as we can and get back to you.

    Kind regards

    Jason

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Hi James,

    Is this metadata.txt file something that's generated in the workflow, or is it a file you already have? Can you point me to where this file is generated/already exists?

    Can you also clarify which version of dropseq_cumulus you ran for job submission 469a9ccb-43c5-46e5-8f35-a8827a345735?

    Many thanks,

    Jason

    0
    Comment actions Permalink
  • Avatar
    James Gatter

    Hey Jason,

    I ran it in alexandria/dropseq_cumulus/1. sco.scp.metadata.txt and other files are generated by Cumulus workflow, but they are wrapped in the Array[File] output_scp_files. In dropseq_cumulus' scp_outputs task, I serialize this Array[File] into separate File variables.

    Specifically, I do this by identifying each File and writing the path to that file in another file (e.g. "/cromwell_root/path/to/sco.scp.metadata.txt" is written in file ./metadata.txt). If you read my original post now, it might make more sense.

    But overall, this isn't so much a problem anymore as I've been using another solution in alexandria_dev/dropseq_cumulus/11:

    First I move the file to the present working directory and then it delocalizes as such:

    File file = glob("*filename.txt")[0] # Delocalizes the first file that matches the glob pattern

    If anything I just want to call attention to the people working on Terra/Cromwell that this function and the subsequent delocalization does not behave as expected.

    Best,

    James

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Hi James,

    Thank you for letting us know that you have found a solution. I will pass the information from your original post on to our workflow engineers for investigation.

    Kind regards,

    Jason

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Hi James,

    I'm still having some difficulty finding this exact line in your job submission ID 469a9ccb-43c5-46e5-8f35-a8827a345735, where it claims to delocalize to

    gs://fc-secure-ec2ce7e8-339a-47b4-b9d9-34f652cbf41f/469a9ccb-43c5-46e5-8f35-a8827a345735/dropseq_cumulus/d7cc5552-e1e6-4f51-93aa-fcf1c8b510ea/call-scp_outputs/call-cumulus/cumulus.cumulus/90d9eb9e-4ffc-4e4f-b9fa-0198e2629158/call-scp_output/attempt-2/glob-1c77504a8a1d1e9b00f3be9956ddf1c3/sco.scp.metadata.txt

    Can you point me to where I can find it? Looking at the scp_outputs task log, I've only found these.

     

    I've only so far found this file sco.scp.metadata.txt in the Localization section of the log.

     

    Many thanks,

    Jason

    0
    Comment actions Permalink
  • Avatar
    James Gatter

    Hi Jason,

    So for delocalization in the log, the wrong files are being delocalized. For example, read_string("metadata.txt") should have looked inside metadata.txt for the path to the desired file, e.g. "/cromwell_root/path/to/sco.scp.metadata.txt", and returned that path to become the WDL File output. Instead it delocalized the undesired files, "metadata.txt", "expr.txt", and "X_fitsne_coords.txt", which each only contain the path to the desired file.

    In the Terra job's Outputs tab however, Terra claims that the desired files instead were delocalized, but the gs:// links to these files are dead, and lead down nonexistent paths. Terra doesn't mention that "metadata.txt", "expr.txt", and "X_fitsne_coords.txt", were all delocalized instead of "sco.scp.metadata,txt", "sco.scp.expr.txt", and "sco.scp.X_fitsne_coords,txt", so that conflicts with the log file. I'm just as confused as you are on this one.

    Sorry for the confusion, I know this is pretty wacky and convoluted.

    Best,

    James

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Hi James,

    Would you be willing to share dropseq_cumulus with jcerrato@broadinstitute.org so I can take a closer look at versions 1 and 11?

    Kind regards,

    Jason

    0
    Comment actions Permalink
  • Avatar
    James Gatter

    Hi Jason,

    The methods should be publicly viewable but I've shared the both tools with you just in case. Hope that helps!

    Oh and the dev snapshot that fails is 9, not 11!

    James

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Hi James,

    Many thanks—I was getting a message that the snapshot had been removed or that I didn't have access. I have access now after being added.

    Kind regards,

    Jason

    0
    Comment actions Permalink
  • Avatar
    James Gatter

    I accidentally removed dev snapshot 10, sorry! If you want to look at the dev version, make sure you are looking at 9, I believe that is the same as alexandria/dropseq_cumulus/1. I think just looking at alexandria/dropseq_cumulus/1 would be best anyways.

    0
    Comment actions Permalink
  • Avatar
    Jason Cerrato

    Hi James,

    Thank you! Our internal team will take a look and see if there is anything unexpected going on. Thank you for your report. If there's anything else we can help with, please let us know!

    Kind regards,

    Jason

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk