Need Help?

Search our documentation and community forum

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate.
Terra powers important scientific projects like FireCloud, AnVIL, and BioData Catalyst. Learn more.

Struct Definition and sample_set entity

Comments

6 comments

  • Avatar
    Kathleen Morrill

    Ok, update, this appears to have just been a WDL versioning issue. Declaring version 1.0 at the top of the WDL allowed for struct definitions but other syntax had to change like input {} wrappers for workflow and task definitions (which is better looking syntax anyway!).

    As far as the input attributes go, this is what I see under Workflows -- unsure what "WomCom..." is but I'll play around with it / upload a JSON instead of using the menu here. If my entity is sample_set, then how do I reference which attributes of the samples to use?

    EDIT: It's "WomCompositeType". I tried giving it this input (a spray.json.JsString):

    "{vcf: this.vcf, vcf_ind: this.vcf_ind, sample: this.sample_id, participant: this.participant}"

    From the error, it seems to need an input of WomCompositeType, though...

    { vcf -> File vcf_ind -> File sample -> String participant -> String }

    How does that work? How do I define this type of input from sample attributes for a sample_set?

    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    I ended up splitting into two workflows, filter VCF workflow for individual samples and a merge VCFs workflow for sample sets, which seems to be the better approach!

    But, still might be curious about how to make Arrays of structs from sample sets in the future. Main purpose being to be able to scatter over a set of structures.

    0
    Comment actions Permalink
  • Avatar
    Anika Das

    Hi Kathleen, 

    Can you try quoting the keys in the json objects? Instead of:
    {
    vcf: this.vcf,
    vcf_ind: this.vcf_ind,
    sample: this.sample_id,
    participant: this.participant
    }
    try:
    {
    "vcf": this.vcf,
    "vcf_ind": this.vcf_ind,
    "sample": this.sample_id,
    "participant": this.participant
    }
    (Newlines added for legibility here, but I don’t know if Terra-UI allows them.)

    Let us know if you have any other questions!
     
    Best, 
    Anika
     
    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    I'll try that out later!

    I've run into a different problem with the merge workflow. Trying to get BCFtools to accept my Array[File] for input to merge.

    Done in this way, write_lines is making a file that lists the original buckets, not the localized files, so it fails:

    task MergeData {
    input {
    Array[File] inputSetVCF
    Array[File] inputSetVCFind
    File inputSetVCFlist = write_lines(inputSetVCF)
    }

    command {
    ../bin/bcftools-1.9/bcftools merge -l ${inputSetVCFlist} --force-samples --threads ${num_threads} -Oz -o 'DogAgingProject_${setID}_gp-${filterGP}.vcf.gz'
    }

    Done within the command, it cannot coerce Array[File] to Array[String] for write_lines() to do its magic:

    task MergeData {
    input {
    Array[File] inputSetVCF
    Array[File] inputSetVCFind
    }

    command {
    ../bin/bcftools-1.9/bcftools merge -l write_lines(${inputSetVCF}) --force-samples --threads ${num_threads} -Oz -o 'DogAgingProject_${setID}_gp-${filterGP}.vcf.gz'
    }

    Done as just feeding bcftools the array (no write_lines or sep) and hoping for the best also fails with "Array value was given but no 'sep' attribute was provided".

    0
    Comment actions Permalink
  • Avatar
    Kathleen Morrill

    The second approach, feeding bcftools write_lines(${inputSetVCF}), does indeed make a .tmp file containing a localized input per line. But the workflow errors out there, throwing "Array value was given but no 'sep' attribute was provided" several times.

    0
    Comment actions Permalink
  • Avatar
    Anika Das

    Hi Kathleen, 

    You mentioned in your question that write_lines() is expecting Array[String] as input but you are providing Array[File]. What you can do is add in a line at the top of your command that creates file from the Array[Files] you provided and use that file to as input for your other command. 
    command {
    echo "~{sep="\n" inputSetVCF}" > list.txt
    ../bin/bcftools-1.9/bcftools merge -l list.txt --force-samples ...
    Please let us know if you have further question!
     
    Best, 
    Anika
    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk