output files hidden in glob-* directories
I have a workflow that writes multiple output files organized in subdirectories. I tried to map these various output files to informatively named workflow outputs, like so:
Array[File] imputed = glob("${job_id}/local/*.gz")
File md5 = "${job_id}/local/results.md5"
File qc_report = "${job_id}/qcreport/qcreport.html"
Array[File] qc_stats = glob("${job_id}/statisticDir/*.txt")
Array[File] log = glob("${job_id}/logfile/*.log")
However, there doesn't seem to be any connection in Terra between my workflow output names and the resulting files. Is there any advantage to defining these named outputs, rather than just writing all output files to the same directory with a command like
Array[File] imputed = glob("${job_id}/*/*")
Also, the `glob()` command puts all files it finds into directories named `glob-*`, so my output directory is a confusing mix of `glob-*` directories and the original directory structure:

There is no way to easily tell which files are in which glob directories without clicking on each one. Is there a better way to manage my outputs?
Comments
1 comment
Hi Stephanie,
Thanks for writing in. This is a great question! This ultimately boils down to what your end goals are and what works best for your process. For most, writing outputs back to their data tables is the best way to keep their data organized.
From our article Data Tables Quickstart Part 1 - Intro to data tables:
Once your data tables are set up, I consider this to be the easiest way of managing outputs. The output locations are written to the table in line with the root data entity it relates to, making it easy to tell which files relate to one another.
Can you explain a little more what you mean by
Kind regards,
Jason
Please sign in to leave a comment.