We have workflows that need to write outputs to a specific path in the workspace bucket, eg the Cumulus workflow in the Intro to HCA data workspace. This is usually done when the workflow does not attach outputs to a data table (for whatever reason). We don't want to have to go digging through the submission directories to find outputs, so we include a task that copies outputs to a specific path. To make that work, in the current state, the workflow config includes an input variable to collect the workspace bucket ID in order to compose the output path.
That requires the user to look up the workspace bucket ID in the dashboard and paste it into the input field — which is clunky and brittle. Imagine you clone the workspace after the workflow has been configured; now you run it in your clone, it's going to try to write outputs to the parent workspace. If you have write access to the parent, you may not realize you're putting your outputs in a different workspace (and you might overwrite things there), and if you don't have write access, it will fail with a permissions error. Ack.
In Notebooks and RStudio cloud environments, metadata like the workspace bucket ID is available through built-in environment variables which is incredibly convenient.
It would be very useful for the use case above to be able to do something like "workspace.bucket-id" in the workflow config. There is already a "Workspace Data" table that this metadata could live in, it just needs to get populated with the relevant metadata by default. There is already precedent for this, eg workspace tags, which are hidden in the UI but you can see them if you download the csv for that table:
["HCA","single-cell","Bioconductor","10x Analysis","cumulus","10X Genomics","Jupyter Notebooks","warp-pipelines"]
Bonus points for actually having a way to display built-in workspace env variables that doesn't involve downloading a csv.