Give users control over deleting bucket data to better manage cloud costs
Hi,
One recurring theme with Terra requests is having better control over the cost. When running workflows with many intermediate steps/files, this quickly leads to a large accumulation of storage costs without many simple ways of managing this. Some people have developed various scripts with varying success rates in trying to handle this, but it would be best if there were an official, guaranteed-to-work way to handle these as I and others have often struggled to get these fixes to work.
One proposed solution would be to allow users to toggle certain settings / actions that will better manage their bucket storage. For example, perhaps there can be workspace settings like "Delete all files created before X weeks" or "Delete all files generated by workflows besides final outputs," or combine both ("Delete all files created before X weeks except the outputs of workflows," etc). Some sort of checklist type of UI to allow mix and matching of conditions to delete on would be ideal. These could be set to occur automatically, or as actions triggered when the user wants.
This is just a rough concept but something to this effect would help users feel confident that even if costs start to climb higher than expected, they still have a few options to be able to put it back under control quickly with little disruption to the workspace itself. Right now the only "official" options seem to be either delete the workspace and start over (very undesirable for complex workspaces) or copy the workspace and delete the old one (preserves the wdl/data structure, but deletes possibly valuable bucket files as outputs of wdls stored in the data tables). Having more refined control over which files get deleted and when would go a long way to controlling cloud costs.
Thanks for your consideration!
Comments
5 comments
Hi Ricky, this is definitely an important topic, thanks for writing in. We do have a couple of official solutions for this, which may not cover the full spectrum of what you're envisioning, but address at least part of the problem you're describing. Have you tried using either the "delete intermediate files" option in the workflow configuration, or the cleanup notebook, which are demo'ed in the blog post linked below?
https://terra.bio/deleting-intermediate-workflow-outputs/
Totally agree with this request! The "Delete intermediate outputs" functionality has been very helpful to prevent the accumulation of storage! There is still a need for easier workspace cleanup functionality, though. For instance, intermediate files from failed workflows don't get cleaned up by "Delete intermediate outputs" - which is the desired behavior, but they can accumulate quickly. Or, sometimes a workflow is run successfully several times, and the intermediates are deleted each time, but over time the outputs of several runs accumulate.
Some additional ideas for workspace storage management:
Geraldine Van der Auwera Thanks for the reminder about the "delete intermediate files" option when running wdls. I agree this can definitely help, but I've still had a bit trouble getting rid of them once they've accumulated, either because I forgot to check the box, or through failed workflows like Emma Pierce-Hoffman mentioned.
For example, one can consider the following situation. Suppose you run a workflow which fails halfway. You definitely don't want that checkbox to delete the intermediate files yet since you'd like to call-cache them when you retry. However, once the workflow eventually succeeds on another submission, those old files might sit there unless there's a timer to automatically delete older files, or some other way to control them.
In regards to the blog post, I'm having trouble viewing the workspace there in the section on deleting these files once you have them accumulated. It says either it doesn't exist or I don't have permission to view it. But it sounds like it points to another mop script, which I've had trouble getting to successfully run in Terra notebooks. (Anecdotally it sounds like others have also had intermittent success/failures with running them as well.)
Totally understand there are use cases like what you describe that aren't currently covered, for sure. I believe there is some work in progress to address some of this, though I'm not personally aware of details or timeline. Our support team may jump in with details if there's anything they can share at this time.
In any case, thank you both for speaking up, and don't hesitate to get colleagues to upvote your request if they are also feeling that pain — one of the factors that go into prioritizing this kind of work (where something is "not broken, but could work better") is how many people are affected by the issue.
Speaking of broken, though, I see the bad link now — the workspace was replaced by a newer one and although it should have remained public (so people could see the deprecation message) it seemed it was accidentally made private. We'll get that fixed asap for others, but you can go ahead and just use the new workspace. You're correct that it's a mop script (which you seem to have already encountered); if you have trouble getting it to work, don't hesitate to report the problem to the helpdesk.
Hi Ricky,
To reiterate what Geraldine said, we really appreciate your thorough explanation of how these features can be of great benefit to you and your teams. We're going to bring this to the attention of the appropriate product manager so they have this on their radar.
As Geraldine mentioned, it would be a big help for your case to have colleagues comment and upvote support of this feature, as it makes it much easier for our product team to recognize the impact building it could have for the community.
Kind Regards,
Anika
Please sign in to leave a comment.