Switch Localization/Delocalization From gsutil to gcloud storage
When executing workflows, I believe that localization and delocalization is done with gsutil and not gcloud. I haven't benchmarked, but it does seem that localization can be slow on Terra for large files and this is one area where gcloud alpha storage should help.
Thank you for writing in! I've sent this request to our development team for consideration, and I'll be happy to follow up with you if this feature gets built.
Chiming in to agree. Right now, localization and delocalization is pretty slow. Even after doing what's suggested here plus switching to SSDs, I'm seeing speeds of about 150 GB/hour. For comparison, real-life transfer speed of files to an external hard drive connected via USB 3.0 seems to be over twice as fast as that. I'll be the first to admit my benchmarking was casual, but this quick back-of-the-napkin calculation indicates file localization on Terra using big SSDs can be significantly slower than consumer-grade file transfer between HDDs.
For tasks that already take a while, this increases the chance that your VM will be taken away from you when everything is effectively done but files are still being delocalized (especially if you use preemptibles). In my current work, I've found tasks that just aren't feasible to run on Terra just because a quick calculation indicates just delocalization has a chance to take up most of the 168 hour time limit allotted to non-preempt VMs.
Thank you for voicing your support for this feature request! I will make a note of your concerns which our product team will take into consideration when prioritizing new features.
Please sign in to leave a comment.