I have some very long running jobs (~5 days) that I have been trying to run for a few weeks now. I am seeing some odd behavior in Terra where there is a delay of 3+ days from submission to start. The real issue is the 7 day built-in time limit on Terra jobs; it seems to count from submission, rather than job start, leaving me with <4 days for a given job. This ends up wasting a lot of money as the jobs all fail due to this time limit, even though the GCS imposed limit is much longer than a week.
Is there any way to get around this? Either the submission-to-start lag or the time limit?
I really don't understand why it only takes ~5 minutes to provision a GCS VM through Terra, but 3 days for a batch job. I get that they may be using different google services, but 3 days just seems astounding for cloud computing.
Any advice would be great! Thanks!