Long-running command in Terra Cloud Environment terminal

Post author
Stephen Fleming

I have a long-running command I'd like to run in the terminal on a Terra workspace Cloud Environment.  I'd like it to be able to crank away on a GPU for days, in some cases.

There seems to be a timeout that turns off my VM however.  I had thought the timeout was for "inactivity".

Can you tell me how to allow a long-running command in terminal to run to completion on a Terra VM?

Thanks,

Stephen

Comments

3 comments

  • Comment author
    Josh Evans

    Hi Stephen,

    Thanks for writing in! You're correct, the Cloud Environment should pause after a some time of inactivity.  This inactivity should start after any commands have finished running.  What is the environment's autopause set to? Is this just a single command that runs or are you running a shell script with a lot of individual commands? Is there any time during the command's run that the system would be idle? 

    If you're running into issues with autopausing and want to ensure it doesn't happen, there's a workaround you can try.  Leave the Cloud Environment's Terminal tab open and set your computer to not sleep.  That'll lock the session and the environment won't pause, but be careful because it won't pause even if the command finishes so you'll need to check the tab for when the command finishes.

    Please let me know if you have any questions. 

    Best,

    Josh

    0
  • Comment author
    Stephen Fleming

    Hi Josh,

    I think the Environment's autopause is set to 30 mins.

    It's just a single command that runs for a very long time.  It's training an ML model using a GPU, and so the task is cranking away on the CPU and GPU at nearly full blast (for both).  But it's just one command.  There is no time when the system would be idle.

    Okay thanks for the suggestion.

    But I think it would be a good idea to have the IA team test this and look into a fix.  I do not think this should be the desired / expected behavior.

    It seems to me like autopause is not working as intended, unless there's something I'm not understanding.

    Thanks,

    Stephen

    0
  • Comment author
    Josh Evans

    Hi Stephen,

    Thanks for the reply.  I'll certainly have our engineers take a look, but I wanted to make sure that you had a workaround so that you could continue to work in the meantime.  Could you please provide the command that you're using. If you've been seeing this behavior with an existing Cloud Environment could you provide the following:

    • Google Project ID for workspace, found on the right side of the workspace dashboard
    • Cluster ID (visit https://app.terra.bio/#clusters, click Details, and see Name field)
    • Screenshot of your cloud environment configuration
    • Approximate time issue first occurred

    Once we have this information, we'll take a look as soon as we can.

    Best,

    Josh 

    0

Please sign in to leave a comment.