Make larger-memory nodes available

May 13, 2021 15:06
9 comments

As I understand it, Terra only allows tasks to be run on N1 nodes. In practice, this seems to limit available memory to 512G. I have some genome alignment jobs that can require closer to 700G, and would like to be able to run them on Terra.

The Terra documentation (and tech support!) point to Google Cloud pages which mention the various "M" nodes that have lots of memory. It would be a really nice feature if Terra were able to use them.

Also: It would be extremely helpful to document which nodes are available in practice through Terra, rather than referring users to the Google Cloud docs. ie, if everything is run on N1 nodes, it'd be really great if that were made clear in the documentation.

Thanks.

Comments

9 comments

Jason Cerrato
- May 14, 2021 15:18
Hi Glenn,

Thank you for writing in with this feature request. To confirm, are you looking to take advantage of these "M" nodes in a submitted workflow or in an interactive analysis?

Kind regards,

Jason

0
Glenn Hickey
- May 14, 2021 20:05
Personally, I'm just interested in submitted workflows.

Thanks

-Glenn

0
Jason Cerrato
- May 17, 2021 13:34
Hi Glenn,

Understood! Our Batch team has work slated this quarter to allow for use of N2 machines, which would allow > 700 GB memory with 96 CPUs. They expect this functionality to be made available toward the end of this quarter, or sometime next quarter. I'm happy to follow up in this thread once I receive word that these are available.

At present, the N1s allow you to request up to 96 CPUs with no more than 6.5 GB per CPU, meaning you can request up to 624 GB memory at this time. While this isn't quite the 700 you would like, we hope you will be able to perform your analysis without issue. As I mentioned, I'll let you know as soon as I hear N2s are available!

I'll also follow up once I've added information about node availability to our documentation.

Kind regards,

Jason

0
Glenn Hickey
- May 18, 2021 13:14
Great news, thanks! Please do let me know.

I did try pushing up towards 624GB. When I did, I didn't get an error, but my tasks seemed to just wait for a long time. With <=512GB they get scheduled right away.

0
Jason Cerrato
- May 18, 2021 15:21
Hi Glenn,

Thanks for following up. That makes sense—it probably takes a while to see that exact configuration available on Google's end.

Jason

0
Jason Cerrato
- May 21, 2021 18:45
Hi Glenn,

I've updated our internal documentation to let users know workflow VMs are set up using type N1 machines: https://support.terra.bio/hc/en-us/articles/360046944671

Kind regards,

Jason

0
Jason Cerrato
- June 17, 2021 14:50
Hey Glenn Hickey,

Just wanted to follow up and let you know the Intel Cascade Lake CPU platform is now available, which utilizes n2 machine types. You can read more about that here:

https://support.terra.bio/hc/en-us/articles/4402240831771-June-16-2021
https://cromwell.readthedocs.io/en/develop/RuntimeAttributes/#cpuplatform
https://support.terra.bio/hc/en-us/articles/360046944671

Kind regards,

Jason

0
Jean Monlong
- April 06, 2023 20:47
We would also like to use instances with more memory, closer to 1Tb. The maximum that I could request with the current available machines is 768Gb using AMD Rome.

I think that supporting Intel Ice Lake could allow us to use up to 864Gb of memory. Ideally, our genome assembly pipeline could use up to ~1Tb of memory on some samples.

It would be nice to have an option to request instances with that amount of memory.

Thanks,

Jean

0
Anika Das
- April 10, 2023 14:18
Hello Jean Monlong

Thank you for writing in! I've sent this request to our development team for consideration, and I'll be happy to follow up with you if this feature gets built.

Kind regards,

Anika

0

Please sign in to leave a comment.