Make larger-memory nodes available

Post author
Glenn Hickey

As I understand it, Terra only allows tasks to be run on N1 nodes.  In practice, this seems to limit available memory to 512G.  I have some genome alignment jobs that can require closer to 700G, and would like to be able to run them on Terra. 

The Terra documentation (and tech support!) point to Google Cloud pages which mention the various "M" nodes that have lots of memory.  It would be a really nice feature if Terra were able to use them.

Also: It would be extremely helpful to document which nodes are available in practice through Terra, rather than referring users to the Google Cloud docs.  ie, if everything is run on N1 nodes, it'd be really great if that were made clear in the documentation.

Thanks.

Comments

9 comments

  • Comment author
    Jason Cerrato

    Hi Glenn,

    Thank you for writing in with this feature request. To confirm, are you looking to take advantage of these "M" nodes in a submitted workflow or in an interactive analysis?

    Kind regards,

    Jason

    0
  • Comment author
    Glenn Hickey

    Personally, I'm just interested in submitted workflows.

    Thanks

    -Glenn

    0
  • Comment author
    Jason Cerrato

    Hi Glenn,

    Understood! Our Batch team has work slated this quarter to allow for use of N2 machines, which would allow > 700 GB memory with 96 CPUs. They expect this functionality to be made available toward the end of this quarter, or sometime next quarter. I'm happy to follow up in this thread once I receive word that these are available.

    At present, the N1s allow you to request up to 96 CPUs with no more than 6.5 GB per CPU, meaning you can request up to 624 GB memory at this time. While this isn't quite the 700 you would like, we hope you will be able to perform your analysis without issue. As I mentioned, I'll let you know as soon as I hear N2s are available!

    I'll also follow up once I've added information about node availability to our documentation.

    Kind regards,

    Jason

    0
  • Comment author
    Glenn Hickey

    Great news, thanks!  Please do let me know. 

    I did try pushing up towards 624GB.  When I did, I didn't get an error, but my tasks seemed to just wait for a long time.  With <=512GB they get scheduled right away.  

    0
  • Comment author
    Jason Cerrato

    Hi Glenn,

    Thanks for following up. That makes sense—it probably takes a while to see that exact configuration available on Google's end.

    Jason

    0
  • Comment author
    Jason Cerrato

    Hi Glenn,

    I've updated our internal documentation to let users know workflow VMs are set up using type N1 machines: https://support.terra.bio/hc/en-us/articles/360046944671

    Kind regards,

    Jason

    0
  • Comment author
    Jason Cerrato

    Hey Glenn Hickey,

    Just wanted to follow up and let you know the Intel Cascade Lake CPU platform is now available, which utilizes n2 machine types. You can read more about that here:

    https://support.terra.bio/hc/en-us/articles/4402240831771-June-16-2021
    https://cromwell.readthedocs.io/en/develop/RuntimeAttributes/#cpuplatform
    https://support.terra.bio/hc/en-us/articles/360046944671

    Kind regards,

    Jason

    0
  • Comment author
    Jean Monlong

    We would also like to use instances with more memory, closer to 1Tb. The maximum that I could request with the current available machines is 768Gb using AMD Rome.

    I think that supporting Intel Ice Lake could allow us to use up to 864Gb of memory. Ideally, our genome assembly pipeline could use up to ~1Tb of memory on some samples.

    It would be nice to have an option to request instances with that amount of memory.

    Thanks,

    Jean

    0
  • Comment author
    Anika Das

    Hello Jean Monlong

    Thank you for writing in! I've sent this request to our development team for consideration, and I'll be happy to follow up with you if this feature gets built.

    Kind regards,

    Anika​ 

    0

Please sign in to leave a comment.