monitoring_script and monitoring_image options in Terra Completed

Post author
Denis Loginov

Hi Team,

We'd like to be able to monitor resource utilization (cpu/memory/disk) of tasks in our workflows run on Terra. Are there plans to enable support for monitoring_script and monitoring_image for this use case?

Thanks

Comments

16 comments

  • Comment author
    Chris Kachulis

    Just want to add another vote for this request.  Beyond the flexibility of monitoring_script and monitoring_image, building some sort of even basic resource use reporting into the Terra UI would be very valuable to me.  Being able to see the memory, disk, cpu usage of all of my tasks in Terra, as opposed to needing to purposefully modify my WDL to gather these metrics myself, would be really great.  

    5
  • Comment author
    Kenneth Westerman

    Any updates on this by chance? Thanks!

    0
  • Comment author
    Jason Cerrato

    Hi Kenny,

    The feature has not yet been built, but I would be happy to reach out and let you know if that changes.

    Kind regards,

    Jason

    0
  • Comment author
    Haley Abel

    Another vote for support of monitoring_script.  It's very difficult to troubleshoot workflows without this capability.

    0
  • Comment author
    Stephen Fleming

    I don't know the best answer, but just wanted to throw a couple links up here for the next time I search for this:

    https://github.com/broadinstitute/dsp-scripts/blob/master/cromwell/methods/cromwell_monitoring_script.sh

    https://github.com/broadinstitute/cromwell-monitor

    0
  • Comment author
    Stephen Fleming
    runtime {
    ...
        monitor_image: quay.io/broadinstitute/cromwell-monitor
    }

    I believe you can do the above in a WDL, but then I am not clear on how to get the results.  Denis Loginov any tips on how to view results?  I don't know what StackDriver is or how to use it.

    0
  • Comment author
    Denis Loginov

    Stephen Fleming unfortunately not, both `monitoring_script` and `monitoring_image` options need to be supplied through Cromwell API call as workflow-level options (they are not available through `runtime` section of the tasks).

    Terra team may have some updates for you soon however ;)

    0
  • Comment author
    Stephen Fleming

    Okay thanks!

    0
  • Comment author
    William Thistlethwaite

    Are there any updates on this? Thanks!

    0
  • Comment author
    Anika Das

    Hi William, 

    Unfortunately these scripts have not yet been prioritized, but I will let the appropriate team know of your interest in the feature!

    Kind Regards, 

    Anika

    0
  • Comment author
    Anika Das

    Hi Chris Kachulis

    Thank you for your note, I will add your comments and upvote to the request for this feature!

    Kind Regards, 

    Anika

    0
  • Comment author
    Thiriveedhi, Vamsi Krishna

    Another up vote to have this feature available in the UI! Thank you.

    0
  • Comment author
    Josh Evans

    Hi,

    Thank you for writing in! I've already added your vote to the feature request and you'll be contacted if the feature gets built.

    Best,

    Josh

    0
  • Comment author
    Thiriveedhi, Vamsi Krishna
    • Edited

    Thank you!

    To anyone who visits this thread in the future, a quick fix may be

    set -e

    dstat -t --cpu --mem --disk --io   > output.csv & <your bash commands>

    #kill dstat pid once your command is executed

    pkill dstat

    This creates a CSV output file and will require more post-processing to get into a usable form. 

    Inspired from Firecloud's paper, published back in 2017 @ biorxiv.org/content/10.1101/209494v1.full.pdf http://dag.wiee.rs/home-made/dstat/ To install use !apt-get install dstat

     

    0
  • Comment author
    Adam Nichols

    Hi all,

    Terra developer here. Pleased to report that this feature has shipped. Check it out!

    Best,
    Adam

    0
  • Comment author
    Adam Mullen

    Hello all! I am a product manager working on Terra and I'm excited to provide another update here. In addition to the monitoring_script, Google has recently improved their VM monitoring tools and we've added the ability for Terra GCP Workspace Owners and Workspace Writers with can_compute to access and create Monitoring Dashboards in the GCP Cloud Console.

    These dashboards allow you to monitor interactive and batch VMs for performance and debugging purposes. You can also monitor quotas and setup alerts for any metric you might want to monitor in your workspaces. See https://cloud.google.com/monitoring/charts/dashboards for more information. Please check it out and let us know if you have any questions or feedback!

    0

Please sign in to leave a comment.