Support workflows with more than 50,000 nodes



  • Matt Bookman

    Two additional notes:

    1- My statement above was incorrect:

    I can then re-run the workflow with call caching enabled, commenting out the ApplyRecalibration, in order to gather the metrics.

    The CollectMetricsSharded needs the input of ApplyRecalibration, so it isn't as simple as I indicated. We will need to craft a separate workflow that takes the ApplyRecalibration as input and does the metrics collection and gathering.

    2- I also noticed that the maximum number of jobs is configurable in Cromwell and the default is 1,000,000:

    private val DefaultTotalMaxJobsPerRootWf = 1000000
    private val DefaultMaxScatterSize = 1000000
    private val TotalMaxJobsPerRootWf = params.rootConfig.getOrElse("", DefaultTotalMaxJobsPerRootWf)
    private val MaxScatterWidth = params.rootConfig.getOrElse("system.max-scatter-width-per-scatter", DefaultMaxScatterSize)

    If possible, please increase the Terra configuration to 60,000 so that the joint discovery workflow can run to completion.

  • Matt Bookman

    Note that I have added a github issue for the workflow itself:

    If this 50,000 limit is going to stay as a hard limit, there are options within the workflow to examine.

Please sign in to leave a comment.

Powered by Zendesk