Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate. Terra powers important scientific projects like FireCloud, AnVIL, DataSTAGE. Click on the "Co-branded Projects" link above to learn more.

Support workflows with more than 50,000 nodes



  • Avatar
    Matt Bookman

    Two additional notes:

    1- My statement above was incorrect:

    I can then re-run the workflow with call caching enabled, commenting out the ApplyRecalibration, in order to gather the metrics.

    The CollectMetricsSharded needs the input of ApplyRecalibration, so it isn't as simple as I indicated. We will need to craft a separate workflow that takes the ApplyRecalibration as input and does the metrics collection and gathering.

    2- I also noticed that the maximum number of jobs is configurable in Cromwell and the default is 1,000,000:


    private val DefaultTotalMaxJobsPerRootWf = 1000000
    private val DefaultMaxScatterSize = 1000000
    private val TotalMaxJobsPerRootWf = params.rootConfig.getOrElse("system.total-max-jobs-per-root-workflow", DefaultTotalMaxJobsPerRootWf)
    private val MaxScatterWidth = params.rootConfig.getOrElse("system.max-scatter-width-per-scatter", DefaultMaxScatterSize)

    If possible, please increase the Terra configuration to 60,000 so that the joint discovery workflow can run to completion.

    Comment actions Permalink
  • Avatar
    Matt Bookman

    Note that I have added a github issue for the workflow itself:


    If this 50,000 limit is going to stay as a hard limit, there are options within the workflow to examine.

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk