Summary
The issue was found at 12:02 PM EDT on Monday, June 17, and impacts users running workflows that use call caching between June 15-21 at 2:00 AM EDT. See the Timeline section for the latest troubleshooting and resolution updates and the Impact section to understand how this could impact your use of the system.
Timeline
June 21 - Issue resolved
June 20, 12:00 PM EDT - Release extended until 2:00 A.M. EDT
Rolling out of Cromwell call cache fix today was expected to have a downtime of 5 hours (4 PM to 9 PM EST). Unfortunately, it's taking longer than expected and current ETA of downtime is now extended till 2 AM tomorrow morning.
June 20, 4:00-9:00 PM EDT - Release scheduled
June 18, 12:28 PM EDT - Bug fix rescheduled to be included in a release on June 20 from 4:00-9:00 PM EDT. During the release window, Terra will be accessible, but you will not be able to look at workflow details in Job Manager, running workflows will be paused, and new workflows will be queued. Workflows will resume again after the outage.
June 17, 09:15 PM EDT - Bug fix is ready for release on June 18. The expected outage time is around 4:00-9:00 PM EDT.
June 17, 01:18 PM EDT - Testing a bug fix
June 17, 12:02 PM EDT - Issue reported
June 15 - Issue starts
Impact
Call caching is not working as expected. Jobs that completed successfully on June 15-21 (fix released at 2:00 AM EDT) with call caching enabled will rerun if relaunched. If you launch a submission between those dates, call caching will only pick up successful jobs run up until June 15. Here is a visual diagram that explains the impact to call caching during this incident.
For more information
Please follow this article to get the most up to date information on this incident. If you would like to be notified of all service incidents or upcoming scheduled maintenance, click Follow on this page.