Between 05:42 PM and 7:00 PM UTC on Thursday, Aug 7th, 2025, customers using Feature Flags in Prod1/Prod2/Prod3 clusters encountered “Failed to fetch (5xx)” errors when attempting to read/evaluate feature flags from the backend database. All other harness modules including CI, CD, STO, SSCA, Code, AR, Pipeline) remained fully functional.
As part of enhancing our Disaster Recovery capabilities to help with the overall availability of Harness platform, we introduced a new architecture leveraging backup-and-restore mechanisms during disaster scenarios. However, this change inadvertently brought down the replica database(s) that were serving read traffic for the Feature Flag service. Under normal circumstances, no service in the live (primary) region should connect to resources in the DR region — this was an isolated, one-off case.
During the incident, some customers were unable to retrieve and evaluate feature flag definitions when attempting to connect to Harness services returning the default value configured in the customer’s applications.
Rollout and Testing
Evaluate and enhance Feature flag architecture
Improve Change Management Process