Between December 29th 2025 02:45 A.M to 09:44 A.M, PT customers using Harness Pipelines in Prod2 experienced few stuck executions.
On November 29th, we deployed a configuration change intended to improve system performance and observability. This change modified the processing model of a failure strategy related event, making it asynchronous.Inadvertently, due to this change, existing idempotent logic incorrectly classified valid failure strategy events as duplicates. As a result, these events were not processed, causing a small number of pipeline executions that triggered failure strategies to become stuck.
Customers experienced the following during this incident:
We identified the configuration change as the root cause and reverted it. The corrected configuration was deployed to Prod2, restoring normal pipeline behavior.
Following the fix, Harness ran a cleanup job in the subsequent hours to identify and abort any remaining stuck executions.
To prevent similar issues in the future, we are taking the following actions: