On April 27, 2026, customers running pipelines in the Prod3 environment experienced intermittent slowness in pipeline execution and delays in execution status updates in the UI.
It was caused by a unexpected spike causing contention on a backend database supporting pipeline orchestration. The issue was mitigated and fully resolved.
Incident window: April 27, 2026, 1:00 PM – 3:12 PM PDT
There was no data loss. The majority of pipelines continued to execute successfully, with the primary impact being increased latency and delayed UI updates.
Pipeline orchestration relies on a backend database to track execution state and power the execution view in the UI.
During the incident, we had a spike of load, leading to increased query latency across the orchestration layer.This resulted in a backlog, causing UI updates to lag behind actual pipeline execution until the system was scaled.
Immediate Mitigation
These actions restored normal pipeline performance and UI responsiveness.
To prevent such issues from happening again.