Unified pipeline dashboard is experiencing delay in Prod3

Incident Report for Harness

Postmortem

Summary:

On 08 May 2025 at 4:05 AM UTC, a system release introduced a change that led to increased data ingestion activity. As a result, customers experienced latency in the Unified Pipeline dashboard causing temporary data staleness within the Prod3 environment.

There was no downtime or data loss, but dashboard visibility was temporarily impacted. The issue has since been mitigated, and preventive measures are being implemented to avoid recurrence.

What was the issue?

A system release triggered high-frequency data ingestion, which caused processing delays and led to temporary latency in the Unified Pipeline dashboard within the Prod3 environment.

Timeline

Timeline Activity
08 May 2025 at 4:05 AM UTC Prod3 system release completed
08 May 2025 at 12:17 PM UTC Issue was identified
08 May 2025 at 01:48 PM UTC A fix to increase system resources of our databases was validated and implemented
08 May 2025 at 3:20 PM UTC Data was caught up and issue resolved

Resolution

To mitigate the processing delays, the database resources were scaled up. This scaling operation helped restore normal processing throughput. Once completed, the latency issue in the Unified Pipeline dashboard was resolved, and full visibility was restored in the Prod3 environment.

Next Steps

Move major data migrations under a feature flag and schedule them over weekends to ensure better processing performance and minimize impact on live systems.

Posted May 22, 2025 - 08:09 PDT

Resolved

This incident has been resolved.
Posted May 08, 2025 - 08:23 PDT

Investigating

We're currently experiencing delays in the Unified Pipeline Dashboard on Prod3.
Our team is actively investigating the issue and will share an update shortly
Posted May 08, 2025 - 05:27 PDT
This incident affected: Prod 3 (Custom Dashboards).