PROD1: Stale Data Observed for Unified Custom Dashboards

Incident Report for Harness

Postmortem

Summary

On April 4, 2025, for 35 minutes, customers in the prod-1 production environment observed that the following custom dashboards were loading stale data: pipeline, stage, and step executions. We discovered that an incorrect version of the ETL process was accidentally deployed, which caused periodic failures in executing it. 

Resolution

Upgrading the ETL process to a newer version addressed this issue.

RCA

Pipeline, stage, and step execution custom dashboards were not loading correctly due to an incorrect upgrade of the ETL process. The upgrade caused periodic execution failures and disrupted the views' data refresh. While no data loss was experienced, dashboards briefly rendered stale data.

Action Items

  • Implement robust deployment process and runbooks to prevent unintended upgrades.
Posted Apr 23, 2025 - 21:13 PDT

Resolved

This incident has been resolved. Thanks for your patience.
Posted Apr 04, 2025 - 10:43 PDT

Identified

We have identified the issue and are working on a fix.
Posted Apr 04, 2025 - 10:08 PDT

Investigating

We are currently investigating this issue.
Posted Apr 04, 2025 - 10:08 PDT
This incident affected: Prod 1 (Custom Dashboards).