PROD4 – Overview Dashboards Experiencing Issues

Incident Report for Harness

Postmortem

Summary

On February 3, 2026, dashboards in the Prod4 environment experienced data-fetch failures following the deployment . The issue was caused by an unintended configuration change prevented dashboard metrics from loading.The issue was identified and mitigated the same day through configuration correction and service redeployment.

Impact

  • Dashboard metrics failed to load in the Prod4 environment.
  • Underlying system functionality remained operational.
  • No data loss occurred.
  • The issue was limited to dashboard data visibility.

Root Cause

A recent update to a shared configuration chart was introduced to support an unrelated migration initiative.

This change resulted in authentication failures, preventing services from connecting to the database and loading dashboard data

Mitigation

The issue was mitigated through the following steps:

  • Reverted the shared configuration to restore normal functionality.
  • Applied an emergency configuration fix to explicitly set the correct values.
  • Rebuilt and redeployed all affected services across relevant versions.

Action Items

To prevent recurrence, the following improvements are being implemented:

  • Enhance version control mechanisms to prevent automatic adoption of shared configuration changes without explicit service-level upgrades.
  • Improve configuration precedence logic to ensure safer fallback mechanisms.
  • Enhance monitoring and validation for all dependency connectivity.
  • Strengthen regression validation processes for shared configuration changes.
Posted Feb 16, 2026 - 10:31 PST

Resolved

This incident has been resolved.
Posted Feb 03, 2026 - 06:26 PST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Feb 03, 2026 - 04:39 PST

Identified

The issue has been identified and a fix is being implemented.
Posted Feb 03, 2026 - 03:56 PST

Investigating

We are currently investigating this issue.
Posted Feb 03, 2026 - 03:24 PST
This incident affected: Prod 4 (Continuous Delivery - Next Generation (CDNG), Platform).