CCM Prod2 : Degraded Performance - Cluster Utilization and Cluster cost dashboards are not loading

Incident Report for Harness

Postmortem

Summary

Between Aug 5th at 10:40 AM PST and Aug 5th at 12:03 PM IST customers were unable to use the Cluster cost and utilization dashboards.

Root Cause

The team inadvertently introduced a bug during the implementation of a new Looker feature. This bug was caused by including a required_access_grant within a refinement.

Impact

Users encountered issues when attempting to access the Cluster cost and utilization dashboards.

Mitigation

  • Deployed an updated build with fix.

Action Items

To prevent a recurrence of this incident, we will be implementing the following key actions:

  • Further strengthen monitoring and alerting to detect early signs of similar issues.

  • Implement preventative mechanism by enhancing the testing process. We are adding further code checks and increase coverage so that such issues do not occur.

Posted Sep 16, 2025 - 10:30 PDT

Resolved

This incident has been resolved.
Posted Aug 05, 2025 - 12:03 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Aug 05, 2025 - 12:00 PDT

Identified

The issue has been identified and a fix is being implemented.
Posted Aug 05, 2025 - 11:58 PDT

Investigating

We are currently investigating this issue.
Posted Aug 05, 2025 - 11:15 PDT
This incident affected: Prod 2 (Cloud Cost Management (CCM)).