[Prod-8] Degraded access to the login page

Incident Report for Harness

Postmortem

Summary

On March 2, the prod8 environment became temporarily inaccessible due to a config issue during a platform deployment. The issue affected ingress routing for the platform UI, resulting in HTTP 404 responses when users attempted to access the environment.

The issue was quickly identified as an ingress configuration problem. A temporary mitigation was applied by updating the ingress configuration, which immediately restored access. A permanent fix is being implemented to prevent recurrence.

Root Cause

The issue was caused by a service config that incorrectly generated ingress configuration during deployment. This caused the ingress controller to misroute incoming requests that did not match the expected path. As a result, these requests were directed to the default backend and returned 404 responses.The problem was isolated to the ingress routing layer. Network connectivity and the Google Cloud Network Load Balancer were functioning normally

Impact

  • Affected Environment: prod8
  • Customer Impact: Users were unable to access the platform UI and received HTTP 404 responses.
  • Scope: Limited to the specific environment impacted by the ingress configuration change.

Resolution

Engineering teams applied a temporary mitigation by patching the platform-ui ingress configuration in production to remove the incorrect host entries. This restored correct routing behavior and resolved the accessibility issue.

Access to the prod8 environment was fully restored after the ingress configuration update.

Prevention and Improvements

To prevent recurrence of this issue, the following steps are underway:

  • Adding additional validation checks to ensure ingress configuration is rendered correctly during deployment.
  • Improving deployment testing for ingress routing scenarios to detect configuration regressions earlier.

These improvements will ensure that similar misconfigurations are caught before reaching production environments.

Posted Mar 04, 2026 - 11:22 PST

Resolved

This incident has been resolved.
Posted Mar 03, 2026 - 08:40 PST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Mar 03, 2026 - 02:43 PST

Investigating

We are currently investigating this issue.
Posted Mar 03, 2026 - 02:32 PST
This incident affected: Prod 8 (Platform).