Prod1/Prod2 pipelines and logins are degraded. Some delegates are disconnected

Incident Report for Harness

Postmortem

Summary

On February 26, 2026, multiple customers experienced disruptions accessing Harness on Prod1 and Prod2. A transient network connectivity issue caused disruption to our backend systems , leading to platform unresponsiveness. Service was restored within approximately one hour.

Impact

  • Customers on Prod2 were unable to log in or access the Harness platform.
  • Prod1 experienced login disruptions due to a cross-environment dependency on Prod2.
  • Delegates disconnected; Kubernetes-based delegates reconnected automatically, while non-Kubernetes delegates required a manual restart.

Root Cause

A transient network connectivity disruption caused connection timeouts across the platform. The exact infrastructure-side trigger of the initial connectivity disruption is still under investigation.

Remediation

  • Immediate: Affected services were manually restarted, clearing stuck connections and restoring platform availability.
  • Short-term: Autoscaling limits were adjusted to better handle sudden reconnection load.
  • Ongoing: Investigation into timeout configuration and application resilience improvements is in progress.

Action Items

To prevent such issues from happening again

  1. Review and update the timeouts settings to fail fast and limit thread blocking during connectivity issues.
  2. Improve application resilience — enhance circuit breakers to prevent connectivity issues and retries
Posted Mar 02, 2026 - 08:30 PST

Resolved

This incident has been resolved.
Posted Feb 26, 2026 - 10:29 PST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Feb 26, 2026 - 10:14 PST

Identified

The issue has been identified and a fix is being implemented.
Posted Feb 26, 2026 - 10:05 PST

Investigating

We are currently investigating this issue.
Posted Feb 26, 2026 - 09:56 PST
This incident affected: Prod 2 (Continuous Delivery - Next Generation (CDNG)) and Prod 1 (Continuous Delivery - Next Generation (CDNG)).