Few Pipelines are running slow and some failures

Incident Report for Harness

Postmortem

Summary

Due to capacity issues on our primary database, pipelines latency for couple of our customers got degraded and errored out.

Root Cause:
One of the databases was under provisioned for the high workload experienced

Mitigation:
We scaled out our database and made sure there is enough headroom and capacity for the workload.

Next Steps:

  • Add additional granular monitoring for our databases.
  • Update thresholds for alerting for warning us on increase in utilization/under capacity.
Posted Sep 08, 2025 - 22:42 PDT

Resolved

This incident has been resolved.
Posted Aug 27, 2025 - 20:32 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Aug 27, 2025 - 20:30 PDT

Identified

We have identified the issue and mitigated the issue
Posted Aug 27, 2025 - 20:30 PDT

Investigating

Internal monitoring identified a potential issue with pipeline
Posted Aug 27, 2025 - 20:29 PDT
This incident affected: Prod 2 (Continuous Delivery - Next Generation (CDNG)).