Issue with In-Progress Pipelines

Incident Report for Harness

Postmortem

Summary

Between September 15, 2025 3:30 PM and 4:11 PM PST, customers with actively running pipelines containing service or infrastructure deployment steps experienced execution failures after a Harness platform upgrade. The affected pipelines were unable to resume execution after the upgrade due to a data compatibility issue.

Only pipelines with service/environment steps that were actively running during the upgrade in prod1 and prod2. Pipelines in prod3 and above remained unaffected and no data loss occurred.

Root Cause

During the platform upgrade, an internal data structure change caused active pipelines to be unable to resume execution. The system could not properly process pipeline execution data that was created before the upgrade due to a missing compatibility layer.

Remediation

  • Immediate: Customers were advised to restart the failing pipelines, which immediately resolved the issue.
  • Permanent: Deployed a compatibility layer to handle data from pipelines that started before platform upgrades, ensuring seamless execution continuity.

Action Items

  1. Enhanced Testing – Implement testing procedures to validate running pipeline compatibility during platform upgrades.
  2. Improved Change Detection – Strengthen internal monitoring and alerting processes to identify potential compatibility issues during development.
Posted Sep 23, 2025 - 09:33 PDT

Resolved

This incident has been resolved.
Posted Sep 15, 2025 - 04:06 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Sep 15, 2025 - 03:44 PDT

Investigating

We are currently investigating this issue. We are currently experiencing an issue affecting pipelines that are already in progress. New Pipeline executions are not impacted and continue to run successfully.
Posted Sep 15, 2025 - 03:29 PDT
This incident affected: Prod 2 (Continuous Delivery - Next Generation (CDNG)).