infracost integration degraded in iacm pipelines

Incident Report for Harness

Postmortem

Summary

On May 5, 2026, customers using Infracost evaluations in the Prod2 environment experienced failures during infrastructure cost estimation workflows. Infrastructure deployments themselves continued to function normally, but pricing evaluations failed due to requests being routed to an incorrect endpoint configuration.  

The issue was caused by a ingress routing configuration issue during a Prod2-specific rollout. Requests to the Infracost service were routed to a default HTML login surface instead of the expected API endpoint, causing evaluation failures.  

Impact

  • Infracost evaluations failed for customers in the Prod2 environment.
  • Infrastructure deployments and pipeline executions continued functioning normally.
  • The issue was isolated to pricing and evaluation functionality only.
  • Prod1 and other environments were not impacted.  

Root Cause

The incident was caused by a configuration mismatch between the Infracost plugin endpoint and the ingress routing configuration in Prod2. This resulted in invalid responses and caused Infracost evaluations to fail.  

Remediation

The issue was mitigated by rerouting Prod2 traffic to the stable Prod1 Infracost API endpoint while validating routing behavior. Once traffic was repointed, impacted pipelines successfully completed and Infracost evaluations were restored.  

Preventive Actions

  • Add automated validation to ensure ingress routes and endpoint configurations remain aligned.
  • Introduce synthetic health checks for Infracost evaluation APIs across environments.
  • Strengthen rollout validation processes for environment-specific routing overrides.
  • Improve deployment guardrails to detect configuration drift between application and ingress configurations before rollout completion.
Posted May 19, 2026 - 17:36 PDT

Resolved

This incident has been resolved.
Posted May 05, 2026 - 10:51 PDT

Investigating

We are currently investigating this issue.
Posted May 05, 2026 - 10:14 PDT
This incident affected: Prod 2 (Infrastructure as Code Management (IaCM)).