HostedCI is facing intermittent build failures in few environments

Incident Report for Harness

Postmortem

Summary

On Nov 25, 2025 at 6:11 AM IST, build failures began occurring for several customers due to an outage in a third-party dependency (canonical) used during the build initialization process.

Root Cause

During build initialization, an apt-get update step relies on this external dependency. The operation became significantly slower than usual, causing health checks to time out.

Impact

Builds failed during the initialization step between 6:11 AM IST and 6:57 AM IST on Nov 25, 2025.

Mitigation

The third-party outage was resolved externally, and the issue self-recovered once the dependency became available again.

Action Items

  • Disable third-party dependency checks during build initialization to avoid similar failures.
  • Introduce proactive pre-build checks with fast-fail behavior to detect and mitigate issues earlier.
Posted Dec 03, 2025 - 10:29 PST

Resolved

This incident has been resolved.
Posted Nov 24, 2025 - 17:28 PST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Nov 24, 2025 - 17:23 PST

Identified

The issue has been identified and a fix is being implemented.
Posted Nov 24, 2025 - 17:23 PST

Investigating

We are currently investigating this issue.
Posted Nov 24, 2025 - 17:15 PST
This incident affected: Prod 3 (Continuous Integration Enterprise(CIE) - Windows Cloud Builds, Continuous Integration Enterprise(CIE) - Linux Cloud Builds), Prod 1 (Continuous Integration Enterprise(CIE) - Windows Cloud Builds, Continuous Integration Enterprise(CIE) - Linux Cloud Builds), and Prod 2 (Continuous Integration Enterprise(CIE) - Windows Cloud Builds, Continuous Integration Enterprise(CIE) - Linux Cloud Builds).