CI-hosted MacOS pipelines were failing during the initialisation step, impacting specific customers using our MacOS-hosted service.
We tightened a firewall rule for our Mac VM registry that was previously too permissive. As a result, another component couldn’t access the registry, leading to pipeline failures.
Time | Event |
---|---|
Sept 1st, 17:00 UTC | Restricted the firewall rule. |
Sept 04, 06:03 UTC | Issue reported by the customer. |
Sept 04, 08:39 UTC | We re-created the firewall rule and validated that the issue was fixed. |
Our MacOS production setup includes several components. When we restricted the permissive firewall rule, the new rule did not account for the NAT IP address of one of these components. After the change, we ran a full sanity pipeline on the Mac machines, which passed successfully. The issue didn’t surface immediately as the affected component maintains a persistent socket connection, unaffected by the firewall until the connection is re-established or restarted. This explains why the failure didn’t occur immediately after we removed the permissive rule on September 1st. We restored the rule, and the issue was resolved.