Failed to retrieve license information seen for some customers.
Incident Report for Harness
Postmortem

Overview

A few customers using the production environment (Prod2) reported encountering a "401 - Failed to fetch error" when attempting to access the Harness User Interface (UI). Notably, these customers observed that they could successfully log in and access the Harness platform using an incognito window.

Timeline (PST)

Time Event
7:02 AM Incident reported by customers
7:10 AM The team executed a rollback of the recent deployment in the Prod2 environment, resulting in the successful resolution of the incident.
7:11 AM Monitoring
7:41 AM The issue has been confirmed as resolved

Resolution

We initiated a rollback procedure, reverting the deployment from 810xx to 809xx within the Prod2 environment.

Affected Users

Users in Prod2 whose tokens had expired over the weekend.

RCA

Users encountered the "Failed to fetch: 401" error due to their session tokens expiring, leading to a 401 Unauthorized response from the Gateway. While typically, this should have redirected users to the login page, they remained on the same page because the 401 response was not handled by the UI with the recent deployment. We mitigated the incident by rolling back to the previously deployed version in the Prod2 environment.

Action Items

  • We will ensure that any reverts are isolated and not combined with additional changes in the same Pull Request (PR) to prevent similar issues.
  • We will enhance our UI Automation by incorporating a critical test case to confirm that all 401 errors consistently redirect users to the login page.
Posted Oct 16, 2023 - 19:10 PDT

Resolved
The issue has been confirmed as resolved, we are working to deliver the root cause analysis and will post this as soon as it is available.
Posted Oct 16, 2023 - 07:41 PDT
Update
We are continuing to monitor for any further issues.
Posted Oct 16, 2023 - 07:11 PDT
Monitoring
The rollback of the deployment is complete, we are now actively working to determine the root cause of the issue of the issue, and monitoring this closely for any additional issues.
Posted Oct 16, 2023 - 07:10 PDT
Investigating
Some of our users are unable to log in to Harness and observing error "401 Failed to retrieve license information" - We are performing a rollback now to mitigate the issue and will update as soon as this is completed.
Posted Oct 16, 2023 - 07:02 PDT
This incident affected: Prod 2 (Continuous Delivery (CD) - FirstGen - EOS, Continuous Delivery - Next Generation (CDNG), Cloud Cost Management (CCM), Continuous Error Tracking (CET), Chaos Engineering, Continuous Integration Enterprise(CIE) - Cloud Builds, Continuous Integration Enterprise(CIE) - Self Hosted Runners, Custom Dashboards, Feature Flags (FF), Security Testing Orchestration (STO), Service Reliability Management (SRM)).