Degradation in SEI 2.0

Incident Report for Harness

Postmortem

Summary 

On 08/09/2025, the SEI 2.0 module experienced a service disruption due to connectivity issues between services. This impacted SEI 2.0 Beta and SEI 2.0 onboarding customers.

Root Cause

The incident's primary root cause was an infra configuration update. This configuration change inadvertently impacted the service connectivity to the database, causing the SEI 2.0 insights outage.

Mitigation

The previous stable config was applied, which immediately resolved the connectivity issue and restored the SEI 2.0 services.

Next Steps

To prevent a recurrence of this incident, we will be implementing the following key actions:

  • Further strengthen monitoring and alerting to detect early signs of similar issues.
  • Update the change management playbook to include mandatory sanity tests following any infrastructure changes.
Posted Sep 16, 2025 - 09:34 PDT

Resolved

This incident has been resolved.
Posted Sep 08, 2025 - 20:29 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Sep 08, 2025 - 07:56 PDT

Update

We are continuing to investigate this issue.
Posted Sep 08, 2025 - 07:55 PDT

Update

We are continuing to investigate this issue.
Posted Sep 08, 2025 - 07:10 PDT

Investigating

We are investigating on the issue
Posted Sep 08, 2025 - 06:59 PDT
This incident affected: Prod 1 (Software Engineering Insights (SEI)), Prod 2 (Software Engineering Insights (SEI)), and Prod 3 (Software Engineering Insights (SEI)).