CCM Connector Validation failing for some accounts
Incident Report for Harness
Postmortem

Overview

On November 13, 2023, a failure in CCM Cloud Connectors impacted customers with currency preferences enabled. The external API responsible for fetching currency rates failed, resulting in data ingestion failure. This incident solely affected data ingestion, and all other features of CCM remained unaffected. The incident was successfully resolved within ~2 hours and 15 minutes, with no reported downtime.

Timeline

Time Event
2023-11-13, 02:15 PM UTC Issue first reported on slack channel for a customer account
2023-11-13, 02:30 PM UTC Incident acknowledged, and internal investigation initiated
2023-11-13, 03:17 PM UTC Root cause identified
2023-11-13, 03:29 PM UTC Temporary fix raised, followed by deployment of latest code
2023-11-13, 06:55 PM UTC Data replayed for all affected customers

Root Cause Analysis (RCA)

The incident originated from the failure of an external API fetching currency rates, impacting data ingestion for CCM Cloud Connectors with currency preferences. The decision to use an external API for currency rates was necessitated by the dynamic nature of currency conversion rates. The incident was further exacerbated by the failure of the fallback mechanism, backup currency rates were not populated for the current month.

Follow-up Actions

  1. Add better fallback mechanisms for currency rates.
  2. Add monitoring for the external public API.
Posted Nov 15, 2023 - 12:57 PST

Resolved
For the customer that raised the issue, data has been replayed and everything is working fine for all three cloud providers.
Posted Nov 13, 2023 - 08:19 PST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 13, 2023 - 07:46 PST
Identified
Customers in Prod-1 and Prod-1 are affected
We have identified the issue and fix is also done. We are going to replay the cloud functions since Nov 9 for these accounts.
Posted Nov 13, 2023 - 07:34 PST
Update
The issue is happening for all the currency preferences enabled customers
Posted Nov 13, 2023 - 07:03 PST
Investigating
We are currently investigating this issue.
Posted Nov 13, 2023 - 06:09 PST
This incident affected: Prod 2 (Cloud Cost Management (CCM)).