Pipelines experienced failure in resolving secrets in cases where more than one secret were used in custom secret manager. This issue was isolated to secrets associated with custom secret managers.
Root Cause Analysis:
The pipeline failure happened because the system failed to resolve secrets correctly. A code change to improve performance of the secret decryptions was deployed which resulted in failures for secrets stored in custom secret manager. The code change was behind a feature flag. The feature flag was disabled which restored normal pipeline operations.
Action Items:
Add New Test Cases: Add new test cases to the automation suite to cover different configuration combinations for custom secret managers.
Add Metrics and Alerts: Implement appropriate metrics and alerts to detect secret/expression resolution failures proactively and mitigate them.
Posted Mar 24, 2025 - 10:09 PDT
Resolved
This incident has been resolved.
Posted Mar 19, 2025 - 14:55 PDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Mar 19, 2025 - 13:20 PDT
Update
We are continuing to work on a fix for this issue.
Posted Mar 19, 2025 - 13:03 PDT
Identified
The issue has been identified and a fix is being implemented.
Posted Mar 19, 2025 - 13:01 PDT
Investigating
We are currently investigating this issue.
Posted Mar 19, 2025 - 12:43 PDT
This incident affected: Prod 2 (Continuous Delivery - Next Generation (CDNG)) and Prod 1 (Continuous Delivery - Next Generation (CDNG)).