Skip to content

Conversation

nikhilsuri-db
Copy link

@nikhilsuri-db nikhilsuri-db commented Sep 26, 2025

What type of PR is this?

  • Feature

Description

This PR introduces a circuit breaker pattern to the telemetry system to prevent cascading failures and improve system resilience. The implementation uses the pybreaker library to monitor telemetry request failures and automatically open the circuit when failure rates exceed configurable thresholds, blocking further requests to protect downstream services. When the circuit is open, telemetry requests are temporarily blocked until the system recovers, at which point the circuit transitions to half-open for testing and eventually closes when normal operation resumes. The circuit breaker configuration is centralized with immutable settings, includes comprehensive logging for monitoring, and maintains full backward compatibility with existing telemetry functionality.

How is this tested?

  • Unit tests
  • E2E Tests
  • Manually
  • N/A

Related Tickets & Documents

https://docs.google.com/document/d/1ftRvby9bwDZzE3s1tOb4hJ4Pd9USiXskb9cDw-uQNPM/edit?usp=sharing

@nikhilsuri-db nikhilsuri-db self-assigned this Sep 26, 2025
@nikhilsuri-db nikhilsuri-db marked this pull request as ready for review September 30, 2025 08:07
@nikhilsuri-db nikhilsuri-db changed the title circuit breaker changes using pybreaker [DRAFT] circuit breaker changes using pybreaker Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant