-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Proposal
The current implementation is susceptible to permanent failure from transient errors. If a session expires or an endpoint enters a failing state, the listener can exit or hang without attempting a recovery. I propose adding a resilience layer to handle session recreation and health monitoring automatically.
Proposed Solution
We should introduce a ResilienceConfig to govern how the client handles degraded states and expose health/recovery methods for the listener loop to utilize.
type ResilienceConfig struct {
AutoRecoverSession bool
CircuitBreakerSettings CircuitBreakerConfig
HealthCheckInterval time.Duration
}
// IsHealthy performs a lightweight check to ensure the client can still communicate with the API.
func (c *Client) IsHealthy(ctx context.Context) error
// Recover attempts to re-establish the session and refresh tokens without a full process restart.
func (c *Client) Recover(ctx context.Context) errorTechnical Improvements
- Automatic Session Recovery: If
AutoRecoverSessionis enabled, the client will attempt to negotiate a new session ID if the current one is invalidated by the server, preventing unnecessary listener crashes. - Circuit Breaking: Implementing a circuit breaker for the scaleset API prevents the client from hammering GitHub's infrastructure during an outage, allowing it to back off and "probe" for health gracefully.
- Health Probing: The
IsHealthymethod allows the listener (or an external orchestrator like Kubernetes) to verify the connection's integrity, enabling proactive restarts if the client enters an unrecoverable state.
Benefits
- Self-Healing: Reduces manual intervention by recovering from expired sessions or transient network partitions.
- Stability: Prevents cascading failures during upstream API degradations.
- Operational Visibility: Provides a clear hook for liveness and readiness probes.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels