Skip to content

Cloud Controller Manager restarts cause repeated LoadBalancer service disruptions with externalTrafficPolicy: Local #10224

@cxgk46-jagadeswar

Description

@cxgk46-jagadeswar

Environment:
Kubernetes Version: 1.33.x (self-managed cluster)
Cloud Provider: Azure

During a routine maintenance operation, CCM pods experienced multiple restarts over a 3+ hour period. Each CCM pod restart triggered the following sequence:

  1. Leader election lease expires (~15 seconds)
  2. New CCM pod acquires leadership
  3. New leader's Service controller performs a full sync of all LoadBalancer services
  4. Azure Load Balancer reconfigures backend pools and health probes
  5. Active connections through the LoadBalancer are terminated

With externalTrafficPolicy: Local, the impact is amplified because:

  1. Traffic is only routed to nodes with running service pods
  2. Azure LB health checks (HealthCheckNodePort) are recalculated during each reconciliation
  3. Brief windows exist where backends are marked unhealthy during the transition

Impact

  • 6 CCM pod transitions occurred during the incident

  • Each transition triggered 3-5 service reconciliation events

  • Long-lived connections (websockets) were repeatedly dropped

  • Total disruption window extended to ~3.5 hours due to cascading reconciliations

Expected Behavior

CCM restarts should minimize disruption to existing LoadBalancer services, particularly when:

  • The underlying service configuration has not changed

  • Backend pods remain healthy and unchanged

  • Only the CCM pod itself is restarting

Questions for Maintainers

  1. Is there a mechanism to perform incremental reconciliation rather than full sync on leader election?
  2. Can the Service controller detect that no actual changes occurred and skip Azure API calls?
  3. Are there recommended configurations to reduce disruption during CCM pod transitions?
  4. Would implementing connection draining or gradual backend pool updates help mitigate this?

This behavior appears to be by design based on how the Service controller performs full reconciliation on startup, but the impact on production workloads with long-lived connections is significant. We are looking for guidance on best practices or potential enhancements to reduce this impact.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions