Skip to content

Honor AzureCluster ControlPlaneEndpoint.Port in API server LB#6278

Open
mboersma wants to merge 1 commit into
kubernetes-sigs:mainfrom
mboersma:fix-controlplane-port
Open

Honor AzureCluster ControlPlaneEndpoint.Port in API server LB#6278
mboersma wants to merge 1 commit into
kubernetes-sigs:mainfrom
mboersma:fix-controlplane-port

Conversation

@mboersma
Copy link
Copy Markdown
Contributor

@mboersma mboersma commented May 5, 2026

What type of PR is this?

/kind bug

What this PR does / why we need it:

ClusterScope.APIServerPort() only consulted Cluster.Spec.ClusterNetwork.APIServerPort, falling back to 6443. As a result, when a user set AzureCluster.Spec.ControlPlaneEndpoint.Port (e.g. to 443) the API server load balancer rule, health probe, and NSG rule were still constructed for port 6443, even though the AzureCluster spec doc string explicitly tells users they may set this field and CAPZ will honor it.

This change updates APIServerPort() to consult AzureCluster.Spec.ControlPlaneEndpoint.Port first, so a user-supplied port flows through to:

  • the API server LB rule frontend/backend port (getLoadBalancingRules)
  • the HTTPS health probe port (getProbes)
  • the NSG rule for inbound API server traffic
  • the back-fill of ControlPlaneEndpoint.Port in the AzureCluster reconciler (already idempotent — only writes when the field is 0)

Which issue(s) this PR fixes:
Fixes #5781

Special notes for your reviewer:

No e2e change is included. None of the existing flavors uses a non-6443 port. Modifying prow-apiserver-ilb (the closest candidate) would also require coordinating the API server bind port in the KubeadmControlPlane and updating the test's probe assertions, which is non-trivial. Coverage is provided at the unit level instead:

  • azure/scope/cluster_test.go::TestAPIServerPort gains cases for the new precedence.
  • azure/services/loadbalancers/spec_test.go::TestAPIServerLBPortPropagation verifies that a non-default LBSpec.APIServerPort propagates to LB rule ports and the probe port.

Adding a dedicated e2e flavor for non-default control plane port would be a reasonable follow-up.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests
  • cherry-pick candidate

Release note:

Honor `AzureCluster.spec.controlPlaneEndpoint.port` when configuring the API server load balancer rule, health probe, and NSG rule. Previously these were always created for port 6443 (or `Cluster.spec.clusterNetwork.apiServerPort` if set), ignoring the AzureCluster control plane endpoint port.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels May 5, 2026
@k8s-ci-robot k8s-ci-robot requested a review from bryan-cox May 5, 2026 20:55
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign justinsb for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from Jont828 May 5, 2026 20:55
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 5, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.88%. Comparing base (bd1e799) to head (d49ccac).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6278   +/-   ##
=======================================
  Coverage   43.88%   43.88%           
=======================================
  Files         289      289           
  Lines       25351    25353    +2     
=======================================
+ Hits        11125    11127    +2     
  Misses      13448    13448           
  Partials      778      778           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mboersma mboersma requested review from jackfrancis and willie-yao and removed request for Jont828 May 5, 2026 22:05
Copy link
Copy Markdown
Contributor

@willie-yao willie-yao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, thanks for the fix! Just had a few comments

Comment thread azure/scope/cluster.go
Comment on lines 1055 to 1070
port := strconv.Itoa(int(s.APIServerPort()))

missingAPIPort := subnet.GetSecurityRuleByDestination(port) == nil
if missingAPIPort {
subnet.SecurityGroup.SecurityRules = append(subnet.SecurityGroup.SecurityRules, infrav1.SecurityRule{
Name: "allow_apiserver",
Description: "Allow K8s API Server",
Priority: 2201,
Protocol: infrav1.SecurityGroupProtocolTCP,
Direction: infrav1.SecurityRuleDirectionInbound,
Source: ptr.To("*"),
SourcePorts: ptr.To("*"),
Destination: ptr.To("*"),
DestinationPorts: ptr.To(port),
Action: infrav1.SecurityRuleActionAllow,
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One weird thing that copilot discovered is that this adds a security rule to allow apiserver based on s.APIServerPort(), and clusters from before this fix would return 6443 erroneously. If someone upgrades their cluster with a custom port to a version that inclues this fix, now APIServerPort() will return 443 or whatever custom port was set and cause a duplicate security rule to be created. This would be a problem because Azure rejects NSGs with duplicate rule names/priorities. Do you think this is something we need to address in this PR or a follow-up?

expectAPIServerPort int32
name string
clusterName string
clusterNetowrk clusterv1.ClusterNetwork
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
clusterNetowrk clusterv1.ClusterNetwork
clusterNetwork clusterv1.ClusterNetwork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

Incorrect loadbalancer rule and healthprobe when exposing controlplane api on different port than 6443

3 participants