Skip to content

fix: configure service-account-issuer in prow CI cluster templates#6288

Merged
k8s-ci-robot merged 2 commits into
kubernetes-sigs:mainfrom
andyzhangx:fix-oidc-issuer-prow-templates
May 8, 2026
Merged

fix: configure service-account-issuer in prow CI cluster templates#6288
k8s-ci-robot merged 2 commits into
kubernetes-sigs:mainfrom
andyzhangx:fix-oidc-issuer-prow-templates

Conversation

@andyzhangx
Copy link
Copy Markdown
Member

@andyzhangx andyzhangx commented May 7, 2026

What this PR does

Adds service-account-issuer: ${SERVICE_ACCOUNT_ISSUER:-https://kubernetes.default.svc.cluster.local} to the apiServer.extraArgs in cluster templates that configure kubeadm-based control planes.

Why

Cluster templates create workload clusters with apiServer.extraArgs: {}, causing kube-apiserver to use the default --service-account-issuer=https://kubernetes.default.svc.cluster.local. This is an internal URL that AAD cannot reach from the public internet.

When CSI drivers (blob-csi-driver, azurefile-csi-driver) run workload identity e2e tests on these clusters, the projected service account tokens are signed with the internal issuer. AAD rejects them because it cannot fetch the OIDC discovery document, causing mount failures (mount error(126): Required key not available).

Changes

  • Modified 64 cluster templates (31 prow CI templates under templates/test/ci/, 19 user-facing templates under templates/, and 14 dev templates under templates/test/dev/)
  • Updated docs/book/src/topics/workload-identity.md to expand the description of SERVICE_ACCOUNT_ISSUER
  • Safe default fallback (https://kubernetes.default.svc.cluster.local) preserves backward compatibility — existing clusters without SERVICE_ACCOUNT_ISSUER set behave identically to before

Addresses

Feedback from #6268

/kind bug

Release note:

configure service-account-issuer on workload cluster apiserver in cluster templates to enable workload identity

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 7, 2026
@k8s-ci-robot k8s-ci-robot requested review from Jont828 and bryan-cox May 7, 2026 00:50
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 7, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.84%. Comparing base (0993e6b) to head (4937536).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6288   +/-   ##
=======================================
  Coverage   43.84%   43.84%           
=======================================
  Files         289      289           
  Lines       25346    25346           
=======================================
  Hits        11112    11112           
  Misses      13460    13460           
  Partials      774      774           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andyzhangx
Copy link
Copy Markdown
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 7, 2026
@andyzhangx andyzhangx force-pushed the fix-oidc-issuer-prow-templates branch from 3cae9d8 to 06782c2 Compare May 7, 2026 01:21
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 7, 2026
@andyzhangx andyzhangx force-pushed the fix-oidc-issuer-prow-templates branch from 06782c2 to 0367def Compare May 7, 2026 01:34
Set service-account-issuer: ${SERVICE_ACCOUNT_ISSUER} in the base
cluster template (templates/flavors/base/cluster-template.yaml).

This ensures projected service account tokens are signed with a
discoverable issuer URL. The default value falls back to the
kube-apiserver default (https://kubernetes.default.svc.cluster.local),
so existing deployments are unaffected.

When SERVICE_ACCOUNT_ISSUER is set (e.g., to a public OIDC endpoint),
workload identity flows (CSI drivers, pod identity) will work correctly
because AAD can discover and validate the token issuer.

/kind bug
@andyzhangx andyzhangx force-pushed the fix-oidc-issuer-prow-templates branch from 0367def to ca6f966 Compare May 7, 2026 01:49
@andyzhangx
Copy link
Copy Markdown
Member Author

/assign @mboersma

Copy link
Copy Markdown
Contributor

@mboersma mboersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/test pull-cluster-api-provider-azure-e2e

(The other failures are "ghosts" from when this was opened against the release-1.23 branch, so we can ignore them.)

This looks good and AFAICT should be safely backward-compatible. But in the PR description you say:

Modified 29 prow CI templates under templates/test/ci/
No user-facing templates are changed

That's not actually true: it changes 50+ templates, many of which are user-facing. I just want to ensure that this was the intention, because if you intended to scope it just to prow CI templates, it doesn't.

Also we could perhaps expand the description of SERVICE_ACCOUNT_ISSUER in docs/book/src/topics/workload-identity.md (line 43 and/or 96). Maybe:

If unset, the cluster template falls back to kubeadm's default 
https://kubernetes.default.svc.cluster.local. Set it to your Azure storage container URL 
when running `clusterctl generate cluster ...` so projected service account tokens in the
workload cluster are signed with an AAD-discoverable issuer.

@kubernetes-sigs kubernetes-sigs deleted a comment from k8s-ci-robot May 7, 2026
Clarify that if SERVICE_ACCOUNT_ISSUER is unset, kubeadm defaults to
https://kubernetes.default.svc.cluster.local, and explain when users
should set it to their Azure storage container URL.
@andyzhangx
Copy link
Copy Markdown
Member Author

/test pull-cluster-api-provider-azure-e2e

@andyzhangx
Copy link
Copy Markdown
Member Author

/test pull-cluster-api-provider-azure-e2e

(The other failures are "ghosts" from when this was opened against the release-1.23 branch, so we can ignore them.)

This looks good and AFAICT should be safely backward-compatible. But in the PR description you say:

Modified 29 prow CI templates under templates/test/ci/
No user-facing templates are changed

That's not actually true: it changes 50+ templates, many of which are user-facing. I just want to ensure that this was the intention, because if you intended to scope it just to prow CI templates, it doesn't.

Also we could perhaps expand the description of SERVICE_ACCOUNT_ISSUER in docs/book/src/topics/workload-identity.md (line 43 and/or 96). Maybe:

If unset, the cluster template falls back to kubeadm's default 
https://kubernetes.default.svc.cluster.local. Set it to your Azure storage container URL 
when running `clusterctl generate cluster ...` so projected service account tokens in the
workload cluster are signed with an AAD-discoverable issuer.

@mboersma it's done, thanks.

Copy link
Copy Markdown
Contributor

@mboersma mboersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/label tide/merge-method-squash
/lgtm
/approve
/cherry-pick release-1.24
/cherry-pick release-1.23

@k8s-ci-robot k8s-ci-robot added tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels May 8, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: fbeaf3bd6ca94394eeccdd8d02412cce29a49152

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mboersma

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2026
@mboersma
Copy link
Copy Markdown
Contributor

mboersma commented May 8, 2026

/cherry-pick release-1.24

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@mboersma: once the present PR merges, I will cherry-pick it on top of release-1.24 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.24

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot merged commit 0d3a836 into kubernetes-sigs:main May 8, 2026
30 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in CAPZ Planning May 8, 2026
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone May 8, 2026
@mboersma
Copy link
Copy Markdown
Contributor

mboersma commented May 8, 2026

/cherry-pick release-1.23

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@mboersma: #6288 failed to apply on top of branch "release-1.23":

Applying: fix: configure service-account-issuer in cluster templates
Using index info to reconstruct a base tree...
M	templates/test/ci/cluster-template-prow-ci-version-dra.yaml
M	templates/test/ci/cluster-template-prow-ci-version-md-and-mp.yaml
A	templates/test/ci/cluster-template-prow-machine-pool-ci-version-multi-zone.yaml
M	templates/test/dev/cluster-template-custom-builds-dra.yaml
M	templates/test/dev/cluster-template-custom-builds-load-dra.yaml
M	templates/test/dev/cluster-template-custom-builds-load.yaml
M	templates/test/dev/cluster-template-custom-builds-machine-pool-load-dra.yaml
M	templates/test/dev/cluster-template-custom-builds.yaml
Falling back to patching base and 3-way merge...
Auto-merging templates/test/dev/cluster-template-custom-builds.yaml
CONFLICT (content): Merge conflict in templates/test/dev/cluster-template-custom-builds.yaml
Auto-merging templates/test/dev/cluster-template-custom-builds-machine-pool-load-dra.yaml
Auto-merging templates/test/dev/cluster-template-custom-builds-load.yaml
CONFLICT (content): Merge conflict in templates/test/dev/cluster-template-custom-builds-load.yaml
Auto-merging templates/test/dev/cluster-template-custom-builds-load-dra.yaml
Auto-merging templates/test/dev/cluster-template-custom-builds-dra.yaml
CONFLICT (modify/delete): templates/test/ci/cluster-template-prow-machine-pool-ci-version-multi-zone.yaml deleted in HEAD and modified in fix: configure service-account-issuer in cluster templates. Version fix: configure service-account-issuer in cluster templates of templates/test/ci/cluster-template-prow-machine-pool-ci-version-multi-zone.yaml left in tree.
Auto-merging templates/test/ci/cluster-template-prow-ci-version-md-and-mp.yaml
CONFLICT (content): Merge conflict in templates/test/ci/cluster-template-prow-ci-version-md-and-mp.yaml
Auto-merging templates/test/ci/cluster-template-prow-ci-version-dra.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0001 fix: configure service-account-issuer in cluster templates

Details

In response to this:

/cherry-pick release-1.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@mboersma: new pull request created: #6290

Details

In response to this:

/cherry-pick release-1.24

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot pushed a commit that referenced this pull request May 11, 2026
…6288)

* fix: configure service-account-issuer in cluster templates

Set service-account-issuer: ${SERVICE_ACCOUNT_ISSUER} in the base
cluster template (templates/flavors/base/cluster-template.yaml).

This ensures projected service account tokens are signed with a
discoverable issuer URL. The default value falls back to the
kube-apiserver default (https://kubernetes.default.svc.cluster.local),
so existing deployments are unaffected.

When SERVICE_ACCOUNT_ISSUER is set (e.g., to a public OIDC endpoint),
workload identity flows (CSI drivers, pod identity) will work correctly
because AAD can discover and validate the token issuer.

/kind bug

* docs: expand SERVICE_ACCOUNT_ISSUER description in workload-identity.md

Clarify that if SERVICE_ACCOUNT_ISSUER is unset, kubeadm defaults to
https://kubernetes.default.svc.cluster.local, and explain when users
should set it to their Azure storage container URL.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants