test: fix workload identity e2e test#3154
Merged
Merged
Conversation
Contributor
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Premium_LRS has a minimum share size of 100GiB. The test was requesting 10Gi which Azure auto-expanded to 100Gi, causing the pvCapacity assertion to fail.
…ad identity e2e test Port critical workload identity infrastructure from blob-csi-driver PR kubernetes-sigs#2445: 1. Background AAD token exchange warm-up (waitForAADTokenExchange) - polls AAD for up to 45min until token exchange succeeds, running in parallel with other tests to avoid blocking the suite. 2. OIDC JWKS readiness check (waitForOIDCJWKS) - ensures the JWKS endpoint returns valid signing keys before proceeding. 3. CAPZ JWKS key mismatch detection and repair (verifyJWKSKeyMatch) - detects when blob-hosted JWKS has different signing keys than kube-apiserver and re-uploads the correct JWKS. Without this, AAD permanently rejects token exchanges with AADSTS7000272. 4. WI test now waits on wiReady channel for warm-up completion before running, preventing the 30min pod Pending timeout. 5. setupWorkloadIdentity now returns (clientID, error) to pass the client ID to the background warm-up goroutine. Also adds github.com/Azure/azure-sdk-for-go/sdk/storage/azblob dependency for blob upload operations.
…t false for WI CSI tokenRequests for workload identity are handled by kubelet based on the pod's service account; no in-container token mount is needed.
Member
Author
|
/cherrypick release-1.35 |
|
@andyzhangx: Failed to get PR patch from GitHub. This PR will need to be manually cherrypicked. Error messagestatus code 406 not one of [200], body: {"message":"Sorry, the diff exceeded the maximum number of lines (20000)","errors":[{"resource":"PullRequest","field":"diff","code":"too_large"}],"documentation_url":"https://docs.github.com/rest/pulls/pulls#get-a-pull-request","status":"406"}DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Improves the workload identity (WI) e2e test reliability by porting the AAD OIDC cache warm-up infrastructure from kubernetes-sigs/blob-csi-driver#2445.
Changes
test/e2e/suite_test.go:
setupWorkloadIdentity()now returns the client ID for background warm-upwaitForOIDCJWKS()— polls OIDC JWKS endpoint until signing keys are availableverifyJWKSKeyMatch()— detects and repairs CAPZ JWKS key mismatch (root cause of AADSTS7000272)waitForAADTokenExchange()— background goroutine polls AAD token exchange for up to 45min, running in parallel with other tests<-wiReadychannel to ensure AAD cache is warm before mountingtest/e2e/dynamic_provisioning_test.go:
ClaimSizefrom10Gito100Gi(Premium_LRS minimum share is 100GiB)skuNamefromStandard_LRStoPremium_LRStest/e2e/testsuites/:
SetAutomountServiceAccountToken()— CSI tokenRequests are handled by kubelet, no in-container token mount neededWhy
AAD independently caches OIDC metadata and can take 20-30+ minutes to accept token exchanges for a new OIDC issuer. Without the warm-up, the WI test pod gets stuck in Pending state waiting for the CSI volume mount, which relies on a successful token exchange.
Additionally, CAPZ clusters have a known race condition where the JWKS blob in Azure Storage may contain a stale signing key different from what kube-apiserver actually uses, causing permanent AADSTS7000272 rejections.
Related: kubernetes-sigs/blob-csi-driver#2445, #3118