Skip to content

feat: support mount SMB Azure File with user-provided OAuth token through k8s secret#3100

Open
andyzhangx wants to merge 67 commits into
kubernetes-sigs:masterfrom
andyzhangx:mount-with-mi-token
Open

feat: support mount SMB Azure File with user-provided OAuth token through k8s secret#3100
andyzhangx wants to merge 67 commits into
kubernetes-sigs:masterfrom
andyzhangx:mount-with-mi-token

Conversation

@andyzhangx
Copy link
Copy Markdown
Member

@andyzhangx andyzhangx commented Apr 21, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

Add mountWithOAuthToken volume context parameter to support cross-tenant scenarios where users provide their own OAuth token via a Kubernetes Secret.

This enables first-party users who cannot use kubelet identity or workload identity (cross-tenant) to mount Azure File shares using an externally-managed OAuth token.

Changes:

  • New volume context parameter: mountWithOAuthToken
  • New secret field: oauthtoken
  • Extend setCredentialCache to support direct token passing to azfilesauthmanager set
  • Extend GetStorageAccountFromSecret to return OAuth token from secret
  • New method setCredentialCacheWithOAuthToken — reads OAuth token from K8s Secret and updates credential cache
  • Mutually exclusive with mountWithManagedIdentity and mountWithWorkloadIdentityToken
  • Linux SMB only (sec=krb5 mount)
  • Supports both dynamic provisioning (via StorageClass parameters) and static provisioning (via PV volumeAttributes)
  • NodeStageVolume: performs initial mount with OAuth token credential cache setup
  • NodePublishVolume: refreshes OAuth token credential cache on every publish (even when already mounted), with SHA256-based skip to avoid unnecessary azfilesauthmanager calls when the token is unchanged
  • getSecretNamespace helper for consistent namespace resolution (secretNamespace → pvcNamespace → "default")
  • Input validation: secretName required, empty server returns InvalidArgument

Usage Example

1. Create Secret with OAuth Token

kubectl create secret generic azure-oauth-token-secret --from-literal=oauthtoken="<oauth-token>"

2. Dynamic Provisioning (StorageClass)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azurefile-oauth-token-sc
provisioner: file.csi.azure.com
parameters:
  skuName: Premium_LRS
  mountWithOAuthToken: "true"
  secretName: "azure-oauth-token-secret"
  secretNamespace: "default"
mountOptions:
  - sec=krb5

3. PV (Static Provisioning)

apiVersion: v1
kind: PersistentVolume
metadata:
  name: azurefile-oauth-token-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  csi:
    driver: file.csi.azure.com
    volumeHandle: "unique-volume-id"
    volumeAttributes:
      storageAccount: "mystorageaccount"
      shareName: "myshare"
      mountWithOAuthToken: "true"
      secretName: "azure-oauth-token-secret"
      secretNamespace: "default"

Which issue(s) this PR fixes:

Fixes #3099

Does this PR introduce a user-facing change?

Support mounting Azure File shares with user-provided OAuth token via `mountWithOAuthToken` volume context parameter for cross-tenant scenarios. Supports both dynamic provisioning (StorageClass) and static provisioning (PV). Token credential cache is refreshed on NodePublishVolume with SHA256-based deduplication.
  • if token is expired, you would hit following error when read/write data from azure file mount:
│ W0427 03:24:50.439219 1016124 nodeserver.go:841] ReadDir /var/lib/kubelet/pods/90ae2394-8dc4-4f23-ad79-2f4ae3c47d47/volumes/kubernetes.io~csi/pv-azurefile/mount failed with readdirent /var/lib/kubelet/pods/90ae2394-8dc4-4f23-ad79-2f4ae3c47d47/volumes/kubernetes.io~csi/pv-azur │
│ efile/mount: required key not available, unmount this directory              

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 21, 2026
@k8s-ci-robot k8s-ci-robot requested review from cvvz and gnufied April 21, 2026 11:50
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 21, 2026
@andyzhangx andyzhangx force-pushed the mount-with-mi-token branch 2 times, most recently from 552cc56 to c2bd547 Compare April 21, 2026 11:59
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 21, 2026
@andyzhangx andyzhangx changed the title feat: support mount Azure File with user-provided managed identity token [WIP] feat: support mount Azure File with user-provided managed identity token Apr 21, 2026
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 21, 2026
@andyzhangx andyzhangx force-pushed the mount-with-mi-token branch 2 times, most recently from 4188373 to 18fe595 Compare April 21, 2026 12:15
@andyzhangx andyzhangx requested a review from Copilot April 21, 2026 12:18
@andyzhangx andyzhangx force-pushed the mount-with-mi-token branch from 18fe595 to fa39190 Compare April 21, 2026 12:20
@andyzhangx andyzhangx changed the title [WIP] feat: support mount Azure File with user-provided managed identity token feat: support mount Azure File with user-provided OAuth token Apr 21, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 21, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for mounting Azure File shares using a user-provided managed identity OAuth token (via Kubernetes Secret) to enable cross-tenant scenarios, integrating this as a new volume context parameter and wiring it into NodeStageVolume credential-cache setup.

Changes:

  • Add mountwithmanagedidentitytoken volume context parameter and secret field azurestoragemanagedidentitytoken.
  • Extend credential-cache setup to support passing an OAuth token directly to azfilesauthmanager set.
  • Update node staging logic to handle the new MI-token path and enforce mutual exclusivity with existing identity-based options.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/azurefile/utils.go Extends setCredentialCache to accept either tokenFile-based WI flow or direct token flow; redacts token in logs.
pkg/azurefile/utils_test.go Updates setCredentialCache call signature in tests.
pkg/azurefile/nodeserver.go Adds parsing/validation for new volume context flag and implements setCredentialCacheWithMIToken called during SMB mount.
pkg/azurefile/nodeserver_test.go Updates mutual exclusivity test/error message to include the new flag.
pkg/azurefile/azurefile.go Adds new constants and extends GetStorageAccountFromSecret to return the MI token; recognizes the new mount mode in GetAccountInfo.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/azurefile/utils.go
Comment thread pkg/azurefile/nodeserver.go
Comment thread pkg/azurefile/azurefile.go
Comment thread pkg/azurefile/nodeserver.go Outdated
Comment thread pkg/azurefile/nodeserver.go Outdated
Comment thread pkg/azurefile/utils_test.go
@andyzhangx andyzhangx changed the title feat: support mount Azure File with user-provided OAuth token [WIP] feat: support mount Azure File with user-provided OAuth token Apr 21, 2026
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 21, 2026
@andyzhangx andyzhangx force-pushed the mount-with-mi-token branch 2 times, most recently from 6f727c1 to 89e66ed Compare April 21, 2026 13:27
@andyzhangx andyzhangx requested a review from Copilot April 21, 2026 13:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/azurefile/nodeserver.go Outdated
Comment thread pkg/azurefile/azurefile.go Outdated
Comment thread pkg/azurefile/utils_test.go
@andyzhangx andyzhangx force-pushed the mount-with-mi-token branch 2 times, most recently from 5c07d5e to 927abaf Compare April 21, 2026 14:01
@andyzhangx andyzhangx requested a review from Copilot April 21, 2026 14:03
andyzhangx and others added 23 commits May 16, 2026 14:26
- Merge storeAccountKey=false and requiresSmbOAuth into a single
  conditional block for mountWithManagedIdentity/mountWithWIToken/mountWithOAuthToken
- Remove duplicate early secretName validation in NodePublishVolume
  (already validated in the mountWithOAuthToken block below)
- Add mountWithOAuthToken to shouldUseServiceAccountToken function
…to setCredentialCacheWithOAuthToken

Move server resolution logic (from volumeContext, volumeID, or secret)
and secretName validation into setCredentialCacheWithOAuthToken. The
function now returns (server, error) and encapsulates:
- secretName validation
- server resolution (from context, volumeID parsing, or secret lookup)
- oauthTokenSHAMap check (skip refresh if token unchanged)
- credential cache refresh

This removes duplicate server resolution code and makes both callers
(pre-ensureMountPoint and mount path) simpler.
OAuth token authentication uses a Kubernetes Secret, not a service
account token. Including it in shouldUseServiceAccountToken caused
NodeStageVolume to skip all validation and return early when
serviceAccountToken was empty, making the NFS/secretName/createFolder
checks unreachable.
…n NodePublishVolume

Validation errors (missing secretName) are caught in NodeStageVolume.
By the time NodePublishVolume runs, failures from setCredentialCacheWithOAuthToken
are server-side issues (secret fetch, credential cache), not input validation.
…cret fetch

- Return codes.InvalidArgument instead of codes.Internal when secretName
  is missing in volume context (input validation error, not driver error).
- Fetch secret once upfront in setCredentialCacheWithOAuthToken and reuse
  parsed values to avoid duplicate Kubernetes API calls.

Addresses copilot review feedback.
setCredentialCacheWithOAuthToken now returns proper gRPC status errors:
- codes.InvalidArgument for missing secretName, empty server, missing token
- plain errors for system failures (secret fetch, credential cache)

The caller propagates the status code from the helper instead of
only checking for empty secretName.
No need to propagate gRPC status codes from the internal helper -
kubelet treats all mount failures the same regardless of code.
Just return codes.Internal with the error message.
Use status.Convert to unwrap the error's code and message cleanly,
avoiding nested 'rpc error' in the message while preserving the
original status code (e.g. InvalidArgument for validation errors).
Use status.Errorf instead of fmt.Errorf so the error surfaces as
codes.Internal rather than codes.Unknown through status.Convert.
Use status.Errorf instead of fmt.Errorf so the error surfaces as
codes.Internal rather than codes.Unknown through status.Convert.
- CreateVolume: validate secretName is provided when mountWithOAuthToken
  is true, fail early with InvalidArgument instead of deferring failure
  to NodePublish
- e2e: use shared oauthSecretName/oauthSecretNamespace constants instead
  of hardcoded strings in dynamic provisioning test
- utils_test: add test case for mutually exclusive token and tokenFile
- CreateVolume: return InvalidArgument when mountWithOAuthToken is
  combined with NFS protocol, preventing provisioning shares that
  will fail at mount time
- e2e: add trailing slash to IMDS resource URL (https://storage.azure.com/)
  to avoid invalid_resource rejection in some environments
NFS can be indicated via either protocol=nfs or fsType=nfs. The
mountWithOAuthToken guard only checked protocol, so a PV with
fsType: nfs could bypass the check. Reject both.
In CAPZ clusters, the OIDC JWKS document is served from Azure Blob
Storage and may not be immediately available after cluster creation.
When the CSI driver tries to mount with workload identity, it exchanges
the SA token for an Azure OAuth token via AAD. If AAD cannot find the
signing key in the OIDC issuer's JWKS metadata, it returns
AADSTS7000272 / 400 Bad Request and the mount fails.

Changes:
- Add 120s wait for FIC and RBAC propagation after setup
- Add waitForOIDCJWKS() that polls the issuer's JWKS endpoint until
  it returns at least one signing key (up to 5 min timeout)
…AuthToken

Instead of wrapping the error with codes.Internal, return the error
directly so that InvalidArgument and other specific status codes from
setCredentialCacheWithOAuthToken are preserved.
…essages

- Extract mountWithOAuthToken validation to shared validateMountWithOAuthToken()
- Log secret name and namespace when using OAuth token
- Improve error message: clientID must be provided when tokenFile is set
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Comment thread test/e2e/suite_test.go Outdated
Comment thread test/e2e/suite_test.go Outdated
Comment thread go.mod
@andyzhangx
Copy link
Copy Markdown
Member Author

/test

@andyzhangx
Copy link
Copy Markdown
Member Author

/retest

1 similar comment
@andyzhangx
Copy link
Copy Markdown
Member Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support bring your own managed identity token

6 participants