Skip to content

feat(oss): implement version-aware ResourceVersion strategy for fuse pod delete#1682

Open
AlbeeSo wants to merge 1 commit into
kubernetes-sigs:masterfrom
AlbeeSo:oss/delete-pod-optimization
Open

feat(oss): implement version-aware ResourceVersion strategy for fuse pod delete#1682
AlbeeSo wants to merge 1 commit into
kubernetes-sigs:masterfrom
AlbeeSo:oss/delete-pod-optimization

Conversation

@AlbeeSo
Copy link
Copy Markdown
Member

@AlbeeSo AlbeeSo commented Apr 30, 2026

Implement intelligent ResourceVersion selection based on Kubernetes version to balance data consistency and etcd pressure in large-scale clusters.

What type of PR is this?

/kind feature

What this PR does / why we need it:

Key Changes:

  • Add K8s version detection using Discovery API at startup (main.go)
  • Implement ShouldConstrainResourceVersion() with version-based logic:
    • K8s >= 1.31: Use RV="" (Consistent Reads from Cache supported)
    • K8s < 1.31: Use RV="0" (avoid etcd quorum read pressure)
    • K8s unknown: Use RV="0" (conservative approach)
  • Add ENV override support: OSS_FUSE_CONSTRAIN_RV=true/false
  • Replace NewFilteredPodInformer with custom ListWatch to control RV

Design Decisions:

  • K8s >= 1.31: Consistent Reads from Cache provides both consistency
    and performance (reads from cache instead of etcd)
  • K8s < 1.31: Force RV="0" to avoid etcd quorum reads that could
    cause mount/unmount hangs in large clusters
  • Version detection happens once at startup, converted to bool flag
    to avoid repeated version checks in hot path

Logging Levels:

  • Info: K8s >= 1.31 (normal operation with new feature)
  • Warning: K8s < 1.31 (using legacy approach with etcd pressure risk)
  • Error: K8s version unknown (conservative fallback)

Testing:

  • Add TestShouldConstrainResourceVersion with 8 test cases:
    • K8s version detection (1.28, 1.30, 1.31, 1.32, nil)
    • ENV override (true/false/invalid)
  • Update TestDelete to cover both constrainRV modes
  • All existing tests updated for new function signatures

Files Modified:

  • main.go: Add detectKubernetesVersion(), pass to OSS driver
  • pkg/oss/oss.go: Accept and propagate k8sVersion
  • pkg/oss/controllerserver.go: Store k8sVersion field
  • pkg/mounter/fuse_pod_manager/oss/registry.go:
    ShouldConstrainResourceVersion() with ENV support
  • pkg/mounter/fuse_pod_manager/fuse_pod_manager.go:
    Use constrainResourceVersion flag in Delete()
  • pkg/mounter/fuse_pod_manager/oss/oss_fuse_pod_manager.go:
    Update constructor signature
  • Test files: Update all test cases for new signatures

Impact:

  • Prevents fuse pod delete from missing newly created pods
  • Reduces etcd pressure in K8s 1.31+ clusters (3x performance gain)
  • Avoids mount/unmount hangs in K8s < 1.31 large clusters
  • Backward compatible with ENV override for manual control

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 30, 2026
@k8s-ci-robot k8s-ci-robot requested review from iltyty and mowangdk April 30, 2026 11:10
…pod delete

Implement intelligent ResourceVersion selection based on Kubernetes version
to balance data consistency and etcd pressure in large-scale clusters.
@AlbeeSo AlbeeSo force-pushed the oss/delete-pod-optimization branch from df75c92 to 9e2aacb Compare May 6, 2026 03:39
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: AlbeeSo
Once this PR has been reviewed and has the lgtm label, please assign huww98, mowangdk for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants