Fix finished Workload GC with finalizers#11181
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
|
the failed test seems another recurrence of #10298 |
| if !deleteRequested { | ||
| return ctrl.Result{}, nil | ||
| } | ||
|
|
||
| r.recorder.Eventf(&wl, nil, corev1.EventTypeNormal, "Deleted", "Deleted", "Deleted finished workload due to elapsed retention") |
There was a problem hiding this comment.
nit, but let's move the record under if deleteRequested, becasuse the return statements don't differ
There was a problem hiding this comment.
Signed-off-by: Shaanveer Singh <shaanver.singh@gmail.com>
mimowo
left a comment
There was a problem hiding this comment.
/lgtm
/approve
/cherrypick release-0.17
/cherrypick release-0.16
Thank you for fixing the issue, and also commonizing the code to make the code more resilient in the future.
|
LGTM label has been added. DetailsGit tree hash: ac6f38781f40487b2a44ff932637513762c01874 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mimowo, ShaanveerS The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/cherrypick release-0.17 |
|
@mimowo: #11181 failed to apply on top of branch "release-0.16": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@mimowo: #11181 failed to apply on top of branch "release-0.17": DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@ShaanveerS please prepare CPs manually |
What type of PR is this?
/kind bug
What this PR does / why we need it:
When finished Workload retention expires, if the Workload still has the
resource-in-usefinalizer attached, Kubernetes marks it for deletion but won't actually remove it.This was already called out as a caveat in KEP-1618:
WorkloadSlices were added after the KEP.
Replaced WorkloadSlices are marked finished by the scheduler in
replaceWorkloadSlice, and that path only callsworkload.Finish, which sets the finished condition but does not remove the finalizer.The fix is to remove it in the finished Workload GC path, so the Workload can actually be removed once retention has elapsed.
Which issue(s) this PR fixes:
Fixes #11130
Special notes for your reviewer:
Does this PR introduce a user-facing change?