[release-0.16] Finish old workload slice only after new slice is admitted#11558
[release-0.16] Finish old workload slice only after new slice is admitted#11558sohankunkerkar wants to merge 2 commits into
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sohankunkerkar The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adjusts workload-slice replacement ordering so the old slice is only finished after the replacement slice is admitted, and adds an integration assertion to prevent regressions.
Changes:
- Move workload-slice replacement/finish logic to run after successful admission in the scheduler.
- Add a watch-based test utility + integration test that asserts “new slice admitted before old slice finished”.
- Update scheduler unit test expected event ordering accordingly.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| test/util/util.go | Adds a watcher-based assertion helper for workload-slice ordering. |
| test/integration/singlecluster/controller/jobs/job/job_controller_test.go | Adds an integration test to enforce the new ordering behavior. |
| pkg/scheduler/scheduler.go | Reorders slice replacement to occur after admission; improves MultiKueue behavior by preserving ClusterName. |
| pkg/scheduler/scheduler_test.go | Adjusts expected event order to match new behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| status, _ := evt.Object.(*metav1.Status) | ||
| gomega.Expect(evt.Type).ShouldNot(gomega.Equal(watch.Error), fmt.Sprintf("watch error: %v", status)) | ||
| } | ||
| if evt.Type != watch.Modified { |
| if wl.Name == oldWorkloadName && workload.IsFinished(wl) { | ||
| oldSliceFinished = true | ||
| } | ||
| if wl.Name != oldWorkloadName && workload.IsAdmitted(wl) { | ||
| gomega.Expect(oldSliceFinished).Should(gomega.BeFalse(), | ||
| "old workload slice was finished before new slice was admitted") | ||
| newSliceAdmitted = true | ||
| } |
| if features.Enabled(features.ElasticJobsViaWorkloadSlices) && oldWorkloadSlice != nil { | ||
| if err := s.replaceWorkloadSlice(ctx, oldWorkloadSlice.WorkloadInfo.ClusterQueue, e.Obj, oldWorkloadSlice.WorkloadInfo.Obj.DeepCopy()); err != nil { | ||
| log.Error(err, "Failed to finish old workload slice after admitting replacement; job reconciler will handle recovery") | ||
| } | ||
| } |
| ginkgo.GinkgoHelper() | ||
| oldSliceFinished := false | ||
| newSliceAdmitted := false | ||
| timeoutCh := time.After(timeout) |
|
/test pull-kueue-verify-release-0-16 |
|
@sohankunkerkar: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This is a manual cherry-pick of #11195 and #11292
/assign sohankunkerkar