-
Notifications
You must be signed in to change notification settings - Fork 619
KEP-9270: Introduce configuration for the MultiKueue Incremental Dispatcher #10877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Mostafahassen1
wants to merge
19
commits into
kubernetes-sigs:main
Choose a base branch
from
Mostafahassen1:kep-9270-multikueue-step-size
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+177
−0
Open
Changes from 2 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
d17aa9a
KEP: Introduce stepSize for MultiKueue Incremental Dispatcher (#9270)
Mostafahassen1 00933d0
rename customconfigs to sequential tests and split taxonomy
Mostafahassen1 d8312bf
Update keps/9270-multikueue-incremental-step-size/README.md
Mostafahassen1 f384da4
Update keps/9270-multikueue-incremental-step-size/README.md
Mostafahassen1 9c1bd75
Update keps/9270-multikueue-incremental-step-size/README.md
Mostafahassen1 5d015b5
feat(multikueue): add incremental dispatcher step size
Mostafahassen1 14b8064
feat(multikueue): add incremental dispatcher step size
Mostafahassen1 c371136
feat(multikueue): add incremental dispatcher step size
Mostafahassen1 1d05665
feat(multikueue): add incremental dispatcher step size
Mostafahassen1 22eb663
Update keps/9270-multikueue-incremental-step-size/kep.yaml
Mostafahassen1 2ecc54c
Update keps/9270-multikueue-incremental-step-size/README.md
Mostafahassen1 601b212
feat(multikueue): add incremental dispatcher step size
Mostafahassen1 038f1eb
update : kep.yml
Mostafahassen1 5a299ea
update : kep.yml and readme files
Mostafahassen1 98fe7a1
docs: remove KEP template and update feature gate name
Mostafahassen1 3224bcc
chore: restore NNNN-template README to clean up PR
Mostafahassen1 2564722
fix : format and update Kep.yml
Mostafahassen1 022f1c3
fix : format in readme file
Mostafahassen1 ece1621
fix : format in readme file
Mostafahassen1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,119 @@ | ||
|
|
||
| # KEP-9270: MultiKueue Incremental Dispatcher Step Size | ||
| <!-- toc --> | ||
| - [Summary](#summary) | ||
| - [Motivation](#motivation) | ||
| - [Goals](#goals) | ||
| - [Non-Goals](#non-goals) | ||
| - [Proposal](#proposal) | ||
| - [Risks and Mitigations](#risks-and-mitigations) | ||
| - [Design Details](#design-details) | ||
| - [API Changes](#api-changes) | ||
| - [Execution Loop](#execution-loop) | ||
| - [Test Plan](#test-plan) | ||
| - [Prerequisite testing updates](#prerequisite-testing-updates) | ||
| - [Unit tests](#unit-tests) | ||
| - [Integration tests](#integration-tests) | ||
| - [e2e tests](#e2e-tests) | ||
| - [Graduation Criteria](#graduation-criteria) | ||
| - [Implementation History](#implementation-history) | ||
| - [Drawbacks](#drawbacks) | ||
| <!-- /toc --> | ||
|
|
||
| ## Summary | ||
|
|
||
| This proposal introduces a new configuration field called `stepSize` to the MultiKueue Incremental Dispatcher. This setting will control the number of worker clusters that the Manager Cluster will attempt to dispatch a workload to concurrently. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Currently, when MultiKueue attempts to schedule a workload across multiple worker clusters, the Incremental Dispatcher has a limitation. If it checks clusters one by one, the process is too slow and delays job execution. If it attempts to dispatch to all clusters simultaneously, it generates unnecessary API traffic, increases the load on the Manager Cluster, and can lead to race conditions where multiple clusters try to admit the same job at the same time. | ||
|
|
||
| Introducing a `stepSize` allows cluster administrators to configure a "batch size" for this process, finding the perfect balance between speed and system stability. | ||
|
|
||
| ### Goals | ||
|
|
||
| * Add a `stepSize` parameter to the `MultiKueueConfig` API. | ||
| * Update the Workload Controller's dispatching loop to process worker clusters in batches defined by the `stepSize`. | ||
| * Ensure that if a workload is admitted in a batch, the remaining pending proxy workloads in that batch are cleanly removed. | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| ### Non-Goals | ||
|
|
||
| * We will not change the way worker clusters are scored or prioritized. | ||
| * We will not alter the `AllAtOnce` dispatcher behavior, as this proposal strictly targets the Incremental Dispatcher. | ||
|
|
||
| ## Proposal | ||
|
|
||
| We propose updating the `MultiKueueConfig` API to accept an optional integer for the step size, and updating the Workload Controller to respect this batch limit when iterating through the list of available worker clusters. | ||
|
|
||
|
|
||
| ### Risks and Mitigations | ||
|
|
||
| **Risk:** An administrator might set the `stepSize` to an extremely high number (e.g., equal to the total number of clusters), which effectively recreates the heavy load issues of the `AllAtOnce` dispatcher and increases the chance of race conditions where multiple clusters admit the job at the exact same time. | ||
| **Mitigation:** Provide clear API documentation regarding best practices. The Workload Controller's conflict resolution logic (which deletes duplicate proxy workloads once one is admitted) will naturally handle race conditions, just as it already does for the `AllAtOnce` strategy. | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
Mostafahassen1 marked this conversation as resolved.
Outdated
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| ## Design Details | ||
|
|
||
| ### API Changes | ||
| We will update the `MultiKueueConfigSpec` struct to include the new field. This change will be backwards compatible by defaulting the step size to `1` (which matches the current one-by-one behavior). | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| ```go | ||
| type MultiKueueConfigSpec struct { | ||
| // Other existing fields... | ||
|
|
||
| // StepSize defines the number of worker clusters the Incremental | ||
| // Dispatcher will query at the same time. | ||
| // Minimum value is 1. If not set, it defaults to 1. | ||
| // +optional | ||
| // +kubebuilder:default=1 | ||
| // +kubebuilder:validation:Minimum=1 | ||
| StepSize *int32 `json:"stepSize,omitempty"` | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| } | ||
| ``` | ||
|
|
||
| ### Execution Loop | ||
| The core logic change will take place inside the MultiKueue Workload Controller's reconciliation loop: | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| 1. When a workload is ready for dispatch, the controller reads the `stepSize` from the `MultiKueueConfig`. | ||
| 2. The controller takes the full list of available worker clusters and divides them into groups (batches) equal to the `stepSize`. | ||
| 3. The controller loops through the first batch, creating proxy workloads on those specific worker clusters simultaneously. | ||
| 4. If a cluster in the batch admits the job, the controller immediately deletes the proxy workloads from the other clusters in that same batch and stops the loop. | ||
| 5. If no cluster in the batch admits the job, the controller moves on to the next batch of clusters. | ||
|
|
||
| ### Test Plan | ||
|
|
||
| I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement. | ||
|
|
||
| #### Prerequisite testing updates | ||
|
|
||
| No major prerequisite testing updates are required. Current mock worker cluster frameworks in EnvTest are sufficient. | ||
|
|
||
| #### Unit tests | ||
|
|
||
| - `pkg/controller/core/workload_controller_test.go`: `<date>` - Verify that the worker cluster list is correctly divided into chunks matching the `stepSize`. | ||
| - `pkg/controller/core/workload_controller_test.go`: `<date>` - Verify that the default value remains `1` if the user does not specify a `stepSize`. | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| #### Integration tests | ||
|
|
||
| - `TestIncrementalDispatcherStepSize`: Simulate a Manager Cluster with 5 Worker Clusters. Set `stepSize` to `2`. Submit a workload and verify via the API server that exactly 2 proxy workloads are created in the first step. Simulate a rejection on the first 2 clusters, and verify that the controller correctly creates the next 2 proxy workloads in the second step. | ||
|
|
||
| #### e2e tests | ||
|
|
||
| Standard E2E multi-cluster tests will be updated to include an iteration where `MultiKueueConfig` is deployed with a `stepSize > 1` to ensure workloads execute successfully end-to-end without duplication. | ||
|
|
||
| ### Graduation Criteria | ||
|
|
||
| * **Alpha:** * API fields added. | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| * Workload Controller updated to support batching. | ||
| * Unit tests and EnvTest integration tests passing. | ||
| * **Beta:** * Gather feedback from the community. | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| * E2E tests added and passing consistently. | ||
| * **Stable:** | ||
| * Feature has been in Beta for at least one release cycle with no major bugs reported. | ||
|
|
||
| ## Implementation History | ||
|
|
||
| - 2026-05-01: KEP proposed and initial draft created. | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| This adds slight code complexity to the existing Incremental Dispatcher loop and introduces one additional configuration parameter for administrators to manage and tune. | ||
| ``` | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| title: MultiKueue Incremental Dispatcher Step Size | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| kep-number: 9270 | ||
| authors: | ||
| - "@Mostafahassen1" | ||
| status: provisional | ||
| creation-date: "2026-05-01" | ||
| reviewers: | ||
| - TBD | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| approvers: | ||
| - TBD | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| # The target maturity stage in the current dev cycle for this KEP. | ||
| stage: alpha | ||
|
|
||
| # The most recent milestone for which work toward delivery of this KEP has been | ||
| # done. Kueue typically uses 0.x versioning. | ||
| latest-milestone: "v0.9" | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| # The milestone at which this feature was, or is targeted to be, at each stage. | ||
| milestone: | ||
| alpha: "v0.9" | ||
| beta: "v0.10" | ||
| stable: "v1.0" | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| # The following PRR answers are required at alpha release | ||
| # List the feature gate name and the components for which it must be enabled | ||
| feature-gates: | ||
| - name: MultiKueueStepSize | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
| components: | ||
| - kueue-controller-manager | ||
| disable-supported: true | ||
|
|
||
| # The following PRR answers are required at beta release | ||
| metrics: | ||
| - kueue_multikueue_dispatched_workloads_total | ||
|
Mostafahassen1 marked this conversation as resolved.
Outdated
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.