feat: add HTTP trigger API for on-demand descheduling cycles#1857
feat: add HTTP trigger API for on-demand descheduling cycles#1857ergoz wants to merge 2 commits into
Conversation
Signed-off-by: Sergey Morozov <sergey.morozov@flant.com>
Signed-off-by: Sergey Morozov <sergey.morozov@flant.com>
|
Welcome @ergoz! |
|
Hi @ergoz. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@googs1025 @JaneLiuL Hello! I hope you're all doing well. I wanted to gently follow up on the pull request I submitted a little while ago — I'd really appreciate it if someone could take a look when they have a free moment. No rush at all, just wanted to make sure it didn't get lost in the shuffle! Happy to answer any questions or make changes based on your feedback. Thanks so much in advance! 🙏 |
|
@googs1025 @JaneLiuL Hello! Kindly remind :) we really need this feature :) |
|
will check this pr tomorrow |
|
@googs1025 great news! Thank you very much! |
|
/ok-to-test |
| for { | ||
| select { | ||
| case <-ctx.Done(): | ||
| return nil |
There was a problem hiding this comment.
case <-ctx.Done():
// Drain any pending trigger so the caller doesn't hang.
select {
case resultCh := <-rs.TriggerCh:
resultCh <- fmt.Errorf("descheduler shutting down")
default:
}
return nil edge case: if a request is sitting in TriggerCh (buffer 1) when shutdown fires, this returns and the handler's <-resultCh blocks until the request context is torn down — which usually shows up as a 504 to the caller during a graceful shutdown. Maybe drain it on the way out? 🤔
There was a problem hiding this comment.
Oh, missed this case :) i will fix it!
|
Thanks for the detailed write-up A few things I'd want to see before this lands, though:
|
|
|
||
| ## Trigger API | ||
|
|
||
| The descheduler exposes an optional HTTP endpoint that allows any authorized caller to **manually trigger a descheduling cycle on demand** without waiting for the next scheduled interval or restarting the process. |
There was a problem hiding this comment.
What sort of authorization is done ? i.e. how does this code distinguishes between authorized and unauthorized callers ?
There was a problem hiding this comment.
The meaning here is ambiguous; I meant that only those with access to the cluster and port can run it. But as @googs1025 asked, I'll add authorization. :)
|
|
In the past we discussed the option of running the descheduler async based on e.g. node condition changes. E.g. everytime a node goes unready for some time or nodes scale down/up. Observing cluster objects via informers and custom indexers.
Can you please more elaborate on how this would work from the user perspective? A single endpoint for every user with a single role? Or, is there a possibility of creating multiple roles? E.g assigning users permissions for individual actions? |
|
@ergoz: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
Problem
The descheduler today supports two operational modes if runs in deployment mode:
--descheduling-interval=0): runs a single eviction pass at startup and exits.--descheduling-interval=5m): runs passes on a fixed time interval.Neither mode gives operators any way to manually trigger an eviction cycle without restarting the process or waiting for the next scheduled tick. This creates friction in several real-world scenarios:
There was no escape hatch. The only workaround was to delete the descheduler pod and let it restart, which is disruptive and causes a gap in leader election.
Solution
This PR introduces an optional HTTP trigger API that allows any authorized caller to synchronously initiate a descheduling cycle at any time via a single
curlcommand.The API is disabled by default and must be explicitly opted into with
--enable-trigger-api, keeping backward compatibility complete.Closes #1855
Changes
cmd/descheduler/app/options/options.goEnableTriggerAPI boolfield toDeschedulerServerto control the feature.TriggerCh chan chan error— the communication channel between the HTTP handler and the main scheduling loop. Usingchan chan errorallows the handler to receive the cycle result and propagate it synchronously to the HTTP caller.--enable-trigger-apiCLI flag.cmd/descheduler/app/server.goEnableTriggerAPIis set, a bufferedchan chan erroris created andPOST /api/v1/descheduler/runis registered on the existingPathRecorderMux(alongside/metricsand/healthz), so it is served by the same HTTPS server — no new port, no new listener, no new TLS config.r.Context().Done()and returns 504 Gateway Timeout.pkg/descheduler/descheduler.gowait.NonSlidingUntilwith an explicitselect-based event loop. The loop listens on three channels simultaneously:ticker.C— fires at--descheduling-interval(existing behavior, unmodified)rs.TriggerCh— receives a manual trigger; executes the cycle and writes the result back to the caller's response channelctx.Done()— graceful shutdownTriggerChisnil(feature disabled), it is never selected inselect, so the scheduler behaves exactly as before.--descheduling-interval=0and--enable-trigger-apiis set, the process now stays alive after the initial cycle and responds to trigger calls indefinitely — enabling a daemon-without-tick mode useful for purely on-demand operation.API Reference
POST/api/v1/descheduler/runSuccess (200)
{"message": "descheduling cycle completed successfully", "status": "ok"}Cycle already running (429)
{"message": "descheduling cycle already in progress or pending", "status": "error"}Internal error (500)
{"message": "failed to run descheduler loop: ...", "status": "error"}Client disconnected (504)
{"message": "request cancelled or timed out", "status": "error"}Usage Examples
Start descheduler with the trigger API enabled
Trigger a cycle immediately from inside the cluster
kubectl exec -n kube-system deploy/descheduler -- \ curl -sk -X POST https://localhost:10258/api/v1/descheduler/runTrigger via port-forward from a local machine
kubectl port-forward -n kube-system deploy/descheduler 10258:10258 & curl -sk -X POST https://localhost:10258/api/v1/descheduler/runUse in a CI/CD step after a deployment
Daemon-without-tick mode (purely on-demand, no automatic interval)
Backward Compatibility
--enable-trigger-apidefaults tofalse. Existing deployments are unaffected.select-based replacement ofwait.NonSlidingUntil) is functionally equivalent for all existing flag combinations when the trigger API is disabled.Checklist
Please ensure your pull request meets the following criteria before submitting
for review, these items will be used by reviewers to assess the quality and
completeness of your changes: