feat: add HTTP trigger API for on-demand descheduling cycles by ergoz · Pull Request #1857 · kubernetes-sigs/descheduler

ergoz · 2026-04-10T11:45:40Z

Description

Problem

The descheduler today supports two operational modes if runs in deployment mode:

One-shot (--descheduling-interval=0): runs a single eviction pass at startup and exits.
Continuous (--descheduling-interval=5m): runs passes on a fixed time interval.

Neither mode gives operators any way to manually trigger an eviction cycle without restarting the process or waiting for the next scheduled tick. This creates friction in several real-world scenarios:

Post-incident remediation: a node drain or topology imbalance just occurred and you need eviction to run right now, not in 4 minutes when the timer fires.
CI/CD pipelines: a deployment job wants to explicitly rebalance pods after a rollout and needs to know when descheduling is complete before proceeding.
Debugging & dry-run validation: an SRE wants to trigger a dry-run pass interactively, inspect logs, and iterate — without restarting the descheduler pod.
Long interval + emergency: the cluster runs descheduler with a 30-minute interval to reduce churn, but an on-call engineer needs to act immediately.

There was no escape hatch. The only workaround was to delete the descheduler pod and let it restart, which is disruptive and causes a gap in leader election.

Solution

This PR introduces an optional HTTP trigger API that allows any authorized caller to synchronously initiate a descheduling cycle at any time via a single curl command.

The API is disabled by default and must be explicitly opted into with --enable-trigger-api, keeping backward compatibility complete.

Closes #1855

Changes

`cmd/descheduler/app/options/options.go`

Added EnableTriggerAPI bool field to DeschedulerServer to control the feature.
Added TriggerCh chan chan error — the communication channel between the HTTP handler and the main scheduling loop. Using chan chan error allows the handler to receive the cycle result and propagate it synchronously to the HTTP caller.
Registered the --enable-trigger-api CLI flag.

`cmd/descheduler/app/server.go`

When EnableTriggerAPI is set, a buffered chan chan error is created and POST /api/v1/descheduler/run is registered on the existing PathRecorderMux (alongside /metrics and /healthz), so it is served by the same HTTPS server — no new port, no new listener, no new TLS config.
The handler is synchronous: it blocks until the descheduling cycle finishes and returns the result as JSON. The caller gets a definitive success/failure signal.
Concurrency protection: the trigger channel has a buffer of 1. If a cycle is already queued or running, a subsequent request receives 429 Too Many Requests immediately instead of blocking indefinitely or spawning a second concurrent cycle.
Request cancellation is respected: if the HTTP client disconnects, the handler unblocks via r.Context().Done() and returns 504 Gateway Timeout.

`pkg/descheduler/descheduler.go`

Replaced wait.NonSlidingUntil with an explicit select-based event loop. The loop listens on three channels simultaneously:
- ticker.C — fires at --descheduling-interval (existing behavior, unmodified)
- rs.TriggerCh — receives a manual trigger; executes the cycle and writes the result back to the caller's response channel
- ctx.Done() — graceful shutdown
When TriggerCh is nil (feature disabled), it is never selected in select, so the scheduler behaves exactly as before.
When --descheduling-interval=0 and --enable-trigger-api is set, the process now stays alive after the initial cycle and responds to trigger calls indefinitely — enabling a daemon-without-tick mode useful for purely on-demand operation.

API Reference

Method	Path	Description
`POST`	`/api/v1/descheduler/run`	Trigger a descheduling cycle synchronously

Success (200)

{"message": "descheduling cycle completed successfully", "status": "ok"}

Cycle already running (429)

{"message": "descheduling cycle already in progress or pending", "status": "error"}

Internal error (500)

{"message": "failed to run descheduler loop: ...", "status": "error"}

Client disconnected (504)

{"message": "request cancelled or timed out", "status": "error"}

Usage Examples

Start descheduler with the trigger API enabled

descheduler \
  --policy-config-file=/etc/descheduler/policy.yaml \
  --descheduling-interval=30m \
  --enable-trigger-api

Trigger a cycle immediately from inside the cluster

kubectl exec -n kube-system deploy/descheduler -- \
  curl -sk -X POST https://localhost:10258/api/v1/descheduler/run

Trigger via port-forward from a local machine

kubectl port-forward -n kube-system deploy/descheduler 10258:10258 &
curl -sk -X POST https://localhost:10258/api/v1/descheduler/run

Use in a CI/CD step after a deployment

curl -sk --max-time 300 -X POST \
  https://descheduler.kube-system.svc.cluster.local:10258/api/v1/descheduler/run \
  | jq -e '.status == "ok"'

Daemon-without-tick mode (purely on-demand, no automatic interval)

descheduler \
  --policy-config-file=/etc/descheduler/policy.yaml \
  --descheduling-interval=0 \
  --enable-trigger-api
# The initial cycle runs at startup.
# The process stays alive and responds only to explicit POST /api/v1/descheduler/run calls.

Backward Compatibility

--enable-trigger-api defaults to false. Existing deployments are unaffected.
The main scheduling loop (select-based replacement of wait.NonSlidingUntil) is functionally equivalent for all existing flag combinations when the trigger API is disabled.
No new ports, certificates, or RBAC resources are introduced.

Checklist

Please ensure your pull request meets the following criteria before submitting
for review, these items will be used by reviewers to assess the quality and
completeness of your changes:

Signed-off-by: Sergey Morozov <sergey.morozov@flant.com>

linux-foundation-easycla · 2026-04-10T11:45:46Z

The committers listed above are authorized under a signed CLA.

✅ login: ergoz / name: Sergey Morozov (65cfeeb, 81c1910)

k8s-ci-robot · 2026-04-10T11:45:49Z

Welcome @ergoz!

It looks like this is your first PR to kubernetes-sigs/descheduler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/descheduler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2026-04-10T11:45:50Z

Hi @ergoz. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-04-10T11:45:57Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ergoz · 2026-04-21T14:23:43Z

@googs1025 @JaneLiuL Hello! I hope you're all doing well. I wanted to gently follow up on the pull request I submitted a little while ago — I'd really appreciate it if someone could take a look when they have a free moment.

No rush at all, just wanted to make sure it didn't get lost in the shuffle! Happy to answer any questions or make changes based on your feedback.

Thanks so much in advance! 🙏

ergoz · 2026-05-04T11:18:26Z

@googs1025 @JaneLiuL Hello! Kindly remind :) we really need this feature :)

googs1025 · 2026-05-04T11:33:24Z

will check this pr tomorrow

ergoz · 2026-05-04T11:55:28Z

@googs1025 great news! Thank you very much!

googs1025 · 2026-05-05T00:42:25Z

/ok-to-test

googs1025 · 2026-05-05T00:46:08Z

+	for {
+		select {
+		case <-ctx.Done():
+			return nil


case <-ctx.Done(): // Drain any pending trigger so the caller doesn't hang. select { case resultCh := <-rs.TriggerCh: resultCh <- fmt.Errorf("descheduler shutting down") default: } return nil

edge case: if a request is sitting in TriggerCh (buffer 1) when shutdown fires, this returns and the handler's <-resultCh blocks until the request context is torn down — which usually shows up as a 504 to the caller during a graceful shutdown. Maybe drain it on the way out? 🤔

Oh, missed this case :) i will fix it!

googs1025 · 2026-05-05T00:49:42Z

Thanks for the detailed write-up

A few things I'd want to see before this lands, though:

The endpoint mutates cluster state but doesn't go through any auth filter — it's mounted on the same mux as /metrics and /healthz, which are read-only by convention. Anyone with network reachability to :10258 can trigger an eviction pass. I think we need either an auth filter (SAR/bearer token, like other K8s components do) or at minimum a loopback-only bind so this is kubectl exec-only.
No tests in the diff — the main loop rewrite (wait.NonSlidingUntil → custom select) really needs coverage for the four flag combinations (interval 0/+, trigger on/off).

ricardomaraschini · 2026-05-06T08:44:20Z


+## Trigger API
+
+The descheduler exposes an optional HTTP endpoint that allows any authorized caller to **manually trigger a descheduling cycle on demand** without waiting for the next scheduled interval or restarting the process.


What sort of authorization is done ? i.e. how does this code distinguishes between authorized and unauthorized callers ?

The meaning here is ambiguous; I meant that only those with access to the cluster and port can run it. But as @googs1025 asked, I'll add authorization. :)

ergoz · 2026-05-06T10:47:14Z

Thanks for the detailed write-up

A few things I'd want to see before this lands, though:

The endpoint mutates cluster state but doesn't go through any auth filter — it's mounted on the same mux as /metrics and /healthz, which are read-only by convention. Anyone with network reachability to :10258 can trigger an eviction pass. I think we need either an auth filter (SAR/bearer token, like other K8s components do) or at minimum a loopback-only bind so this is kubectl exec-only.

No tests in the diff — the main loop rewrite (wait.NonSlidingUntil → custom select) really needs coverage for the four flag combinations (interval 0/+, trigger on/off).

Yes, you are right! I will add auth like in k8s. But should i add another port listener / mux / something else to split it with /healthz and /metrics? or auth is enough?
Ok, will be added :)

ingvagabund · 2026-05-11T10:02:29Z

In the past we discussed the option of running the descheduler async based on e.g. node condition changes. E.g. everytime a node goes unready for some time or nodes scale down/up. Observing cluster objects via informers and custom indexers.

real-world scenarios

Can you please more elaborate on how this would work from the user perspective? A single endpoint for every user with a single role? Or, is there a possibility of creating multiple roles? E.g assigning users permissions for individual actions?

k8s-ci-robot · 2026-05-18T11:51:36Z

@ergoz: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-descheduler-verify-master	`65cfeeb`	link	true	`/test pull-descheduler-verify-master`
pull-descheduler-test-e2e-k8s-master-1-36	`65cfeeb`	link	true	`/test pull-descheduler-test-e2e-k8s-master-1-36`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ergoz added 2 commits April 9, 2026 01:27

feat: api handler for manual run

81c1910

Signed-off-by: Sergey Morozov <sergey.morozov@flant.com>

feat: api handler for manual run - update readme

65cfeeb

Signed-off-by: Sergey Morozov <sergey.morozov@flant.com>

k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Apr 10, 2026

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 10, 2026

k8s-ci-robot requested a review from googs1025 April 10, 2026 11:45

k8s-ci-robot requested a review from JaneLiuL April 10, 2026 11:45

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Apr 10, 2026

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 5, 2026

googs1025 reviewed May 5, 2026

View reviewed changes

ricardomaraschini suggested changes May 6, 2026

View reviewed changes


		## Trigger API

		The descheduler exposes an optional HTTP endpoint that allows any authorized caller to manually trigger a descheduling cycle on demand without waiting for the next scheduled interval or restarting the process.

Conversation

ergoz commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

cmd/descheduler/app/options/options.go

cmd/descheduler/app/server.go

pkg/descheduler/descheduler.go

API Reference

Usage Examples

Start descheduler with the trigger API enabled

Trigger a cycle immediately from inside the cluster

Trigger via port-forward from a local machine

Use in a CI/CD step after a deployment

Daemon-without-tick mode (purely on-demand, no automatic interval)

Backward Compatibility

Checklist

Uh oh!

linux-foundation-easycla Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Apr 10, 2026

Uh oh!

k8s-ci-robot commented Apr 10, 2026

Uh oh!

k8s-ci-robot commented Apr 10, 2026

Uh oh!

ergoz commented Apr 21, 2026

Uh oh!

ergoz commented May 4, 2026

Uh oh!

googs1025 commented May 4, 2026

Uh oh!

ergoz commented May 4, 2026

Uh oh!

googs1025 commented May 5, 2026

Uh oh!

googs1025 May 5, 2026

Choose a reason for hiding this comment

Uh oh!

ergoz May 6, 2026

Choose a reason for hiding this comment

Uh oh!

googs1025 commented May 5, 2026

Uh oh!

ricardomaraschini May 6, 2026

Choose a reason for hiding this comment

Uh oh!

ergoz May 6, 2026

Choose a reason for hiding this comment

Uh oh!

ergoz commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ingvagabund commented May 11, 2026

Uh oh!

k8s-ci-robot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ergoz commented Apr 10, 2026 •

edited

Loading

`cmd/descheduler/app/options/options.go`

`cmd/descheduler/app/server.go`

`pkg/descheduler/descheduler.go`

linux-foundation-easycla Bot commented Apr 10, 2026 •

edited

Loading

ergoz commented May 6, 2026 •

edited

Loading