diff --git a/Makefile b/Makefile
index e958d19548..86ddc82e9e 100644
--- a/Makefile
+++ b/Makefile
@@ -26,7 +26,7 @@ ARCHS = amd64 arm arm64
 
 LDFLAGS=-ldflags "-X ${LDFLAG_LOCATION}.version=${VERSION} -X ${LDFLAG_LOCATION}.buildDate=${BUILD} -X ${LDFLAG_LOCATION}.gitbranch=${BRANCH} -X ${LDFLAG_LOCATION}.gitsha1=${SHA1}"
 
-GOLANGCI_VERSION := v2.8.0
+GOLANGCI_VERSION := v2.12.2
 HAS_GOLANGCI := $(shell ls _output/bin/golangci-lint 2> /dev/null)
 
 GOFUMPT_VERSION := v0.7.0
@@ -146,7 +146,7 @@ verify-gen:
 
 lint:
 ifndef HAS_GOLANGCI
-	curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b ./_output/bin ${GOLANGCI_VERSION}
+	GOBIN=$(CURRENT_DIR)/_output/bin go install github.com/golangci/golangci-lint/v2/cmd/golangci-lint@${GOLANGCI_VERSION}
 endif
 	./_output/bin/golangci-lint run -v
 	./_output/bin/golangci-lint fmt -v
diff --git a/README.md b/README.md
index f66e03c71b..8579ce19b5 100644
--- a/README.md
+++ b/README.md
@@ -697,6 +697,25 @@ The `topologyBalanceNodeFit` arg is used when balancing topology domains while t
 topologyBalanceNodeFit: false
 ```
 
+The `zoneAwareNodeFit` arg (default: `false`) prevents over-committing a single
+under-loaded topology domain within one balancing round. For each candidate pod the
+descheduler considers evicting, it requires that at least one specific under-loaded
+domain (identified by the constraint's `topologyKey`) has both (a) a node where the pod
+fits per the scheduler's per-node predicates AND (b) sufficient remaining aggregate
+resource headroom — after subtracting pods already committed to that domain in this
+round — to absorb the pod's cpu/memory request and one pod slot.
+
+Unlike `topologyBalanceNodeFit`, which only checks per-node fit on the union of
+under-loaded domain nodes (a check that is stateless across the batch),
+`zoneAwareNodeFit` tracks per-domain cumulative state. Use it when batched eviction
+would otherwise push more pods toward an under-loaded domain than it can actually
+absorb, causing the scheduler to send the excess back to the over-loaded domain
+(eviction churn).
+
+```yaml
+zoneAwareNodeFit: true
+```
+
 Strategy parameter `labelSelector` is not utilized when balancing topology domains and is only applied during eviction to determine if the pod can be evicted.
 
 [Supported Constraints](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#spread-constraint-definition) fields:
@@ -714,12 +733,13 @@ Strategy parameter `labelSelector` is not utilized when balancing topology domai
 
 **Parameters:**
 
-|Name|Type|
-|---|---|
-|`namespaces`|(see [namespace filtering](#namespace-filtering))|
-|`labelSelector`|(see [label filtering](#label-filtering))|
+|Name|Type|Description|
+|---|---|---|
+|`namespaces`|(see [namespace filtering](#namespace-filtering))||
+|`labelSelector`|(see [label filtering](#label-filtering))||
 |`constraints`|(see [whenUnsatisfiable](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#topologyspreadconstraint-v1-core))||
 |`topologyBalanceNodeFit`|bool|default `true`. [node fit filtering](#node-fit-filtering) when balancing topology domains|
+|`zoneAwareNodeFit`|bool|default `false`. When enabled, gates eviction on cumulative per-topology-domain headroom: a candidate pod is admitted only if some specific under-loaded domain has both a node where it fits AND remaining aggregate cpu/memory/pod-slot headroom (after pods already committed to that domain in this round). Prevents over-committing a single under-loaded domain — a case `topologyBalanceNodeFit` cannot detect.|
 
 **Example:**
 
diff --git a/docs/proposals/zone-aware-nodefit.md b/docs/proposals/zone-aware-nodefit.md
new file mode 100644
index 0000000000..f66d042141
--- /dev/null
+++ b/docs/proposals/zone-aware-nodefit.md
@@ -0,0 +1,282 @@
+# RFC: ZoneAwareNodeFit for RemovePodsViolatingTopologySpreadConstraint
+
+**Status:** Implementable
+**Author:** bruno.chauvet@rokt.com
+**Date:** 2026-04-15 (revised 2026-05-17)
+**Related issues:** kubernetes-sigs/descheduler#1534, kubernetes-sigs/descheduler#1067
+
+---
+
+## Problem
+
+`RemovePodsViolatingTopologySpreadConstraint` evicts pods from over-loaded topology
+domains in batches: each iteration of `balanceDomains` selects N candidate pods from
+an over-loaded domain and lets the scheduler place them in under-loaded domains. The
+existing `TopologyBalanceNodeFit` gate (default `true`) checks, for each candidate,
+whether the pod can fit on at least one node in the union of all under-loaded domains.
+
+The check is **stateless across the batch**: every candidate evaluates `PodFitsAnyOtherNode`
+against the same node-view (the descheduler does not simulate intermediate scheduler
+decisions). So if N candidates each individually fit on a node in the only under-loaded
+domain `B`, all N pass the gate — even though `B` may only have aggregate room for
+K ≪ N of them. After eviction, the scheduler can place only K pods in `B`; the
+remaining N − K either go back to the over-loaded domain (churn — issues #1534, #1067)
+or remain Pending.
+
+### Illustrative scenario
+
+```
+Domain A (over-loaded): 7 pods on nodes with ample CPU
+Domain B (under-loaded by pod count): 1 pod on a node with 250 m CPU allocatable
+Existing pod in B uses 100 m → 150 m of aggregate headroom remains in domain B
+Candidate pods request 100 m CPU each
+
+balanceDomains schedules 3 evictions toward domain B.
+TopologyBalanceNodeFit evaluates each candidate against the live indexer state
+(B1 still has 150 m headroom) → all 3 pass → all 3 evicted.
+The scheduler can fit at most 1 of them in B. The other 2 go back to A → churn.
+```
+
+### Stateless vs cumulative checks, side-by-side
+
+```mermaid
+sequenceDiagram
+    autonumber
+    participant BD as balanceDomains
+    participant TBNF as topologyBalanceNodeFit gate<br/>(stateless per-node check on union of domains)
+    participant ZANF as zoneAwareNodeFit gate<br/>(cumulative per-domain headroom)
+    participant Idx as live nodeIndexer<br/>(B1 = 150 m headroom)
+    participant H as remainingHeadroom map<br/>(zoneB = 150 m initially)
+
+    Note over BD: 3 candidate pods<br/>(each requests 100 m)
+
+    BD->>TBNF: candidate 1
+    TBNF->>Idx: PodFitsAnyOtherNode([B1])
+    Idx-->>TBNF: yes (150 m ≥ 100 m)
+    TBNF-->>BD: pass
+    BD->>ZANF: candidate 1
+    ZANF->>H: zoneB headroom?
+    H-->>ZANF: 150 m
+    ZANF->>H: decrement by 100 m → 50 m
+    ZANF-->>BD: pass
+
+    BD->>TBNF: candidate 2
+    TBNF->>Idx: PodFitsAnyOtherNode([B1])
+    Idx-->>TBNF: yes — indexer is NOT mutated mid-batch
+    TBNF-->>BD: pass ← bug
+    BD->>ZANF: candidate 2
+    ZANF->>H: zoneB headroom?
+    H-->>ZANF: 50 m
+    ZANF-->>BD: reject (50 m < 100 m)
+
+    BD->>TBNF: candidate 3
+    TBNF-->>BD: pass ← bug
+    BD->>ZANF: candidate 3
+    ZANF-->>BD: reject
+
+    Note over BD: topologyBalanceNodeFit alone:<br/>3 evictions, scheduler returns 2 → churn<br/>zoneAwareNodeFit (with or without TBNF):<br/>1 eviction, no churn
+```
+
+The fundamental gap: `TopologyBalanceNodeFit` reasons about **per-node fit on the
+union of under-loaded domains**, which mathematically collapses to
+`OR(merge(domains))`. It cannot detect cumulative over-commit of a single under-loaded
+domain because it never tracks state across the batch.
+
+## Proposed Change
+
+Add an opt-in field `ZoneAwareNodeFit *bool` to
+`RemovePodsViolatingTopologySpreadConstraintArgs` (default `false`).
+
+When enabled, `balanceDomains` tracks, **per under-loaded topology domain**, the
+aggregate resource headroom remaining for the current batch. Each candidate pod is
+admitted only if some specific under-loaded domain has both:
+
+- (a) at least one node where the pod fits per the scheduler's per-node predicates
+  (`PodFitsAnyOtherNode`), AND
+- (b) sufficient remaining aggregate headroom — after subtracting all pods already
+  committed to that domain in this batch — to absorb the pod's resource request and
+  one pod slot.
+
+On admission, that domain's headroom is decremented. Subsequent candidates see the
+decremented state; once a domain is drained, later candidates must find another
+qualifying domain or be skipped.
+
+Per-domain headroom is initialised as `sum(allocatable) − sum(existing pod requests)`
+across the domain's nodes, for cpu, memory, and pod count.
+
+### Why this is qualitatively different from `TopologyBalanceNodeFit`
+
+The accumulator makes pod-N's decision depend on commitments 1..N-1. There is no
+equivalent stateless `OR(merge(...))` formulation, because `PodFitsAnyOtherNode` has
+no notion of "headroom remaining after prior commitments in this batch."
+
+`ZoneAwareNodeFit` is therefore **strictly stricter** than `TopologyBalanceNodeFit`:
+any pod that fails `ZoneAwareNodeFit` also fails the cumulative-correctness goal
+that `TopologyBalanceNodeFit` cannot express.
+
+## API
+
+```yaml
+apiVersion: "descheduler/v1alpha2"
+kind: DeschedulerPolicy
+profiles:
+  - name: default
+    plugins:
+      balance:
+        enabled:
+          - name: RemovePodsViolatingTopologySpreadConstraint
+    pluginConfig:
+      - name: RemovePodsViolatingTopologySpreadConstraint
+        args:
+          topologyBalanceNodeFit: true   # unchanged — existing flag
+          zoneAwareNodeFit: true         # NEW — cumulative per-domain gate
+```
+
+### Field semantics
+
+| Field | Default | Reasoning level | Catches cumulative over-commit? |
+|---|---|---|---|
+| `topologyBalanceNodeFit` | `true` | Per-node, stateless across batch | No |
+| `zoneAwareNodeFit` | `false` | Per-domain, cumulative within batch | Yes |
+
+Enabling both is valid: `TopologyBalanceNodeFit` runs first as a fast per-node pre-filter,
+then `ZoneAwareNodeFit` applies the cumulative-headroom constraint.
+
+## Implementation
+
+### types.go
+
+```go
+// ZoneAwareNodeFit, if set to true, requires that each pod evicted by topology-spread
+// balancing have, in at least one specific under-loaded topology domain (as identified
+// by the constraint's TopologyKey), (a) a node where the pod fits per the scheduler's
+// per-node predicates AND (b) sufficient remaining aggregate resource headroom — after
+// accounting for other pods already committed to that domain during the same balancing
+// round — to absorb the pod.
+ZoneAwareNodeFit *bool `json:"zoneAwareNodeFit,omitempty"`
+```
+
+### defaults.go
+
+```go
+if args.ZoneAwareNodeFit == nil {
+    args.ZoneAwareNodeFit = utilptr.To(false)
+}
+```
+
+### topologyspreadconstraint.go
+
+Two new helpers replace the original duplicate filter + naive OR(OR) check:
+
+```go
+// groupNodesByDomain classifies the supplied nodes (already filtered to under-loaded
+// domains) by their topology-key label value. Single pass over the same node slice
+// the existing filterNodesBelowIdealAvg returns.
+func groupNodesByDomain(nodes []*v1.Node, topologyKey string) map[string][]*v1.Node
+
+// computeDomainHeadroom returns, per under-loaded domain, the aggregate remaining
+// resource headroom (allocatable − already-requested) summed across the domain's
+// nodes. Tracks cpu, memory, and pod count. Computed once per balanceDomains call.
+func computeDomainHeadroom(
+    nodesByDomain map[string][]*v1.Node,
+    nodeIndexer podutil.GetPodsAssignedToNodeFunc,
+) map[string]v1.ResourceList
+
+// podFitsSomeDomainWithHeadroom returns the topology-domain key whose remaining
+// headroom AND per-node fit both admit pod. On success it decrements that domain's
+// headroom in place and returns ok=true. Iterates domains in descending remaining
+// CPU headroom order so pods spread toward the roomiest under-loaded domain first;
+// ties fall back to alphabetical for determinism.
+func podFitsSomeDomainWithHeadroom(
+    nodeIndexer podutil.GetPodsAssignedToNodeFunc,
+    pod *v1.Pod,
+    nodesByDomain map[string][]*v1.Node,
+    remainingHeadroom map[string]v1.ResourceList,
+) (string, bool)
+```
+
+In `balanceDomains`:
+
+```go
+zoneAwareNodeFit := utilptr.Deref(d.args.ZoneAwareNodeFit, false)
+var nodesByDomain map[string][]*v1.Node
+var remainingHeadroom map[string]v1.ResourceList
+if zoneAwareNodeFit {
+    nodesByDomain = groupNodesByDomain(nodesBelowIdealAvg, tsc.TopologyKey)
+    remainingHeadroom = computeDomainHeadroom(nodesByDomain, getPodsAssignedToNode)
+}
+```
+
+Inside the per-pod loop, after the existing `topologyBalanceNodeFit` gate:
+
+```go
+if zoneAwareNodeFit {
+    if _, ok := podFitsSomeDomainWithHeadroom(
+        getPodsAssignedToNode, aboveToEvict[k], nodesByDomain, remainingHeadroom,
+    ); !ok {
+        d.logger.V(2).Info(
+            "ignoring pod for eviction: no target topology domain has fit + remaining headroom",
+            "pod", klog.KObj(aboveToEvict[k]),
+        )
+        continue
+    }
+}
+```
+
+## Alternatives Considered
+
+**Modify `TopologyBalanceNodeFit` semantics to be cumulative.** Silent behavioural change
+for existing users. Rejected.
+
+**Per-node-fit-only gate, no headroom tracking.** This is what the first revision of this
+proposal shipped; review surfaced that it collapses to `OR(merge(domains))` and adds
+nothing over `TopologyBalanceNodeFit`. Rejected — see Problem section.
+
+**Threading zone context through `EvictorPlugin.PreEvictionFilter`.** Breaking API change
+across all plugins. Rejected in favour of a TSC-local gate.
+
+## Test Plan
+
+The acceptance bar for this test plan: each behavioural case must produce a different
+result under the redesign than under a naive `OR(merge(domains))` per-node-fit
+implementation. Cases marked **gap-catching** would have flagged the prior revision's
+churn bug. Cases marked **regression** ensure the new gate still enforces existing
+invariants (per-node fit, default-off no-op).
+
+### Integration cases (`TestTopologySpreadConstraint` table)
+
+Shared topology unless noted: `zoneA` over-loaded (7 pods on a 2 000 m node);
+`zoneB` under-loaded (1 pod on a 250 m node → 150 m aggregate headroom).
+Each candidate pod requests 100 m CPU. balanceDomains schedules 3 evictions.
+
+| # | Scenario | Args | Expected | Catches gap? |
+|---|---|---|---|---|
+| 1 | ZoneAwareNodeFit caps cumulative load **on top of** TopologyBalanceNodeFit | TBNF=true (default), ZANF=true, evictor nodeFit=false | 1 of 3 | ✅ — prior impl admits 3 (per-node check stateless across batch) |
+| 2 | ZoneAwareNodeFit alone caps at aggregate headroom | TBNF=false, ZANF=true, evictor nodeFit=false | 1 of 3 | ✅ — prior impl admits 3 |
+| 3 | TopologyBalanceNodeFit alone does **not** cap cumulative load (the bug) | defaults (TBNF=true, ZANF=false), nodeFit=false | 3 of 3 | ❌ baseline contrast (proves the bug exists without the new gate) |
+| 4 | Drains domain by domain, caps at sum of headroom across domains (zoneB=150 m, zoneC=250 m → total headroom 400 m fits 4×100 m, but algorithm schedules 4 evictions in 2 iterations; the 4th finds both domains drained) | TBNF=false, ZANF=true, nodeFit=false | 3 of 4 | ✅ — prior impl admits 4 (zoneC always has per-node fit) |
+| 5 | Per-node fit still required when aggregate headroom is misleadingly high (zoneB has 5 × 50 m nodes = 250 m aggregate but no single node fits 100 m) | TBNF=false, ZANF=true, nodeFit=false | 0 evictions | ❌ regression — both impls reject |
+| 6 | Multi-node-per-domain grouping (zoneB has 2 nodes each contributing 100 m → 200 m aggregate fits 2×100 m) | TBNF=false, ZANF=true, nodeFit=false | 2 evictions | ✅ — would surface a grouping bug that mis-bucketed nodes |
+| 7 | Default off — backward compat (ZANF unset, evictor nodeFit handles it) | defaults, nodeFit=true | 0 evictions | ❌ regression — identical to prior versions |
+
+### Helper unit tests
+
+| Test | Coverage |
+|---|---|
+| `TestGroupNodesByDomain` | Classifies by `node.Labels[topologyKey]`; drops nodes missing the label |
+| `TestComputeDomainHeadroom` | Aggregates allocatable − sum-of-requests across a domain's nodes; counts cpu (millicores) and pod slots correctly |
+| `TestComputeDomainHeadroomFailsClosedOnIndexerError` | When `ListPodsOnANode` errors for a node, that node contributes zero to the domain aggregate (no allocatable, no requests) |
+| `TestPodFitsSomeDomainWithHeadroom` — roomiest first | Picks the domain with the highest remaining CPU headroom even when it sorts later alphabetically |
+| `TestPodFitsSomeDomainWithHeadroom` — alphabetical tie-break | When two domains have equal headroom, commits to the alphabetically-earlier domain (determinism) |
+| `TestPodFitsSomeDomainWithHeadroom` — drain then reject | Three sequential calls with starting headroom = 250 m; first two admit, third rejects (50 m < 100 m) |
+| `TestPodFitsSomeDomainWithHeadroom` — aggregate failure | Headroom < pod request → reject |
+| `TestPodFitsSomeDomainWithHeadroom` — per-node failure | Aggregate ample but no single node fits (occupant pod consumes capacity) → reject |
+| `TestPodFitsSomeDomainWithHeadroom` — empty input | Empty `nodesByDomain` → reject without error |
+
+## Risk
+
+**Low.**
+- Opt-in (`false` by default).
+- Additive: existing `TopologyBalanceNodeFit` is unchanged.
+- Localised to `balanceDomains` and a few new helpers in the TSC plugin.
+- No changes to framework types, eviction API, or other plugins.
diff --git a/hack/lib/go.sh b/hack/lib/go.sh
index 311d0f93b7..d0a4242ba0 100644
--- a/hack/lib/go.sh
+++ b/hack/lib/go.sh
@@ -19,7 +19,7 @@
 go::verify_version() {
   GO_VERSION=($(go version))
 
-  if [[ -z $(echo "${GO_VERSION[2]}" | grep -E 'go1.23|go1.24|go1.25') ]]; then
+  if [[ -z $(echo "${GO_VERSION[2]}" | grep -E 'go1.24|go1.25|go1.26') ]]; then
     echo "Unknown go version '${GO_VERSION[2]}', skipping gofmt."
     exit 1
   fi
diff --git a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults.go b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults.go
index 55b4092f87..49016363ea 100644
--- a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults.go
+++ b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults.go
@@ -36,6 +36,9 @@ func SetDefaults_RemovePodsViolatingTopologySpreadConstraintArgs(obj runtime.Obj
 	if args.TopologyBalanceNodeFit == nil {
 		args.TopologyBalanceNodeFit = utilptr.To(true)
 	}
+	if args.ZoneAwareNodeFit == nil {
+		args.ZoneAwareNodeFit = utilptr.To(false)
+	}
 	if len(args.Constraints) == 0 {
 		args.Constraints = append(args.Constraints, v1.DoNotSchedule)
 	}
diff --git a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults_test.go b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults_test.go
index 0901e70874..4b96b9760c 100644
--- a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults_test.go
+++ b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/defaults_test.go
@@ -49,6 +49,7 @@ func TestSetDefaults_RemovePodsViolatingTopologySpreadConstraintArgs(t *testing.
 				LabelSelector:          nil,
 				Constraints:            []v1.UnsatisfiableConstraintAction{v1.DoNotSchedule},
 				TopologyBalanceNodeFit: utilptr.To(true),
+				ZoneAwareNodeFit:       utilptr.To(false),
 			},
 		},
 		{
@@ -63,6 +64,7 @@ func TestSetDefaults_RemovePodsViolatingTopologySpreadConstraintArgs(t *testing.
 				LabelSelector:          &metav1.LabelSelector{},
 				Constraints:            []v1.UnsatisfiableConstraintAction{v1.DoNotSchedule, v1.ScheduleAnyway},
 				TopologyBalanceNodeFit: utilptr.To(true),
+				ZoneAwareNodeFit:       utilptr.To(false),
 			},
 		},
 		{
@@ -71,6 +73,7 @@ func TestSetDefaults_RemovePodsViolatingTopologySpreadConstraintArgs(t *testing.
 			want: &RemovePodsViolatingTopologySpreadConstraintArgs{
 				Constraints:            []v1.UnsatisfiableConstraintAction{v1.DoNotSchedule},
 				TopologyBalanceNodeFit: utilptr.To(true),
+				ZoneAwareNodeFit:       utilptr.To(false),
 			},
 		},
 		{
@@ -81,6 +84,7 @@ func TestSetDefaults_RemovePodsViolatingTopologySpreadConstraintArgs(t *testing.
 			want: &RemovePodsViolatingTopologySpreadConstraintArgs{
 				TopologyBalanceNodeFit: utilptr.To(false),
 				Constraints:            []v1.UnsatisfiableConstraintAction{v1.DoNotSchedule},
+				ZoneAwareNodeFit:       utilptr.To(false),
 			},
 		},
 		{
@@ -91,6 +95,18 @@ func TestSetDefaults_RemovePodsViolatingTopologySpreadConstraintArgs(t *testing.
 			want: &RemovePodsViolatingTopologySpreadConstraintArgs{
 				Constraints:            []v1.UnsatisfiableConstraintAction{v1.DoNotSchedule},
 				TopologyBalanceNodeFit: utilptr.To(true),
+				ZoneAwareNodeFit:       utilptr.To(false),
+			},
+		},
+		{
+			name: "RemovePodsViolatingTopologySpreadConstraintArgs with ZoneAwareNodeFit=true is not overwritten",
+			in: &RemovePodsViolatingTopologySpreadConstraintArgs{
+				ZoneAwareNodeFit: utilptr.To(true),
+			},
+			want: &RemovePodsViolatingTopologySpreadConstraintArgs{
+				TopologyBalanceNodeFit: utilptr.To(true),
+				Constraints:            []v1.UnsatisfiableConstraintAction{v1.DoNotSchedule},
+				ZoneAwareNodeFit:       utilptr.To(true),
 			},
 		},
 	}
diff --git a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint.go b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint.go
index a47cb77f72..e036a27e32 100644
--- a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint.go
+++ b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint.go
@@ -21,6 +21,7 @@ import (
 	"sort"
 
 	v1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/resource"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/labels"
 	"k8s.io/apimachinery/pkg/runtime"
@@ -315,6 +316,28 @@ func (d *RemovePodsViolatingTopologySpreadConstraint) balanceDomains(ctx context
 	eligibleNodes := filterEligibleNodes(ctx, nodes, tsc)
 	nodesBelowIdealAvg := filterNodesBelowIdealAvg(eligibleNodes, sortedDomains, tsc.TopologyKey, idealAvg)
 
+	zoneAwareNodeFit := utilptr.Deref(d.args.ZoneAwareNodeFit, false)
+
+	// When zoneAwareNodeFit is enabled, group the same under-loaded nodes by their
+	// topology-domain value and compute each domain's remaining aggregate headroom.
+	// The headroom map is mutated as evictions are committed within this call so that
+	// later candidate pods see headroom already consumed by earlier ones.
+	//
+	// The grouping and headroom snapshot are taken once at entry and are *not* recomputed
+	// as the outer (i, j) loop progresses. As pods get added to podsForEviction the
+	// effective set of below-ideal domains can shift, but the snapshot's drift is the
+	// safe (conservative) direction: a domain that becomes saturated mid-batch will stop
+	// admitting new pods once its tracked headroom is drained, and a domain that becomes
+	// further below idealAvg is simply not promoted into the gate's view. Recomputing
+	// per iteration would be more accurate but costs an extra ListPodsOnANode sweep
+	// across the domain's nodes for every (i, j) step.
+	var nodesByDomain map[string][]*v1.Node
+	var remainingHeadroom map[string]v1.ResourceList
+	if zoneAwareNodeFit {
+		nodesByDomain = groupNodesByDomain(nodesBelowIdealAvg, tsc.TopologyKey)
+		remainingHeadroom = computeDomainHeadroom(nodesByDomain, getPodsAssignedToNode)
+	}
+
 	// i is the index for belowOrEqualAvg
 	// j is the index for aboveAvg
 	i := 0
@@ -379,6 +402,23 @@ func (d *RemovePodsViolatingTopologySpreadConstraint) balanceDomains(ctx context
 				continue
 			}
 
+			if zoneAwareNodeFit {
+				// Note: headroom is decremented at this point, before the actual eviction
+				// is recorded in the eviction loop. If a later eviction filter (rate limit,
+				// pre-eviction predicate, eviction error) rejects this pod, the chosen
+				// domain's headroom has been consumed without an actual pod move, making
+				// the gate conservative on the next candidate. Deferring the subtraction
+				// would require restructuring the eviction loop to call back into the
+				// gate on success; the current behaviour favours safety (admit fewer
+				// pods than nominally possible) over throughput.
+				targetDomain, ok := podFitsSomeDomainWithHeadroom(getPodsAssignedToNode, aboveToEvict[k], nodesByDomain, remainingHeadroom)
+				if !ok {
+					d.logger.V(2).Info("ignoring pod for eviction: no target topology domain has fit + remaining headroom", "pod", klog.KObj(aboveToEvict[k]))
+					continue
+				}
+				d.logger.V(2).Info("ZoneAwareNodeFit admitted pod; target domain headroom decremented", "pod", klog.KObj(aboveToEvict[k]), "targetDomain", targetDomain)
+			}
+
 			podsForEviction[aboveToEvict[k]] = struct{}{}
 		}
 		sortedDomains[j].pods = sortedDomains[j].pods[:len(sortedDomains[j].pods)-movePods]
@@ -405,6 +445,150 @@ func filterNodesBelowIdealAvg(nodes []*v1.Node, sortedDomains []topology, topolo
 	return nodesBelowIdealAvg
 }
 
+// groupNodesByDomain classifies the given nodes by their topology-domain label value.
+// Nodes missing the label are skipped. Returns an empty (non-nil) map if no nodes have the label.
+func groupNodesByDomain(nodes []*v1.Node, topologyKey string) map[string][]*v1.Node {
+	result := make(map[string][]*v1.Node)
+	for _, n := range nodes {
+		if v, ok := n.Labels[topologyKey]; ok {
+			result[v] = append(result[v], n)
+		}
+	}
+	return result
+}
+
+// computeDomainHeadroom returns, for each topology domain, the aggregate remaining
+// resource headroom (allocatable minus already-requested) summed across the domain's
+// nodes. Tracked resources: cpu, memory, pod count.
+//
+// Fails closed: if the pod indexer cannot list pods on a node, that node is omitted
+// from the headroom entirely (neither its allocatable nor any requests are counted)
+// and the error is logged. Counting allocatable without subtracting requests would
+// over-state available capacity and re-introduce the eviction churn this gate is
+// meant to prevent.
+func computeDomainHeadroom(nodesByDomain map[string][]*v1.Node, nodeIndexer podutil.GetPodsAssignedToNodeFunc) map[string]v1.ResourceList {
+	result := make(map[string]v1.ResourceList, len(nodesByDomain))
+	for domain, dnodes := range nodesByDomain {
+		var cpuMilli, memBytes, podSlots int64
+		for _, n := range dnodes {
+			// Filter is intentionally nil so this aggregation reasons over the same pod
+			// set as nodeAvailableResources / fitsRequest (pkg/descheduler/node/node.go),
+			// which also calls ListPodsOnANode with no filter. Diverging here would let
+			// per-node fit and per-domain headroom disagree about what counts as occupied
+			// capacity; if fitsRequest ever tightens its filter (e.g. to exclude
+			// terminal/terminating pods), this call site must be updated in the same
+			// change to keep the two halves of the gate aligned.
+			podsOnNode, err := podutil.ListPodsOnANode(n.Name, nodeIndexer, nil)
+			if err != nil {
+				klog.V(2).ErrorS(err, "ZoneAwareNodeFit: failed to list pods on node; excluding from domain headroom (fail-closed)", "node", n.Name, "domain", domain)
+				continue
+			}
+			cpuMilli += n.Status.Allocatable.Cpu().MilliValue()
+			memBytes += n.Status.Allocatable.Memory().Value()
+			podSlots += n.Status.Allocatable.Pods().Value()
+			for _, p := range podsOnNode {
+				req, _ := utils.PodRequestsAndLimits(p)
+				if q, ok := req[v1.ResourceCPU]; ok {
+					cpuMilli -= q.MilliValue()
+				}
+				if q, ok := req[v1.ResourceMemory]; ok {
+					memBytes -= q.Value()
+				}
+				podSlots--
+			}
+		}
+		result[domain] = v1.ResourceList{
+			v1.ResourceCPU:    *resource.NewMilliQuantity(cpuMilli, resource.DecimalSI),
+			v1.ResourceMemory: *resource.NewQuantity(memBytes, resource.BinarySI),
+			v1.ResourcePods:   *resource.NewQuantity(podSlots, resource.DecimalSI),
+		}
+	}
+	return result
+}
+
+// podFitsSomeDomainWithHeadroom returns the topology-domain key whose remaining aggregate
+// headroom can absorb the pod AND whose nodes include at least one where the pod fits per
+// the scheduler's per-node predicates. On success it decrements that domain's headroom by
+// the pod's requests (cpu/mem/1 pod slot) and returns ok=true. Returns ok=false if no
+// domain qualifies.
+//
+// Domain selection order: domains are iterated by descending remaining CPU headroom so
+// pods are spread toward the roomiest under-loaded domain first; ties (and zero/negative
+// headroom) fall back to alphabetical order for determinism. This avoids draining
+// alphabetically-earlier domains first and leaving later candidates rejected even though
+// a different domain had headroom to spare.
+//
+// The CPU-only sort key is intentional: CPU is the most commonly request-constrained
+// resource for workloads we are spreading across topology domains, and a multi-resource
+// composite scoring (e.g. normalized headroom across the pod's actual requests) would
+// add complexity without changing behaviour for the typical case. For memory-heavy pods
+// in environments with abundant CPU slack but uneven memory headroom, the alphabetical
+// fallback can pick a domain with tighter memory than another candidate; the
+// headroomCoversPod check below still rejects domains that cannot absorb the pod, so
+// this only affects ordering, not correctness.
+func podFitsSomeDomainWithHeadroom(
+	nodeIndexer podutil.GetPodsAssignedToNodeFunc,
+	pod *v1.Pod,
+	nodesByDomain map[string][]*v1.Node,
+	remainingHeadroom map[string]v1.ResourceList,
+) (string, bool) {
+	podReq, _ := utils.PodRequestsAndLimits(pod)
+	domains := make([]string, 0, len(nodesByDomain))
+	for d := range nodesByDomain {
+		domains = append(domains, d)
+	}
+	sort.Slice(domains, func(i, j int) bool {
+		ci := remainingHeadroom[domains[i]][v1.ResourceCPU]
+		cj := remainingHeadroom[domains[j]][v1.ResourceCPU]
+		if ci.MilliValue() != cj.MilliValue() {
+			return ci.MilliValue() > cj.MilliValue()
+		}
+		return domains[i] < domains[j]
+	})
+
+	for _, domain := range domains {
+		if !headroomCoversPod(remainingHeadroom[domain], podReq) {
+			continue
+		}
+		if !node.PodFitsAnyOtherNode(nodeIndexer, pod, nodesByDomain[domain]) {
+			continue
+		}
+		subtractPodFromHeadroom(remainingHeadroom, domain, podReq)
+		return domain, true
+	}
+	return "", false
+}
+
+// headroomCoversPod reports whether the domain's remaining headroom is sufficient to
+// absorb pod's cpu/memory request plus one pod slot. Missing keys are treated as zero.
+func headroomCoversPod(headroom, podReq v1.ResourceList) bool {
+	cpuHead := headroom[v1.ResourceCPU]
+	memHead := headroom[v1.ResourceMemory]
+	podsHead := headroom[v1.ResourcePods]
+	cpuReq := podReq[v1.ResourceCPU]
+	memReq := podReq[v1.ResourceMemory]
+	return cpuHead.MilliValue() >= cpuReq.MilliValue() &&
+		memHead.Value() >= memReq.Value() &&
+		podsHead.Value() >= 1
+}
+
+// subtractPodFromHeadroom decrements remainingHeadroom[domain] by the pod's cpu/memory
+// request and one pod slot.
+func subtractPodFromHeadroom(remainingHeadroom map[string]v1.ResourceList, domain string, podReq v1.ResourceList) {
+	rl := remainingHeadroom[domain]
+	cpuReq := podReq[v1.ResourceCPU]
+	memReq := podReq[v1.ResourceMemory]
+	cpu := rl[v1.ResourceCPU]
+	cpu.Sub(cpuReq)
+	rl[v1.ResourceCPU] = cpu
+	mem := rl[v1.ResourceMemory]
+	mem.Sub(memReq)
+	rl[v1.ResourceMemory] = mem
+	pods := rl[v1.ResourcePods]
+	pods.Sub(*resource.NewQuantity(1, resource.DecimalSI))
+	rl[v1.ResourcePods] = pods
+}
+
 // sortDomains sorts and splits the list of topology domains based on their size
 // it also sorts the list of pods within the domains based on their node affinity/selector and priority in the following order:
 // 1. non-evictable pods
diff --git a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint_test.go b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint_test.go
index 4a484288ab..a3ea9d6402 100644
--- a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint_test.go
+++ b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/topologyspreadconstraint_test.go
@@ -9,6 +9,7 @@ import (
 
 	appsv1 "k8s.io/api/apps/v1"
 	v1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/resource"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/runtime"
 	"k8s.io/apimachinery/pkg/util/sets"
@@ -16,6 +17,7 @@ import (
 	utilptr "k8s.io/utils/ptr"
 
 	"sigs.k8s.io/descheduler/pkg/api"
+	podutil "sigs.k8s.io/descheduler/pkg/descheduler/pod"
 	"sigs.k8s.io/descheduler/pkg/framework/plugins/defaultevictor"
 	frameworktesting "sigs.k8s.io/descheduler/pkg/framework/testing"
 	frameworktypes "sigs.k8s.io/descheduler/pkg/framework/types"
@@ -1454,6 +1456,219 @@ func TestTopologySpreadConstraint(t *testing.T) {
 			namespaces:           []string{"ns1"},
 			args:                 RemovePodsViolatingTopologySpreadConstraintArgs{},
 		},
+		// --- ZoneAwareNodeFit test cases ---
+		// All five share the same topology: zoneA (over-loaded, 7 pods) vs zoneB (under-loaded,
+		// 1 pod). zoneB has just 250m CPU on a single node, of which 100m is already used by the
+		// existing pod — i.e. 150m of aggregate CPU headroom, enough for exactly one additional
+		// 100m-request pod. balanceDomains schedules 3 evictions toward zoneB.
+		//
+		// These cases isolate ZoneAwareNodeFit by disabling DefaultEvictor NodeFit and (where
+		// noted) TopologyBalanceNodeFit, so the only differentiator is the new gate.
+		{
+			name: "ZoneAwareNodeFit caps cumulative load on top of TopologyBalanceNodeFit (1 of 3)",
+			// Critical regression: with the previous OR(merge) implementation of
+			// ZoneAwareNodeFit, this case would evict 3 (the stateless per-node-fit check
+			// passes for every candidate because the live indexer reports B1 at 150m
+			// throughout the loop). The cumulative-headroom gate must reduce this to 1,
+			// even when TopologyBalanceNodeFit is also enabled. This proves ZoneAwareNodeFit
+			// adds value over the existing flag — not just as a renamed alias.
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 250, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       7,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+			}),
+			expectedEvictedCount: 1,
+			namespaces:           []string{"ns1"},
+			args: RemovePodsViolatingTopologySpreadConstraintArgs{
+				// TopologyBalanceNodeFit default-on; ZoneAwareNodeFit on. The new gate
+				// must still cap evictions at zoneB's 150m headroom.
+				ZoneAwareNodeFit: utilptr.To(true),
+			},
+			nodeFit: false,
+		},
+		{
+			name: "ZoneAwareNodeFit=true alone caps evictions at zoneB's aggregate headroom (1 of 3)",
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 250, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       7,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+			}),
+			// zoneB headroom: 250m allocatable - 100m existing = 150m → fits exactly 1 more
+			// 100m-request pod. Pod 1 commits and decrements headroom to 50m. Pods 2 and 3 are
+			// rejected by the aggregate-headroom check (50m < 100m) even though per-node fit
+			// (live indexer view) still says B1 has room.
+			expectedEvictedCount: 1,
+			namespaces:           []string{"ns1"},
+			args: RemovePodsViolatingTopologySpreadConstraintArgs{
+				TopologyBalanceNodeFit: utilptr.To(false),
+				ZoneAwareNodeFit:       utilptr.To(true),
+			},
+			nodeFit: false,
+		},
+		{
+			name: "TopologyBalanceNodeFit=true alone does NOT cap cumulative load (evicts 3)",
+			// Same topology and same cumulative-overflow condition as the previous case. This
+			// case shows the existing TopologyBalanceNodeFit gate cannot detect the over-commit
+			// because it evaluates each candidate pod against a stateless per-node fit check;
+			// all three candidates individually see B1 as having 150m headroom (the indexer is
+			// not mutated mid-loop), so all three pass and get evicted — the exact churn the
+			// new gate prevents.
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 250, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       7,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+			}),
+			expectedEvictedCount: 3,
+			namespaces:           []string{"ns1"},
+			args:                 RemovePodsViolatingTopologySpreadConstraintArgs{}, // defaults: TBNF=true, ZANF=false
+			nodeFit:              false,
+		},
+		{
+			name: "ZoneAwareNodeFit drains domain by domain and caps total at sum of headroom",
+			// 3 domains, each under-loaded zone individually tight. Under the previous
+			// OR(merge) implementation, every candidate would see at least one node in some
+			// domain with per-node headroom — all candidates pass. The cumulative gate must
+			// pick the roomiest domain first (zoneC, 250m), commit until it can no longer
+			// fit, then fall back to zoneB (150m), and reject any 4th candidate because
+			// both domains are saturated.
+			//
+			// zoneB: 250m allocatable - 100m used = 150m headroom → fits 1 × 100m pod
+			// zoneC: 350m allocatable - 100m used = 250m headroom → fits 2 × 100m pods
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 250, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+				test.BuildTestNode("C1", 350, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneC" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       7,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+				{count: 1, node: "C1", labels: map[string]string{"foo": "bar"}},
+			}),
+			// Total batch headroom: 150 + 250 = 400m → fits 3 × 100m pods plus 100m
+			// stranded across the two domains. The previous OR(merge) implementation
+			// would admit all 4 candidates; the new gate admits 3 (2 to zoneC, 1 to
+			// zoneB) and rejects the 4th when both domains are drained below the pod
+			// request.
+			expectedEvictedCount: 3,
+			namespaces:           []string{"ns1"},
+			args: RemovePodsViolatingTopologySpreadConstraintArgs{
+				TopologyBalanceNodeFit: utilptr.To(false),
+				ZoneAwareNodeFit:       utilptr.To(true),
+			},
+			nodeFit: false,
+		},
+		{
+			name: "ZoneAwareNodeFit still requires per-node fit (aggregate headroom alone is insufficient)",
+			// zoneB has 5 nodes with 50m CPU each → 250m aggregate headroom (looks like 2 pods'
+			// worth), but no single node can host a 100m-request pod. The new gate must reject.
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 50, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+				test.BuildTestNode("B2", 50, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+				test.BuildTestNode("B3", 50, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+				test.BuildTestNode("B4", 50, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+				test.BuildTestNode("B5", 50, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       7,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+			}),
+			expectedEvictedCount: 0,
+			namespaces:           []string{"ns1"},
+			args: RemovePodsViolatingTopologySpreadConstraintArgs{
+				TopologyBalanceNodeFit: utilptr.To(false),
+				ZoneAwareNodeFit:       utilptr.To(true),
+			},
+			nodeFit: false,
+		},
+		{
+			name: "ZoneAwareNodeFit aggregates headroom across multiple nodes in the same below-ideal domain",
+			// Verifies grouping correctness when more than one node belongs to the same
+			// below-ideal domain. zoneB has two nodes (B1, B2) each with 200m allocatable
+			// and a 100m existing pod → 100m per-node headroom, 200m aggregate. A grouping
+			// bug that mis-bucketed nodes (e.g. counted only one B node) would yield
+			// expectedEvictedCount=1; correct multi-node aggregation yields 2 commits
+			// (cumulatively draining the domain) and a 3rd rejection.
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 200, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+				test.BuildTestNode("B2", 200, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       9,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+				{count: 1, node: "B2", labels: map[string]string{"foo": "bar"}},
+			}),
+			expectedEvictedCount: 2,
+			namespaces:           []string{"ns1"},
+			args: RemovePodsViolatingTopologySpreadConstraintArgs{
+				TopologyBalanceNodeFit: utilptr.To(false),
+				ZoneAwareNodeFit:       utilptr.To(true),
+			},
+			nodeFit: false,
+		},
+		{
+			name: "ZoneAwareNodeFit default (false) is no-op — behaviour matches prior versions",
+			// Default args (TBNF=true, ZANF=false). With nodeFit=true on DefaultEvictor and
+			// zoneB unable to host more pods, eviction is blocked by the DefaultEvictor's own
+			// pre-filter — confirming the new flag introduces no behavioural change when off.
+			nodes: []*v1.Node{
+				test.BuildTestNode("A1", 2000, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneA" }),
+				test.BuildTestNode("B1", 50, 3000, 10, func(n *v1.Node) { n.Labels["zone"] = "zoneB" }),
+			},
+			pods: createTestPods([]testPodList{
+				{
+					count:       4,
+					node:        "A1",
+					labels:      map[string]string{"foo": "bar"},
+					constraints: getDefaultTopologyConstraints(1),
+				},
+				{count: 1, node: "B1", labels: map[string]string{"foo": "bar"}},
+			}),
+			expectedEvictedCount: 0,
+			namespaces:           []string{"ns1"},
+			args:                 RemovePodsViolatingTopologySpreadConstraintArgs{},
+			nodeFit:              true,
+		},
 	}
 
 	for _, tc := range testCases {
@@ -1818,6 +2033,224 @@ func getDefaultTopologyConstraints(maxSkew int32, edits ...func(*v1.TopologySpre
 	return []v1.TopologySpreadConstraint{constraint}
 }
 
+func TestGroupNodesByDomain(t *testing.T) {
+	mkNode := func(name, zone string) *v1.Node {
+		return test.BuildTestNode(name, 1000, 1<<30, 10, func(n *v1.Node) {
+			if zone != "" {
+				n.Labels["zone"] = zone
+			}
+		})
+	}
+	nodes := []*v1.Node{
+		mkNode("A1", "zoneA"),
+		mkNode("A2", "zoneA"),
+		mkNode("B1", "zoneB"),
+		mkNode("X1", ""), // missing label
+	}
+	got := groupNodesByDomain(nodes, "zone")
+	if len(got["zoneA"]) != 2 {
+		t.Errorf("zoneA: got %d nodes, want 2", len(got["zoneA"]))
+	}
+	if len(got["zoneB"]) != 1 {
+		t.Errorf("zoneB: got %d nodes, want 1", len(got["zoneB"]))
+	}
+	if _, ok := got[""]; ok {
+		t.Error("nodes missing the topology key must not produce an empty-string entry")
+	}
+}
+
+func TestComputeDomainHeadroom(t *testing.T) {
+	// zoneA has two nodes: A1 (1000m cpu, no pods), A2 (500m cpu, one 200m pod).
+	// Expected aggregate headroom: (1000 - 0) + (500 - 200) = 1300m. Pod slots: 10 + 10 - 1 = 19.
+	a1 := test.BuildTestNode("A1", 1000, 1<<30, 10, nil)
+	a2 := test.BuildTestNode("A2", 500, 1<<30, 10, nil)
+	occupant := test.BuildTestPod("occ", 200, 0, "A2", nil)
+	indexer := podutil.GetPodsAssignedToNodeFunc(func(nodeName string, filter podutil.FilterFunc) ([]*v1.Pod, error) {
+		if nodeName == "A2" {
+			return []*v1.Pod{occupant}, nil
+		}
+		return nil, nil
+	})
+
+	got := computeDomainHeadroom(map[string][]*v1.Node{"zoneA": {a1, a2}}, indexer)
+	if cpu := got["zoneA"][v1.ResourceCPU]; cpu.MilliValue() != 1300 {
+		t.Errorf("aggregate CPU: got %dm, want 1300m", cpu.MilliValue())
+	}
+	if pods := got["zoneA"][v1.ResourcePods]; pods.Value() != 19 {
+		t.Errorf("aggregate pod slots: got %d, want 19", pods.Value())
+	}
+}
+
+func TestComputeDomainHeadroomFailsClosedOnIndexerError(t *testing.T) {
+	// If ListPodsOnANode fails for a node, that node must contribute zero — neither
+	// its allocatable nor any pod requests should land in the aggregate. Over-counting
+	// would re-introduce the eviction churn this gate prevents.
+	a1 := test.BuildTestNode("A1", 1000, 1<<30, 10, nil)
+	a2 := test.BuildTestNode("A2", 500, 1<<30, 10, nil)
+	indexer := podutil.GetPodsAssignedToNodeFunc(func(nodeName string, filter podutil.FilterFunc) ([]*v1.Pod, error) {
+		if nodeName == "A2" {
+			return nil, fmt.Errorf("simulated indexer failure")
+		}
+		return nil, nil
+	})
+
+	got := computeDomainHeadroom(map[string][]*v1.Node{"zoneA": {a1, a2}}, indexer)
+	// Only A1 (1000m, 10 pod slots) contributes; A2 is excluded fail-closed.
+	if cpu := got["zoneA"][v1.ResourceCPU]; cpu.MilliValue() != 1000 {
+		t.Errorf("aggregate CPU after fail-closed exclusion: got %dm, want 1000m", cpu.MilliValue())
+	}
+	if pods := got["zoneA"][v1.ResourcePods]; pods.Value() != 10 {
+		t.Errorf("aggregate pod slots after fail-closed exclusion: got %d, want 10", pods.Value())
+	}
+}
+
+func TestPodFitsSomeDomainWithHeadroom(t *testing.T) {
+	// Build a stub node indexer that reports the supplied pods as living on the named node.
+	makeIndexer := func(podsByNode map[string][]*v1.Pod) podutil.GetPodsAssignedToNodeFunc {
+		return podutil.GetPodsAssignedToNodeFunc(func(nodeName string, filter podutil.FilterFunc) ([]*v1.Pod, error) {
+			out := []*v1.Pod{}
+			for _, p := range podsByNode[nodeName] {
+				if filter == nil || filter(p) {
+					out = append(out, p)
+				}
+			}
+			return out, nil
+		})
+	}
+
+	t.Run("commits to roomiest qualifying domain and decrements its headroom", func(t *testing.T) {
+		// zoneA has more CPU headroom than zoneB, so even though zoneB sorts first
+		// alphabetically, the candidate must commit to zoneA. This guards against
+		// regressing back to an alphabetical-first selection that would drain the
+		// tighter domain prematurely.
+		nodesByDomain := map[string][]*v1.Node{
+			"zoneB": {test.BuildTestNode("B1", 1000, 1<<30, 10, nil)},
+			"zoneA": {test.BuildTestNode("A1", 1000, 1<<30, 10, nil)},
+		}
+		headroom := map[string]v1.ResourceList{
+			"zoneA": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(800, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<28, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(5, resource.DecimalSI),
+			},
+			"zoneB": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(300, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<28, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(5, resource.DecimalSI),
+			},
+		}
+		pod := test.BuildTestPod("candidate", 100, 0, "other", nil)
+		indexer := makeIndexer(nil)
+
+		domain, ok := podFitsSomeDomainWithHeadroom(indexer, pod, nodesByDomain, headroom)
+		if !ok || domain != "zoneA" {
+			t.Fatalf("expected commit to roomiest domain zoneA, got domain=%q ok=%v", domain, ok)
+		}
+		if cpu := headroom["zoneA"][v1.ResourceCPU]; cpu.MilliValue() != 700 {
+			t.Errorf("zoneA CPU headroom after commit: got %dm, want 700m", cpu.MilliValue())
+		}
+		if cpu := headroom["zoneB"][v1.ResourceCPU]; cpu.MilliValue() != 300 {
+			t.Errorf("zoneB CPU headroom must be untouched: got %dm, want 300m", cpu.MilliValue())
+		}
+	})
+
+	t.Run("equal headroom falls back to alphabetical for determinism", func(t *testing.T) {
+		nodesByDomain := map[string][]*v1.Node{
+			"zoneB": {test.BuildTestNode("B1", 1000, 1<<30, 10, nil)},
+			"zoneA": {test.BuildTestNode("A1", 1000, 1<<30, 10, nil)},
+		}
+		headroom := map[string]v1.ResourceList{
+			"zoneA": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(500, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<28, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(5, resource.DecimalSI),
+			},
+			"zoneB": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(500, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<28, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(5, resource.DecimalSI),
+			},
+		}
+		pod := test.BuildTestPod("candidate", 100, 0, "other", nil)
+		indexer := makeIndexer(nil)
+
+		domain, ok := podFitsSomeDomainWithHeadroom(indexer, pod, nodesByDomain, headroom)
+		if !ok || domain != "zoneA" {
+			t.Fatalf("expected alphabetical tie-break to commit to zoneA, got domain=%q ok=%v", domain, ok)
+		}
+	})
+
+	t.Run("repeated calls drain headroom and then reject", func(t *testing.T) {
+		nodesByDomain := map[string][]*v1.Node{
+			"zoneA": {test.BuildTestNode("A1", 1000, 1<<30, 10, nil)},
+		}
+		headroom := map[string]v1.ResourceList{
+			"zoneA": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(250, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<28, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(5, resource.DecimalSI),
+			},
+		}
+		indexer := makeIndexer(nil)
+		pod := test.BuildTestPod("p", 100, 0, "other", nil)
+
+		if _, ok := podFitsSomeDomainWithHeadroom(indexer, pod, nodesByDomain, headroom); !ok {
+			t.Fatal("first call: expected ok=true (250m >= 100m)")
+		}
+		if _, ok := podFitsSomeDomainWithHeadroom(indexer, pod, nodesByDomain, headroom); !ok {
+			t.Fatal("second call: expected ok=true (150m >= 100m)")
+		}
+		if _, ok := podFitsSomeDomainWithHeadroom(indexer, pod, nodesByDomain, headroom); ok {
+			t.Fatal("third call: expected ok=false (50m < 100m)")
+		}
+	})
+
+	t.Run("returns false when no domain has aggregate headroom", func(t *testing.T) {
+		nodesByDomain := map[string][]*v1.Node{
+			"zoneA": {test.BuildTestNode("A1", 1000, 1<<30, 10, nil)},
+		}
+		headroom := map[string]v1.ResourceList{
+			"zoneA": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(50, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<28, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(5, resource.DecimalSI),
+			},
+		}
+		pod := test.BuildTestPod("p", 100, 0, "other", nil)
+		if _, ok := podFitsSomeDomainWithHeadroom(makeIndexer(nil), pod, nodesByDomain, headroom); ok {
+			t.Error("expected ok=false when aggregate CPU is insufficient")
+		}
+	})
+
+	t.Run("returns false when no domain node can host the pod (per-node fit fails)", func(t *testing.T) {
+		// Node has plenty of aggregate headroom by allocatable, but an existing pod on the
+		// node consumes most of it, leaving no per-node room. Aggregate headroom is supplied
+		// separately so only the per-node fit check (via fitsRequest) should reject.
+		smallNode := test.BuildTestNode("A1", 120, 1<<30, 10, nil)
+		nodesByDomain := map[string][]*v1.Node{"zoneA": {smallNode}}
+		headroom := map[string]v1.ResourceList{
+			"zoneA": {
+				v1.ResourceCPU:    *resource.NewMilliQuantity(10000, resource.DecimalSI),
+				v1.ResourceMemory: *resource.NewQuantity(1<<30, resource.BinarySI),
+				v1.ResourcePods:   *resource.NewQuantity(10, resource.DecimalSI),
+			},
+		}
+		occupant := test.BuildTestPod("occupant", 100, 0, "A1", nil)
+		indexer := makeIndexer(map[string][]*v1.Pod{"A1": {occupant}})
+		candidate := test.BuildTestPod("candidate", 100, 0, "other", nil)
+		if _, ok := podFitsSomeDomainWithHeadroom(indexer, candidate, nodesByDomain, headroom); ok {
+			t.Error("expected ok=false when no node has per-node room despite aggregate headroom")
+		}
+	})
+
+	t.Run("empty nodesByDomain returns false without error", func(t *testing.T) {
+		pod := test.BuildTestPod("p", 100, 0, "other", nil)
+		if _, ok := podFitsSomeDomainWithHeadroom(makeIndexer(nil), pod, map[string][]*v1.Node{}, map[string]v1.ResourceList{}); ok {
+			t.Error("expected ok=false on empty input")
+		}
+	})
+}
+
 func TestCheckIdenticalConstraints(t *testing.T) {
 	selector, _ := metav1.LabelSelectorAsSelector(&metav1.LabelSelector{MatchLabels: map[string]string{"foo": "bar"}})
 
diff --git a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/types.go b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/types.go
index 272dde144a..cf45e2e423 100644
--- a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/types.go
+++ b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/types.go
@@ -33,4 +33,19 @@ type RemovePodsViolatingTopologySpreadConstraintArgs struct {
 	LabelSelector          *metav1.LabelSelector              `json:"labelSelector,omitempty"`
 	Constraints            []v1.UnsatisfiableConstraintAction `json:"constraints,omitempty"`
 	TopologyBalanceNodeFit *bool                              `json:"topologyBalanceNodeFit,omitempty"`
+	// ZoneAwareNodeFit, if set to true, requires that each pod evicted by
+	// topology-spread balancing have, in at least one specific under-loaded
+	// topology domain (as identified by the constraint's TopologyKey),
+	// (a) a node where the pod fits per the scheduler's per-node predicates
+	// AND (b) sufficient remaining aggregate resource headroom — after
+	// accounting for other pods already committed to that domain during
+	// the same balancing round — to absorb the pod.
+	//
+	// Unlike TopologyBalanceNodeFit, which only checks per-node fit across
+	// the union of under-loaded domains, ZoneAwareNodeFit reasons about
+	// per-domain cumulative capacity. This prevents over-committing a single
+	// under-loaded domain within one balancing round, which would otherwise
+	// cause the scheduler to push evicted pods back to the over-loaded
+	// domain (eviction churn). Defaults to false.
+	ZoneAwareNodeFit *bool `json:"zoneAwareNodeFit,omitempty"`
 }
diff --git a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/zz_generated.deepcopy.go b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/zz_generated.deepcopy.go
index 223bcbc913..dde145651c 100644
--- a/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/zz_generated.deepcopy.go
+++ b/pkg/framework/plugins/removepodsviolatingtopologyspreadconstraint/zz_generated.deepcopy.go
@@ -52,6 +52,11 @@ func (in *RemovePodsViolatingTopologySpreadConstraintArgs) DeepCopyInto(out *Rem
 		*out = new(bool)
 		**out = **in
 	}
+	if in.ZoneAwareNodeFit != nil {
+		in, out := &in.ZoneAwareNodeFit, &out.ZoneAwareNodeFit
+		*out = new(bool)
+		**out = **in
+	}
 	return
 }