Skip to content

LowNodeUtilization: heterogeneous / cluster-total capacity awareness #1870

@derhornspieler

Description

@derhornspieler

Problem

LowNodeUtilization evaluates each node's utilization as a percentage of its own allocatable. On a cluster with heterogeneous node sizes, two nodes at the same percentage can be doing very different absolute amounts of work — but the plugin treats them as equivalently loaded and won't rebalance between them.

Concrete scenario

A 3-node cluster with asymmetric hardware (real numbers from a Harvester / KubeVirt cluster):

Node Physical cores Logical CPUs (allocatable) Real CPU usage usage % Cores in use
A 24 48 32.1 cores 67% 32.1
B 12 24 19.0 cores 82% 19.0
C 12 24 19.3 cores 84% 19.3

With default thresholds (cpu: 30 / 50), all three nodes report as "appropriately utilized." The plugin cannot decide to evict from B or C because none of them cross the 50% target — even though B and C are within ~16% of saturation while A has 16 logical CPUs of headroom.

The metricsUtilization: { metricsServer: true } option doesn't help here either — the percentage-based signal is the structural issue, not the request-vs-metrics axis.

Suggested enhancements

  1. balanceMode: total option that targets even absolute utilization across nodes (e.g., aim for each node to absorb 1/N of total cluster load weighted by capacity) rather than even percentage. Useful when nodes have heterogeneous capacity and the goal is to use the larger nodes proportionally more.

  2. Memory-side parity — the same gap exists for memory thresholds, with the same consequences for nodes of different RAM sizes.

  3. Optional NFD label integration for physical-core awareness — read feature.node.kubernetes.io/cpu-* labels (or similar) so the plugin can weight by physical resources separately from logical CPU allocatable. On hyperthreaded x86, two threads on the same physical core deliver ~1.3x throughput rather than 2x, which matters for CPU-bound workloads where saturating SMT siblings has diminishing returns.

Workarounds today

None that I'm aware of within the LowNodeUtilization plugin. The closest alternative is hand-crafted nodeAffinity/taints policies that are static rather than load-aware. Would be glad to be pointed at any existing pattern I've missed.

Related

Downstream report (config gap that surfaced this on a Harvester cluster): harvester/harvester#10547

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions