fix: validate LowNodeUtilization metrics source#1877
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @immanuwell! |
|
Hi @immanuwell. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
@immanuwell: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@tiraboschi fyi |
There was a problem hiding this comment.
Pull request overview
This PR tightens policy-time validation for the LowNodeUtilization plugin’s metricsUtilization configuration so invalid/typo’d metrics source settings fail early instead of surfacing as runtime plugin initialization errors.
Changes:
- Rejects
metricsUtilization: {}when neithermetricsUtilization.sourcenor legacymetricsUtilization.metricsServeris set. - Rejects unknown
metricsUtilization.sourcevalues (anything other thanKubernetesMetricsorPrometheus). - Adds unit tests covering empty source, unknown source, and legacy
metricsServer: truewithout a source.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/framework/plugins/nodeutilization/validation.go | Adds early validation for empty/unknown metrics source in LowNodeUtilization args. |
| pkg/framework/plugins/nodeutilization/validation_test.go | Extends table tests to cover the new validation behavior and legacy metricsServer usage. |
Comments suppressed due to low confidence (1)
pkg/framework/plugins/nodeutilization/validation.go:75
MetricsServeris treated as the Kubernetes-metrics path at runtime (seelownodeutilization.goswitch:case metrics.MetricsServer, metrics.Source == KubernetesMetrics), but this validation only forbidsMetricsServerwhenSource == KubernetesMetrics. As a result, configs likemetricsServer: truecombined withsource: Prometheus(and/orprometheus:config) pass validation but will silently run with Metrics Server instead of Prometheus. Consider rejecting any non-emptySourceand anyPrometheusconfig whenMetricsServeris true (or explicitly requireSourceto be empty when using the legacy flag).
if args.MetricsUtilization.Source == "" && !args.MetricsUtilization.MetricsServer {
return fmt.Errorf("metrics source is empty")
}
if args.MetricsUtilization.Source != "" && args.MetricsUtilization.Source != api.KubernetesMetrics && args.MetricsUtilization.Source != api.PrometheusMetrics {
return fmt.Errorf("unrecognized metrics source")
}
if args.MetricsUtilization.Source == api.KubernetesMetrics && args.MetricsUtilization.MetricsServer {
return fmt.Errorf("it is not allowed to set both %q source and metricsServer", api.KubernetesMetrics)
}
if args.MetricsUtilization.Source == api.KubernetesMetrics && args.MetricsUtilization.Prometheus != nil {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| name: "legacy metrics server without metrics source", | ||
| args: &LowNodeUtilizationArgs{ | ||
| Thresholds: api.ResourceThresholds{ | ||
| v1.ResourceCPU: 20, | ||
| v1.ResourceMemory: 20, | ||
| extendedResource: 20, | ||
| }, | ||
| TargetThresholds: api.ResourceThresholds{ | ||
| v1.ResourceCPU: 80, | ||
| v1.ResourceMemory: 80, | ||
| extendedResource: 80, | ||
| }, | ||
| MetricsUtilization: &MetricsUtilization{ | ||
| MetricsServer: true, | ||
| }, | ||
| }, | ||
| errInfo: nil, |
Description
Tiny config validation fix, nothing fancy.
LowNodeUtilizationacceptedmetricsUtilization: {}or a typo'dmetricsUtilization.source. That can come straight from policy YAML, sincesourceis just a string; no cloud quota / API ceiling blocks it. Later the plugin trips at runtime withmetrics source is emptyorunrecognized metrics source.This catches that during policy validation instead. Legacy
metricsServer: truestays valid.Reproduce
Before the fix, the new table cases returned
nil:go test ./pkg/framework/plugins/nodeutilization -run TestValidateLowNodeUtilizationPluginConfig -count=1Now they pass, and the wider package suite is clean:
go test ./pkg/...Related search: #1658 and #1840 are in the same metrics config area, but this does not directly close them.
Checklist