Skip to content

Exclude kube-system from webhooks targeting common resources#11192

Merged
k8s-ci-robot merged 8 commits into
kubernetes-sigs:mainfrom
dkaluza:webhooks-exclude-kube-system
May 18, 2026
Merged

Exclude kube-system from webhooks targeting common resources#11192
k8s-ci-robot merged 8 commits into
kubernetes-sigs:mainfrom
dkaluza:webhooks-exclude-kube-system

Conversation

@dkaluza
Copy link
Copy Markdown
Contributor

@dkaluza dkaluza commented May 14, 2026

What type of PR is this?

/kind feature
/area integrations

What this PR does / why we need it:

Adds kube-system and kueue-system (or other release namespace in case of helm installation) namespace excludes to webhooks targeting common resource types.
(Custom resources that are not added by Kueue)

This makes sure Kueue deployment does not suspend crucial system components, e.g. jobs run by kubeadm.

Which issue(s) this PR fixes:

Fixes #11006

Special notes for your reviewer:

I haven't modified webhook selectors for Kueue's own resources as those shouldn't be run by other crucial system components. But if you think it is worth to also exclude those, just let me know.

Prepared with Antigravity, but carefully reviewed to not miss any resource.

Does this PR introduce a user-facing change?

Kueue integration webhooks now consistently exclude `kube-system` and the Kueue
installation namespace. For manifest-based installs this is `kueue-system`; for Helm
installs this is the Helm release namespace.

Previously, Pod, Deployment, and StatefulSet integrations already excluded these
namespaces, while other workload integrations did not.

ACTION REQUIRED: If you run Kueue-managed workloads in `kube-system` or in the namespace
where Kueue is installed, move them to another namespace before upgrading.

If this is not possible, update `managedJobsNamespaceSelector` and the relevant webhook
`namespaceSelector`s to include those namespaces. This setup is discouraged, as it may
affect system workloads and disrupt cluster upgrades.

@k8s-ci-robot k8s-ci-robot added release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. kind/feature Categorizes issue or PR as related to a new feature. area/integrations Workload integrations labels May 14, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented May 14, 2026

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 3bf8e90
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/6a0b3182ef6feb00084c6a40

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 14, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @dkaluza. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 14, 2026
@mbobrovskyi
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 14, 2026
Comment thread hack/processing-plan.yaml Outdated
{{- if (hasKey $managerConfig "managedJobsNamespaceSelector") }}
namespaceSelector:
{{- toYaml $managerConfig.managedJobsNamespaceSelector | nindent 6 }}
{{- else }}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like the fact that we have to modify 200+ lines to make such small change, but it looks like we want to be really explicit about the changes here, so didn't want to change this.

- webhooks.[name=mstatefulset.kb.io].namespaceSelector
- webhooks.[name=mtfjob.kb.io].namespaceSelector
- webhooks.[name=mtrainjob.kb.io].namespaceSelector
- webhooks.[name=mxgboostjob.kb.io].namespaceSelector
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unfortunately still requires registration of new webhook here when there is a new integration.

This is not the case if we just do * but then we will also target Kueue crds... This can be solved by splitting those in some way, but I'm not convinced this is worth it right now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, do I understand that this change only then impacts all workloads other than Pod, Deployment, StatefulSet? Because Pod, Deployment, StatefulSet already had the exclusions in the deleted code here:

- patch: |-
    apiVersion: admissionregistration.k8s.io/v1
    kind: MutatingWebhookConfiguration
    metadata:
      name: mutating-webhook-configuration
    webhooks:
    - name: mpod.kb.io
      namespaceSelector:
        matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: NotIn
          values:
            - kube-system
            - kueue-system
    - name: mdeployment.kb.io
      namespaceSelector:
        matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: NotIn
          values:
            - kube-system
            - kueue-system
    - name: mstatefulset.kb.io
      namespaceSelector:
        matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: NotIn
          values:
            - kube-system
            - kueue-system

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct.

@dkaluza dkaluza changed the title Webhooks exclude kube system Exclude kube-system from webhooks targeting common resources May 14, 2026
Comment thread config/components/webhook/kustomization.yaml Outdated
….yaml

Co-authored-by: Olek Zabłocki <olekz@google.com>
@mimowo
Copy link
Copy Markdown
Contributor

mimowo commented May 15, 2026

/test pull-kueue-test-e2e-multikueue-main
/test pull-kueue-test-e2e-main-1-34
look like infra, we are fixing currently similar issues by retry: #11238 (comment)

Comment thread hack/processing-plan.yaml
namespaceSelector:
{{- if (hasKey $managerConfig "managedJobsNamespaceSelector") -}}
{{- toYaml $managerConfig.managedJobsNamespaceSelector | nindent 6 -}}
{{- if (hasKey $managerConfig "managedJobsNamespaceSelector") }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this par looks duplicated for all crds:

          namespaceSelector:
            {{- if (hasKey $managerConfig "managedJobsNamespaceSelector") -}}
              {{- toYaml $managerConfig.managedJobsNamespaceSelector | nindent 6 -}}
            {{- else }}
            matchExpressions:
              - key: kubernetes.io/metadata.name
                operator: NotIn
                values:
                  - kube-system
                  - '{{ .Release.Namespace }}'
            {{- end }}

i'm wondering if there would be a way to introduce some kind of placeholder for that in processing-plain.yaml, wdyt?

This is not a blocker, but it would be good to research this topic on this occasion so that we can open follow up issue to maybe introduce that capability if not existing currently.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would named templates help?

Copy link
Copy Markdown
Contributor Author

@dkaluza dkaluza May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should be generally doable with just a better yq selector(condition here), no need for named template yet, but generally I agree named templates also look like a viable way to achieve deduplication in similar cases.

I got some strange behavior of our yaml processor with new condition(not pushed yet), I will investigate on Monday.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, needed a small hotfix to make this feasible, added a test for that.

- webhooks.[name=vleaderworkerset.kb.io].namespaceSelector
- webhooks.[name=vmpijob.kb.io].namespaceSelector
- webhooks.[name=vpaddlejob.kb.io].namespaceSelector
- webhooks.[name=vpod.kb.io].namespaceSelector
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this target path is exactly the source path.

An AI bot told me it'll still work, even with create: true (just silently ignoring this target).
So maybe this is desired, for the completeness of the target list.
Or maybe it could be considered confusing/misleading.

I'm leaving the choice to you; I just wanted to make sure you're aware of this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out.

Generally I consider this beneficial, as it provides mentioned completeness. In this way the number of webhooks in both selects is the same. Also they can be easier matched against expected number, you do not have to always remember this special case.

An AI bot told me it'll still work, even with create: true (just silently ignoring this target).

According to kustomize docs create: true just means to create the target path if it does not exists instead of throwing an error.
Generally this is a replacement of the same content with the same content, so I wouldn't expect any problems here.

@olekzabl
Copy link
Copy Markdown
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 15, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 56e78ee8647edbf06877089eb5a2a3c0437e523c

@mimowo
Copy link
Copy Markdown
Contributor

mimowo commented May 18, 2026

/release-note-edit

Kueue integration webhooks will no longer target kube-system or kueue-system namespace.

ACTION REQUIRED: If you are running Kueue-managed workloads in kube-system namespace or
namespace in which  Kueue is installed please move those workloads to a different namespace. 

If  this is not feasible then you may need to adjust your `managedJobsNamespaceSelector` and
webhook configuration to also cover these namespaces. However such a deployment is
discouraged as may result in disruptions during cluster upgrades.

path: /mutate-workload-codeflare-dev-v1beta2-appwrapper
failurePolicy: Fail
name: mappwrapper.kb.io
{{- if (hasKey $managerConfig "managedJobsNamespaceSelector") }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: does this change:

  {{- if (hasKey $managerConfig "managedJobsNamespaceSelector") }}
    namespaceSelector:
      {{- toYaml $managerConfig.managedJobsNamespaceSelector | nindent 6 }}
    {{- end }}

to

    namespaceSelector:
      {{- if (hasKey $managerConfig "managedJobsNamespaceSelector") }}
        {{- toYaml $managerConfig.managedJobsNamespaceSelector | nindent 6 }}
      {{- else }}
      matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: NotIn
          values:
            - kube-system
            - '{{ .Release.Namespace }}'
      {{- end }}

impact the default configuration by Helm? IIUC it does because by default we would keep managedJobsNamespaceSelector as nil, and so by default Helm would cover kueue-system and kube-system, but I want to double check.

Copy link
Copy Markdown
Contributor Author

@dkaluza dkaluza May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now default helm deployment has namespace excludes only for the mentioned:

  • statefulsets
  • pods
  • deployments

After this PR the namespaces will be excluded for all of the workload source types by default . Which makes those consistent across types and aligned with the release manifests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, let's capture that in the release note

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 18, 2026
@k8s-ci-robot k8s-ci-robot requested a review from olekzabl May 18, 2026 14:28
buffer.WriteString(line + "\n")

if slices.Contains(keyLines, i+offset) {
if slices.Contains(keyLines, i) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As keyLines are from the original yaml, we shouldn't be skipping lines if we add a new snippet.

@mimowo
Copy link
Copy Markdown
Contributor

mimowo commented May 18, 2026

/release-note-edit

Kueue integration webhooks now consistently exclude `kube-system` and the Kueue
installation namespace. For manifest-based installs this is `kueue-system`; for Helm
installs this is the Helm release namespace.

Previously, Pod, Deployment, and StatefulSet integrations already excluded these
namespaces, while other workload integrations did not.

ACTION REQUIRED: If you run Kueue-managed workloads in `kube-system` or in the namespace
where Kueue is installed, move them to another namespace before upgrading.

If this is not possible, update `managedJobsNamespaceSelector` and the relevant webhook
`namespaceSelector`s to include those namespaces. This setup is discouraged, as it may
affect system workloads and disrupt cluster upgrades.

@mimowo
Copy link
Copy Markdown
Contributor

mimowo commented May 18, 2026

/lgtm
/approve
Thank you, nice to see closing the inconsistencies between configuration for different Jobs 👍 I'm even wondering if we should cherrypick this as a consistency bug.

Any opinion @tenzen-y ?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 18, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: e3a7c6d93ff7bd83f4bd9fcb5369d24c3c3adb80

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dkaluza, mimowo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2026
@k8s-ci-robot k8s-ci-robot merged commit 03754ac into kubernetes-sigs:main May 18, 2026
37 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.18 milestone May 18, 2026
@mimowo
Copy link
Copy Markdown
Contributor

mimowo commented May 18, 2026

Thank you, nice to see closing the inconsistencies between configuration for different Jobs 👍 I'm even wondering if we should cherrypick this as a consistency bug.
Any opinion @tenzen-y ?

I think we can skip the cherrypick as of now because no user requested the change as a bugfix for now. We will reconsider when this is needed to CP another bugfix, or if a user requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/integrations Workload integrations cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exclude kube-system namespace from webhooks targeting common resource types

5 participants