Skip to content

channels: move from protokube to a static pod#18328

Open
hakman wants to merge 8 commits into
kubernetes:masterfrom
hakman:channels
Open

channels: move from protokube to a static pod#18328
hakman wants to merge 8 commits into
kubernetes:masterfrom
hakman:channels

Conversation

@hakman
Copy link
Copy Markdown
Member

@hakman hakman commented May 13, 2026

Extracts addon channel reconciliation from protokube into a new kops-channels static pod, rendered by nodeup on control-plane nodes. The pod runs the existing channels binary in a long-lived loop, reconciles every cluster channel on a 60s interval, and patches the local node with the mandatory control-plane labels.

To make this easier to review, keeping the cleanup for a separate PR.

TODO (followups):

  • cleanup of old binaries
  • move manifest generation to cloudup
  • move env vars to mounted file
  • ...

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 13, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign justinsb for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from olemarkus and zetaab May 13, 2026 02:20
@k8s-ci-robot k8s-ci-robot added area/channels cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/nodeup size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 13, 2026
@hakman
Copy link
Copy Markdown
Member Author

hakman commented May 13, 2026

/cc @rifelpet @justinsb

@k8s-ci-robot k8s-ci-robot requested review from justinsb and rifelpet May 13, 2026 02:21
@hakman hakman removed request for olemarkus and zetaab May 13, 2026 02:21
@hakman hakman force-pushed the channels branch 2 times, most recently from b1dd5ca to 427e3b3 Compare May 13, 2026 05:57
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 13, 2026
@hakman hakman force-pushed the channels branch 4 times, most recently from ed8903e to 56c10c5 Compare May 13, 2026 12:10
@hakman hakman changed the title WIP channels: move from protokube to a static pod channels: move from protokube to a static pod May 13, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 13, 2026
@hakman hakman force-pushed the channels branch 2 times, most recently from 72dcd77 to 1bd57d7 Compare May 13, 2026 15:38
@hakman
Copy link
Copy Markdown
Member Author

hakman commented May 14, 2026

This is ready for review. It is based on what we discussed during office hours, so no concerns regarding the direction.
@rifelpet any thoughts on getting this into 1.36?

Comment thread nodeup/pkg/model/channels.go Outdated

if os.Getenv("S3_ENDPOINT") != "" {
for _, name := range []string{"S3_ENDPOINT", "S3_REGION", "S3_ACCESS_KEY_ID", "S3_SECRET_ACCESS_KEY"} {
add(name, os.Getenv(name))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would expose secrets to anything with "get pods" access in kube-system. could we replace this with a volume mount and a .env file?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, but already the pattern in a few other places. Updating to follow the same approach as there for now in 0b33d94. Once this merges we can move all these to separate files.

@hakman hakman requested a review from rifelpet May 15, 2026 12:41
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 17, 2026
When set, re-applies the channel on the given duration until SIGTERM,
logging per-iteration errors instead of exiting. Default 0 preserves the
existing one-shot behavior for current callers (protokube, CI). Enables
running channels as a long-lived workload (e.g. a static pod) without
an external loop.
hakman added 6 commits May 18, 2026 08:52
Relocates the control-plane node labeler from protokube to a new
channels/pkg/nodelabeler package and renames it to
BootstrapControlPlaneNodeLabels. Protokube still drives the call for
now via the new import path. This is preparation for running channels
as a static pod that owns both addon application and the labels addons
target.

The labeler's tainter.go scratch types are removed; the new package
inlines only the patch struct it needs.
apply channel now takes one or more channel URLs and applies them
sequentially per invocation. With --interval the loop iterates over the
URLs each tick, mirroring protokube's old syncOnce ordering. Per-channel
errors are collected via multierr so one bad channel does not stop the
rest. Single-URL callers continue to work unchanged.

Adds --node-name: when set, each iteration patches the named node with
the mandatory control-plane labels via channels/pkg/nodelabeler. Empty
--node-name skips labeling, which is the right default for one-shot
CLI use from a developer's laptop. The kops-channels static pod supplies
--node-name via the downward API.

Together these let a single channels process own both addon application
and control-plane labeling for the entire channel set, replacing
protokube's per-channel subprocess fan-out and its separate labeler
step.
Adds the ko-kops-channels-export Makefile target set (build, export,
version-dist, dev-upload, push) cloning the kops-controller pattern,
and wires kops-channels-push into cloudbuild.yaml so the staging push
step pushes the new image alongside the others. Needed so channels
can run as a static pod under kubelet instead of as a host binary
invoked by protokube.
Adds a ChannelsBuilder that emits /etc/kubernetes/manifests/kops-channels.manifest.
The pod runs one container per channel URL on a 60s interval; the
bootstrap-channel container additionally patches the local node with
control-plane labels via --bootstrap-node-labels and the downward API.
The pod is system-node-critical because it owns the labels addons target
for scheduling, and uses hostNetwork so VFS can reach the cloud
metadata service before CNI is up.

At this commit the static pod and protokube both apply channels in
parallel; that is safe because apply is idempotent via manifest-hash
annotations. The protokube side is removed in the next commit.
Now that the kops-channels static pod owns both responsibilities, drop
the protokube-side reconciliation: the channels exec wrapper, the
--channels and --node-name flags, the labeler call, and the host-side
install of /opt/kops/bin/channels in the nodeup builder. The KubeBoot
struct sheds Channels and NodeName; the sync loop is now an idle
keep-alive for the gossip goroutines and will be removed alongside the
legacy gossip code path.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 18, 2026
The first apply fails while a control-plane node's apiserver is still
starting; retry every 5s until it succeeds rather than waiting a full
interval, which delays cluster bootstrap. Also reuse a cached kube
client per iteration.
@hakman
Copy link
Copy Markdown
Member Author

hakman commented May 18, 2026

@rifelpet anything else left to clarify here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/channels area/nodeup cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants