Skip to content

cluster-inventory: Add ClusterProfile discovery for in-cluster mode#4577

Open
kahirokunn wants to merge 14 commits into
kubernetes-sigs:mainfrom
kahirokunn:support-cluster-inventory
Open

cluster-inventory: Add ClusterProfile discovery for in-cluster mode#4577
kahirokunn wants to merge 14 commits into
kubernetes-sigs:mainfrom
kahirokunn:support-cluster-inventory

Conversation

@kahirokunn
Copy link
Copy Markdown
Member

@kahirokunn kahirokunn commented Feb 6, 2026

Summary

This PR adds support for the Cluster Inventory API to automatically discover and register Kubernetes clusters in Headlamp via ClusterProfile resources, enabling multi-cluster management without manual kubeconfig configuration.

Cluster Inventory support in Headlamp is currently alpha/experimental and is disabled by default. It uses the upstream Cluster Inventory v1alpha1 API from the v0.1.x releases, so fields and behavior may change until that project reaches beta.

Why This Belongs in Headlamp

This is implemented in Headlamp core rather than as a Headlamp UI plugin because Cluster Inventory discovery needs backend integration. Headlamp has to watch ClusterProfile CRDs, build Kubernetes clients from Cluster Inventory access providers, register discovered clusters in the shared kubeconfig context store, and expose those clusters consistently to the frontend and proxy layer.

Cluster Inventory API is also a Kubernetes SIG Multicluster API project under kubernetes-sigs. It is intended to provide a vendor-neutral, standardized interface for multi-cluster discovery and cluster status. Supporting it in core lets Headlamp interoperate with any implementation of that API instead of tying this behavior to one plugin or one vendor-specific integration.

Architecture

graph TB
    subgraph ManagementCluster["🏢 Management Cluster (Hub)"]
        HL["Headlamp Server<br/>--enable-cluster-inventory"]
        CP1["ClusterProfile<br/>name: workload-1<br/>ns: default"]
        CP2["ClusterProfile<br/>name: workload-2<br/>ns: fleet-a"]

        subgraph ProviderConfig["🔑 Credential Provider"]
            PF["provider-config.json"]
            EXEC["Exec Plugin<br/>(credential-plugin.sh)"]
            PF --> EXEC
        end

        HL -- "ClusterProfile informer/watch" --> CP1
        HL -- "ClusterProfile informer/watch" --> CP2
        HL -. "Root reconcile loop<br/>watcher lifecycle / no-CRD retry" .-> HL
        HL -- "Fetch credentials" --> PF
    end

    subgraph WorkloadCluster1["⚙️ Workload Cluster 1"]
        API1["kube-apiserver"]
        RESOURCES1["Pods, Deployments, ..."]
    end

    subgraph WorkloadCluster2["⚙️ Workload Cluster 2"]
        API2["kube-apiserver"]
        RESOURCES2["Pods, Deployments, ..."]
    end

    CP1 -- "status.accessProviders<br/>server + CA" --> API1
    CP2 -- "status.accessProviders<br/>server + CA" --> API2
    EXEC -- "ExecCredential<br/>(clientCert + clientKey)" --> API1
    EXEC -- "ExecCredential<br/>(clientCert + clientKey)" --> API2

    style ManagementCluster fill:#1a237e,stroke:#5c6bc0,color:#fff
    style WorkloadCluster1 fill:#004d40,stroke:#26a69a,color:#fff
    style WorkloadCluster2 fill:#004d40,stroke:#26a69a,color:#fff
    style ProviderConfig fill:#4a148c,stroke:#ce93d8,color:#fff
Loading

Related Issue

#4708

Changes

  • Automatic discovery of clusters from ClusterProfile resources (Cluster Inventory API) in both in-cluster and local/desktop modes
  • Add a Cluster Inventory label selector config (--cluster-inventory-label-selector, env var, Helm config.clusterInventory.labelSelector) to hide matching ClusterProfile resources.

Steps to Test

Prerequisites

  • Docker running
  • kind, kubectl, clusterctl, yq installed

1. Create a Kind management cluster

kind create cluster --name headlamp-ci-hub --config - <<'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /var/run/docker.sock
      containerPath: /var/run/docker.sock
EOF

kubectl config use-context kind-headlamp-ci-hub

2. Install CAPI + k0smotron

clusterctl init --infrastructure docker

kubectl wait --for=condition=Available deployment --all -n capi-system --timeout=120s
kubectl wait --for=condition=Available deployment --all -n capd-system --timeout=120s

kubectl apply --server-side=true -f https://docs.k0smotron.io/v1.10.6/install.yaml
kubectl wait --for=condition=Available deployment --all -n k0smotron --timeout=300s

3. Create a workload cluster

K0S_VERSION="v1.35.3+k0s.0"
K8S_VERSION="${K0S_VERSION%%+*}"
WORKLOAD_CLUSTER_NAME="workload-1"

kubectl apply -f - <<EOF
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: cluster.local
    services:
      cidrBlocks: ["10.128.0.0/12"]
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: K0smotronControlPlane
    name: ${WORKLOAD_CLUSTER_NAME}-cp
    namespace: default
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: DockerCluster
    name: ${WORKLOAD_CLUSTER_NAME}
    namespace: default
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0smotronControlPlane
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}-cp
  namespace: default
spec:
  version: ${K0S_VERSION}
  persistence:
    type: emptyDir
  service:
    type: LoadBalancer
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerCluster
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}
  namespace: default
  annotations:
    cluster.x-k8s.io/managed-by: k0smotron
spec: {}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}-md
  namespace: default
spec:
  clusterName: ${WORKLOAD_CLUSTER_NAME}
  replicas: 1
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: ${WORKLOAD_CLUSTER_NAME}
      pool: worker-pool-1
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: ${WORKLOAD_CLUSTER_NAME}
        pool: worker-pool-1
    spec:
      clusterName: ${WORKLOAD_CLUSTER_NAME}
      version: ${K8S_VERSION}
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: K0sWorkerConfigTemplate
          name: ${WORKLOAD_CLUSTER_NAME}-machine-config
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachineTemplate
        name: ${WORKLOAD_CLUSTER_NAME}-mt
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}-mt
  namespace: default
spec:
  template:
    spec: {}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sWorkerConfigTemplate
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}-machine-config
spec:
  template:
    spec:
      version: ${K0S_VERSION}
EOF

Wait for the control plane and workers:

kubectl wait k0smotroncontrolplane ${WORKLOAD_CLUSTER_NAME}-cp \
  --for=jsonpath='{.status.ready}'=true --timeout=600s

kubectl wait machine -l cluster.x-k8s.io/cluster-name=${WORKLOAD_CLUSTER_NAME} \
  --for=jsonpath='{.status.phase}'=Running --timeout=600s

4. Install ClusterProfile CRD and create a ClusterProfile

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-inventory-api/v0.1.0/config/crd/bases/multicluster.x-k8s.io_clusterprofiles.yaml
kubectl wait --for=condition=Established crd clusterprofiles.multicluster.x-k8s.io --timeout=30s

# Extract connection info from the CAPI-managed kubeconfig secret
SECRET_DATA=$(kubectl get secret ${WORKLOAD_CLUSTER_NAME}-kubeconfig -o jsonpath='{.data.value}' | base64 -d)
SERVER_URL=$(echo "${SECRET_DATA}" | yq -r '.clusters[0].cluster.server')
CA_DATA=$(echo "${SECRET_DATA}" | yq -r '.clusters[0].cluster.certificate-authority-data')

# Create ClusterProfile (spec)
kubectl apply -f - <<EOF
apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ClusterProfile
metadata:
  name: ${WORKLOAD_CLUSTER_NAME}
  namespace: default
spec:
  clusterManager:
    name: capi-docker
EOF

# Patch status subresource with access provider info
kubectl patch clusterprofile ${WORKLOAD_CLUSTER_NAME} \
  --type merge --subresource=status -p "{
  \"status\": {
    \"version\": { \"kubernetes\": \"${K8S_VERSION}\" },
    \"accessProviders\": [{
      \"name\": \"capi-secret\",
      \"cluster\": {
        \"server\": \"${SERVER_URL}\",
        \"certificate-authority-data\": \"${CA_DATA}\"
      }
    }]
  }
}"

5. Create the credential plugin and provider config

Create credential-plugin.sh:

cat > /tmp/credential-plugin.sh <<'PLUGIN_EOF'
#!/usr/bin/env bash
set -euo pipefail
CLUSTER_NAME="${1:?Usage: credential-plugin.sh <cluster-name>}"
MGMT_CONTEXT="kind-headlamp-ci-hub"

SECRET_VALUE=$(kubectl --context "${MGMT_CONTEXT}" get secret "${CLUSTER_NAME}-kubeconfig" \
  -o jsonpath='{.data.value}' | base64 -d)

CLIENT_CERT=$(echo "${SECRET_VALUE}" | yq -r '.users[0].user.client-certificate-data' | base64 -d)
CLIENT_KEY=$(echo "${SECRET_VALUE}" | yq -r '.users[0].user.client-key-data' | base64 -d)

cat <<EOF
{
  "apiVersion": "client.authentication.k8s.io/v1",
  "kind": "ExecCredential",
  "status": {
    "clientCertificateData": $(echo "${CLIENT_CERT}" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))'),
    "clientKeyData": $(echo "${CLIENT_KEY}" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))')
  }
}
EOF
PLUGIN_EOF
chmod +x /tmp/credential-plugin.sh

Create provider-config.json:

cat > /tmp/provider-config.json <<EOF
{
  "providers": [
    {
      "name": "capi-secret",
      "execConfig": {
        "apiVersion": "client.authentication.k8s.io/v1",
        "command": "/tmp/credential-plugin.sh",
        "args": ["workload-1"],
        "provideClusterInfo": false
      }
    }
  ]
}
EOF

6. Start Headlamp and verify

Terminal 1:

make backend
HEADLAMP_BACKEND_TOKEN=headlamp ./backend/headlamp-server -dev -listen-addr=localhost \
  --kubeconfig ~/.kube/config \
  --enable-cluster-inventory \
  --cluster-inventory-provider-file /tmp/provider-config.json

# Optional: shorten root watcher lifecycle retry while testing seed/root changes.
# --cluster-inventory-root-reconcile-interval=1m

Terminal 2:

npm run frontend:start
  1. Open http://localhost:3000
  2. Confirm cluster-inventory-store--kind-headlamp-ci-hub--default--workload-1--64f958ab98d6 appears in the cluster list with source Cluster Inventory
  3. Click into the cluster and verify you can browse its resources, for example by opening the Pods view
  4. Label the discovered profile with headlamp.dev/ignore=true and verify it disappears when using the default Helm selector or --cluster-inventory-label-selector='!headlamp.dev/ignore'.

7. Clean up

kind delete cluster --name headlamp-ci-hub
rm -f /tmp/credential-plugin.sh /tmp/provider-config.json

Notes for the Reviewer

  • Adds sigs.k8s.io/cluster-inventory-api as a new Go dependency.
  • Cluster Inventory support is alpha/experimental, disabled by default, and currently targets the upstream v1alpha1 / v0.1.x API.
  • The --cluster-inventory-provider-file flag is required when Cluster Inventory is enabled.
  • When Cluster Inventory is disabled, Headlamp does not start ClusterProfile discovery and existing kubeconfig, in-cluster, and dynamic-cluster behavior is unchanged.
  • When Cluster Inventory is enabled but a watched API server does not have the ClusterProfile CRD, Headlamp treats that root as having no Cluster Inventory source, logs that the CRD is unavailable, caches that no-CRD state, and retries later.
  • Cluster Inventory credentials are obtained through the operator-provided access provider configuration. Provider commands/images should be treated as trusted deployment inputs, similar to kubeconfig exec credential plugins, because they can return credentials used to connect to discovered clusters.
CleanShot 2026-05-10 at 10 42 28

@kahirokunn kahirokunn marked this pull request as draft February 6, 2026 11:45
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2026
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 6, 2026
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch 3 times, most recently from 7244999 to aecb41c Compare February 6, 2026 17:23
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 6, 2026
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch from aecb41c to 9a2ca17 Compare February 6, 2026 17:30
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 7, 2026
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch 3 times, most recently from e369623 to 595dfb8 Compare February 7, 2026 01:52
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 7, 2026
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch from 595dfb8 to 54c6f02 Compare February 7, 2026 02:21
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 14, 2026
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch from 54c6f02 to fde608e Compare February 14, 2026 18:04
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 14, 2026
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch 4 times, most recently from 593ca6b to 1cdd0cd Compare February 14, 2026 19:02
@kahirokunn kahirokunn marked this pull request as ready for review February 14, 2026 19:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 14, 2026
@k8s-ci-robot k8s-ci-robot requested review from skoeva and sniok February 14, 2026 19:04
@kahirokunn kahirokunn force-pushed the support-cluster-inventory branch from 1cdd0cd to 4b70bff Compare February 14, 2026 19:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 43 changed files in this pull request and generated 2 comments.

Comment thread charts/headlamp/templates/deployment.yaml
Comment thread charts/headlamp/values.schema.json
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 43 out of 44 changed files in this pull request and generated 1 comment.

Comment thread frontend/src/components/App/Home/ClusterTable.tsx Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated no new comments.

@kahirokunn
Copy link
Copy Markdown
Member Author

kahirokunn commented May 10, 2026

@illume Could you please review this again when you have a moment?
We plan to utilize this feature in the headlamp plugin for Open Cluster Management, which is currently scheduled to begin new development.
Thank you 🙏

@kahirokunn
Copy link
Copy Markdown
Member Author

@illume Hi🥺

@illume
Copy link
Copy Markdown
Contributor

illume commented May 14, 2026

Hey hey.

Very cool! Looks like very excellent work.

Yes, sorry, I will have a look in the next days. Have to learn what a Cluster Inventory API is too.

A quick browse I see it uses CRDs. Can you please add a note to the PR description about why it should be in headlamp rather than in a plugin? It looks like it uses a backend component, and it's an official k8s API? (two good reasons to include it in headlamp rather than in a plugin).

Comment thread frontend/src/components/App/Home/ClusterTable.tsx Outdated
@kahirokunn
Copy link
Copy Markdown
Member Author

Thank you for taking a look! I have added "Why This Belongs in Headlamp" to the body of the PR.
I have also addressed the points raised in your review and resolved the conflicts.

@kahirokunn
Copy link
Copy Markdown
Member Author

/retest

@kahirokunn
Copy link
Copy Markdown
Member Author

I pushed again to re-run GHA.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

@illume illume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking a look! I have added "Why This Belongs in Headlamp" to the body of the PR. I have also addressed the points raised in your review and resolved the conflicts.

Thanks for this!!

I left a few notes about some code organisation.

After reading the Cluster Inventory project a bit more:

  • It looks like Cluster Inventory is in an alpha/experimental state so far. Because of this I think we need to make sure it's marked that way to headlamp users. When the project gets into a beta stage we can mark it as beta in headlamp. How does that sound?
  • Can you please explain in the PR description how headlamp behaves in clusters where cluster inventory is not installed?
  • For the new Golang code imported, I haven't reviewed it yet. I need to understand how this affects the threat model for headlamp. (if you can add a note to the PR description, otherwise I will look into it).

Comment thread backend/go.mod
Comment thread backend/pkg/kubeconfig/kubeconfig.go
Comment thread backend/pkg/kubeconfig/watcher.go
Comment thread charts/headlamp/README.md
Comment thread charts/headlamp/values.schema.json Outdated
Comment thread backend/cmd/headlamp_test.go
Comment thread backend/pkg/headlampconfig/headlampConfig.go
Comment thread backend/pkg/kubeconfig/kubeconfig.go Outdated
Comment thread frontend/src/components/App/Home/ClusterTable.tsx Outdated
Comment thread backend/pkg/config/config.go
@kahirokunn
Copy link
Copy Markdown
Member Author

@illume Hi, Thank you for your review 🙏 could you please review this again? Thank you 🙏

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 56 changed files in this pull request and generated no new comments.

@kahirokunn
Copy link
Copy Markdown
Member Author

I rebased with the latest main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants