Skip to content

docs: Add tool discovery proposal#227

Open
rubambiza wants to merge 2 commits into
kubernetes-sigs:mainfrom
rubambiza:proposal-tool-discovery-upstream
Open

docs: Add tool discovery proposal#227
rubambiza wants to merge 2 commits into
kubernetes-sigs:mainfrom
rubambiza:proposal-tool-discovery-upstream

Conversation

@rubambiza
Copy link
Copy Markdown
Contributor

Summary

  • Proposes optional, opt-in tool discovery for KAN: a controller that calls tools/list on MCP backends, validates schemas, and surfaces discovered tools in XBackend.status
  • Enables policy validation against real backend state, operator visibility via kubectl, and a foundation for response filtering
  • Discovery is advisory (does not affect policy enforcement) and opt-in via annotation

Phasing

  • Phase 1: In-cluster discovery over plain HTTP with graceful failure for auth-required backends
  • Phase 2: External backends with TLS/authentication (deferred pending Backend spec stabilization)
  • Phase 3: AccessPolicy cross-referencing against discovered tools

Prior discussion

This proposal was initially drafted and reviewed on the fork PR (rubambiza#1) with feedback from @evaline-ju and @david-martin incorporated.

Test plan

  • Community review of proposal content and API design
  • Alignment on community consensus points (XBackend.status location, MCP client library, sync interval)

Signed-off-by: Gloire Rubambiza <gloire@ibm.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 6, 2026

Deploy Preview for kube-agentic-networking ready!

Name Link
🔨 Latest commit 6a59a19
🔍 Latest deploy log https://app.netlify.com/projects/kube-agentic-networking/deploys/69e99a98f65e37000802c668
😎 Deploy Preview https://deploy-preview-227--kube-agentic-networking.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 6, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rubambiza
Once this PR has been reviewed and has the lgtm label, please assign david-martin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 6, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @rubambiza. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 6, 2026
@evaline-ju
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 7, 2026
Signed-off-by: Gloire Rubambiza <gloire@ibm.com>
@LiorLieberman
Copy link
Copy Markdown
Member

/cc

Copy link
Copy Markdown
Contributor

@david-martin david-martin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the XBackend resource won't be in v1alpha1, we should hold this for now.
I see the value in having a controller being able to list and report back tools from mcp servers for more than AccessPolicy tool validation. For example, it can help with configuration validation, or surfacing of tools for informational purposes beyond auth.

I tend to like the standalone (extension?) approach, as making it a required piece could be a barrier to some agentic networking implementations, and could potentially be unwanted in some installations for various reasons (unwanted load, difficult to configure auth for some mcp servers)

/hold


## Non-Goals

- **`tools/list` response filtering.** Filtering the `tools/list` response
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to call this out.
I wonder, should we be more explicit to avoid confusion here as there are 2 identities at play here.
The identity to use when this new controller calls tools/list for populating the backend status,
and the identity when an agent/user is calling tool list?
The latter tool/list filtering I see as dynamic at list time, performing an authZ function, and driven by the rules in AccessPolicy

**Phase 1: In-cluster discovery (plain HTTP).** XBackend status extension,
discovery controller, schema validation, `list_changed` support, drift
detection. The controller connects to backends using the service name and port
from `XBackend.spec.mcp` over plain HTTP. This is sufficient for in-cluster
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https?

5. **Updates** `XBackend.status.discoveredTools[]` with validated tools only.
6. **Listens** for `tools/list_changed` notifications on backends that declare
the `listChanged` capability, triggering immediate re-discovery.
7. **Falls back to polling** (configurable interval, default 30s) for backends
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30s probably too aggresive.
5 minutes maybe?

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 7, 2026
An `XAccessPolicy` referencing a non-existent tool receives a Warning:

```yaml
apiVersion: agentic.prototype.x-k8s.io/v0alpha0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agentic.networking.x-k8s.io

### AccessPolicy Validation

The existing AccessPolicy reconciler is extended to cross-reference
`rules[].authorization.tools[]` against `XBackend.status.discoveredTools`:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note the change to method based matching in v1alpha1

// +kubebuilder:validation:XValidation:rule="has(self.params) && self.params.size() > 0 ? self.name in ['prompts/get', 'tools/call', 'resources/subscribe', 'resources/unsubscribe', 'resources/read'] : true",message="Params can only be specified for get, call, subscribe, unsubscribe, or read methods"

Will need to consider that and update here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants