feat: add dynamic authorization using CEL in AccessPolicy#241
Conversation
✅ Deploy Preview for kube-agentic-networking ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Welcome @abhipatnala! |
|
Hi @abhipatnala. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@abhipatnala , Can you fix this? |
4d167a1 to
972a554
Compare
|
This is a great and promising start. Do you have specific instructions for one who wants to try this PR? |
| case agenticv0alpha0.AuthorizationRuleTypeCEL: | ||
| if rule.Authorization.CEL != nil { | ||
| env, err := cel.NewEnv( | ||
| cel.Variable("request", cel.MapType(cel.StringType, cel.AnyType)), |
There was a problem hiding this comment.
I guess without defining an identity (or principal) variable as well, users can use rules.source to condition a particular CEL expression based on exact matching properties of the identity – to the extent of what source supports today (a SPIFFE ID, a Service Account name/namespace). Was that the thinking?
There was a problem hiding this comment.
@guicassolato I thought we use source field in yaml for identity matching and CEL for dynamic payload matching. Exposing identity directly in CEL is a great idea for a follow-up.
| } | ||
| ast, issues := env.Compile(rule.Authorization.CEL.Expression) | ||
| if issues != nil && issues.Err() != nil { | ||
| klog.Errorf("Failed to compile CEL expression %q: %v", rule.Authorization.CEL.Expression, issues.Err()) |
There was a problem hiding this comment.
We should update the status of the XAccessPolicy resource to reflect this failure.
cc @guicassolato We should merge #236 before this PR.
There was a problem hiding this comment.
@abhipatnala ended up validating the CEL expression in pkg/controller. So we no longer need to merge #236 before this PR.
|
Please extend https://github.com/kubernetes-sigs/kube-agentic-networking/tree/main/tests/crd for this PR |
|
Please extend https://github.com/kubernetes-sigs/kube-agentic-networking/tree/main/tests/cel for this PR |
|
/ok-to-test |
|
Please add e2e test coverage into https://github.com/kubernetes-sigs/kube-agentic-networking/tree/main/tests/e2e |
9537973 to
bc87664
Compare
…ching - Move `celEnv`, `GetCelEnv` and `CompileCelExpression` from `accesspolicy.go` to a new file `cel.go` within the `translator` package to clean up code organization. - Implement a thread-safe `sync.RWMutex` cache for compiled `cel.Ast` expressions to avoid duplicate compilation overhead between the controller validation and Envoy filter rendering. - Add comprehensive unit test coverage in `cel_test.go` verifying singleton initialization, macro expansion (`request.mcp.tool_name`), syntax errors, and concurrent safety.
4a47649 to
dfe0d54
Compare
LiorLieberman
left a comment
There was a problem hiding this comment.
Thanks for updating the description. one last request.
There was a problem hiding this comment.
Can we get another e2e test(s) that demostrate startsWith and/or other CEL expressions?
dfe0d54 to
7241db9
Compare
7241db9 to
dbb9107
Compare
|
Thanks @abhipatnala ! |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhipatnala, LiorLieberman The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- Update k8s/crds/agentic.networking.x-k8s.io_xaccesspolicies.yaml manifest via make generate. - Fix gofumpt trailing blank newlines at EOF in accesspolicy.go and cel_test.go. - Synchronize API group version and target refs to 'agentic.networking.x-k8s.io' in five new YAML test data manifests to fix E2E and CRD validation pre-submits. - Extend E2E tests to verify native Envoy CEL variables (method, path, headers) are correctly enforced alongside custom macros.
dbb9107 to
beb5b4f
Compare
|
/lgtm |
| ) | ||
|
|
||
| const ( | ||
| celVarRequestMCPToolName = "request.mcp.tool_name" |
There was a problem hiding this comment.
I think this conflict with existing CEL dictionaries that have different meaning. I suggest the following naming to help disambiguate.
mcp.request.method
mcp.request.tool_name (valid if mcp.request.method == tools/call)
http.request.authority
http.request.method
http.request.path (without query)
http.request.query
http.request.headers (we will need to disambiguate how multivalue headers are handled, since this is a security sensitive issue).
| // 1. Validate variables in the original expression to ensure they are on the allowlist | ||
| matches := variableRegex.FindAllString(expression, -1) | ||
| for _, match := range matches { | ||
| if !isValidVariable(match) { | ||
| return nil, fmt.Errorf("unsupported CEL variable: %s", match) | ||
| } | ||
| } |
There was a problem hiding this comment.
I'm not sure about enforcing this validation in the control plane.
Because request was defined as a cel.Map, the validation would block perfectly valid expressions such as request["path"], to favour only the syntactically sugary version request.path. This will obviously limit more advanced use cases such as request[request.headers["x-dyn-property"]].
The validation also blocks macros on the request map such as request.all(…), request.exists(…), etc.
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR introduces CEL-based dynamic authorization for XAccessPolicy. To ensure security and compatibility with Envoy, the controller enforces a strict allowlist of supported CEL variables.
Currently supported variables:
Custom MCP Variables:
request.mcp.tool_name (e.g., request.mcp.tool_name == 'get_weather' or request.mcp.tool_name.startsWith('read_'))
Standard Envoy HTTP Attributes:
request.path
request.url_path
request.method
request.host
request.headers
request.time
How it Works (Validation & Translation)
To provide a seamless developer experience while maintaining strict security, this implementation uses a highly optimized, single-pass compilation architecture:
Validation (Fail-Fast): During reconciliation, the controller parses the user's raw CEL string and extracts all request.* identifier chains. It strictly checks these against the allowlist above. If a user attempts to use an unsupported variable (e.g., request.unrecognized), the controller immediately rejects the policy with a clear error in the XAccessPolicy status.
Translation: Valid custom variables (specifically request.mcp.tool_name) undergo a macro replacement. The controller translates them into the underlying Envoy-native metadata path (metadata.filter_metadata['mcp_proxy'].params.name) populated by our upstream WASM filter. Standard attributes (like request.path) are passed through untouched.
Native Envoy Enforcement: The translated expression is compiled into a CEL Abstract Syntax Tree (AST) using a loosely-typed map environment. This perfectly matches the SelectExpr structure Envoy expects. The AST is injected into the proxy via xDS, allowing Envoy's native RBAC engine to evaluate the complex conditions at runtime.
Which issue(s) this PR fixes:
Fixes #52
Does this PR introduce a user-facing change?:
Yes