Add v1alpha1 api#259
Conversation
|
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kube-agentic-networking ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
bdfd3c7 to
2ae1a91
Compare
|
/retest |
haiyanmeng
left a comment
There was a problem hiding this comment.
@LiorLieberman , can you add test coverage into https://github.com/kubernetes-sigs/kube-agentic-networking/tree/main/tests/cel and https://github.com/kubernetes-sigs/kube-agentic-networking/tree/main/tests/crd? Thanks!
| // +kubebuilder:validation:MinItems=1 | ||
| // +kubebuilder:validation:MaxItems=10 | ||
| // +listType=atomic | ||
| // +kubebuilder:validation:XValidation:rule="self.all(ref, ref.kind == self[0].kind)",message="All targetRefs must have the same Kind" |
There was a problem hiding this comment.
why
is not included in this PR?If we decide to keep supporting Backend-targeted AccessPolicy in the reference implementation, we should allow the target to be either Gatway or Backend.
There was a problem hiding this comment.
I think given there is no official support for XBackend, we should leave references to it out of the validation. When the support is official, then it makes sense to add it back in.
There was a problem hiding this comment.
Hi Haiyan, as we indicated before, the release is not going to include XBackend in it. Therefore Core support for targeting AccessPolicy is going to be only Gateway. However, we still want to allow room for extensibility for targeting any GVK (one example would be the protoype in this repo which will continue to target v0alpha0 backend) so can not have this validation.
There was a problem hiding this comment.
Should we validate the targetRef in control plane?
There was a problem hiding this comment.
TargetRef validation would be in each control plane, yes. And if invalid status will be posted.
There was a problem hiding this comment.
This is about not mixing targetRefs?
So all can be Gateways, or all can be XBackends, but not a mix of both?
There was a problem hiding this comment.
Correct. To start simple. I think this came out of a conclusion in extAuth discussion where Backend and Gateway could have been confusing to be together. This may not be required though anymore since we are not releasing this with XBackend support. I still think its smart to include (easier to relax this later)
| // 1. ExternalAuth runs before all other Allow policies. | ||
| // 2. If an ExternalAuth server denies the request, the request is denied. | ||
| // 3. If it allows the request, processing continues for all other allow policies for that target. | ||
| // 4. The request is allowed only if all allow policies allow it. |
There was a problem hiding this comment.
#220 proposes that the request is allowed if any allow policy allows it.
There was a problem hiding this comment.
While we can let ExternalAuth deny early. Having it allow early has several concerns.
External Auth servers typically handle global context (like JWTs or tokens), but they don't necessarily anything about local boundaries inside the cluster. There may be cases where we still need our downstream policies (like SPIFFE identity matching or MCP method whitelisting) to validate that workload A is actually authorized to communicate with workload B, after external auth callout.
Hope thats clear
There was a problem hiding this comment.
Looks like that proposal is still in discussion? v1alpha1 is getting cut with the current behavior -- it is still alpha and can change.
There was a problem hiding this comment.
I was going to ask for some clarification on a user/group scenario with multiple rules having to be allowed, but I think this discussion is moreso about rules across different policies?
Also, we don't have support for groups or spiffe wildcards right now? So that wouldn't be relevant here (which is probably a good motivation for CEL/OIDC)
There was a problem hiding this comment.
This is not what #220 proposes. allow policies means inline policies. I am trying to suggest that we define the rules here to address #237.
We currently propose Allow and ExternalAuth. We dont propose deny in this version given the consensus we got with other folks in the group. This rules are just outlining some guidance on how ExternalAuth and Allow should work across different policies (as @david-martin suggested)
David - right - I would really love to see CEL and OIDC support, hopefully you/gui can pick this up soon and we release it as part of 0.2.
(there is someone thats working on CEL right now, hopefully we incorporate this soon)
@LiorLieberman , this seems unexpected. PTAL. |
|
/retest |
2ae1a91 to
28c18ed
Compare
| // +kubebuilder:validation:MinItems=1 | ||
| // +kubebuilder:validation:MaxItems=10 | ||
| // +listType=atomic | ||
| // +kubebuilder:validation:XValidation:rule="self.all(ref, ref.kind == self[0].kind)",message="All targetRefs must have the same Kind" |
There was a problem hiding this comment.
I think given there is no official support for XBackend, we should leave references to it out of the validation. When the support is official, then it makes sense to add it back in.
| // 1. ExternalAuth runs before all other Allow policies. | ||
| // 2. If an ExternalAuth server denies the request, the request is denied. | ||
| // 3. If it allows the request, processing continues for all other allow policies for that target. | ||
| // 4. The request is allowed only if all allow policies allow it. |
There was a problem hiding this comment.
Looks like that proposal is still in discussion? v1alpha1 is getting cut with the current behavior -- it is still alpha and can change.
47b6384 to
1004ea0
Compare
1004ea0 to
d22ba83
Compare
This test is very flaky. This API should not cause this to fail. I think its is related to @ziyue-101 's last changes with TLS. |
|
/retest |
|
/lgtm But you need to get the tests to pass |
dee5b7d to
204c206
Compare
204c206 to
46a4e31
Compare
46a4e31 to
edcc468
Compare
7573f1b to
107d717
Compare
david-martin
left a comment
There was a problem hiding this comment.
Some questions so I can better understand the changes from 0 to 1.
In general though, it looks reasonable to me.
/approve
| // +kubebuilder:validation:MinItems=1 | ||
| // +kubebuilder:validation:MaxItems=10 | ||
| // +listType=atomic | ||
| // +kubebuilder:validation:XValidation:rule="self.all(ref, ref.kind == self[0].kind)",message="All targetRefs must have the same Kind" |
There was a problem hiding this comment.
This is about not mixing targetRefs?
So all can be Gateways, or all can be XBackends, but not a mix of both?
| ActionTypeAllow AccessPolicyActionType = "Allow" | ||
|
|
||
| // ActionTypeExternalAuth is used to identify that the request should be delegated to an external auth service if rules match. | ||
| ActionTypeExternalAuth AccessPolicyActionType = "ExternalAuth" |
There was a problem hiding this comment.
Is there a contradiction here?
the request should be delegated to an external auth service if rules match
and
Evaluation logic:
// 1. ExternalAuth runs before all other Allow policies.
There was a problem hiding this comment.
This means that if action: ExternalAuth and any rule (defined under the "rules" field) match - it goes to the externalAuth server for evaluation. If action: Allow and any rule (defined under the "rules" field) match - the evaluation of this is ALLOW.
Regardless of that, Evaluation logic between policies on the same target works in a way that if there is ExternalAuth, it before all other Allow policies. If the ExternalAuth server denies it - request is denied. If it allows it, you still need all other policies to allow it (unless they dont exist).
A concrete example:
action: extAuth
rules:
- match path /foo
target: Gateway1
action: Allow
rules:
- match header bar
target: Gateway1
If the request is "/foo", ExtAuthServer allows, and header bar is missing -- you still get the request denied.
| // matched by this rule. A request originating from a pod associated with | ||
| // this Service Account will match the rule. | ||
| // | ||
| // The Service Account listed here is expected to exist within the same |
There was a problem hiding this comment.
by trust domain, do we mean same kubernetes cluster, or same namespace?
There was a problem hiding this comment.
its configurable. Often times it would be a cluster identifier, but its not strictly required.
Here is how it could commonly be: spiffe://<trust_domain>/ns/<namespace>/sa/<serviceaccount>
| // 1. ExternalAuth runs before all other Allow policies. | ||
| // 2. If an ExternalAuth server denies the request, the request is denied. | ||
| // 3. If it allows the request, processing continues for all other allow policies for that target. | ||
| // 4. The request is allowed only if all allow policies allow it. |
There was a problem hiding this comment.
I was going to ask for some clarification on a user/group scenario with multiple rules having to be allowed, but I think this discussion is moreso about rules across different policies?
Also, we don't have support for groups or spiffe wildcards right now? So that wouldn't be relevant here (which is probably a good motivation for CEL/OIDC)
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: david-martin, LiorLieberman The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| generate: manifests deepcopy register clientsets ## Generate manifests, deepcopy code, and clientsets. | ||
|
|
||
| # TODO: Remove the python post-processing patch once XAccessPolicy v1alpha1 is fully implemented | ||
| # and flipped to `served: true` instead of v0alpha0. |
There was a problem hiding this comment.
do we have a tracking issue for this?
There was a problem hiding this comment.
Planning to fast follow after I add ref implementation support for the new field structure (i.e moving the support to support the v1alpha1 from v0alpha0)
|
/hold pending some q's |
|
|
||
| // spec defines the desired state of AccessPolicy. | ||
| // +required | ||
| Spec AccessPolicySpec `json:"spec"` |
There was a problem hiding this comment.
the k8s CRD convention naming would be something like this
Spec XAccessPolicySpec `json:"spec"`
I wonder why we have two different naming here?
There was a problem hiding this comment.
Good question. Looks like we have not done this in the Gateway community. I have no strong opinion, but its going to be more nice code wise to not call v1alpha1.XAcessPolicySpec. The X prefix serves primarily the purpose of safe upgrades to standard channel and to explicitly indicate that this is an experimental resource
https://github.com/kubernetes-sigs/gateway-api/blob/main/apisx/v1alpha1/xbackendtrafficpolicy_types.go
https://github.com/kubernetes-sigs/gateway-api/blob/main/apisx/v1alpha1/xmesh_types.go
|
|
||
| const ( | ||
| // ActionTypeAllow is used to identify that the request should be allowed if rules match. | ||
| ActionTypeAllow AccessPolicyActionType = "Allow" |
There was a problem hiding this comment.
Shouldn't this be "LocalAuth" when the other Value of Enum is "ExternalAuth"?
There was a problem hiding this comment.
The idea is that actions, will evolve to add Deny in the future as well. We already got some usecases that suggests deny will be needed.
Then you will have 3 actions. (1) this policy is an ALLOW policy, (2) this policy is a delegate to an external server and (3) this policy is a deny policy (if we ends up adding it in future releases)
| // | ||
| // Implementations SHOULD return a regular HTTP formatted response if the policy is enforced against non-MCP traffic. | ||
| // Implementations MAY return a JSON-RPC formatted response if the policy is enforced against MCP traffic. | ||
| // +kubebuilder:validation:XValidation:rule="self.action == 'ExternalAuth' ? has(self.externalAuth) : true",message="externalAuth must be specified when action is set to 'ExternalAuth'" |
There was a problem hiding this comment.
I am a bit confused. When the AccessPolicyActionType is ExternalAuth, does the external server do both the authentication and autherization? If so, does it mean that the AccessRule field is not needed in that case?
Also, how does the external server do Authentication? What identity does the controller use to talk to the external auth sever?
| } | ||
|
|
||
| // MCPAttributes defines the protocol-specific attributes for MCP authorization. | ||
| type MCPAttributes struct { |
There was a problem hiding this comment.
Does this imply that the rules are basically based on MCP tool name and parameters?
There was a problem hiding this comment.
for now yes. The group discussed to add additional support for "Inline" ways to specify other requests attributes like path, headers, etc.
That said, once CEL support is added you will be able to do more granular things with CEL. You would have AuthorizationRuleTypeCEL (in addition to the AuthorizationRuleTypeInline). See this slide for the general idea
…cel crd tests environment
107d717 to
e87c703
Compare
| .PHONY: manifests | ||
| manifests: controller-gen ## Generate CustomResourceDefinition objects. | ||
| $(CONTROLLER_GEN) rbac:roleName=manager-role crd paths="./api/..." output:crd:artifacts:config=k8s/crds | ||
| python3 -c "p='k8s/crds/agentic.networking.x-k8s.io_xaccesspolicies.yaml'; text=open(p).read(); parts=text.split(' - name: v1alpha1'); parts[1]=parts[1].replace(' served: true', ' served: false', 1); open(p,'w').write(' - name: v1alpha1'.join(parts))" |
There was a problem hiding this comment.
you should put this in a hack/script.py. (follow up PR)
|
/lgtm Let's get an initial version in and iterate |
|
/unhold |
What type of PR is this?
What this PR does / why we need it:
Add v1alpha1 API
Move ExternalAuth to top level and add more native mcp support as discussed.
Which issue(s) this PR fixes:
Relates to #255
Does this PR introduce a user-facing change?: