Skip to content

Conversation

@usize
Copy link

@usize usize commented Nov 6, 2025

This PR introduces the initial “How” section for the Egress Gateways proposal.

It defines a clear model for routing outbound traffic through Kubernetes Gateways and describes how policy and extension points can be applied for both general egress and AI-specific use cases.

Importantly, it opens up discussion around important questions regarding:

  • integration with existing Gateway API resources versus creating new ones.
  • support for policies with varying levels of granularity without relying on known anti-patterns.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 6, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot
Copy link
Contributor

Welcome @usize!

It looks like this is your first PR to kubernetes-sigs/wg-ai-gateway 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/wg-ai-gateway has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Nov 6, 2025
@usize
Copy link
Author

usize commented Nov 6, 2025

cc @shaneutt @kflynn (since I tried incorporating your feedback directly)

@usize usize changed the title Egress gateways how proposal docs: Begin laying the groundwork for a 'how' section regarding egress gateways. Nov 6, 2025
# name: egress-client-cert
extensions:
- name: inject-credentials
type: CredentialInjector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this type referring to?
do you assume a predefined catalog of extensions?
can I “bring my own” payload processor?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. We probably want to have a set of built in/standard processors with the ability to have (prefixed) implementation specific ones

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type is the processor kind here. I think supporting a small catalog of processors would be great, and correct. Users can define their own BYO Processor -- e.g. foo.bar.com/EntityRedactor:v1 -- by conforming to an extension interface.

The interface for extensions will need something like:

  • Phase(s) it operates on e.g. request-headers
  • failOpen/Closed (default to closed?)
  • priority
  • type specific config schema

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea I was going with here, was to reuse the approach that filters use on HTTPRoute, but without the rule match (since the Backend policy should impact all traffic going to the backend and not depend on routing decisions).

### Conflict Resolution
When multiple policies influence the same request:
- **Specificity precedence**: Route > Backend > Gateway.
- **Same-level ties**: Implementations MUST use a deterministic tie-break (e.g., lexical name order) and surface status indicating the conflict.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be rephrased.
current phrasing leaves the tie breaking question to the specific gateway implementation.
we need to define deterministic order which will apply to all implementations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know in Istio, we pick the oldest policy first to bias towards stability. Is it really a requirement to force all implementations to use the same mechanism? Is that feasible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think deterministic order should be defined one way or another, otherwise switching between different Gateways might break expected order.

not talking about the mechanism, only about what is the expected result.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise switching between different Gateways might break expected order.

IMO, switching between Gateway implementations should not be a core design goal of our APIs. In my experience, this is a far more complicated process than most users ever want to take on. I don't feel super strongly, but I am wary about imposing these requirements on implementations w.r.t the tie-breaker


## Observability Considerations

- Implementations SHOULD expose metrics tagged by `{gateway, route, backend, namespace, serviceAccount}` and surface conditions (e.g., `Accepted`, `Programmed`, `Degraded`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

learning from past mistakes - I think we should phrase things we believe in as MUST.
phrasing as SHOULD may or may not be implemented, so it doesn’t say a lot.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHOULD seems appropriate for metrics IMO, but +1 on a MUST for status


## Next Steps

1. Decide on Gateway resource approach (reuse vs. new EgressGateway type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should reuse as much as we can when possible.
this should probably be stated clearly.

## Next Steps

1. Decide on Gateway resource approach (reuse vs. new EgressGateway type)
2. Define Backend resource schema with embedded policy rules
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume embedded policy rules - you mean to state the policy (like rate limit), in the CR?
what about use cases where the payload processing requires custom logic that cannot be expressed declaratively?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd felt a bit fuzzy about having a catalog of policies versus embedding some policy directly into the CR. As per our discussion above, I've changed the proposal to suggest maintaining a catalog of policies that can be extended via extensionRef (like filters on HTTPRoute).

1. Decide on Gateway resource approach (reuse vs. new EgressGateway type)
2. Define Backend resource schema with embedded policy rules
3. Specify filter extension points for payload processing
4. Align with multi-cluster and agentic networking proposals
Copy link
Contributor

@nirrozenbaum nirrozenbaum Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely agree that SIG MC should be involved. having said that, I think it belongs to phase 2.
phase 1 is being able to call an external service while staying within the scope of a single cluster.

extensions:
- name: inject-credentials
type: CredentialInjector
phase: request-headers # request-headers|request-body|connect|backend-request|backend-response|response-body|response-headers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to flesh out phases a bit more, especially in the security context. For example, some payload processing (most? all?) should probably not be done after the auth step and the request is coming from a trusted peer or rate-limiting is applied (to prevent malicious or untrusted requests). How does an implementation think about that?


Option B implies defining equivalents of parentRefs, listeners, and route attachment; this is a significant fork from Gateway API and should be justified by clear need for an egress specific spec.

**Recommendation needed**: Feedback requested on whether the semantics justify a new resource or if Gateway reuse is sufficient.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a +1 on continuing to use Gateway. It's a standard resource that's familiar to users and understood by implementations

hostname: api.openai.com
port: 443
tls:
mode: SIMPLE # SIMPLE | MUTUAL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should borrow from Gateway here for implementation specific TLS modes

port: 443
tls:
mode: SIMPLE # SIMPLE | MUTUAL
serverName: api.openai.com

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SNI may be more ubiquitous but don't feel strongly

config:
requestsPerMinute: 1000
```
Alternatively, policies MAY be separate CRDs (e.g., `BackendTLSPolicy`, `EgressPolicy`) with `spec.targetRef: Backend`, avoiding schema growth on `Backend`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely see the schema growth issue...does something like Envoy's typed config model help out a bit? Give implementations a way to dynamically instantiate a policy based on well-known type (urls?)?

Client traffic flows through the egress gateway directly to an external endpoint (FQDN or IP). The gateway applies policies and routing logic before forwarding to the destination.

### Parent Mode
Client traffic flows through a local egress gateway to an upstream gateway before reaching the final endpoint. This enables gateway chaining for multi-cluster or multi-zone topologies. The local egress gateway treats the parent as a single upstream. Local retries are limited to establishing the parent connection. Request-level retries are performed by the parent. Implementations MUST prevent retry loops across gateways.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More details are going to be needed here I think, especially when it comes to communication between the gateway and the client, best practices for ensuring traffic goes through the gateway etc

When multiple policies influence the same request:
- **Specificity precedence**: Route > Backend > Gateway.
- **Same-level ties**: Implementations MUST use a deterministic tie-break (e.g., lexical name order) and surface status indicating the conflict.
- **Same-level ties**: Implementations MUST use a deterministic lexical name order tie-break and surface status indicating the conflict.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to be prescriptive on tie-breaker, I think taking the oldest resource is an approach that biases more towards cluster stability. You'll never have a stable system become disrupted by a new, conflicting resource. Granted, I'm biased because Istio does this (less work for me 😄 ), but I figured I'd mention it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. I re-read your original comment thread and saw that I'd missed that. I'm fine with this. Lexical name order was just the first thing I glommed onto.

After thinking about it, I'm leaning toward specifying a tie-break because now is our best opportunity to do it IMO. Trying to do it later, after a few implementations spring up, would be more painful.

@usize usize force-pushed the egress-gateways-how-proposal branch from 0646815 to d99dab1 Compare November 14, 2025 04:19
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review, Kubecon week was last week.

This is a very thoughtful first pass at how we might implement egress. I have a few comments asking for some additional provisions regarding other paths we might take, other considerations we need to work through as we move forward, but once those are accounted for this seems like a great next step.

/approve

> highly focused sections as much as possible to help make things easier to
> read and review. Long, unbroken walls of code and YAML in this document are
> not advisable as that may increase the time it takes to review.
1. Resource model using Gateway + HTTPRoute with a Backend for destinations (Service or FQDN).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other ways to implement this (for instance, Linkerd allows for using Gateway, but also attaching to the mesh network as a parentRef to configure different kinds of egress) that we're aware of. I think starting with this is good, but we'll need to leave an open door for other resource models.

- Introduce dedicated `EgressGateway` resource type
- Enables egress-specific fields (e.g., global CIDR allow-lists) without policy attachment overhead
- Clearer separation of ingress vs egress concerns

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add another Alternatives Considered here for the Mesh resource (experimental, currently) as we know that Linkerd at least already does something like this, allowing egress from the sidecars, as opposed to the GW.

failOpen: true
config:
requestsPerMinute: 1000
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to not fuss over this API right now, rather: it would be nice if we could put together a working prototype to test with and inform our decisions about how this API should work.


Additional processors may be defined. They MUST declare the following fields:

- phase: one of {request-headers, request-body, connect, backend-request, backend-response, response-headers, response-body}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- phase: one of {request-headers, request-body, connect, backend-request, backend-response, response-headers, response-body}
- priority: integer. (Lower runs first within the same phase).
- failOpen: boolean. Default false (closed).
- preAuth: boolean. Default false. (trusted-peer context unavailable before authorization)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. More to do to figure out how this is going to work ofc, but very important callout.

Comment on lines +276 to +287
extensions:
- name: pii-detector
type: acme.example.com/PIIDetector:v1
phase: request-body
priority: 20
failOpen: false
preAuth: true
config:
modelRef: pii-detect-small
redactionStyle: delete
confidenceThreshold: 0.7
maxBodyBytes: 2097152
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain yet whether we want to add these extensions/filters/policies at the Backend level, as it seems that a pii-detector could be applied in an egress, ingress, or mesh context (in theory), so it seems more natural that these would apply to the HTTPRoute? 🤔

I think we can play with this, but before we merge let's add a comment pointing out the alternatives that we need to consider still.

Copy link
Author

@usize usize Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't put a great deal of effort into the example use case. More-so focusing on the way it would look to add a filter. I'd like to brainstorm some good Backend specific cases.

1. Define Backend resource schema.
2. Specify default Backend policies e.g. CredentialInjector and QoSController.
3. Specify filter extension points for payload processing
4. Align with multi-cluster and agentic networking proposals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on who's feeling froggy, I could see us saying we want to prototype this a bit as a next step.

Copy link
Author

@usize usize Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've proposed setting aside time for myself to do just that as a part of planning for ${MY_JOB}. 😁

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shaneutt, usize

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2025
usize and others added 2 commits November 18, 2025 11:37
'Filter' is a more standard term, which matches the proposed approach.

Co-authored-by: Shane Utt <[email protected]>
Signed-off-by: usize <[email protected]>
@usize usize force-pushed the egress-gateways-how-proposal branch from 18c0800 to 2a8cebc Compare November 18, 2025 19:37
@usize usize requested a review from shaneutt November 18, 2025 19:41
Comment on lines +173 to +174
Controllers MUST publish the set of supported processor kinds and versions for a GatewayClass via `GatewayClass.status.parametersRef` or an implementation-specific status e.g. `GatewayClass.status.supportedExtensionKinds`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the list is VERY long?
e.g., do you still expect to see it in status if there are 1k supported processors?

Copy link
Author

@usize usize Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on wg meeting:

  • Maybe we should reference an external source of truth instead of trying to keep the whole catalog in a status.

  • Users can also configure a pointer to their own external catalogs in this way.

  • but -

  • External link can also add complexity for users via indirection.

  • Maybe status could link to a ConfigMap (probably easy to prototype and likely durable)


Additional processors may be defined. They MUST declare the following fields:

- phase: one of {request-headers, request-body, connect, backend-request, backend-response, response-headers, response-body}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to add more details on these phases?
what does each phase means? when does it happen?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wg meeting notes:

  • for early stage design / proof of concept, maybe we should be less strict about phases.
  • we should gather feedback from Gateway API and potential users before defining it.

usize and others added 2 commits November 19, 2025 08:10
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
### Parent Mode
Client traffic flows through a local egress gateway to an upstream gateway before reaching the final endpoint. This enables gateway chaining for multi-cluster or multi-zone topologies. The local egress gateway treats the parent as a single upstream. Local retries are limited to establishing the parent connection. Request-level retries are performed by the parent.

Operators MUST use network policy or sidecar/egress proxy configuration to deny direct egress from workloads and force all outbound traffic to the Gateway.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean application pods need to be egress gateway-aware? (in lieu of a service mesh/sidecar approach?)

Copy link
Author

@usize usize Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not. The expectation is that without a service mesh operators will configure NetworkPolicy or CNI routing to route traffic through the gateway without workloads needing to be aware of it.

- phase: one of {request-headers, request-body, connect, backend-request, backend-response, response-headers, response-body}
- priority: integer. (Lower runs first within the same phase).
- failOpen: boolean. Default false (closed).
- preAuth: boolean. Default false. (trusted-peer context unavailable before authorization)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: acme-piidetector-v1-schema
namespace: gateway-system
data:
schema.json: |
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make it clear that we can support complicated dynamic use cases: e.g. pattern matching beyond what can be describe by a regular expression.

Copy link

@DamianSawicki DamianSawicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR describes egress gateway as a reverse proxy with static Backends. Another approach is a dynamic forward proxy, and we see a huge demand for it. Since the filename and the Why? section are general, I'd suggest splitting How? into subsections corresponding to different models, with the proposals from the present PR fitting into something like Reverse Proxy.

fqdn:
hostname: api.openai.com
port: 443
tls:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to using BackendTLSPolicy here.

I avoided it in the initial proposal because I’m not yet sure what it looks like to apply BackendTLSPolicy where backends are primarily intended to be external FQDNs rather than in-cluster Services.

That is, as of today, to use BackendTLSPolicy for an external FQDN we'd need to represent the FQDN via the creation of a synthetic Service. Since we're proposing the addition of a new Backend type, it seemed useful to try to avoid re-introducing that synthetic Service pattern.

The semantics may line up with minimal changes, but we need to investigate further.

I’ll add a note that we may want to reuse or align with BackendTLSPolicy once the shape of the egress Backend resource stabilizes.

Copy link
Contributor

@nirrozenbaum nirrozenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@usize @shaneutt
as I was telling in the last call, I have mixed feelings around this PR.
TLDR - This PR is mixing both egress gateway and payload processing.

more details:
In our WG we currently identified two main use cases that are of interest.

  • egress gateway - ideally we would like to be able to use HttpRoute (or alike) to route traffic to a service outside of the cluster.
  • Payload Processors as part of our request flow, ideally also using some Gateway level objects.

This PR is mixing both, which makes the proposal much more complex and also not very easy to implement.

The ideal is to have both to work using one intuitive solution, but for initial PoC and start to collect feedback I think we should separate concerns.
That is - we should try to implement (or in this PR, to specify the How) only for egress use case, without getting into Payload Processing. then we should try to implement a PoC (quick a dirty) and go with that to Gateway API meetings to collect feedback.

each of the use cases is big by itself and deserves a separate discussion.

@usize
Copy link
Author

usize commented Nov 23, 2025

TLDR - This PR is mixing both egress gateway and payload processing.

@nirrozenbaum looking at the comments here, yes, I agree. I've tried to put a bit too much into this PR. Payload processing already has its own proposal. Some of this might fit better there, or in some other proposal that bridges payload processing and egress as a singular focus.

How about this. I'm going to trim it down and simplify in the direction you're suggesting (removing details around payload processing). I'll keep the feedback I've gotten so far, and start toward building a simple PoC.

That way we can generate a little forward momentum here without risking needing to backtrack all of it later.

@nirrozenbaum
Copy link
Contributor

TLDR - This PR is mixing both egress gateway and payload processing.

@nirrozenbaum looking at the comments here, yes, I agree. I've tried to put a bit too much into this PR. Payload processing already has its own proposal. Some of this might fit better there, or in some other proposal that bridges payload processing and egress as a singular focus.

How about this. I'm going to trim it down and simplify in the direction you're suggesting (removing details around payload processing). I'll keep the feedback I've gotten so far, and start toward building a simple PoC.

That way we can generate a little forward momentum here without risking needing to backtrack all of it later.

SGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants