From 5b821c13f8412bd3d7cc9c9bd5dd087d35e1b684 Mon Sep 17 00:00:00 2001 From: shadialtarsha Date: Fri, 8 Nov 2024 14:13:41 +0100 Subject: [PATCH 1/6] first draft --- geps/gep-3440/index.md | 281 ++++++++++++++++++++++++++++++++++++ geps/gep-3440/metadata.yaml | 38 +++++ 2 files changed, 319 insertions(+) create mode 100644 geps/gep-3440/index.md create mode 100644 geps/gep-3440/metadata.yaml diff --git a/geps/gep-3440/index.md b/geps/gep-3440/index.md new file mode 100644 index 0000000000..0a80dbcff6 --- /dev/null +++ b/geps/gep-3440/index.md @@ -0,0 +1,281 @@ +# GEP-3440: Gateway API Support for gRPC Retries + +* Issue: [#3440](https://github.com/kubernetes-sigs/gateway-api/issues/3440) +* Status: Provisional + +## TLDR +This proposal introduces support for gRPC retries in the Gateway API, +allowing for configuration of retry attempts, backoff duration, and retryable status codes for gRPC routes. + +## Goals + +- To allow specification of gRPC status codes that should be retried. +- To allow specification of the maximum number of times to retry a gRPC request. +- To allow specification of the minimum backoff interval between retry attempts for gRPC requests. +- Retry configuration must be applicable to most known Gateway API implementations for gRPC. +- To define any interaction with configured gRPC timeouts and backoff. + +## Non-Goals + +- No standard APIs for advanced retry logic, such as integrating with rate-limiting headers. +- No default retry policies for all routes within a namespace or for routes tied to a specific Gateway. +- No support for detailed backoff adjustments, like fine-tuning intervals, adding jitter, or setting max duration caps. +- No retry support for streaming or bidirectional APIs (maybe considered in future proposals). + +## Introduction + +To keep services reliable and resilient, a Gateway API implementation should be able to retry failed gRPC requests to +backend services before giving up and returning an error to clients. + +Retries are helpful for several key reasons: +1. **Network failures**: Network issues can often cause temporary errors. Retrying a request helps to mitigate these +intermittent problems. +2. **Server-side failures**: Servers may fail temporarily due to overload or other issues. +Retrying allows requests to succeed once these conditions are resolved. +3. **Recovery from Temporary Errors**: Certain errors, like "Unavailable" or "resource-exhausted" are often short-lived. +Retrying can allow the request to complete once these issues clear up. + +This proposal aims to establish a streamlined, consistent API for retrying gRPC requests, covering essential +functionality in a way that is broadly applicable across implementations. + +## Background on implementations + +Researching how different Gateway API implementations handle retries for gRPC requests. + +### Envoy +Envoy supports retries for gRPC requests using the `retry_policy` field in the `route` configuration of the HTTP filter. +`retry_on` specifies the gRPC status codes that should trigger a retry by using `x-envoy-retry-grpc-on`, +and it supports a few built-in status codes like: +- `cancelled`: Envoy will attempt a retry if the gRPC status code in the response headers is “cancelled”. +- `deadline-exceeded`: Envoy will attempt a retry if the gRPC status code in the response headers is “deadline-exceeded”. +- `internal`: Envoy will attempt a retry if the gRPC status code in the response headers is “internal”. +- `resource-exhausted`: Envoy will attempt a retry if the gRPC status code in the response headers is “resource-exhausted”. +- `unavailable`: Envoy will attempt a retry if the gRPC status code in the response headers is “unavailable”. + +As with the `x-envoy-retry-grpc-on` header, the number of retries can be controlled via the `x-envoy-max-retries` header. + +By default, Envoy uses a fully jittered exponential backoff algorithm for retries. +This means that after a failed attempt, Envoy waits a random amount of time (with jitter) based on +an exponential growth pattern before trying again. +- **Default Timing**: The base interval starts at 25ms, and each subsequent retry can increase +this interval exponentially. By default, the maximum interval is capped at 250ms (10 times the base interval). +- **Per-Attempt Timeout (`per_try_timeout`)**: Envoy allows you to set a specific timeout for each retry attempt, +known as `per_try_timeout`. This timeout includes the initial request and each retry attempt. +If you don’t specify a `per_try_timeout`, Envoy uses the global route timeout for the total duration of the request. + +In the Gateway API, this `per_try_timeout` will be equivalent to the BackendRequest timeout in the GRPCRouteRule. +This ensures that each retry attempt, including the initial one, respects the overall timeout defined for the backend +request, preventing retries from extending beyond the desired duration. + +### Nginx +`ngx_http_grpc_module` in Nginx supports retries for gRPC requests using the `grpc_pass` directive. + +For gRPC requests, Nginx allows retries under certain conditions by forwarding requests to another server in +an upstream pool when the initial request fails. +The following configuration options are available to control when and how retries occur: +1. **Retry Conditions** (`grpc_next_upstream`): + Nginx can retry a request if certain issues are encountered, such as: + - Network errors (e.g., connection or read errors). + - Timeouts when establishing a connection or reading a response. + - Invalid headers if the server sends an empty or malformed response. + - Specific HTTP error codes (e.g., 500, 502, 503, 504, 429) can be configured as retryable for gRPC responses. + By default, Nginx only retries on network error and timeout, + but you can specify other conditions (like HTTP status codes) to expand retry options. +2. **Retry Limit by Time** (`grpc_next_upstream_timeout`): + You can set a total time limit for how long Nginx will attempt retries. + This limits the retry process to a specified time window, after which Nginx will stop attempting further retries. +3. **Retry Limit by Number** (`grpc_next_upstream_tries`): + You can set a maximum number of retry attempts for a request. + Once this limit is reached, Nginx will stop attempting further retries. +4. **Non-Idempotent Requests** (`non_idempotent`): + By default, Nginx does not retry non-idempotent requests (like POST or PUT) because they can cause side effects + if sent multiple times. However, you can enable retries for non-idempotent requests if needed. + +**Important Considerations**: +- **Partial Responses**: Nginx can only retry if no part of the response has been sent to the client. +If an error occurs mid-response, retries are not possible. +- **Unsuccessful Attempts**: Errors like `timeout` and `invalid_header` are always considered unsuccessful and will +trigger retries if specified, while errors like `403` and `404` are not retryable by default. + +### HAProxy +1. **Retry Conditions**: HAProxy can retry requests based on various network conditions +(e.g., connection failures, timeouts) and some HTTP error codes. While HAProxy does support gRPC via HTTP/2, it does not +have built-in support for handling specific gRPC error codes (like `Cancelled`, `Deadline Exceeded`). +It relies on HTTP-level conditions for retries, so its gRPC support is less granular than the GEP requires. +2. **Retry Limits**: HAProxy allows you to set a maximum number of retries for a request using the `retries` directive. +It also supports setting a timeout for the entire retry process using the `timeout connect` and `timeout server` directives. + +### Traefik +1. **Retry Conditions**: Traefik allows for retries based on HTTP-level conditions (e.g., connection errors and +certain HTTP status codes like 500, 502, 503, and 504), but it does not natively interpret specific gRPC error codes +like `UNAVAILABLE` or `DEADLINE_EXCEEDED`. This means that, while Traefik can retry requests on common HTTP errors +that might represent temporary issues, it lacks the ability to directly handle and retry based on +gRPC-specific error codes, limiting its alignment with the GEP’s requirement for granular gRPC error handling. +2. **Retry Limits**: Traefik provides configurable retry attempts and can set a maximum number of retries. However, +Traefik does not offer per-try timeout controls specific to each retry attempt. Instead, it typically relies on a +global request timeout, limiting the flexibility needed for more precise gRPC retry management (like Envoy’s `per_try_timeout`). + +## API +Having a dedicated API for gRPC retry conditions is necessary because gRPC uses +unique error codes (e.g., `UNAVAILABLE`, `DEADLINE_EXCEEDED`) that represent transient issues specific to its protocol, +which are not adequately covered by general HTTP status codes. gRPC also supports streaming and real-time communications, +making retry strategies more complex than those used for standard HTTP requests. Existing proxies like Envoy handle +gRPC retries with specialized logic, while other proxies rely on HTTP error codes, lacking the precision needed +for gRPC. + +### Go + +```go +type GRPCRouteRule struct { + // Retry defines the configuration for when to retry a gRPC request. + // + // Support: Extended + // + // +optional + // + Retry *GRPCRouteRetry `json:"retry,omitempty"` + + // ... +} + +// GRPCRouteRetry defines retry configuration for a GRPCRoute. +// +// Implementations SHOULD retry on common transient gRPC errors +// if a retry configuration is specified. +// +type GRPCRouteRetry struct { + // Reasons defines the gRPC error conditions for which a backend request + // should be retried. + // + // Supported gRPC error conditions: + // * "cancelled" + // * "deadline-exceeded" + // * "internal" + // * "resource-exhausted" + // * "unavailable" + // + // Implementations MUST support retrying requests for these conditions + // when specified. + // + // Support: Extended + // + // +optional + // + Reasons []GRPCRouteRetryCondition `json:"reasons,omitempty"` + + // Attempts specifies the maximum number of times an individual request + // from the gateway to a backend should be retried. + // + // If the maximum number of retries has been attempted without a successful + // response from the backend, the Gateway MUST return an error. + // + // When this field is unspecified, the number of times to attempt to retry + // a backend request is implementation-specific. + // + // Support: Extended + // + // +optional + Attempts *int `json:"attempts,omitempty"` + + // Backoff specifies the minimum duration a Gateway should wait between + // retry attempts, represented in Gateway API Duration formatting. + // + // For example, setting the `rules[].retry.backoff` field to `100ms` + // will cause a backend request to be retried approximately 100 milliseconds + // after timing out or receiving a specified retryable condition. + // + // Implementations MAY use an exponential or alternative backoff strategy, + // MAY cap the maximum backoff duration, and MAY add jitter to stagger requests, + // as long as unsuccessful backend requests are not retried before the configured + // minimum duration. + // + // If a Request timeout (`rules[].timeouts.request`) is configured, the entire + // duration of the initial request and any retry attempts MUST not exceed the + // Request timeout. Ongoing retry attempts should be cancelled if this duration + // is reached, and the Gateway MUST return a timeout error. + // + // Support: Extended + // + // +optional + Backoff *Duration `json:"backoff,omitempty"` +} + +// GRPCRouteRetryCondition defines a gRPC error condition for which a backend +// request should be retried. +// +// The following conditions are considered retryable: +// +// * "cancelled" +// * "deadline-exceeded" +// * "internal" +// * "resource-exhausted" +// * "unavailable" +// +// Implementations MAY support additional gRPC error codes if applicable. +// +// +kubebuilder:validation:Enum=cancelled;deadline-exceeded;internal;resource-exhausted;unavailable +type GRPCRouteRetryCondition string + +// Duration is a string value representing a duration in time. +// Format follows GEP-2257, which is a subset of Golang's time.ParseDuration syntax. +// +// +kubebuilder:validation:Pattern=`^([0-9]{1,5}(h|m|s|ms)){1,4}$` +type Duration string +``` + +### YAML +```yaml +apiVersion: gateway.networking.k8s.io/v1 +kind: GRPCRoute +metadata: + name: foo-route +spec: + parentRefs: + - name: example-gateway + hostnames: + - "foo.example.com" + rules: + - matches: + - method: + service: com.example + method: Login + retry: + reasons: + - cancelled + - deadline-exceeded + - internal + - resource-exhausted + - unavailable + attempts: 3 + backoff: 100ms + backendRefs: + - name: foo-svc + port: 50051 +``` + +## Conformance Details +To ensure correct gRPC retry functionality, the following tests must be implemented across Gateway API implementations: +1. `SupportGRPCRouteRetryBackendTimeout` + - **Test**: Verify retries respect the BackendRequestTimeout. Requests should fail if the timeout is reached, even with retries. + - **Expected**: Retries occur within the configured timeout, and fail if exceeded. +2. `SupportGRPCRouteRetry` + - **Test**: Ensure retries are triggered for retryable gRPC errors (cancelled, deadline-exceeded, internal, resource-exhausted, unavailable). + - **Expected**: Retries for retryable errors; no retries for non-retryable errors. +3. `SupportGRPCRouteRetryBackoff` + - **Test**: Confirm retries use the configured backoff strategy. + - **Expected**: Retries happen with increasing delay as per backoff configuration. + +## Alternatives + +### GRPCRoute filter +An alternative approach could be to introduce a new filter for GRPCRoute that handles retries. However, as we have already +established a `retry` field in the HTTPRouteRule, it makes sense to extend this to GRPCRoute for consistency. + +## References + +- [gRPC Retry Design](https://grpc.io/blog/guides/retry/) +- [gRPC Status Codes](https://grpc.io/docs/guides/error/) +- [Envoy Retry Policy](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-retry-policy) +- [Nginx gRPC Module](https://nginx.org/en/docs/http/ngx_http_grpc_module.html) +- [HAProxy Retries](https://cbonte.github.io/haproxy-dconv/2.4/configuration.html#4.2-retries) +``` diff --git a/geps/gep-3440/metadata.yaml b/geps/gep-3440/metadata.yaml new file mode 100644 index 0000000000..93022acd26 --- /dev/null +++ b/geps/gep-3440/metadata.yaml @@ -0,0 +1,38 @@ +apiVersion: internal.gateway.networking.k8s.io/v1alpha1 +kind: GEPDetails +number: 696 +name: GRPC Retries +status: Provisional +# Any authors who contribute to the GEP in any way should be listed here using +# their Github handle. +authors: + - shadialtarsha +relationships: + # obsoletes indicates that a GEP makes the linked GEP obsolete, and completely + # replaces that GEP. The obsoleted GEP MUST have its obsoletedBy field + # set back to this GEP, and MUST be moved to Declined. + obsoletes: {} + obsoletedBy: {} + # extends indicates that a GEP extends the linkned GEP, adding more detail + # or additional implementation. The extended GEP MUST have its extendedBy + # field set back to this GEP. + extends: {} + extendedBy: {} + # seeAlso indicates other GEPs that are relevant in some way without being + # covered by an existing relationship. + seeAlso: {} +# references is a list of hyperlinks to relevant external references. +# It's intended to be used for storing Github discussions, Google docs, etc. +references: + - https://grpc.io/docs/guides/retry/ + - https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-retrypolicy + - https://grpc.github.io/grpc/core/md_doc_grpc_xds_features.html +# featureNames is a list of the feature names introduced by the GEP, if there +# are any. This will allow us to track which feature was introduced by which GEP. +featureNames: + - SupportGRPCRRouteRetryBackendTimeout + - SupportGRPCRouteRetry + - SupportGRPCRouteRetryBackoff +# changelog is a list of hyperlinks to PRs that make changes to the GEP, in +# ascending date order. +changelog: {} From 7cecf093e0a3f524ccb1ca650d13f623a9311226 Mon Sep 17 00:00:00 2001 From: Shadi Altarsha <61504589+shadialtarsha@users.noreply.github.com> Date: Fri, 8 Nov 2024 16:05:41 +0100 Subject: [PATCH 2/6] Update geps/gep-3440/index.md Co-authored-by: Sotiris Nanopoulos --- geps/gep-3440/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/geps/gep-3440/index.md b/geps/gep-3440/index.md index 0a80dbcff6..c3f3e0d190 100644 --- a/geps/gep-3440/index.md +++ b/geps/gep-3440/index.md @@ -40,7 +40,7 @@ functionality in a way that is broadly applicable across implementations. ## Background on implementations -Researching how different Gateway API implementations handle retries for gRPC requests. +Below we list how different data planes handle retries for gRPC requests. ### Envoy Envoy supports retries for gRPC requests using the `retry_policy` field in the `route` configuration of the HTTP filter. From 072e76e71f8fa04ff9d0d89854d94f7d5bc96ccf Mon Sep 17 00:00:00 2001 From: Shadi Altarsha <61504589+shadialtarsha@users.noreply.github.com> Date: Fri, 8 Nov 2024 16:05:49 +0100 Subject: [PATCH 3/6] Update geps/gep-3440/index.md Co-authored-by: Sotiris Nanopoulos --- geps/gep-3440/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/geps/gep-3440/index.md b/geps/gep-3440/index.md index c3f3e0d190..b3ed5bf81c 100644 --- a/geps/gep-3440/index.md +++ b/geps/gep-3440/index.md @@ -107,7 +107,7 @@ It also supports setting a timeout for the entire retry process using the `timeo ### Traefik 1. **Retry Conditions**: Traefik allows for retries based on HTTP-level conditions (e.g., connection errors and -certain HTTP status codes like 500, 502, 503, and 504), but it does not natively interpret specific gRPC error codes +certain HTTP status codes like 500, 502, 503, and 504), but it does not natively interpret specific gRPC status codes like `UNAVAILABLE` or `DEADLINE_EXCEEDED`. This means that, while Traefik can retry requests on common HTTP errors that might represent temporary issues, it lacks the ability to directly handle and retry based on gRPC-specific error codes, limiting its alignment with the GEP’s requirement for granular gRPC error handling. From 54e5c505c27a3cd698d46cd208b12c9a8cf01898 Mon Sep 17 00:00:00 2001 From: Shadi Altarsha <61504589+shadialtarsha@users.noreply.github.com> Date: Fri, 8 Nov 2024 16:06:14 +0100 Subject: [PATCH 4/6] Update geps/gep-3440/index.md Co-authored-by: Seth Epps <18355267+seth-epps@users.noreply.github.com> --- geps/gep-3440/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/geps/gep-3440/index.md b/geps/gep-3440/index.md index b3ed5bf81c..c877d2f3b9 100644 --- a/geps/gep-3440/index.md +++ b/geps/gep-3440/index.md @@ -273,7 +273,7 @@ established a `retry` field in the HTTPRouteRule, it makes sense to extend this ## References -- [gRPC Retry Design](https://grpc.io/blog/guides/retry/) +- [gRPC Retry Design](https://grpc.io/docs/guides/retry/) - [gRPC Status Codes](https://grpc.io/docs/guides/error/) - [Envoy Retry Policy](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-retry-policy) - [Nginx gRPC Module](https://nginx.org/en/docs/http/ngx_http_grpc_module.html) From 0357b137b35a8055d2ab084dced29ef2fb212241 Mon Sep 17 00:00:00 2001 From: shadialtarsha Date: Fri, 8 Nov 2024 17:47:42 +0100 Subject: [PATCH 5/6] Follow the gRPC status codes spec naming --- geps/gep-3440/index.md | 46 +++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/geps/gep-3440/index.md b/geps/gep-3440/index.md index c877d2f3b9..746b72f53e 100644 --- a/geps/gep-3440/index.md +++ b/geps/gep-3440/index.md @@ -100,7 +100,7 @@ trigger retries if specified, while errors like `403` and `404` are not retryabl ### HAProxy 1. **Retry Conditions**: HAProxy can retry requests based on various network conditions (e.g., connection failures, timeouts) and some HTTP error codes. While HAProxy does support gRPC via HTTP/2, it does not -have built-in support for handling specific gRPC error codes (like `Cancelled`, `Deadline Exceeded`). +have built-in support for handling specific gRPC status codes (like `Cancelled`, `Deadline Exceeded`). It relies on HTTP-level conditions for retries, so its gRPC support is less granular than the GEP requires. 2. **Retry Limits**: HAProxy allows you to set a maximum number of retries for a request using the `retries` directive. It also supports setting a timeout for the entire retry process using the `timeout connect` and `timeout server` directives. @@ -110,7 +110,7 @@ It also supports setting a timeout for the entire retry process using the `timeo certain HTTP status codes like 500, 502, 503, and 504), but it does not natively interpret specific gRPC status codes like `UNAVAILABLE` or `DEADLINE_EXCEEDED`. This means that, while Traefik can retry requests on common HTTP errors that might represent temporary issues, it lacks the ability to directly handle and retry based on -gRPC-specific error codes, limiting its alignment with the GEP’s requirement for granular gRPC error handling. +gRPC-specific error codes, limiting its alignment with the GEP’s requirement for granular gRPC status codes handling. 2. **Retry Limits**: Traefik provides configurable retry attempts and can set a maximum number of retries. However, Traefik does not offer per-try timeout controls specific to each retry attempt. Instead, it typically relies on a global request timeout, limiting the flexibility needed for more precise gRPC retry management (like Envoy’s `per_try_timeout`). @@ -140,28 +140,28 @@ type GRPCRouteRule struct { // GRPCRouteRetry defines retry configuration for a GRPCRoute. // -// Implementations SHOULD retry on common transient gRPC errors +// Implementations SHOULD retry on common transient gRPC status codes // if a retry configuration is specified. // type GRPCRouteRetry struct { - // Reasons defines the gRPC error conditions for which a backend request + // Reasons defines the gRPC status codes for which a backend request // should be retried. // - // Supported gRPC error conditions: - // * "cancelled" - // * "deadline-exceeded" - // * "internal" - // * "resource-exhausted" - // * "unavailable" + // Supported gRPC status codes: + // * "CANCELLED" + // * "DEADLINE_EXCEEDED" + // * "INTERNAL" + // * "RESOURCE_EXHAUSTED" + // * "UNAVAILABLE" // - // Implementations MUST support retrying requests for these conditions + // Implementations MUST support retrying requests for these status codes // when specified. // // Support: Extended // // +optional // - Reasons []GRPCRouteRetryCondition `json:"reasons,omitempty"` + Reasons []GRPCRouteRetryStatusCode `json:"reasons,omitempty"` // Attempts specifies the maximum number of times an individual request // from the gateway to a backend should be retried. @@ -200,21 +200,21 @@ type GRPCRouteRetry struct { Backoff *Duration `json:"backoff,omitempty"` } -// GRPCRouteRetryCondition defines a gRPC error condition for which a backend +// GRPCRouteRetryStatusCode defines a gRPC status code for which a backend // request should be retried. // -// The following conditions are considered retryable: +// The following status codes are considered retryable: // -// * "cancelled" -// * "deadline-exceeded" -// * "internal" -// * "resource-exhausted" -// * "unavailable" +// * "CANCELLED" +// * "DEADLINE_EXCEEDED" +// * "INTERNAL" +// * "RESOURCE_EXHAUSTED" +// * "UNAVAILABLE" // -// Implementations MAY support additional gRPC error codes if applicable. +// Implementations MAY support additional gRPC status codes if applicable. // -// +kubebuilder:validation:Enum=cancelled;deadline-exceeded;internal;resource-exhausted;unavailable -type GRPCRouteRetryCondition string +// +kubebuilder:validation:Enum=CANCELLED;DEADLINE_EXCEEDED;INTERNAL;RESOURCE_EXHAUSTED;UNAVAILABLE +type GRPCRouteRetryStatusCode string // Duration is a string value representing a duration in time. // Format follows GEP-2257, which is a subset of Golang's time.ParseDuration syntax. @@ -259,7 +259,7 @@ To ensure correct gRPC retry functionality, the following tests must be implemen - **Test**: Verify retries respect the BackendRequestTimeout. Requests should fail if the timeout is reached, even with retries. - **Expected**: Retries occur within the configured timeout, and fail if exceeded. 2. `SupportGRPCRouteRetry` - - **Test**: Ensure retries are triggered for retryable gRPC errors (cancelled, deadline-exceeded, internal, resource-exhausted, unavailable). + - **Test**: Ensure retries are triggered for retryable gRPC status codes (cancelled, deadline-exceeded, internal, resource-exhausted, unavailable). - **Expected**: Retries for retryable errors; no retries for non-retryable errors. 3. `SupportGRPCRouteRetryBackoff` - **Test**: Confirm retries use the configured backoff strategy. From e435f1a043c661edf84bbbf0646287aff66e5b8e Mon Sep 17 00:00:00 2001 From: shadialtarsha Date: Tue, 12 Nov 2024 15:23:48 +0100 Subject: [PATCH 6/6] explicit goal for http errros that dont have gRPC status code --- geps/gep-3440/index.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/geps/gep-3440/index.md b/geps/gep-3440/index.md index 746b72f53e..f2b96e8f24 100644 --- a/geps/gep-3440/index.md +++ b/geps/gep-3440/index.md @@ -13,6 +13,8 @@ allowing for configuration of retry attempts, backoff duration, and retryable st - To allow specification of the maximum number of times to retry a gRPC request. - To allow specification of the minimum backoff interval between retry attempts for gRPC requests. - Retry configuration must be applicable to most known Gateway API implementations for gRPC. +- Retry configuration must be applicable for errors that happen on the HTTP layer (e.g., connection errors, timeouts) +but don't have a direct mapping to gRPC status codes. - To define any interaction with configured gRPC timeouts and backoff. ## Non-Goals