Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(pdk): refine with optional cache to improvement performance #13981

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

ProBrian
Copy link
Contributor

@ProBrian ProBrian commented Dec 5, 2024

Summary

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

Fix #[issue number]

@ProBrian ProBrian changed the title [draft] test pdk refine without patching all resty APIs Pdk refine with optional cache to improvement performance Dec 16, 2024
@ProBrian ProBrian marked this pull request as ready for review December 16, 2024 01:52
@@ -619,14 +617,14 @@ local function new(self)
-- kong.request.get_header("Host") -- "foo.com"
-- kong.request.get_header("x-custom-header") -- "bla"
-- kong.request.get_header("X-Another") -- "foo bar"
function _REQUEST.get_header(name)
function _REQUEST.get_header(name, ctx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http_get_header supports ctx param to cache headers, expose it to get_header API so we could use cache.

Copy link
Member

@bungle bungle Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this either. If we add this parameter, we need to always keep the API like that, that the second parameter needs to be ctx. The benefits are so small that I would rather not do it. The most of the changes in plugins seems to be utilizing this. And that is also I think somewhat "bad" code to make people create tables of ctx and pass it here. If we cannot find better way to do this, then I would not do it.

Just an example e.g. if someone wants to add more beneficial parameter lilke default in case the header is missing the code would become:

local value = kong.request.get_header("Some-Header", nil, "default")

Perhaps we will never add that either, but just tells that we don't want these strange ctx parameters in public API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we use kong.tools.get_header(name, ctx) directly in plugins, to replace non-cached kong.request.get_header API in certain places to achieve the cache benefit? It must be better to use PDK APIs in plugins, but I do see that there are a lot of kong.tools usage in plugins(which I feel more like should be used "internally" in core or some non-plugin module?), Do you think it's a proper practice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ProBrian In general it would be best if we can get to state where there is no require "module" statements in plugins. That is we have a powerful PDK. But meanwhile, yes, many plugins require different modules. That also then spreads to all custom plugins as plugins are copy pasted. That then will make all our internal APIs and tools public too which we cannot break. I think for example those plugins, just call get_headers once and work with headers and call it done, rather than pass a table to somewhere that something else caches the get_headers.

@@ -285,10 +285,11 @@ local function new(pdk, major_version)
-- -- or `access` phase prior calling this function.
--
-- local body = kong.service.response.get_raw_body()
function response.get_raw_body()
function response.get_raw_body(buffered_proxying)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a minor refinement so that if we could reduce the reads of ngx.ctx.buffered_proxying by passing it by param

@@ -158,21 +158,22 @@ end

local function do_authentication(conf)
local www_authenticate = "Basic realm=\"" .. conf.realm .. "\""
local ctx = {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a local ctx here which is cheaper thanngx.ctx, and the cache is only fresh in the local ctx lifetime, not in the request lifetime, so we don't need ngx.ctx

@ProBrian ProBrian requested a review from bungle December 16, 2024 03:11
Copy link
Member

@bungle bungle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert all the API changes and related changes in where the changed APIs were used. Benefit vs. drawbacks are too small. Those optimizations can be implemented inside plugins themselves if they are that good. In many cases it also depends whether plugin is reading single or multiple headers to what to use, eg. is. get_headers better than multiple get_header and is multiple get_header better with created context table at what count?

@@ -619,14 +617,14 @@ local function new(self)
-- kong.request.get_header("Host") -- "foo.com"
-- kong.request.get_header("x-custom-header") -- "bla"
-- kong.request.get_header("X-Another") -- "foo bar"
function _REQUEST.get_header(name)
function _REQUEST.get_header(name, ctx)
Copy link
Member

@bungle bungle Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this either. If we add this parameter, we need to always keep the API like that, that the second parameter needs to be ctx. The benefits are so small that I would rather not do it. The most of the changes in plugins seems to be utilizing this. And that is also I think somewhat "bad" code to make people create tables of ctx and pass it here. If we cannot find better way to do this, then I would not do it.

Just an example e.g. if someone wants to add more beneficial parameter lilke default in case the header is missing the code would become:

local value = kong.request.get_header("Some-Header", nil, "default")

Perhaps we will never add that either, but just tells that we don't want these strange ctx parameters in public API.

kong/pdk/service/response.lua Outdated Show resolved Hide resolved
@ProBrian
Copy link
Contributor Author

Let's revert all the API changes and related changes in where the changed APIs were used. Benefit vs. drawbacks are too small. Those optimizations can be implemented inside plugins themselves if they are that good. In many cases it also depends whether plugin is reading single or multiple headers to what to use, eg. is. get_headers better than multiple get_header and is multiple get_header better with created context table at what count?

Removed all public API changes. @bungle

@bungle
Copy link
Member

bungle commented Jan 31, 2025

@ProBrian I'll setup a meeting, where we can discuss how to get this in.

@chronolaw chronolaw changed the title Pdk refine with optional cache to improvement performance perf(pdk): refine with optional cache to improvement performance Feb 7, 2025
kong/globalpatches.lua Outdated Show resolved Hide resolved
@chronolaw
Copy link
Contributor

Do we have any data about improvements?

@@ -116,7 +116,7 @@ X-Missing: nil
=== TEST 4: response.get_header() returns nil when response header does not fit in default max_headers
=== TEST 4: response.get_header() returns existing header will not limited by get_headers() default max_headers configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a new feature, just a refine, since we use ngx.header[name] to replace _RESPONSE.get_headers()[name] in the get_header implementation, the max headers limitation brought by get_headers() will not exist anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this extends the capability of this PDK function. Shall we consider this a new feature?

Copy link
Contributor Author

@ProBrian ProBrian Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinks it's not needed to consider this a new feature, because

  1. the limitation is due to our internal implementation.
  2. the limitation is not documented which means users aren't even aware of this limitation.

that means we have no interface changes and no documentation changes from user's perspective.

@ProBrian
Copy link
Contributor Author

ProBrian commented Feb 7, 2025

Do we have any data about improvements?

I did a preliminary test on aws, you can refer to https://konghq.atlassian.net/wiki/spaces/~712020faf780811dc24bb28ad1df7390c03c41/pages/4217667620/A+Journey+to+PDK+Caching#One-more-try-for-the-2nd-half, but that's not a standard test environment. I think we should run a perf test after this PR merged to see.

kong/globalpatches.lua Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment