-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: consider using APIv3 standard endpoints #356
Comments
Noting:
|
I did a test for this using this script: https://gist.github.com/humitos/f7f5e48881868616ebb63b417887515d
Executing this function multiple times, I get times from 300ms to 600ms in the worst cases. Taking a look at NewRelic, the average time for current addons endpoint is 250ms and the median is 150ms. However, it seems that API responses are not cached by Cloudflare and they are always hitting our servers -- I see the header Another thing that I noticed is that APIv3 endpoints are not proxied via Required changes
Is there any reason why these two points aren't already implemented? cc @readthedocs/backend |
I think we should try to proxy only the required endpoints, we don't want to proxy all endpoints (bring the entire jungle just to get the banana), and also just expose read-only operations, as write/delete operations can be abused by an attacker if there is a vulnerability in the docs pages. We also need to take into consideration permissions from sharing tokens on .com, and we shouldn't cache responses there, as we check for authn/authz on .com. I'm +1 on splitting the addons API into several endpoints, but maybe just split by functionality rather than trying to re-implement all API v3 under docs pages. We would also need to apply purge rules for each proxied endpoint on .org. Which also brings the question, are we really saving resources by splitting the cached resources (some ednpoints will still need to be purged frequently on some actions)? Is it worth the extra complexity (we now have to manage several caches)? |
I'm not saying re-implementing APIv3 on documentation pages, just expose the same endpoint/backend code under
We aren't caching APIv3 at all right now. We will have the same caching complexity when we add cache even if we don't use it for addons; so that's not extra work. We will be saving a lot, since there will be a few endpoints that don't need to be purged if only the project has changed for example (eg. file tree diff, builds, versions, etc won't be purged). Otherwise, just by changing any field in any object, everything has to be purged (current behavior) |
And we shouldn't cache API v3, since it's served from our main application (we don't want to cache anything in our main application, as it serves dynamic content).
But we will still need to purge the other endpoints, we will be saving something like 1/3 of the response at the added complexity of tracking each action that will need to purge each endpoint instead of doing one purge per build. And I'm +1 on splitting things by functionality (but mostly to avoid having one big response), so things like FTD will be cached independently (but probably must endpoints will need to be purged on each build). This also makes me wonder how much are we really saving by not just purging on each build, unless lots of projects trigger builds every minute. Our current browser cache TTL is set to 20 min, and our edge cache TTL is two hours https://developers.cloudflare.com/cache/how-to/configure-cache-status-code/#edge-ttl, that means that each response is cached by 2 hours and 20 min max. |
I don't understand why you are saying we shouldn't cache APIv3. Can you expand on that? What are the cons you are seeing there? We can still serve uncached content from our application and cached content from our APIv3. I don't see any problem with that.
No, if you take a look at the example I shared, it perform 5 requests and only one depends on the build object. That means we will need to perform only 1 request when the build object changes; since all the other endpoints will be cached already. We will be saving 80% of the requests currently.
This is fine. Note this 2 hours cache will be shared between all the users; that means we will avoid a lot of requests hitting our servers. Besides, we can eventually increase that TTL if we want to save even more. |
Initial implementation as a POC to give it a try. It works fine at first sight and I think it could be a good idea to move forward with it. I didn't find any blocker yet. Closes #356
We should cache read-only/static content, API v3 isn't read-only, we also serve dynamic content from our main application (content that depends on the user session). We can create very specific cache rules to avoid things like cache poisoning, but I'd just avoid caching content on our main application domain. We can cache content over docs domains (assuming we only expose read-only resources, and on .org only).
Version objects do depend on build objects, fields like identifier (commit), built and downloads depend on the result of the latest build. Like I said, I'm fine that we are splitting the addons response, but purging the cache should still happen after each build, trying to purge the cache over specific cases will add more complexity and not save us much at the end. |
I understand you are OK proxing APIv3 and serving it under
I'm fine to start purging the cache after each build for now. We can make it more performant in the future if we want. We can continue the discussion around cache invalidation in https://github.com/readthedocs/meta/discussions/162. |
Summarizing, the work required here so far is:
|
Now that we are allowing public access to APIv3 endpoints, we can think about making different API queries using the common APIv3 endpoints to fetch small chunks of data instead doing one big call to the specific
/_/addons/
one.This would be transparent for our own JavaScript library and for users of the
CustomEvent
because it will be managed internally and the exact same JSON-like object will be returned.As a example,
projects.current
field can be populated with https://docs.readthedocs.io/en/stable/api/v3.html#project-detailsversions.current
field with https://docs.readthedocs.io/en/stable/api/v3.html#version-detailversions.active
field with https://docs.readthedocs.io/en/stable/api/v3.html#versions-listingThis may be something good to explore since I'm seeing some benefits on caching (not all of the responses will be purged on each new build) and also in maintenance as well, since it's going to be one less big endpoint to maintain over time. Besides, we will become users of our own APIv3 which will make us to keep improving it over time.
The text was updated successfully, but these errors were encountered: