-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving cache hit ratio #304
Comments
Thanks for using Serverless Image Handler Solution. |
Hi @ddonahue99 , do you have any experience with the CloudFront Origin Shield already? I'm just thinking about using it too to at least somehow limit the Lambda execution count/time.. |
Hi @Buthrakaur - Since posting this, I've made a few changes that have greatly improved the hit ratio, including enabling Origin Shield (which resulted in a modest improvement). The more notable impact, however, was modifying the CloudFront cache settings. I bumped the TTL up to the max (1 year) and changed the cache key to not include the origin and accept headers. From what I could tell, the accept header is part of the cache key by default for the AUTO_WEBP setting, which makes sense, because depending on the client, the response could be webp or jpeg or whatever other fallback you specify. If you are not using AUTO_WEBP, the response will always be the same, so it doesn't make sense to have roughly one cache entry per major browser: Example of how accept headers vary by browser:
With all of these changes, my application was hovering around a 70-75% hit ratio and is now closer to 96%. Ultimately, the better solution for optimizing the hit ratio would be to permanently cache the output in S3. I'd still love to see that as a built-in option to this template. 🙏 |
Hi! Noticed another possible improvement: Please read this: "Cache Hit Ratio - Remove Accept-Encoding header when compression is not needed". |
Hi @ddonahue99 , I am seeing that TTL configuration is already one year when image on s3 doesn't provide one, and when s3 file provides a TTL then it will use that one. See this. Cloudfront will honor cache-control header from origin when it provides one. Maybe... am I missing something? Please give us some details (I am currently working on improve hit ratio too). |
Notice: making modifications to allow enabling Origin Shield optionally (enabled by parameter) on the solution, is a little complicate on the current CDK definition (or at least I can't figure out a simple way). But a solution's user could modify the provided template.yaml to add it. It's as simple as:
As a simple workaround, I would suggest to the mantainers to add it to the documentation. |
One other possible optimization [0]. Thumbor requests without id [1] (such as "/fit-in/120x120/") are forwarded to the origin. The important thing here is that we could avoid making a request to the origin, just using a CloudFront Function (o a Lambda@Edge extension) matching the incorrect path. I'll show an example:
Notice:
Of course, this approach can also be applied to some other previously know to be always invalid/not existent (but frequently requested) paths. As with optionally enabling Origin Shield (see my previous message), enabling this code based on EnableDefaultFallbackImageParameter value is a little complicated on the current CDK definition (or at least I can't figure out a simple way). But readers can make a custom modification of the base template.yaml. [0] = being strict, the proposal here is not a cache optimization: CloudFront Functions are run before using cache. But might avoid making requests to the origin, which, in the end, achieves the same. |
As a picture is worth a thousand words, these are my results (invocations on backend lambda) after applying all these things at the same time; sorry that was what I did, so I cannot show them one at a time.
Initial peak is attributable to old caches invalidation due to cache key conformation being modified. But @ddonahue99 proposal of caching converted images on s3 would be a very important improvement, because CloudFront's caches (I understand that this applies to POPs, Regional and Shield caches) will discard less popular objects (please read this) |
@fvsnippets It looks like you made a really meaningful dent, nice work and thank you for sharing all of your findings! I'm going to have to investigate the If the AWS team is open to allowing for permanent caching in S3, I still agree that would have the biggest impact over the long-term. This solution is not the most efficient as-is for performance/cost at scale. |
maybe i'm wrong, but the main issue with doing straight serving from s3 in cloudfront is how to map the cloudfront cache key to a filename, especially when using things like AUTO_WEBP (and especially once AUTO_WEBP also does AUTO_AVIF ;)) without adding the runtime cost of another lambda edge call (time, and money). I suppose it could be dealt with by the image handler, by having it check a CACHE_BUCKET once it has fully resolved all parameters, and immediatly prior to actually loading the image from the SOURCE BUCKET and performing operation. if present, return it, as if it had been through the entire process, and if not, proceed and store the output to the CACHE_BUCKET. it does mean, it will not do CDN => CACHE_S3? => API-GW, but instead CDN => API-GW => CACHE_S3, so you wont save on the api-gw calls, but you /will/ save on customer wait time for items that are already processed once. |
Maybe it could be enabled only under certain circumstances (AUTO_WEBP not enabled, etc) and only for certain paths (e.g. Thumbor resize URLs). I understand that CloudFront allows the latter by using origin groups (but I haven't read enough/have experience on that topic to tell for sure). |
I think my main concern is with not storing already processed items is, if i upload nice and juicy 10mb pngs as source images, it takes 5-10 seconds to turn it into an avif (after bumping sharp to .30 and adding it as a valid format) which is not going to be a smooth experience to the end user. But honestly i have no idea what number of cache-evictions i would be looking at under normal circumstances (just started playing with this lib), but my site does have a few hundreds of thousands of images, and with 8 size variants for each, in 3 potential formats (avif, webp, jpg) it does add up, especially if it also adds a cachekey pr accept-header variant, (which for /some/ internet explorer/edge variants seem to include every office program installed) If anyone has/is willing to share some experience on this, that would be great. I was wondering if maybe a cloudfront function could be used to “normalize” the accept header into, maybe, only the optimal image/ prefix the client can understand, and use that as the cache key? (although that might break hmac validation?) |
For what its worth, i tried adding this to the backend-end-construct.ts
and wired it up further down
And it does seem to work, for the AutoWebP scenario, where you just want to return the best possible representation the client can consume. Ie
gives a cache miss on first access (and hits afterwards) but
gives a cache-hit because the accept header is rewritten to just Now, i realize this will probably conflict with other features, and request-specific requests for formats. Ie if you explicitly ask for a jpg in the transformations, it would cache it with an the image/web accept header, but....i suppose it will still actually RETURN content type image/jpg, and the filename/path part will already make it unique for requests that ask for transformation to jpg. Unsure if this is a problem, really... |
It would be super nice to have a comprehensive guide of things to do for people who are just getting started with "improving cache hit ratio", I can see several improvements are mentioned above, but I'm not sure how that should translate in "configuration updates". Could someone clarify if/what should be done? |
We will evaluate adding to the Implementation Guide some information on this subject. |
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled. |
Hi AWS team, I am bringing this task to your attention as I think it is an absolute must to improve the cache ratio. This task has been open for two years already and no steps have been taken to improve it. We have 50+ websites where we use this image handler and are running high costs because of this. |
Hi Folks, Planned
Potential for future
Not Planned
Thanks for your interest in SIH, |
hi everyone! I started working on S3 caching today using a hash of image request info as an additional key. here's the basic approach: https://github.com/wonathanjong/sls-img-cache let me know what y'all think :) |
Just made edits to forked repo
|
My company recently deployed the serverless image handler, and it was a breeze - nice work! One thing we've noticed that has been a little surprising is a lower than expected CloudFront cache hit ratio, and we'd love to be able to get the Lambda costs down. My assumption is that the serverless image handler is caching at each CloudFront edge location, so for a given image requested from several places around the globe, it will need to hit the lambda multiple times. Over time, those cached items will expire and will need to be re-hydrated again. Is that correct?
Assuming that's what's going on, a couple options come to mind for optimizing the hit ratio:
Cache the converted images in S3, rather than relying solely on the CloudFront cache. Storage costs would be higher, but it would need to hit the lambda exactly once for a given set of image parameters. This would obviously require some fundamental changes to the serverless-image-handler.
For a lighter approach, would CloudFront Origin Shield solve this problem? Would need to crunch the numbers to evaluate cost implications, but it seems like it exists for this sort of use case.
Thanks in advance for any guidance, and please let me know if there are any other options I am not considering.
The text was updated successfully, but these errors were encountered: