-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache limits for resources and attributes #509
Conversation
Here's what's happening: We are using one set of the form We are using sets of the form So, for one tenant and one signal, the max number of sets will be 8192 + 1. For each payload we doing the following:
I propose the following structure:
So for each payload, we have
Bonus points if this can be a lua script. |
What is happening with the metrics? Several reasons: 1. We have pre-filtering implemented for logs and traces with the help of a separate I spent some time reading the memory overhead associated with different choices in Redis. Redis maintains a dictionary of keys, and each key entry adds an overhead of 50-70 bytes other than its key string length. When the requirement is a bunch of k1:v1 mapping, hashes shine if keys can sharded to fit in the hash-max-ziplist-entries. We are better off HLL for each signal for total cardinality and total resources than maintaining counters ourselves. Now, the remaining requirements are membership checks and per-resource limits. We have two options at hand: 1. One global set with Comparison b/w one set vs per-resource sets One set of
Memory for each entry
For 3 million entries, at ~100 bytes each = 300 MB Separate Set per Resource:
Memory for each entry
Overhead of Multiple Keys
Total memory usage: 214 (for 3 mil entries at an average of 70 bytes value) + 1.2 = 215MB In conclusion, the per-resource set has less memory as we don't repeat the fingerprint for each value in the set. I believe we should go with the per-resource. I am not expecting any current user to come close to any of the limits (Will probably revisit this for a metrics name/type-based scheme to handle histograms better) |
The main objective of this change was to prevent the rouge datasets from adversely affecting others. The shortcoming of a single set of resource + attribute fingerprints is that all resources are treated equally, when in practice, some resources are more cardinal than others, so Instead of maintaining a single set for a combination of resource fingerprint + attrs fingerprint, we now maintain one key for each resource fingerprint, which maintains the attribute fingerprints set. To prevent the key explosion, a separate set for tracking the number of unique resource fingerprints (configured with
max_resources
) for each data source is maintained (Some users add the timestamp of the log records as resource attributes; we don't want to accept such data as part of this).The (configurable) limits are that there can only be a maximum of 8192 resources for a data source for the current window, and each data source can have a maximum of 2048 unique attribute fingerprints. Since any data can go into attributes, we want to limit the attribute's fingerprints as well. We have several filtering by layer that is intended to filter out all high cardinal values before they reach the fingerprint creation.
This greatly reduces the unique fingerprint however, even if the distinct values are not higher, there can be some attributes that have 10-20 values, their combinations can result in a high number of unique attribute fingerprints, so we want to limit them as well. This is the
max_cardinality_per_resource
. Even if the combinations are below, the number of resources X Number of attributes can be ~17mil, which we don't want to allow, so there is a total maximum cardinality allowed for each data source in the current window, configured withmax_total_cardinality
. All of these settings have some defaults; they are based on our observations in monitoring our system and choosing sensible defaults. We may tweak some of them as we learn more.