fix(cache): Estimate size of posting lists #9515
Open
+99
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The maximum cost of the Ristretto cache is set to a number of megabytes, but when inserting a posting list item into the cache, the cost is set to 1. Because of this the cache only grows and no items get evicted. This results in high memory consumption, especially with the new UID cache (see #9513).
This PR introduces a cost function which estimates the memory size of each item. For more details see predictable-labs#6. Credits to @darkcoderrises for the implementation.
The default cache size is changed from 1024 to 4096, to reflect the more accurate cost estimation. See below for how the size of the cache relates to the occupied memory by the Dgrpah process, when this PR is applied.
size-mb=1024
Dgraph process occupies up to 3.7 GiB of memory
heap:
size-mb=2048
Dgraph process occupies up to 5.4 GiB of memory
heap:
size-mb=4096
Dgraph process occupies up to 8.4 GiB of memory
heap:
size-mb=8192
Dgraph process occupies up to 13.3 GiB of memory
heap:
Closes #9513