Conversation
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit makes the dnsdist CacheValue structure definition public, so that the type can be referred to in the Rust `cxx::bridge` bindings as part of the Moka cache value data type. That is, we want to use `std::shared_ptr<CacheValue>` as the values stored in the Moka cache, which requires making CacheValue public. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
… `dnsdist` This line is needed in order to bring in the correct header include paths (-I) when building dnsdist pointing to the automatically generated C++ headers built by `cxx::bridge`. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This adds a basic `cxx::bridge` binding to the Rust `moka` crate. A few simple functions/methods are exposed that allow creating a Moka cache object and performing get/insert/etc. operations on it. The key type is hardcoded to `Vec<u8>` and the value type is hardcoded to `cxx::SharedPtr<CacheValue>` (Rust), aka `std::shared_ptr<CacheValue>` (C++), where `CacheValue` is dnsdist's structure type for storing packet cache entries. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit adds a `d_cache` member to DNSDistPacketCache which is initialized by calling into our Rust `cxx::bridge` binding of the `moka` crate, using the configured number of max cache entries and shard count specified in the CacheSettings struct. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
…s contents pdns/dnsdistdist/packetcache.hh is a symlink to pdns/packetcache.hh, because this header file is shared between dnsdist and PowerDNS Authoritative. This commit prepares for making dnsdist-specific changes to pdns/dnsdist/packetcache.hh by replacing the symlink with an ordinary file containing the original contents of the symlink target. XXX: This is obviously bad since it results in code duplication. Figure out a better way to do this later. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Some of the functions in packetcache.hh are not used in dnsdist, so they could be removed from dnsdist's version of this file. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
…CI() Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit adds a wrapper function that exposes Rust's `Vec::extend_from_slice()`, since this isn't supported by the `cxx::bridge` binding. This allows faster copying of data into a Rust `Vec<u8>`. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit updates CacheKey to accumulate copies of the bytes passed to update() and updateCI() into a Rust Vec<u8> member of the struct. (Technically, updateCI() downcases the bytes first.) This allows a CacheKey to be built up incrementally by calls to update() / updateCI() as before, but after the key has been built we have both the original burtle hash value of the key as well as a buffer of bytes containing the actual key. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit adjusts the `DNSDistPacketCache::insert()` and `DNSDistPacketCache::get()` methods to use the `CacheKey` type for cache keys rather than raw 32-bit hash values, and to query the new Moka packet cache rather than the original packet cache. The `InternalQueryState` structure originally had three member fields which were raw 32-bit hash values, and this commit also adjusts those fields to be `CacheKey` objects. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit removes the unused parameter `dnssecOK` from the signature of `DNSDistPacketCache::get()`. This method previously called a helper function `cachedValueMatches()` since it needed to handle cache key collisions, but since we now pass a fully constructed cache key (rather than a hash of the key) to the Moka cache's lookup function, the value that is returned matches the given key, so this step is unnecessary, and thus passing the `dnssecOK` parameter (part of the cache key) to the `DNSDistPacketCache::get()` method is unnecessary. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit adjusts the `DNSDistPacketCache::getSize()` method so that it calls the underlying Moka cache's `entry_count()` method to obtain the number of entries in the cache. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Stub out this method until a way of dumping the Moka cache contents is implemented. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
None of the "cache cleaning" functionality is needed for the Moka-based packet cache, so disable the relevant block of code in `maintThread()`. Note that this is the only place where the packet cache's "keep_stale_data" setting is used, so this setting now no longer does anything. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
…() method Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
…Flags` Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
…cOK` Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
…vedOverUDP` Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
This commit updates `DNSDistPacketCache::get()` to calculate the length of the CacheValue's `value` field rather than using the pre-calculated length stored in the CacheValue's `len` field. Since this is the only place where the `len` field is used, this makes it possible to remove it. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
Now that only the Moka cache is being used, there is no need to maintain the `hash` member of `CacheKey`. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
… object This commit avoids constructing and returning a new CacheKey object from `DNSDistPacketCache::getKey()` and instead writes the key into the `CacheKey` reference that is passed in via parameter. Since CacheKey objects are stored in an InternalQueryState, this avoids overwriting an empty CacheKey with a new one. (And an empty CacheKey itself contains an empty Rust Vec<u8>). The impact is very, very minor but still seems to be a win, maybe on the order of ~0.2% or less, if I am reading the perf output correctly. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
`cxx::bridge` needs to be able to parse the header that has the definition of `CacheValue`. The `*.rs.h` header files generated by `cxx::bridge` won't be available until after `cxx::bridge` has finished building (i.e., until after dnsdist-rust-lib has been built). That means that any header file that `cxx::bridge` wants to read via an `include!` directive cannot itself have a dependency on any of the header files generated by `cxx::bridge`, because those generated header files won't actually exist (if this is a clean working tree). Since dnsdist-cache.hh has an include of `rust/moka.rs.h`, this causes a build failure when `cxx::bridge` runs and can't find that header file. So, this commit breaks out the `CacheValue` struct into its own header file that doesn't have a dependency on any of the headers generated by `cxx::bridge`. Signed-off-by: Robert Edmonds <edmonds@users.noreply.github.com>
|
Thanks a lot for contributing this! I have two main concerns at the moment:
So I guess we need to come up with a realistic benchmark setup to evaluate the impact of both this PR and #16692. |
Thanks for taking a look! For my use case I don't necessarily have the backends co-located with the dnsdist frontends, so I care a lot more about having a higher cache-hit rate in dnsdist's packet cache, and also about maintaining a good cache-hit rate in the face of abusive traffic like pseudorandom subdomain attacks. Regarding performance under concurrent reads, I did make sure to use Any thoughts on the non-algorithmic changes in this branch? I actually spent a bit more time on refactoring how |
Description
Hi,
This is a draft PR for discussion purposes because there are definitely some things that are broken or weird in this branch.
This branch replaces the underlying data store for the
DNSDistPacketCachewith the Rust moka crate, a "fast, concurrent cache library for Rust".cxx::bridgeis used to create a binding from Rust to C++, wrapping the moka::sync::SegmentedCache object, specialized for the key and value data types required by the packet cache.The key type stored in the
mokacache is Rust'sVec<u8>. These keys are in Rust-allocated memory and store the bytes extracted by dnsdist'sDNSDistPacketCache::getKey()parser, concatenated together into a contiguous buffer. Because this buffer is in Rust-allocated memory, it can be inserted into themokacache, taking ownership of the key instead of copying it. In the C++ portion of the dnsdist code base, a newCacheKeystruct has been introduced that wraps the Rust-ownedVec<u8>buffer. The parts of the code base that previously operated on Jenkins hash values directly have been updated to use this newCacheKeytype to accumulate the bytes of the key. (Themokacrate internally uses the aHash crate to hash keys which probably compares favorably to the Jenkins lookup2 hash.)Adding this
CacheKeyabstraction and giving the bytes of the actual query key directly to the container to store seems to simplify things, because it removes the need for a post-lookup step that checks for collisions, and as a result at least half of the members ofCacheValuewould seem to be unnecessary and could be removed, which then makes it unnecessary to pass these fields around as function parameters.The value type stored in the
mokacache is C++'sstd::shared_ptr<CacheValue>.cxx::bridgehas built-in support for std::shared_ptr, which lets us allocate cache value objects on the C++ side and insert them directly into themokacache. If I understand correctly, the refcount is automatically increased when inserting into the cache, so whenmokadecides to evict a cache entry it may or may not be deallocated immediately, depending on if another worker thread has taken out a reference. This is a big difference from the original implementation which storesCacheValuedirectly in the container and copies out the required elements from the cache value directly into the response packet while holding a read lock on the container.Because
mokahandles eviction incrementally, it does not have its own background maintenance thread, and the "cache cleaning" work that was being done by dnsdist's maintenance thread is unnecessary in this branch.Discussion
This is an early draft submitted for feedback purposes. There is another draft PR pending (#16692) that adds LRU and SIEVE into the existing implementation, but my branch is a lot more intrusive since it introduces a hard dependency on Rust and reworks how keys/values are stored in the cache.
My primary interest is in getting improved caching admission/eviction algorithms in dnsdist that have "scan resistance", or being robust against "one hit wonders", i.e. it would be very nice if a pseudorandom subdomain attack could not trivially empty the packet cache by just sending more queries than the size of the cache in a short amount of time.
I have done some initial performance tests and found that this implementation does use a few percent more CPU cycles than the original implementation, which makes sense given the additional work needed to implement the algorithms that
mokauses. The tradeoff of slightly more CPU usage may be worth it if it results in a better cache hit ratio.There are several commits marked with
XXXin the commit subject that indicate where more work is needed, for instance there are some ancillary operations (dumping the cache, expunging records, searching for cached records, etc.) that the new implementation would need to support that it doesn't yet. There is alsopdns/dnsdistdist/packetcache.hhwhich is a symlink that I broke so that I could test out the dnsdist-specific changes in this branch without affecting the other products, but there is probably a better way to do this.Checklist
I have: