Releases: bitfaster/BitFaster.Caching
Releases · bitfaster/BitFaster.Caching
v2.5.4
What's changed
- Eagerly purge deleted items from internal ConcurrentLruqueues. Previously, deleted items remain in the internal queues until fully cycled out of cold. Instead, purge them as items transition from queue to queue (e.g. from hot to warm) as part of cycle.
- Fix UnobservedTaskExceptionfor value creation when usingAsAsyncCache()/AsyncAtomicFactory/ScopedAsyncAtomicFactory. If there is an exception when invoking the value factory delegate, the internalTaskCompletionSourcehas an exception set that previously was not observed unless another thread concurrently evaluated the result. Fixed by @advdotnet.
- In  ConcurrentLru.Trim()avoid incorrectly trimming an extra warm item when values are trimmed from cold but not warm.
- ConcurrentLru.TrimExpired()is now thread safe.
- Minor code cleanups by @Joy-less.
Full changelog: v2.5.3...v2.5.4
v2.5.3
What's changed
- Eliminate volatile writes in ConcurrentLruinternal bookkeeping code for pure reads, improving concurrent read throughput by 175%.
- Vectorize the hot methods in CmSketchusing Neon intrinsics for ARM CPUs. This results in slightly betterConcurrentLfucache throughput measured on Apple M series and Azure Cobalt 100 CPUs.
- Unroll loops in the hot methods in CmSketch. This results in slightly betterConcurrentLfuthroughput on CPUs without vector support (i.e. neither x86 AVX2 nor Arm Neon).
- On vectorized code paths (AVX2 and Neon), CmSketchallocates the internal buffer using the pinned object heap on .NET6 or newer. Use of the fixed statement is removed, eliminating a very small overhead. Sketch block pointers are then aligned to 64 bytes, guaranteeing each block is always on the same CPU cache line. This provides a small speedup for theConcurrentLfumaintenance thread by reducing CPU cache misses.
- Minor improvements to the AVX2 JITted code via MethodImpl(MethodImplOptions.AggressiveInlining)and removal of local variables to improve performance on .NET8/9 and dynamic PGO.
Full changelog: v2.5.2...v2.5.3
v2.5.2
What's changed
- Fix race between update and TryRemove(KeyValuePair)for bothConcurrentLruand FixConcurrentLfu. Prior to this fix, values may be deleted if the value is updated to no longer match theTryRemoveinput argument whileTryRemoveis executing.
- Fix ConcurrentLfutorn writes for large structs using SeqLock.
Full changelog: v2.5.1...v2.5.2
v2.5.1
What's changed
- Fix ConcurrentLfutime-based expiry policy failing to update the entry's expiry on read. Prior to this fix, expiry is only updated when the read buffer is processed (following a cache write, or when the read buffer is full).
- Fix ConcurrentLrutorn writes for large structs using SeqLock.
- Fix torn writes for 64-bit current time on 32-bit platforms for ConcurrentLruAfterAccessPolicyandDiscretePolicy.
- P/Invoke TickCount64to evaluate current time for .NET Standard on Windows,Duration.SinceEpochis 5x faster resulting in lower latency lookups forConcurrentTLru/ConcurrentTLfu.
- Use Stopwatch.GetTimestampto evaluate current time on MacOS,Duration.SinceEpochis about 20% faster resulting in slightly lower latency lookups forConcurrentTLru/ConcurrentTLfu.
Full changelog: v2.5.0...v2.5.1
v2.5.0
What's changed
- Provide time-based expiry  for ConcurrentLfu, matchingConcurrentLru. This closely follows the implementation in Java's Caffeine, using a port of Caffeine's hierarchical timer wheel to perform all operations in O(1) time. Expire after write, expire after access and expire after usingIExpiryCalculatorcan be configured viaConcurrentLfuBuilderextension methods.
- Provide ICacheExtandIAsyncCacheExtto enable client code compiled against .NET Standard to use the builder APIs and cache methods added since v2.0. These new methods are excluded in the base interfaces for .NET Standard, since adding them would be a breaking change.
- Provide the Durationconvenience methodsFromHoursandFromDays.
Full changelog: v2.4.1...v2.5.0
v2.4.1
What's changed
- Fixed a race condition in ConcurrentLfufor add-remove-add of the same key.
- MpscBoundedBuffer.Clear()is now thread safe, fixing a race in- ConcurrentLfuclear.
- Fixed ConcurrentLruCountandIEnumerable<KeyValuePair<K,V>>to filter out expired items when used with time-based expiry.
- BitFaster.Caching is now compiled with <nullable>enable</nullable>, and APIs are annotated to support null reference type static analysis.
Full changelog: v2.4.0...v2.4.1
v2.4.0
What's changed
- Provide two new time-based expiry schemes for ConcurrentLru:- Expire after access: evict after a fixed duration since an entry's most recent read or write. This is equivalent to MemoryCache's sliding expiry, and is useful for data bound to a session that expires due to inactivity.
- Per item expiry time: evict after a duration calculated for each item using the specified IExpiryCalculator. Expiry time may be set independently at creation, after a read and after a write.
 
- Align TryRemoveoverloads withConcurrentDictionaryforIAsyncCacheandAsyncAtomicFactory, matching the implementation forICacheadded in v2.3.0. This adds two new overloads:- bool TryRemove(K key, out V value)- enables getting the value that was removed.
- bool TryRemove(KeyValuePair<K, V> item)- enables removing an item only when the key and value are the same.
 
- Add extension methods to make it more convenient to use AsyncAtomicFactorywith a plainConcurrentDictionary. This is similar to storing anAsyncLazy<T>instead ofT, but with the same exception propagation semantics and API asConcurrentDictionary.GetOrAdd.
- BitFaster.Caching assembly marked as trim compatible to enable trimming when used in native AOT applications.
- AtomicFactoryvalue initialization logic modified to mitigate lock convoys, based on the approach given here.
- Fixed ConcurrentLru.Clearto correctly handle removed items present in the internal bookkeeping data structures.
Full changelog: v2.3.3...v2.4.0
v2.3.3
What's changed
- Eliminated all races in ConcurrentLrueviction logic, and the transition between the cold cache and warm cache eviction routines. This prevents a variety of rare 'off by one item count' situations that could needlessly evict items when the cache is within bounds.
- Fix ConcurrentLru.Clear()to always clear the cache when items in the warm queue are marked as accessed.
- Optimize ConcurrentLfudrain buffers logic to give ~5% better throughput (measured by the eviction throughput test).
- Cache the ConcurrentLfudrain buffers delegate to prevent allocating a closure when scheduling maintenance.
- BackgroundThreadSchedulerand- ThreadPoolSchedulernow use- TaskScheduler.Default, instead of implicitly using- TaskScheduler.Current(fixes CA2008).
- ScopedAsyncCachenow internally calls- ConfigureAwait(false)when awaiting tasks (fixes CA2007).
- Fix ConcurrentLrudebugger display on .NET Standard.
Full changelog: v2.3.2...v2.3.3
v2.3.2
What's changed
- Fix ConcurrentLruNullReferenceExceptionwhen expiring and disposing null values (i.e. the cached value is a reference type, and the caller cached a null value).
- Fix ConcurrentLfuhandling of updates to detached nodes, caused by concurrent reads and writes. Detached nodes could be re-attached to the probation LRU pushing out fresh items prematurely, but would eventually expire since they can no longer be accessed.
Full changelog: v2.3.1...v2.3.2
v2.3.1
What's changed
- Introduce a simple heuristic to estimate the optimal ConcurrentDictionarybucket count forConcurrentLru/ConcurrentLfu/ClassicLrubased on thecapacityconstructor arg. When the cache is at capacity, theConcurrentDictionarywill have a prime number bucket count and a load factor of 0.75.- When capacity is less than 150 elements, start with a ConcurrentDictionarycapacity that is a prime number 33% larger than cache capacity. Initial size is large enough to avoid resizing.
- For larger caches, pick ConcurrentDictionaryinitial size using a lookup table. Initial size is approximately 10% of the cache capacity such that 4ConcurrentDictionarygrow operations will arrive at a hash table size that is a prime number approximately 33% larger than cache capacity.
 
- When capacity is less than 150 elements, start with a 
- SingletonCachesets the internal- ConcurrentDictionarycapacity to the next prime number greater than the capacity constructor argument.
- Fix ABA concurrency bug in Scopedby changingReferenceCountto use reference equality (viaobject.ReferenceEquals).
- .NET6 target now compiled with SkipLocalsInit. Minor performance gains.
- Simplified AtomicFactory/AsyncAtomicFactory/ScopedAtomicFactory/ScopedAsyncAtomicFactoryby removing redundant reads, reducing code size.
- ConcurrentLfu.Countnow does not lock the underlying- ConcurrentDictionary, matching- ConcurrentLru.Count.
- Use CollectionsMarshal.AsSpanto enumerate candidates withinConcurrentLfu.Trimon .NET6.
Full changelog: v2.3.0...v2.3.1