Reimplement weakref_lru_cache on top of a custom hash map type. #35003

copybara-service · 2026-02-11T19:29:00Z

Reimplement weakref_lru_cache on top of a custom hash map type.

The existing weakref_lru_cache has a number of concurrency problems that are unearthed by tsan and the tests added in this PR, but the key problem that made me decide that the existing approach does not work is the following:

if we're using a std::unordered_map<>, then we must protect it by a mutex mu_ that is not the GIL because the GIL can be released during Python equality functions, and std::unordered_map<> would not be happy with this. This implies that the lock order must be mu_ then GIL, otherwise it would not be ok to release and reacquire the GIL.
but the tp_traverse GC traversal handler must hold the GIL (or, at least, the Python threads are all in a stopped state because Python GC is stop-the-world, which is much the same as holding the GIL), but would need to acquire mu_ to traverse the std::unordered_map. And since we are in the middle of GC, we cannot release the GIL. So there is no valid lock ordering possible.

This PR takes a different approach:

we add a new hash map ReentrantHashMap which is heavily inspired by absl::flat_hash_map.
the key features of ReentrantHashMap is that it allows the map to be mutated from the equality function. We use a simple optimistic versioning scheme wherein we detect mutations during equality tests and retry the lookup.
this scheme is not robust to equality tests that intentionally mutate the map (e.g., we might loop endlessly) but the goal is to catch incidental mutation because of, e.g., releasing the GIL, and we assume that is improbable.

The existing weakref_lru_cache has a number of concurrency problems that are unearthed by tsan and the tests added in this PR, but the key problem that made me decide that the existing approach does not work is the following: * if we're using a std::unordered_map<>, then we must protect it by a mutex mu_ that is not the GIL because the GIL can be released during Python equality functions, and std::unordered_map<> would not be happy with this. This implies that the lock order must be mu_ then GIL, otherwise it would not be ok to release and reacquire the GIL. * but the `tp_traverse` GC traversal handler must hold the GIL (or, at least, the Python threads are all in a stopped state because Python GC is stop-the-world, which is much the same as holding the GIL), but would need to acquire mu_ to traverse the std::unordered_map. And since we are in the middle of GC, we cannot release the GIL. So there is no valid lock ordering possible. This PR takes a different approach: * we add a new hash map ReentrantHashMap which is heavily inspired by absl::flat_hash_map. * the key features of ReentrantHashMap is that it allows the map to be mutated from the equality function. We use a simple optimistic versioning scheme wherein we detect mutations during equality tests and retry the lookup. * this scheme is not robust to equality tests that intentionally mutate the map (e.g., we might loop endlessly) but the goal is to catch incidental mutation because of, e.g., releasing the GIL, and we assume that is improbable. PiperOrigin-RevId: 868228444

copybara-service bot assigned hawkinsp Feb 11, 2026

copybara-service bot force-pushed the test_868228444 branch from 79345b3 to ed95ee6 Compare February 11, 2026 20:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplement weakref_lru_cache on top of a custom hash map type. #35003

Reimplement weakref_lru_cache on top of a custom hash map type. #35003

copybara-service bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Reimplement weakref_lru_cache on top of a custom hash map type. #35003

Are you sure you want to change the base?

Reimplement weakref_lru_cache on top of a custom hash map type. #35003

Conversation

copybara-service bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant