Skip to content

Commit 3262000

Browse files
swolchokrwgk
andauthored
Add fast_type_map, use it authoritatively for local types and as a hint for global types (ABI breaking) (#5842)
* Add fast_type_map, use it authoritatively for local types and as a hint for global types nanobind has a similar two-level lookup strategy, added and explained by wjakob/nanobind@b515b1f In this PR I've ported this approach to pybind11. To avoid an ABI break, I've kept the fast maps to the `local_internals`. I think this should be safe because any particular module should see its `local_internals` reset at least as often as the global `internals`, and misses in the fast "hint" map for global types fall back to the global `internals`. Performance seems to have improved. Using my patched fork of pybind11_benchmark (https://github.com/swolchok/pybind11_benchmark/tree/benchmark-updates, specifically commit hash b6613d12607104d547b1c10a8145d1b3e9937266), I run bench.py and observe the MyInt case. Each time, I do 3 runs and just report all 3. master, Mac: 75.9, 76.9, 75.3 nsec/loop this PR, Mac: 73.8, 73.8, 73.6 nsec/loop master, Linux box: 188, 187, 188 nsec/loop this PR, Linux box: 164, 165, 164 nsec/loop Note that the "real" percentage improvement is larger than implied by the above because master does not yet include #5824. * simplify unsafe_reset_local_internals in test * pre-implement PYBIND11_INTERNALS_VERSION 12 * use PYBIND11_INTERNALS_VERSION 12 on Python 3.14 per suggestion * Implement reviewer comments: revert PY_VERSION_HEX change, fix REVIEW comment, add two-level lookup comments. ci.yml coming separately * Use the inplace build to smoke test ABI bump? * [skip ci] Remove "smoke" from comment. This is full testing, just only on a few platforms. --------- Co-authored-by: Ralf W. Grosse-Kunstleve <[email protected]>
1 parent 9ea1976 commit 3262000

File tree

6 files changed

+91
-16
lines changed

6 files changed

+91
-16
lines changed

.github/workflows/ci.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,8 @@ jobs:
203203
if: runner.os != 'Windows'
204204
run: echo "CMAKE_GENERATOR=Ninja" >> "$GITHUB_ENV"
205205

206-
# More-or-less randomly adding a few extra flags here
206+
# More-or-less randomly adding a few extra flags here.
207+
# In particular, use this one to test the next ABI bump (internals version).
207208
- name: Configure
208209
run: >
209210
cmake -S. -B.
@@ -213,6 +214,7 @@ jobs:
213214
-DDOWNLOAD_CATCH=ON
214215
-DDOWNLOAD_EIGEN=ON
215216
-DCMAKE_CXX_STANDARD=14
217+
-DPYBIND11_INTERNALS_VERSION=10000000
216218
217219
# Checks to makes sure defining `_` is allowed
218220
# Triggers EHsc missing error on Windows

include/pybind11/detail/class.h

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,10 +221,14 @@ extern "C" inline void pybind11_meta_dealloc(PyObject *obj) {
221221
auto tindex = std::type_index(*tinfo->cpptype);
222222
internals.direct_conversions.erase(tindex);
223223

224+
auto &local_internals = get_local_internals();
224225
if (tinfo->module_local) {
225-
get_local_internals().registered_types_cpp.erase(tindex);
226+
local_internals.registered_types_cpp.erase(tinfo->cpptype);
226227
} else {
227228
internals.registered_types_cpp.erase(tindex);
229+
#if PYBIND11_INTERNALS_VERSION >= 12
230+
internals.registered_types_cpp_fast.erase(tinfo->cpptype);
231+
#endif
228232
}
229233
internals.registered_types_py.erase(tinfo->type);
230234

include/pybind11/detail/internals.h

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,6 @@
3939
/// further ABI-incompatible changes may be made before the ABI is officially
4040
/// changed to the new version.
4141
#ifndef PYBIND11_INTERNALS_VERSION
42-
// REMINDER for next version bump: remove loader_life_support_tls
4342
# define PYBIND11_INTERNALS_VERSION 11
4443
#endif
4544

@@ -177,6 +176,12 @@ struct type_equal_to {
177176
};
178177
#endif
179178

179+
// For now, we don't bother adding a fancy hash for pointers and just
180+
// let the standard library use the identity hash function if that's
181+
// what it wants to do (e.g., as in libstdc++).
182+
template <typename value_type>
183+
using fast_type_map = std::unordered_map<const std::type_info *, value_type>;
184+
180185
template <typename value_type>
181186
using type_map = std::unordered_map<std::type_index, value_type, type_hash, type_equal_to>;
182187

@@ -237,6 +242,15 @@ struct internals {
237242
pymutex mutex;
238243
pymutex exception_translator_mutex;
239244
#endif
245+
#if PYBIND11_INTERNALS_VERSION >= 12
246+
// non-normative but fast "hint" for registered_types_cpp. Meant
247+
// to be used as the first level of a two-level lookup: successful
248+
// lookups are correct, but unsuccessful lookups need to try
249+
// registered_types_cpp and then backfill this map if they find
250+
// anything.
251+
fast_type_map<type_info *> registered_types_cpp_fast;
252+
#endif
253+
240254
// std::type_index -> pybind11's type information
241255
type_map<type_info *> registered_types_cpp;
242256
// PyTypeObject* -> base type_info(s)
@@ -261,7 +275,9 @@ struct internals {
261275
PyObject *instance_base = nullptr;
262276
// Unused if PYBIND11_SIMPLE_GIL_MANAGEMENT is defined:
263277
thread_specific_storage<PyThreadState> tstate;
278+
#if PYBIND11_INTERNALS_VERSION <= 11
264279
thread_specific_storage<loader_life_support> loader_life_support_tls; // OBSOLETE (PR #5830)
280+
#endif
265281
// Unused if PYBIND11_SIMPLE_GIL_MANAGEMENT is defined:
266282
PyInterpreterState *istate = nullptr;
267283

@@ -302,7 +318,11 @@ struct internals {
302318
// impact any other modules, because the only things accessing the local internals is the
303319
// module that contains them.
304320
struct local_internals {
305-
type_map<type_info *> registered_types_cpp;
321+
// It should be safe to use fast_type_map here because this entire
322+
// data structure is scoped to our single module, and thus a single
323+
// DSO and single instance of type_info for any particular type.
324+
fast_type_map<type_info *> registered_types_cpp;
325+
306326
std::forward_list<ExceptionTranslator> registered_exception_translators;
307327
PyTypeObject *function_record_py_type = nullptr;
308328
};

include/pybind11/detail/type_caster_base.h

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -205,21 +205,43 @@ PYBIND11_NOINLINE detail::type_info *get_type_info(PyTypeObject *type) {
205205
return bases.front();
206206
}
207207

208-
inline detail::type_info *get_local_type_info(const std::type_index &tp) {
209-
auto &locals = get_local_internals().registered_types_cpp;
210-
auto it = locals.find(tp);
208+
inline detail::type_info *get_local_type_info(const std::type_info &tp) {
209+
const auto &locals = get_local_internals().registered_types_cpp;
210+
auto it = locals.find(&tp);
211211
if (it != locals.end()) {
212212
return it->second;
213213
}
214214
return nullptr;
215215
}
216216

217-
inline detail::type_info *get_global_type_info(const std::type_index &tp) {
217+
inline detail::type_info *get_global_type_info(const std::type_info &tp) {
218+
// This is a two-level lookup. Hopefully we find the type info in
219+
// registered_types_cpp_fast, but if not we try
220+
// registered_types_cpp and fill registered_types_cpp_fast for
221+
// next time.
218222
return with_internals([&](internals &internals) {
219223
detail::type_info *type_info = nullptr;
224+
#if PYBIND11_INTERNALS_VERSION >= 12
225+
auto &fast_types = internals.registered_types_cpp_fast;
226+
#endif
220227
auto &types = internals.registered_types_cpp;
221-
auto it = types.find(tp);
228+
#if PYBIND11_INTERNALS_VERSION >= 12
229+
auto fast_it = fast_types.find(&tp);
230+
if (fast_it != fast_types.end()) {
231+
# ifndef NDEBUG
232+
auto types_it = types.find(std::type_index(tp));
233+
assert(types_it != types.end());
234+
assert(types_it->second == fast_it->second);
235+
# endif
236+
return fast_it->second;
237+
}
238+
#endif // PYBIND11_INTERNALS_VERSION >= 12
239+
240+
auto it = types.find(std::type_index(tp));
222241
if (it != types.end()) {
242+
#if PYBIND11_INTERNALS_VERSION >= 12
243+
fast_types.emplace(&tp, it->second);
244+
#endif
223245
type_info = it->second;
224246
}
225247
return type_info;
@@ -228,7 +250,7 @@ inline detail::type_info *get_global_type_info(const std::type_index &tp) {
228250

229251
/// Return the type info for a given C++ type; on lookup failure can either throw or return
230252
/// nullptr.
231-
PYBIND11_NOINLINE detail::type_info *get_type_info(const std::type_index &tp,
253+
PYBIND11_NOINLINE detail::type_info *get_type_info(const std::type_info &tp,
232254
bool throw_if_missing = false) {
233255
if (auto *ltype = get_local_type_info(tp)) {
234256
return ltype;

include/pybind11/pybind11.h

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1637,10 +1637,14 @@ class generic_type : public object {
16371637
with_internals([&](internals &internals) {
16381638
auto tindex = std::type_index(*rec.type);
16391639
tinfo->direct_conversions = &internals.direct_conversions[tindex];
1640+
auto &local_internals = get_local_internals();
16401641
if (rec.module_local) {
1641-
get_local_internals().registered_types_cpp[tindex] = tinfo;
1642+
local_internals.registered_types_cpp[rec.type] = tinfo;
16421643
} else {
16431644
internals.registered_types_cpp[tindex] = tinfo;
1645+
#if PYBIND11_INTERNALS_VERSION >= 12
1646+
internals.registered_types_cpp_fast[rec.type] = tinfo;
1647+
#endif
16441648
}
16451649

16461650
PYBIND11_WARNING_PUSH
@@ -2138,10 +2142,18 @@ class class_ : public detail::generic_type {
21382142

21392143
if (has_alias) {
21402144
with_internals([&](internals &internals) {
2141-
auto &instances = record.module_local ? get_local_internals().registered_types_cpp
2142-
: internals.registered_types_cpp;
2143-
instances[std::type_index(typeid(type_alias))]
2144-
= instances[std::type_index(typeid(type))];
2145+
auto &local_internals = get_local_internals();
2146+
if (record.module_local) {
2147+
local_internals.registered_types_cpp[&typeid(type_alias)]
2148+
= local_internals.registered_types_cpp[&typeid(type)];
2149+
} else {
2150+
type_info *const val
2151+
= internals.registered_types_cpp[std::type_index(typeid(type))];
2152+
internals.registered_types_cpp[std::type_index(typeid(type_alias))] = val;
2153+
#if PYBIND11_INTERNALS_VERSION >= 12
2154+
internals.registered_types_cpp_fast[&typeid(type_alias)] = val;
2155+
#endif
2156+
}
21452157
});
21462158
}
21472159
def("_pybind11_conduit_v1_", cpp_conduit_method);

tests/test_with_catch/external_module.cpp

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,26 @@ namespace py = pybind11;
66
* modules aren't preserved over a finalize/initialize.
77
*/
88

9+
namespace {
10+
// Compare unsafe_reset_internals_for_single_interpreter in
11+
// test_subinterpreter.cpp.
12+
void unsafe_reset_local_internals() {
13+
// NOTE: This code is NOT SAFE unless the caller guarantees no other threads are alive
14+
// NOTE: This code is tied to the precise implementation of the internals holder
15+
16+
py::detail::get_local_internals_pp_manager().unref();
17+
py::detail::get_local_internals();
18+
}
19+
} // namespace
20+
921
PYBIND11_MODULE(external_module,
1022
m,
1123
py::mod_gil_not_used(),
1224
py::multiple_interpreters::per_interpreter_gil()) {
13-
25+
// At least one test ("Single Subinterpreter") wants to reset
26+
// internals. We have separate local internals because we are a
27+
// separate DSO, so ours need to be reset too!
28+
unsafe_reset_local_internals();
1429
class A {
1530
public:
1631
explicit A(int value) : v{value} {};

0 commit comments

Comments
 (0)