perf: swap Signature.bind for internal dataclass within Annotable #11691

JonAnCla · 2025-10-15T20:51:48Z

Description of changes

This PR makes some performance improvements to the Annotable class which is a base used for many ibis internals, particularly Node, and therefore provides a performance improvement to building all expressions

While doing further investigation around points made in #11641 I found that Signature.bind is a major bottleneck in instantiation of Annotable objects. Signature.bind is a python library function that implements, for a given python function/class, an "interpreter" for mapping passed args/kwargs to actual named args/kwargs. However because it is implemented in python it is orders of magnitude slower (~10-20x) than the cpython code that implements the same process.

In this PR I've therefore replaced calling Signature.bind with following:

when an Annotable class is created, also create a "proxy dataclass" using standard python dataclasses, with exactly the same signature
when an Annotable is instantiated, use the "proxy dataclass" to bind passed args/kwargs to annotated/named args/kwargs and extract the resulting __dict__ from that dataclass instance

Changes are not that intrusive but a little bit "funky". Performance is improved by 10-50% across all expression building benchmarks (larger expressions benefit more)

This is a POC - a few tests that check Annotable raises when incorrect args/kwargs are passed fail because exceptions raised have slightly different text than before. I think these are all solvable but I wanted to check that the approach is acceptable before continuing

I've attached some profiles. (ipython) code to generate these is below

from ibis.common.grounds import Annotable

class MyAnnotable(Annotable):
    foo: int
    bar: str

import line_profiler
%load_ext line_profiler

%lprun -s -m ibis.common.grounds -m ibis.common.bases [MyAnnotable(foo=42, bar="hello") for _ in range(100)]

profile-after.txt
profile-before.txt

@kszucs and @cpcloud if you could take a look as time allows and let me know thoughts that'd be much appreciated. Thanks!

kszucs · 2025-10-16T09:44:18Z

Python dataclasses render the __init__ method then call exec() to turn it into an actual function, that is why it is faster. We can implement that ourselves based on the signature object to speed up instantiation. I would rather not create an intermediate dataclass since we can generate the initial method with tighter control.

JonAnCla · 2025-10-16T09:58:08Z

Thanks, my one reservation would be that using exec feels a bit icky :)

At least if we delegate that job to dataclasses (which as you say uses it to build the dataclass init method), we're getting a well known & well tested piece of code to do that piece of dirty work

Another thing to consider is that with these changes initialising objects via the internal dataclass is not a bottleneck (checking types using Pattern etc is), so we may not need to speed up the Signature.bind part much further.

Having said the above I don't have a strong opinion and happy to re-implement as preferred, so let me know if you still have same preference/not. Thanks for taking a look!

kszucs · 2025-10-16T19:24:42Z

I actually have a port of GitHub.com/kszucs/koerce using mypyc having better runtime performance than the current cython implementation of koerce. I also managed to speed up signature binding significantly. If you are interested I can share it in the upcoming days/next week. We could offload additional perf critical paths with continuous benchmarking configured ensuring good performance.

JonAnCla · 2025-10-16T21:12:48Z

sounds great, please do share when you have time :)
would this be something you'd hope to get into ibis itself or sit outside as an add on?

- WIP

9a2f4e7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: swap Signature.bind for internal dataclass within Annotable #11691

perf: swap Signature.bind for internal dataclass within Annotable #11691

Uh oh!

JonAnCla commented Oct 15, 2025 •

edited

Loading

Uh oh!

kszucs commented Oct 16, 2025

Uh oh!

JonAnCla commented Oct 16, 2025

Uh oh!

kszucs commented Oct 16, 2025

Uh oh!

JonAnCla commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: swap Signature.bind for internal dataclass within Annotable #11691

Are you sure you want to change the base?

perf: swap Signature.bind for internal dataclass within Annotable #11691

Uh oh!

Conversation

JonAnCla commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Uh oh!

kszucs commented Oct 16, 2025

Uh oh!

JonAnCla commented Oct 16, 2025

Uh oh!

kszucs commented Oct 16, 2025

Uh oh!

JonAnCla commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JonAnCla commented Oct 15, 2025 •

edited

Loading