-
Notifications
You must be signed in to change notification settings - Fork 8
Proof-of-concept Python support using converter nodes #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## main #45 +/- ##
===========================================
+ Coverage 52.94% 81.67% +28.72%
===========================================
Files 114 118 +4
Lines 2289 2074 -215
Branches 1107 336 -771
===========================================
+ Hits 1212 1694 +482
- Misses 237 244 +7
+ Partials 840 136 -704
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…ion from annotations
beojan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't help but think this could be made a lot shorter and easier to read by using Pybind11 (which supports embedding the Python interpreter). That would be a fair amount of work though, and probably not worth it for the prototype.
| // Python wrapper for Phlex handles | ||
| extern PyTypeObject PhlexLifeline_Type; | ||
| // clang-format off | ||
| typedef struct py_lifeline { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be re-written in the standard C++ style. The C syntax ends up creating two names for the type (py_lifeline and py_lifeline_t).
| if (mod) { | ||
| PyObject* reg = PyObject_GetAttrString(mod, "PHLEX_EXPERIMENTAL_REGISTER_ALGORITHMS"); | ||
| if (reg) { | ||
| PyObject* pym = wrap_module(&m); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapped_module?
| dlopen(info.dli_fname, RTLD_GLOBAL | RTLD_NOW); | ||
| } | ||
|
|
||
| #if PY_VERSION_HEX < 0x03020000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is support for such old Python versions required?
| import_numpy(true); | ||
| #endif | ||
|
|
||
| // TODO: the GIL should first be released on the main thread and this seems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use a static RAII type that releases the GIL on construction and re-acquires on destruction that should solve this, and automatically handle only releasing it once:
https://en.cppreference.com/w/cpp/language/storage_duration.html#Static_block_variables
| // clang-format on | ||
|
|
||
| PyObject* phlex::experimental::wrap_configuration(configuration const* config) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should document that the caller must release the reference
| } | ||
|
|
||
| template <typename... Args> | ||
| void callv(Args... args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between call and callv?
| if (!np_view) \ | ||
| return (intptr_t)nullptr; \ | ||
| \ | ||
| /* make the date read-only by not making it writable */ \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data?
| .input_family(cinputs[0] + "py", cinputs[1] + "py", cinputs[2] + "py"); | ||
| } | ||
| } else { | ||
| PyErr_SetString(PyExc_TypeError, "unsupported number of inputs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems limiting
| auto const& inp_type = input_types[i]; | ||
|
|
||
| if (inp_type == "bool") | ||
| INSERT_INPUT_CONVERTER(bool, inp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have more than one Python algorithm using the same inputs, wouldn't this register the same conversion algorithm multiple times?
| .input_family(cinputs[0] + "py") | ||
| .output_products("py" + output); | ||
| } else { | ||
| auto* pyc = new py_callback_1v{callable}; // id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious that the v suffix here is for void and not vector.
This is a proof-of-concept breadboard to show we can actually run Python algorithms with very low overhead and without requiring the Python algorithm author to write any C++. The current implementation largely mimics the C++ one, while using Python reflection to reduce boilerplate.
The trick is to add converter nodes, to/from
PyObject*, which solves the combinatorical problem by making it enumerable. (This current code is just an exemplar providing converters for a subset of builtin types; the actual enumeration still needs to happen, e.g. generated from the IDL or maybe using the public converter interface of cppyy to have a complete set of converters.)Lifetimes of view objects created by converter nodes are handled with a "lifeline" Python object to equalize their lives. The actual callback checks for lifeline objects and extracts the view for passing to Python for any found, leaving the source alive until the end of the callback.
Intermediaries are created from a combination of input/output labels as used in the registration and Python reflection (eg. function name and annotations). There's a trade-off between conformity and allowable flexibility that will need to be fleshed out with the stake-holders. For example, the formal argument names of the Python function could be used as (part of) the label.
Current tests show the use of a C++ producer, a Python algorithm performing a transformation, and a Python observer to verify the result. Intermediate converter nodes are not yet discarded for back-to-back Python calls, which is an optimization that needs to happen either at graph construction time (probably difficult) or after the fact on the graph. All Python calls are still serialized explicitly (and acquire the GIL). Error reporting translates any Python exceptions to C++
std::runtime_error.One important open question remains, but does not need to be resolved before including this PR into the prototype:
Do not merge this pull request yet. This PR exists to document delivery of the matching F1 milestones and in preparation of the prototype. Of the listed open questions above, at minimum point 1 needs to be resolved before being able to merge it, as the current location (in
test/python) isn't good enough for the prototype.