Skip to content

Conversation

@phoebusm
Copy link
Collaborator

@phoebusm phoebusm commented Dec 19, 2025

Reference Issues/PRs

Part of https://man312219.monday.com/boards/7852509418/pulses/18378732140 and
https://man312219.monday.com/boards/7852509418/pulses/8371158618

What does this implement or fix?

  1. Consolidate some binding functions so they can be reused in different repos
  2. Fix sts fixtures missing configuration if the native version store is reused
    ( Now add_s3_library_to_env set both native config and class variable for config aws_auth and aws_profile. They should only be set in native config. It can't be changed until 7.0.0)

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@phoebusm phoebusm added patch Small change, should increase patch version no-release-notes This PR shouldn't be added to release notes. labels Dec 19, 2025
@phoebusm phoebusm changed the title For Enterprise ABI compatibility v2 For Enterprise ABI compatibility Dec 19, 2025
@phoebusm phoebusm changed the title For Enterprise ABI compatibility Consolidate bindings for Enterprise ABI compatibility Dec 24, 2025
@phoebusm phoebusm marked this pull request as ready for review December 24, 2025 17:32
@phoebusm phoebusm removed the no-release-notes This PR shouldn't be added to release notes. label Dec 24, 2025
s3::S3Settings s3_settings(const pybind11::tuple& t);
pybind11::tuple to_tuple(const s3::GCPXMLSettings& settings);
pybind11::tuple to_tuple(const s3::S3Settings& settings);
void register_common_bindings(pybind11::module& m, bool local_bindings);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe rename to register_common_storage_bindings?

[](py::tuple t) {
util::check(t.size() >= 7, "Invalid AtomKey pickle object!");

[[maybe_unused]] const int serialization_version = t[0].cast<int>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Why don't we remove it?

);
},
[](py::tuple t) {
util::check(t.size() >= 7, "Invalid AtomKey pickle object!");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check for t.size() == 7

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's for some sort of future proof, if new setting is added to the class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for the staged data tokens API, which makes the staged data tokens (which wrap atom keys) pickleable. This helps to use the staged data API across processes which may have different ArcticDB versions.


#pragma once

#include <pybind11/pybind11.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this header in the .hpp file. You can forward declare pybind11::module and place the header only in the .cpp

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'll work as well. Is it for the sake of compilation speed?

@@ -0,0 +1,13 @@
#pragma once
#include <pybind11/pybind11.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace arcticdb::entity::apy {

void register_common_entity_bindings(py::module& m, BindingScope scope) {
bool local_bindings = (scope == BindingScope::LOCAL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register_common_entity_bindings is only called with BindingScope::GLOBAL why does it have an option to pass LOCAL?

I've probably missed some part of the discussion. So I wanted to double check the intent.

We will distinguish ArcticDB and ArcticDB Enterprise AtomKey and VersionItem? ArcticDB Enterprise users won't be able to pass in AtomKey return from ArcticDB.

This will also mean the the library tool currently won't be able to operate with AtomKeys from Enterprise.

Copy link
Collaborator Author

@phoebusm phoebusm Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register_common_entity_bindings is only called with BindingScope::GLOBAL why does it have an option to pass LOCAL?

It used by downstream library, namely Enterprise.

Copy link
Collaborator Author

@phoebusm phoebusm Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will distinguish ArcticDB and ArcticDB Enterprise AtomKey and VersionItem? ArcticDB Enterprise users won't be able to pass in AtomKey return from ArcticDB.

It's never supposed to work out of the box as they are defined in different binaries.
The reason why they are needed to be bound in enterprise, as they are the return type of some functions. If core and enterprise are ABI-compatible, not binding those class will give us pybind exception.
But it will be very useful to provide some helper functions to convert those two types between two binaries

Comment on lines +3 to +7
#include <pybind11/pybind11.h>
#include <arcticdb/util/error_code.hpp>
#include <arcticdb/storage/s3/s3_settings.hpp>
#include <arcticdb/storage/common.hpp>
#include <arcticdb/python/python_bindings_common.hpp>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can trim some of the includes using fwd declarations.


namespace arcticdb::entity::apy {

void register_common_entity_bindings(pybind11::module& m, bool local_bindings);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "common" mean here?


namespace arcticdb::entity::apy {

void register_common_entity_bindings(pybind11::module& m, bool local_bindings);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid bool flags on APIs, it's hard to guess what the false in entity::apy::register_common_entity_bindings(version, false); means. Instead there should be separate functions for the true and false cases, which might push down to a private function with the bool in its signature.

logger.error(f"Error creating access key: {e}")
raise e

out.native_config = NativeVariantStorage(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this change, although I've seen your comment,

Fix sts fixtures missing configuration if the native version store is reused
( Now add_s3_library_to_env set both native config and class variable for config aws_auth and aws_profile. They should only be set in native config. It can't be changed until 7.0.0)

What does if the native version store is reused mean? A unit test to go with this change would make this easier to review, and help us avoid breaking it again.

raise e

out.native_config = NativeVariantStorage(
NativeS3Settings(AWSAuthMethod.STS_PROFILE_CREDENTIALS_PROVIDER, profile_name, False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use kwargs for the last argument whenever we create this object? This also has the True / False readability problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These unit tests don't cover the _core_api.py functions?

Copyright 2023 Man Group Operations Ltd.
NO WARRANTY, EXPRESSED OR IMPLIED.
This module implements a backwards compatible version of msgpack functions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it?

This module implements a backwards compatible version of msgpack functions.
"""

# Treat everything here as public APIs!!!! Any change made here may break downstream repos!!!!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth mentioning that we should apply semantic versioning here

Copy link
Collaborator

@poodlewars poodlewars Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to make the metrics config pickleable? This relates to this comment on the enterprise PR https://github.com/man-group/arcticdb-enterprise/pull/281#discussion_r2650734598 . All we need is to add accessors to the fields in hte metrics config? It's confusing to use the pickling APIs for this purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Small change, should increase patch version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants