Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing components do not work properly #36

Open
andrewstenger opened this issue May 30, 2024 · 0 comments
Open

Multiprocessing components do not work properly #36

andrewstenger opened this issue May 30, 2024 · 0 comments

Comments

@andrewstenger
Copy link
Owner

andrewstenger commented May 30, 2024

The functions in Boto-Plus that implement multiprocessing via Python's multiprocessing library do not work properly if the use_multiprocessing argument is set to True. The issue is that boto3's resource/client objects cannot be pickled. But the multiprocessing library tries to pickle all objects that are provided as input to the Pool() function. So when the use_multiprocessing argument is supplied, the functions will fail.

The solution to this will involve a non-trivial refactoring of the code. One idea I've had is that when a function is called with the use_multiprocessing argument, it can instantiate another class-object made specifically for multiprocessing that instantiates the boto3 resource/client entirely within the _mp_ function that gets applied during multiprocessing via a with statement similar to the code below

with mp.Pool() as pool:
    uris = pool.map(self.__copy_object_mp_unpack, payloads)

In this case, we would have to instantiate the boto3 objects within the self.__copy_object_mp_unpack function. Alternative solutions can also be explored if this is too unwieldy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant