Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pickle] Dump using the highest available protocol #261

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

john-bodley
Copy link

@john-bodley john-bodley commented Jul 17, 2021

This PR ensures that all backends use the same protocol (the highest available) for pickling the payload to cache. Some backends currently use the default (4 for Python 3.8) whereas some use 2 and others use the highest protocol available (maximum of 5).

Per the pickle documentation,

Protocol version 5 was added in Python 3.8. It adds support for out-of-band data and speedup for in-band data. Refer to PEP 574 for information about improvements brought by protocol 5.

it seems there is merit in utilizing the highest available protocol for consistency and increased performance.

Note that it seems the protocol is encoded within the dumped object, i.e., one does not need to re-specify the protocol when loading via pickle.loads(...), thus this should be a non-breaking change.

@@ -327,7 +327,7 @@ def _set(self, key, value, timeout=None):
# I didn't found a good way to avoid pickling/unpickling if
# key is smaller than chunksize, because in case or <werkzeug.requests>
# getting the length consume the data iterator.
serialized = pickle.dumps(value, 2)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uncertain why this was hardcoded as 2.

@sh4nks
Copy link
Collaborator

sh4nks commented Jul 22, 2021

While I am generally for this PR I am not sure if this might break some applications?

@@ -102,7 +102,7 @@ def set(self, key, value, timeout=None):
full_key = self.key_prefix + key
content_type = "application/json"
try:
value = json.dumps(value)
value = json.dumps(value, pickle.HIGHEST_PROTOCOL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pickle isn't used here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies. I got a tad carried away.

@john-bodley
Copy link
Author

@sh4nks regarding,

While I am generally for this PR I am not sure if this might break some applications?

per here:

Note: Newer versions of the protocol offer more features and improvements but are limited to higher versions of the interpreter. Be sure to consider this when choosing which protocol to use.

To identify the highest protocol that your interpreter supports, you can check the value of the pickle.HIGHEST_PROTOCOL attribute.

My take is that by using pickle.HIGHEST_PROTOCOL (as opposed to an explicit version—other than -1) ensures safely on a per interpreter level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants