-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST(string dtype): Resolve xfail in test_base.py #60713
Conversation
if dtype is not None: | ||
raise TypeError("Cannot change data-type for string array.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any dtypes we do want to accept here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - the point of NumPy's view
method seems incompatible with the string data types (both object-based and Arrow-based)
def view(self, dtype: Dtype | None = None) -> ArrayLike: | ||
if dtype is not None: | ||
raise TypeError("Cannot change data-type for string array.") | ||
return super().view(dtype=dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside, I noticed NumPy's documentation on view says:
Passing None for dtype is different from omitting the parameter, since the former invokes dtype(None) which is an alias for dtype('float64').
So I'm assuming we overload the meaning of dtype=None
in our extension class hierarchy somewhere, otherwise this would fail (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean by "otherwise this would fail". I think we have behavior that differs from NumPy here:
pandas/pandas/core/arrays/base.py
Lines 1874 to 1876 in fa5c255
if dtype is not None: | |
raise NotImplementedError(dtype) | |
return self[:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK good to know we override. It wouldn't make sense otherwise to reinterpret these bytes using NumPy's default behavior (float)
@@ -533,6 +533,11 @@ def _str_map_nan_semantics( | |||
else: | |||
return self._str_map_str_or_object(dtype, na_value, arr, f, mask) | |||
|
|||
def view(self, dtype: Dtype | None = None) -> ArrayLike: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overriding this here, the only thing it does is change the error from NotImplementedError to TypeError compared to the implementation in the base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, for non-Arrow StringArray, this avoids getting the NDArrayBackedExtensionArray.view
implemetation, which has different behaviour compared to the base class Extensionarray.view
Thanks @rhshadrach |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon! Remember to remove the If these instructions are inaccurate, feel free to suggest an improvement. |
Backport PR: #60742 |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.