Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Added isascii() string method fixing issue #59091 #60532

Merged
merged 40 commits into from
Jan 2, 2025

Conversation

avecasey
Copy link
Contributor

This pull request introduces the isascii() string method to the pandas API, enabling users to efficiently check whether all characters in each string are ASCII. The implementation leverages the pyarrow.compute.string_is_ascii function for high-performance processing when working with Arrow-backed pandas objects. The method has been integrated into the pandas string accessor (str) via the _map_and_wrap utility, ensuring it aligns with pandas' existing API design.

Summary of Changes:

Added the isascii method to the string accessor using _map_and_wrap for dynamic registration and documentation integration.

Implemented the core logic in _str_isascii, which uses pyarrow.compute.string_is_ascii.
Added corresponding tests to verify correctness, performance, and edge cases.

Updated the whatsnew documentation to include the new feature in the upcoming version notes.

This enhancement addresses requests for additional string validation methods (closes #59091) and aligns pandas with commonly needed text processing capabilities.

@WillAyd
Copy link
Member

WillAyd commented Dec 23, 2024

I think this looks good in general. I don't think the CI failures are related - can you try merging in main to see if that fixes things?

@mroeschke mroeschke added the Strings String extension data type and string data label Dec 29, 2024
@mroeschke
Copy link
Member

pre-commit.ci autofix

@mroeschke mroeschke added this to the 3.0 milestone Jan 2, 2025
@mroeschke mroeschke merged commit 228627a into pandas-dev:main Jan 2, 2025
51 checks passed
mroeschke added a commit that referenced this pull request Jan 2, 2025
* first

* second

* Update object_array.py

* third

* ascii

* ascii2

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* style

* style

* style

* style

* docs

* reset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update doc/source/whatsnew/v3.0.0.rst

---------

Co-authored-by: Abby VeCasey <[email protected]>
Co-authored-by: Matthew Roeschke <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@mroeschke
Copy link
Member

Thanks @avecasey

gmcrocetti pushed a commit to gmcrocetti/pandas that referenced this pull request Jan 3, 2025
…das-dev#60532)

* first

* second

* Update object_array.py

* third

* ascii

* ascii2

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* style

* style

* style

* style

* docs

* reset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update doc/source/whatsnew/v3.0.0.rst

---------

Co-authored-by: Abby VeCasey <[email protected]>
Co-authored-by: Matthew Roeschke <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
gmcrocetti pushed a commit to gmcrocetti/pandas that referenced this pull request Jan 3, 2025
…das-dev#60532)

* first

* second

* Update object_array.py

* third

* ascii

* ascii2

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* ascii3

* style

* style

* style

* style

* docs

* reset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update doc/source/whatsnew/v3.0.0.rst

---------

Co-authored-by: Abby VeCasey <[email protected]>
Co-authored-by: Matthew Roeschke <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: String methods has no method "isascii()"
3 participants