Skip to content

extract_dict and extract_regex_tok should return TokenSpanArray, not DataFrame #206

@frreiss

Description

@frreiss

For legacy reasons, the functions extract_dict() and extract_regex_tok() in spanner/extract.py return single-column DataFrames. These functions should return TokenSpanArray objects instead. Users who want a DataFrame can construct one on top of the returned array.

In addition to the testing code in test_extract.py, there is some downstream code in the notebooks that will need to be modified to deal with this API change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions