extract_dict and extract_regex_tok should return TokenSpanArray, not DataFrame

For legacy reasons, the functions `extract_dict()` and `extract_regex_tok()` in `spanner/extract.py` return single-column DataFrames. These functions should return `TokenSpanArray` objects instead. Users who want a DataFrame can construct one on top of the returned array.

In addition to the testing code in `test_extract.py`, there is some downstream code in the notebooks that will need to be modified to deal with this API change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

extract_dict and extract_regex_tok should return TokenSpanArray, not DataFrame #206

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

extract_dict and extract_regex_tok should return TokenSpanArray, not DataFrame #206

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions