Skip to content

Conversation

ms8r
Copy link

@ms8r ms8r commented Jan 1, 2017

Enables access to the items in a Dataset Row by index or by column header. For example, data[0]['first_name'] == data[0][0] if 'first_name' is the label of the first column as specified in the Dataset's headers. (Ref. issues #22, #158, #265.)

Implemented by adding a Row attribute _dset that stores a reference to the Dataset that "owns" the Row and thus allowing each Row access to the parent Dataset's headers. Constructors, insert methods and itemgetters/setters have been updated accordingly. In addition Dataset has a new attribute _lblidx that indicates whether label based indexing is possible (i.e. header with unique labels exists). _lblidxis maintained via updated headers property.

To allow label based access within a Row the Dataset's __getitem__ now returns a Row rather than a tuple, with the Row basically behaving like a list externally. This has the potential to cause some backwards compatibility issues if client code relied on Dataset items being returned as plain tuples. To minimize this impact the PR adds __add__, __eq__, and __ne__ methods for Rows. Tests have been updated by applying the Row.tuple property for comparisons with tuple literals (PR will fail existing tests otherwise). Independent of the label based indexing I'd suggest returning Dataset items as Rows instead of plain tuples may be preferable in any case to enable adding additional functionality in the future.

Other changes/additions:

  • Add copy method for Datasets that updates _dset references in new object's Rows and uses copy.deepcopy instead of copy.copy. This should also fix a bug in the current version where copies (in filterand stack) are shallow and the new object's _data attribute points to the same list as the original object (filter and stack updated accordingly).
  • Add assertions to existing tests for methods that return new Dataset objects to verify that Row's _dset points to the new object and that the new object is not a shallow copy (filter, stack, stack_col, subset, sorted, and transpose)
  • Add tests for new functionality (plus one for existing filter)

@timofurrer
Copy link
Member

Can you please resolve the conflicts. Thanks 🎉

@ms8r
Copy link
Author

ms8r commented Mar 17, 2019

Done ;-) This also surfaced a left over bug in the has_tag method (incorrect unicode handling under Python 2.7.... time to move to Python 3 only...

@hugovk hugovk mentioned this pull request Oct 4, 2019
@hugovk
Copy link
Member

hugovk commented Oct 8, 2024

great work

Please don't waste our time with spam reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants