Skip to content

Fix shape for datasets of references to iterators #1238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

h-mayorquin
Copy link
Contributor

@h-mayorquin h-mayorquin commented Jan 21, 2025

Motivation

Fix #1237

This is a draft that @rly started working on that I share here for further reference.

@h-mayorquin h-mayorquin changed the title Fix shape for datasets of references Fix shape for datasets of references to iterators Jan 21, 2025
Copy link

codecov bot commented Jan 21, 2025

Codecov Report

Attention: Patch coverage is 37.50000% with 5 lines in your changes missing coverage. Please review.

Project coverage is 90.82%. Comparing base (ff4a0aa) to head (aab88ab).
Report is 20 commits behind head on dev.

Files with missing lines Patch % Lines
src/hdmf/build/objectmapper.py 37.50% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1238      +/-   ##
==========================================
- Coverage   90.87%   90.82%   -0.05%     
==========================================
  Files          42       42              
  Lines        9524     9529       +5     
  Branches     1921     1923       +2     
==========================================
  Hits         8655     8655              
- Misses        576      580       +4     
- Partials      293      294       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@h-mayorquin
Copy link
Contributor Author

Checking the errors it seems that we were too quick to dismiss the multi-dimensional case, @rly .

@rly
Copy link
Contributor

rly commented Jan 22, 2025

I haven't looked deeply, but I suspect that because VectorData shape allows for 1D, 2D, 3D, or 4D data (ref), and there are datasets of references in NWB such as the electrode_group column in the Units table that extend VectorData but do NOT specify that the dataset should be 1-dimensional (ref), that spec_shape is [[None], [None, None], [None, None, None], [None, None, None, None]] inherited from VectorData, and the new check is triggered.

  1. We should amend the NWB schema to restrict certain table columns like Units.electrode_group to be 1-D.
  2. Instead of checking the spec, we should check whether the dataset of references being written has more than 1 dimension. We should probably put that here: https://github.com/hdmf-dev/hdmf/blob/dev/src/hdmf/backends/hdf5/h5tools.py#L1252 . Note that the dataset being created has shape (len(data), ) which means we already assume they are 1-D datasets.

Related TODO items:

  1. Warn if creating a spec with a dataset of references that allows more than one dimension.

I'll take a look at this in the next couple days.

@h-mayorquin
Copy link
Contributor Author

I think #1270 already fixed the issue that this PR was aimed to solve. Fixed.

@h-mayorquin h-mayorquin closed this May 7, 2025
@h-mayorquin h-mayorquin deleted the fix_images_order_shape branch May 7, 2025 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Image iterator reveals that list of references gets an incorrect data shape
2 participants