Skip to content

Conversation

@foreverallama
Copy link
Contributor

Fixes #78 bug where structs with large number of fields could not be written in v7.3 format.

  • MAT_HDF.jl: New method _write_field_reference!() to write field name attributes as a dataset and return HDF5 reference
  • New read and write tests for struct_large.mat

Struct field names are written as attributes in v7.3 format. However, HDF5 has a maximum allowable attribute size of 64KB. When the number of fields is large, MATLAB instead writes a reference to a dataset, and writes the field names under #refs#.

The threshold for this write mechanism is when the total character length of all field names of the struct is >= 4096

src/MAT_HDF5.jl Outdated
end
write_attribute(g, "MATLAB_fields", HDF5.VLen(k))
total_chars = sum(length, k)
if total_chars < 4096
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code should probably be added to more struct-like writing? MatlabClassObject, MatlabOpaque and MatlabStructArray perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I forgot about MatlabStructArray, it should be added there. I don't think it needs to be added for the others but we can do it to be on the safer side

@foreverallama
Copy link
Contributor Author

A couple of changes to support any type of struct data:

  • Moved everything to a function _write_struct_fields() which invokes _write_field_reference() if required
  • New constant struct_field_attr_matlab = "MATLAB_fields", replaced all instances of MATLAB_fields with that
  • Support for MatlabStructArray, MatlabClassObject and scalar structs

The test is failing because for some reason the fields are going wildly out of order between save-load. Not sure what's going on here or how to resolve it. The data is intact though

@matthijscox
Copy link
Member

I've made isapprox testing possible, and made sure it doesn't care about name ordering in MatlabStructArray isapprox.

@matthijscox matthijscox merged commit f41c114 into JuliaIO:master Dec 10, 2025
8 checks passed
@foreverallama
Copy link
Contributor Author

Seems to be a quirk of HDF5 files. Keys are being written based on hash order of the Dict, but HDF5 stores/retrieves them lexicographically I guess

@foreverallama foreverallama deleted the structs branch December 10, 2025 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nested dictionaries are limited in size

2 participants