Skip to content

Latest commit

 

History

History
56 lines (41 loc) · 1.74 KB

2022-10-03.md

File metadata and controls

56 lines (41 loc) · 1.74 KB

Binary Sparse Format Working Group Meeting #4 - October 3, 2022

Attendees

  • Ben Brock
  • Tim Davis
  • Jim Kitchen
  • Tim Mattson
  • Scott McMillan
  • Erik Welch
  • Isaac Virshup
  • Will Kimmerer

Agenda

  • Discuss Erik's work on V2 spec/implementation

Summary of Discussion

Short discussion about Zarr. Community seems to be migrating away from HDF5, toward Zarr.

Question: does Zarr have good support outside of Python?

Jim: Zarr has some support in NetCDF.

Erik: the V2 proposal has keywords to represent the storage format of each dimension:

  • "Sparse" or "full" for sparse or dense values in the inner dimension

  • "compressed" or "doubly_compressed" to represent an outer dimension that points to another inner dimension

  • Could potentially add "uncompressed" to an inner dimension to reference an inner dimension that is full and uncompressed.

Resources and Links

Discussion of C support for Zarr zarr-developers/community#9 zarr-developers/zarr-specs#41

Chunking Support https://github.com/fsspec/kerchunk

Outcomes

  • Ben will continue to investigate NetCDF

  • Ben will investigate C/C++ support for Zarr

  • Jim will talk about awkward array

Consider if there are additional values we need to support from SuiteSparse Matrix Collection's Matrix Market

  • Symmetric (structure and with values)
  • Are there others?
  • Jim: Matrix Market has a bunch of unusual formats like Hermetian, skew symmetric, etc.
  • We can potentially just store these all explicitly, without too much overhead (since we are a binary format).
  • Or we could add additional attributes (e.g. "symmetric") to change the interpretation of the formats we already have (essentially what MM does).