Skip to content

Conversation

@cboulay
Copy link
Member

@cboulay cboulay commented Aug 11, 2025

This adds HybridBuffer -- A stateful, FIFO buffer that combines a deque for fast appends with a contiguous circular buffer for efficient, advancing reads. It also adds HybridAxisArrayBuffer which does the same for AxisArrays.

I will leave this in draft for now because I'm going to further develop HybridAxisArrayBuffer so it is suitable in:

  • ResampleProcessor
  • SamplerTransformer
  • WindowTransformer

@cboulay
Copy link
Member Author

cboulay commented Aug 11, 2025

Note to self:

  • Add option to grow buffer on demand up to maximum size
  • For HybridAxisArrayBuffer, add ability to query time range and peek based on time range

@cboulay cboulay linked an issue Aug 11, 2025 that may be closed by this pull request
@cboulay
Copy link
Member Author

cboulay commented Aug 14, 2025

I was hoping to defer adding different overflow strategies until later, but then I realized that one of the primary purposes of the ResampleProcessor is to return even-interval data for smoother processing downstream, so allowing the buffer to drop samples was antithetical to that. So I implemented overflow_strategy="grow".

The recent smattering of commits improves upon the buffer classes then refactors resample.py to use them. This did not impact performance when using update_strategy="immediate" but had about a 18% slowdown when using update_strategy="on_demand" (0.83 s -> 098 s). The profiler resampled 100 seconds worth of 128-ch irregular rate (avg 1024 Hz) data at an even 2 kHz -- so < 1% of the compute budget.

While I didn't get the performance boost I was hoping for, I did achieve the other primary objective of this effort. By centralizing the buffer-handling code, I was able to simplify resample.py (less ~40 lines of code, though I added that much back in docstrings) and I found and fixed a bug that I might have missed otherwise. Oh and another result is that the resample module is now Array API compatible (on the data... still expects np array for the CoordinateAxis timestamps but only in a couple lines).

I'll continue this effort on sampler.py after I step away for a bit.

@cboulay cboulay marked this pull request as ready for review August 15, 2025 03:30
@cboulay
Copy link
Member Author

cboulay commented Aug 15, 2025

Sampler now refactored to use the new buffers. This resulted in about 30 fewer lines of code, Array API support, and support for messages with their primary axis being a CoordinateAxis (i.e., irregularly sampled data).

I'm not going to include the Window node in this PR. That one is a little too complex and will delay this further.

Ready for review!

@griffinmilsap
Copy link
Contributor

Sorry I've slept on this for so long, but it looks like this PR has stabilized and is likely good to go. I haven't reviewed the code in detail, but I'm alright to merge this as a new feature -- my (minor) comments below can be addressed in future PRs if at all.

  • It might be worthwhile to consider using StrEnums for update strategy and overflow strategy; I'm less of a fan of string literals generally -- it makes it a bit more difficult to introspect what all valid options are.
  • I'd suggest another overflow behavior" fill which would dump the deque into the buffer until its full, then leave the rest of the deque intact (basically don't drop data)

I'll mark this as ok-to-go and we can fix issues as they pop up in dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hybrid time series buffer

2 participants