Skip to content

Refactor serialization to avoid reversing or collecting iterators #641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 20, 2025

Conversation

kylewlacy
Copy link
Contributor

This PR makes it so facet-serialize no longer requires DoubleEndedIterator when iterating over values. In turn, this also means that serialization can use plain iterator values in more places, without resorting to collecting e.g. into a Vec.

The "trick" is in SerialzeTask, which now holds iterators wherever possible. When popped from the stack, it gets the next value from the iterator then pushes it back onto the stack (until the iterator is empty).

I was hoping this would lead to perf improvements, but locally, I wasn't seeing a noticeable difference... I still felt like this change had other minor benefits (e.g. less memory use during iteration, iter_vtable for HashMap no longer needs to collect its keys into a vec, etc.)

@fasterthanlime
Copy link
Contributor

I have been wanting to do that for a while, so thank you for that

Comment on lines +155 to +154
pub struct FieldsForSerializeIter<'mem, 'facet, 'shape> {
stack: Vec<FieldIter<'mem, 'facet, 'shape>>,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type is a hand-written iterator to re-create the old fields_for_serialize() impl Trait method. This was necessary so that facet-serialize could store a nameable iterator type (I also tried Box<dyn Iterator<...>> but hit some weird lifetime bounds issues... plus I didn't feel great about needing an extra box for this

(One minor plus to this implementation is that it's no longer recursive!)

@@ -473,9 +494,11 @@ where
let fields = peek_struct.fields_for_serialize().count();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, one really subtle thing that I hate!! this is iterating over all the flattened fields twice: once here to count the number of flattened fields, and once while doing the actual serialization

This is needed so we can call serializer.start_object(Some(_)), but I couldn't figure out an efficient way to count the number of flattened fields without just iterating (including allocating a vec)...

If there's a way to impl ExactSizedIterator for FieldsForSerializeIter then we could avoid this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way to impl ExactSizedIterator for FieldsForSerializeIter then we could avoid this

Not sure about this, but maybe we can remove the hint...

@tversteeg tversteeg added the 📁 formats facet-json, facet-yaml, facet-toml etc. label May 19, 2025
@fasterthanlime
Copy link
Contributor

I thought this was merged and I merged some PRs that introduce conflicts :( Feel free to merge it yourself as soon as the conflicts are resolved @kylewlacy

@kylewlacy kylewlacy force-pushed the serialize-without-iter-rev branch from 8e39735 to 258d9eb Compare May 20, 2025 06:39
@kylewlacy kylewlacy added this pull request to the merge queue May 20, 2025
Merged via the queue into facet-rs:main with commit e04ce61 May 20, 2025
17 checks passed
@kylewlacy kylewlacy deleted the serialize-without-iter-rev branch May 20, 2025 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📁 formats facet-json, facet-yaml, facet-toml etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants