Skip to content

refactor: shrinks Value from 96 to 24 bytes#727

Open
claymcleod wants to merge 1 commit intomainfrom
feat/shrink-value-representation
Open

refactor: shrinks Value from 96 to 24 bytes#727
claymcleod wants to merge 1 commit intomainfrom
feat/shrink-value-representation

Conversation

@claymcleod
Copy link
Member

@claymcleod claymcleod commented Mar 13, 2026

While investigating Sprocket's baseline memory usage, I noticed that certain workloads—particularly those involving deeply nested compound types like Map[String, Map[String, ...]]—scaled poorly. The root cause was the Value enum in wdl-engine, which occupied 96 bytes per element due to large inline types (Type at 56 bytes, TaskPostEvaluationValue at 96 bytes) inflating the enum even though the common case (PrimitiveValue) is only 16 bytes.

This refactor pushes Arc wrapping down into the types that inflate Value. Each compound value type (Array, Map, Pair, Struct, EnumVariant) and each large hidden value type (TaskPreEvaluation, TaskPostEvaluation) becomes a thin wrapper around Arc<Inner>, where the inner struct owns all fields directly—eliminating nested Arc layering. Two new wrapper types (NoneValue and TypeNameRefValue) handle variants that previously stored a bare Type. CallValue wraps its CallType field behind Arc as well.

The tradeoff is an additional pointer indirection when accessing fields of compound and hidden values, since their data now lives behind Arc on the heap rather than inline in the enum. In practice, this cost is negligible: these types were already heap-allocated internally (e.g., Array stored an Arc<Vec<Value>>), so the refactor consolidates rather than adds indirection. Clone remains cheap—a refcount bump on one Arc rather than multiple.

A const assertion prevents Value from exceeding 24 bytes in the future. The as_map standard library function was also updated to use the IndexMap entry API, eliminating a redundant key clone.

Benchmarked on an Apple M3 Max against the prior version across five synthetic workloads:

Benchmark Before RSS After RSS Memory reduction Before time After time
Nested arrays (1K, 2^10) 36 MB 34 MB -6% 0.05s 0.04s
Nested arrays (59K, 3^10) 300 MB 243 MB -19% 0.63s 0.58s
Flat array (1M strings) 549 MB 423 MB -23% 4.50s 3.52s
Nested arrays (1M, 4^10) 3,578 MB 2,980 MB -17% 8.93s 8.45s
Nested maps (1M, 32^4) 3,731 MB 2,457 MB -34% 11.76s 11.06s

The improvement is most pronounced for nested maps because each map entry stores a Value for the key and a Value for the nested map, so the 4x per-element reduction compounds across all four nesting levels. Flat arrays benefit less because only one Array container exists—the savings come from shrinking each element Value within its backing Vec.

For all contributors:

  • You have not used AI on any parts of this pull request.
  • You have added a few sentences describing the PR here.
  • Your code builds clean without any errors or warnings.
  • You have added tests (when appropriate).

@claymcleod claymcleod requested a review from a team as a code owner March 13, 2026 22:26
@claymcleod claymcleod requested a review from a-frantz March 13, 2026 22:26
@claymcleod claymcleod marked this pull request as draft March 13, 2026 22:30
@claymcleod claymcleod removed the request for review from a-frantz March 13, 2026 22:30
@github-actions github-actions bot added the S-awaiting-pass-CI PR is awaiting CI to pass. label Mar 13, 2026
@claymcleod claymcleod force-pushed the feat/shrink-value-representation branch from a89a6a1 to 66913af Compare March 14, 2026 01:38
context.arguments[0].span,
));
}
indexmap::map::Entry::Vacant(e) => {
Copy link
Member Author

@claymcleod claymcleod Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a correctness sidequest rather than a performance improvement. In the case where the key is not in the map, we can avoid the preemptive clone of the key.

@claymcleod claymcleod force-pushed the feat/shrink-value-representation branch from 66913af to 1f6da9b Compare March 14, 2026 02:12

for (name, ty) in ty.members().iter() {
// Check for optional members that should be set to `None`
// Check for optional members that should be set to none
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the approach that WDL None values should be referred to as "none" to disambiguate it from Rust's None value.

ty: impl Into<Type>,
name: Arc<String>,
members: Arc<IndexMap<String, Value>>,
members: impl Into<IndexMap<String, Value>>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike name, every caller of this just constructed an IndexMap and immediately wrapped it with an Arc. So the original Arc was truly just for making this cheaply clonable.

@github-actions github-actions bot removed the S-awaiting-pass-CI PR is awaiting CI to pass. label Mar 14, 2026
While investigating Sprocket's baseline memory usage, certain
workloads—particularly those involving deeply nested compound types like
`Map[String, Map[String, ...]]`—scaled poorly. The root cause was the
`Value` enum in `wdl-engine`, which occupied 96 bytes per element due to
large inline types (`Type` at 56 bytes, `TaskPostEvaluationValue` at 96
bytes) inflating the enum even though the common case (`PrimitiveValue`)
is only 16 bytes.

This refactor pushes `Arc` wrapping down into the types that inflate
`Value`. Each compound value type (`Array`, `Map`, `Pair`, `Struct`,
`EnumVariant`) and each large hidden value type (`TaskPreEvaluation`,
`TaskPostEvaluation`) becomes a thin wrapper around `Arc<Inner>`, where
the inner struct owns all fields directly—eliminating nested `Arc`
layering. Two new wrapper types (`NoneValue` and `TypeNameRefValue`)
handle variants that previously stored a bare `Type`. `CallValue` wraps
its `CallType` field behind `Arc` as well.

The tradeoff is an additional pointer indirection when accessing fields
of compound and hidden values, since their data now lives behind `Arc`
on the heap rather than inline in the enum. In practice, this cost is
negligible: these types were already heap-allocated internally (e.g.,
`Array` stored an `Arc<Vec<Value>>`), so the refactor consolidates
rather than adds indirection. Clone remains cheap—a refcount bump on
one `Arc` rather than multiple.

A const assertion prevents `Value` from exceeding 24 bytes in the
future. The `as_map` standard library function was also updated to use
the `IndexMap` entry API, eliminating a redundant key clone.

See PR for benchmark results.
@claymcleod claymcleod force-pushed the feat/shrink-value-representation branch from 1f6da9b to 6ee481b Compare March 14, 2026 02:37
@claymcleod claymcleod marked this pull request as ready for review March 14, 2026 02:43
Copy link
Contributor

@peterhuene peterhuene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just one minor nit!

Comment on lines +1322 to +1323
let mut arguments: [CallArgument; MAX_PARAMETERS] =
std::array::from_fn(|_| CallArgument::none());
Copy link
Contributor

@peterhuene peterhuene Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could just be a call to repeat as CallArgument should probably be Clone:

Suggested change
let mut arguments: [CallArgument; MAX_PARAMETERS] =
std::array::from_fn(|_| CallArgument::none());
let mut arguments: [CallArgument; MAX_PARAMETERS] =
std::array::repeat(CallArgument::none()));

Copy link
Contributor

@peterhuene peterhuene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also meant to approve my previous review as the nit comment isn't that big a deal either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants