Skip to content

perf(normalizer): use Rc<str> interning to reduce alias resolution allocations#726

Open
anakrish wants to merge 1 commit into
microsoft:mainfrom
anakrish:rc-str-normalizer-opt
Open

perf(normalizer): use Rc<str> interning to reduce alias resolution allocations#726
anakrish wants to merge 1 commit into
microsoft:mainfrom
anakrish:rc-str-normalizer-opt

Conversation

@anakrish
Copy link
Copy Markdown
Collaborator

The alias normalizer allocates fresh Strings for path segments and lowercased short names on every resolution. Since these values are derived from the alias registry and never change, we can precompute them as Rc<str> at registry-load time and reuse them via cheap refcount bumps during normalization.

Concretely, ResolvedEntry now stores pre-split path segments as Vec<Rc<str>> and a pre-lowercased short name as Rc<str>. The normalizer uses Rc::clone instead of String::from / to_ascii_lowercase in the hot path. A few new helpers (obj_insert_rc, rc_lowercase, obj_insert_lc) keep the call sites clean. The is_root_field_collision check also avoids a temporary allocation by doing byte-level comparison directly.

Benchmarking across 11 real Azure policies shows 27-49% improvement in cached eval time (including normalization), with the biggest wins on high-alias-count resources where per-alias allocation cost dominates.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Azure Policy alias normalization by precomputing and reusing Rc<str>-backed strings (lowercased short names and split ARM path segments) so the hot normalization path avoids repeated String allocations and per-call split('.') work.

Changes:

  • Precompute ResolvedEntry lowercase short names and path segments as Rc<str> at registry-load time.
  • Add ObjMap helpers to insert pre-allocated Rc<str> keys and to lowercase keys with a fast-path.
  • Update alias resolution to navigate ARM paths via precomputed segments and insert common-case scalar aliases without allocating new keys.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/languages/azure_policy/aliases/types.rs Stores precomputed Rc<str> lowercased short name and pre-split path segments in ResolvedEntry.
src/languages/azure_policy/aliases/obj_map.rs Adds Rc<str> insertion/lowercasing helpers and changes root-field collision prefix check to avoid allocation.
src/languages/azure_policy/aliases/normalizer/alias_resolution.rs Uses precomputed segments for ARM navigation and inserts scalar aliases via Rc cloning in the common case.
Comments suppressed due to low confidence (1)

src/languages/azure_policy/aliases/obj_map.rs:283

  • is_root_field_collision no longer matches the previous starts_with("properties.") behavior: the new default_path.len() > 11 excludes "properties." (len == 11), which previously returned true. If this function is only meant to avoid allocations, consider using a slice-based, case-insensitive prefix check that preserves the exact prior semantics (including the len==11 case).
pub fn is_root_field_collision(short_name: &str, default_path: &str) -> bool {
    ROOT_FIELDS
        .iter()
        .any(|f| f.eq_ignore_ascii_case(short_name))
        && default_path
            .as_bytes()
            .iter()
            .zip(b"properties.")
            .all(|(a, b)| a.to_ascii_lowercase() == *b)
        && default_path.len() > 11

Comment thread src/languages/azure_policy/aliases/obj_map.rs Outdated
Comment thread src/languages/azure_policy/aliases/types.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread src/languages/azure_policy/aliases/obj_map.rs Outdated
Comment thread src/languages/azure_policy/aliases/obj_map.rs Outdated
…locations

Replace per-alias heap allocations with reference-counted string interning
throughout the normalizer's alias resolution pipeline:

- Store alias ARM path segments as Vec<Rc<str>> instead of Vec<String>,
  enabling zero-alloc BTreeMap lookups via Rc::clone (refcount bump)
  rather than Value::from() (heap allocation per lookup)

- Pre-compute lowercased short name (short_name_lc: Rc<str>) at
  registry-load time, enabling allocation-free insertion for the
  common case of non-dotted, non-collision alias short names

- Add rc_lowercase() fast-path that skips allocation when the input
  string is already lowercase ASCII (common for ARM property names)

- Update set_nested_lowercased/set_nested_inner/set_nested_in_btree to
  thread Rc<str> through intermediate object construction, avoiding
  String temporaries at every nesting level

- Replace to_ascii_lowercase().starts_with() in is_root_field_collision
  with a zero-alloc byte-level comparison

Measured 27-49% improvement in normalization time across a range of
Azure Policy alias-heavy resource types (293-608 aliases per type).

Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment on lines 391 to 416
/// A resolved alias entry with its default path and optional versioned paths.
#[derive(Debug, Clone)]
pub struct ResolvedEntry {
/// The original-cased alias short name (e.g., `"accountType"`, not
/// `"accounttype"`). The entries map uses lowercase keys for
/// case-insensitive lookup, but the normalizer needs the original casing
/// to write values at correctly-cased paths in the output.
pub short_name: String,
/// The default ARM JSON path.
pub default_path: String,
/// Versioned path overrides: `(api_version, arm_path)` pairs.
pub versioned_paths: Vec<(String, String)>,
/// Optional metadata from the alias catalog (type, modifiability).
pub metadata: Option<AliasPathMetadata>,

// ── Precomputed fields (derived at registry-load time) ──────────────
/// Whether `short_name` contains `[*]` (i.e., this is a wildcard/array alias).
pub is_wildcard: bool,
/// Pre-lowercased short name as `Rc<str>` for allocation-free common-case inserts.
pub(crate) short_name_lc: Rc<str>,
/// Precomputed `default_path.split('.').collect()` for fast ARM path navigation.
pub default_path_segments: Vec<String>,
pub(crate) default_path_segments: Vec<Rc<str>>,
/// Precomputed path segments for each versioned path, in the same order
/// as `versioned_paths`.
pub versioned_path_segments: Vec<Vec<String>>,
pub(crate) versioned_path_segments: Vec<Vec<Rc<str>>>,
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants