Skip to content

Fix partial object iteration dropping bindings and guard unsupported RVM patterns#718

Open
Copilot wants to merge 11 commits intomainfrom
copilot/fix-partial-set-rule-iteration
Open

Fix partial object iteration dropping bindings and guard unsupported RVM patterns#718
Copilot wants to merge 11 commits intomainfrom
copilot/fix-partial-set-rule-iteration

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 7, 2026

Partial object rules with iterating bodies, such as p[k] if { some k, _ in input.items }, were only emitting the first matching binding. This affected both the interpreter and the RVM path for the supported dynamic-key form; contains-based partial sets and comprehensions were not the source of the bug.

  • Interpreter: stop early-return on dynamic object keys

    • Tighten constness detection in eval_output_expr_in_loop so p[k] if is only treated as constant when k is a simple literal.
    • This preserves the existing optimization for constant keys while allowing iterating bindings to contribute all matching entries.
  • RVM compiler: classify supported partial object heads correctly

    • Update rule typing for bracket-head comprehension rules:
      • p[k] if { ... } → partial object
      • p[k] := v if { ... } → partial object
      • p contains k if { ... } → partial set
    • Supported RVM partial-object keys now include dynamic expressions, not just bare variables, so forms like p[input.aliases[k]] := v if { ... } continue to compile and collect all bindings.
    • Static literal bracket prefixes such as p["a"][k] := v if { ... } continue to compile as supported shapes.
  • RVM compiler: reject unsupported partial object patterns

    • Add explicit compiler errors for partial object heads the current RVM codegen does not handle correctly:
      • constant-key heads like p["fixed"] if { ... } and p["fixed"] := v if { ... }
      • nested bracket heads like p[a][b] if { ... }, p[a][b] := v if { ... }, and literal-prefix variants such as p[1][k] := v if { ... }
      • hidden dynamic-prefix variants such as p[a].q[b] if { ... }
    • Validation now walks the full reference prefix chain, so nested bracket prefixes are rejected even when hidden behind dot lookups, while string-literal bracket prefixes equivalent to dot access are still allowed.
    • These shapes continue to work in the interpreter, while the RVM now fails compilation clearly instead of compiling incorrect behavior.
  • Regression coverage

    • Add RVM/interpreter regression cases for:
      • dynamic-key partial objects collecting all bindings
      • explicit-value partial objects
      • dynamic expression keys
      • multi-body iteration
      • array iteration
      • contains partial sets
      • the exact issue regorus 0.9.1 — partial set rule with iterating body emits only one element #712 reproducer, locking down the v0/v1 distinction between:
        • violations[k] { ... } as a partial set in v0
        • violations[k] if { ... } as a partial object in v1
        • violations contains k if { ... } as a partial set in v1
    • Add interpreter-only coverage for constant-key and multi-level partial object shapes, including explicit-value forms.
    • Update the RVM cases for constant-key and nested-bracket partial objects, including explicit-value forms, to expect compiler errors.
    • Add RVM runtime edge-case coverage for:
      • undefined dynamic keys and undefined values
      • mixed undefined key/value iterations
      • duplicate explicit-value keys across multiple rules and multiple iteration paths
      • duplicate same-value keys
      • single-element input
    • Add broader partial-object coverage for:
      • type conflicts between partial objects, partial sets, and complete rules
      • large-range/stress cases
      • real-world RBAC, violation-reporting, and filtered resource-mapping patterns
      • computed/function/array-index keys and special-string keys
      • not filtering, dot+bracket object-value heads, and OPA-style join/composite-value cases
      • additional unsupported RVM shapes such as dynamic-prefix/static-suffix nesting, literal-prefix nested brackets, and with in rule bodies
    • Cases that currently rely on known-buggy undefined-key/value or conflicting-duplicate-key RVM behavior are now kept as skipped tests with TODO notes and the correct expected outcomes preserved for future unskipping, rather than asserting the buggy output as expected behavior.
    • The every vacuous-truth partial-object case was verified to still diverge in the RVM and remains explicitly skipped with a note describing the current behavior.

Example:

package test
import rego.v1

violations[k] if {
  some k, _ in input.items
}

With:

{ "items": { "FOO": 1, "BAR": 2, "BAZ": 3 } }

this now evaluates to:

{ "BAR": true, "BAZ": true, "FOO": true }

For now, unsupported RVM partial object shapes fail fast with compiler errors and fall back to the interpreter behavior rather than producing incorrect results. Additional runtime edge-case and conformance-style tests also lock down supported behavior while preserving skipped TODO coverage for known partial-object divergences that remain to be fixed separately.

Copilot AI and others added 2 commits May 7, 2026 11:38
Agent-Logs-Url: https://github.com/microsoft/regorus/sessions/abb61b2e-582a-48ea-91ac-0cfebaaea96f

Co-authored-by: anakrish <35780660+anakrish@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix partial set rule iteration to emit all bindings Fix partial object iteration dropping bindings in interpreter and RVM May 7, 2026
Copilot AI requested a review from anakrish May 7, 2026 11:46
@anakrish anakrish requested a review from Copilot May 7, 2026 12:07
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a correctness bug where partial object rules with dynamic keys would stop after the first matching binding, causing missing keys in the produced object. The fix aligns interpreter constantness detection and RVM rule typing so both execution paths emit all bindings for dynamic-key partial objects.

Changes:

  • Interpreter: tighten “constant rule” detection so early-return is only enabled when the object key expression is a simple literal.
  • RVM compiler: classify bracket-head comprehension rules (p[k] if ...) as PartialObject instead of PartialSet.
  • Add regression YAML cases covering dynamic-key partial objects, explicit-value partial objects, multi-body iteration, array iteration, and contains partial sets (plus interpreter-only v1 coverage for constant-key and multilevel shapes).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/rvm/rego/cases/partial_object_rules.yaml Adds RVM+interpreter regression cases for partial object iteration (and contains partial set) behavior.
tests/interpreter/cases/rule/partial_object_v1.yaml Adds interpreter-focused Rego v1 cases for constant-key and multilevel partial object shapes.
src/languages/rego/compiler/rules.rs Adjusts rule-type classification so p[k] if { ... } compiles as a partial object.
src/interpreter.rs Prevents incorrect early-return for dynamic-key partial object rules by tightening constness detection.

BAZ: 3
FOO: 1

- note: partial_object_constant_key_still_works
@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 7, 2026

RVM: Add compiler errors for unsupported partial object patterns (addresses #719)

The interpreter fix in this PR is correct and approved. However, the RVM classification change (PartialSetPartialObject at rules.rs:62) is too broad — the RVM codegen only handles variable-key partial objects correctly. Constant-key and multi-level patterns produce wrong results (see #719).

What to do

Add compiler error guards so unsupported patterns fall back to the interpreter. The RVM compiler already does this for other unsupported features (e.g., WithKeywordUnsupported, walk loops).

Step 1: Add error variants to src/languages/rego/compiler/error.rs

#[error("partial object rules with constant keys are not yet supported by the RVM compiler")]
PartialObjectConstantKeyUnsupported,

#[error("partial object rules with nested bracket keys are not yet supported by the RVM compiler")]
PartialObjectNestedKeyUnsupported,

Step 2: Add a validation helper in src/languages/rego/compiler/rules.rs

After compute_rule_type classifies a Compr { refr: RefBrack, assign: None } as PartialObject, validate that the pattern is one the codegen can handle. Add a helper that checks:

  1. The RefBrack has exactly one dynamic bracket at the leaf (no nesting like p[a][b])
  2. The bracket index is a variable (Expr::Var), not a literal/constant (not p["fixed"])

If either check fails, return the appropriate CompilerError. This should be checked in the rule compilation path, before codegen emits ObjectSet.

The simplest place is in compile_worklist_rules (around line 340-360 where key_expr is extracted from RefBrack). Before proceeding with compilation:

// For PartialObject without assign, validate the key is a variable
if rule_type == RuleType::PartialObject && assign.is_none() {
    // Check: index must be a Var (not a literal/constant)
    match index.as_ref() {
        Expr::Var { .. } => {} // OK — variable key like p[k]
        _ => return Err(CompilerError::PartialObjectConstantKeyUnsupported.at(&refr_span)),
    }
    // Check: prefix must not contain another RefBrack (no nesting like p[a][b])
    if matches!(prefix.as_ref(), Expr::RefBrack { .. }) {
        return Err(CompilerError::PartialObjectNestedKeyUnsupported.at(&refr_span));
    }
}

Step 3: Un-skip the RVM tests that now correctly error

The two skipped test cases in tests/rvm/rego/cases/partial_object_rules.yaml:

  • partial_object_constant_key_still_works — change from skip: true to want_error: true (or whatever the test harness uses to expect compilation failure)
  • partial_object_multilevel_key_collects_nested_bindings — same: expect a compiler error

Check tests/rvm/rego/mod.rs for how error expectations work in the test harness (look for want_error, expect_error, or similar fields in TestCase).

Step 4: Run tests

cargo test --features rvm
cargo test --test opa --features opa-testutil

Summary

This is a temporary mitigation for #719. The RVM codegen will be properly fixed in a future PR. For now, unsupported partial object patterns gracefully fall back to the interpreter via compiler errors.

@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 7, 2026

@copilot Take a look at #718 (comment)

Agent-Logs-Url: https://github.com/microsoft/regorus/sessions/6aa57d3d-7860-4646-ac8a-8e6183b117da

Co-authored-by: anakrish <35780660+anakrish@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

RVM: Add compiler errors for unsupported partial object patterns (addresses #719)

The interpreter fix in this PR is correct and approved. However, the RVM classification change (PartialSet → `...

Addressed in f2ff982. I added explicit RVM compiler errors for constant-key and nested-bracket partial object heads, validated them in the rule compilation path, and updated the RVM regressions to expect those compiler errors while allowing interpreter success.

@copilot Take a look at #718 (comment)

Done in f2ff982.

Copilot AI changed the title Fix partial object iteration dropping bindings in interpreter and RVM Fix partial object iteration dropping bindings and guard unsupported RVM patterns May 7, 2026
@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

Review Findings — Round 2

The interpreter fix is correct and approved. The RVM compiler error guards need adjustments:

Finding 1 (High): Validation guard is too narrow — assign.is_none() skips := forms

The guard at rules.rs:366-370 only validates p[k] if (no assign), but p[k] := v if forms have the same nested-key and constant-key codegen issues.

Example: p[a][b] := 1 if { some a, obj in input.nested; some b, _ in obj } bypasses the guard, and the compiler drops outer key a, producing {"read":1,"deploy":1} instead of {"app":{"read":1},"ops":{"deploy":1}}.

Fix: Remove the assign.is_none() condition. Validate all RuleType::PartialObject RuleHead::Compr heads:

// Change from:
if rule_type == RuleType::PartialObject && assign.is_none() {
    self.validate_partial_object_shape(refr)?;
}

// To:
if rule_type == RuleType::PartialObject {
    self.validate_partial_object_shape(refr)?;
}

Finding 2 (Medium): Dynamic non-Var keys incorrectly rejected

validate_partial_object_shape rejects ALL non-Var keys as "constant keys", but expressions like p[input.x] if and p[x+1] if are dynamic — the RVM codegen handles them fine via compile_rego_expr + ObjectSet.

Fix: Instead of !matches!(index.as_ref(), Expr::Var { .. }), reject only actual literals:

// Change from:
if !matches!(index.as_ref(), Expr::Var { .. }) {
    return Err(CompilerError::PartialObjectConstantKeyUnsupported.at(index.span()));
}

// To: reject only simple literals (strings, numbers, bools, null)
if matches!(index.as_ref(), 
    Expr::String { .. } | Expr::RawString { .. } | Expr::Number { .. } 
    | Expr::Bool { .. } | Expr::Null { .. }) {
    return Err(CompilerError::PartialObjectConstantKeyUnsupported.at(index.span()));
}

Finding 3 (Note): Semver consideration

Adding new public CompilerError enum variants is technically a semver break for downstream code that exhaustively matches on the enum. Consider adding #[non_exhaustive] to CompilerError (can be a separate PR/issue).

Test additions needed

  1. Add a test for p[a][b] := v if with want_error + allow_interpreter_success
  2. Add a test for p["fixed"] := v if with want_error + allow_interpreter_success
  3. The existing partial_object_explicit_value_collects_all_bindings test (p[k] := v if) should continue to pass (variable key with assign is supported)

@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

@copilot Please address the findings in the previous comment.

Agent-Logs-Url: https://github.com/microsoft/regorus/sessions/357566e7-fc57-4672-94cc-d31687ed59d1

Co-authored-by: anakrish <35780660+anakrish@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 8, 2026

Review Findings — Round 2

The interpreter fix is correct and approved. The RVM compiler error guards need adjustments:...

Addressed in bf43dd4. The partial-object guard now applies to all RuleType::PartialObject bracket heads, only rejects literal constant keys (not dynamic expressions like input.aliases[k]), and adds the requested explicit-value regression cases for constant-key and nested-key forms.

@copilot Please address the findings in the previous comment.

Done in bf43dd4.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment on lines +4 to +16
cases:
- note: constant_key_partial_object_v1
data: {}
input:
enabled: true
modules:
- |
package test
import rego.v1

p["fixed"] if {
input.enabled
}
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 8, 2026

Comprehensive test cases for partial object rules

Please add the following test cases to tests/rvm/rego/cases/partial_object_rules.yaml. These cover edge cases, adversarial scenarios, real-world...

Addressed in 3b4f618. I added the comprehensive partial-object cases to tests/rvm/rego/cases/partial_object_rules.yaml, pinned the diverging RVM behaviors to current reality with TODO(#719) notes, and kept the every vacuous-truth case skipped with an explanation after verifying it still diverges.

@copilot Please add all the test cases from the previous comment to tests/rvm/rego/cases/partial_object_rules.yaml. For cases where the actual behavior diverges, adjust expectations to match reality...

Done in 3b4f618.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment on lines 58 to 63
RuleHead::Compr { refr, assign, .. } => match refr.as_ref() {
crate::ast::Expr::RefBrack { .. } if assign.is_some() => {
RuleType::PartialObject
}
crate::ast::Expr::RefBrack { .. } => RuleType::PartialSet,
crate::ast::Expr::RefBrack { .. } => RuleType::PartialObject,
_ => RuleType::Complete,
@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

Fix: Tests should not encode buggy behavior as expected output

Several test cases currently hard-code known-incorrect RVM output as want_result. This blesses bugs — when someone fixes the RVM, these tests would break instead of pass.

Rule: If the expected behavior cannot be tested correctly today, use skip: true with a TODO comment explaining the correct behavior and referencing #719.

Tests to fix:

1. Undefined key/value tests — These currently assert "#undefined" as a key or encode partial results that include undefined entries:

  • partial_object_undefined_key_skipped
  • partial_object_undefined_key_skips_iteration
  • partial_object_undefined_value_skips_iteration
  • partial_object_mixed_undefined_key_value_cases_skip_bad_iterations

For each, change to skip: true and add a comment like:

    # TODO (#719): RVM incorrectly materializes undefined keys/values instead of
    # skipping them. Correct behavior: skip iterations where key or value is undefined.
    skip: true

Keep the want_result set to the correct expected output (what OPA would produce), so when the RVM is fixed these tests can simply be unskipped.

2. Duplicate key conflict tests — These currently assert last-writer-wins (a: 2) but OPA errors on conflicting keys with different values:

  • partial_object_duplicate_key_last_wins
  • partial_object_duplicate_paths_same_key_different_values_conflict

For each, set the correct expectation (want_error with a conflict message) and skip: true:

    # TODO (#719): regorus silently overwrites conflicting keys instead of erroring.
    # Correct behavior: error on same key with different values.
    skip: true
    want_error: "conflict"

3. Nested literal prefixp[1][k] slips through has_dynamic_bracket since 1 is a literal. Add this to the nested-bracket rejection in validate_partial_object_shape: walk the full prefix chain and reject ANY RefBrack, not just those with dynamic indices. A static RefBrack prefix like p[1] still cannot be correctly lowered by the current codegen.

Alternatively, if fixing the guard is too complex, just add a test case for p[1][k] := v if with skip: true documenting the gap.

Do NOT use allow_interpreter_incorrect_behavior: true for cases where the interpreter is correct — that flag means "interpreter is wrong, RVM is right", which is the opposite of these cases.

@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

@copilot Please address the findings in the previous comment.

Agent-Logs-Url: https://github.com/microsoft/regorus/sessions/19553cd8-af3d-4285-86c1-3ef1a72c0b93

Co-authored-by: anakrish <35780660+anakrish@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 8, 2026

Fix: Tests should not encode buggy behavior as expected output

Several test cases currently hard-code known-incorrect RVM output as want_result. This blesses bugs — when someone fixes the RVM,...

Addressed in 4f6651a. The undefined-key/value and conflicting-duplicate-key cases now use skip: true with TODO(#719) notes while keeping the correct expected behavior, and the RVM validator now rejects nested literal-prefix bracket heads like p[1][k].

@copilot Please address the findings in the previous comment.

Done in 4f6651a.

@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

@copilot Please take a look at #718 (comment)

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 8, 2026

@copilot Please take a look at #718 (comment)

Took another look — this was already addressed in 4f6651a, and there aren’t any additional changes needed beyond that update.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

@anakrish anakrish marked this pull request as ready for review May 8, 2026 18:37
@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

Missing: Regression tests for the exact issue #712 reproducer

The PR fixes the bug but doesn't have a test for the exact scenario reported in the issue. Please add these to lock down v0/v1 semantics for the original reproducer:

  # --- Issue #712 exact reproducer: violations[k] in v0 vs v1 ---

  # v0: violations[k] { ... } is a partial SET (no "if" keyword)
  - note: issue_712_reproducer_v0_partial_set
    data: {}
    input:
      servers:
        FOO: 1
        BAR: 2
        BAZ: 3
    modules:
      - |
        package test

        violations[k] {
          some k, _ in input.servers
        }
    query: data.test.violations
    want_result:
      set!: ["BAR", "BAZ", "FOO"]

  # v1: violations[k] if { ... } is a partial OBJECT (key -> true)
  # This is the exact bug from issue #712 — was producing only one entry
  - note: issue_712_reproducer_v1_partial_object
    data: {}
    input:
      servers:
        FOO: 1
        BAR: 2
        BAZ: 3
    modules:
      - |
        package test

        violations[k] if {
          some k, _ in input.servers
        }
    query: data.test.violations
    want_result:
      BAR: true
      BAZ: true
      FOO: true

  # v1: violations contains k if { ... } is a partial SET
  - note: issue_712_reproducer_v1_contains_partial_set
    data: {}
    input:
      servers:
        FOO: 1
        BAR: 2
        BAZ: 3
    modules:
      - |
        package test

        violations contains k if {
          some k, _ in input.servers
        }
    query: data.test.violations
    want_result:
      set!: ["BAR", "BAZ", "FOO"]

These three tests together lock down the v0/v1 semantic distinction that caused the original issue. Add them to both tests/rvm/rego/cases/partial_object_rules.yaml and tests/interpreter/cases/rule/partial_object_v1.yaml (the v0 case goes in the interpreter test since it's v0 syntax).

@anakrish
Copy link
Copy Markdown
Collaborator

anakrish commented May 8, 2026

@copilot Please add the issue #712 reproducer tests from the previous comment.

Agent-Logs-Url: https://github.com/microsoft/regorus/sessions/ca6066bd-e40f-4334-9ace-4b14a72ad7b4

Co-authored-by: anakrish <35780660+anakrish@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 8, 2026

Missing: Regression tests for the exact issue #712 reproducer

The PR fixes the bug but doesn't have a test for the exact scenario reported in the issue. Please add these to lock down v0/v1 semanti...

Added the issue #712 reproducer coverage in a639b9d. The RVM suite now includes the v1 partial-object and contains cases, and the interpreter suite includes those plus the v0 partial-set case with future.keywords.in.

@copilot Please add the issue #712 reproducer tests from the previous comment.

Done in a639b9d.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment on lines +92 to +111
let Expr::RefBrack {
refr: prefix,
index,
..
} = refr.as_ref()
else {
return Ok(());
};

if Self::has_unsupported_bracket_prefix(prefix) {
return Err(CompilerError::PartialObjectNestedKeyUnsupported.at(refr.span()));
}

if Self::is_simple_literal(index) {
return Err(CompilerError::PartialObjectConstantKeyUnsupported.at(index.span()));
}

Ok(())
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

regorus 0.9.1 — partial set rule with iterating body emits only one element

3 participants