Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scatter_inc with all-false literal active mask #113

Open
merlinND opened this issue Dec 17, 2024 · 7 comments · May be fixed by #114
Open

scatter_inc with all-false literal active mask #113

merlinND opened this issue Dec 17, 2024 · 7 comments · May be fixed by #114

Comments

@merlinND
Copy link
Member

Calling scatter_inc with a literal all-False mask returns an array of size 0, which is unexpected.
I would have expected scatter_inc(..., False) to boil down to a gather instead.

We are running into this case in real code, and the zero-sized array causes a crash when used later in the code.
This was found by @faycalaa.

Likely root cause

It is most likely due to this early exit:

drjit-core/src/op.cpp

Lines 2084 to 2085 in 61570f7

if (mask_v->is_literal() && mask_v->literal == 0)
return 0;

I think that something like this would make more sense (untested):

    if (mask_v->is_literal() && mask_v->literal == 0)
        return steal(jitc_var_gather(target, index, /* true-valued literal for the gather */));

Reproducer

def reproduce(opaque: bool):
    counter = dr.ones(mi.UInt, 2)
    # dr.make_opaque(counter)  # No effect
    index = dr.zeros(mi.UInt, 2)

    # All-false literal mask triggers an "optimization" that is not correct.
    active = dr.full(mi.Bool, False, 2)
    if opaque:
        dr.make_opaque(active)

    old_value = dr.scatter_inc(counter, index, active)
    dr.eval(old_value, counter)

    print(f"{opaque=}:")
    print(f"- {old_value=}")
    print(f"- {counter=}")


reproduce(True)
reproduce(False)

Results in:

opaque=True:
- old_value=[0, 0]
- counter=[1, 1]
opaque=False:
- old_value=[]
- counter=[1, 1]
@merlinND merlinND changed the title scatter_inc with all-false active mask scatter_inc with all-false literal active mask Dec 17, 2024
@wjakob
Copy link
Member

wjakob commented Dec 17, 2024

Scatter-inc does an optimization that collapses writes to same index, while using thread-local commination to locally establish the right indices that will be returned to the caller. You should not expect to get usable data for entries for which the mask is set to false.

@merlinND
Copy link
Member Author

Hi @wjakob,

In the documentation it is written:

The operation also supports masking—the return value in the unmasked case is undefined.

Is it meant to say that for lanes where active is False, the return value is undefined? Specifically, if we don't increment a value, we don't get the current value back either?

@wjakob
Copy link
Member

wjakob commented Dec 17, 2024

Right, that's a typo. It's meant to say "in the masked case"

@wjakob
Copy link
Member

wjakob commented Dec 17, 2024

By the way: you can easily combine this with a gather to get the behavior your want. But of course this has a perf cost.

@merlinND
Copy link
Member Author

Okay, that makes sense!
In this case we're not actually relying on the return values, so the current behavior is almost fine.

It's just the particular case of an all-false literal active mask which leads an empty array to be returned (see first post):

opaque=False:
- old_value=[]
- counter=[1, 1]

Then we run into exceptions like operands have incompatible sizes! in subsequent code.
Would it make sense to return an array of arbitrary values, but with the correct width?

@wjakob
Copy link
Member

wjakob commented Dec 17, 2024

Yes, we should return an array of the correct size.

@merlinND
Copy link
Member Author

I fixed the typo in the scatter_inc docstring: mitsuba-renderer/drjit@0bbbda0

And opened a PR to return an array of the correct width even in the all-false literal mask case: #114

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants