Segmentation fault with symbolic loops on LLVM backend (only)

Hi,

I’ve implemented BVH traversal routines with drjit (very similar to what FCPW does [here](https://github.com/rohan-sawhney/fcpw/blob/master/include/fcpw/aggregates/bvh.inl)). The issue I’m seeing is that with the CUDA backend my code runs correctly (I have tests against a pure Python implementation), but with the LLVM backend it fails with out-of-bounds memory accesses that I can’t pinpoint. In debug mode I get errors like: `drjit.gather(): out-of-bounds read from position 1009400057 in an array of size 7. (typing.py:2279)`

Some traces (like the one above) even point into code I didn’t write (e.g., `typing.py`), which makes it hard to diagnose.

Concretely, I’m using a fixed-size stack implemented as a `dr.Local` over a custom dataclass, following the [instructions for nested objects](https://drjit.readthedocs.io/en/stable/cflow.html). When reading/writing I’m careful to do:
```python
stack_entry = subtree.read(read_stack(stack_ptr), active=active)
node_index = stack_entry.node
current_dist = stack_entry.distance
subtree.write(stack_entry, read_stack(stack_ptr), active=active)
stack_ptr -= mi.Int32(1)
```
I also keep the `active` mask updated conservatively, since my initial suspicion was that incorrect reads/writes might come from there.

Do you have any ideas about where these non-deterministic behaviors (LLVM vs CUDA) could be coming from?

If it helps, I’m happy to share the full snippet privately.

Thank you in advance!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Segmentation fault with symbolic loops on LLVM backend (only) #446

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Segmentation fault with symbolic loops on LLVM backend (only) #446

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions