-
Notifications
You must be signed in to change notification settings - Fork 889
Description
Is there an existing CVA6 bug for this?
- I have searched the existing bug issues
Bug Description
Description
We encountered a non-compliant behavior in CVA6 where enabling interrupts via a CSR write (csrrw to mie) does not trigger the trap immediately. Instead, the core continues to execute subsequent instructions (an "interrupt skid"), which incorrectly modifies the architectural state before the trap is finally taken.
This issue was identified during fuzzing verification by comparing CVA6 execution traces against Spike.
Environment
- Commit ID: aa4f5a5
- Config: cv64a6_imafdc_sv39
- ISA: rv64g_zba_zbb_zbs_zbc_zbkb_zbkx_zkne_zknd_zknh
Steps to Reproduce
The following sequence demonstrates the issue. An interrupt is pending, and we enable it via mie.
# ... setup logic ...
li a3, 0x000000000000b8d8 # Value to enable interrupts (MIE set)
csrrw s9, mie, a3 # <--- Interrupt enabled here.
addi x1, x0, 1145 # <--- CVA6 EXECUTES this (Resulting in x1=1145)
nop # <--- CVA6 EXECUTES this
# <--- CVA6 finally traps here
Expected Behavior (Spike)
According to the RISC-V specification, the trap should occur immediately after the csrrw instruction retires. The addi instruction should not be executed, and the register x1 should not be modified. mepc should point to the addi instruction.
Observed Behavior (CVA6)
CVA6 executes the csrrw, but then proceeds to fetch, execute, and commit the subsequent addi and nop instructions. The trap is taken only after these instructions retire.
Crucially, this results in visible architectural state corruption: register x1 is updated to 1145 (0x479), which is a violation of the precise exception/interrupt model required by the ISA.
Log Evidence
Spike:
core 0: 0x0000000080002024 (0x30469cf3) csrrw s9, mie, a3
core 0: 3 0x0000000080002024 (0x30469cf3) x25 0x0000000000000000 c772_mie 0x0000000000000888
core 0: exception interrupt #7, epc 0x0000000080002028 <-- Traps immediately at next PC
core 0: >>>> other_exp
CVA6:
core 0: 0x0000000080002024 (0x30469cf3) csrrw s9, mie, a3
3 0x0000000080002024 (0x30469cf3) x25 0x0000000000000000
core 0: 0x0000000080002028 (0x47900093) li ra, 1145 <-- EXECUTED! (Skid)
3 0x0000000080002028 (0x47900093) x 1 0x0000000000000479 <-- x1 (ra) CORRUPTED
core 0: 0x000000008000202c (0x00000013) nop <-- EXECUTED!
3 0x000000008000202c (0x00000013)
exception @ 0x0000000080002030 (0x00000013) <-- Late Trap
Reference to Specification
This violates the RISC-V Privileged Architecture Specification:
"These conditions for an interrupt trap to occur must be evaluated in a bounded amount of time from when an interrupt becomes, or ceases to be, pending in mip, and must also be evaluated immediately following the execution of an xRET instruction or an explicit write to a CSR on which these interrupt trap conditions expressly depend (including mip, mie, mstatus, and mideleg)."
By executing subsequent instructions, CVA6 fails to guarantee the precise interrupt boundary.
Analysis
We suspect that writing to the mie CSR does not trigger a pipeline flush.
When mie is updated in the commit stage (or CSR functional unit), the instructions immediately following it (which are already in the Decode/Issue stages) are not flushed. They are allowed to proceed to Commit before the interrupt logic asserts the trap.
The issue appears sensitive to instruction alignment/timing (e.g., adding/removing nop padding sometimes masks the issue), which strongly suggests a race between the CSR update propagation and the pipeline commit logic due to the lack of an explicit flush.
Attachments
ELF binary and full execution logs are attached.