-
Notifications
You must be signed in to change notification settings - Fork 69
Description
Chipyard Version and Hash
Hash: ucb-bar/chipyard@44fec76
OS Setup
Linux 6.6.114.1-microsoft-standard-WSL2 SMP PREEMPT_DYNAMIC Mon Dec 1 20:46:23 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Other Setup
Using MediumBoomV4CosimConfig
Current Behavior
Description
We suspect that the Cospike harness (cospike_impl.cc) might be incorrectly using the wdata from the DUT's commit log to force-update Spike's CSR state. It appears that for Read-Modify-Write (RMW) instructions like csrrc, wdata may reflect the old value of the CSR (pre-modification). If this is the case, the harness could be effectively "rolling back" Spike's correctly updated state to the old value, which would explain the simulation divergence.
Conditions
- Chipyard + BOOM (MediumBoomV4CosimConfig), Cospike co-simulation enabled.
- mstatus.FS initially set to Dirty (FP enabled).
- Execute: csrrc gp, mstatus, mask (clears mstatus.FS bits to Off) → fcvt.lu.s (FP instruction).
Observed
- DUT (BOOM): Traps on fcvt.lu.s (Cause 2: Illegal Instruction). ✓
- Spike (in Co-sim): Does NOT trap on fcvt.lu.s, continues execution.✗(We observed that the Spike log in Cosim indeed shows that, upon executing the instruction to clear the FS bit, mstatus was updated to the new value. )
- Result: PC mismatch reported.
Additional Observation (with csrr inserted)
After inserting the csrr x30, mstatus instruction, we found that Spike returned the old value (i.e., the value with the FS bit still set, which contradicts the earlier Spike log), whereas the DUT returned the new value. However, in this run, both Spike and the DUT triggered an exception at the fcvt instruction, allowing the simulation to pass. If our hypothesis is correct, this behavior may be due to the Cospike harness’s synchronization mechanism: after reading mstatus, it forcibly synchronized the DUT’s mstatus back to Spike, thereby enabling Spike to correctly raise an exception at the subsequent fcvt instruction.
Expected Behavior
After csrrc clears mstatus.FS, both DUT and Spike should trap on fcvt.lu.s. Co-simulation should pass.
Other Information
Faulty Logic
File: testchipip/src/main/resources/testchipip/csrc/cospike_impl.cc
The harness attempts to synchronize Spike with DUT's wdata without distinguishing instruction types:
// Problematic Override Logic
// s->XPR.write(rd, wdata);
// ...
uint64_t read_bits = s->csrmap[csr_addr]->read();
// Force Spike to match wdata (which is OLD value for RMW)
uint64_t write_bits = (read_bits & ~ignore_bits) | (wdata & ignore_bits);
s->csrmap[csr_addr]->write(write_bits);
Experiments
We performed three experiments to isolate the issue.
Experiment 1: Baseline Reproduction (Original Bug)
Setup: mstatus.FS is Dirty. Execute csrrc (clearing FS) followed by fcvt (FP instruction).
li s11, 0x000000000000f000
csrrc gp, mstatus, s11
fcvt.lu.s s3, fa7, dyn
Observation:
Spike: core 0: 3 0x0000000080002004 (0x0000fdb7) lui s11,0xf x27 0x000000000000f000
DUT: Cosim: 1583 commit: 80002004 (0xfdb7) lui s11, 0xf x27 0xf000(DUT)
Spike: core 0: 3 0x0000000080002008 (0x300db1f3) csrrc gp,mstatus,s11
x3 0x8000000a000c6088 c768_mstatus 0x0000000a000c0088 <- NEW MSTATUS
DUT: Cosim: 1591 commit: 80002008 (0x300db1f3) csrrc gp, mstatus, s11 x3 0x8000000a000c6088(DUT)
Cosim: CSR read 300
Cosim: CSR status override check
Cosim: 1599 exception 2
Spike: core 0: 3 0x000000008000200c (0xc038f9d3) fcvt.lu.s s3,fa7
c1_fflags 0x0000000000000010 x19 0xffffffffffffffff
Spike: core 0: 3 0x0000000080002010 (0xffffff17) auipc t5,0xfffff
x30 0x0000000080001010
DUT: Cosim: 1661 commit: 80001000 (0x341020f3) csrr ra, mepc x30 0x8000200c(DUT)
Cosim: 67d PC mismatch spike 80002010 != DUT 80001000 <-- [Spike=80002010 vs DUT=80001000]
- DUT (BOOM): Executes
csrrc.FSbecomes Off.fcvtcorrectly triggers Trap (Cause 2). - Trace: DUT reports wdata(gp) = Old Value (Dirty).
- Spike: Executes csrrc -> FS becomes Off. Harness Overrides mstatus with wdata (Dirty). Spike FS reverts to Dirty.
- Result: Spike executes fcvt (no trap), causing a PC mismatch with DUT.
Experiment 2: Diagnostic Check (csrr Read)
Setup: Insert csrr rd, mstatus immediately after csrrc, before fcvt.
li s11, 0x000000000000f000
csrrc gp, mstatus, s11
csrr x30, mstatus
fcvt.lu.s s3, fa7, dyn
Observation:
Spike: core 0: 3 0x0000000080002004 (0x0000fdb7) lui s11,0xf x27 0x000000000000f000
DUT: Cosim: 1583 commit: 80002004 (0xfdb7) lui s11, 0xf x27 0xf000(DUT)
core 0: 3 0x0000000080002008 (0x300db1f3) csrrc gp,mstatus,s11
x3 0x8000000a000c6088 c768_mstatus 0x0000000a000c0088 <- NEW MSTATUS
Cosim: 1591 commit: 80002008 (0x300db1f3) csrrc gp, mstatus, s11 x3 0x8000000a000c6088(DUT)
Cosim: CSR read 300
Cosim: CSR status override check
core 0: 3 0x000000008000200c (0x30002f73) csrr t5,mstatus x30 0x8000000a000c6088 <- OLD MSTATUS
Cosim: 1605 commit: 8000200c (0x30002f73) csrr t5, mstatus x30 0xa000c0088(DUT) <- NEW MSTATUS
Cosim: CSR read 300
Cosim: CSR status override check
Cosim: 1613 exception 2
core 0: 3 0x0000000080001000 (0x341020f3) csrr ra,mepc x1 0x0000000080002010 <- Trap on FP instruction
Cosim: 1673 commit: 80001000 (0x341020f3) csrr ra, mepc x1 0x80002010(DUT) <- Trap on FP instruction
- Spike:
csrrreturns Dirty (Old Value). This may confirms the harness rollback. - DUT (BOOM):
csrrreturnsOFF(New Value) in the log. - Note: Due to the overrides mechanism, after executing
csrr x30, mstatus, Spike’smstatus.FSfield was overwritten again by the DUT’smstatus.FS. - Result: Both simulators trap on FP instruction
Experiment 3: Fix Verification (Remove Override)
Setup: Comment out the s->csrmap[csr_addr]->write(...) logic in cospike_impl.cc.
Observation:
Spike: core 0: 3 0x0000000080002004 (0x0000fdb7) lui s11,0xf x27 0x000000000000f000
DUT: Cosim: 1583 commit: 80002004 (0xfdb7) lui s11, 0xf x27 0xf000(DUT)
Spike: core 0: 3 0x0000000080002008 (0x300db1f3) csrrc gp,mstatus,s11
x3 0x8000000a000c6088 c768_mstatus 0x0000000a000c0088
DUT: Cosim: 1591 commit: 80002008 (0x300db1f3) csrrc gp, mstatus, s11 x3 0x8000000a000c6088(DUT)
Cosim: CSR read 300
Cosim: CSR status override check
Cosim: 1599 exception 2
core 0: 3 0x0000000080001000 (0x341020f3) csrr ra,mepc x1 0x000000008000200c <- Trap on FP instruction
Cosim: 1661 commit: 80001000 (0x341020f3) csrr ra, mepc x1 0x8000200c(DUT) <- Trap on FP instruction
- Spike: Executes
csrrc. FS remains Off (Correct). - Result: Both Spike and DUT trap on
fcvt.csrrreads correct (New) value. Cosimulation passes perfectly.
Attachments
ELF binary and full execution logs are attached.