-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for NMIs to ot_earlgrey #111
base: ot-earlgrey-9.1.0
Are you sure you want to change the base?
Conversation
For earlgrey-1.0.0, OpenTitan's aon_timer has been updated so that it's wakeup timer is now a 64-bit counter rather than a 32-bit counter (whereas the watchdog timer remains 32-bit). This requires the implementation of register mapping changes with each wakeup count / threshold register now having a HI/LO counterpart. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Connect the alerts for each of the OpenTitan Earlgrey devices to the alert handler. This is based on the mappings found in the autogenerated `hw/top_earlgrey/sw/autogen/top_earlgrey.h` file for the Earlgrey top found in the OpenTitan repository. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Connects GPIO signals for the four different escalation severities of OpenTitan Earlgrey's Alert handler. For now, escalation signal 3 (phase 3) connects to the pwrmgr, where it causes a shutdown as is the default in Darjeeling. In reality, the alert handler should cause a reset in phase 3 and populate the rstmgr's `reset_info` accordingly, but this requires additional work in other blocks, and as such we simply leave this connected to the pwrmgr shutdown for now to match Darjeeling's functionality. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Ibex IRQs are defined using the `ibex_irq_set` function such that they are optimised to only propagate the IRQ signal if there has been a change in the level (i.e. they are edge-triggered). Despite this, it is possible for alerts to be sent from harwdare without previously resetting the alert, for example upon a hardware fault. This behaviour can be observed in several OpenTitan tests which force alerts several times without resetting the state of the block in-between. One possible solution would just be to send a signal with level=1 followed by a signal with level=0, but the more efficient and maintainable solution is just to use the original `qemu_irq` for this purpose, as the IbexIRQ wrapping functionality does not make sense in this case. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Adds support for Non-Maskable Interrupts (NMIs) to the RISC-V QEMU CPU implementation. NMIs are defined by the RISC-V specification to be independent to regular interrupts, used only for hardware error conditions and causing an immediate jump to an implementation-defined NMI vector running in M-mode, regardless of the values of e.g. MIE/MIP. The spec states that "the values written to mcause on an NMI are implementation-defined"; in Ibex we always use an mcause of 31 (and a corresponding mtvec offset of 0x7C to determine the jump address for the NMI handler). To aid a more generic implementation, this commit introduces a generic `nmi_cause` CPU environment value that can be used alongside the NMI functionality to allow implementations to change the mcause as they see fit. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Adds support for NMIs to Earlgrey's Ibex wrapper, allowing incoming NMI IRQ signals to be received by the device, which will then be propagated through an external IRQ signal to the Ibex itself (the hart) with the newly introduced IRQ_NMI value. This will allow support for hardware raising NMIs via GPIO signals between the QEMU devices, such that NMI support can be added to Earlgrey. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
This initialises Ibex within Earlgrey with the newly defined `nmi_cause` value as 31, as NMI mcause values are implementation-defined, and Ibex makes the decision to implement all NMIs with a cause of 31, leading to an offset of 0x7c from the mtvec base being loaded into mepc upon an NMI occuring. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
The aon timer should send an NMI signal to Ibex when its watchdog timer barks. Now that the NMI functionality is in place, we can appropriately hook up this signal, which should appear before the equivalent IRQ. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
In escalation phase 0, the alert handler sends an NMI to the CPU, which will interrupt execution so long as the NMI is enabled in Ibex itself. As such, we connect this functionality in the alert handler. See: https://opentitan.org/book/sw/device/silicon_creator/rom/doc/shutdown.html#alert-escalation-phase-actions Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
See the comment for more details; to allow for proper NMI functionality based on regular IRQ and NMI handling routines, we must ensure that in `sifive_plic_irq_request`, we only clear an IRQs pending status when that IRQ has been claimed. Failure to do so can mean we clear an IRQ at a peripheral as part of an NMI service routine and as such never call into the IRQ handler when the NMI handler returns, as would be expected. This commit updates the SiFive PLIC so that it works as expected. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
I think we'll need to some time to review this MR. We also need to check if there is no prior NMI support upstream (I remember working on resumable-NMI for QEMU RISC-V with another project). A couple of notes:
|
When I started working on this I checked upstream and couldn't find any prior support. Checking again now it does appear that (at-least partial) support for this may have been added just 4 days ago. So it might be more appropriate to move onto the upstream implementation, though I don't know what extra work will be required there with regards to Ibex's implementation specifics, nor how much work it will be to pull in recent upstream changes.
I guess in that case the alternative to make this fix is to have all PLIC signals go through the PLIC_EXT first, which then propagates them. But this would require duplicating a lot of the PLIC functionality outside of the PLIC, to the point that we would be essentially implementing a new PLIC anyhow. I see that we have already added e.g. tracing to the SiFive PLIC. Importantly, this is a case where OpenTitan's functionality seems to differ from the generic RISC-V PLIC. I'm not sure what would be best to do here.
When I was looking at the This is why in this case I judged it appropriate for these specific alert signals to the alert handler to change them to use
This is definitely worth considering; I've only considered |
When I started working on this I checked upstream and couldn't find any prior support. Checking again now it does appear that (at-least partial) support for this may have been added just 4 [days ago] :-)
Do you mean
The I have not read about the NMI support in OT. As RISC-V NMI are implementation-specific, I think we'll have a hard time to fit it into QEMU without breaking the other machines.
How does that work at RTL level? It would nice to have a look at some waves. Maybe writing to ALERT_TEST creates an edge (i.e. the next clock cycle reset the level of the alert line?). I've been asking myself this question for about 2 years and never took time to have a look at it. Rather, I tried to have a look at it, but gave up to find how to produce waves and observe them with gtkwave, then moved to another topic 😞) |
Yes, but to make this work we would have to replicate a lot of the existing PLIC functionality in the plic_ext to have the needed values be accessible, essentially composing the PLIC and making the plic_ext wrap it to add OT-specific functionality.
As far as I can tell, the OT PLIC is both edge-triggered and level-triggered, dependent upon the hardware interrupt source: "It receives interrupt events as either edge or level of the incoming interrupt signals". That is a separate issue though, I think - the primary concern here is that interrupts can be acknowledged at the peripheral, but still be pending and awaiting claim/completion in the PLIC. Based on my understanding, current behaviour means that for example:
In the current PLIC implementation, part 6 is not correct to the OT implementation - the NMI handler acknowledges the IRQ at the peripheral, which sends a signal to the PLIC, causing it to also clear the IRQ, meaning that steps 6-7 do not properly occur. It appears that OpenTitan has custom behaviour where an IRQ that is enabled, and pending but not claimed cannot be cleared by a change in the incoming IRQ signal from the peripheral. To me the correct solution is implementing an OpenTitan PLIC that extends this generic RISC-V
I'm not sure either; I've been likewise unsuccessful in my attempts at getting waves with OT. However, digging a bit into the RTL, I think I've found something elucidating. If I'm interpreting this comment correctly, it seems that regular alerts are latched in a local register until they are reset. On the contrary, it seems like test alerts specifically are not latched until reset, which explains the behaviour I've been seeing and am trying to implement. Based on this comment, perhaps what I had deemed to be the 'less appropriate' solution is in fact more appropriate here: specifically for writes to |
Maybe we could implement what the RISC-V specification documents as the "interrupt gateway", i.e. something between the device that generates the IRQ and the PLIC controller itself. In any case, we cannot change the behavior of
If the interrupt was handled as edge-triggered, the PLIC would still see the IRQ has raised even once the peripheral had lowered its signalling line, would it not? BTW, if I understand it right, the NMI is not a real RISC-V NMI - which are not resumable for RISC-V -, it is a more a custom implementation of an RNMI (resumable NMI), isn't it? So there are IRQs, ~RNMI and Alerts. Wow...
I think we need to ask for help on this one, so that QEMU sticks as much as possible to the actual behavior.
Yes, that would make sense.
Yes, I think this should be ok. I'm wondering if I have not already implemented this somewhere in Darjeeling BTW (I tend to forget most of the work I do....). Nit pick, not an unsigned 0, as IRQs are signed integers for some reason in QEMU. |
This sounds like what I was suggesting we make the I wonder if there's an easy way we could either (a) make a separate interrupt gateway device that has access to these internals, or (b) make a PLIC "wrapper" that composes the The main issue I'm trying to avoid here is replicating the
Yes, I think that you are correct. Making the PLIC edge-triggered would seem like the most satisfactory solution then, but I'm not sure this is 100% correct, as in OpenTitan we have both edge-triggered ("event type") and level-triggered ("status" type) interrupts. From my understanding, status type interrupts like for example the UART
Yes, as far as I can tell these are not UNMI as defined in the RISC-V spec (though they are close to the implementation), nor are these RNMI from the So reading that I am still not sure my implementation is 100% correct: I think I need to introduce additional Ibex-specific CSRs for backing up the MPP/MIE/MCAUSE etc. values, to account for the case where an NMI happens while handling another interrupt? |
I would rather see something in front of the PLIC, but I've really overlooked this issue so I might be 100% wrong :)
Unfortunately this is a global setting. Between DJ and EG, IRQs of some devices (namely the SPI device) have moved from edge-triggered to level-triggered IRQ, but these changes where only about how the device manages its IRQ signals, not how they are handled by the PLIC. We need to better understand this topic before adding/changing more code I believe.
You are right. This means it's gonna be even more difficult to integrate into QEMU, as we do not want to break the other RISC-V implementation nor add vendor-specific code to generic files... |
It unfortunately does not appear to be compatible with |
This PR adds support for Non-Maskable Interrutps (NMIs) to
ot_earlgrey
. You can see more detail in each individual commit message. Note that the first three commits are instead from #109 and #110, and are just included as the baseot_aon_timer
andot_alert
functionality is needed to test that the NMIs are working properly. This PR is accordingly marked as a draft, but can be reviewed by simply ignoring the three commits before b3e8936. These will be removed as that functionality is added.To get the NMI tests passing properly, a few commits introduce other changes outside of just NMI support. Most notably, I: switch to use QEMU IRQs instead of Ibex IRQs for alerts (to allow correct propagation of multiple alert signals without clearing the signal in-between), and only clear pending IRQs at the PLIC when they are claimed.
The NMI implementation changes to
target/riscv
are intended to be minimialistic and implementation-agnostic such that they could be utilised by another RISC-V machine to implement their own NMIs. Refer to section "3.5. Non-Maskable Interrupts" of the RISC-V privileged spec and Ibex's Exceptions and Interrupts documentation, alongside the commit messages, for an explanation of the required changes.I have verified these changes against ~50 known existing passing tests. These changes have also been tested to make previously failing tests pass:
aon_timer_smoketest
, andrv_core_ibex_nmi_irq_test
(running with an icount shift oficount=7
). All tests were built from master on OpenTitan.