-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Hyperlight KVM guest debugging using gdb #111
base: main
Are you sure you want to change the base?
Conversation
be21b85
to
ca61def
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super excited to see this landing. Nice work! My feedback mostly consists of nits and questions.
4a44b2e
to
16492de
Compare
Signed-off-by: Doru Blânzeanu <[email protected]>
…en gdb feature is on Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
- this type will be used by the gdb and the hypervisor handler to send requests and receive responses Signed-off-by: Doru Blânzeanu <[email protected]>
- the target implements the traits to provide callbacks for the gdb commands Signed-off-by: Doru Blânzeanu <[email protected]>
- adds a specific handler for the vcpu exit debug that waits for debug messages and processes them Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
- also adds handling of gdb client disconnect by resuming execution Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
2a36f06
to
d0db26e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!! Looked briefly though everything and I will give a more in-depth review tomorrow
- improve documentation - improve gdb errors by adding a specific error variant for when the rflags conversion from u64 to u32 fails - fix clippy issue on windows Signed-off-by: Doru Blânzeanu <[email protected]>
- the user experience is not great as the fields and methods related to gdb port are not accessible unless the gdb feature is turned on. - if one runs the build command in the cli and provides the --features gdb the IDE will not know that the feature is enabled and will show errors as it doesn't know about the fields/methods Signed-off-by: Doru Blânzeanu <[email protected]>
- when using visual studio code, sometimes it requests to read from unexpected addresses which can cause an error on the hypervisor side. This fix signals this to the gdb thread which marks it as a non-fatal error Signed-off-by: Doru Blânzeanu <[email protected]>
17a5d14
to
98ac60e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. I tried it out and it works great. Left some more comments, feel free to dismiss if you disagree with any of them. I realize some of them might be hard to implement as well. GREAT WORK!!
#[instrument(skip_all, parent = Span::current(), level= "Trace")] | ||
/// Sets the configuration for the guest debug | ||
pub fn set_guest_debug_port(&mut self, port: u16) { | ||
self.guest_debug_port = Some(max(port, Self::MIN_GUEST_DEBUG_PORT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I called this function but forgot to enable the gdb feature. The code ran fine, but didn't wait for gdb to connect, and I got confused. In my opinion, maybe we should feature-gate this function, or return error somewhere (maybe not in this func) if a user is trying to use a sandbox with this setting set, without using the gdb feature. Maybe others have better ideas. Maybe some tests for this could be good to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the previous intention, however, I disabled it with the last commit.
The reasoning behind was that the rust-analyzer didn't know about the feature being enabled and it showed errors in IDE as it doesn't know the methods.
I am not sure which is the best approach, you might be right with being confusing when it doesn't do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to enable the feature for rust-analyzer in .vscode/settings.json
|
||
### End to end example | ||
|
||
Using the [Sandbox configuration](#sandbox-configuration) above to configure the [hello-world](https://github.com/hyperlight-dev/hyperlight/blob/main/src/hyperlight_host/examples/hello-world/main.rs) example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful if we created a brand new example for this instead of using hello-world. I read through this doc but missed the part where it said "Using the sandbox configuration above", so I was confused why it wasn't working. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I can add another example
```bash | ||
# Terminal 2 | ||
$ cat .gdbinit | ||
file file src/tests/rust_guests/bin/debug/simpleguest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file file typo?
@@ -128,6 +130,8 @@ kvm = ["dep:kvm-bindings", "dep:kvm-ioctls"] | |||
mshv2 = ["dep:mshv-bindings2", "dep:mshv-ioctls2"] | |||
mshv3 = ["dep:mshv-bindings3", "dep:mshv-ioctls3"] | |||
inprocess = [] | |||
# This enables compilation of gdb stub for easy debug in the guest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can drop the first part of this comment, since we probably don't care too much about what compilation it enables, but rather that it allows for easy gdb debugging of a guest.
self.dbg_cfg.arch.debugreg = [0; 8]; | ||
for (k, addr) in addrs.iter().enumerate() { | ||
self.dbg_cfg.arch.debugreg[k] = *addr; | ||
self.dbg_cfg.arch.debugreg[7] |= 1 << (k * 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a comment about what these bit operations does/means?
if self.debug.is_some() { | ||
log::debug!("Setting entrypoint bp {:X}", self.entrypoint); | ||
let mut entrypoint_debug = KvmDebug::new(); | ||
entrypoint_debug.hw_breakpoints.push(self.entrypoint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make an method on KvmDebug for adding a breakpoint that does error checking (>4 breapoints), instead of manually modifying the vec? It seems fragile to do the error checking during the set_breakpoints
, and to expose the inner vec like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I'll change it 👍
let addr = self.translate_gva(addr)?; | ||
|
||
if let Some(debug) = self.debug.as_mut() { | ||
if debug.hw_breakpoints.contains(&addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the .contains
is unnecessary given that you use .position
later anyway. You might be able to simplify this whole block using hw_breakpoints.retain(|a| a != addrv)
or something along those lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the suggestion
} | ||
} | ||
|
||
if conn.peek().map(|b| b.is_some()).unwrap_or(false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I really understand what this is doing. Maybe a comment here could help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a comment
/// Enumerates the possible responses that a hypervisor can provide to a debugger | ||
#[derive(Debug)] | ||
pub enum DebugResponse { | ||
AddHwBreakpoint(bool), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need all of these responses, or would it be sufficient to combine some of them into something like Success()? I looked briefly and I didn't see AddHwBreakpoint
and AddSwBreakpoint
being used for anything for example. ( can see DebugResponse::Step | DebugResponse::Continue | DebugResponse::DisableDebug are being used though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need stems from the return value the add_sw_breakpoint
and the others expect.
An example signature is:
fn add_sw_breakpoint(
&mut self,
addr: <Self::Arch as Arch>::Usize,
_kind: <Self::Arch as Arch>::BreakpointKind,
) -> TargetResult<bool, Self> {
and it expects true/false on the success path signaling whether the breakpoint is active, true being it was just added, false being it was already active and the new call didn't actually do anything.
I could put all of them together to make a generic response that returns Result<bool, Error>
but I am not sure it would make it look a lot better getting rid of only 3 variants. Let me know what you think
enable pretty-printer | ||
layout src | ||
|
||
$ gdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This markdown is great. One thing this gdb feature introduces is some complexity regarding threading. Would you be able to write down (or maybe create a diagram) for how the threads interact with each other, for example the handler thread and so on? It's a bit confusing to me which thread waits for what thread, etc :D. A simple list about what happens in what order when you create a sandbox with gdb enabled would be sufficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good idea, I thought about adding a diagram but I was afraid it would get desynchronized with the code when there would be a change as the documentation usually doesn't get updated unless it is close to the code being changed.
I'll work on a diagram to help with the understanding of the inner workings of the feature.
The current implementation supports:
The overall architecture is designed to work like a
Request - Response
protocol from the gdb thread to the hypervisor handler over aDebugCommChannel
.gdbstub
to handle the communication with the gdb client