Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Hyperlight KVM guest debugging using gdb #111

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

dblnz
Copy link
Contributor

@dblnz dblnz commented Dec 13, 2024

The current implementation supports:

  • add/remove HW breakpoints (max 4)
  • add/remove SW breakpoints
  • read/write registers
  • read/write addresses
  • step/continue
  • get code offset from target

The overall architecture is designed to work like a Request - Response protocol from the gdb thread to the hypervisor handler over a DebugCommChannel.

  • All the functionality is implemented on the hypervisor side so it has access to the shared memory and VcpuFd
  • The gdb thread uses gdbstub to handle the communication with the gdb client
  • when the gdb client requests the supported features mentioned above, a request is sent forward to the hypervisor handler for the running sandbox to run

@dblnz dblnz added the kind/enhancement New feature or request label Dec 13, 2024
@dblnz dblnz self-assigned this Dec 13, 2024
@dblnz dblnz marked this pull request as draft December 13, 2024 17:54
@dblnz dblnz force-pushed the gdb-support-latest branch 2 times, most recently from be21b85 to ca61def Compare December 19, 2024 11:41
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super excited to see this landing. Nice work! My feedback mostly consists of nits and questions.

docs/how-to-debug-a-hyperlight-guest.md Outdated Show resolved Hide resolved
docs/how-to-debug-a-hyperlight-guest.md Outdated Show resolved Hide resolved
docs/how-to-debug-a-hyperlight-guest.md Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/gdb/mod.rs Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/gdb/mod.rs Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/gdb/target.rs Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/hypervisor_handler.rs Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/kvm.rs Outdated Show resolved Hide resolved
@dblnz dblnz force-pushed the gdb-support-latest branch from 4a44b2e to 16492de Compare January 22, 2025 15:40
@dblnz dblnz marked this pull request as ready for review January 22, 2025 16:07
@dblnz dblnz changed the title Initial support for gdb debugging in Hyperlight KVM guest Add support for Hyperlight KVM guest debugging using gdb Jan 22, 2025
dblnz added 17 commits February 4, 2025 22:54
Signed-off-by: Doru Blânzeanu <[email protected]>
- this type will be used by the gdb and the hypervisor handler
to send requests and receive responses

Signed-off-by: Doru Blânzeanu <[email protected]>
- the target implements the  traits to provide callbacks
for the gdb commands

Signed-off-by: Doru Blânzeanu <[email protected]>
- adds a specific handler for the vcpu exit debug that waits for
debug messages and processes them

Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
- also adds handling of gdb client disconnect by resuming execution

Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
Signed-off-by: Doru Blânzeanu <[email protected]>
@dblnz dblnz force-pushed the gdb-support-latest branch from 2a36f06 to d0db26e Compare February 4, 2025 21:07
ludfjig
ludfjig previously approved these changes Feb 5, 2025
Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!! Looked briefly though everything and I will give a more in-depth review tomorrow

src/hyperlight_host/Cargo.toml Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/gdb/mod.rs Show resolved Hide resolved
src/hyperlight_host/src/sandbox/config.rs Outdated Show resolved Hide resolved
src/hyperlight_host/src/hypervisor/gdb/x86_64_target.rs Outdated Show resolved Hide resolved
- improve documentation
- improve gdb errors by adding a specific error variant for when the
  rflags conversion from u64 to u32 fails
- fix clippy issue on windows

Signed-off-by: Doru Blânzeanu <[email protected]>
- the user experience is not great as the fields and methods
related to gdb port are not accessible unless the gdb feature
is turned on.
- if one runs the build command in the cli and provides the --features gdb
the IDE will not know that the feature is enabled and will show
errors as it doesn't know about the fields/methods

Signed-off-by: Doru Blânzeanu <[email protected]>
- when using visual studio code, sometimes it requests to read from
  unexpected addresses which can cause an error on the hypervisor side.
  This fix signals this to the gdb thread which marks it as a non-fatal
  error

Signed-off-by: Doru Blânzeanu <[email protected]>
@dblnz dblnz force-pushed the gdb-support-latest branch from 17a5d14 to 98ac60e Compare February 5, 2025 16:06
Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. I tried it out and it works great. Left some more comments, feel free to dismiss if you disagree with any of them. I realize some of them might be hard to implement as well. GREAT WORK!!

#[instrument(skip_all, parent = Span::current(), level= "Trace")]
/// Sets the configuration for the guest debug
pub fn set_guest_debug_port(&mut self, port: u16) {
self.guest_debug_port = Some(max(port, Self::MIN_GUEST_DEBUG_PORT));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I called this function but forgot to enable the gdb feature. The code ran fine, but didn't wait for gdb to connect, and I got confused. In my opinion, maybe we should feature-gate this function, or return error somewhere (maybe not in this func) if a user is trying to use a sandbox with this setting set, without using the gdb feature. Maybe others have better ideas. Maybe some tests for this could be good to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the previous intention, however, I disabled it with the last commit.
The reasoning behind was that the rust-analyzer didn't know about the feature being enabled and it showed errors in IDE as it doesn't know the methods.
I am not sure which is the best approach, you might be right with being confusing when it doesn't do anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to enable the feature for rust-analyzer in .vscode/settings.json


### End to end example

Using the [Sandbox configuration](#sandbox-configuration) above to configure the [hello-world](https://github.com/hyperlight-dev/hyperlight/blob/main/src/hyperlight_host/examples/hello-world/main.rs) example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful if we created a brand new example for this instead of using hello-world. I read through this doc but missed the part where it said "Using the sandbox configuration above", so I was confused why it wasn't working. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I can add another example

```bash
# Terminal 2
$ cat .gdbinit
file file src/tests/rust_guests/bin/debug/simpleguest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file file typo?

@@ -128,6 +130,8 @@ kvm = ["dep:kvm-bindings", "dep:kvm-ioctls"]
mshv2 = ["dep:mshv-bindings2", "dep:mshv-ioctls2"]
mshv3 = ["dep:mshv-bindings3", "dep:mshv-ioctls3"]
inprocess = []
# This enables compilation of gdb stub for easy debug in the guest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can drop the first part of this comment, since we probably don't care too much about what compilation it enables, but rather that it allows for easy gdb debugging of a guest.

self.dbg_cfg.arch.debugreg = [0; 8];
for (k, addr) in addrs.iter().enumerate() {
self.dbg_cfg.arch.debugreg[k] = *addr;
self.dbg_cfg.arch.debugreg[7] |= 1 << (k * 2);
Copy link
Contributor

@ludfjig ludfjig Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a comment about what these bit operations does/means?

if self.debug.is_some() {
log::debug!("Setting entrypoint bp {:X}", self.entrypoint);
let mut entrypoint_debug = KvmDebug::new();
entrypoint_debug.hw_breakpoints.push(self.entrypoint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make an method on KvmDebug for adding a breakpoint that does error checking (>4 breapoints), instead of manually modifying the vec? It seems fragile to do the error checking during the set_breakpoints, and to expose the inner vec like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I'll change it 👍

let addr = self.translate_gva(addr)?;

if let Some(debug) = self.debug.as_mut() {
if debug.hw_breakpoints.contains(&addr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the .contains is unnecessary given that you use .position later anyway. You might be able to simplify this whole block using hw_breakpoints.retain(|a| a != addrv) or something along those lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the suggestion

}
}

if conn.peek().map(|b| b.is_some()).unwrap_or(false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I really understand what this is doing. Maybe a comment here could help

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment

/// Enumerates the possible responses that a hypervisor can provide to a debugger
#[derive(Debug)]
pub enum DebugResponse {
AddHwBreakpoint(bool),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all of these responses, or would it be sufficient to combine some of them into something like Success()? I looked briefly and I didn't see AddHwBreakpoint and AddSwBreakpoint being used for anything for example. ( can see DebugResponse::Step | DebugResponse::Continue | DebugResponse::DisableDebug are being used though)

Copy link
Contributor Author

@dblnz dblnz Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need stems from the return value the add_sw_breakpoint and the others expect.
An example signature is:

    fn add_sw_breakpoint(
        &mut self,
        addr: <Self::Arch as Arch>::Usize,
        _kind: <Self::Arch as Arch>::BreakpointKind,
    ) -> TargetResult<bool, Self> {

and it expects true/false on the success path signaling whether the breakpoint is active, true being it was just added, false being it was already active and the new call didn't actually do anything.

I could put all of them together to make a generic response that returns Result<bool, Error> but I am not sure it would make it look a lot better getting rid of only 3 variants. Let me know what you think

enable pretty-printer
layout src

$ gdb
Copy link
Contributor

@ludfjig ludfjig Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This markdown is great. One thing this gdb feature introduces is some complexity regarding threading. Would you be able to write down (or maybe create a diagram) for how the threads interact with each other, for example the handler thread and so on? It's a bit confusing to me which thread waits for what thread, etc :D. A simple list about what happens in what order when you create a sandbox with gdb enabled would be sufficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good idea, I thought about adding a diagram but I was afraid it would get desynchronized with the code when there would be a change as the documentation usually doesn't get updated unless it is close to the code being changed.

I'll work on a diagram to help with the understanding of the inner workings of the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants