-
Notifications
You must be signed in to change notification settings - Fork 138
Description
Currently, it is not possible to interrupt or cancel execution if the guest is calling a host function. This means that, if the host function hangs, then the call will never return or get cancelled. This gets surfaced, like so:
HyperlightError::GuestExecutionHungOnHostFunctionCall() => {} |
One possible solution
When running with the seccomp feature on, host functions are wrapped in their own thread like so:
hyperlight/src/hyperlight_host/src/sandbox/host_funcs.rs
Lines 208 to 228 in b9c67fb
let join_handle = std::thread::Builder::new() | |
.name(format!("Host Function Worker Thread for: {:?}", name_cloned)) | |
.spawn(move || { | |
// We have a `catch_unwind` here because, if a disallowed syscall is issued, | |
// we handle it by panicking. This is to avoid returning execution to the | |
// offending host function—for two reasons: (1) if a host function is issuing | |
// disallowed syscalls, it could be unsafe to return to, and (2) returning | |
// execution after trapping the disallowed syscall can lead to UB (e.g., try | |
// running a host function that attempts to sleep without `SYS_clock_nanosleep`, | |
// you'll block the syscall but panic in the aftermath). | |
match std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| call_func(&host_funcs_cloned, &name_cloned, args_cloned))) { | |
Ok(val) => val, | |
Err(err) => { | |
if let Some(crate::HyperlightError::DisallowedSyscall) = err.downcast_ref::<crate::HyperlightError>() { | |
return Err(crate::HyperlightError::DisallowedSyscall) | |
} | |
crate::log_then_return!("Host function {} panicked", name_cloned); | |
} | |
} | |
})?; |
You could leverage these threads to cancel execution in the same way we cancel execution in the guest:
hyperlight/src/hyperlight_host/src/hypervisor/hypervisor_handler.rs
Lines 729 to 762 in b9c67fb
let thread_id = self.execution_variables.get_thread_id()?; | |
if thread_id == u64::MAX { | |
log_then_return!("Failed to get thread id to signal thread"); | |
} | |
let mut count: i32 = 0; | |
// We need to send the signal multiple times in case the thread was between checking if it | |
// should be cancelled and entering the run loop | |
// We cannot do this forever (if the thread is calling a host function that never | |
// returns we will sit here forever), so use the timeout_wait_to_cancel to limit the number | |
// of iterations | |
let number_of_iterations = | |
self.configuration.max_wait_for_cancellation.as_micros() / 500; | |
while !self.execution_variables.run_cancelled.load() { | |
count += 1; | |
if count > number_of_iterations.try_into().unwrap() { | |
break; | |
} | |
info!( | |
"Sending signal to thread {} iteration: {}", | |
thread_id, count | |
); | |
let ret = unsafe { pthread_kill(thread_id, SIGRTMIN()) }; | |
// We may get ESRCH if we try to signal a thread that has already exited | |
if ret < 0 && ret != ESRCH { | |
log_then_return!("error {} calling pthread_kill", ret); | |
} | |
std::thread::sleep(Duration::from_micros(500)); | |
} |
Though, this would mean always wrapping host function calls with an extra thread and that might be naive in terms of perf.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status