Skip to content

Conversation

@ludfjig
Copy link
Contributor

@ludfjig ludfjig commented Oct 2, 2025

This PR adds a poison state to sandbox in order to prevent further operations when the sandbox is left in an inconsistent state that could compromise memory safety, data integrity, or security. The sandbox becomes poisoned when guest functions abort/panic or when host-initiated execution cancellation occurs, leaving behind leaked heap allocations, corrupted data structures, or unreleased resources. For example, interrupting execution while guest is allocating can leave the global allocation lock in an inconsistent state, making future allocations impossible in subsequent runs due to infinite locking/spinning.

Poisoned sandboxes will reject all further operations (guest calls, snapshots, memory mapping) until the inconsistent state is resolved through either restoring to a snapshot or manually (unsafely) clearing the poison state.

Closes #848

@ludfjig ludfjig added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Oct 2, 2025
@ludfjig ludfjig marked this pull request as ready for review October 2, 2025 19:15
@ludfjig ludfjig requested a review from Copilot October 2, 2025 20:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces sandbox poisoning functionality to prevent further operations when a sandbox is in an inconsistent state that could compromise memory safety. The sandbox becomes poisoned when guest functions abort/panic or when host-initiated execution cancellation occurs.

Key changes:

  • Added poisoned state tracking with safety checks across all sandbox operations
  • Implemented automatic poison detection for specific error types (GuestAborted, ExecutionCanceledByHost)
  • Added recovery mechanisms through snapshot restoration or manual poison clearing

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/hyperlight_host/src/sandbox/initialized_multi_use.rs Core implementation of poison state tracking, safety checks, and recovery mechanisms
src/hyperlight_host/src/error.rs Added PoisonedSandbox error variant with detailed documentation
src/hyperlight_host/tests/integration_test.rs Updated interrupt tests to clear poison state for continued execution
src/hyperlight_host/src/sandbox/snapshot.rs Added Debug trait to Snapshot struct
src/hyperlight_host/src/mem/shared_mem_snapshot.rs Added Debug trait to SharedMemorySnapshot struct

dblnz
dblnz previously approved these changes Oct 23, 2025
Copy link
Contributor

@dblnz dblnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!
Changes LGTM. I left some clarifying comments, but overall good to go.

| HyperlightError::NoMemorySnapshot
| HyperlightError::ParameterValueConversionFailure(_, _)
| HyperlightError::PEFileProcessingFailure(_)
| HyperlightError::PoisonedSandbox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit strange to me that IsPoisoned doesn't return true when this is the error.

Copy link
Contributor Author

@ludfjig ludfjig Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right it's confusing, but it is correct. PoisonedSandbox is what we return in case some other error already have caused poison state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which reminds we should probably better organize our errors into categories.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't by definition poisoned then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because this error doesn't cause poison. PoisionedSandbox is what we returned if it is poisoned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the sandbox returns this error then poisoned() == true? something feels off about not verifying that if this is returned and the internal state should also be verified to be correct.

///
/// **Data Structure Corruption**:
/// - Hash tables with inconsistent bucket counts vs. actual entries
/// - Linked lists with broken next/prev pointers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these things we have that get corrupted? These seems like general information on corruption but not specific to what we are protecting against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes these are just general information warnings

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would all of these fall under memory corruption?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this list feels overly generic and distracts from the rest of the comment but can see what others think

Signed-off-by: Ludvig Liljenberg <[email protected]>
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, a few nits but happy to see what other say

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider adding some kind of "poisoned sandbox" state to prevent sandbox misuse

3 participants