Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HIP] Hyperfile #85

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 160 additions & 0 deletions proposals/0001-hip-hyperfile/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# HIP 0001 - Hyperfile

<!-- toc -->
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [User Stories (Optional)](#user-stories-optional)
- [Story 1](#story-1)
- [Story 2](#story-2)
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Test Plan](#test-plan)
- [Unit tests](#unit-tests)
- [Integration tests](#integration-tests)
- [e2e tests](#e2e-tests)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
<!-- /toc -->

## Summary

This proposal introduces the concept of a `hyperfile.toml` configuration file for specifying
sandbox settings in Hyperlight. This feature will allow users to define configurations such
as `stack_size_override`, `max_execution_time`, and execution modes (`RunInHypervisor` or
`RunInProcess`) in a structured and human-readable format. The goal is to enable host
applications built with Hyperlight to be configurable in a non-programmatic way,
allowing users to modify microVM settings without directly altering the host code.

## Motivation

### Goals

- Enable the creation and use of a `hyperfile.toml` configuration file for defining sandbox
settings.
- Support all current configuration options (e.g., `stack_size_override`, `max_execution_time`)
in the new format.
- Maintain backwards compatibility with direct code-based sandbox configuration.

### Non-Goals

- This proposal does not aim to change how sandbox configurations are applied programmatically.
- It does not address dynamic configuration updates while a sandbox is running.

## Proposal

The `hyperfile.toml` will serve as a configuration file for sandbox initialization. Users will
specify parameters in the TOML format. The Hyperlight API will expose methods to parse the
file and use it to initialize sandboxes.

### User Stories

#### Story 1: Simplifying Development

As a developer, I want to define my sandbox settings in a `hyperfile.toml` so I can avoid
repetitive code when creating multiple sandboxes with similar configurations.

#### Story 2: Easier Debugging and Sharing

As a systems engineer, I want a standardized configuration file to share with teammates for
debugging or reproducing issues without requiring them to modify source code.

#### Story 3: Supporting Off-the-Shelf Hosts

As a host provider offering an off-the-shelf host leveraging Hyperlight, I want to use a
`hyperfile.toml` to define configurable sandbox settings. This allows end users to adjust
microVM parameters without modifying the host's code, ensuring the integrity of the host
application while providing flexibility to users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you see this as the host's responsibility to specify/resolve where these files are at runtime? We could possibly determine the default for a Sandbox via a well known environment variable , however if the host wanted to provide different types of Sandbox with different configurations for each type it would need to specify how this is done , I don't think that is something that can be done in Hyperlight itself, regardless of if this is done in single files or with named configurations in a single file.

### Notes/Constraints/Caveats

- **Validation**: The parser must validate the TOML file to ensure all required fields are
provided and within acceptable ranges. Defaults will be used for any omitted optional fields.
- **Error Handling**: Errors during parsing should provide detailed feedback to help users
correct their configuration files.

### Risks and Mitigations

#### Increased Complexity for New Users

Introducing a configuration file may add initial complexity for users unfamiliar with TOML.

- **Mitigation**: Provide a well-documented template and examples in the Hyperlight
documentation.

## Design Details

### Rough Sketch of a `hyperfile.toml`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered being able to specify multiple different named sandbox configurations in the same file that can be referenced by name when creating the sandboxes?

I can see a use case for wanting to specify different configurations.

This would work similar to how runtime classes work in containerd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this I imagine one of the entries would be a default one which would be used if no named sandbox configuration is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just doing this in separate files? config1.toml and config2.toml. It would simplify parsing alot, especially if we use the same struct for the config value itself, AND serde toml deserialization. And then you could do something like Config::FromFile(...) on the one you want to use.


Below is a sample configuration file that demonstrates how sandbox parameters might be
defined using TOML syntax:

```toml
# Sandbox Configuration File for Hyperlight

# Define sandbox memory settings
input_data_size = 1024 # Size of input data buffer in bytes
output_data_size = 2048 # Size of output data buffer in bytes
function_definition_size = 512 # Size of function definition buffer in bytes
host_exception_size = 256 # Size of host exception buffer in bytes
guest_error_buffer_size = 128 # Size of guest error buffer in bytes

# Optional overrides for memory sizes
stack_size_override = 4_194_304 # Stack size in bytes (4 MiB)
heap_size_override = 8_388_608 # Heap size in bytes (8 MiB)

# Kernel-specific settings
kernel_stack_size = 8192 # Kernel stack size in bytes

# Execution limits
max_execution_time = "100ms" # Maximum execution time (in milliseconds)
max_initialization_time = "100ms" # Maximum initialization time (in milliseconds)
max_wait_for_cancellation = "10ms" # Maximum wait time for cancellation (in milliseconds)

# Error handling
guest_panic_context_buffer_size = 1024 # Size of guest panic context buffer in bytes

# Execution mode
execution_mode = "RunInHypervisor" # Options: "RunInHypervisor", "RunInProcess"
```

### Test Plan

#### Unit Tests

- Validate parsing of correct and incorrect `hyperfile.toml` files.
- Ensure all fields map correctly from TOML to sandbox configurations.

#### Integration Tests

- Test initialization of sandboxes using `hyperfile.toml` with different configurations.
- Ensure compatibility with existing programmatic APIs.

#### e2e Tests

- Confirm that sandboxes initialized via `hyperfile.toml` behave identically to those
configured programmatically.


## Implementation History

- [x] Draft proposal created (November 2024).
- [ ] Initial implementation of TOML parser for sandbox configuration.
- [ ] Test cases and documentation updates.

## Drawbacks

- Adds another layer of configuration that may not be necessary for simple use cases.
- Potentially increases the learning curve for new users.

## Alternatives

1. **YAML or JSON Configurations**:
- Rejected due to TOML's better readability for specifying nested configurations.

2. **Environment Variables**:
- Less maintainable and harder to validate compared to a dedicated file.