-
Notifications
You must be signed in to change notification settings - Fork 137
Use page tracking for snapshot and restore #683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Use page tracking for snapshot and restore #683
Conversation
7e85120
to
014eab3
Compare
014eab3
to
56fd983
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to me. The signal handling of SIGSEGV makes me a little nervous, and there seems to be an assumption that Vec<MemoryRegions>
are always consecutive in memory, which i think we might break in the future. I also think a refactor of our memory management code would be nice 😅 .
I also remember I had a bunch of tests for evolve/devolve in my previous dirty-pages PR that maybe could be useful to test logic. Those tests were the ones that allowed me to catch the bug in mshv's implementation :P
Also, these regression seems pretty big (compared to main branch)
for page_idx in bit_index_iterator(&bitmap) { | ||
page_indices.push(current_page + page_idx); | ||
} | ||
current_page += num_pages; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are making an assumption here that all memory regions are consecutive with no gaps. I think Vec<MemoryRegion>
has this invariant, but not sure if documented anywhere
shared_mem.with_exclusivity(|e| e.copy_to_slice(&mut buffer, 0))??; | ||
} else { | ||
// Sort pages for deterministic ordering and to enable consecutive page optimization | ||
dirty_pages.sort_unstable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this sort is redundant
|
||
// Collect dirty pages and sort them for consecutive page optimization | ||
let mut dirty_pages: Vec<usize> = bit_index_iterator(dirty_bitmap).collect(); | ||
dirty_pages.sort_unstable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this sort is redundant
pub(super) fn create_new_snapshot<S: SharedMemory>( | ||
&mut self, | ||
shared_mem: &mut S, | ||
dirty_page_map: Option<&Vec<u64>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think(?) this option can also be removed
@@ -606,6 +607,12 @@ impl Hypervisor for HypervWindowsDriver { | |||
Ok(()) | |||
} | |||
|
|||
fn get_and_clear_dirty_pages(&mut self) -> Result<Vec<u64>> { | |||
// For now we just mark all pages dirty which is the equivalent of taking a full snapshot | |||
let total_size = self.mem_regions.iter().map(|r| r.guest_region.len()).sum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3838d0a
to
411a668
Compare
Signed-off-by: Simon Davies <[email protected]>
… sizes Signed-off-by: Simon Davies <[email protected]>
- Implemented `LinuxDirtyPageTracker` for tracking dirty pages in Linux. - Utilizes SIGSEGV to detect writes and marks pages as dirty. - Supports concurrent access and ensures memory protection. - Includes comprehensive tests for various scenarios including overlap detection and concurrent writes. - Added `WindowsDirtyPageTracker` as a placeholder for Windows dirty page tracking. - Currently marks all pages as dirty until further implementation is completed. - Includes basic structure and initialization logic. Signed-off-by: Simon Davies <[email protected]>
…g instructions Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
…state reset validation Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
…ge tracking after memory is allocated and stopping it once the uninitialized sandbox is evolved Signed-off-by: Simon Davies <[email protected]>
…memory from them. Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
… offset methods to be public Signed-off-by: Simon Davies <[email protected]>
…managing memory snapshots and update related methods for improved state management Signed-off-by: Simon Davies <[email protected]>
… shared_memory_snapshot_manager for improved snapshot management Signed-off-by: Simon Davies <[email protected]>
…lated methods Signed-off-by: Simon Davies <[email protected]>
411a668
to
1231cc6
Compare
…lve methods Signed-off-by: Simon Davies <[email protected]>
…ethod Signed-off-by: Simon Davies <[email protected]>
…ptions in parameters. Signed-off-by: Ludvig Liljenberg <[email protected]>
1231cc6
to
9d4d90a
Compare
This pull request introduces snapshotting and restoring of sandbox state using dirty page tracking rather than copying the entire memory state each time, in many/most scenarios where sandboxes have larger than default amounts of memory this results in better perforamnce and reduced memory usage.
However, due to inefficiencies in the way dirty page tracking works on mshv and the fact that the implementation of tracking is not done for Windows this is not universally true.
Included below are screenshots showing the difference in time taken for some benchmarks with and without dirty page tracking enabled, along with some explanations of the differences seen.
The changes comprise:
Host Shared Memory Dirty Page tracking
SIGSEGV
to support dirty page tracking for host memory mapped into a VM. Updated documentation to reflect this change and added debugging instructions for handlingSIGSEGV
in GDB and LLDB.VM Dirty page tracking
Snapshot Management
Benchmarking Enhancements
guest_call_benchmark_large_param
function to support benchmarks for multiple parameter sizes and added a newsandbox_heap_size_benchmark
function to measure sandbox creation performance with varying heap sizes.guest_call_heap_size_benchmark
to evaluate guest function call performance with different heap sizes.Performance Changes
KVM Before
KVM After
KVM shows massive positive changes in performance when creating large sandboxes and calling functions in sandboxes with large memory configurations , although not measured the amount of memory consumed should have reduced considerably. In the scenario where large parameters are passed to the sandbox there is a degradation of performance, this has not been investigated yet, it may be due to the page based mechanism for saving/restoring data being more expensive than copying and restoring all the data. Other work to make large parameter passing more efficient will likely have a large positive impact here as well.
There is some regression in performance for small/default sandbox sizes which is more than likely caused by the overhead of tracking and building/restoring page based snapshots for small memory vs. copying and restoring the entire memory.
mshv2 Before
mshv2 After
mshv3 Before
mshv3 After
Both mshv2 and mshv3 show similar patterns to KVM in that the larger sandbox sizes show large improvements, the default/small sandboxes show regressions and the large parameter sizes show regressions, possibly for the same reason that the KVM one does , again no investigation done here yet.
The biggest difference between KVM and mshv is for two reasons, first when enabling dirty page tracking on mshv the first call to get dirty pages results in a returned bitmap showing all pages dirty, since this would cause us to snapshot all memory as a baseline after enabling dirty page tracking this PR gets the dirty pages immediately and then discards the result. This approach means that we have to make 2 calls to get dirty pages when we really only need one. The impact of this is quite large, especially with larger memory configurations since the call to get dirty pages seems to be O(n) where n is the number of pages in the memory configuration (regardless of if the pages are dirty or not), for a 950mb VM I observed ~1.4ms response for this call, however, this approach with larger sandboxes is still much quicker than copying all the memory. Fixing these issues in mshv will probably bring the performance much closer to KVM.
Windows 2025 Before
Windows 2025 After
Windows is largely either the same performance or has regressed, this is because the Windows implementation has not been done yet, at the moment each time dirt pages are requested Windows reports that all pages are dirty and the snapshots/restores are done on that basis, there is some overhead of this approach especially when restoring