8
8
- [ Overview] ( #overview )
9
9
- [ Snapshot files management] ( #snapshot-files-management )
10
10
- [ Performance] ( #performance )
11
- - [ Known issues] ( #known-issues )
11
+ - [ Developer preview status] ( #developer-preview-status )
12
+ - [ Limitations] ( #limitations )
12
13
- [ Firecracker Snapshotting characteristics] ( #firecracker-snapshotting-characteristics )
13
14
- [ Snapshot versioning] ( #snapshot-versioning )
14
15
- [ Snapshot API] ( #snapshot-api )
@@ -38,6 +39,7 @@ guest workload at that particular point in time.
38
39
39
40
The Firecracker snapshot feature is in [ developer preview] ( ../RELEASE_POLICY.md )
40
41
on all CPU micro-architectures listed in [ README] ( ../../README.md#supported-platforms ) .
42
+ See [ this section] ( #developer-preview-status ) for more info.
41
43
42
44
### Overview
43
45
@@ -82,8 +84,6 @@ resumed microVM.
82
84
83
85
The Firecracker snapshot design offers a very simple interface to interact with
84
86
snapshots but provides no functionality to package or manage them on the host.
85
- Using snapshots in production is currently not recommended as there are open
86
- [ Known issues] ( #known-issues ) .
87
87
88
88
The [ threat containment model] ( ../design.md#threat-containment ) states
89
89
that the host, host/API communication and snapshot files are trusted by Firecracker.
@@ -93,33 +93,49 @@ snapshot files by implementing authentication and encryption schemes while
93
93
managing their lifecycle or moving them across the trust boundary, like for
94
94
example when provisioning them from a respository to a host over the network.
95
95
96
- Firecracker is optimized for fast load/resume and it's designed to do some very basic
97
- sanity checks only on the vm state file. It only verifies integrity using a 64
98
- bit CRC value embedded in the vm state file, but this is only as a partial
99
- measure to protect against accidental corruption, as the disk files and memory
100
- file need to be secured as well. It is important to note that CRC computation
101
- is validated before trying to load the snapshot. Should it encounter failure,
102
- an error will be shown to the user and the Firecracker process will be terminated.
96
+ Firecracker is optimized for fast load/resume, and it's designed to do some
97
+ very basic sanity checks only on the vm state file. It only verifies integrity
98
+ using a 64-bit CRC value embedded in the vm state file, but this is only
99
+ a partial measure to protect against accidental corruption, as the disk
100
+ files and memory file need to be secured as well. It is important to note that
101
+ CRC computation is validated before trying to load the snapshot. Should it
102
+ encounter failure, an error will be shown to the user and the Firecracker
103
+ process will be terminated.
103
104
104
105
### Performance
105
106
106
107
The Firecracker snapshot create/resume performance depends on the memory size,
107
- vCPU count and emulated devices count. The Firecracker CI runs snapshots tests
108
- on AWS ** m5d.metal** instances for Intel and on AWS ** m6g.metal** for ARM.
109
- The baseline for snapshot resume latency target on Intel is under ** 8ms** with
110
- 5ms p90, and on ARM is under ** 3ms** for a microVM with the following specs:
111
- 2vCPU/512MB/1 block/1 net device.
108
+ vCPU count and emulated devices count.
109
+ The Firecracker CI runs snapshot tests on:
112
110
113
- ### Known issues
111
+ - AWS ** m5d.metal** and ** m6i.metal** instances for Intel
112
+ - AWS ** m6g.metal** for ARM
113
+ - AWS ** m6a.metal** for AMD
114
114
115
- - High snapshot latency on 5.4+ host kernels - [ #2129 ] ( https://github.com/firecracker-microvm/firecracker/issues/2129 )
115
+ We are running nightly performance tests for all the enumerated platforms on
116
+ all supported kernel versions.
117
+ The baselines can be found in their [ respective config file] ( ../../tests/integration_tests/performance/configs/ ) .
118
+
119
+ ### Developer preview status
120
+
121
+ The snapshot functionality is still in developer preview due to the following:
122
+
123
+ - Poor entropy and replayable randomness when resuming multiple microvms from
124
+ the same snapshot. We do not recommend to use snapshotting in production if
125
+ there is no mechanism to guarantee proper secrecy and uniqueness between
126
+ guests.
127
+ Please see [ Snapshot security and uniqueness] ( #snapshot-security-and-uniqueness ) .
128
+
129
+ ### Limitations
130
+
131
+ - High snapshot latency on 5.4+ host kernels due to cgroups V1. We
132
+ strongly recommend to deploy snapshots on cgroups V2 enabled hosts for the
133
+ implied kernel versions - [ related issue] ( https://github.com/firecracker-microvm/firecracker/issues/2129 ) .
116
134
- Guest network connectivity is not guaranteed to be preserved after resume.
117
135
For recommendations related to guest network connectivity for clones please
118
136
see [ Network connectivity for clones] ( network-for-clones.md ) .
119
137
- Vsock device does not have full snapshotting support.
120
138
Please see [ Vsock device limitation] ( #vsock-device-limitation ) .
121
- - Poor entropy and replayable randomness when resuming multiple microvms which
122
- deal with cryptographic secrets. Please see [ Snapshot security and uniqueness] ( #snapshot-security-and-uniqueness ) .
123
139
- Snapshotting on arm64 works for both GICv2 and GICv3 enabled guests.
124
140
However, restoring between different GIC version is not possible.
125
141
@@ -542,7 +558,7 @@ Boot microVM A -> ... -> Create snapshot S -> Resume -> ...
542
558
-> Load S in microVM B -> Resume -> ...
543
559
```
544
560
545
- Here, both microVM A and B do work staring from the state stored in snapshot S.
561
+ Here, both microVM A and B do work starting from the state stored in snapshot S.
546
562
Unique identifiers, random numbers, and cryptographic tokens that are meant to
547
563
be used once may be used twice. It doesn't matter if microVM A is terminated
548
564
before microVM B resumes execution from snapshot S or not. In this example, we
0 commit comments