Skip to content

Adding automatic bundle on zone death #3829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 22 additions & 19 deletions docs/how-to-run.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ The rest of these instructions assume that you're building and running Omicron o
The Sled Agent supports operation on both:

* a Gimlet (i.e., real Oxide hardware), and
* an ordinary PC that's been set up to look like a Gimlet using the `./tools_create_virtual_hardware.sh` script.
* an ordinary PC that's been set up to look like a Gimlet using the `./tools/create_virtual_hardware.sh` script.

This script also sets up a "softnpu" zone to implement Boundary Services. SoftNPU simulates the Tofino device that's used in real systems. Just like Tofino, it can implement sled-to-sled networking, but that's beyond the scope of this doc.

Expand Down Expand Up @@ -373,7 +373,9 @@ $ dig recovery.sys.oxide.test @192.168.1.20 +short
192.168.1.21
----

Where did 192.168.1.20 come from? That's the external address of the external DNS server. We knew that only because it's the first address in the "internal services" IP pool in config-rss.toml.
Where did 192.168.1.20 come from? That's the external address of the external
DNS server. We knew that because it's listed in the `external_dns_ips` entry of
the `config-rss.toml` file we're using.

Having looked this up, the easiest thing will be to use `http://192.168.1.21` for your URL (replacing with `https` if you used a certificate, and replacing that IP if needed). If you've set up networking right, you should be able to reach this from your web browser. You may have to instruct the browser to accept a self-signed TLS certificate. See also <<_connecting_securely_with_tls_using_the_cli>>.

Expand All @@ -392,12 +394,19 @@ An IP pool is needed to provide external connectivity to Instances. The address

[source,console]
----
$ oxide api /v1/system/ip-pools/default/ranges/add --method POST --input - <<EOF
{
"first": "192.168.1.31",
"last": "192.168.1.40"
$ oxide ip-pool range add --pool default --first 192.168.1.31 --last 192.168.1.40
success
IpPoolRange {
id: 4a61e65a-d96d-4c56-9cfd-dc1e44d9e99b,
ip_pool_id: 1b1289a7-cefe-4a7e-a8c9-d93330846301,
range: V4(
Ipv4Range {
first: 192.168.1.31,
last: 192.168.1.40,
},
),
time_created: 2023-08-02T16:31:43.679785Z,
}
EOF
----

With SoftNPU you will generally also need to configure Proxy ARP. Below, `IP_POOL_START` and `IP_POOL_END` are the first and last addresses you used in the previous command:
Expand Down Expand Up @@ -435,11 +444,6 @@ $ oxide api /v1/images?project=myproj --method POST --input - <<EOF
{
"name": "alpine",
"description": "boot from propolis zone blob!",
"block_size": 512,
"distribution": {
"name": "alpine",
"version": "propolis-blob"
},
"os": "linux",
"version": "1",
"source": {
Expand All @@ -457,22 +461,21 @@ $ oxide api /v1/images --method POST --input - <<EOF
{
"name": "crucible-tester-sparse",
"description": "boot from a url!",
"block_size": 512,
"distribution": {
"name": "debian",
"version": "9"
},
"os": "debian",
"version": "9",
"source": {
"type": "url",
"url": "http://[fd00:1122:3344:101::15]/crucible-tester-sparse.img"
"url": "http://[fd00:1122:3344:101::15]/crucible-tester-sparse.img",
"block_size": 512
}
}
EOF
----

=== Provision an instance using the CLI

You'll need the id `$IMAGE_ID` of the image you just created.
You'll need the id `$IMAGE_ID` of the image you just created. You can fetch that
with `oxide image view --image $IMAGE_NAME`.

Now, create a Disk from that Image. The disk size must be a multiple of 1 GiB and at least as large as the image size. The example below creates a disk using the image made from the alpine ISO that ships with propolis, and sets the size to the next 1GiB multiple of the original alpine source:

Expand Down
11 changes: 9 additions & 2 deletions illumos-utils/src/running_zone.rs
Original file line number Diff line number Diff line change
Expand Up @@ -933,11 +933,10 @@ impl RunningZone {

/// Return the names of the Oxide SMF services this zone is intended to run.
pub fn service_names(&self) -> Result<Vec<String>, ServiceError> {
const NEEDLES: [&str; 2] = ["/oxide", "/system/illumos"];
let output = self.run_cmd(&["svcs", "-H", "-o", "fmri"])?;
Ok(output
.lines()
.filter(|line| NEEDLES.iter().any(|needle| line.contains(needle)))
.filter(|line| is_oxide_smf_log_file(line))
.map(|line| line.trim().to_string())
.collect())
}
Expand Down Expand Up @@ -1191,3 +1190,11 @@ impl InstalledZone {
path
}
}

/// Return true if the named file appears to be a log file for an Oxide SMF
/// service.
pub fn is_oxide_smf_log_file(name: impl AsRef<str>) -> bool {
const SMF_SERVICE_PREFIXES: [&str; 2] = ["/oxide", "/system/illumos"];
let name = name.as_ref();
SMF_SERVICE_PREFIXES.iter().any(|needle| name.contains(needle))
}
117 changes: 115 additions & 2 deletions openapi/sled-agent.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,34 @@
"version": "0.0.1"
},
"paths": {
"/all-zone-bundles": {
"get": {
"summary": "List all zone bundles that exist, even for now-deleted zones.",
"operationId": "zone_bundle_list_all",
"responses": {
"200": {
"description": "successful operation",
"content": {
"application/json": {
"schema": {
"title": "Array_of_ZoneBundleMetadata",
"type": "array",
"items": {
"$ref": "#/components/schemas/ZoneBundleMetadata"
}
}
}
}
},
"4XX": {
"$ref": "#/components/responses/Error"
},
"5XX": {
"$ref": "#/components/responses/Error"
}
}
}
},
"/cockroachdb": {
"post": {
"summary": "Initializes a CockroachDB cluster",
Expand Down Expand Up @@ -528,7 +556,7 @@
},
"/zones/{zone_name}/bundles": {
"get": {
"summary": "List the zone bundles that are current available for a zone.",
"summary": "List the zone bundles that are available for a running zone.",
"operationId": "zone_bundle_list",
"parameters": [
{
Expand Down Expand Up @@ -639,6 +667,42 @@
"$ref": "#/components/responses/Error"
}
}
},
"delete": {
"summary": "Delete a zone bundle.",
"operationId": "zone_bundle_delete",
"parameters": [
{
"in": "path",
"name": "bundle_id",
"description": "The ID for this bundle itself.",
"required": true,
"schema": {
"type": "string",
"format": "uuid"
}
},
{
"in": "path",
"name": "zone_name",
"description": "The name of the zone this bundle is derived from.",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"204": {
"description": "successful deletion"
},
"4XX": {
"$ref": "#/components/responses/Error"
},
"5XX": {
"$ref": "#/components/responses/Error"
}
}
}
},
"/zpools": {
Expand Down Expand Up @@ -2654,6 +2718,39 @@
"vni"
]
},
"ZoneBundleCause": {
"description": "The reason or cause for a zone bundle, i.e., why it was created.",
"oneOf": [
{
"description": "Generated in response to an explicit request to the sled agent.",
"type": "string",
"enum": [
"explicit_request"
]
},
{
"description": "A zone bundle taken when a sled agent finds a zone that it does not expect to be running.",
"type": "string",
"enum": [
"unexpected_zone"
]
},
{
"description": "An instance zone was terminated.",
"type": "string",
"enum": [
"terminated_instance"
]
},
{
"description": "Some other, unspecified reason.",
"type": "string",
"enum": [
"other"
]
}
]
},
"ZoneBundleId": {
"description": "An identifier for a zone bundle.",
"type": "object",
Expand All @@ -2677,6 +2774,14 @@
"description": "Metadata about a zone bundle.",
"type": "object",
"properties": {
"cause": {
"description": "The reason or cause a bundle was created.",
"allOf": [
{
"$ref": "#/components/schemas/ZoneBundleCause"
}
]
},
"id": {
"description": "Identifier for this zone bundle",
"allOf": [
Expand All @@ -2689,11 +2794,19 @@
"description": "The time at which this zone bundle was created.",
"type": "string",
"format": "date-time"
},
"version": {
"description": "A version number for this zone bundle.",
"type": "integer",
"format": "uint8",
"minimum": 0
}
},
"required": [
"cause",
"id",
"time_created"
"time_created",
"version"
]
},
"ZoneType": {
Expand Down
51 changes: 50 additions & 1 deletion schema/zone-bundle-metadata.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,20 @@
"description": "Metadata about a zone bundle.",
"type": "object",
"required": [
"cause",
"id",
"time_created"
"time_created",
"version"
],
"properties": {
"cause": {
"description": "The reason or cause a bundle was created.",
"allOf": [
{
"$ref": "#/definitions/ZoneBundleCause"
}
]
},
"id": {
"description": "Identifier for this zone bundle",
"allOf": [
Expand All @@ -20,9 +30,48 @@
"description": "The time at which this zone bundle was created.",
"type": "string",
"format": "date-time"
},
"version": {
"description": "A version number for this zone bundle.",
"type": "integer",
"format": "uint8",
"minimum": 0.0
}
},
"definitions": {
"ZoneBundleCause": {
"description": "The reason or cause for a zone bundle, i.e., why it was created.",
"oneOf": [
{
"description": "Generated in response to an explicit request to the sled agent.",
"type": "string",
"enum": [
"explicit_request"
]
},
{
"description": "A zone bundle taken when a sled agent finds a zone that it does not expect to be running.",
"type": "string",
"enum": [
"unexpected_zone"
]
},
{
"description": "An instance zone was terminated.",
"type": "string",
"enum": [
"terminated_instance"
]
},
{
"description": "Some other, unspecified reason.",
"type": "string",
"enum": [
"other"
]
}
]
},
"ZoneBundleId": {
"description": "An identifier for a zone bundle.",
"type": "object",
Expand Down
Loading