server: VmmHdl appears to be leaked during normal VM shutdown

Repro steps:

1. Start an ad hoc propolis-server instance
2. Send it an ensure request
3. Ask to stop the VM instance (you don't have to start it first, though you can)

Observed: The `DESTROY_SELF` vmm ioctl is issued (and a probe set on `vmm_destroy_locked` fires), but the kernel VMM persists until the process is killed. The stack on the resulting call to `vmm_destroy_finish` shows it originated from `genunix!proc_exit`. Writing a simple `Drop` impl for `VmmHdl` that just prints to stderr shows that this drop impl is apparently never reached.

Expected: there is at least some way to convince Propolis to fully close the kernel VMM fd on VM destruction.

---

We've discussed this in the past and concluded that in at least some cases it's useful for the kernel VMM to outlive the Propolis instance that owns it so that the VMM can be inspected with tools like `mdb -b`. I would at least like to consider avoiding this for production builds, though, for reasons related to [this Omicron issue comment](https://github.com/oxidecomputer/omicron/issues/6809#issuecomment-2773971685): it's useful for sled-agent to be able to say "Propolis reported that it's in the Destroyed state, so all its reservoir memory is released," because this helps give it the ability to tell Nexus that a VMM is gone and then do long-running zone cleanup operations afterward.

Even absent that motivation, I'd at least like to understand exactly which paths aren't fully dropping their `VmmHdl` references so that we can adjust the behavior if/when we need to.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server: VmmHdl appears to be leaked during normal VM shutdown #892

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

server: VmmHdl appears to be leaked during normal VM shutdown #892

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions