Skip to content

server: VmmHdl appears to be leaked during normal VM shutdown #892

Open
@gjcolombo

Description

@gjcolombo

Repro steps:

  1. Start an ad hoc propolis-server instance
  2. Send it an ensure request
  3. Ask to stop the VM instance (you don't have to start it first, though you can)

Observed: The DESTROY_SELF vmm ioctl is issued (and a probe set on vmm_destroy_locked fires), but the kernel VMM persists until the process is killed. The stack on the resulting call to vmm_destroy_finish shows it originated from genunix!proc_exit. Writing a simple Drop impl for VmmHdl that just prints to stderr shows that this drop impl is apparently never reached.

Expected: there is at least some way to convince Propolis to fully close the kernel VMM fd on VM destruction.


We've discussed this in the past and concluded that in at least some cases it's useful for the kernel VMM to outlive the Propolis instance that owns it so that the VMM can be inspected with tools like mdb -b. I would at least like to consider avoiding this for production builds, though, for reasons related to this Omicron issue comment: it's useful for sled-agent to be able to say "Propolis reported that it's in the Destroyed state, so all its reservoir memory is released," because this helps give it the ability to tell Nexus that a VMM is gone and then do long-running zone cleanup operations afterward.

Even absent that motivation, I'd at least like to understand exactly which paths aren't fully dropping their VmmHdl references so that we can adjust the behavior if/when we need to.

Metadata

Metadata

Assignees

No one assigned

    Labels

    serverRelated specifically to the Propolis server API and its VM management functions.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions