Skip to content

Un-deleteable disk from an import failure during online update #9196

@askfongjojo

Description

@askfongjojo

I had an import running during online update. The pantry used for the disk import was expunged and recreated. The disk import failed (which is expected) but the disk left behind in import_ready state returned 500 error when I tried to finalize it.

The error in nexus log suggests that the action required contacting the expunged pantry instance:

20:57:16.862Z INFO 64e1f996-d688-4e66-801a-6d36c758e513 (dropshot_external): request completed
    error_message_external = Internal Server Error
    error_message_internal = saga ACTION error at node "call_pantry_detach_for_disk": deserialize failed: unknown variant `pantry detach failed: remote server is gone`, expected one of `ObjectNotFound`, `ObjectAlreadyExists`, `InvalidRequest`, `Unauthenticated`, `InvalidValue`, `Forbidden`, `InternalError`, `ServiceUnavailable`, `InsufficientCapacity`, `TypeVersionMismatch`, `Conflict`, `NotFound`, `Gone`
    file = /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.4/src/server.rs:855
    latency_us = 470695
    local_addr = 172.30.2.7:443
    method = POST
    remote_addr = 172.20.17.42:60899
    req_id = 0987058e-7291-41fd-b386-e6d7bb7f9ddf
    response_code = 500
    uri = /v1/disks/alpine3-boot/finalize?project=angela

The un-finalizable/un-deleteable disk can be a problem for users who have automated provisioning/de-provisioning process (e.g., terraform) that will keeping using the same object names.

This issue is not specific to online update. Expunging a bad disk or sled with a pantry that disappeared in the middle of a disk import will also cause the same problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions