Skip to content

oximeter server registration could be more resilient to failure (and asynchronous) #513

Open
@jordanhendricks

Description

@jordanhendricks

#511 is the easy fix to #497: instead of blocking in instance_ensure retrying forever, the registration will fail after a couple of retries.

The better, longer term fix is to make registration of the server endpoint asynchronous such that transient failures to connect to the oximeter consumer do not make it such that that endpoint won't be able to serve metrics for forever. This depends on some work on the oximeter side: oxidecomputer/omicron#3956.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request.serverRelated specifically to the Propolis server API and its VM management functions.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions