Skip to content

Add a metric to expose non-ready machines or errors #1550

@renchap

Description

@renchap

Use-case:

We are deploying machines on Hetzner, and sometimes its not possible to create the machine due to account limit on resources:

 machine_controller.go:383] Failed to reconcile machine "xxx-m-1-68c6cd6957-6hk94": failed to create machine at cloudprovider, due to failed to create server, due to core limit exceeded (resource_limit_exceeded)    

It would be very useful to have a metric to monitor for this, and be able to have an alert when machines have been scheduler but are not successfully created.

Metadata

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions