feat: Bump controller-pkg to expose Provider field #284

Callisto13 · 2023-05-31T13:38:48Z

This PR exposes the option to select a provider for a deployment or machine.

The default is set by the operator who configured flintlockd on the remote host and cannot be altered here.
The provider can be set in the spec:

 kind: MicrovmMachineTemplate
 apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
 spec:
   template:
     spec:
       provider: "cloudhypervisor"
...

Merging as-is would introduce a ~~bug~~ high chance of user error, consider...

Scenario: A user sets provider: "cloudhypervisor" in their CAPMVM template. A host is chosen from the list, the request is made to flintlock, but it fails because the cloud-hypervisor-static binary is not present on the host. What do we do here?

Perhaps as part of liquidmetal-dev/flintlock#509 we can add information about available providers. CAPMVM can call this endpoint first (I believe we already do a check that the service is reachable), check the provider is available, and select another host if not. This would complicate the failureDomain selection process.

Second option is to do this on the flintlock side: if the desired provider is not available, cycle through to the next one until all have been tried. We would need to feed this back to the CAPMVM end user so it is clear what has happened, like in a clear log-line or the mvm spec (both flint and capmvm sides).

Or maybe this does not count as a bug? Flintlock already can be misconfigured or deliberately set up with constraints which make CAPMVM creations fail. Perhaps we need to improve documentation (operators need to be clear about what is available on hosts), and put more into both handling failures gracefully and making it easy to get relevant information about the target hosts.

Callisto13 · 2023-05-31T13:40:26Z

if merged, follow with liquidmetal-dev/site#14 after releasing

yitsushi · 2023-05-31T14:12:23Z

I would say it's a "user-error" kind of bug. Yes, we can make the UX better, but I wouldn't count it as a bug. (see my comment here: liquidmetal-dev/controller-pkg#3 (comment)).

If the reported error shows what's the issue, like "failed to provision microvm because provider binary not found: %provider%", I think it's totally fine. We can't say it's a bug if a user tries to use a linux binary on a mac machine.

yitsushi · 2023-05-31T14:14:05Z

cycle through to the next one until all have been tried

I don't think it would work:

Different providers have different requirements, like kernel, base image, i don't what else.
A user requested to use X, we should fail if it's not possible and not use Y. They requested X for a reason.

Callisto13 · 2023-05-31T14:35:22Z

Different providers have different requirements, like kernel, base image, i don't what else

oh yeeeeh forgot about all that

Callisto13 · 2023-05-31T14:43:27Z

I would say it's a "user-error" kind of bug

Eh it is and isn't. A user cannot choose which host they get a node on. They could enable cloud-h on some hosts and not others, at which point it becomes annoying luck.

But I suppose operators could say "don't set the provider override at all"... idk this just feels like we are deliberately setting a trap.

yitsushi · 2023-06-02T02:02:35Z

Eh it is and isn't. A user cannot choose which host they get a node on. They could enable cloud-h on some hosts and not others, at which point it becomes annoying luck.

Created an issue that can resolve this and I think we wanted to do it, but never get there. And a place for further discussions.

#285

Callisto13 · 2023-06-02T10:31:31Z

We also have this problem right now if people use the latest flintlock:

operator has 2 hosts with firecracker, 2 hosts with cloud-hypervisor
capmvm user specifies all hosts in template
capmvm user specifies either firecracker or ch kernels in mvm spec
wrong host is chosen at random
misery ensues

could do with that tagging story sooner rather than later

yitsushi · 2023-06-05T08:26:48Z

could do with that tagging story sooner rather than later

I can only agree 💯

feat: Bump controller-pkg to expose Provider field

6aaa1f3

Callisto13 requested a review from yitsushi May 31, 2023 13:38

Callisto13 added the kind/feature New feature or request label May 31, 2023

Callisto13 requested a review from richardcase May 31, 2023 13:40

yitsushi approved these changes May 31, 2023

View reviewed changes

richardcase approved these changes Jun 1, 2023

View reviewed changes

richardcase merged commit 29a61b9 into main Oct 11, 2023

richardcase deleted the provider branch October 11, 2023 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Bump controller-pkg to expose Provider field #284

feat: Bump controller-pkg to expose Provider field #284

Callisto13 commented May 31, 2023 •

edited

Loading

Callisto13 commented May 31, 2023

yitsushi commented May 31, 2023

yitsushi commented May 31, 2023

Callisto13 commented May 31, 2023

Callisto13 commented May 31, 2023

yitsushi commented Jun 2, 2023

Callisto13 commented Jun 2, 2023 •

edited

Loading

yitsushi commented Jun 5, 2023

feat: Bump controller-pkg to expose Provider field #284

feat: Bump controller-pkg to expose Provider field #284

Conversation

Callisto13 commented May 31, 2023 • edited Loading

Callisto13 commented May 31, 2023

yitsushi commented May 31, 2023

yitsushi commented May 31, 2023

Callisto13 commented May 31, 2023

Callisto13 commented May 31, 2023

yitsushi commented Jun 2, 2023

Callisto13 commented Jun 2, 2023 • edited Loading

yitsushi commented Jun 5, 2023

Callisto13 commented May 31, 2023 •

edited

Loading

Callisto13 commented Jun 2, 2023 •

edited

Loading