Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local volume health monitoring #10

Open
msau42 opened this issue Dec 19, 2018 · 12 comments
Open

Local volume health monitoring #10

msau42 opened this issue Dec 19, 2018 · 12 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@msau42
Copy link
Contributor

msau42 commented Dec 19, 2018

Migrating from kubernetes-retired/external-storage#817

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 19, 2018
@msau42
Copy link
Contributor Author

msau42 commented Dec 19, 2018

To summarize, I think we can potentially split this up into these main areas:

  • A common way to report PV health
  • Daemonset controller that monitors local disks per node
  • Cluster controller that monitors nodes that get deleted
  • Workload controller that reacts to PV health

@yanniszark
Copy link

Link to the effort of @NickrenREN so far for a local storage monitor:

I agree with the first three but I am unsure about the last one.
Some thoughts:

A common way to report PV health.

  • How to expose: annotations can be used on the PV objects to indicate certain conditions, like the mount point disappearing. The local storage design doc proposes adding taints to PV objects that will enable Pods to avoid binding specific PVs. (A new PV taint will be introduced to handle unhealthy volumes. The addon or another external entity can monitor the volumes and add a taint when it detects that it is unhealthy.)
  • What to include: I can think of 2 generic conditions at the moment, corresponding to different failure scenarios:
    1. Availability: Indicates if the disk is available for operations. If the mount point disappears, that means the PV is not available. If the nodes defined the local PV is on is deleted, then the PV is not available (maybe this should be split into a separate condition?). Each storage provider can also determine if a PV it provisioned is available and update this condition.
    2. Health: Indicates an estimate for the health of the disk based on metrics (eg SMART).

A problem here is that in order to take action in the case of an unhealthy PV, you need to manually bind PVs and PVCs. If a PV is reported to be unhealthy and you want to prevent a Pod from using it, there isn't anything you can do other than manually binding the PVC. A mechanism like PV taints, as proposed in the local storage document, would be very helpful here. Please correct me if there's something I'm missing here.

Daemonset controller that monitors local disks per node

At first, the DaemonSet can watch the mount points and collect smart data, then make the above conditions available through annotations on PV objects. We should be careful to avoid scenarios like node repair on GKE where a failed node will come back with the same name and disks mounted in the same places, but without any data. When that happens, the PVs will still work but now point to empty volumes. Instead, the PVs should not work as the underlying disks are essentially different, they are just mounted at the same points. To prevent this type of scenario, the Daemonset could create symlinks for each discovered directory, include the filesystem UUID in the symlink name and pass the path to the symlink in the PV object.

Cluster controller that monitors nodes that get deleted

We should be careful to check all PVs on the startup of the controller. If a Node is deleted before the controller is started then we need to clear PVs belonging to that Node. If a Node is deleted and at the same time this controller crashes, a subsequent list+watch will not return the deleted Node and we won't be notified that something happened.

Workload controller that reacts to PV health

As I mentioned, I don't see any real way of reacting. Deleting the PV doesn't work, because the local provisioner will recreate it. Recreating the PVC+Pod using it will not work because it could still end up binded to that PV. Is there anything we can do here? The local volume doc recommends introducing taints for PVs.

@msau42
Copy link
Contributor Author

msau42 commented Jan 7, 2019

cc @gnufied
Detecting disks properly is something that the local storage operator may be addressing

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 28, 2019
@cofyc
Copy link
Member

cofyc commented Apr 28, 2019

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 28, 2019
@NickrenREN
Copy link
Contributor

NickrenREN commented Apr 29, 2019

Thanks for removing the stale label @cofyc , and thanks for your comment @yanniszark
I agree on most of yanniszark's opinions. And I am updating the PV monitor design doc, here is the link: https://docs.google.com/document/d/1WG51DZZeXyP50EKyhECYg5m5KEB4F0AhYBhA555_Gs4/edit.

We only focus on monitoring mechanism at the first stage, reaction is not in the scope of that doc.
Comments are welcome, thanks

Will submit a new PR for storage monitor later.

@NickrenREN
Copy link
Contributor

PR submitted, comments are welcome, thanks
kubernetes/enhancements#1077

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 28, 2019
@NickrenREN
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 30, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 28, 2019
@msau42
Copy link
Contributor Author

msau42 commented Dec 24, 2019

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 24, 2019
jsafrane pushed a commit to jsafrane/sig-storage-local-static-provisioner that referenced this issue Apr 14, 2020
@arianvp
Copy link
Contributor

arianvp commented Apr 12, 2021

Now that the required logic in kubelet seems to have landed in 1.21; are there any concrete plans adding this to the static provisioner? https://kubernetes.io/blog/2021/04/08/kubernetes-1-21-release-announcement/#persistentvolume-health-monitor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

7 participants