Skip to content

Conversation

qJkee
Copy link
Contributor

@qJkee qJkee commented Sep 25, 2025

This is a POC of device deletion for LVMS operator.
Currently we do allow to user to delete required devices and optional devices from the CR, but there are few limitations:

  1. You can't delete all devices from device class, at least one required or optional device must present in CR
  2. If device is no longer available on the node, user must fix LVM's VG on the node first and then modify the CR to reflect current state of the VG on the node. The reason for that is because we can't remove exact missing device, only all of them if there are more than one device is missing
  3. You can't delete device which has data on it. If LVMS detects that there are data on the disk, it will ask user to do pvmove command manually on the node to remove all the data from the disk in VG

The standard user workflow might be defined in 2 ways.

  1. If the disk is lost, fix the VG on the node first and then delete disk from the CR
  2. Delete the disk from CR, but make sure that there are no data on this disk, otherwise use pvmove command on the host to remove any data leftowers

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 25, 2025

@qJkee: This pull request references OCPEDGE-2109 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 25, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 25, 2025
@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 25, 2025
Copy link
Contributor

openshift-ci bot commented Sep 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qJkee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 25, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 30, 2025

@qJkee: This pull request references OCPEDGE-2109 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

This is a POC of device deletion for LVMS operator.
Currently we do allow to user to delete required devices and optional devices from the CR, but there are few limitations:

  1. You can't delete all devices from device class, at least one required or optional device must present in CR
  2. If device is no longer available on the node, user must fix LVM's VG on the node first and then modify the CR to reflect current state of the VG on the node. The reason for that is because we can't remove exact missing device, only all of them if there are more than one device is missing
  3. You can't delete device which has data on it. If LVMS detects that there are data on the disk, it will ask user to do pvmove command manually on the node to remove all the data from the disk in VG

The standard user workflow might be defined in 2 ways.

  1. If the disk is lost, fix the VG on the node first and then delete disk from the CR
  2. Delete the disk from CR, but make sure that there are no data on this disk, otherwise use pvmove command on the host to remove any data leftowers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@codecov-commenter
Copy link

codecov-commenter commented Oct 1, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 54 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.78%. Comparing base (44c57f7) to head (5c736d5).
⚠️ Report is 71 commits behind head on main.

Files with missing lines Patch % Lines
internal/controllers/vgmanager/controller.go 22.50% 25 Missing and 6 partials ⚠️
internal/controllers/vgmanager/lvm/lvm.go 46.15% 14 Missing ⚠️
...nal/controllers/vgmanager/device_mapping_helper.go 63.15% 4 Missing and 3 partials ⚠️
api/v1alpha1/lvmcluster_webhook.go 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1380      +/-   ##
==========================================
- Coverage   49.62%   48.78%   -0.85%     
==========================================
  Files          51       52       +1     
  Lines        3609     3663      +54     
==========================================
- Hits         1791     1787       -4     
- Misses       1680     1726      +46     
- Partials      138      150      +12     
Files with missing lines Coverage Δ
api/v1alpha1/lvmcluster_webhook.go 78.14% <60.00%> (-0.93%) ⬇️
...nal/controllers/vgmanager/device_mapping_helper.go 63.15% <63.15%> (ø)
internal/controllers/vgmanager/lvm/lvm.go 87.19% <46.15%> (-5.58%) ⬇️
internal/controllers/vgmanager/controller.go 66.73% <22.50%> (-4.16%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@qJkee
Copy link
Contributor Author

qJkee commented Oct 2, 2025

/hold for reviews

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 2, 2025
@jeff-roche
Copy link
Contributor

Really solid code here, I just had a few nit-picky things

@openshift-ci openshift-ci bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 14, 2025
@qJkee
Copy link
Contributor Author

qJkee commented Oct 14, 2025

/retest

@qJkee qJkee changed the title [WIP] OCPEDGE-2109: POC device deletion OCPEDGE-2109: POC device deletion Oct 15, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 15, 2025
@qJkee qJkee changed the title OCPEDGE-2109: POC device deletion [WIP] OCPEDGE-2109: POC device deletion Oct 15, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 15, 2025
@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 20, 2025
@qJkee qJkee changed the title [WIP] OCPEDGE-2109: POC device deletion OCPEDGE-2109: POC device deletion Oct 20, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 20, 2025
@qJkee
Copy link
Contributor Author

qJkee commented Oct 20, 2025

/retest

Allow removing devices from required and optional paths
from LVMCluster object
Copy link
Contributor

openshift-ci bot commented Oct 20, 2025

@qJkee: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-hypershift 5c736d5 link true /test e2e-aws-hypershift
ci/prow/e2e-aws 5c736d5 link true /test e2e-aws
ci/prow/e2e-aws-single-node 5c736d5 link true /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@qJkee qJkee changed the title OCPEDGE-2109: POC device deletion OCPEDGE-2109: Allow device deletion on day2 Oct 20, 2025
@qJkee
Copy link
Contributor Author

qJkee commented Oct 20, 2025

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants