Skip to content

Conversation

@mrnold
Copy link
Contributor

@mrnold mrnold commented Jun 6, 2025

What this PR does / why we need it:
This pull request changes the VDDK importer to use libnbd's asynchronous API. This should give a general performance boost, and it should produce a response to the recommended VDDK performance tuning options (referencing CNV-52722)

Which issue(s) this PR fixes:
Fixes CNV-57949

Special notes for your reviewer:
Work in progress, as I am not 100% sure I am doing this right. I get good copies, but not much improvement in transfer speed when run in a real importer pod:

  • When I run equivalent code in a standalone tool, I see around a 70MB/s speedup.
  • When I run that tool in a pod I create by hand, I see a similar speedup only if I bump up the pod's CPU limit.
  • When I run the actual importer pod, I see no speedup compared to current CDI, even if I bump up the CPU limit.

Any suggestions?
CC @rwmjones @ebblake

Release note:

VDDK: use AIO libnbd API for block status/pread.

@kubevirt-bot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jun 6, 2025
@kubevirt-bot kubevirt-bot requested review from akalenyu and mhenriks June 6, 2025 21:21
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign akalenyu for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rwmjones
Copy link
Contributor

rwmjones commented Jun 7, 2025

I think you should be using the asynch API as internally it is much more efficient. The synchronous API is really only for convenience and ease of programming.

I guess this is also relevant here: https://gitlab.com/nbdkit/nbdkit/-/commit/5a882e74cae3dbaa09bf3b942a02f9947b12f6e5

NB: This only works if the VDDK plugin is opened readonly (meaning you're using nbdkit -r option or applying nbdkit-cow-filter on top). Also there are numerous footguns with VDDK, even with the patch above. If you can share the complete nbdkit command line then I can comment further.

@mrnold
Copy link
Contributor Author

mrnold commented Jun 9, 2025

Sure, here is an nbdkit command line from an importer pod:

--foreground --readonly --exit-with-parent -U /tmp/nbd.sock --pidfile /tmp/nbd.pid --filter=retry --filter=cacheextents vddk libdir=/opt/vmware-vix-disklib-distrib server=vcenter.local.lan user=[email protected] password=+/tmp/password2388921325 thumbprint=35:91:93:D6:74:31:C8:02:3D:27:C5:A6:3A:89:84:55:76:93:72:03 vm=moref=vm-15 --verbose -D nbdkit.backend.datapath=0 -D vddk.datapath=0 -D vddk.stats=1 file=[datastore2] fedora/fedora-000002.vmdk]

@rwmjones
Copy link
Contributor

rwmjones commented Jun 9, 2025

Get rid of --filter=cacheextents. It didn't work well in most common cases, and we deprecated it: https://libguestfs.org/nbdkit-cacheextents-filter.1.html It's quite likely it's doing nothing good for you, and might even have a negative effect.

Use the -r (readonly) flag, except in the unlikely case you want to write to the disk on VMware. Not only is this safer, but it also enables the pre-caching optimization described below.

You might need to add --filter=blocksize minblock=512 (last in the list of filters) to avoid an annoying corner case / crash when fetching extents if you have a 4G or larger hole. For more details see: https://gitlab.com/nbdkit/libnbd/-/issues/12

Is the backing storage on VMware thin provisioned? Just because it says "Thin Provisioned" in the UI unfortunately does not mean that it is. VMware will happily let you select TP with an NFS v3 backing store that cannot possibly implement it. The only way to find out for sure is to scan the guest (eg. nbdinfo --map $uri) to see if it has holes.

Try out upstream nbdkit. This has a number of enhancements, but the main highlights are:

https://gitlab.com/nbdkit/nbdkit/-/commit/5a882e74cae3dbaa09bf3b942a02f9947b12f6e5 (pre-caching optimization)
https://gitlab.com/nbdkit/nbdkit/-/commit/a224d7fd8d839287c63cb4a063569283dd974b48
https://gitlab.com/nbdkit/nbdkit/-/commit/2db1ede27bb529b36b0075eab337f0c585d1a7ec

Upstream nbdkit-vddk-plugin is also able to multiplex VDDK disks through a single nbdkit instance, by using the NBD export to select which disk you want to read:

https://gitlab.com/nbdkit/nbdkit/-/commit/18866d92d1a774f4241afc754266c734b9828180

(Whether this is beneficial to performance is questionable, it probably has no effect. But it may be more convenient and/or use less memory to use a single instance.)

If after testing with the enhanced stats above, you find that the time taken opening and closing the VDDK session is a problem (you'll need to look at the Open, GetInfo and Close numbers under -D vddk.stats=1 output), then you might investigate this new filter. I haven't tried it much yet myself:

https://gitlab.com/nbdkit/nbdkit/-/commit/b45b9e584687a83823f0ae6c04a1d24fb0261acb

(Some of these are in RHEL 10.1's nbdkit.)

Try out upstream libnbd too, as it has these enhancements that may be relevant:

https://gitlab.com/nbdkit/libnbd/-/commit/84e57710d095d058ab06aea6db1c897aaccf0c02
https://gitlab.com/nbdkit/libnbd/-/commit/89b3f0673f2e9d8de090c5e1228b36be6ca28997
https://gitlab.com/nbdkit/libnbd/-/commit/65d412c6846452127fa62969c5d062edea72e5f2

@mrnold
Copy link
Contributor Author

mrnold commented Jun 10, 2025

Thanks! The readonly flag is set, and removing cacheextents didn't seem to make much of a difference. I think the VMware thin provisioning is working fine, it is definitely not on NFS and I see holes in the nbdinfo map matching what CDI gets. I had some success increasing the size of the buffer passed to AioPread, but I still needed to increase the pod CPU limit. I'm not sure if running nbdkit and the importer together is more work than the default limit, or if I have just written super inefficient code.

Is that libnbd change available in a tagged golang release somewhere? The last version I can go get is 1.20.0, is this still the right place to look for it?

@rwmjones
Copy link
Contributor

I do need to update that site. However the changes are all in the C library so I don't think the golang bindings need to be updated (as long as the resulting binary dynamically links to /usr/lib64/libnbd.so.0.0.0). The most important change is the precaching one in nbdkit anyway.

@rwmjones
Copy link
Contributor

I believe I've updated the golang bindings website to libnbd 1.22.2 now.

mrnold added 3 commits June 25, 2025 10:36
Having it global caused issues with unit tests.

Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 29, 2025
@kubevirt-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

@kubevirt-bot kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants