Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete async feature - this crate is not actually async-safe #138

Merged
merged 1 commit into from
May 18, 2023

Conversation

dralley
Copy link
Collaborator

@dralley dralley commented May 14, 2023

I have had suspicions that this library may not be a good fit for async use, and have finally gotten around to checking whether they are accurate.

I wrote some executables to test

  • building packages
  • signing packages
  • parsing packages
  • verifying digests / signatures

Since async runtimes are all about latency and non-blocking cooperative multitasking, I tried to find the most realistic worst-case scenarios. I wrote a script that printed a list of the top 20 largest packages in Fedora and downloaded a few of those (ranging from 100mb to 2.9gb). I also tested normal sized packages like zlib and sqlite but while the magnitude was smaller the relative pattern was basically the same with those. For the "build" tests I extracted the contents of those packages and then used rpm-rs to recreate the packages using the pre-extracted files.

I used the time tool on my executables to print out how much time was spent waiting on IO (sys time) vs running computations (user time). Everything was compiled in release mode, and I ran the executable directly (not through cargo) to avoid that overhead. This was all using blocking IO for simplicity but it still tells us about the relevant information to judge async suitability.

To summarize my findings:

  • Parsing the packages was pretty fast in most cases and was an ok-ish application of async, because 90% of the time was spent in IO. However:
  • If you then additionally verify the digests or signatures, then it flips the other way around and the workload becomes roughly 90% computation. Computing all the digests was much more expensive than the IO was and the program would computationally block for 0.5 - 16 seconds
  • Signature verification was even more expensive, it would computationally block for 1 - 27 seconds
  • Building the packages, even using an scenario favorable to async by not using any compression (minimizing computation time and maximizing IO time), took 10-120 seconds and was nearly all computation.

Conclusion:

This is a huge footgun-in-waiting. Most operations with rpm-rs are just too CPU-heavy to be run in an async runtime without the use of spawn_blocking() as described by the tokio documentation https://dtantsur.github.io/rust-openstack/tokio/index.html#cpu-bound-tasks-and-blocking-code. Async isn't necessarily pointless in the parsing case, but as spawn_blocking() needs to be used for everything else anyway, we ought to just reduce complexity and be consistent about what we suggest users to do.

📜 Checklist

  • Commits are cleanly separated and have useful messages
  • A changelog entry or entries has been added to CHANGELOG.md
  • Documentation is thorough
  • Test coverage is excellent and passes
  • Works when tests are run --all-features enabled

@dralley dralley requested review from drahnr and cmeister2 May 14, 2023 03:41
@dralley dralley marked this pull request as ready for review May 14, 2023 03:42
@dralley dralley changed the title Delete async feature - this crate is not async-safe Delete async feature - this crate is not actually async-safe May 14, 2023
@dralley
Copy link
Collaborator Author

dralley commented May 14, 2023

Even just taking a sha256 checksum of a large file using the sha2 crate takes multiple seconds if your CPU doesn't have hardware acceleration for it, which mine does not. Most Intel CPUs older than 2021 don't have it.

Copy link
Contributor

@ikrivosheev ikrivosheev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dralley it's cool! It's simplify my work on: #100 I like it))

@dralley dralley force-pushed the delete-async branch 4 times, most recently from 4d2715e to 14d7ce7 Compare May 17, 2023 04:22
The operations this crate performs are sufficiently computationally
expensive in the worst case that it is not at all async-safe.
@dralley
Copy link
Collaborator Author

dralley commented May 17, 2023

@cmeister2 @drahnr Any thoughts or objections?

@cmeister2
Copy link
Collaborator

cmeister2 commented May 17, 2023 via email

@drahnr
Copy link
Contributor

drahnr commented May 18, 2023

I see two paths:

  1. Delete async handling

  2. Allow passing a pool / spawner to any blocking async ops where internally the blocking calls are pushed to that

  3. is convenient for us, 2. would be for the user

Now if we don't want/can't provide 2., 1. is the same consequence

@dralley
Copy link
Collaborator Author

dralley commented May 18, 2023

Allow passing a pool / spawner to any blocking async ops where internally the blocking calls are pushed to that

Is it really more user friendly though? It feels like doing both too much and not enough, while making the documentation twice as long and doubling the number of functions. The user would still need to create and manage their own threadpool, and I don't think there's a generic trait for that, so we'd have to accept a concrete type like rayon::ThreadPool, which then becomes part of our API and a new dependency... Idk.

I still haven't heard of a compelling use case for wanting to use this library with async in the first place. Async is primarily for massive concurrency with non-CPU bound, non-memory bound workloads, while avoiding the overhead of thousands of threads, which is primarily relevant in the webserver space, but I can't imagine generating or reading RPMs within a webserver context is going to be a large use case.

@drahnr
Copy link
Contributor

drahnr commented May 18, 2023

Just from a capacity perspective, I am for removal

@dralley dralley merged commit 5c1bf15 into rpm-rs:master May 18, 2023
@dralley dralley deleted the delete-async branch May 18, 2023 19:59
@ikrivosheev ikrivosheev mentioned this pull request May 23, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants