-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete async feature - this crate is not actually async-safe #138
Conversation
Even just taking a sha256 checksum of a large file using the sha2 crate takes multiple seconds if your CPU doesn't have hardware acceleration for it, which mine does not. Most Intel CPUs older than 2021 don't have it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4d2715e
to
14d7ce7
Compare
The operations this crate performs are sufficiently computationally expensive in the worst case that it is not at all async-safe.
@cmeister2 @drahnr Any thoughts or objections? |
I'm sad about it but I agree with your reasoning.
…On Wed, 17 May 2023, 20:18 Daniel Alley, ***@***.***> wrote:
@cmeister2 <https://github.com/cmeister2> @drahnr
<https://github.com/drahnr> Any thoughts or objections?
—
Reply to this email directly, view it on GitHub
<#138 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPA36KUPHAJOLMUFNO3J6DXGUQBHANCNFSM6AAAAAAYA4OR44>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I see two paths:
Now if we don't want/can't provide 2., 1. is the same consequence |
Is it really more user friendly though? It feels like doing both too much and not enough, while making the documentation twice as long and doubling the number of functions. The user would still need to create and manage their own threadpool, and I don't think there's a generic trait for that, so we'd have to accept a concrete type like I still haven't heard of a compelling use case for wanting to use this library with async in the first place. Async is primarily for massive concurrency with non-CPU bound, non-memory bound workloads, while avoiding the overhead of thousands of threads, which is primarily relevant in the webserver space, but I can't imagine generating or reading RPMs within a webserver context is going to be a large use case. |
Just from a capacity perspective, I am for removal |
I have had suspicions that this library may not be a good fit for async use, and have finally gotten around to checking whether they are accurate.
I wrote some executables to test
Since async runtimes are all about latency and non-blocking cooperative multitasking, I tried to find the most realistic worst-case scenarios. I wrote a script that printed a list of the top 20 largest packages in Fedora and downloaded a few of those (ranging from 100mb to 2.9gb). I also tested normal sized packages like
zlib
andsqlite
but while the magnitude was smaller the relative pattern was basically the same with those. For the "build" tests I extracted the contents of those packages and then usedrpm-rs
to recreate the packages using the pre-extracted files.I used the
time
tool on my executables to print out how much time was spent waiting on IO (sys
time) vs running computations (user
time). Everything was compiled in release mode, and I ran the executable directly (not through cargo) to avoid that overhead. This was all using blocking IO for simplicity but it still tells us about the relevant information to judge async suitability.To summarize my findings:
Conclusion:
This is a huge footgun-in-waiting. Most operations with
rpm-rs
are just too CPU-heavy to be run in an async runtime without the use ofspawn_blocking()
as described by the tokio documentation https://dtantsur.github.io/rust-openstack/tokio/index.html#cpu-bound-tasks-and-blocking-code. Async isn't necessarily pointless in the parsing case, but asspawn_blocking()
needs to be used for everything else anyway, we ought to just reduce complexity and be consistent about what we suggest users to do.📜 Checklist
--all-features
enabled