-
Notifications
You must be signed in to change notification settings - Fork 88
writing an I/O-related library that doesn't care about whether it's used in a sync or async context #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm, so this is mostly(?) a question of "sync-async" bridging, I think? I was going to open a similar-ish (only vaguely) issue based on some conversations I had with @Mark-Simulacrum, though in that case the gist is more like "I want to write a client that uses libraries that are implemented in async with minimal fuss, and right now it's painful". |
I opened #54 to cover my conversation with @Mark-Simulacrum |
@nikomatsakis Yeah, I think "sync-async" bridging is a very general way to put it, although there may be many specific cases of it with this being one of them. (Not quite sure how to categorize everything, I leave that difficult task to y'all. :-)) |
Some elements of this blog post feel related:
|
Here’s an approach to this from Python’s ecosystem: https://sans-io.readthedocs.io/ |
@matklad In my example above, that's what |
I have recently implemented an
and that this changes propagates to any code that performs I/O (it does not propagate to code that performs CPU-intensive tasks such as de-serialization) by basically adding They represent two different compilation paths: one that calls It all boils down to the fact that we never know what the next byte will cause to the state machine. In the case of CSV, the key item is the un-escaped end of line, that causes a whole panoply of events. The reader must read byte by byte in a state machine until the state is "new row". An equivalent way of framing this problem is that it is not possible to tell how many bytes we should read for the state machine to move to its "clean" state (finished reading a whole line). This framing of the problem also emerges in reading protobuf, flatbuffers, thrift, because certain nested types are stored in things like The alternative to this is to offer a sync API and re-use it via I see two potential ways improve the situation:
|
I know we don't need a list of examples, but serde seems worth a special mention here. It's a complex, popular, amazing library that at the bottom just writes/reads from a stream. Currently, we have to fork a separate version of entire serde ecosystem with all the functions in a different colour just so we can deserialise from an async source. (ie: it's not just the io-related library - it's everything that calls that io-related library, and sometimes this involves crates from multiple third-party authors.) |
@anguslees that's a very good point. We should also keep in mind that serde has been unmaintained for almost 2 years now. |
std::io::Read
orstd::io::Write
, and wants it to work in an async context with minimal fuss.Read
/Write
traits, it works on streams. (Where "stream" is used colloquially here.)Read
/Write
traits seems inelegant and doesn't seem to scale with respect to maintenance effort.Read
/Write
impls, but it may come with additional costs that one might not want to pay.One example here is the
flate2
crate. To solve this problem, they have an optional dependency on a specific async runtime, tokio, and have impls forAsyncRead
andAsyncWrite
that are specific to async runtime.Another example is the
csv
crate. The problem has come up a few times:Its author (me) has not wanted to wade into these waters because of the aforementioned problems. As a result, folks are maintaining a
csv-async
fork to make it work in an async context. This doesn't seem ideal.This is somewhat related to #45. For example,
AsyncRead
/AsyncWrite
traits that are shared across all async runtimes might do a lot to fix this problem. But I'm not sure. Fundamentally, this, to me, is about writing I/O adapters that don't care about whether they're used in a sync or async context with minimal fuss, rather than just about trying to abstract over all async runtimes.Apologies in advance if I've filled out this issue incorrectly. I tried to follow the others, but maybe I got a bit too specific! Happy to update it as appropriate. Overall, I think this is a really wonderful approach to gather feedback. I'm completely blown away by this!
Also, above, I said this was "almost" Barabara, because this story doesn't actually require the programmer to write or even care about async code at all.
The text was updated successfully, but these errors were encountered: