-
Notifications
You must be signed in to change notification settings - Fork 488
refactor!: move storage module into logstore #3382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3382 +/- ##
==========================================
+ Coverage 71.82% 71.99% +0.17%
==========================================
Files 145 145
Lines 45972 45774 -198
Branches 45972 45774 -198
==========================================
- Hits 33018 32957 -61
+ Misses 10859 10743 -116
+ Partials 2095 2074 -21 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Robert Pack <[email protected]>
Just an fyi, the IO runtime wrapping will hopefully replaced by a kind of Spawservice, the initial PR of Tustvold became stale but I'll try to push it forward from there with Andrew, apache/arrow-rs#7253 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm! :) Im curious to see your caching layer and on what abstraction level it will be because there are some quite through implementations out there at least on an ObjectStore level
Awesome, looking forward to that :)
@ion-elgreco - do you have some links maybe? 😄. The main idea is though to keep it quite simple on that layer - our most effective caching is how we managed the parsed actions :). Another optimisation, that is not part of this (but I'm hoping to get to one day) is to cache parquet footers - this is something we can implement on the DF level ... That said, the PR should be coming up shortly, and we can continue there :). |
@roeap the overview is here apache/arrow-rs-object-store#14
|
Description
TLDR;
storage
->logstore::storage
runtime
mod for io runtimeAnother pre-factor PR, this time we move the
storage
module intologstore
.While working on adding a caching layer for log reads I realised that we are exposing quite a bit of functionality via the storage module and interactions with the logstore module require quite a bit of code parsing. With kernel coming in we will have to deal with yet another storage abstraction.
The good news, in practice we actually have a quite narrow waist when creating the tables, so we can just hook into the approach we take for wrapping the io runtime. Still I believe, that logically the
ObjectStore
sits behind teh log store, or at least the log store is responsible for providing it.In an immediate follow up I would like to add the cached store (@rtyler - in case you have some time to run bench marks? 😄) and with that make do a pass over our config for storage to make sure it stays manageable as we add yet another set of options.