-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pagination to query Execution #141
base: main
Are you sure you want to change the base?
Add pagination to query Execution #141
Conversation
@alamb not done yet, but thought you may be interested. |
src/app/app_execution.rs
Outdated
current_batch: Option<usize>, | ||
} | ||
|
||
impl PaginatingRecordBatchStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure yet if this is the final api signatures - I will know once I plug into the display code
src/app/app_execution.rs
Outdated
@@ -108,3 +110,179 @@ impl AppExecution { | |||
Ok(()) | |||
} | |||
} | |||
|
|||
/// A stream of [`RecordBatch`]es that can be paginated for display in the TUI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that isn't entirely clear to me is if an entire RecordBatch is shown at a time or just a slice (like only rows 100-200)
This might be an important distinction in terms of what "current batch" means and if you want this structure to pagninate based on batch index or logical row number
I think either could work -- but if you pagniate on batch, you'll have to implement logic somewhere else to translate that into logical row number for display
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that could depend on the UI that we expose. I have two options in mind:
- Incrementing a page completely replaces the displayed records with the next page of records
- Incrementing a page adds the records to the existing view (i think getting the UI correct for this may not be straight forward as i believe the naive expectation could be that the next page is triggered while scrolling and reaching the bottom of the table - like how it works in DBeaver. But happy to hear other views on this)
One thing to note - is that the Table
widget we use automatically provides some scrolling capabilities (it tracks the selected row automatically - so if we continue to use that, which i think is beneficial at our stage, we at least dont have to do everything from scratch).
One potentially terrible idea I had, that would at least enable a very simple v1 (at the cost of worse query performance) is default batch size for the TUI to some relatively small amount (say 200 rows) and then all rows for that batch would be added to the Table
. Next page just gets the next batch and replaces the Table
records. (I.e. option 1). We wouldnt need to track rows or figure out how to stitch records together between record batch boundaries. I believe this is the simplest approach and would at least make the app usable for larger queries, then we could add a todo for something more user friendly / that doesnt impact query performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incrementing a page completely replaces the displayed records with the next page of records
This is how I (naively) as a user would expect things to behave
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One potentially terrible idea I had, that would at least enable a very simple v1 (at the cost of worse query performance) is default batch size for the TUI to some relatively small amount (say 200 rows) and then all rows for that batch would be added to the Table. Next page just gets the next batch and replaces the Table records. (I.e. option 1). We wouldnt need to track rows or figure out how to stitch records together between record batch boundaries. I believe this is the simplest approach and would at least make the app usable for larger queries, then we could add a todo for something more user friendly / that doesnt impact query performance.
This seems reasonable to me.
I also think all the APIs to stitch rows together are in arrow-rs (like concat_batches
and slice) so we could also make some sort of adapter stream (another wrapper!) that took the incoming stream and reformatted it to smaller sized record batches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @matthewmturner
src/app/mod.rs
Outdated
@@ -61,6 +62,8 @@ pub enum AppEvent { | |||
Resize(u16, u16), | |||
ExecuteDDL(String), | |||
QueryResult(Query), | |||
// PaginatedQueryResult(PaginatingRecordBatchStream), | |||
QueryResultsNextPage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be new AppEvent
's for paginating
src/app/ui/tabs/sql.rs
Outdated
.title(Title::from(" Page ").alignment(Alignment::Right)); | ||
|
||
let results = app.execution().results(); | ||
let locked = tokio::task::block_in_place(|| results.blocking_lock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we didn't need to do this while rendering, but without it the app panics.
src/app/app_execution.rs
Outdated
/// A stream of [`RecordBatch`]es that can be paginated for display in the TUI. | ||
pub struct PaginatingRecordBatchStream { | ||
// currently executing stream | ||
inner: SendableRecordBatchStream, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because SendableRecordBatchStream
is not Sync
this struct doesnt work well with our AppEvent
which requires Send
and Sync
. I think we can restructure so that the batches / current idx are managed separate from the batch stream so they can be sent as AppEvent
and state can be updated in the existing pattern (i.e in our handlers
) and then we dont need to lock during rendering.
So I ended up going with a different approach on paginating but i think it aligns better with the rendering model. Now, the sql tab state will store the record batches and current index to display and those will be updated synchronously in |
This makes sense to me |
Making good progress DFT.Paginating.movEdit: Sorry its kind of blurry but im pressing the right arrow and youll see the results updating and page number changing. Still more to clean up and improve but its all coming together now |
Might end up needing those arrow utilities sooner than expected because the FlightSQL client doesnt let us set the record batch size. |
BTW I am hoping to next contribute:
|
(#141) was becoming hard for me to reason about because FlightSQL pagination has to be handled differently than SQL tab pagination (because on the SQL tab we can control batch size directly on the context but we cant do that with FlightSQL). So I would like to split pagination on each of those tabs into their own PRs to keep them more focused and easier to review / reason about / etc. So this PR makes pagination work and adds integration tests for testing pagination. There is still room for improvement in the end to end testing (right now we test some of the implementation details) but at least we have some coverage.
This PR adds pagination to query execution.
PaginatingRecordBatchStream