-
Notifications
You must be signed in to change notification settings - Fork 989
Improve performance for small string gather #20656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Benchmarking data: |
Co-authored-by: David Wendt <[email protected]>
davidwendt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for working on this.
|
/ok to test 95cee21 |
|
/ok to test bbc702e |
|
The failed |
|
/ok to test dcd49e6 |
|
/ok to test a4b6cd1 |
|
/ok to test c8095f3 |
bdice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor requests, otherwise LGTM.
davidwendt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the cudf::detail::util::div_rounding_up_safe utility from cpp/include/detail/utilities/integer_utils.hpp to compute the grid_size.
|
/ok to test 2a2ec83 |
|
/ok to test 2fc013d |
|
/ok to test 86d6865 |
|
/merge |
Description
MR improves performance of gather API for small string columns(avg.length <= 32 char) by using cub::DeviceMemcpy::Batched API to perform the gather for string columns with greater than ~0.5 million rows. The threshold for the the row count is decided based on benchmarking data on H100.
MR adds an additional test case to check string
gatherimplementation where kernel is launched with > 1 CTA to verify correctness. It also adds more parameters to sweep for the gather benchmark for representing larger input row counts.Checklist