-
Notifications
You must be signed in to change notification settings - Fork 214
Add CUDA 13.0 Tests for CuFile I/O Operations #1060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks, Chloe! Pinning you internally... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chloechia4 any reason cuFileDriverClose_v2
is removed? I see this symbol still exists in the cuFILE header. For cuda-bindings, the Cython layer (cyxxxxx.{pxd,pyx}
) are consider stable public APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chloechia4: I left a comment on why I think this is happening in your cybind MR. Addressing that, putting the changes here, and removing those extra files @leofang mentioned should hopefully do it. I haven't tried testing (I'm on WSL and it looks like cuFile doesn't work there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this
cuda_bindings/tests/test_cufile.py
Outdated
assert io_events[i].status == cufile.Status.COMPLETE, f"Write {i} failed with status {io_events[i].status}" | ||
|
||
# Force file sync | ||
os.fsync(fd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
cuda_bindings/tests/test_cufile.py
Outdated
|
||
# Verify that statistics data was written to the buffer | ||
# Convert buffer to bytes and check that it's not all zeros | ||
buffer_bytes = bytes(stats_buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than just checking for buffer_bytes, can you verify by looking at actual fields of the data structure(CUfileStatsLevel1_t)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point I added field checks in get_stats_l1/get_stats_l2/get_stats_l3. It appears that the Python bindings don't expose the CUfileStatsLevel*_t structures as ctypes classes that we can directly use. So I just added Python equivalent classes
check_status(__status__) | ||
|
||
|
||
cpdef get_parameter_min_max_value(int param, intptr_t min_value, intptr_t max_value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add tests for this API as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
/ok to test |
@mdboom, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/ |
Description
This PR includes tests for the low-level bindings and the generated low-level bindings introduced in CUDA 13.0 for CUFile.
CUDA 13.0 CuFile Operations
Note: The original
test_batch_io_large_operations()
did not pass once switched from CUDA 12.9 to 13.0. I realized it was because the file reads were occurring before the writes as it was submitting all operations (reads and writes) together in one batch. As a result, it was trivially failing because the reads would return as 0 bytes, since they were happening before any write I/O occurred. I changed it to so it would be separated into two phases: writes complete first in one batch handle, and then reads are submitted in another batch handle. This new test works with CUDA 12.9 versioning as well.All tests passing across CUDA versions
