You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: correct PDL parameter handling in RopeQuantize kernel (#1982)
<!-- .github/pull_request_template.md -->
## 📌 Description
### 1. Fixed Parameter Alignment
- **Issue**: The `stream` parameter was being passed to the wrong
position in the `RopeQuantize` function call due to missing `enable_pdl`
parameter. SGLang will hang before this pr.
- **Fix**: Added the `enable_pdl` parameter to the function signature
and properly aligned all parameters
### 2. Fixed PDL Launch Configuration
- **Issue**: When `enable_pdl=true`, the kernel would throw CUDA errors
due to incorrect PDL attribute handling
- **Fix**: Aligned the implementation with `csrc/fmhaReduction.cu`.
<!-- What does this PR do? Briefly describe the changes and why they’re
needed. -->
## 🔍 Related Issues
<!-- Link any related issues here -->
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added PDL (Programmatic Dynamic Launch) benchmarking capability for
rope quantization operations.
* Extended configuration options to enable or disable PDL functionality.
* **Tests**
* Updated test suite to validate PDL enabled and disabled scenarios in
rope quantization workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
0 commit comments