Commit 9ce1af7
minor fix for xqa (#1994)
<!-- .github/pull_request_template.md -->
## 📌 Description
<!-- What does this PR do? Briefly describe the changes and why they’re
needed. -->
1 change xqa_mla comments to be consistent with mla instead of mha.
2 put cudaMemcpyFromSymbol/cudaFuncSetAttribute outside of launch
function to avoid breaking cuda graph capture
3 use int32 as pagetable index
## 🔍 Related Issues
<!-- Link any related issues here -->
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [x] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added MLA variant documentation clarifying SM120 GPU requirement and
fixed head group ratio configuration.
* **Documentation**
* Updated data type specifications for XQA operations; page table now
requires int32 instead of uint32.
* Added max sequence length derivation notes for page-table-based
configurations.
* Clarified MLA variant input/output data types (float8_e4m3fn and
bfloat16).
* **Bug Fixes**
* Corrected data type handling in page table processing to ensure
compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Qidi Sang <[email protected]>
Co-authored-by: yzh119 <[email protected]>1 parent 7d9d7af commit 9ce1af7
File tree
5 files changed
+42
-36
lines changed- csrc/xqa
- flashinfer
- tests/attention
5 files changed
+42
-36
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2655 | 2655 | | |
2656 | 2656 | | |
2657 | 2657 | | |
| 2658 | + | |
| 2659 | + | |
| 2660 | + | |
| 2661 | + | |
| 2662 | + | |
| 2663 | + | |
| 2664 | + | |
| 2665 | + | |
| 2666 | + | |
2658 | 2667 | | |
2659 | 2668 | | |
2660 | 2669 | | |
| |||
2673 | 2682 | | |
2674 | 2683 | | |
2675 | 2684 | | |
2676 | | - | |
2677 | | - | |
2678 | | - | |
2679 | | - | |
2680 | | - | |
2681 | | - | |
2682 | | - | |
2683 | 2685 | | |
2684 | 2686 | | |
2685 | 2687 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3165 | 3165 | | |
3166 | 3166 | | |
3167 | 3167 | | |
| 3168 | + | |
| 3169 | + | |
| 3170 | + | |
| 3171 | + | |
| 3172 | + | |
| 3173 | + | |
| 3174 | + | |
| 3175 | + | |
| 3176 | + | |
3168 | 3177 | | |
3169 | 3178 | | |
3170 | 3179 | | |
| |||
3183 | 3192 | | |
3184 | 3193 | | |
3185 | 3194 | | |
3186 | | - | |
3187 | | - | |
3188 | | - | |
3189 | | - | |
3190 | | - | |
3191 | | - | |
3192 | 3195 | | |
3193 | 3196 | | |
3194 | 3197 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1835 | 1835 | | |
1836 | 1836 | | |
1837 | 1837 | | |
| 1838 | + | |
| 1839 | + | |
| 1840 | + | |
| 1841 | + | |
| 1842 | + | |
| 1843 | + | |
| 1844 | + | |
| 1845 | + | |
| 1846 | + | |
1838 | 1847 | | |
1839 | 1848 | | |
1840 | 1849 | | |
| |||
1860 | 1869 | | |
1861 | 1870 | | |
1862 | 1871 | | |
1863 | | - | |
1864 | | - | |
1865 | | - | |
1866 | | - | |
1867 | | - | |
1868 | | - | |
1869 | | - | |
1870 | 1872 | | |
1871 | 1873 | | |
1872 | 1874 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
156 | | - | |
| 156 | + | |
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| 198 | + | |
198 | 199 | | |
199 | 200 | | |
200 | 201 | | |
| |||
352 | 353 | | |
353 | 354 | | |
354 | 355 | | |
355 | | - | |
| 356 | + | |
356 | 357 | | |
357 | 358 | | |
358 | 359 | | |
359 | 360 | | |
360 | | - | |
| 361 | + | |
361 | 362 | | |
362 | 363 | | |
363 | 364 | | |
364 | | - | |
365 | | - | |
| 365 | + | |
366 | 366 | | |
367 | 367 | | |
368 | | - | |
369 | | - | |
| 368 | + | |
370 | 369 | | |
371 | 370 | | |
372 | | - | |
| 371 | + | |
373 | 372 | | |
374 | 373 | | |
375 | 374 | | |
376 | 375 | | |
377 | 376 | | |
378 | 377 | | |
379 | | - | |
| 378 | + | |
380 | 379 | | |
381 | 380 | | |
382 | 381 | | |
| |||
399 | 398 | | |
400 | 399 | | |
401 | 400 | | |
402 | | - | |
403 | | - | |
| 401 | + | |
| 402 | + | |
404 | 403 | | |
405 | 404 | | |
406 | 405 | | |
| |||
423 | 422 | | |
424 | 423 | | |
425 | 424 | | |
426 | | - | |
| 425 | + | |
427 | 426 | | |
428 | 427 | | |
429 | 428 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
253 | 253 | | |
254 | 254 | | |
255 | 255 | | |
256 | | - | |
| 256 | + | |
257 | 257 | | |
258 | 258 | | |
259 | 259 | | |
| |||
265 | 265 | | |
266 | 266 | | |
267 | 267 | | |
268 | | - | |
| 268 | + | |
269 | 269 | | |
270 | 270 | | |
271 | 271 | | |
| |||
470 | 470 | | |
471 | 471 | | |
472 | 472 | | |
473 | | - | |
| 473 | + | |
474 | 474 | | |
475 | 475 | | |
476 | 476 | | |
| |||
482 | 482 | | |
483 | 483 | | |
484 | 484 | | |
485 | | - | |
| 485 | + | |
486 | 486 | | |
487 | 487 | | |
488 | 488 | | |
| |||
0 commit comments