Skip to content

Conversation

@amirkl94
Copy link
Contributor

@amirkl94 amirkl94 commented Oct 28, 2025

📌 Description

Using different API after apach-tvm-ffi version bump.

Summary by CodeRabbit

  • Bug Fixes
    • Improved null-pointer validation for FP8 quantization tensors used during inference, increasing robustness and reducing risk of runtime errors.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @amirkl94, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a compatibility issue arising from an apach-tvm-ffi library version update. It specifically modifies the tensor validation logic within the Cutlass FusedMoE backend, replacing deprecated get() calls with the new GetDLTensorPtr() method. This ensures that the validation checks continue to function correctly and the system remains compatible with the updated library, preventing potential runtime errors.

Highlights

  • API Update: Updated the API calls from get() to GetDLTensorPtr() in response to a version bump of the apach-tvm-ffi library.
  • Bugfix: Resolved a bug in the Cutlass FusedMoE validations by adapting to the new tensor pointer retrieval method.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@amirkl94
Copy link
Contributor Author

/bot run

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 28, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Updated null-pointer checks in the Cutlass FusedMoE FP8 quantization validation to use GetDLTensorPtr() instead of get() for four tensors: fc1_dequant, fc2_quant, fc2_dequant, and fc1_input_dequant.

Changes

Cohort / File(s) Summary
Quantization validation pointer access
csrc/fused_moe/cutlass_backend/flashinfer_cutlass_fused_moe_sm100_binding.cu
Replaced null-checks that used get() with GetDLTensorPtr() for fc1_dequant, fc2_quant, fc2_dequant, and fc1_input_dequant pointers; type/rank/shape validations unchanged

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

  • Single-file, small repetitive change (pointer accessor swap).
  • Review focus: ensure no behavioral change and that GetDLTensorPtr() has the same lifetime/ownership assumptions as get().

Possibly related PRs

Suggested reviewers

  • yzh119
  • yongwww
  • djmmoss
  • cyx-6

Poem

🐰 Small paws tap keys in morning light,
Four pointers checked, now tidy and tight.
From get() to GetDLTensorPtr() they hop,
Consistent steps — a clean, gentle drop. ✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description is largely incomplete compared to the repository's template requirements. While the Description section is present, it only contains a single brief line about an API version bump. The description is missing critical required sections including the Related Issues section, the entire Pull Request Checklist (Pre-commit Checks, Tests), and Reviewer Notes. The minimal description provides insufficient context about the changes and does not follow the established template structure. The author should expand the pull request description by filling out the missing template sections. Add a Related Issues section (linking to any issue this bugfix addresses), complete the Pull Request Checklist including confirmation that pre-commit hooks were run and tests were added/updated, and optionally add Reviewer Notes highlighting that this is an API migration due to the apach-tvm-ffi version bump. The Description section could also be expanded to provide more detail about why this API change was necessary and what problem it solves.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Bugfix: Change get() -> GetDLTensorPtr() in cutlass FusedMoE validations" is directly related to the main change described in the raw summary. The title accurately reflects the primary modification—replacing get() calls with GetDLTensorPtr() in FP8 quantization tensor pointer validation—and is concise, specific, and free of vague terms or noise. A teammate scanning commit history would clearly understand the purpose of this change from the title alone.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a5a368 and c8df0a9.

📒 Files selected for processing (1)
  • csrc/fused_moe/cutlass_backend/flashinfer_cutlass_fused_moe_sm100_binding.cu (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • csrc/fused_moe/cutlass_backend/flashinfer_cutlass_fused_moe_sm100_binding.cu

Comment @coderabbitai help to get the list of available commands and usage tips.

@flashinfer-bot
Copy link
Collaborator

@amirkl94 is not authorized to trigger this CI job. cc: @yzh119, @sricketts, @yongwww

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly updates the API call from .get() to .GetDLTensorPtr() for tvm::ffi::Tensor objects, which is necessary due to a version bump in the apache-tvm-ffi library. The change is straightforward and correct. I've identified a minor typo in an error message on one of the modified lines and have provided a suggestion to fix it.

Signed-off-by: Amir Klein <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
csrc/fused_moe/cutlass_backend/flashinfer_cutlass_fused_moe_sm100_binding.cu (1)

829-829: Fix typo in error message.

The error message contains a typo: "fc2_dequant_dequant" should be "fc2_dequant". This is a pre-existing issue, not introduced by this PR.

Apply this diff to fix the typo:

-          << "Expecting fc2_dequant_dequant to be non null";
+          << "Expecting fc2_dequant to be non null";
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 135d7bc and 0a5a368.

📒 Files selected for processing (1)
  • csrc/fused_moe/cutlass_backend/flashinfer_cutlass_fused_moe_sm100_binding.cu (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Deploy Docs
🔇 Additional comments (1)
csrc/fused_moe/cutlass_backend/flashinfer_cutlass_fused_moe_sm100_binding.cu (1)

825-831: LGTM! API migration correctly applied.

The change from get() to GetDLTensorPtr() is consistent across all four null-pointer checks for the FP8 quantization parameters. This correctly addresses the API change after bumping the apache-tvm-ffi version.

@bkryu
Copy link
Collaborator

bkryu commented Oct 28, 2025

/bot run

@flashinfer-bot
Copy link
Collaborator

GitLab MR !93 has been created, and the CI pipeline #37457887 is currently running. I'll report back once the pipeline job completes.

@bkryu
Copy link
Collaborator

bkryu commented Oct 28, 2025

I can repro the previously failing unit tests now passing with this PR on B200. Waiting for results from CI bot's pipeline

Copy link
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the timely fix!

@yzh119 yzh119 merged commit 8ad64e2 into flashinfer-ai:main Oct 28, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants