Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELAX][BYOC] OpenCLML offload support for Relax #17654

Merged
merged 11 commits into from
Feb 19, 2025

Conversation

srkreddy1238
Copy link
Contributor

This brings in OpenCLML offloading via BYOC path with available operators in Relax.
Adds Codegen tests for Mainline CI.
Also brings in pipeline definitions for Adreno targets.


# Verify codegen
clml_mod = OpenCLMLOffLoad()(clml_mod)
verify_codegen(clml_mod, clml_codegen)
Copy link
Contributor Author

@srkreddy1238 srkreddy1238 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do codegen check here (JSON comparison).

clml_mod = OpenCLMLOffLoad()(clml_mod)
verify_codegen(clml_mod, clml_codegen)

# On Mainline CI
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainline will not have rpc, Hence we don't proceed beyond this point

@srkreddy1238
Copy link
Contributor Author

@Hzfengsy for the OptimizeBatchnorm pass

DecomposeOpsForInference couldn't help here as it results Conv2D+few elem_wise ops.

OptimizeBatchnorm does folds batchnorm attributes into Conv2D weight and Bias, which we can offload as Fused Conv2D+Bias op of CLML.

@Hzfengsy
Copy link
Member

Thanks for pinging me. @srkreddy1238

OptimizeBatchnorm does folds batchnorm attributes into Conv2D weight and Bias, which we can offload as Fused Conv2D+Bias op of CLML.

IIUC it works as a CLML-specific pass. If so, could you please move it under adreno or clml folders, or at least add comments saying that it works only for CLML.

@srkreddy1238
Copy link
Contributor Author

IIUC it works as a CLML-specific pass.

This is not CLML specific. Can be used for fusing Conv2D+BN -> Conv2D (Updated weight, bias) for inference case.

@srkreddy1238 srkreddy1238 force-pushed the clml_relax branch 2 times, most recently from 0b220c8 to d65a89a Compare February 17, 2025 08:46
@tqchen
Copy link
Member

tqchen commented Feb 17, 2025

This is not CLML specific. Can be used for fusing Conv2D+BN -> Conv2D (Updated weight, bias) for inference case..

Note on this pass. This was an optimization can indeed can apply more broadly. One note is that this optimization is a special case of FoldScaleAxis optimization that folds scale into weights, so we want to add a comment here that we can replace by general FoldScaleAxis in future, cc @Hzfengsy @srkreddy1238

@tqchen
Copy link
Member

tqchen commented Feb 17, 2025

Thanks @srkreddy1238 for great effort, glad to see the new target aware pipeline helps to simplify the flow here. also cc @MasterJH5574

@tqchen tqchen merged commit cc2f079 into apache:main Feb 19, 2025
15 checks passed
@tqchen
Copy link
Member

tqchen commented Feb 19, 2025

Thanks @srkreddy1238 this is merged!

@srkreddy1238
Copy link
Contributor Author

The codegen tests are added in unity->gpu. But I see this CI pipeline only builds GPU but doesn't run tests. Is this expected to be enabled soon ?

@tqchen
Copy link
Member

tqchen commented Feb 19, 2025

https://github.com/apache/tvm/blob/main/ci/jenkins/unity_jenkinsfile.groovy#L356 the build stage of unity pipeline does run relax tests.

We can also followup to fold unity pipeline into the normal gpu pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants