Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hipblaslt flow support fp8 swizzle #1648

Merged
merged 1 commit into from
Feb 14, 2025
Merged

Conversation

mengzcai
Copy link
Contributor

@mengzcai mengzcai commented Feb 11, 2025

[----------] Global test environment tear-down
[==========] 53119 tests from 13 test suites ran. (6789418 ms total)
[ PASSED ] 53119 tests.

./build/release/clients/staging/hipblaslt-bench -r f16_r --swizzleA -m 128 -n 128 -k 128 -v --transA T --transB N --initialization rand_int

T,N,0,1,128,128,128,1,128,16384,0,128,16384,128,16384,128,16384,f16_r,f16_r,f16_r,f16_r,f32_r,0,0,0,0,0,none,0,f16_r,391.991,8.55633,10.7,0.0161795,259235,0,1e-05,1e-05


./build/release/clients/staging/hipblaslt-bench -r f16_r --swizzleA -m 129 -n 129 -k 129 -v --transA T --transB N --initialization rand_int

T,N,0,1,129,129,129,1,129,16641,0,129,16641,129,16641,129,16641,f16_r,f16_r,f16_r,f16_r,f32_r,0,0,0,0,0,none,0,f16_r,41.3222,1.0097,103.9,0.0174304,246316,0,1e-05,1e-05


./build/release/clients/staging/hipblaslt-bench -r f8_fnuz_r --c_type f16_r --d_type f16_r --swizzleA -m 128 -n 128 -k 128 -v --transA T --transB N --initialization rand_int

T,N,0,1,128,128,128,1,128,16384,0,128,16384,128,16384,128,16384,f8_fnuz_r,f8_fnuz_r,f16_r,f16_r,f32_r,0,0,0,0,0,none,0,f16_r,364.722,5.3074,11.5,0.0155591,269572,0,1e-05,1e-05


./build/release/clients/staging/hipblaslt-bench -r f8_fnuz_r --c_type f16_r --d_type f16_r --swizzleA -m 129 -n 129 -k 129 -v --transA T --transB N --initialization rand_int

T,N,0,1,129,129,129,1,129,16641,0,129,16641,129,16641,129,16641,f8_fnuz_r,f8_fnuz_r,f16_r,f16_r,f32_r,0,0,0,0,0,none,0,f16_r,248.172,4.17593,17.3,0.0113768,377380,0,1e-05,1e-05

UseInitialStridesAB: false
UseInitialStridesCD: false
UseScaleAB: Scalar
UseScaleAlphaVec: 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should enable SAV for gridbase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thx.

transB: N
alpha: 1
beta: [ 0.0, 2.0 ]
scaleA: [2]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scaleA: [2] is scale vector, which should not be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both scaleA:[2] and scaleB:[2] have been deleted, thanks.

bias_vector: [0, 1]
bias_type: f16_r
unit_check: 1
gpu_arch: '94[0-2]'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

942 only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

UseE: false
UseInitialStridesAB: false
UseInitialStridesCD: false
UseScaleAB: Vector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to support Scalar but not Vector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

M: [128]
N: [128]
K: [128]
swizzle_a: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"swizzle_a: true" has been set in real_precisions_swizzleA_support, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the redundant ones have been removed.

@mengzcai mengzcai force-pushed the swizzle_client_f8 branch 2 times, most recently from 638b95c to dc3f395 Compare February 12, 2025 18:36
Serge45
Serge45 previously approved these changes Feb 13, 2025
category: pre_checkin
function:
matmul: *real_precisions_swizzleA_support
M: [128]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some odd size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

jichangjichang
jichangjichang previously approved these changes Feb 13, 2025
add logic yamls, sample

add gtest for fp16,fp8 swizzle

assert1
@boringmorning boringmorning self-requested a review February 14, 2025 14:58
@mengzcai mengzcai merged commit 07110b7 into ROCm:develop Feb 14, 2025
11 of 12 checks passed
@mengzcai mengzcai deleted the swizzle_client_f8 branch February 24, 2025 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants