-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache swizzled tensor for tuning #1686
base: develop
Are you sure you want to change the base?
Conversation
//TODO: Support more swizzling type, such as 32x32x8, currently we have 16x16x8 only. | ||
if(needSwizzle) | ||
|
||
//if no validation, skip the swizzle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand we don't need to permute the tensor when no validation. But is it correct to skip the padding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After doing some experiments on swizzled and non-swizzled on STA problem, I observed that there's performance disparity. In the latest commit, it uses LRUCache to manage the cached tensor for balancing memory usage and runtime performance, and now it always performs swizzle for STA/STB problem.
0120521
to
4948354
Compare
Re-layout will be skipped if validation disabled.