-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature](mluOpExecFFT): add bluestein fft #1213
Open
DanieeelLiu
wants to merge
4
commits into
Cambricon:master
Choose a base branch
from
DanieeelLiu:bluestein_fft
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.generate()系数占用空间多少,SRAM 肯定能放下吗?长度是 FFT 旋转因子的长度,还是全部的N?
2.所有核用的是相同的系数吧,然后是只有一个 ipu core 在 generate,还是每个核 generate 1/4,后面方式的性能应该更好些?
3.为什么不直接保存在 NRAM 上,是空间不够吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
计算中有三个需要系数的地方,都不一样,是想每次调用的时候计算,只需要调用一次,时间消耗应该不大,之前打算一个cluster 单ipu core 算完存在sram,sram 空间比较大 2M, 应该够N 用了,这里因为是拼接算子,所以暂时不涉及原来fft 所需要的旋转因子。明哥提醒每个core generate 1/4 更合适,之前没看到有这个 从非0 起的指令函数,刚找到了。
nram 上怕空间不够用,大点的N 就放不下了, 例如4098 pad 到比较8192 更大的数会比较大,所以放sram
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generate那个如果N超了2M有存在GDRAM上的备用方案吗,客户规模不一定有,但补充功能测例时应该不保准不会测到吧