Skip to content

Regenerate binaries on ISPC 1.27 #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Regenerate binaries on ISPC 1.27 #60

wants to merge 3 commits into from

Conversation

MarijnS95
Copy link
Member

https://github.com/ispc/ispc/releases/tag/v1.27.0
https://github.com/Traverse-Research/ispc-downsampler/actions/runs/15157304944

TODO: Still need to compare performance, but perhaps this helps on newer architectures. Might also have to evaluate if we're simply missing some TargetISA flags relevant for newer SoCs?

@MarijnS95 MarijnS95 requested a review from Jasper-Bekkers May 21, 2025 08:36
@MarijnS95
Copy link
Member Author

Turns out there are a bunch of new generic target ISAs to streamline which vector sizes/widths to select, as well as Apple-specific CPU targets :)

@MarijnS95
Copy link
Member Author

On the MacBook Air M4

Main @ 3556673

Downsample `square_test.png` using ispc_downsampler
                        time:   [38.827 ms 38.848 ms 38.884 ms]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

This ispc-1.27 PR @ 3556673

Downsample `square_test.png` using ispc_downsampler
                        time:   [48.220 ms 48.253 ms 48.287 ms]
                        change: [+24.103% +24.190% +24.278%] (p = 0.00 < 0.05)
                        Performance has regressed.

Recompiling locally on the M4 Air (ispc 1.27.0 from brew using cargo b -rF ispc):

Downsample `square_test.png` using ispc_downsampler
                        time:   [46.576 ms 46.586 ms 46.596 ms]
                        change: [-3.5237% -3.4550% -3.3855%] (p = 0.00 < 0.05)
                        Performance has improved.

That's a significant performance deficit, which we should investigate before merging. Even playing around with the new CPU flags from Twinklebear/ispc-rs#42, or the generic ISAs, or removing .target_isas() altogether to compile natively for the host yields no improvement.

Funny thing is, with the ISPC test this M4 Air whines a little, but it doesn't during resize 😓

@MarijnS95 MarijnS95 marked this pull request as draft May 26, 2025 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants