-
-
Notifications
You must be signed in to change notification settings - Fork 190
Add test & fix for unusual pitch in surface.premul_alpha()
#2882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, we don't need to make the sse2 version handle the unusual pitches, and probably should just fallback to the non-simd version.
As long as the avx2 and sse2 code can handle the pitch-is-multiple-of-bpp case, we should be good to go (and cover all real usecases?)
IG the same approach should probably be taken on all simd surface manipulations not just premul |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -2841,7 +2841,8 @@ premul_surf_color_by_alpha(SDL_Surface *src, SDL_Surface *dst) | |||
// since we know dst is a copy of src we can simplify the normal checks | |||
#if !defined(__EMSCRIPTEN__) | |||
#if SDL_BYTEORDER == SDL_LIL_ENDIAN | |||
if ((PG_SURF_BytesPerPixel(src) == 4) && pg_has_avx2()) { | |||
if ((PG_SURF_BytesPerPixel(src) == 4) && | |||
(src->pitch % PG_SURF_BytesPerPixel(src) == 0) && pg_has_avx2()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test should be more robust. It should check for src->pitch == (src->w * PG_SURF_BytesPerPixel(src))
because I can imagine pitch may still be a multiple of bpp and not be equal to the minimal possible pitch.
fixes #2750
From what I understand, these unusual, non-pixel aligned, surface pitches won't practically come up on any modern desktop systems (possibly no modern systems) so I prioritised fixing it for the least performance sensitive versions of this function - sse2 & the non-SIMD fallback. These are also the versions I originally wrote so I had a better idea of how they were supposed to work.
The basic fix is to add in the standard 'skip' value that is used in all the blitters to handle pitch issues between two different surfaces - usually these are pixel aligned. In the SSE2 case, to deal with the .5 of a pixel overlap in the pitch case we have to cast the skip value down to Uint8 to get to 'channel', or single byte, level of pointer incrementing as we only want to skip 2 channels worth of a pixel (2 bytes) rather than a whole pixel (4 bytes).
I think that makes sense anyway.
This probably needs feedback from @itzpr3d4t0r and @Starbuck5 to see if they think what I've changed makes sense and if we need to do anything else here.