Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove copy from hsMatrix2SIMD in Metal renderer #1647

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

colincornaby
Copy link
Contributor

hsMatrix2SIMD used to do a transpose of the input hsMatrix - but shaders and code were rewritten to make this unnecessary. hsMatrix2SIMD still does a vestigial memory copy. This commit removes that memory copy.

This changes the meaning of hsMatrix2SIMD - so functions that need to make a copy now do so manually. Otherwise a direct cast is made. hsMatrix2SIMD has also been made constexpr.

This should result in a small (~1%) performance improvement in the Metal renderer main loop.

Comment on lines +68 to +72
constexpr const matrix_float4x4* hsMatrix2SIMD(const hsMatrix44& src)
{
//reinterperate_cast not allowed in constexpr
return (simd_float4x4*)(src.fMap);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably shouldn't be constexpr because there's not really anything that can be done at compile time. Instead, it should be either static inline or just a macro and use reinterpret_cast.

Copy link
Member

@dpogue dpogue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it makes sense to return a matrix_float4x4& rather than a pointer, since we always seem to be immediately dereferencing it

hsMatrix2SIMD used to do a transpose of the input hsMatrix - but shaders and code were rewritten to make this unnecessary. hsMatrix2SIMD still does a vestigial memory copy. This commit removes that memory copy.

This changes the meaning of hsMatrix2SIMD - so functions that need to make a copy now do so manually. Otherwise a direct cast is made. hsMatrix2SIMD has also been made constexpr.

This should result in a small (~1%) performance improvement in the Metal renderer main loop.
@colincornaby colincornaby force-pushed the hsMatrix2SIMD-without-implicit-copy branch from b11d7cf to 14d89ad Compare January 1, 2025 22:51
@colincornaby
Copy link
Contributor Author

I've rebased this branch to include the Linux runner changes.

@dpogue - I'm still considering your question. No real thoughts yet - I think the underlying fMap in the matrix is a pointer type which is why it was a pointer here. Shame the matrix type includes extra data beyond the matrix, but using the fMap as the raw underlying matrix data is a small inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants