Skip to content

Enable VPCLMULQDQ support on Rust 1.89+ #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

onethumb
Copy link
Contributor

The Problem

We're currently gating VPCLMULQDQ support (for a ~2X throughput improvement on modern x86_64 platforms) behind the vpclmulqdq feature flag since AVX512 hasn't been stable.

Stable AVX512 support just landed for Rust 1.89.0, so we'd like to start using it.

The Solution

Replace the vpclmulqdq feature flag with rustversion for Rust 1.89+ so we can safely use it.

Changes

  • Deprecate the vpclmulqdq feature flag and remove all of its usage
  • Add rustversion support and gate all AVX512 usage for Rust 1.89+

Planned version bump

  • Which: MINOR
  • Why: non-breaking performance improvement

Links

Notes

I'll wait to merge and ship this until 1.89 becomes stable on August 7, 2025.

Removes the “vpclmulqdq” feature flag in favor of Rust 1.89+ since
VPCLMULQDQ support is stabilized on 1.89.0.

https://releases.rs/docs/1.89.0/
@onethumb onethumb requested a review from Copilot June 10, 2025 20:28
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables AVX512 support with stable Rust 1.89+ by replacing the legacy vpclmulqdq feature flag with rustversion conditionals. Key changes include:

  • Updating attribute directives across CRC32 and architecture modules to use rustversion.
  • Removing the vpclmulqdq cfg-gates from multiple files.
  • Adjusting Cargo.toml to reflect the deprecation of the vpclmulqdq feature.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/crc32/fusion/x86.rs Replaced commented rustversion attributes with active ones, removing vpclmulqdq cfgs.
src/arch/x86.rs Updated attribute directives by removing vpclmulqdq cfgs.
src/arch/vpclmulqdq.rs Removed vpclmulqdq cfg-gate; now fully gated by rustversion.
src/arch/mod.rs Adjusted conditional definitions for update functions based on rustversion.
Cargo.toml Added rustversion dependency and updated the vpclmulqdq feature comment.
Comments suppressed due to low confidence (1)

src/arch/mod.rs:49

  • [nitpick] Multiple definitions of the 'update' function are conditionally compiled with similar logic; consider refactoring to reduce duplication and improve readability.
pub(crate) unsafe fn update(state: u64, bytes: &[u8], params: CrcParams) -> u64 {

# the features below aren't in use, are deprecated, and will be removed in the next MAJOR version
vpclmulqdq = [] # depreated, VPCLMULQDQ stabilized in Rust 1.89.0
Copy link
Preview

Copilot AI Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct the spelling 'depreated' to 'deprecated' in the feature flag comment.

Suggested change
vpclmulqdq = [] # depreated, VPCLMULQDQ stabilized in Rust 1.89.0
vpclmulqdq = [] # deprecated, VPCLMULQDQ stabilized in Rust 1.89.0

Copilot uses AI. Check for mistakes.

@jg15
Copy link

jg15 commented Jun 13, 2025

What will happen if this program is run on a x86_64 platform that does not support vpclmulqdq (or other platform-specific instructions)?
I assume we should gracefully fall back to the most efficient set of instructions that the given processor supports.
Do we need to query the CPU features before choosing which algorithm to attempt?
Perhaps using something like this macro https://doc.rust-lang.org/std/macro.is_x86_feature_detected.html

@onethumb
Copy link
Contributor Author

onethumb commented Jun 13, 2025

@jg15

What will happen if this program is run on a x86_64 platform that does not support vpclmulqdq (or other platform-specific instructions)? I assume we should gracefully fall back to the most efficient set of instructions that the given processor supports. Do we need to query the CPU features before choosing which algorithm to attempt? Perhaps using something like this macro https://doc.rust-lang.org/std/macro.is_x86_feature_detected.html

It does currently gracefully degrade: VPCLMULQDQ -> AVX2 -> SSE -> software-based table lookup. It does so using both conditional compilation attributes and runtime functions like is_x86_feature_detected() and is_aarch64_feature_detected(), and I've tested it across a wide range of systems which have a wide variety of features. The intended degradation works as designed across all the systems I've tested to date.

If you have a system which doesn't work properly, or doesn't degrade to the correct target, I'd love to know what CPU it is so I can test it. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants