-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import ML-KEM from mlkem-native/PQ code package #2041
base: main
Are you sure you want to change the base?
Conversation
7f66f23
to
274d30c
Compare
959c697
to
1eebf31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, @bhess. I surely didn't check all 540 files but focused on the integration logic: Please see the single comments. In general, the patch is way too large in my opinion: Isn't it possible that the upstream uses fewer hard-coded include paths and also provides a YML documentation of their implementation? "copy_from_upstream" ideally should be easy to run to regularly follow the upstream without the need to always create new patches: the latter only creates unnecessary work for OQS and consequently reduces the motivation for keeping the code up-to-date. Of course, if there is no further development expected in PQCP (is it?) this point is moot.
cb07260
to
d199bb3
Compare
Thanks for the review @baentsch. The patch size is now much reduced, basically only to adapt a few things to be able to use our fips202/sha3 implementation. For the upstream implementation it seems not straight-forward to move away from relative import paths. However, this is no longer an issue because I’ve added an option to copy_from_upstream that preserves the upstream folder structure. As a result, no further patching is required. |
Comments addressed. Discussion ongoing. Don't want to hinder other approvals moving things forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't attempted to review the code imported from PQCP (and I wouldn't have the expertise to do so anyhow), but the integration-related code looks good to me.
src/common/pqclean_shims/fips202.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's time to rename this file to "upstream_shims" or something similar to reflect the fact that it's no longer exclusive to PQClean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SWilson4 In my "final" review I came across this unresolved comment: As you opened it, please close it before merge -- I guess by/before adding a separate issue so that your proposal above doesn't get forgotten.
The following measurements are on an Intel Xeon Gold 6338 CPU @ 2.00GHz, Turbo Boost turned off for consistent results:
1.1. Old implementation from main
1.2 mlkem-native implementation
-> We see a nice speedup in the generic code
2.1 Old implementation from main:
2.2 mlkem-native implementation
-> The key generation performance is very similar, but there's some performance degradation in encapsulation/decapsulation. This can likely be attributed to the additional key checks implemented in mlkem-native to meet FIPS203 requirements, which are more noticeable in the otherwise optimized code. Feedback from @mkannwischer would be appreciated to confirm if this aligns with your expectations. |
Thanks for the benchmarks. |
Apologies for the delay, we did some analysis and experiments in the background to understand the performance numbers reported by @bhess better. The impact of input validation is surprisingly large: We see up to 5% for encapsulation and 30% for decapsulation. Still, it does not explain all of the performance drop. mlkem-native is not adopting all AVX2 code from the pqcrystals repo, a conscious decision to limit the verification burden. In light of the above numbers, however, we revisited what merits implementing in intrinsics, and imported AVX2 intrinsics code for polynomial (de)compression routines from pqcrystals (with some robustness improvements). Below are the current performance numbers of OQS The performance is now close to the pqcrystals performance when discounting input validation. @bhess Could you also re-run the benchmarks on your machine?
|
Signed-off-by: Basil Hess <[email protected]> Pull update [full tests] Signed-off-by: Basil Hess <[email protected]> Update CT files [full tests] [extended tests] Signed-off-by: Basil Hess <[email protected]> Update filter_algs [full tests] Signed-off-by: Basil Hess <[email protected]> Switch to upstream repo with patches [full tests] Signed-off-by: Basil Hess <[email protected]> Update README.md [skip ci] Signed-off-by: Basil Hess <[email protected]> New import Copy-from-upstream option to preserve folder stucture Smaller patch: no include paths fixing & meta-ymls available upstream Documenting ct-passes file Update dependencies for CBOM [full tests] [extended tests] Signed-off-by: Basil Hess <[email protected]> Correct upstream branch [full tests] [extended tests] Signed-off-by: Basil Hess <[email protected]> Pull new update with fips202 context initialization [full tests] [extended tests] Signed-off-by: Basil Hess <[email protected]> pull from upstream [full tests] [extended tests] Signed-off-by: Basil Hess <[email protected]>
Thank you @hanno-becker for the updates in mlkem-native and the thorough analysis. I've updated the PR to sync with commit 84398e7230fa31ba4241f5eb36bdc3c1dbbd5bcd from upstream. The performance numbers on my machine are consistent with those on your Ice Lake instance.
Regarding the performance drop due to input validation, I believe this is a tradeoff we should accept in order to align with the ML-KEM/FIPS 203 requirements. If we discount the overhead from input validation, the performance drop compared to pqcrystals is well under 10% for encapsulation/decapsulation, which is within the 15% tolerance defined in the liboqs release process. Also, this seems not to be a regression of the liboqs integration as the same characteristics is visible in standalone mlkem-native. Considering that the new implementation also benefits from formal verification, I think this tradeoff is well worth making. Any thoughts or objections on this? |
Signed-off-by: Basil Hess <[email protected]>
e9f23a9
to
e7fc57b
Compare
Thank you, @bhess! There are CI failures, but they seem unrelated to this PR? @praveksharma @dstebila @baentsch @SWilson4 Anything else needed, or are we good to go? |
I rebased to main. It looks like the CUDA build is also failing on the main branch. @praveksharma, do you have any insights on why that might be? |
This is likely happening because of a misconfigured CI image becuase of which CMake cannot find the CUDA compiler; here is the relevant issue: #2056. The issue should not effect this PR. |
Hmm -- when looking at the current PR, I still see a patch file: How does this fit with the above statement @bhess? |
Basically LGTM. Some (I hope only) housekeeping comments I'd suggest addressing before merge, tagging @bhess. Thanks @hanno-becker @mkannwischer for helping us streamline the OQS "upstream zoo" a bit! |
Signed-off-by: Basil Hess <[email protected]>
Thank you, @baentsch, for the careful review -- good calls regarding the patch size and benchmarks! |
Thank you for the review @baentsch !
The context of my statement was the following:
The patch mainly adjusts the include paths from @hanno-becker: Once the PR receives a second approval, would it be possible to create a tag or release upstream? This way, we could reference the tag/release instead of pointing to a specific commit in the main branch, as we're doing currently. |
@bhess Yes, that will be fine. Once we have the second approval here, let's sync once more on what upstream commit to update this PR to, and then tag that. |
@bhess Could you update to the latest @baentsch @praveksharma @dstebila If you're happy, could one of you provide the 2nd approval? |
Signed-off-by: Basil Hess <[email protected]>
@bhess We created a tag v1.0.0-alpha.2. Could you update, hopefully a final time? |
Signed-off-by: Basil Hess <[email protected]>
Thank you @hanno-becker! The PR is updated. |
@praveksharma @baentsch @dstebila Are we good to go? |
This PR tracks the integration of ML-KEM from the mlkem-native upstream repository.
It replaces the current ML-KEM implementation in liboqs, which was previously imported from pq-crystals, with the mlkem-native implementation from PQCP.
Some features of mlkem-native:
The upstream code recently had a v1.0.0-alpha release and is actively maintained. The goal is to synchronize the PR with an upcoming tagged release of mlkem-native.
Additionally, the upstream code includes enhanced key validation as defined by FIPS 203 by default, which resolves issue #1951.
Closes #1951.
TODOs: