-
Notifications
You must be signed in to change notification settings - Fork 32
SLOTHY: Superoptimize AArch64 INTT #748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b99e917 to
308107d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
46398 cycles |
46414 cycles |
1.00 |
ML-DSA-44 sign |
131854 cycles |
132001 cycles |
1.00 |
ML-DSA-44 verify |
47791 cycles |
47801 cycles |
1.00 |
ML-DSA-65 keypair |
81320 cycles |
81338 cycles |
1.00 |
ML-DSA-65 sign |
218021 cycles |
218252 cycles |
1.00 |
ML-DSA-65 verify |
80057 cycles |
80072 cycles |
1.00 |
ML-DSA-87 keypair |
132433 cycles |
132477 cycles |
1.00 |
ML-DSA-87 sign |
279539 cycles |
279832 cycles |
1.00 |
ML-DSA-87 verify |
130404 cycles |
130427 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
114369 cycles |
114382 cycles |
1.00 |
ML-DSA-44 sign |
428771 cycles |
428727 cycles |
1.00 |
ML-DSA-44 verify |
121521 cycles |
121527 cycles |
1.00 |
ML-DSA-65 keypair |
196266 cycles |
196236 cycles |
1.00 |
ML-DSA-65 sign |
697624 cycles |
697597 cycles |
1.00 |
ML-DSA-65 verify |
196490 cycles |
196451 cycles |
1.00 |
ML-DSA-87 keypair |
323099 cycles |
323068 cycles |
1.00 |
ML-DSA-87 sign |
880192 cycles |
880149 cycles |
1.00 |
ML-DSA-87 verify |
327024 cycles |
326950 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
35497 cycles |
35143 cycles |
1.01 |
ML-DSA-44 sign |
121073 cycles |
121147 cycles |
1.00 |
ML-DSA-44 verify |
38211 cycles |
38343 cycles |
1.00 |
ML-DSA-65 keypair |
63058 cycles |
62033 cycles |
1.02 |
ML-DSA-65 sign |
201564 cycles |
200225 cycles |
1.01 |
ML-DSA-65 verify |
63214 cycles |
62992 cycles |
1.00 |
ML-DSA-87 keypair |
95083 cycles |
94071 cycles |
1.01 |
ML-DSA-87 sign |
234938 cycles |
230071 cycles |
1.02 |
ML-DSA-87 verify |
94549 cycles |
95065 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
95910 cycles |
95749 cycles |
1.00 |
ML-DSA-44 sign |
349210 cycles |
348923 cycles |
1.00 |
ML-DSA-44 verify |
101723 cycles |
101599 cycles |
1.00 |
ML-DSA-65 keypair |
163623 cycles |
163400 cycles |
1.00 |
ML-DSA-65 sign |
565612 cycles |
564714 cycles |
1.00 |
ML-DSA-65 verify |
166016 cycles |
165902 cycles |
1.00 |
ML-DSA-87 keypair |
267621 cycles |
267773 cycles |
1.00 |
ML-DSA-87 sign |
723411 cycles |
723169 cycles |
1.00 |
ML-DSA-87 verify |
273113 cycles |
272914 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
276121 cycles |
285745 cycles |
0.97 |
ML-DSA-44 sign |
825067 cycles |
894122 cycles |
0.92 |
ML-DSA-44 verify |
273925 cycles |
280893 cycles |
0.98 |
ML-DSA-65 keypair |
475121 cycles |
486508 cycles |
0.98 |
ML-DSA-65 sign |
1365693 cycles |
1463974 cycles |
0.93 |
ML-DSA-65 verify |
451767 cycles |
465274 cycles |
0.97 |
ML-DSA-87 keypair |
805587 cycles |
832477 cycles |
0.97 |
ML-DSA-87 sign |
1835576 cycles |
2000183 cycles |
0.92 |
ML-DSA-87 verify |
773596 cycles |
798546 cycles |
0.97 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
69206 cycles |
69206 cycles |
1 |
ML-DSA-44 sign |
184245 cycles |
184387 cycles |
1.00 |
ML-DSA-44 verify |
69151 cycles |
69106 cycles |
1.00 |
ML-DSA-65 keypair |
119169 cycles |
119372 cycles |
1.00 |
ML-DSA-65 sign |
294909 cycles |
295603 cycles |
1.00 |
ML-DSA-65 verify |
115188 cycles |
115375 cycles |
1.00 |
ML-DSA-87 keypair |
203578 cycles |
203802 cycles |
1.00 |
ML-DSA-87 sign |
388167 cycles |
387905 cycles |
1.00 |
ML-DSA-87 verify |
195857 cycles |
195698 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
68739 cycles |
69152 cycles |
0.99 |
ML-DSA-44 sign |
202889 cycles |
208483 cycles |
0.97 |
ML-DSA-44 verify |
70765 cycles |
71242 cycles |
0.99 |
ML-DSA-65 keypair |
121544 cycles |
122178 cycles |
0.99 |
ML-DSA-65 sign |
331812 cycles |
342020 cycles |
0.97 |
ML-DSA-65 verify |
117588 cycles |
118474 cycles |
0.99 |
ML-DSA-87 keypair |
198973 cycles |
199864 cycles |
1.00 |
ML-DSA-87 sign |
428789 cycles |
439835 cycles |
0.97 |
ML-DSA-87 verify |
194463 cycles |
195400 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
57596 cycles |
57148 cycles |
1.01 |
ML-DSA-44 sign |
180276 cycles |
179819 cycles |
1.00 |
ML-DSA-44 verify |
61330 cycles |
61125 cycles |
1.00 |
ML-DSA-65 keypair |
99552 cycles |
99980 cycles |
1.00 |
ML-DSA-65 sign |
295981 cycles |
297406 cycles |
1.00 |
ML-DSA-65 verify |
101064 cycles |
101235 cycles |
1.00 |
ML-DSA-87 keypair |
154022 cycles |
154320 cycles |
1.00 |
ML-DSA-87 sign |
353635 cycles |
354598 cycles |
1.00 |
ML-DSA-87 verify |
153339 cycles |
153733 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
115425 cycles |
115728 cycles |
1.00 |
ML-DSA-44 sign |
364333 cycles |
373939 cycles |
0.97 |
ML-DSA-44 verify |
119521 cycles |
119966 cycles |
1.00 |
ML-DSA-65 keypair |
198063 cycles |
199386 cycles |
0.99 |
ML-DSA-65 sign |
597460 cycles |
615426 cycles |
0.97 |
ML-DSA-65 verify |
194953 cycles |
196625 cycles |
0.99 |
ML-DSA-87 keypair |
324508 cycles |
326647 cycles |
0.99 |
ML-DSA-87 sign |
761477 cycles |
784067 cycles |
0.97 |
ML-DSA-87 verify |
320619 cycles |
322593 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4 (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
128248 cycles |
128410 cycles |
1.00 |
ML-DSA-44 sign |
457074 cycles |
456811 cycles |
1.00 |
ML-DSA-44 verify |
136266 cycles |
136364 cycles |
1.00 |
ML-DSA-65 keypair |
220925 cycles |
220811 cycles |
1.00 |
ML-DSA-65 sign |
745590 cycles |
746754 cycles |
1.00 |
ML-DSA-65 verify |
220422 cycles |
220734 cycles |
1.00 |
ML-DSA-87 keypair |
365098 cycles |
365323 cycles |
1.00 |
ML-DSA-87 sign |
944236 cycles |
943162 cycles |
1.00 |
ML-DSA-87 verify |
369008 cycles |
369319 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
135820 cycles |
135683 cycles |
1.00 |
ML-DSA-44 sign |
540804 cycles |
540540 cycles |
1.00 |
ML-DSA-44 verify |
148978 cycles |
148890 cycles |
1.00 |
ML-DSA-65 keypair |
229064 cycles |
228278 cycles |
1.00 |
ML-DSA-65 sign |
891984 cycles |
889005 cycles |
1.00 |
ML-DSA-65 verify |
238549 cycles |
237556 cycles |
1.00 |
ML-DSA-87 keypair |
373012 cycles |
374889 cycles |
0.99 |
ML-DSA-87 sign |
1104955 cycles |
1108915 cycles |
1.00 |
ML-DSA-87 verify |
387383 cycles |
388708 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
72610 cycles |
73277 cycles |
0.99 |
ML-DSA-44 sign |
213227 cycles |
221350 cycles |
0.96 |
ML-DSA-44 verify |
75647 cycles |
76289 cycles |
0.99 |
ML-DSA-65 keypair |
128411 cycles |
129514 cycles |
0.99 |
ML-DSA-65 sign |
353239 cycles |
367951 cycles |
0.96 |
ML-DSA-65 verify |
125610 cycles |
126662 cycles |
0.99 |
ML-DSA-87 keypair |
206984 cycles |
210841 cycles |
0.98 |
ML-DSA-87 sign |
445951 cycles |
467821 cycles |
0.95 |
ML-DSA-87 verify |
205858 cycles |
206335 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
158807 cycles |
159036 cycles |
1.00 |
ML-DSA-44 sign |
565120 cycles |
565513 cycles |
1.00 |
ML-DSA-44 verify |
170104 cycles |
170189 cycles |
1.00 |
ML-DSA-65 keypair |
270096 cycles |
270337 cycles |
1.00 |
ML-DSA-65 sign |
925021 cycles |
926558 cycles |
1.00 |
ML-DSA-65 verify |
276337 cycles |
276745 cycles |
1.00 |
ML-DSA-87 keypair |
451464 cycles |
451390 cycles |
1.00 |
ML-DSA-87 sign |
1182648 cycles |
1184163 cycles |
1.00 |
ML-DSA-87 verify |
461290 cycles |
461812 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2 (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
214615 cycles |
214703 cycles |
1.00 |
ML-DSA-44 sign |
782904 cycles |
782961 cycles |
1.00 |
ML-DSA-44 verify |
230602 cycles |
230723 cycles |
1.00 |
ML-DSA-65 keypair |
385499 cycles |
385351 cycles |
1.00 |
ML-DSA-65 sign |
1309947 cycles |
1310117 cycles |
1.00 |
ML-DSA-65 verify |
376028 cycles |
376231 cycles |
1.00 |
ML-DSA-87 keypair |
607490 cycles |
607560 cycles |
1.00 |
ML-DSA-87 sign |
1655444 cycles |
1656700 cycles |
1.00 |
ML-DSA-87 verify |
618074 cycles |
618424 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
40979 cycles |
41813 cycles |
0.98 |
ML-DSA-44 sign |
129104 cycles |
129629 cycles |
1.00 |
ML-DSA-44 verify |
43279 cycles |
43610 cycles |
0.99 |
ML-DSA-65 keypair |
72204 cycles |
72392 cycles |
1.00 |
ML-DSA-65 sign |
211025 cycles |
212110 cycles |
0.99 |
ML-DSA-65 verify |
73573 cycles |
73873 cycles |
1.00 |
ML-DSA-87 keypair |
109362 cycles |
109524 cycles |
1.00 |
ML-DSA-87 sign |
247835 cycles |
249633 cycles |
0.99 |
ML-DSA-87 verify |
112117 cycles |
110170 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3 (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
138756 cycles |
138845 cycles |
1.00 |
ML-DSA-44 sign |
493651 cycles |
493534 cycles |
1.00 |
ML-DSA-44 verify |
148309 cycles |
148510 cycles |
1.00 |
ML-DSA-65 keypair |
242702 cycles |
242325 cycles |
1.00 |
ML-DSA-65 sign |
808721 cycles |
808829 cycles |
1.00 |
ML-DSA-65 verify |
240717 cycles |
240967 cycles |
1.00 |
ML-DSA-87 keypair |
396445 cycles |
396798 cycles |
1.00 |
ML-DSA-87 sign |
1027156 cycles |
1026810 cycles |
1.00 |
ML-DSA-87 verify |
401766 cycles |
402035 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
120131 cycles |
121135 cycles |
0.99 |
ML-DSA-44 sign |
455762 cycles |
455691 cycles |
1.00 |
ML-DSA-44 verify |
130384 cycles |
130171 cycles |
1.00 |
ML-DSA-65 keypair |
204342 cycles |
206078 cycles |
0.99 |
ML-DSA-65 sign |
734962 cycles |
734493 cycles |
1.00 |
ML-DSA-65 verify |
208973 cycles |
210555 cycles |
0.99 |
ML-DSA-87 keypair |
337202 cycles |
337989 cycles |
1.00 |
ML-DSA-87 sign |
924298 cycles |
922942 cycles |
1.00 |
ML-DSA-87 verify |
344714 cycles |
345245 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
827587 cycles |
827337 cycles |
1.00 |
ML-DSA-44 sign |
3333337 cycles |
3331425 cycles |
1.00 |
ML-DSA-44 verify |
920188 cycles |
919913 cycles |
1.00 |
ML-DSA-65 keypair |
1402437 cycles |
1404508 cycles |
1.00 |
ML-DSA-65 sign |
5443872 cycles |
5442876 cycles |
1.00 |
ML-DSA-65 verify |
1470631 cycles |
1469680 cycles |
1.00 |
ML-DSA-87 keypair |
2304223 cycles |
2306569 cycles |
1.00 |
ML-DSA-87 sign |
6818211 cycles |
6817332 cycles |
1.00 |
ML-DSA-87 verify |
2407130 cycles |
2402780 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
464921 cycles |
464556 cycles |
1.00 |
ML-DSA-44 sign |
2212617 cycles |
2208657 cycles |
1.00 |
ML-DSA-44 verify |
546747 cycles |
545962 cycles |
1.00 |
ML-DSA-65 keypair |
779753 cycles |
777602 cycles |
1.00 |
ML-DSA-65 sign |
3629690 cycles |
3608012 cycles |
1.01 |
ML-DSA-65 verify |
850546 cycles |
847136 cycles |
1.00 |
ML-DSA-87 keypair |
1255823 cycles |
1253990 cycles |
1.00 |
ML-DSA-87 sign |
4472890 cycles |
4443324 cycles |
1.01 |
ML-DSA-87 verify |
1361597 cycles |
1361371 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
114069 cycles |
114842 cycles |
0.99 |
ML-DSA-44 sign |
361125 cycles |
371794 cycles |
0.97 |
ML-DSA-44 verify |
118214 cycles |
119310 cycles |
0.99 |
ML-DSA-65 keypair |
197806 cycles |
199034 cycles |
0.99 |
ML-DSA-65 sign |
597002 cycles |
614739 cycles |
0.97 |
ML-DSA-65 verify |
194692 cycles |
196379 cycles |
0.99 |
ML-DSA-87 keypair |
323963 cycles |
326202 cycles |
0.99 |
ML-DSA-87 sign |
760578 cycles |
783241 cycles |
0.97 |
ML-DSA-87 verify |
320320 cycles |
322445 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
214154 cycles |
213868 cycles |
1.00 |
ML-DSA-44 sign |
782106 cycles |
784182 cycles |
1.00 |
ML-DSA-44 verify |
230079 cycles |
230054 cycles |
1.00 |
ML-DSA-65 keypair |
384873 cycles |
384979 cycles |
1.00 |
ML-DSA-65 sign |
1326370 cycles |
1314553 cycles |
1.01 |
ML-DSA-65 verify |
375442 cycles |
375792 cycles |
1.00 |
ML-DSA-87 keypair |
606660 cycles |
606983 cycles |
1.00 |
ML-DSA-87 sign |
1652718 cycles |
1654364 cycles |
1.00 |
ML-DSA-87 verify |
617702 cycles |
618211 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
227884 cycles |
229441 cycles |
0.99 |
ML-DSA-44 sign |
640944 cycles |
675710 cycles |
0.95 |
ML-DSA-44 verify |
231954 cycles |
238273 cycles |
0.97 |
ML-DSA-65 keypair |
389453 cycles |
405551 cycles |
0.96 |
ML-DSA-65 sign |
1045279 cycles |
1085789 cycles |
0.96 |
ML-DSA-65 verify |
379960 cycles |
383505 cycles |
0.99 |
ML-DSA-87 keypair |
655079 cycles |
677320 cycles |
0.97 |
ML-DSA-87 sign |
1347200 cycles |
1446952 cycles |
0.93 |
ML-DSA-87 verify |
633744 cycles |
643185 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)
Details
| Benchmark suite | Current: 308107d | Previous: 2948ece | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
308894 cycles |
313340 cycles |
0.99 |
ML-DSA-44 sign |
1203258 cycles |
1210106 cycles |
0.99 |
ML-DSA-44 verify |
332723 cycles |
345251 cycles |
0.96 |
ML-DSA-65 keypair |
581830 cycles |
573758 cycles |
1.01 |
ML-DSA-65 sign |
1985838 cycles |
2022886 cycles |
0.98 |
ML-DSA-65 verify |
547594 cycles |
546429 cycles |
1.00 |
ML-DSA-87 keypair |
891983 cycles |
880785 cycles |
1.01 |
ML-DSA-87 sign |
2521635 cycles |
2523515 cycles |
1.00 |
ML-DSA-87 verify |
903880 cycles |
908652 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
308107d to
770940b
Compare
|
We first need to merge slothy-optimizer/slothy#363 and then update the SLOTHY commit. |
770940b to
1001650
Compare
It has been merged and I updated the SLOTHY commit. |
Resolves #206 Signed-off-by: Matthias J. Kannwischer <[email protected]>
1001650 to
3efff95
Compare
Resolves Run Neon NTT/iNTT through SLOTHY #206
Depends on AArch64: Add d-form ldr slothy-optimizer/slothy#363
Depends on CI: Switch from pqcp-arm64 to Github Arm runners #749
iNTT speed-ups: