The pfx format has a huge performance loss with clang #5604

claudioandre-br · 2024-12-02T21:48:10Z

I can easily reproduce:

gcc from Ubuntu 24

$ john | head -1; sleep 3; john --test --format=pfx
John the Ripper 1.9.0-jumbo-1+bleeding-b3bd5ea707 2024-12-01 03:06:08 +0100 OMP [linux-gnu 64-bit x86_64 AVX2 AC]
Will run 8 OpenMP threads
Benchmarking: pfx, (.pfx, .p12) [PKCS#12 PBE (SHA1/SHA2) 256/256 AVX2 8x]... (8xOMP) DONE
Speed for cost 1 (iteration count) of 2048, cost 2 (mac-type [1:SHA1 224:SHA224 256:SHA256 384:SHA384 512:SHA512]) of 1
Raw:	26842 c/s real, 3696 c/s virtual

clang from Ubuntu 24.

$ $ run/john | head -1; sleep 3; run/john --test --format=pfx
John the Ripper 1.9.0-jumbo-1+bleeding-364b1ca435 2024-12-02 06:20:23 +0100 [linux-gnu 64-bit x86_64 AVX2 AC]
Benchmarking: pfx, (.pfx, .p12) [PKCS#12 PBE (SHA1/SHA2) 256/256 AVX2 8x2]... DONE
Speed for cost 1 (iteration count) of 2048, cost 2 (mac-type [1:SHA1 224:SHA224 256:SHA256 384:SHA384 512:SHA512]) of 1
Raw:	9792 c/s real, 9792 c/s virtual

$ run/john | head -1; sleep 3; run/john --test --format=pfx
John the Ripper 1.9.0-jumbo-1+bleeding-0251a0f0f8 2024-12-02 18:56:33 -0300 [linux-gnu 64-bit x86_64 AVX2 AC]
Benchmarking: pfx, (.pfx, .p12) [PKCS#12 PBE (SHA1/SHA2) 256/256 AVX2 8x]... DONE
Speed for cost 1 (iteration count) of 2048, cost 2 (mac-type [1:SHA1 224:SHA224 256:SHA256 384:SHA384 512:SHA512]) of 1
Raw:	11328 c/s real, 11328 c/s virtual

The text was updated successfully, but these errors were encountered:

solardiz · 2024-12-02T21:56:27Z

We probably need to re-tune these:

#ifndef SIMD_PARA_SHA1
#if defined(__INTEL_COMPILER)
#define SIMD_PARA_SHA1                  1
#elif defined(__clang__)
#define SIMD_PARA_SHA1                  2
#elif defined(__llvm__)
#define SIMD_PARA_SHA1                  2
#elif defined(__XOP__)
#define SIMD_PARA_SHA1                  2
#else
#define SIMD_PARA_SHA1                  1
#endif
#endif

Can you try making it 1 for clang?

Also, not limited to SHA-1, I think we never considered possibly increasing these PARA for AVX-512, where we have twice more SIMD registers (up to 32 now). This is unrelated to this issue, which was seen on AVX2 (so still 16 registers).

solardiz added the enhancement label Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The pfx format has a huge performance loss with clang #5604

The pfx format has a huge performance loss with clang #5604

claudioandre-br commented Dec 2, 2024 •

edited

Loading

solardiz commented Dec 2, 2024

The pfx format has a huge performance loss with clang #5604

The pfx format has a huge performance loss with clang #5604

Comments

claudioandre-br commented Dec 2, 2024 • edited Loading

solardiz commented Dec 2, 2024

claudioandre-br commented Dec 2, 2024 •

edited

Loading