-
Notifications
You must be signed in to change notification settings - Fork 5
Description
I came across this article and was wondering if an approach like this was tried for the following tree building code with repeated SIMD min lookups instead of maintaining a min-heap.
https://www.intel.com/content/www/us/en/developer/articles/technical/fast-computation-of-huffman-codes.html
fdeflate/src/compress/bitstream.rs
Lines 68 to 70 in cbaf854
let mut lengths = [0u8; 286]; | |
let mut codes = [0u16; 286]; | |
build_huffman_tree(&frequencies, &mut lengths, &mut codes, 15); |
If I understand correctly, they found above ~140 elements seemed to be where the vector method was preferred over the heap. However, they're using AVX2 and u16 instead of u32 which I think the current code uses. I'm not sure if it's possible to efficiently get the min position. with implicit autovectorization.
I don't know the impact this would have on overall runtime since they numbers they report are just for the tree building.