You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In current implemention, the BZ3OmpCompressor will not perform a compress until it's internal buffer is long enough for all threads to perform a compress, which means it has to wait until num_threads blocks are received by calling compress method, otherwise it will just return b"". This is called lazy compress, which guaranteed maximum performance. It's okey for compressor, but for decompressor, things changed. The decompressor just have no idea about when does the stream end. If the decompressor still buffers the input, the caller might thought that the input is not enough for a block decompress and drop it. So it only assume that the input could end at any time. In this way, the decompressor can't buffer the input to perform multi-threaded decompress. It perform a decompress when it's buffer is long enough for one thread to decompress(although it would be better if more blocks are received in the same time), making it degenerate to a single thread decompressor. An effective measure to avoid this is to fill at least num_threads blocks, for BZIP3File usage, you can
from bz3 import compression
compression.BUFFER_SIZE = 300*10**6
to increase the buffer size, making more blocks receive at the same time.
The text was updated successfully, but these errors were encountered:
In current implemention, the
BZ3OmpCompressor
will not perform a compress until it's internal buffer is long enough for all threads to perform a compress, which means it has to wait untilnum_threads
blocks are received by callingcompress
method, otherwise it will just returnb""
. This is calledlazy compress
, which guaranteed maximum performance. It's okey for compressor, but for decompressor, things changed. The decompressor just have no idea about when does the stream end. If the decompressor still buffers the input, the caller might thought that the input is not enough for a block decompress and drop it. So it only assume that the input could end at any time. In this way, the decompressor can't buffer the input to perform multi-threaded decompress. It perform a decompress when it's buffer is long enough for one thread to decompress(although it would be better if more blocks are received in the same time), making it degenerate to a single thread decompressor. An effective measure to avoid this is to fill at leastnum_threads
blocks, forBZIP3File
usage, you canto increase the buffer size, making more blocks receive at the same time.
The text was updated successfully, but these errors were encountered: