-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Continuing from #274:
Arithmetic coding in JPEG is defined in the CCITT Recommendation T.81 (identical to ISO/IEC 10918-1) with T.81 Corrigendum 1 as part of the extended sequential and progressive processes. While it is not included in the baseline profile, it is a fully standardized feature of JPEG. In addition, ITU-T Recommendation T.851 with T.851 Corrigendum 1 later defined a baseline arithmetic coding process, further standardizing this mode.
Historically, arithmetic coding saw little adoption due to patent concerns, and many implementations focused exclusively on baseline Huffman coding. With the relevant patents long expired, the original barrier to adoption no longer applies.
Published estimates of the expected size savings vary. Some sources mention typical reductions of around 5–7% compared to optimized Huffman coding, while other reports suggest higher gains. For example, the Chromium issue Arithmetic coded JPEG support indicates savings of up to 16% compared to optimized Huffman coding and 19% compared to non-optimized Huffman coding.
In a recent local test, the savings were significantly larger for one test input. A 1200-dpi scan of an A4 page with black text on a light-green textured background was encoded using cjpeg -q 1 with three configurations:
- Arithmetic coding
- Optimized Huffman coding
- Default (non-optimized) Huffman coding
File sizes:
- input PNM file: 339,278,652 bytes
- output arithmetic-coded JPEG: 239,224 bytes
- output optimized Huffman-coded JPEG: 807,963 bytes
- output unoptimized Huffman-coded JPEG: 1,928,235 bytes
All three JPEG outputs had the same, subjectively sufficient visual readability. In this case, the arithmetic coding reduced the size by more than a factor of 3 compared to the optimized Huffman coding and more than a factor of 8 compared to the unoptimized Huffman coding.
While such gains may not be typical for all images, this example shows that arithmetic coding can provide very substantial size reductions in at least some cases.
Given that:
- arithmetic coding is standardized in T.81 and T.851,
- the patent situation is no longer a concern, and
- the potential size savings can be significant,
support for arithmetic-coded JPEGs would improve standards coverage and could provide meaningful benefits in some use cases. It would also help address downstream issues such as https://bugs.debian.org/1127449.
Thank you for considering this feature.