The fastest 128-bit and 256-bit non-crypto hash, passes all tests, and under 140 source lines of code. Nice! — mrbluecoat
Tip
Example usage (hash everything):
find . -type f -print0 | xargs -0 -I{} rainsum {}
Note
NEWS DEC'24:
rainsum
v1.2 now includes improvements adapted from Frank J. T. Wojcik's SMHasher3, the gold standard for evaluating non-cryptographic hash functions. SMHasher3 provides extensive speedups, bug fixes, and enhancements over SMHasher and SMHasher2. From their repo: "SMHasher3 is a test suite for evaluating non-cryptographic hash functions."
This repository features the Rainbow and Rainstorm hash functions (collectively, Rain Hashes), created by Cris at DOSAYGO and licensed under Apache-2.0. All size variants of both hashes pass all tests in SMHasher3. Relevant results are available in the results/
directory, or at the SMHasher3 GitLab repository. The CLI tool API is similar to standard tools like sha256sum
, but with more switches to select algorithm and digest length. The hashes produce digests ranging from 64 through to 512 bits wide. See the table below for details.
The codebase includes:
- A C++ reference implementation
- A WASM port (for Node.js and browser environments)
- A Makefile for building all targets
Algorithm | Speed | Hash Size | Purpose | Core Mixing Function | Security |
---|---|---|---|---|---|
Rainbow | ~5.79 GiB/s | 64, 128, 256 bits | General-purpose, non-crypto hash | Multiplication, addition/subtraction, rotation, XOR | Not designed as cryptographic |
Rainstorm | ~1.91 GiB/s (4 rds) | 64 to 512 bits | Proposed as a crypto-hash but requires analyses | Addition/subtraction, rotation, XOR | Unvetted, no formal analysis |
- Usage Options
- Assets
- Building
- Benchmark
- Repository Structure
- Rainbow
- Rainstorm - Unvetted for Security
- Note on Cryptographic Intent
- Genesis
- License
- Rainsum Field Manual
- Developer Information
- Story
-
Node.js Library Install:
npm i @dosyago/rainsum@latest
Usage:
import { rainbowHash, rainstormHash } from '@dosyago/rainsum'; const seed = 0x0; const hash = await rainbowHash(256, seed, 'Hello there!'); const intendedCryptoHash = await rainstormHash(512, seed, Buffer.from('Hello there!'));
-
Global Binary via NPM
npm i -g @dosyago/rainsum@latest jsrsum --help
-
Build everything (For WASM ensure Emscripten is installed):
make clean && make sudo make install # optional, installs 'rainsum' globally
If you encounter troubles on macOS use the build script:
./scripts/build.sh
-
Use as a library or CLI:
- Link
rainstorm.o
orrainbow.o
into your project, or includetool.h
and source files from./src/
. - Use the
rainsum
CLI directly:rainsum file.txt # By default: Rainbow, 256-bit hash rainsum --help
- Link
jsrsum
(via NPM + WASM) mirrors the CLI/API of rainsum
but runs slower.
rainsum
CLI tool and associated.o
object files (C++)- WASM binary
- JS CLI tool (
jsrsum
) with an ES Module API exportingrainbowHash
andrainstormHash
- Scripts:
./scripts/bench.mjs
: Benchmark JS WASM vs C++./scripts/verify.mjs
: Verify correctness- Other utility scripts
make clean && make
Optional installation:
sudo make install
C++ vs WASM (Node.js)
These benchmarks highlight the performance gap between native C++ and the WASM-based implementation under Node.js:
Test Input & Size Run C++ Version WASM Version Speedup
10 bytes (input1) 1 6,964,250 ns 129,892,625 ns 18x (C++ wins!)
100 bytes (input2) 1 6,876,208 ns 127,928,542 ns 18x (C++ wins!)
1,000 bytes (input3) 1 6,583,125 ns 127,668,333 ns 19x (C++ wins!)
10,000 bytes (input4) 1 5,920,167 ns 127,438,666 ns 21x (C++ wins!)
100,000 bytes (input5) 1 6,214,916 ns 127,551,791 ns 20x (C++ wins!)
1,000,000 bytes (input6) 1 5,321,500 ns 126,986,167 ns 23x (C++ wins!)
10,000,000 bytes (input7) 1 9,879,042 ns 132,929,875 ns 13x (C++ wins!)
100,000,000 bytes (input8) 1 49,761,375 ns 207,500,917 ns 4x (C++ wins!)
Runs 2 and 3 show similar results, consistently demonstrating that the native C++ version outperforms the WASM variant by a factor of ~3x to ~25x, depending on input size.
.
├─ LICENSE.txt
├─ Makefile
├─ README.md
├─ docs/
├─ js/
├─ rain/
│ ├─ bin/
│ └─ obj/
├─ results/
│ ├─ dieharder/
│ └─ smhasher3/
├─ scripts/
└─ src/
├─ common.h
├─ rainbow.cpp
├─ rainstorm.cpp
├─ rainsum.cpp
├─ tool.h
└─ ...
Rainbow is a fast, general-purpose non-cryptographic hash that can produce 64-bit, 128-bit, or 256-bit hashes. It is simple, compact (~140 LOC), and passes all SMHasher3 tests for its 64-bit variant. Its prime-based mixing function yields strong avalanche properties.
Rainstorm is a slower, more complex hash function inspired by cryptographic designs. It supports output sizes of 64 to 512 bits and adjustable rounds. While it uses operations reminiscent of cryptographic hashes, it is not formally analyzed by any 3rd-party. Consider it experimental, but designed to be secure.
Rainstorm's design includes concepts from cryptographic hashing, but without formal analysis, it should not be considered secure. Feedback and analysis are welcome.
The mixing functions were inspired by Discohash and refined iteratively using SMHasher3 over a few weeks, guided by experience and intuition. Careful prime selection ensured broad avalanche properties.
Apache-2.0 © 2023 - 2024 Cris & DOSAYGO Corporation. With improvements © Frank J. T. Wojcik 2023
Rainsum is a CLI tool for hashing input data using Rainbow or Rainstorm. It supports two primary modes:
- Digest Mode: Produces a fixed-length hexadecimal output.
- Stream Mode: Iteratively feeds the hash back into itself to produce a variable-length stream of binary output.
A JavaScript WASM version (jsrsum
) offers similar functionality at reduced speed.
rainsum [OPTIONS] [INFILE]
If no INFILE
is provided, input is read from standard input.
-m, --mode [digest|stream]
: Defaultdigest
.-a, --algorithm [bow|storm]
: Chooserainbow
(bow
) orrainstorm
(storm
). Defaultstorm
.-s, --size [64-256|64-512]
: Hash size in bits. Default256
. Rainbow supports 64,128,256. Rainstorm supports 64,128,256,512.-o, --output-file FILE
: Write output toFILE
.-t, --test-vectors
: Run test vectors.-l, --output-length HASHES
: For stream mode, number of iterations.--seed VALUE
: Sets the seed (64-bit number or string). A string seed is hashed by Rainstorm to produce a 64-bit seed.-h, --help
: Show help.-v, --version
: Print version.
Produces a single, fixed-size hash in hex. For example:
rainsum -m digest -a storm -s 256 -o output.txt input.txt
Generates a stream of hashes by repeatedly feeding the previous hash into the function. Specify iterations with -l
. Example:
rainsum -m stream -a storm -s 512 -l 1000000 -o output.txt input.txt
bow
(Rainbow): 64, 128, 256 bitsstorm
(Rainstorm): 64, 128, 256, 512 bits
Use -t, --test-vectors
to verify correctness against known inputs.
Use --seed
to set a custom seed. String seeds are hashed to a 64-bit value.
rainsum --help
rainsum --version
A modern C++17 compiler is required. Run make
to build. Ensure necessary build tools (like xcode-select
on macOS) are installed. On macOS you should use ./scripts/build.sh
to ensure that path to the correct compiler and SDK is accurate, modify that script if needed for your system.
Rainsum provides flexible hashing through Rainbow or Rainstorm. Experiment freely and enjoy the convenience of multiple modes and sizes.
The hash definitions may evolve. Changes that affect outputs will be indicated by version increments.
Refer to --test-vectors
and results/
for known outputs and verification.
make
sudo make install
rainsum --help
Improvements, analyses, and faster implementations are welcomed.
This is my hash included in the best hash testing library out there "SMHasher3" maintained by Frank T Wojcik: https://gitlab.com/fwojcik/smhasher3 This test lib has many improvements over SMHasher 1 & 2, listed on the previous link. Results are: https://gitlab.com/fwojcik/smhasher3/-/blob/main/results/README.md
Rainbow is the fastest 128-bit hash, and the fastest 256-bit hash (non-crypto). The 8th fastest 64-bit hash (by family, or 9th fastest overall, and 13-th fatest overall if you include 32-bit hashes). The advantage of rainbow is its easily scalable output size (64, 128 and 256), its high quality (passes all the tests), and its utter simplicity: it is under 140 source lines of code, easily readable
The random constants are primes that were selected based on their avalanche quality under a multiply modulo operation. This part of the development was interesting different primes had very distinctly different avalanche qualities. The highest quality primes caused good avalanche (~ 50% bit flip probability) across the widest possible set of bits, on average. These are quite rare. A lot of large primes only avalanche across a narrow range of bits, even in a 128-bit space. The search program took a couple of days to discover all these primes running on a 2020s era MBP. Primes are chosen because they give a complete residue set under the modulus, ensuring a long cycle length at least regarding the nominal test operation.
The rest of the hash was developed by trial and error using my intuition on developing hashes arising from long experience of doing so, and using SMHasher3 to evaluate the results, by iterating to improve and re-testing, over a period of a couple of weeks in the holidays a few years ago. I started making hash functions in my teens as a fun hobby.