Skip to content

The fastest 128-bit and 256-bit hash, passes all tests, and under 140 source lines of code. API library and CLI tool in C++ and NodeJS/Wasm

License

Notifications You must be signed in to change notification settings

DOSAYGO-Research/rain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rain

The fastest 128-bit and 256-bit non-crypto hash, passes all tests, and under 140 source lines of code. Nice!mrbluecoat

Tip

Example usage (hash everything):

 find . -type f -print0 | xargs -0 -I{} rainsum {}

Note

NEWS DEC'24:

rainsum v1.2 now includes improvements adapted from Frank J. T. Wojcik's SMHasher3, the gold standard for evaluating non-cryptographic hash functions. SMHasher3 provides extensive speedups, bug fixes, and enhancements over SMHasher and SMHasher2. From their repo: "SMHasher3 is a test suite for evaluating non-cryptographic hash functions."

This repository features the Rainbow and Rainstorm hash functions (collectively, Rain Hashes), created by Cris at DOSAYGO and licensed under Apache-2.0. All size variants of both hashes pass all tests in SMHasher3. Relevant results are available in the results/ directory, or at the SMHasher3 GitLab repository. The CLI tool API is similar to standard tools like sha256sum, but with more switches to select algorithm and digest length. The hashes produce digests ranging from 64 through to 512 bits wide. See the table below for details.

The codebase includes:

  • A C++ reference implementation
  • A WASM port (for Node.js and browser environments)
  • A Makefile for building all targets
  • An experimental cryptosystem based on plaintext mining ciphers
Algorithm Speed Hash Size Purpose Core Mixing Function Security
Rainbow ~5.79 GiB/s 64, 128, 256 bits General-purpose, non-crypto hash Multiplication, addition/subtraction, rotation, XOR Not designed as cryptographic
Rainstorm ~1.91 GiB/s (4 rds) 64 to 512 bits Proposed as a crypto-hash but requires analyses Addition/subtraction, rotation, XOR Unvetted, no formal analysis

Table of Contents


Usage Options

JavaScript (WASM) Channel

  1. Node.js Library Install:

    npm i @dosyago/rainsum@latest

    Usage:

    import { rainbowHash, rainstormHash } from '@dosyago/rainsum';
    
    const seed = 0x0;
    const hash = await rainbowHash(256, seed, 'Hello there!');
    const intendedCryptoHash = await rainstormHash(512, seed, Buffer.from('Hello there!'));
  2. Global Binary via NPM

    npm i -g @dosyago/rainsum@latest
    jsrsum --help

Native Channel

  1. Build everything (For WASM ensure Emscripten is installed):

    make clean && make
    sudo make install  # optional, installs 'rainsum' globally

    If you encounter troubles on macOS use the build script:

    ./scripts/build.sh
  2. Use as a library or CLI:

    • Link rainstorm.o or rainbow.o into your project, or include tool.h and source files from ./src/.
    • Use the rainsum CLI directly:
      rainsum file.txt
      # By default: Rainbow, 256-bit hash
      rainsum --help

jsrsum (via NPM + WASM) mirrors the CLI/API of rainsum but runs slower.


Assets

  • rainsum CLI tool and associated .o object files (C++)
  • WASM binary
  • JS CLI tool (jsrsum) with an ES Module API exporting rainbowHash and rainstormHash
  • Scripts:
    • ./scripts/bench.mjs: Benchmark JS WASM vs C++
    • ./scripts/verify.mjs: Verify correctness
    • Other utility scripts

Building

make clean && make

Optional installation:

sudo make install

Benchmark

C++ vs WASM (Node.js)

These benchmarks highlight the performance gap between native C++ and the WASM-based implementation under Node.js:

Test Input & Size          Run   C++ Version        WASM Version       Speedup
10 bytes (input1)          1     6,964,250 ns     129,892,625 ns     18x (C++ wins!)
100 bytes (input2)         1     6,876,208 ns     127,928,542 ns     18x (C++ wins!)
1,000 bytes (input3)       1     6,583,125 ns     127,668,333 ns     19x (C++ wins!)
10,000 bytes (input4)      1     5,920,167 ns     127,438,666 ns     21x (C++ wins!)
100,000 bytes (input5)     1     6,214,916 ns     127,551,791 ns     20x (C++ wins!)
1,000,000 bytes (input6)   1     5,321,500 ns     126,986,167 ns     23x (C++ wins!)
10,000,000 bytes (input7)  1     9,879,042 ns     132,929,875 ns     13x (C++ wins!)
100,000,000 bytes (input8) 1    49,761,375 ns     207,500,917 ns      4x (C++ wins!)

Runs 2 and 3 show similar results, consistently demonstrating that the native C++ version outperforms the WASM variant by a factor of ~3x to ~25x, depending on input size.


Repository Structure

.
├─ LICENSE.txt
├─ Makefile
├─ README.md
├─ docs/
├─ js/
├─ rain/
│  ├─ bin/
│  └─ obj/
├─ results/
│  ├─ dieharder/
│  └─ smhasher3/
├─ scripts/
└─ src/
   ├─ common.h
   ├─ rainbow.cpp
   ├─ rainstorm.cpp
   ├─ rainsum.cpp
   ├─ tool.h
   └─ ...

Rainbow

Rainbow is a fast, general-purpose non-cryptographic hash that can produce 64-bit, 128-bit, or 256-bit hashes. It is simple, compact (~140 LOC), and passes all SMHasher3 tests for its 64-bit variant. Its prime-based mixing function yields strong avalanche properties.


Rainstorm - Unvetted for Security

Rainstorm is a slower, more complex hash function inspired by cryptographic designs. It supports output sizes of 64 to 512 bits and adjustable rounds. While it uses operations reminiscent of cryptographic hashes, it is not formally analyzed by any 3rd-party. Consider it experimental, but designed to be secure.

Documents

  • Rainstorm Formal Spec (Draft): PDF
  • Rainstorm Crypto Note: PDF

Note on Cryptographic Intent

Rainstorm's design includes concepts from cryptographic hashing, but without formal analysis, it should not be considered secure. Feedback and analysis are welcome.


Genesis

The mixing functions were inspired by Discohash and refined iteratively using SMHasher3 over a few weeks, guided by experience and intuition. Careful prime selection ensured broad avalanche properties.


License

Apache-2.0 © 2023 - 2024 Cris & DOSAYGO Corporation. With improvements © Frank J. T. Wojcik 2023


Rainsum Field Manual

1. Introduction

Rainsum is a CLI tool for hashing input data using Rainbow or Rainstorm. It supports two primary modes:

  • Digest Mode: Produces a fixed-length hexadecimal output.
  • Stream Mode: Iteratively feeds the hash back into itself to produce a variable-length stream of binary output.

A JavaScript WASM version (jsrsum) offers similar functionality at reduced speed.

2. Basic Usage

2.1 Command Structure

rainsum [OPTIONS] [INFILE]

If no INFILE is provided, input is read from standard input.

2.2 Options

  • -m, --mode [digest|stream]: Default digest.
  • -a, --algorithm [bow|storm]: Choose rainbow (bow) or rainstorm (storm). Default storm.
  • -s, --size [64-256|64-512]: Hash size in bits. Default 256. Rainbow supports 64,128,256. Rainstorm supports 64,128,256,512.
  • -o, --output-file FILE: Write output to FILE.
  • -t, --test-vectors: Run test vectors.
  • -l, --output-length HASHES: For stream mode, number of iterations.
  • --seed VALUE: Sets the seed (64-bit number or string). A string seed is hashed by Rainstorm to produce a 64-bit seed.
  • -h, --help: Show help.
  • -v, --version: Print version.

3. Modes of Operation

3.1 Digest Mode

Produces a single, fixed-size hash in hex. For example:

rainsum -m digest -a storm -s 256 -o output.txt input.txt

3.2 Stream Mode

Generates a stream of hashes by repeatedly feeding the previous hash into the function. Specify iterations with -l. Example:

rainsum -m stream -a storm -s 512 -l 1000000 -o output.txt input.txt

4. Hash Algorithms and Sizes

  • bow (Rainbow): 64, 128, 256 bits
  • storm (Rainstorm): 64, 128, 256, 512 bits

5. Test Vectors

Use -t, --test-vectors to verify correctness against known inputs.

6. Seed Values

Use --seed to set a custom seed. String seeds are hashed to a 64-bit value.

7. Help and Version Information

  • rainsum --help
  • rainsum --version

8. Compilation

A modern C++17 compiler is required. Run make to build. Ensure necessary build tools (like xcode-select on macOS) are installed. On macOS you should use ./scripts/build.sh to ensure that path to the correct compiler and SDK is accurate, modify that script if needed for your system.

9. Conclusion

Rainsum provides flexible hashing through Rainbow or Rainstorm. Experiment freely and enjoy the convenience of multiple modes and sizes.


Developer Information

Stability

The hash definitions may evolve. Changes that affect outputs will be indicated by version increments.

Test Vectors

Refer to --test-vectors and results/ for known outputs and verification.

Building and Installing

make
sudo make install
rainsum --help

Contributions

Improvements, analyses, and faster implementations are welcomed.


Story

This is my hash included in the best hash testing library out there "SMHasher3" maintained by Frank T Wojcik: https://gitlab.com/fwojcik/smhasher3 This test lib has many improvements over SMHasher 1 & 2, listed on the previous link. Results are: https://gitlab.com/fwojcik/smhasher3/-/blob/main/results/README.md

Rainbow is the fastest 128-bit hash, and the fastest 256-bit hash (non-crypto). The 8th fastest 64-bit hash (by family, or 9th fastest overall, and 13-th fatest overall if you include 32-bit hashes). The advantage of rainbow is its easily scalable output size (64, 128 and 256), its high quality (passes all the tests), and its utter simplicity: it is under 140 source lines of code, easily readable

The random constants are primes that were selected based on their avalanche quality under a multiply modulo operation. This part of the development was interesting different primes had very distinctly different avalanche qualities. The highest quality primes caused good avalanche (~ 50% bit flip probability) across the widest possible set of bits, on average. These are quite rare. A lot of large primes only avalanche across a narrow range of bits, even in a 128-bit space. The search program took a couple of days to discover all these primes running on a 2020s era MBP. Primes are chosen because they give a complete residue set under the modulus, ensuring a long cycle length at least regarding the nominal test operation.

The rest of the hash was developed by trial and error using my intuition on developing hashes arising from long experience of doing so, and using SMHasher3 to evaluate the results, by iterating to improve and re-testing, over a period of a couple of weeks in the holidays a few years ago. I started making hash functions in my teens as a fun hobby.