Authors: Takeki Sudo, Kazushi Ahara (Meiji University), Shizuo Kaji (Kyushu University / Kyoto University)
CubicalRipser is an adaptation of Ripser by Ulrich Bauer, specialized in fast computation of persistent homology for cubical complexes.
- High performance for cubical complexes up to 4D
- C++ command-line binaries and Python modules
- PyTorch integration for differentiable workflows
- Utility helpers for loading, plotting, and vectorization
- Flexible filtrations with both V- and T-constructions
- Binary coefficients (field F2)
- Optional creator/destroyer locations in outputs
If you use this software in research, please cite:
@misc{2005.12692,
author = {Shizuo Kaji and Takeki Sudo and Kazushi Ahara},
title = {Cubical Ripser: Software for computing persistent homology of image and volume data},
year = {2020},
eprint = {arXiv:2005.12692}
}- Getting Started
- Installation
- Python Usage
- Command-Line Usage
- Input Formats
- V and T Constructions
- Creator and Destroyer Cells
- Deep Learning Integration
- Timing Comparisons
- Testing and Regression Checks
- Other Software for Cubical Complex PH
- Release Notes
- License
- Google Colab Demo: CubicalRipser in Action
- Topological Data Analysis Tutorial: Hands-On Guide
- Applications in Deep Learning:
pip install -U cripserimport numpy as np
import cripser
arr = np.load("sample/rand2d.npy")
ph = cripser.compute_ph(arr, filtration="V", maxdim=2)
print(ph[:5])Build binaries first (see Installation), then:
./build/cubicalripser --maxdim 2 --output out.csv sample/3dimsample.txt
./build/tcubicalripser --maxdim 2 --output out_t.csv sample/3dimsample.txtpip install -U cripserIf wheel compatibility is an issue on your platform:
pip uninstall -y cripser
pip install --no-binary cripser cripserRequirements:
- Python >= 3.8
- CMake >= 3.15
- C++14+ compiler (GCC, Clang, MSVC;
src/Makefiledefaults to C++20)
Clone and initialize submodules:
git clone https://github.com/shizuo-kaji/CubicalRipser.git
cd CubicalRipser
git submodule update --init --recursiveBuild CLI binaries:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -jOutputs:
build/cubicalripser(V-construction)build/tcubicalripser(T-construction)
Install the Python package from source:
pip install .Legacy alternative (from src/):
cd src
make allCubicalRipser works on 1D/2D/3D/4D NumPy arrays (dtype convertible to float64).
cripser.computePH(...): low-level bindingcripser.compute_ph(...): convenience wrapper (converts essential deaths tonp.inf)
Both support:
filtration="V"orfiltration="T"maxdim,top_dim,embedded,location
Example (V-construction):
import numpy as np
import cripser
arr = np.load("sample/rand4d.npy")
ph = cripser.compute_ph(arr, filtration="V", maxdim=3)For 1D-3D input, each row is typically:
dim, birth, death, x1, y1, z1, x2, y2, z2
For 4D input:
dim, birth, death, x1, y1, z1, w1, x2, y2, z2, w2
See Creator and Destroyer Cells for interpretation of coordinates.
Notes:
computePH(...)and CLI may represent essential deaths asDBL_MAX.compute_ph(...)converts essential deaths tonp.inf.
dgms = cripser.to_gudhi_diagrams(ph)
persistence = cripser.to_gudhi_persistence(ph)GUDHI plotting example:
import gudhi as gd
gd.plot_persistence_diagram(diagrams=dgms)import torch
import cripser
x = torch.rand(32, 32, requires_grad=True)
ph = cripser.compute_ph_torch(x, maxdim=1, filtration="V")
loss = cripser.finite_lifetimes(ph, dim=0).sum()
loss.backward()The gradient is propagated through birth/death values to creator/destroyer voxel locations. Pairing changes are discrete, so the gradient is piecewise-defined.
- plotting:
cripser.plot_diagrams(...)(requiresmatplotlib) - vectorization:
cripser.persistence_image(...),cripser.create_PH_histogram_volume(...) - OT distance:
cripser.wasserstein_distance(...)(requirestorchandPOT/ot)
A convenience wrapper for quick experiments without writing Python code.
Typical capabilities:
- Accepts a single file (
.npy, image, DICOM, etc.) or a directory of slices - Builds 1D-4D arrays from files
- Chooses V- or T-construction
- Computes PH up to a chosen max dimension
- Writes CSV or
.npyoutputs - Optional sorting for DICOM/sequence inputs
Help:
python demo/cr.py -hExamples:
# single NumPy array
python demo/cr.py sample/rand2d.npy -o ph.csv
# increase max dimension, use T-construction
python demo/cr.py sample/rand128_3d.npy -o ph.csv --maxdim 3 --filtration T
# directory of DICOM files (sorted)
python demo/cr.py dicom/ --sort -it dcm -o ph.csv
# directory of PNG slices
python demo/cr.py slices/ -it png -o ph.csv
# save PH as NumPy for reuse
python demo/cr.py sample/rand128_3d.npy -o ph.npy
# invert intensity sign
python demo/cr.py sample/rand128_3d.npy -o ph.csv --negativeSelected options (see -h for full list):
--maxdim k--filtration V|T--sort-it EXT-o FILE--embedded--negative--transform ...,--threshold ...,--threshold_upper_limit ...
Perseus-style text input:
./build/cubicalripser --print --maxdim 2 --output out.csv sample/3dimsample.txtNumPy input (1D-4D):
./build/cubicalripser --maxdim 3 --output result.csv sample/rand128_3d.npyT-construction:
./build/tcubicalripser --maxdim 3 --output volume_ph.csv sample/rand128_3d.npy--maxdim, -m <k>: compute up to dimensionk(default3)--threshold, -t <value>: threshold for births--print, -p: print pairs to stdout--embedded, -e: Alexander dual interpretation--top_dim: top-dimensional computation using Alexander duality--location, -l yes|none: include or omit creator/destroyer coordinates--algorithm, -a link_find|compute_pairs: 0-dimensional PH method--cache_size, -c <n>: cache limit--min_recursion_to_cache, -mc <n>: recursion threshold for caching--output, -o <FILE>: write.csv,.npy, or DIPHA-style persistence binary--verbose, -v
Notes:
- CLI does not use
--filtration; V/T are separate binaries. - Use
--output noneto suppress output file creation.
- NumPy (
.npy) - Perseus text (
.txt): Specification - CSV (
.csv) - DIPHA complex (
.complex): Specification
A helper utility converts images/volumes between multiple formats.
# image -> .npy
python demo/img2npy.py demo/img.jpg output.npy
# image series glob -> volume .npy (shell expansion)
python demo/img2npy.py input*.jpg volume.npy
# explicit files -> volume .npy
python demo/img2npy.py input00.dcm input01.dcm input02.dcm volume.npyDICOM volume conversion:
python demo/img2npy.py dicom/*.dcm output.npyDirect DICOM PH computation:
python demo/cr.py dicom/ --sort -it dcm -o output.csvDIPHA conversions:
# NumPy -> DIPHA complex
python demo/img2npy.py img.npy img.complex
# DIPHA complex -> NumPy
python demo/img2npy.py img.complex img.npy
# DIPHA persistence output (.output/.diagram) -> NumPy
python demo/img2npy.py result.output result.npyA scalar time series can be treated as a 1D image, so CubicalRipser can compute its persistent homology.
For this special case, other software may be more efficient.
A related frequency-regression example is demonstrated in the TutorialTopologicalDataAnalysis repository.
- V-construction: pixels/voxels represent 0-cells (4-neighborhood in 2D)
- T-construction: pixels/voxels represent top-cells (8-neighborhood in 2D)
Use:
- Python:
filtration="V"orfiltration="T" - CLI:
cubicalripser(V),tcubicalripser(T)
By Alexander duality, the following are closely related:
./build/cubicalripser input.npy
./build/tcubicalripser --embedded input.npyThe difference is in the sign of filtration and treatment of permanent cycles. Here, --embedded converts input I to -I^infty in the paper's notation.
For details, see Duality in Persistent Homology of Images by Adelie Garin et al.
The creator of a cycle is the cell that gives birth to the cycle. For example, in 0-dimensional homology, the voxel with lower filtration in a component creates that class, and a connecting voxel can destroy the class with higher birth time.
Creator and destroyer cells are not unique, but they are useful for localizing cycles.
For finite lifetime in the default convention:
arr[x2,y2,z2] - arr[x1,y1,z1] = death - birth = lifetime
where (x1,y1,z1) is creator and (x2,y2,z2) is destroyer.
With --embedded, creator and destroyer roles are swapped:
arr[x1,y1,z1] - arr[x2,y2,z2] = death - birth = lifetime
Thanks to Nicholas Byrne for suggesting this convention and providing test code.
The original project examples include lifetime-enhanced and histogram-style topological channels for CNNs.
Historical note: older documentation referenced demo/stackPH.py; equivalent functionality is now available in the Python vectorization APIs.
Example using current APIs:
import numpy as np
import cripser
arr = np.load("sample/rand2d.npy")
ph = cripser.compute_ph(arr, maxdim=1)
# Persistence image (channels = selected homology dimensions)
pi = cripser.persistence_image(
ph,
homology_dims=(0, 1),
n_birth_bins=32,
n_life_bins=32,
)
# PH histogram volume (channels encode dim x life-bin x birth-bin)
hist = cripser.create_PH_histogram_volume(
ph,
image_shape=arr.shape,
homology_dims=(0, 1),
n_birth_bins=4,
n_life_bins=4,
)For practical CNN examples, see HomologyCNN.
Timing scripts are under demo/check/.
CLI timing example:
python3 demo/check/timing_cr.py sample/bonsai128.npy \
--mode cli \
--cubicalripser-bin build/cubicalripser \
--tcubicalripser-bin build/tcubicalripser \
--runs 5 --warmup 1 \
-o timing_bonsai128.csvThis writes:
runrows: one per timed iteration (elapsed_seconds)summaryrows: aggregate metrics per(binary, dataset)- statistics:
mean_seconds,std_seconds,min_seconds,max_seconds - metadata:
timestamp_utc,git_commit,maxdim,binary_path,input_path
Reference comparison example:
python3 demo/check/timing_cr.py \
--mode cli \
--cubicalripser-bin build/cubicalripser \
--tcubicalripser-bin build/tcubicalripser \
--reference-csv demo/check/performance/reference_timing.csv \
--max-slowdown 1.10 \
--fail-on-regression \
-o timing_current_vs_reference.csvRun Python tests:
pytestReference-based checks:
# computation regression checks (CLI + Python)
python3 demo/check/check_computation.py --mode all \
--cubicalripser-bin build/cubicalripser \
--tcubicalripser-bin build/tcubicalripser
# benchmark and compare against reference timing
python3 demo/check/timing_cr.py \
--mode cli \
--cubicalripser-bin build/cubicalripser \
--tcubicalripser-bin build/tcubicalripser \
--reference-csv demo/check/performance/reference_timing.csv \
--runs 3 --warmup 1 \
-o timing_current_vs_reference.csvMore details: demo/check/README.md.
The following notes are based on limited understanding and tests and may be incomplete.
-
Cubicle by Hubert Wagner
- V-construction
- parallelized algorithms can be faster on multicore machines
- chunked input handling can reduce memory usage
-
HomcCube by Ippei Obayashi
- V-construction
- integrated into HomCloud
-
DIPHA by Ulrich Bauer and Michael Kerber
- V-construction
- MPI-parallelized for cluster use
- commonly used; memory footprint can be relatively large
-
GUDHI (INRIA)
- V- and T-construction in arbitrary dimensions
- strong documentation and usability
- generally emphasizes usability over raw performance
-
- V-construction
-
Perseus by Vidit Nanda
- V-construction
- v0.0.31: Changed module structure (hopefully, backward compatible)
- v0.0.30: Improved cache resulting in large speedup
- v0.0.24: Repository renamed from
CubicalRipser_3dimtoCubicalRipser.- update old remote if needed:
git remote set-url origin https://github.com/shizuo-kaji/CubicalRipser.git
- update old remote if needed:
- v0.0.23: Added torch integration
- v0.0.22: Changed birth coordinates for T-construction to better match GUDHI for permanent cycles
- v0.0.19: Added support for 4D cubical complexes
- v0.0.8: Fixed memory leak in Python bindings (pointed out by Nicholas Byrne)
- v0.0.7: Speed improvements
- v0.0.6: Changed birth/death location definition
- up to v0.0.5, differences from the original version:
- optimized implementation (lower memory footprint and faster on some data)
- improved Python usability
- much larger practical input sizes
- cache control
- Alexander duality option for highest-degree PH
- both V and T constructions
- birth/death location output
Distributed under GNU Lesser General Public License v3.0 or later. See LICENSE.