REDICAT is a high-throughput Rust toolkit for exploring RNA editing events in both bulk and single-cell sequencing experiments. The command-line interface ships three production pipelines—bulk, bam2mtx, and call—all built on a shared library that emphasizes lock-free parallelism, sparse data structures, and deterministic output.
If REDICAT supports your research, please cite:
Wei, T., Li, J., Lei, X., Lin, R., Wu, Q., Zhang, Z., Shuai, S., & Tian, R. (2025). Multimodal CRISPR screens uncover DDX39B as a global repressor of A-to-I RNA editing. Cell Reports, 44(7). https://doi.org/10.1016/j.celrep.2025.116009
- Canonical-contig smart defaults: By default every command operates on the primary human chromosomes (chr1–chr22, chrX, chrY, chrM). Add
--allcontigsto opt into decoys, patches, or spike-ins when you need full genome coverage. - On-the-fly site vetting: Depth, N-content, and editing-support thresholds mirror the
bulkheuristics so only confident loci advance to matrix assembly. - Fearless concurrency: Rayon thread pools, per-thread chunk reducers, and reader pooling keep pileup traversal and sparse matrix assembly saturated on modern CPUs.
- Dynamic Rayon work queue:
ParGrangesnow flattens genomic tiles into a global work queue sobulkkeeps every core busy even when only a handful of contigs remain in flight. - Parallel AnnData assembly:
bam2mtxnow collects per-thread chunk results, merges them with a parallel sparse assembler, and writes.h5adoutputs without a single threaded bottleneck or extra global channels. - Run-aware sparse merging (New): The AnnData converter now persists sorted triplet runs to disk and performs a streaming multi-way merge when finalising CSR layers, keeping peak RSS close to the sparse output size and preventing the single-threaded 30+ minute stall that previously occurred after "Processed 100%" log events.
- Depth-aware chunking:
bam2mtxstreams TSV manifests with Polars, accumulates position weights (DEPTH + REF_SKIP + FAIL) until the--chunksizebudget is spent, and automatically falls back to the hotspot-friendly--chunk-size-max-depthbatches (default 2) wheneverNEAR_MAX_DEPTH=true. - Local-time observability: Console logs now emit system time zone timestamps and guaranteed 30-minute status beacons that report remaining chunks alongside outstanding
NEAR_MAX_DEPTHloci for long-running jobs. - Depth-cap skip audit: Any site whose pileup depth meets or exceeds
--max-depthis skipped before UMI aggregation, recorded to a sibling<name>_skiped_sites.txtmanifest, and summarised in the final log output for easy triage. - Layered architecture: The library is split into
core(shared utilities & sparse ops),engine(parallel schedulers and position primitives), andpipeline(bam2mtx & call workflows), keeping reusable pieces lightweight and testable. - Sparse-first analytics: All matrix work is written against CSR matrices with adaptive density hints so memory usage tracks the number of edited positions, not the theoretical genome size.
- Strand-collapsed editing counts (New): The
callpipeline now merges complementary*0/*1strands before assigning counts to theref,alt, andothersmatrices, guaranteeing strand-invariant allele totals. - Optimized sparse operations: Uses two-pointer algorithms for sparse matrix operations (O(nnz) complexity) instead of HashMap-based approaches, with SmallVec for temporary allocations and FxHashMap for faster non-cryptographic hashing.
- Minimal memory overhead: Arc clones are kept to minimum, performance-critical paths avoid unnecessary allocations, and streaming converters maintain bounded memory usage.
- AnnData-native exports: The toolkit produces
.h5adfiles that slot directly into Python-based workflows (Scanpy, scVI, Seurat via conversion).
src/
main.rs # CLI entry point and subcommand wiring
commands/
bulk/ # Bulk depth CLI (args, processor, orchestration)
bam2mtx/ # Matrix converter CLI (args, workflow, engine)
call/ # RNA editing pipeline CLI (args + pipeline)
common.rs # Shared CLI/runtime helpers (threads, contigs)
lib/
core/ # Concurrency, IO, errors, read filters, sparse utilities
engine/ # Parallel genomic range scheduler + position structs
pipeline/
bam2mtx/ # BAM ➜ sparse matrix processing pipeline
call/ # RNA editing analytics pipeline
# Clone and build
git clone https://github.com/aStudyingTurtle/redicat.git
cd redicat
cargo build --release
# Optional: install into Cargo's bin directory
cargo install --path .Binary artifacts live at target/release/redicat after compilation.
# Discover high-confidence editing sites on canonical chromosomes
redicat bulk sample.bam -o sites.tsv.gz --threads 16
# Generate single-cell mismatch matrices using a two-pass workflow
redicat bam2mtx --bam sample.bam --barcodes barcodes.tsv \
--output counts.h5ad --two-pass --threads 16
# Tune coverage heuristics with `--min-depth`, `--max-n-fraction`, and `--editing-threshold` to mirror bulk filtering when necessary.
# Run the RNA editing analysis pipeline on the produced AnnData matrix
redicat call --input counts.h5ad --output results.h5ad \
--fa GRCh38.fa --site-white-list sites.tsv.gzAdd --allcontigs to any command that should consider every contig in the BAM header instead of the canonical subset.
Calculates per-base depth and nucleotide counts across a BAM/CRAM. Filtering options allow you to tune MAPQ, base quality, minimum depth, and an editing-specific heuristic. The revamped scheduler flattens genomic tiles into a unified Rayon work queue and forces small stealable tasks via with_min_len(1) + adaptive with_max_len, keeping CPU utilisation high for long-running bulk traversals. Results stream to a bgzip-compressed TSV with near-max-depth flags computed from the larger of the filtered depth or the full pileup support (DEPTH + REF_SKIP + FAIL).
near_max_depth flags now key off that max-depth metric, ensuring the 1% ceiling tracks whichever of the filtered or raw pileup coverage is higher.
Positions whose filtered depth reaches the --max-depth ceiling are explicitly marked as NEAR_MAX_DEPTH, providing an audit trail for loci clipped by the depth guard.
Some libraries of bulk were taken form the perbase project.
Option reference:
| Short | Long | Description | Default |
|---|---|---|---|
| — | <reads> |
Input indexed BAM/CRAM to analyze. | required |
-o |
--output |
Output path (bgzip added when missing). | required |
-t |
--threads |
Worker threads for pileup traversal. | 10 |
-c |
--chunksize |
Ideal basepairs per worker (ParGranges chunk). |
500000 |
-Q |
--min-baseq |
Base-quality floor; lower bases counted as N. |
30 (implicit) |
-q |
--mapquality |
Minimum mapping quality accepted. | 255 |
-z |
--zero-base |
Emit 0-based coordinates instead of 1-based. | false |
-D |
--max-depth |
Depth ceiling; positions within 1% of the cap are flagged as NEAR_MAX_DEPTH. |
8000 |
-d |
--min-depth |
Minimum coverage required to report a site. | 10 |
-n |
--max-n-fraction |
Maximum tolerated N fraction (depth / value). | 20 |
-a |
--all |
Report all sites rather than editing-enriched subset. | false |
-et |
--editing-threshold |
Depth divisor when checking for editing support. | 1000 |
-A |
--allcontigs |
Traverse every contig instead of canonical set. | false |
Converts barcoded BAMs into sparse AnnData matrices (per base, per barcode, strand-aware). Accepts precomputed site lists or can bootstrap one via --two-pass, which internally runs bulk with the same contour filters. Parallel chunk workers now reduce their results locally and a multi-threaded sparse assembler materialises the final CSR layers before the .h5ad file is written, avoiding the previous channel-based bottleneck and keeping memory predictable.
During long conversions the console prints system-local timestamps and, in addition to 5% progress beacons, guarantees a status snapshot every 30 minutes summarising remaining chunks and unprocessed NEAR_MAX_DEPTH loci so operators can watch backlog decay in real time. At completion the command emits a <output>_skiped_sites.txt sibling file listing the contig, 1-based position, and observed depth of every locus that was skipped because its pileup depth met or exceeded --max-depth.
Recent Memory Optimizations (v0.3.1):
- UMI String Interning: Identical UMI sequences are now deduplicated in memory using
Arc<str>, reducing memory usage by 50-70% for high-depth positions with redundant UMIs. - Adaptive HashMap Sizing: Hash maps for counts and UMI consensus now use dynamic capacity estimation based on whether a chunk contains high-depth positions, preventing over-allocation.
- Aggressive Triplet Spilling: The in-memory triplet threshold has been reduced from 500K to 100K (with a chunk-size multiplier of 2× instead of 32×), ensuring more frequent disk flushes and lower peak memory usage.
- Enhanced Monitoring: Each chunk now logs detailed statistics including position count, high-depth site count, and total weight, making it easier to identify memory-intensive workloads.
Highlights:
- TSV positions are filtered to canonical contigs unless
--allcontigsis specified. - Depth/N/editing thresholds drop low-quality loci before matrix assembly, matching the first-pass
bulkdefaults. --two-passscouts sites withbulk --max-depth 8000, trimming runaway pileups before the heavy second pass.- Reader pools and chunk-aware parallelism keep pileups cache-friendly while chunk staging bounds memory.
- Output matrices respect the density hint supplied via
--matrix-density, and the streaming converter remaps cell/position indices on the fly before materialising CSR layers. - Info-level progress logs report chunk-level completion percentage, detailed chunk statistics, and ETA so long runs stay observable.
Option reference:
| Short | Long | Description | Default |
|---|---|---|---|
-b |
--bam |
Input indexed BAM/CRAM file. | required |
| — | --tsv |
Optional site list TSV (columns CHR and POS). |
optional |
| — | --barcodes |
Cell barcode whitelist (one per line). | required |
| — | --two-pass |
Run bulk first to build a fresh site list. |
false |
-o |
--output |
Output AnnData (.h5ad) path. |
required |
-t |
--threads |
Worker threads for pileup + AnnData writing. | 10 |
-q |
--min-mapq |
Minimum mapping quality per read. | 255 |
-Q |
--min-baseq |
Minimum base quality; lower bases count as N. |
30 |
-d |
--min-depth |
Minimum non-N coverage to keep a site. |
10 |
-n |
--max-n-fraction |
Maximum tolerated ambiguous fraction (depth / value). | 20 |
-et |
--editing-threshold |
Ensures at least two bases exceed depth / editing-threshold. |
1000 |
-D |
--max-depth |
Cap on pileup traversal depth during matrix assembly (default 655,360). Sites with depth ≥ cap are skipped, logged, and mirrored to <output>_skiped_sites.txt. |
65536 |
-S |
--stranded |
Treat UMIs as strand-aware. | false |
| — | --umi-tag |
BAM tag containing UMI sequence. | UB |
| — | --cb-tag |
BAM tag containing cell barcode. | CB |
-r |
--reference |
Reference FASTA (required for CRAM). | optional |
-c |
--chunksize |
Genomic position weight budget per processing chunk (applied to standard loci). | 100000 |
| — | --chunk-size-max-depth |
Maximum hotspot chunk length when NEAR_MAX_DEPTH is true in the TSV. Updated default to 2 for faster hotspot turnover. |
2 |
| — | --matrix-density |
Hint for CSR buffer sizing. | 0.005 |
-A |
--allcontigs |
Include decoys and noncanonical contigs. | false |
Executes the full RNA editing pipeline on .h5ad inputs: variant annotation, strand-aware ref/alt counting, CEI computation, and mismatch statistics. Validation steps ensure reference FASTA, white-list TSV, and input AnnData all exist and are readable before work begins.
Recent updates collapse opposing strand layers (A0 + A1, etc.) before counts are distributed to the ref, alt, and others matrices. This guarantees that allelic support is identical regardless of BAM strand orientation and avoids spuriously split coverage in downstream metrics.
Editing Logic (Updated):
redicat call --editingtypenow enforces complementary substitutions for all six editing signatures. For example,--editingtype AGkeeps A→G events alongside the strand-complementary T→C observations, while discarding loci whose reference base is neither A nor T. Equivalent complementary rules exist for AC, AT, CA, CG, and CT.- Optimized ref/alt/obs matrix calculation (v0.3.2+): The matrices use an efficient mask-based sparse operation approach inspired by the Python implementation, achieving O(nnz) complexity. For AG editing:
- When ref_base=A: ref layer = A counts, alt layer = G counts, obs layer = T+C counts
- When ref_base=T: ref layer = T counts, alt layer = C counts, obs layer = A+G counts
- Sites with ref_base=G or C are excluded
- One-hot encoding masks are created for ref/alt/others positions, then applied via element-wise sparse multiplication to each base matrix, maintaining full sparsity and dramatically improving performance for large datasets.
- Similar logic applies to all other editing types (AC, AT, CA, CG, CT), ensuring strand-invariant and biologically accurate allele counting with optimal efficiency.
- The site white-list contributes an
is_editing_siteflag insidevar, while the mismatch thresholds (--max-other-threshold,--min-coverage,--min-edited-threshold,--min-ref-threshold) produce afilter_passflag. Only loci marked with both flags feed the observation-levelref,alt, andotherstotals as well as CEI. - The generated AnnData therefore includes a
vartable withis_editing_site,filter_pass, andMismatchcolumns, and anobstable whoseref/alt/otherscolumns andCEImetric reflect exclusively the confidently filtered editing sites.
Important options:
--editingtype· choose among AG, AC, AT, CA, CG, CT editing signatures.--min-coverage· per-site coverage filter before CEI calculation.--threads· configure Rayon pool size (defaults to 2 for stability).
- Sparse matrix column sums for site statistics (New v0.3.2): Site-level mismatch statistics (XX_ref, XX_alt, XX_others) are now computed using efficient sparse matrix column sum operations directly on the ref/alt/others layers, achieving O(nnz) complexity and eliminating the need for DataFrame row iteration.
- Parallel tree reduction for sparse matrix sums (New v0.3.2): Coverage matrix calculation now uses a divide-and-conquer parallel tree reduction strategy, achieving O(log n) parallel depth instead of O(n) sequential operations. For 8 base layers, this reduces from 7 sequential additions to 3 parallel levels with 4-way parallelism at the first level, dramatically improving performance while maintaining full sparsity.
- Mask-based sparse matrix operations (New v0.3.2): The
callpipeline now uses one-hot encoding masks combined with element-wise sparse multiplication for ref/alt/obs matrix calculation, achieving O(nnz) complexity instead of O(n_obs × n_vars). This dramatically improves performance for large sparse datasets while maintaining full correctness and sparsity. - UMI String Interning (New): Identical UMI sequences are deduplicated in memory using
Arc<str>, reducing memory footprint by 50-70% for datasets with repetitive UMIs at high-depth positions. - Adaptive HashMap Capacity (New): Hash maps dynamically adjust their initial capacity based on chunk characteristics (high-depth vs normal), preventing over-allocation while maintaining performance.
- Parallel sparse reduction (New):
bam2mtxmerges per-thread chunk outputs with a lock-free sparse assembler, so CSR materialisation scales with core count without staging everything in a single consumer thread. - Aggressive Triplet Spilling (New): The streaming matrix builder now uses a lower spill threshold (100K triplets, minimum 50K cap) to flush data to disk more frequently, dramatically reducing peak memory usage during large conversions.
- Depth-cap awareness (New):
bulknow marks any site whose filtered depth reaches the configured maximum asNEAR_MAX_DEPTH, complementing the existing 1% proximity heuristic and making depth clipping visible downstream. - Streaming CSR materialisation (New): Triplet runs are merged with a heap-based iterator that deduplicates row/column pairs on the fly, so final layer construction stays parallel-friendly and never loads all triplets into RAM at once.
- Enhanced Chunk Monitoring (New): Each chunk logs position count, high-depth site count, and total weight, providing visibility into workload characteristics and helping diagnose memory-intensive regions.
- Reader pooling: Each worker thread checks out an
IndexedReaderfrom a pool, avoiding reopen/seek penalties when iterating across genomic tiles. - Triplet batching: Sparse matrices are assembled from thread-local batches of
(row, col, value)triplets, eliminating cross-thread contention, and the final merge operates fully in parallel without an intermediate channel stage. - Adaptive chunking: Dual CLI knobs (
--chunksizefor the depth-weighted standard loci budget and--chunk-size-max-depthfor hotspot batch caps, now defaulting to 2) govern the parallel granularity forbam2mtx(and the batch size for AnnData writes), while the refactoredParGrangesscheduler flattens tiles into a single Rayon queue sobulkhonors--chunksizewithout serial bottlenecks. Normal chunk weights now match the manifest’sDEPTH + REF_SKIP + FAILmetric for tighter alignment with bulk heuristics. - Chunk-level fetch & depth guards:
bam2mtxbatches contiguous sites per contig, records observed depth, and—in--two-passmode—relies on a first-passbulkrun capped at--max-depth 8000plus in-run pileup guards to prune oversaturated loci before dense pileups reach the matrix stage. - Progress-aware logging: Chunk processors emit periodic
%complete + ETA updates plus detailed per-chunk statistics so multi-hour conversions remain easy to follow from the console.
# REDIportal3 white-list, for use with `call`
wget https://comics.med.sustech.edu.cn/static_data/redicat/data/rediportal3.tsv.gz
# 10X 3' 20K_PBMC data
# chr22
wget https://comics.med.sustech.edu.cn/static_data/redicat/data/chr22.bam
wget https://comics.med.sustech.edu.cn/static_data/redicat/data/chr22.bam.bai
# cell barcodes
wget https://comics.med.sustech.edu.cn/static_data/redicat/data/barcodes.tsv.gzA minimal BAM (test/chr22.bam) and index are included for smoke tests. Run cargo test to execute unit tests that exercise pileup traversal and CLI argument parsing.

