tsac-ng — Neural Audio Codec (Multi-Backend)

tsac-ng v0.1.4 — Reverse-engineered, AI-augmented reimplementation of the TSAC neural audio codec. Compatible with the .txc container format and .bin model files.

🤖 AI-Assisted Development: Built by a single developer working with AI coding assistants across 102 investigation rounds (R079-R180) in 4 phases. Architecture, ground-truth extraction (GDB/objdump/LD_PRELOAD), and verification were human-led; implementation was AI-augmented. See METHODOLOGY.md for the full story.

Relationship to TSAC: Like Linux to Unix — same ecosystem compatibility, zero shared code. Not a port. Not a wrapper. A from-scratch reconstruction.

Compatibility Status (Honest Assessment)

Feature	Status	%	Notes
Our own fast TXC encode/decode	✅	100%	Raw uint8 format, works correctly
Original tsac fast TXC decode	🎯	90%	10-bit indices 100%. RMS 0.2023 ≈ target 0.2029 (99.7%). AVX-512 fixed. WAV corr ~0 (BF8 weight 29% residual)
Original tsac normal TXC decode	🔧	60%	Header + CRC done. Transformer (191L) + range coder implemented. End-to-end integration pending
CRC32 validation	✅	100%	Fully reversed (polynomial 0x04C11DB7)
Verbose output parity	✅	100%	batch_size, progress %, bitrate, AVG_BITS — all match
DAC decoder architecture	✅	95%	32 conv1d/29 snake/4 convtr GDB-verified
BF8 dequantization	✅	80%	Full pipeline RE'd: 0x8990→uint16→shl16→float32, gs=32, bfloat16. Weight corr 0.71→0.82
CPU SIMD backends	✅	95%	AVX-512/AVX2/NEON/SVE/RVV. AVX-512 conv1d/convt bugs FIXED (R161-R164)
CUDA backend	✅	85%	Full decode+encode graph. LibNC driver API layer (40% tensor ops)
HIP backend	✅	65%	Compiles. Decode+encode kernels present
Vulkan backend	🔧	40%	Pipeline infra complete (4 shaders). Decode/encode not wired
LLVM JIT backend	🔧	35%	4 JIT functions working (conv1d verified). Decode graph stubbed
CPU encoder	✅	70%	Architecture correct. Strided convs fixed. CUDA encoder naming corrected
Transformer model	✅	80%	12L GPT-2 implemented (293L). Forward pass, GELU, attention, RoPE analysis done
Range coder	✅	80%	get_freq + cumulative + multi-bit decode implemented
Convt weight access	✅	100%	GDB confirmed: stride=K/2, [Co][K][Ci] pattern

Overall Progress

██████████████████████░░ ~90% 已完成
████████████████████░░░░ ~85% 已探索/理解
████░░░░░░░░░░░░░░░░░░░░ ~20% 未探索

102 investigation rounds (079-180) | 4 phases | 85.53 quality | v0.1.4

What We Know (102 Rounds)

Fast TXC: 10-bit fixed-width bit packing. 54/54 GDB verified. RMS 0.2023 ≈ target.
Normal TXC: FBAZ magic, 16-byte header, BE uint32 n_blocks, CRC32.
Transformer: 12L GPT-2 decoder, d512, n4, RoPE. Implemented (293L).
BF8 pipeline: Full RE — libnc 0x8990, uint16→shl16→float32, gs=32, bfloat16.
AVX-512: Conv1d/convt bugs FIXED (stride-K gather + bias 16×). Full speed.
weight_g tuning: Applied to model.6 only → RMS 0.2023 (was 0.046).
Convt: GDB confirmed stride=K/2, [Co][K][Ci].
Encoder: Strided convs fixed. CUDA naming corrected.
GPU: CUDA full. HIP compiles. Vulkan/LLVM infra ready.
Residual: WAV corr ~0 (BF8 weight 29% error despite RMS match).

Features

5 CPU SIMD levels across 3 architectures (x86-64 AVX/AVX2/AVX-512, ARM NEON/SVE, RISC-V RVV)
3 GPU backends: CUDA (NVIDIA), HIP/ROCm (AMD), Vulkan (cross-platform)
1 experimental backend: LLVM JIT
Runtime CPUID dispatch — auto-selects best SIMD with scalar fallback
Zero system() calls — fully self-contained
CLI compatible with original tsac (2024-04-08)

Quick Start

# Build (CPU backend, x86-64)
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

# Decompress our own fast TXC files
./tsac-ng -v d input.txc output.wav

# Decompress original tsac fast TXC files (produces audio, but not bit-accurate yet)
./tsac-ng -v d original_fast.txc output.wav

# With CUDA
cmake .. -DUSE_CUDA=ON -DCUDAToolkit_ROOT=/opt/cuda
./tsac-ng --cuda -v d input.txc output.wav

Backend Status

Backend	Build	Runtime	Notes
CPU (x86-64)	✅	✅	AVX/AVX2/AVX-512 auto-dispatch
CPU (ARM64)	✅	✅	NEON + SVE auto-detect
CPU (RISC-V)	✅	✅	RVV + scalar fallback
CUDA	✅	✅	SM 8.0+, Runtime API
HIP/ROCm	✅	✅	gfx1030+, ROCm 7.x
Vulkan	✅	⚠️	Cross-compile for ARM64 Mali
LLVM JIT	✅	⚠️	Experimental

Architecture

┌─────────────┐    ┌──────────────┐    ┌──────────────┐
│  .txc file  │───▶│  txc_format  │───▶│ codebook_idx │
└─────────────┘    └──────────────┘    └──────┬───────┘
                                               │
                    ┌─────────────────────────┘
                    ▼
┌──────────┐  RVQ lookup  ┌──────────┐  decode graph  ┌──────┐
│ .bin     │─────────────▶│  1024-d  │───────────────▶│ PCM  │
│ model    │  12 codebooks│ features  │  7-layer DAC  │audio │
└──────────┘              └──────────┘                └──────┘

Decoder graph: RVQ Codebook → Conv1d(1024→1536) → 4× ResidualBlock (1536→768→384→192→96) → Snake → Conv1d(96→2) → tanh → PCM

Project Structure

tsac-ng/
├── src/
│   ├── cpu_decoder.c      # CPU decoder + encoder + BF8 dequant
│   ├── range_coder.c      # get_freq adaptive range coder (arith.c RE)
│   ├── txc_format.c       # .txc parser (10-bit bitpacking + CRC32)
│   ├── tsac_codec.c       # Codec API + WAV I/O + bitrate display
│   ├── model_loader.c     # .bin model loader (BF8/float32 auto-detect)
│   ├── main.c             # CLI (compatible with original tsac)
│   ├── cuda/              # CUDA backend (kernels + backend)
│   ├── llvm/              # LLVM JIT backend (experimental)
│   ├── vulkan/            # Vulkan compute backend
│   ├── arch/arm/          # ARM NEON + SVE
│   └── arch/riscv/        # RISC-V RVV
├── hip/                   # HIP/ROCm backend
├── include/               # Public headers
├── docs/evidence/         # GDB ground truth + libnc disassembly
├── cmake/                 # Toolchain files
└── experimental/          # Experimental code

CLI Reference

tsac-ng [options] c|d|t infile outfile

Options (compatible with original tsac):
  --cuda, --hip, --vulkan, --llvm   GPU/accelerator backend
  -q, --n_codebooks n    Codebooks (1-12 stereo, 1-9 mono, default=max)
  -T n                   Thread count (default=1)
  -v                     Verbose mode (batch_size, progress, bitrate, AVG_BITS)
  -h, --help             Show help
  -s, --separate_channels  Stereo as dual mono
  -c, --channels n       Force channel count
  -f, --fast             Fast mode (no transformer)
  -m, --model path       Model file path (directory or direct .bin path)
  -M, --trf_model path   Transformer model path
  --batch_size n         Batch size (default=auto)

Known Limitations

Original fast TXC audio: 10-bit indices 100% correct. 🎯 RMS 0.2023 ≈ target 0.2029 (99.7% match). AVX-512 fixed, weight_g tuned, 0% clipping. WAV correlation ~0 — BF8 weight 29% residual.
Normal TXC: Transformer + range coder implemented. End-to-end integration pending.
Encoder: Strided convs fixed. CUDA naming corrected.
GPU: CUDA complete, HIP compiles, Vulkan/LLVM infra-only.

Roadmap

See .ai/ROADMAP.md for detailed milestone planning. Current phase: Phase 4 Complete — v0.1.4 (102 rounds, 4 phases). 🎯 RMS milestone achieved.

Development Methodology

This is an AI-augmented reverse engineering project. The workflow:

Human extracts ground truth           AI generates implementation
(GDB breakpoints, objdump,           (C code matching the spec,
 LD_PRELOAD intercepts,              SIMD intrinsics, GPU kernels,
 hex dumps, WAV comparison)          CMake build system)
        │                                      │
        └────────────┬─────────────────────────┘
                     ▼
            Compile → Test → Compare RMS
                     │
        ┌────────────┴────────────┐
        │                         │
    RMS matches?              RMS differs?
        │                         │
    Commit ✅                Read error → Craft better prompt → Loop

What this means in practice:

The 10-bit TXC parser, CRC32, range coder, and DAC graph architecture were manually reverse-engineered from the original binary using GDB and objdump
The SIMD kernels (AVX-512, AVX2, NEON, SVE, RVV), GPU backends (CUDA, HIP, Vulkan), and build system were AI-generated from architecture specifications
Every round's deliverable was verified by the human against ground truth (GDB-captured indices, libnc weight dumps, WAV RMS comparison)
Bugs like the is_ct false positive (found in Round 049) took 48 rounds to surface precisely because the AI-generated code was plausible but subtly wrong — only systematic cross-validation caught it

Why this approach? A single developer cannot simultaneously:

Reverse-engineer a closed-source binary's wire format
Implement 5 SIMD levels across 3 CPU architectures
Write 3 GPU backends from scratch
Debug numerical precision issues across a 32-layer neural network

But a developer + AI can. The developer does the irreplaceable human work (understanding the binary, designing verification strategies, judging correctness); the AI does the replaceable work (generating SIMD intrinsics, wiring up CMake, filling in boilerplate).

Honest caveats:

Some AI-generated code works on the happy path but hasn't been tested on edge cases
The residual -3.4dB RMS error exists because the AI-generated dequant formula doesn't match libnc's fused operation — and neither human nor AI has cracked this yet
Code review happened through compilation + testing, not line-by-line human review
Open an issue if you find something weird — it might be an AI hallucination

License

MIT

tsac-ng v0.1.4 — Copyright (c) 2026 Hope2333 (幽零小喵)

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
.ai		.ai
.sisyphus/plans		.sisyphus/plans
cmake		cmake
docs		docs
experimental		experimental
hip		hip
include		include
scripts		scripts
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.cn.md		README.cn.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tsac-ng — Neural Audio Codec (Multi-Backend)

Compatibility Status (Honest Assessment)

Overall Progress

What We Know (102 Rounds)

Features

Quick Start

Backend Status

Architecture

Project Structure

CLI Reference

Known Limitations

Roadmap

Development Methodology

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tsac-ng — Neural Audio Codec (Multi-Backend)

Compatibility Status (Honest Assessment)

Overall Progress

What We Know (102 Rounds)

Features

Quick Start

Backend Status

Architecture

Project Structure

CLI Reference

Known Limitations

Roadmap

Development Methodology

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages