diff --git a/docs/glossary.md b/docs/glossary.md
new file mode 100644
index 000000000..7c6e5501f
--- /dev/null
+++ b/docs/glossary.md
@@ -0,0 +1,575 @@
+# Glossary
+
+This page defines project terms as they are used in the pccx public
+documentation. It is intentionally conservative: planned work, throughput
+targets, and board measurements are labelled as such.
+
+## Project And Release Lines
+
+pccx
+: Parallel Compute Core eXecutor. A hardware-software co-design project for
+  NPU architectures targeting edge inference workloads.
+
+v001
+: Archived experimental pccx architecture line. It remains in the docs as
+  historical context and should not be treated as the active RTL target.
+
+v002
+: Active KV260 LLM architecture line. In this docs site, `v002` usually means
+  the public architecture, ISA, driver, RTL-reference, and verification pages
+  for the current `pccx-FPGA-NPU-LLM-kv260` line.
+
+v002.0
+: Baseline v002 integration line on KV260. Throughput language for this line is
+  measured-only until release evidence is published.
+
+v002.1
+: Planned continuation of v002 on the same RTL repository. The roadmap scopes
+  sparsity and speculative-decoding work to this line. The 20 tok/s number is a
+  target for this line, not a reported board result.
+
+v003.x
+: Planned LLM continuation in a separate RTL repository. Public documentation
+  treats v003 as a future line until its repository and release branches are
+  stabilized.
+
+vision-v001
+: Parallel CNN inference track that reuses the KV260 substrate but targets
+  vision workloads rather than autoregressive LLM decoding.
+
+pccx-lab
+: Companion verification and profiling environment for pccx traces, reports,
+  and workflow automation. Public claims derived from lab output still need the
+  release evidence gates described in the roadmap.
+
+pccx-llm-launcher
+: Companion launcher repository for model preparation, runtime contracts, and
+  KV260-facing orchestration. Current public launcher pages describe scaffold,
+  mock, and contract surfaces unless they cite board evidence.
+
+## Hardware Target
+
+KV260
+: Xilinx Kria KV260 Starter Kit, based on the Zynq UltraScale+ ZU5EV device.
+  It is the primary board target for v002 public documentation.
+
+`kv260`
+: Lowercase slug used in repository names, branch names, build directories, or
+  scripts when a filesystem-safe target identifier is needed.
+
+Zynq UltraScale+
+: AMD/Xilinx SoC family that combines a Processing System and Programmable
+  Logic fabric. The KV260 target uses a ZU5EV part.
+
+PS
+: Processing System. The Arm-based host side of the Zynq device.
+
+PL
+: Programmable Logic. The FPGA fabric side where the pccx NPU RTL is
+  implemented.
+
+AXI
+: Arm AMBA interconnect protocol family used for host, memory, and streaming
+  interfaces in the design.
+
+AXI-HP
+: High-Performance AXI ports from the PS to PL. In v002 documentation these
+  ports are used for high-bandwidth weight traffic into the NPU.
+
+ACP
+: Accelerator Coherency Port. In pccx docs, ACP refers to the coherent path
+  used for activation/result traffic between host memory and the accelerator.
+
+DSP48E2
+: Xilinx DSP slice available in UltraScale+ devices. pccx v002 uses DSP48E2
+  packing for the W4A8 GEMM datapath.
+
+BRAM
+: Block RAM in the FPGA fabric. pccx uses BRAM for smaller local buffers and
+  per-core storage structures.
+
+URAM
+: UltraRAM in the FPGA fabric. pccx v002 uses URAM for the shared L2 cache and
+  weight buffering structures described in the architecture docs.
+
+CDC
+: Clock-domain crossing. Used where data moves between the AXI/control clock
+  domain and the core compute clock domain.
+
+Vivado block design
+: Xilinx Vivado IP-integrator design graph. In the v002.1 docs, a block-design
+  scaffold is build setup material, not proof that implementation or timing has
+  completed.
+
+bitstream
+: FPGA configuration artifact produced after synthesis and implementation.
+  Public pccx docs should call a bitstream deployable only when the matching
+  evidence page or release checklist links the build, timing, and board
+  artefacts.
+
+SD staging
+: Packaging step that prepares files for booting or testing the KV260 from SD
+  media. It is a deploy-preparation step and does not by itself establish a
+  hardware run.
+
+## Data Types And Numeric Formats
+
+W4A8
+: Weight-4, Activation-8 quantization. In pccx v002 this means INT4 weights
+  multiplied by INT8 activations on the main integer compute path.
+
+W4A8KV4
+: Shorthand used for an evidence-gated Gemma 3N E4B target configuration:
+  W4A8 compute with 4-bit KV-cache storage. Treat it as a target configuration
+  label unless a page cites measured evidence.
+
+INT4
+: Signed 4-bit integer value, used for quantized weights in the W4A8 path.
+
+INT8
+: Signed 8-bit integer value, used for quantized activations in the W4A8 path.
+
+BF16
+: Brain floating point format with an 8-bit exponent and 7-bit mantissa. pccx
+  docs use BF16 for activation, KV-cache, or SFU paths where integer-only
+  arithmetic is not the intended representation.
+
+FP32
+: IEEE single-precision floating point. Public docs mention FP32 only where the
+  operation needs a higher-precision software or SFU-side representation.
+
+Precision promotion
+: Conversion from the integer compute path to BF16 or FP32 for non-linear or
+  numerically sensitive operations such as softmax, RMSNorm, GELU, and RoPE.
+
+Sign recovery
+: The correction step used when signed low-bit operands are packed into a
+  wider multiply datapath. In pccx docs the term is tied to W4A8 DSP packing,
+  not to model-level accuracy claims.
+
+Activation quantization
+: Policy for converting activation values into the representation consumed by
+  the integer datapath. The v002.1 decision page names the default policy but
+  does not claim final model accuracy.
+
+`e_max`
+: Maximum-exponent summary used by the v002.1 activation-scale policy. Public
+  docs describe it as a scale-selection mechanism, not as measured accuracy or
+  throughput evidence.
+
+BFP
+: Block floating point. In the v002.1 activation policy, BFP refers to a shared
+  power-of-two activation scale for a block of values.
+
+symmetric INT8
+: Reviewed activation-scale mode that uses symmetric signed INT8 quantization.
+  The design-decision page keeps it as a mode under review rather than the
+  v002.1 default.
+
+constant-cache scale
+: Driver-provided activation-scale table or constant path. It remains a
+  reviewed mode until the hardware/software interface and tests make it the
+  chosen default.
+
+`ACT_SCALE_POLICY`
+: Public parameter handle for the v002.1 activation scaling policy.
+
+`ACT_SCALE_EMAX_BFP`
+: Default v002.1 activation-scale mode named by the design-decision page:
+  `e_max` plus BFP power-of-two scaling.
+
+## Compute Blocks
+
+GEMM
+: General Matrix-Matrix Multiply. In v002 it is the matrix core used mainly for
+  prefill and other matrix-heavy work. The architecture docs describe a 32 x 32
+  systolic array for the KV260 configuration.
+
+GEMV
+: General Matrix-Vector Multiply. In v002 it is the vector core used for
+  decode-dominant work where a new token repeatedly multiplies an activation
+  vector by streamed weights.
+
+CVO
+: Complex Vector Op. ISA opcode family for non-linear vector operations and
+  reductions that execute on the SFU path.
+
+SFU
+: Special Function Unit. The backend that executes CVO operations such as exp,
+  sqrt, GELU, sin/cos, reduce-sum, scale, and reciprocal.
+
+PE
+: Processing Element. A compute cell in the systolic array or related datapath.
+
+Systolic array
+: Regular grid of PEs that moves operands through a fixed pattern. In pccx v002
+  public docs, this term usually refers to the GEMM array.
+
+Weight Stationary
+: GEMM dataflow where a weight tile is loaded into the array and reused across
+  many activation steps.
+
+Weight Streaming
+: GEMV dataflow where weights stream through the vector datapath because each
+  weight is used once for the current token step.
+
+LUT
+: Lookup table. In the FPGA sense, LUTs are logic resources. In the algorithmic
+  sense, pccx docs also use lookup tables for some dequantization or SFU helper
+  paths; read the local context.
+
+CORDIC
+: Iterative coordinate-rotation method used for selected transcendental
+  functions. pccx docs mention CORDIC as part of the SFU implementation path.
+
+K-split
+: Division of the reduction dimension into chunks. v002.1 docs discuss it with
+  drain cadence and accumulator bounds, not as a completed scheduler claim.
+
+drain cadence
+: Frequency at which partial accumulators are drained from a K-split path.
+  The current v002.1 default is parameterized rather than hardwired into a
+  public performance claim.
+
+`K_DRAIN_LIMIT`
+: Public parameter handle for the v002.1 K-split accumulator drain limit. The
+  documented default is `1024`.
+
+DSP accounting baseline
+: Convention for reporting intended compute-core DSP usage separately from
+  implementation extras. Actual utilization still comes from synthesis reports.
+
+`DSP_BASELINE_GEMM`
+: GEMM compute-core DSP baseline parameter. The v002.1 decision page sets it
+  to `1024` for the 32 x 32 PE grid.
+
+`DSP_BASELINE_GEMV`
+: GEMV compute-core DSP baseline parameter. The v002.1 decision page sets it
+  to `64` for four 16-DSP vector lanes.
+
+`DSP_BASELINE_ALPHA`
+: Accounting bucket for implementation extras outside the GEMM/GEMV baseline.
+
+## ISA And Runtime Terms
+
+ISA
+: Instruction Set Architecture. pccx v002 uses a custom fixed-width 64-bit ISA
+  for compute, memory, and CVO instructions.
+
+VLIW
+: Very Long Instruction Word. In pccx docs this describes the fixed-width
+  instruction format and explicit fields used by the NPU dispatcher.
+
+opcode
+: Operation-code field in an instruction. The v002 ISA pages are the source of
+  truth for opcode values and instruction field layouts.
+
+GEMM instruction
+: v002 ISA compute instruction that dispatches matrix-matrix work to the GEMM
+  backend.
+
+GEMV instruction
+: v002 ISA compute instruction that dispatches matrix-vector work to the GEMV
+  backend.
+
+MEMCPY instruction
+: v002 ISA memory movement instruction. See the ISA reference for supported
+  source and destination paths.
+
+MEMSET instruction
+: v002 ISA instruction used to write shape or constant-table state rather than
+  to run arithmetic.
+
+CVO instruction
+: v002 ISA instruction that dispatches an SFU function over a vector or
+  reduction operand.
+
+HAL
+: Hardware Abstraction Layer. The C/C++ driver layer that wraps register,
+  memory, and instruction-dispatch details for host software.
+
+Sail
+: ISA-specification language used by the pccx formal model. In pccx docs, Sail
+  models are used to check instruction semantics and field widths against the
+  intended ISA structure.
+
+launcher contract
+: Data-only interface between the planned KV260 runtime path and launcher
+  software. A contract page describes shapes and guardrails; it is not board
+  execution evidence.
+
+readiness scaffold
+: Typed placeholder or adapter surface that makes a future hardware path
+  reviewable before device access is implemented.
+
+AXI command/status shapes
+: Launcher-side data structures for command and status exchange over the
+  future KV260 boundary. Shape validation is contract evidence, not a live
+  MMIO run.
+
+result streaming
+: Runtime path for returning generated tokens or accelerator results. Public
+  docs should distinguish mock streams, serial test framing, and captured board
+  streams.
+
+serial TTY
+: Character-device path used by launcher or lab tooling to exchange framed
+  records with a connected target. Tests that skip without a device are not
+  board evidence.
+
+TraceStream
+: pccx-lab iterator contract for trace records. File replay and serial TTY
+  sources can share this surface while still having different evidence status.
+
+`KVFPGA_TTY`
+: Environment or configuration path naming the serial device used by the KV260
+  trace source.
+
+newline JSON framing
+: Trace framing style where one JSON payload is carried per line between
+  begin/end markers.
+
+CRC
+: Cyclic redundancy check. In pccx-lab trace framing docs it is used to detect
+  corrupted payloads; skipped bad frames should not be counted as valid
+  hardware evidence.
+
+sequence gap
+: Missing trace-frame sequence number reported by the lab pipeline. It is a
+  diagnostic signal that the captured stream may be incomplete.
+
+## Memory And Model Terms
+
+L1
+: Local per-core memory or buffer close to a compute backend.
+
+L2
+: Shared on-chip cache in the v002 architecture. It is backed by URAM and is
+  shared by GEMM, GEMV, SFU, and memory-dispatch paths.
+
+Weight Buffer
+: On-chip FIFO/buffer path for model weights arriving from external memory.
+  GEMM uses it for weight preload/reuse; GEMV uses it for streaming.
+
+KV cache
+: Attention key/value storage retained across autoregressive decoding steps.
+  pccx docs distinguish KV-cache design targets from measured board capacity
+  or throughput claims.
+
+Attention Sink
+: KV-cache policy term for retaining the first tokens of a prompt while using a
+  sliding local window for recent tokens.
+
+Local Window
+: KV-cache policy term for the recent-token region retained during long-context
+  decoding.
+
+RoPE
+: Rotary Position Embedding. pccx maps RoPE-related sine and cosine work to CVO
+  operations in the SFU path.
+
+RMSNorm
+: Root Mean Square Layer Normalization. In pccx docs this is one of the
+  non-linear or reduction-heavy operations associated with the SFU path.
+
+Softmax
+: Normalization used in attention. pccx docs map its exponential, reduction,
+  reciprocal, and scale steps to CVO/SFU operations.
+
+GELU
+: Gaussian Error Linear Unit activation. pccx docs map GELU to the CVO/SFU
+  path.
+
+Gemma 3N E4B
+: Target LLM family named in the v002 public docs. Claims about token rate or
+  board execution remain evidence-gated unless the page cites published
+  verification data.
+
+GemmaArchSpec
+: Launcher-side configuration object for Gemma shape metadata and packed-size
+  checks. It is a spec-validation surface, not a model execution claim.
+
+W4 prep
+: Launcher-side preparation of signed W4 packed weights and related metadata.
+  Current docs treat it as a software contract until hardware handoff evidence
+  lands.
+
+manifest metadata
+: Structured metadata that records prepared weight shapes, scales, packed
+  sizes, or related handoff fields for the launcher path.
+
+tokenizer contract
+: Offline tokenizer interface used by the launcher scaffold. Placeholder
+  fixtures do not claim real Gemma tokenizer assets.
+
+token streaming
+: Movement of prompt or generated-token data across a runtime boundary. In the
+  current software-path docs, serial and mock streaming are scaffold evidence
+  until board captures are published.
+
+marker-wrapped chunks
+: Token-transport records delimited by explicit markers, sometimes with length
+  prefixes. They define framing behavior rather than hardware throughput.
+
+mock orchestration
+: End-to-end software path that joins prompt encode, W4 prep, mock command
+  polling, output receive, and decode without a real board run.
+
+AltUp
+: Gemma-specific multi-stream state item named in v002.1 FAQ material. Its
+  effect on throughput or memory pressure still needs measured evidence before
+  public claims.
+
+LAuReL
+: Gemma-specific mechanism named in model and FAQ pages. Public docs may
+  describe the mapping, but speedup or accuracy claims need evidence.
+
+PLE
+: Per-Layer Embedding mechanism referenced by Gemma model docs. Treat
+  PLE-related scheduling text as design mapping unless an evidence page links a
+  measurement.
+
+grouped-query attention
+: Attention variant that shares key/value projections across query groups.
+  pccx docs discuss it as part of the Gemma mapping and KV-cache traffic
+  budget.
+
+cross-layer KV sharing
+: Gemma-specific KV reuse pattern that affects cache residency and traffic.
+  Public docs should keep it separate from measured throughput claims.
+
+EAGLE-3
+: Speculative-decoding technique named in the v002.1 roadmap scope. In this
+  repo it is planned work, not a completed v002.0 feature.
+
+SSD
+: Speculative-decoding roadmap item in the v002.1 scope. Expand or redefine
+  the acronym at the point of use when adding detailed public documentation.
+
+J Tree
+: Roadmap shorthand associated with the v002.1 speculative-decoding stack.
+  Treat it as planned scope until a design page defines and verifies it.
+
+G sparsity
+: Roadmap lane for v002.1 sparsity work. It should be described as ramp scope
+  until implementation and evidence pages say more.
+
+H/H+
+: Roadmap shorthand for EAGLE-3 speculative-decoding phases in the v002.1
+  ramp.
+
+I SSD
+: Roadmap shorthand for the SSD phase in the v002.1 ramp.
+
+K benchmark
+: Roadmap shorthand for benchmark/evidence work after the v002.1 mechanism
+  lanes. Benchmarks become public claims only through the evidence gates.
+
+## Metrics And Evidence
+
+tok/s
+: Tokens per second. pccx uses this as the primary user-visible decoding
+  throughput unit.
+
+TT
+: Throughput target. This is planning shorthand for a target token rate, not a
+  measurement. Public pages should prefer spelling out "throughput target" on
+  first use.
+
+measured-only
+: Documentation posture for the v002.0 release line: do not quote throughput,
+  timing closure, or board-run claims until the evidence checklist admits those
+  measurements.
+
+bring-up
+: Hardware integration phase where the bitstream, board setup, host driver,
+  and smoke tests are made to run together. Bring-up logs are evidence inputs,
+  not automatically release claims.
+
+release evidence
+: Checklist-gated artifacts used to decide whether timing, throughput, or
+  board-execution statements are allowed in public docs.
+
+evidence inventory
+: Public list of measured, reproducible artefacts and pending gates. It is the
+  place to check whether a value is measured, pending, or only a target.
+
+claim guard
+: Review rule or scan that prevents public docs from turning targets,
+  scaffolds, mocks, or pending gates into completed hardware claims.
+
+pre-flight
+: Preparatory state for build, launcher, or deploy work before the full command
+  sequence has been run and evidence has been captured.
+
+smoke capture
+: Small board or tool run used to collect initial logs. It can support bring-up
+  evidence, but it does not replace release evidence for timing or throughput.
+
+timing report
+: Vivado report used to justify timing wording. A docs page should not claim
+  timing closure without a linked report or release evidence entry.
+
+utilization report
+: Vivado report used to justify FPGA resource wording such as DSP, LUT, BRAM,
+  or URAM counts.
+
+throughput target
+: Planned token-rate goal. It must remain distinct from measured throughput in
+  public wording.
+
+board run
+: Execution against a connected KV260 or other named target board. Mock tests,
+  type checks, and local software orchestration are not board runs.
+
+trace replay
+: Analysis of an existing `.pccx` trace file through pccx-lab tooling. Replay
+  can validate analysis paths without proving new hardware execution.
+
+## Documentation And Release Terms
+
+spec resolution
+: Reader step that separates architecture intent, model mapping, ISA source of
+  truth, and measured evidence before quoting a claim.
+
+runbook
+: Step-by-step command record for a build, local docs check, deploy, or
+  hardware procedure. A runbook is procedure evidence only after the commands
+  and results are captured.
+
+deploy runbook
+: Documentation path for publishing the Sphinx site through GitHub Pages. A
+  deploy check proves publication, not hardware performance.
+
+release status
+: Label such as draft, prerelease, latest release, or archived release used by
+  release notes. It should not be overloaded with hardware readiness.
+
+pre-release
+: GitHub Release state for work that is published before being treated as a
+  final release.
+
+validation status
+: Release-note field that records which checks passed, failed, or were not run.
+  It should name commands or CI runs where useful.
+
+known limitations
+: Release-note section for caveats, missing evidence, or deferred capability.
+
+release checklist
+: Maintainer checklist for release hygiene. For pccx ISA PDF changes, the
+  checklist includes rebuilding the PDF from `main.tex`.
+
+GitHub Pages deploy
+: Publication workflow for the documentation site. Passing deploy does not
+  convert a target, mock, or pending gate into measured evidence.
+
+contributors acknowledgement
+: Public recognition of people who contribute documentation, reviews, bug
+  reports, diagrams, examples, or related code after maintainers accept the
+  entry for publication.
+
+news section
+: Placeholder area for future project updates, release announcements, and
+  community news. It should not carry release claims without the same evidence
+  gates as the rest of the docs.
diff --git a/docs/index.rst b/docs/index.rst
index e65147752..bc0285639 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -101,6 +101,8 @@ Working tracks for the next release lines:
 
 The :doc:`roadmap` summarises how the three tracks relate, and the
 ``pccx`` family-tree figure on that page links them visually.
+The :doc:`glossary` defines project and v002 architecture terms used
+across the public docs.
 
 The v001 architecture is archived at
 :doc:`archive/experimental_v001/index`.
@@ -202,3 +204,4 @@ risks, keeping the ecosystem safe for open-source hardware development.
 
    v003/index
    vision-v001/index
+   glossary
diff --git a/index.rst b/index.rst
index e09d1b39e..8febfeab8 100644
--- a/index.rst
+++ b/index.rst
@@ -110,6 +110,7 @@ Tooling & Lab
 
    docs/index
    docs/quickstart
+   docs/glossary
    docs/Evidence/index
    docs/roadmap