Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,11 @@ jinn-guard-v1.0-review.zip export-ignore
# Rust-dominant reality) and collapses it in diffs.
bpf/vmlinux.h linguist-vendored
bpf/vmlinux.h linguist-generated

# Jinn Guard is pitched and built as a Rust project; the eBPF/BPF-LSM C objects
# under bpf/ are the in-kernel enforcement layer, but by byte count they dominate
# GitHub's language bar and misrepresent the repo as a C codebase. Mark the whole
# bpf/ tree linguist-vendored so the language bar reflects the Rust-primary
# reality. The source stays in the repo and fully visible — this only affects
# Linguist's stats, and GitHub re-indexes the language bar on the next push.
bpf/** linguist-vendored
2 changes: 1 addition & 1 deletion BENCHMARKS-03.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ expected-deny); safe mode ran 250 (all expected-allow, audit-only).
## 3. Kernel path resolution (Tier 3 — audit-only)

The LSM hooks loaded in safe mode and resolved **full absolute file paths**
(the CVE-2026-002 fix) on 6.17 — audit-only, nothing blocked. PASS.
(the JG-ADV-2026-002 fix) on 6.17 — audit-only, nothing blocked. PASS.

---

Expand Down
10 changes: 5 additions & 5 deletions BENCHMARKS-04.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
> Purpose: extend coverage into the **RHEL family** (a third, distinct kernel
> lineage) under **SELinux Enforcing**, and validate real eBPF-LSM allow/deny
> enforcement there. This run also **found and fixed a real fail-open bug**
> (CVE-2026-003) — see §6 — which is itself the strongest argument that a
> (JG-ADV-2026-004) — see §6 — which is itself the strongest argument that a
> distro-matrix is worth running.

---
Expand Down Expand Up @@ -70,15 +70,15 @@ SELinux denial**, enforcing allow/deny in-kernel. Each surface: 500 operations

> **2,750 enforced operations on AlmaLinux 9 / kernel 5.14 under SELinux
> Enforcing: 0 fail-open, 0 incorrect decisions, 0 timeouts** — *after* the
> CVE-2026-003 fix (§6). The eBPF programs verified and loaded cleanly on a third
> JG-ADV-2026-004 fix (§6). The eBPF programs verified and loaded cleanly on a third
> kernel lineage; SELinux and the BPF-LSM coexisted without interference.

---

## 3. Kernel path resolution (Tier 3 — audit-only)

LSM hooks loaded in safe mode and resolved full absolute file paths
(CVE-2026-002 fix) on 5.14 — audit-only, nothing blocked. PASS.
(JG-ADV-2026-002 fix) on 5.14 — audit-only, nothing blocked. PASS.

---

Expand Down Expand Up @@ -175,7 +175,7 @@ Two reads, deliberately kept separate:

---

## 7. What this run found and fixed — CVE-2026-003
## 7. What this run found and fixed — JG-ADV-2026-004

On the **first** armed run, AlmaLinux 9 / 5.14 exposed a **fail-open** in
`socket_connect`: a *variable* fraction (~30–55% under load) of denied TCP
Expand All @@ -195,5 +195,5 @@ connects were wrongly allowed, while UDP/exec/file held at 0. Investigation
tripped a fail-open type gate. Fixed by reading the correct width. (This bug
was **latent on every distro** — Debian/Ubuntu merely got zero padding.)

Both fixes landed (CVE-2026-003) and this run is the **post-fix re-validation**:
Both fixes landed (JG-ADV-2026-004) and this run is the **post-fix re-validation**:
`fail_open=0` on every surface. The distro-matrix did exactly its job.
139 changes: 139 additions & 0 deletions BENCHMARKS-05.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Jinn Guard — Benchmark Run 05

**Test #:** 05 (launch-hygiene re-validation) · **Run date:** 2026-06-19
**Branch:** `chore/launch-hygiene` · **Host:** local dev sandbox (`jinn-dev`)
**See also:** [`BENCHMARKS-01.md`](BENCHMARKS-01.md) · [`BENCHMARKS-02.md`](BENCHMARKS-02.md) · [`BENCHMARKS-03.md`](BENCHMARKS-03.md) · [`BENCHMARKS-04.md`](BENCHMARKS-04.md)

> Purpose: re-run the full **userspace** test + benchmark suite after the
> launch-hygiene pass (advisory-ID rename, README register fix, doc cleanups —
> all docs/comments, **no logic**) to confirm behavior and performance are
> unchanged. This host runs the **same CPU family and kernel** as Run 01
> (AMD Ryzen 5 7520U / Debian 13 / kernel 6.12), so the numbers are directly
> comparable to that baseline.

---

## Environment

| | |
|---|---|
| CPU | **AMD Ryzen 5 7520U** (8 threads, scaling ~83%, max 4.38 GHz) — same model as Run 01 |
| Distribution | **Debian 13** (trixie family) |
| Kernel | **Linux 6.12.90+deb13.1-amd64** |
| RAM | ~5.75 GiB |
| `/tmp` | **tmpfs** — audit log + lineage are CPU-isolated from disk-fsync latency (as in Run 03/04) |
| Toolchain | rustc/cargo **1.95.0**, release profile, clang 19 |
| Privilege | **uid 1000, no `bpftool`** → kernel-LSM Tier 4 (armed allow/deny) **not run here** |

> **Scope note.** Kernel in-kernel allow/deny enforcement (Tier 4) requires root +
> BPF load and is **not** exercised on this unprivileged sandbox. It is already
> validated on three real hosts in [`BENCHMARKS-01..04`](BENCHMARKS-04.md)
> (Debian 6.12, Ubuntu 6.17, AlmaLinux 5.14 — 2,500–2,750 ops, 0 fail-open).
> This run covers the **full automated suite + userspace performance**.

---

## 1. Full automated test suite

`cargo test --workspace --release`:

| Binary | Result |
|---|---|
| `ts_checker` (Z3 SMT) | 4 passed |
| `ts_cli` unit | 87 passed |
| `integration` | 13 passed |
| `swarm_attack` (adversarial) | 12 passed |
| `kernel_lsm` (Tier 4) | 6 **ignored** (env-gated: needs root + BPF) |

> **116 passed · 0 failed · 6 ignored** (122 defined). Identical pass profile to
> Run 04. The launch-hygiene changes did not alter any behavior.

## 2. Attack resistance (adversarial suite)

`swarm_attack`: **12/12 passed, 0 fail-open** — replay storm, signature forgery,
intent injection, quota abuse, anonymous flood, impersonation, path traversal,
forged delegation, bad-protocol, and the all-at-once mixed assault.

---

## 3. Userspace latency & throughput (`cargo bench --bench stress_bench`)

### Single-client latency (10,000 sequential, full decision pipeline)

| Percentile | Run 05 (Ryzen 5 7520U) | Run 01 baseline (same CPU) |
|---|---|---|
| P50 | **259 µs** | 257 µs |
| P75 | 304 µs | — |
| P90 | 435 µs | — |
| P95 | **533 µs** | 366 µs |
| P99 | **782 µs** | 463 µs |
| P99.9 | 1,243 µs | — |
| Max | 2,962 µs | 1,900 µs |
| Single-client RPS | **~3,219** | ~3,640 |

> P50 matches Run 01 to within noise (259 vs 257 µs). Tail percentiles (P95/P99)
> are higher here — this is a **shared, non-CPU-isolated sandbox** at ~83% scaling,
> not a dedicated host, so tail latency is noisier. The median (the pipeline's
> real cost) is unchanged.

### Concurrent throughput (tmpfs `/tmp`; 0 errors at every level)

| Agents | Total RPS | P50 | P95 | Errors |
|---|---:|---:|---:|---:|
| 10 | **6,208** | 1,220 µs | 1,874 µs | 0 |
| 50 | 6,055 | 1,220 µs | 2,432 µs | 0 |
| 100 | 6,159 | 1,233 µs | 2,535 µs | 0 |
| 500 | 5,741 | 1,252 µs | 37,107 µs | 0 |

> Peak **~6,208 RPS**, flat to 100 concurrent agents, **0 errors** throughout.
> At 500 agents throughput holds but the P95 tail balloons (scheduling
> congestion on 8 threads) — consistent with Run 01 (~6,500 peak).

### Mixed allow/deny (70/30)

5,000 requests → **3,500 allow / 1,500 deny classified correctly, 0
misclassifications** (~3,517 RPS).

### Saturation sweep

| Threads | RPS | P99 |
|---|---:|---:|
| 2 | 4,556 | 1 ms |
| 4 | 4,809 | 1 ms |
| 8 | 5,111 | 2 ms |
| 16 | 4,781 | 5 ms |
| 32 | 4,888 | 9 ms |
| 64 | **saturated** (P99 > 10 ms) | — |

---

## 4. Component micro-benchmarks (criterion)

| Path | Median | Throughput |
|---|---:|---:|
| Core decision pipeline (in-process) | **73.2 µs** | ~13.6 K/s |
| UDS framed roundtrip (persistent conn) | **16.2 µs** | ~61.6 K/s |
| End-to-end serial roundtrip (new conn/request) | **151.1 µs** | ~6.6 K/s |

> The UDS transport (~16 µs) is a small fraction of the full decision (~73 µs+);
> the pipeline, not the socket, dominates. *(The persistent-connection case in
> `socket_throughput` hit a `BrokenPipe` in the bench harness mid-run — a
> harness-robustness quirk, not a daemon fault; the e2e new-connection figure
> above completed cleanly.)*

---

## 5. Scope & honesty notes

- Userspace only; **kernel Tier 4 not run on this unprivileged sandbox** — see
Runs 01–04 for live in-kernel enforcement.
- Shared sandbox at ~83% CPU scaling: treat **P50/medians** as representative and
**tails** as noisier than a dedicated host would show.
- Still a validated research prototype / controlled-pilot MVP, not independently
audited. See [`THREAT_MODEL.md`](THREAT_MODEL.md) and
[`SECURITY/ADVISORIES.md`](SECURITY/ADVISORIES.md).

**Bottom line:** post-launch-hygiene, the suite is **116/116 green (0 fail-open
in the adversarial suite)** and userspace performance is in line with the Run 01
baseline on identical silicon — confirming the docs/comment-only hygiene pass
changed nothing operational.
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,10 @@ Linux 6.12 host across all four validation tiers.
log, and disclosed residual risks.

### Fixed
- **CVE-2026-002 (Critical) — filesystem policy bypass via relative paths.**
- **JG-ADV-2026-002 (Critical) — filesystem policy bypass via relative paths.**
Kernel now resolves the full absolute path before the denylist check
(`jg_read_dentry_path`, depth-12 dentry walk). Verified audit-only and armed.
- **CVE-2026-001 (High) — execve bypass via interpreter chains.** Governed agents
- **JG-ADV-2026-001 (High) — execve bypass via interpreter chains.** Governed agents
are denied known interpreters (`DENY_INTERPRETER_NOT_ALLOWED`).
- **Fail-open regression (enterprise18).** The `system_immunity` and
"out-of-scope" ALLOW fast-paths ran *before* the gate chain, letting
Expand Down
43 changes: 43 additions & 0 deletions LAUNCH_CHECKLIST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Launch Checklist

Hygiene pass before the first distribution push to a security-literate audience
(r/netsec, HN, AFWERX/DIU-adjacent). The repo's intellectual-honesty framing
(validation-status section, `THREAT_MODEL.md`, "what it does NOT claim") is the
core asset — everything here aligns the rest of the repo up to that bar.

## Automated pass (branch `chore/launch-hygiene`)

- [x] **A — Kill self-assigned CVE identifiers.** Renamed project-assigned CVE
identifiers to internal `JG-ADV-*` advisory IDs across docs and (in the
follow-up pass) code comments + script output strings. Rename record:
- `CVE-2026-001` -> `JG-ADV-2026-001` (execve interpreter-chain bypass)
- `CVE-2026-002` -> `JG-ADV-2026-002` (filesystem relative-path bypass)
- `CVE-2026-003` -> `JG-ADV-2026-003` (agent impersonation / UID spoofing)
- `CVE-2026-003` -> `JG-ADV-2026-004` (socket-LSM fail-open — renumbered to
resolve the duplicate-`003` collision; the newer finding takes the new
number)

A one-line disclaimer was added at first use in `README.md` and
`THREAT_MODEL.md` ("`JG-ADV-*` are internal, self-identified advisory IDs,
not CVE records issued by a CNA"). Canonical registry:
[`SECURITY/ADVISORIES.md`](SECURITY/ADVISORIES.md).
- [x] **B — Fix the register collision.** Retitled the README headline from
"Enterprise Semantic Firewall" to "Kernel-level enforcement firewall for
autonomous AI agents (research prototype)" to match the validation-status
section. The Fleet & Enterprise feature-tier section (feature-gated, off by
default) is legitimate and left as-is.
- [x] **C — Repo metadata.** About/description + topics command prepared (run
manually — `gh` not available in the working environment; exact command in
the PR summary).
- [x] **D — Fix the language bar.** Marked `bpf/**` `linguist-vendored` in
`.gitattributes` so the language bar reads Rust-primary. GitHub re-indexes
on the next push.

## Manual (human, not agent)

- [ ] **MANUAL** — Re-record demo: screen-recorder window must never overlay
terminal content (currently covers the open and the closing thesis card).
- [ ] **MANUAL** — Post demo as a **native LinkedIn video** (uploaded, not a
link). Put the repo URL in the **first comment**, not the post body.
- [ ] **MANUAL** — Warm DM to a named contact for a repost into the
AFWERX/DIU-adjacent network.
4 changes: 2 additions & 2 deletions OWASP-MAPPING.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,9 +79,9 @@ can cause.
An agent generates, modifies, or runs code/commands unsafely.
The kernel `bprm_check_security` LSM hook denies execution of non-allowlisted
binaries for governed agents, with interpreter-chain mitigation
(CVE-2026-001, `DENY_INTERPRETER_NOT_ALLOWED`). Enforcement is in the kernel and
(JG-ADV-2026-001, `DENY_INTERPRETER_NOT_ALLOWED`). Enforcement is in the kernel and
cannot be bypassed by the agent process.
*Evidence:* `bpf/lsm/jg_bprm_check_security.c`; `CHANGELOG.md` (CVE-2026-001);
*Evidence:* `bpf/lsm/jg_bprm_check_security.c`; `CHANGELOG.md` (JG-ADV-2026-001);
Tier-4 armed validation.

### ASI06 — Memory and Context Poisoning · **Out of scope**
Expand Down
4 changes: 2 additions & 2 deletions PROFESSOR_VALIDATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ then prints a PASS/SKIP/FAIL summary. Skipped tiers tell you what they need.
|------|----------------|--------------|------------------|
| **1. Build + tests** | The full automated suite passes (≈117 tests: Z3 engine, governance pipeline, 13 integration, 12 swarm-attack). | Rust (`cargo`) | No |
| **2. Mandatory mediation** | A maximally-locked agent container (no network, read-only FS, all capabilities dropped, seccomp, socket-only) **cannot** act directly; only broker-mediated actions through Jinn Guard succeed. | Docker | No (containers) |
| **3. Kernel path resolution** | The eBPF-LSM hooks load and resolve **full file paths** in the kernel (the CVE-2026-002 fix), in **audit-only** mode. | root + BPF-LSM + clang | **No** (audit-only) |
| **3. Kernel path resolution** | The eBPF-LSM hooks load and resolve **full file paths** in the kernel (the JG-ADV-2026-002 fix), in **audit-only** mode. | root + BPF-LSM + clang | **No** (audit-only) |
| **4. Kernel enforcement** | Real allow/deny across execve, TCP, UDP, file create, and file unlink. | root + `--arm` + cgroup v2 | **Only inside a dedicated test cgroup** — see below |

---
Expand Down Expand Up @@ -117,7 +117,7 @@ denied operation was actually denied, and every allowed operation succeeded.
security-critical cases — resolve to full absolute paths.
- **Interpreter chains.** An agent explicitly allowed to run an interpreter can
invoke other tools through it; Jinn Guard denies interpreters by policy for
governed agents (CVE-2026-001 mitigation), but per-binary execve limits are
governed agents (JG-ADV-2026-001 mitigation), but per-binary execve limits are
only as strong as the allowlist.
- **Not independently audited; single-distribution validated (Debian).**

Expand Down
18 changes: 14 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# 🛡️ Jinn Guard — Enterprise Semantic Firewall
# 🛡️ Jinn Guard — Kernel-enforced semantic firewall for autonomous AI agents (validated research prototype)

[![CI](https://github.com/AlphaReasoning/The-Jinn-Guard/actions/workflows/ci.yml/badge.svg)](https://github.com/AlphaReasoning/The-Jinn-Guard/actions/workflows/ci.yml)

**Jinn Guard** is an asynchronous, kernel-aware semantic firewall designed to enforce mathematical safety constraints on autonomous AI agents before any tool execution is permitted. It intercepts high-level natural language intents and processes them through a lifetime-anchored **Z3 SMT solver pipeline** — verifying state transitions and risk ceilings against formalized compliance models before granting or denying execution authority.

Operating locally over high-throughput **UNIX domain sockets** on AlphaOS, the platform binds user-space proxy validation with low-level **eBPF kernel telemetry** and namespace tracking to guarantee absolute zero-trust process isolation and immutable anti-replay protection across the entire host subsystem.
Operating locally over high-throughput **UNIX domain sockets**, the platform binds user-space proxy validation with low-level **eBPF kernel telemetry** and namespace tracking to enforce zero-trust process isolation and anti-replay protection for governed cgroups.

> ### ▶️ See it live in 5 minutes
> ```bash
Expand Down Expand Up @@ -170,6 +170,12 @@ Kernel Layer (eBPF)
└─→ governance loop (telemetry feed)
```

> **A note on languages.** The probes in `bpf/` are C — small, separately-compiled
> eBPF programs loaded into the kernel. The governance core (the daemon, the Z3
> verification pipeline, the policy engine, and the CLI) is **Rust**, under `ts_cli/`.
> `bpf/**` is marked `linguist-vendored`, so GitHub's language bar reflects the Rust
> core rather than the volume of low-level kernel C.

---

## 📦 Components
Expand Down Expand Up @@ -246,7 +252,11 @@ Validated on three distributions / three kernel generations: **Debian 13 / kerne

## Known Limitations

### Filesystem path resolution — mount boundaries (was CVE-2026-002, now fixed)
> **Advisory registry:** the canonical list of `JG-ADV-*` IDs, status, and fix commits lives in [`SECURITY/ADVISORIES.md`](SECURITY/ADVISORIES.md).
>
> **Note on identifiers:** `JG-ADV-*` are internal, self-identified advisory IDs, not CVE records issued by a CNA.

### Filesystem path resolution — mount boundaries (was JG-ADV-2026-002, now fixed)

The BPF `inode_create`/`inode_unlink` hooks now resolve the **full absolute
path** of a file operation in the kernel (a bounded `d_parent` walk), closing the
Expand All @@ -260,7 +270,7 @@ on a single-root install) — the security-critical cases — resolve to full
absolute paths. Crossing mount boundaries requires path-family LSM hooks or
`bpf_d_path` and is tracked for a future release.

### Interpreter chains (CVE-2026-001, mitigated)
### Interpreter chains (JG-ADV-2026-001, mitigated)

An agent explicitly allowed to run an interpreter can invoke other tools through
it. Jinn Guard denies known interpreters by policy for governed agents (any
Expand Down
Loading
Loading