Nutanix technical questionnaire reviews — completed by Sam (AI assistant) using battlecard-anchored RAG search.
The reviews in this repo are the output of a controlled comparison between two approaches to AI-assisted technical document review: RAG-grounded (live retrieval from Nutanix KB and docs) vs direct LLM (no retrieval, training-cutoff knowledge only).
The comparison below is from actual production queries during the same session. Same question, same model, same context — only the retrieval pipeline differs.
Query: "What are the differences between Red Hat OpenShift and Nutanix Native Hyperconverged Infrastructure?"
| Metric | Non-RAG (Direct LLM) | RAG-Grounded |
|---|---|---|
| Answer quality | Hallucinated | Battlecard-sourced |
| Latency | ~1–2s | ~6–8s |
| Input tokens | 14,394 | ~1,700 |
| Output tokens | 1,621 | ~1,600 |
| Total tokens | 75,631 | ~3,300 |
| Training knowledge | Frozen at cutoff | Live from current KB |
- 23× fewer tokens — retrieval grounds the answer so the model doesn't need to "guess" its context into existence
- Battlecard sourcing — answers cite specific KB numbers, product names, and version facts rather than sounding plausible
- Domain accuracy — Nutanix-specific claims (AOS versions, RF2/RF3 behaviour, NCC health checks) are verified against actual docs, not hallucinated
- ~6–8s overhead — vector search + reranking + light LLM synthesis adds ~6s on top of direct LLM; acceptable for quality-sensitive work
| Approach | Best for |
|---|---|
| Direct LLM | Speed-first tasks where training knowledge is sufficient: drafting generic content, brainstorming, language polishing |
| RAG | Domain-specific questions where accuracy matters: Nutanix configs, KB references, version lifecycle, compatibility matrices |
For the Nutanix questionnaire work in this repo, all reviews use the RAG-grounded approach — because accuracy against specific product versions, hardware compatibility lists, and lifecycle dates is what determines whether a submission is marked responsive or non-responsive.
doc-reviews/
├── README.md ← this file
└── review.md ← completed questionnaire review (2026-04-21)
- Sam (this agent): RAG search via
nutanix_rag_search.py→ LanceDBnutanix_rag_v3→ Gemma 4 31B reranking → synthesised answer - NX_Shield (parallel agent, external engineers): Direct LLM (no RAG) — for contrast baseline
- Vector DB: LanceDB (
nutanix_rag_v3, ~170k rows) with Jina AI embeddings - Search: SearXNG (self-hosted, key-free) for live web fallback