status = await client.health_check(ping_llm=True)- With
ping_llm=True, Engram performs a minimal LiteLLM call — use for staging or post-deploy verification; you may useping_llm=Falsein environments where outbound LLM checks are restricted, as long as you monitor ingest separately.
Interpret flags on HealthStatus (Neo4j connectivity, embedder loaded, vector index, schema version, LLM reachability).
Configure .env / engram_memory/.env, then install from the repo (pip install -e .) or PyPI when available.
| Use case | Command |
|---|---|
| Recommended | python -m engram_memory.cli.e2e_validate |
| Windows clone helper | scripts\engram_memory-e2e.cmd |
Pip script (if on PATH) |
engram_memory-e2e |
| No package install | python scripts/e2e_validate.py |
On Windows, prefer python -m … if engram_memory-e2e is not found (Scripts not on PATH).
| Flag / env | Purpose |
|---|---|
--skip-seed + --user-id or E2E_USER_ID |
Retrieval-only smoke |
--batch-seed |
One LLM call for bundled seed content |
E2E_LLM_TIMEOUT_SEC, E2E_INGEST_TIMEOUT_SEC |
Wall-clock guardrails |
Run in CI against a dedicated Neo4j instance. Full options: --help.
Bolt only (no LLM): python scripts/neo4j_verify_connectivity.py with Neo4j env vars set.
- Set
LOG_FORMAT=jsonfor centralized log aggregation. - Correlate logs with your
user_idandreference_idin application-level fields where possible.
Every IngestResult includes tokens_prompt, tokens_completion, and tokens_total, enabling per-call cost monitoring in production. Use these fields to:
- Track LLM spend per user or per document
- Set alerts when token usage exceeds thresholds
- Compare models for cost-efficiency
The benchmark suite (tests/test_live_benchmark.py) includes configurable per-model pricing and generates cost estimates in benchmarks/benchmark_report.json.
- LLM: token-bucket (
LLM_RATE_LIMIT_RPM,LLM_RATE_LIMIT_BURST) applies to ingestion. - Retries and circuit breaker: configured via
LLM_MAX_RETRIESand adapter behaviour — protect your budget when the provider is failing.
Local SentenceTransformers models download on first use. Plan container images or cached model directories for cold-start latency in Kubernetes or serverless environments.
- Treat
NEO4J_PASSWORDandLLM_API_KEYas secrets (secret manager, not git). user_idshould be an application-level stable identifier; avoid embedding sensitive personal data in graph keys if your threat model requires minimization.
- Issues: github.com/hackdavid/engram-memory/issues
- Contributing: README — Contributing