LiteGraph Improvements Roadmap

Status Key: [ ] Not started | [~] In progress | [x] Complete | [!] Blocked | [?] Needs decision

Last updated: 2026-03-26

Phase 1: Developer Ergonomics (Target: Q2 2026)

Goal: Make the first 5 minutes magical for every developer.

1. OpenAPI 3.0 Specification + Swagger UI

Priority: P0 (Critical) Effort: Small (Watson Webserver has built-in support since v6.5.x) Impact: Unlocks auto-generated SDK clients in any language, provides interactive API explorer

Why This Matters

Developers expect /swagger or /openapi.json on any REST API in 2026
Enables tools like openapi-generator to produce typed clients in Go, Rust, Java, Swift, etc. without manual SDK work
Interactive Swagger UI lets developers explore the API without reading docs
Contract-first development: downstream teams can build against the spec before implementation is complete

Implementation Steps

Acceptance Criteria

GET /openapi.json returns valid OpenAPI 3.0.3 JSON with all 169 endpoints
GET /swagger renders interactive Swagger UI with routes grouped by tag
Every route has a human-readable summary and is assigned to exactly one tag
Security schemes are correctly defined and referenced
Path parameters (tenantGuid, graphGuid, nodeGuid, etc.) are auto-documented with correct types

Files Changed

src/LiteGraph.Server/LiteGraph.Server.csproj (Watson version bump)
src/LiteGraph.Server/API/REST/RestServiceHandler.cs (OpenAPI config + route metadata)

2. Query Language (LiteQL or Cypher Subset)

Priority: P1 (High) Effort: Large (new parser + query executor) Impact: Eliminates "20 API calls for one traversal" problem

Why This Matters

Graph traversals often require chaining multiple REST calls (get node, get edges, get neighbors, filter)
A query language collapses this into one call: MATCH (p:Person)-[:KNOWS]->(f) WHERE p.data.age > 30 RETURN f
Competing graph databases (Neo4j, ArangoDB, DGraph) all offer query languages
AI agents benefit enormously from structured query over multiple natural-language API calls

Implementation Steps

Acceptance Criteria

Single-call graph pattern matching with filtering and projection
Performance within 2x of equivalent hand-coded repository calls
Syntax errors produce helpful messages with line/column positions
Query timeout prevents runaway queries
Works with vector-indexed graphs (vector similarity in WHERE clause)

Files Created

src/LiteGraph/Query/Lexer.cs
src/LiteGraph/Query/Parser.cs
src/LiteGraph/Query/Planner.cs
src/LiteGraph/Query/Executor.cs
src/LiteGraph/Query/Ast/*.cs (AST node types)
src/Test.Query/ (test project)

3. SDK Resilience Layer

Priority: P1 (High) Effort: Medium Impact: Production-grade reliability for REST SDK consumers

Why This Matters

Network failures, transient errors, and server restarts are inevitable in production
Without built-in retry logic, every SDK consumer must implement their own
Circuit breakers prevent cascading failures when the server is overloaded
Connection pooling reduces latency for high-throughput workloads

Implementation Steps

Acceptance Criteria

Transient 500/503 errors are automatically retried without caller intervention
Circuit breaker prevents request storms against a failing server
All retry/timeout/circuit-breaker settings are configurable
Default behavior works well for 95% of use cases without configuration
Retry attempts are logged for observability

Files Changed

sdk/csharp/src/LiteGraph.Sdk/ (resilience layer classes)
sdk/python/litegraph_sdk/base.py (retry logic)
sdk/js/src/base/SdkBase.js (retry logic)

4. Transactions / Batch Atomics

Priority: P2 (Medium) Effort: Medium Impact: Data consistency for multi-step graph mutations

Why This Matters

Creating a node with edges and vectors requires 3+ API calls
If any call fails, the graph is left in an inconsistent state
Transactions allow all-or-nothing semantics for complex mutations
Critical for import/migration workflows and AI agent operations

Implementation Steps

Acceptance Criteria

Multi-operation mutations succeed or fail atomically
Rollback restores graph to pre-transaction state on any failure
Performance overhead < 10% compared to individual calls
Transaction timeout prevents long-running locks
Clear error messages indicating which operation failed and why

Phase 2: Ecosystem & Reach (Target: Q3 2026)

Goal: Meet developers where they are.

5. ASP.NET Core Migration (or Dual-Mode)

Priority: P2 (Medium) Effort: Large Impact: Unlocks standard .NET ecosystem middleware, hosting, and tooling

Why This Matters

ASP.NET Core is the dominant .NET web framework with massive ecosystem
Standard middleware: OpenTelemetry, rate limiting, response compression, health checks
Dependency injection enables testability and modularity
dotnet watch for hot reload during development
Standard deployment: Azure App Service, AWS ECS, Google Cloud Run with zero custom config
WatsonWebserver is capable but requires learning a non-standard API

Implementation Steps

Acceptance Criteria

100% API compatibility with existing Watson-based server
Standard ASP.NET middleware works out of the box
Health check endpoints available for container orchestration
Performance within 10% of Watson-based server
Both servers can coexist in the solution

Files Created

src/LiteGraph.Server.AspNet/ (new project)

6. Python & JS SDK Parity

Priority: P0 (Critical) Effort: Large Impact: Developers in the two most popular languages for AI/ML get first-class LiteGraph support

Why This Matters

Python is THE language for AI/ML — the primary audience for a vector-capable graph DB
JavaScript/Node.js dominates backend web development and serverless functions
SDK gaps force these developers to fall back to raw HTTP calls, losing type safety and convenience
Vector operations (the key differentiator) are completely missing from Python SDK

Current Coverage Gap Analysis

Feature Category	C# (reference)	Python	JavaScript
Admin (backup/restore/flush)	6 methods	0	0
Vectors (CRUD + search)	21 methods	0	7
Vector Index management	5 methods	0	0
Graph subgraph/statistics	4 methods	0	1
Enumerate (pagination v2)	~15 methods	0	0
Scoped labels/tags/vectors	~33 methods	0	0
Node routing/connectivity	4 methods	0	0
Credential advanced	3 methods	0	0
Total missing	—	~90 methods	~50 methods

Implementation Steps — Python SDK

Implementation Steps — JavaScript SDK

Acceptance Criteria

Python SDK covers 100% of vector CRUD + search operations
Python SDK covers admin operations (backup/restore/flush)
JavaScript SDK covers admin, vector index, and scoped operations
All new methods follow existing SDK patterns and naming conventions
New methods have proper error handling using existing exception classes
TypeScript definitions updated for JS SDK (if applicable)

Files Changed/Created — Python

sdk/python/litegraph_sdk/resources/vectors.py (new)
sdk/python/litegraph_sdk/resources/admin.py (new)
sdk/python/litegraph_sdk/models/vector_search_request.py (new)
sdk/python/litegraph_sdk/models/vector_search_result.py (new)
sdk/python/litegraph_sdk/resources/graphs.py (extended)
sdk/python/litegraph_sdk/resources/nodes.py (extended)
sdk/python/litegraph_sdk/resources/labels.py (extended)
sdk/python/litegraph_sdk/resources/tags.py (extended)
sdk/python/litegraph_sdk/__init__.py (updated exports)

Files Changed/Created — JavaScript

sdk/js/src/base/LiteGraphSdk.js (extended with ~30 new methods)
sdk/js/src/models/VectorSearchRequest.js (new, if needed)

7. Change Feeds / Event Streaming

Priority: P2 (Medium) Effort: Large Impact: Enables real-time dashboards, cache invalidation, event-driven architectures

Why This Matters

Graph mutations currently require polling to detect changes
Real-time dashboards need instant notification of node/edge changes
Cache invalidation in distributed systems needs event streams
Event sourcing patterns enable audit trails and temporal queries

Implementation Steps

Acceptance Criteria

Real-time event delivery < 100ms from mutation to subscriber notification
Event ordering is guaranteed per-tenant
SSE endpoint supports reconnection without event loss
Webhook delivery retries failed calls up to configurable limit
Events are retained for configurable duration (default 24h)

8. Schema Constraints (Optional Validation)

Priority: P3 (Low) Effort: Medium Impact: Data quality enforcement without rigid schema requirements

Why This Matters

Property graphs are schema-free by design, but real applications need data quality guardrails
"Every Person node must have an email tag" should be enforceable without application code
Schema validation catches data errors at write time instead of read time
Optional schemas maintain flexibility while adding safety

Implementation Steps

Acceptance Criteria

Schema validation is optional and disabled by default
Validation errors include clear descriptions of what failed and why
Existing data is not affected when a schema is applied (only new writes are validated)
Warn mode logs violations without rejecting writes
Schema can be exported/imported for reuse across graphs

Phase 3: Scale & Enterprise (Target: Q4 2026 - Q1 2027)

Goal: Production confidence at scale.

9. Pluggable Storage Backends

Priority: P1 (High) Effort: Very Large Impact: Removes SQLite single-writer bottleneck, enables enterprise deployment

Why This Matters

SQLite is excellent for single-server, moderate-load deployments
Enterprise customers need PostgreSQL (or similar) for write concurrency, replication, and operational tooling
A pluggable backend lets users choose based on their needs without code changes
SQLite remains the zero-config default; PostgreSQL is the production recommendation

Implementation Steps

Acceptance Criteria

PostgreSQL backend passes 100% of existing test suite
Backend selection is configuration-only (no code changes required)
SQLite performance is not degraded by the abstraction layer
PostgreSQL supports concurrent writes from multiple server instances
Migration tool can transfer 1M+ entities without data loss

10. RBAC & Fine-Grained Permissions

Priority: P2 (Medium) Effort: Large Impact: Enterprise-grade access control

Why This Matters

Current auth is tenant-scoped only: all authenticated users can access everything in their tenant
Enterprises need role-based access: some users read-only, some admin, some restricted to specific graphs
API key scoping prevents over-privileged service accounts
OIDC/OAuth2 integration enables SSO with existing identity providers

Implementation Steps

Acceptance Criteria

At minimum, Viewer (read-only) and Editor roles work out of the box
Graph-level permissions allow isolating sensitive graphs within a tenant
API keys can be scoped to specific operations for least-privilege service accounts
Existing deployments continue to work (default: all users get Editor role)
Performance overhead < 5ms per request for permission checks (cached)

11. Distributed Vector Indexing

Priority: P3 (Low) Effort: Very Large Impact: Support for vector collections larger than single-server memory

Why This Matters

Current HNSW index is single-machine, bounded by available RAM
Large-scale RAG applications may have 10M+ vectors
Sharded indexes distribute memory and compute across nodes
GPU-accelerated distance computation enables real-time search at scale

Implementation Steps

Acceptance Criteria

Support 10M+ vectors across multiple nodes
Search latency < 500ms at 95th percentile for 10M vectors
Recall > 95% compared to brute-force search
Automatic rebalancing when shards are added/removed
Graceful degradation when shards are unavailable

12. Observability

Priority: P1 (High) Effort: Medium Impact: Production debugging and performance monitoring

Why This Matters

Current logging is syslog-based with limited structure
OpenTelemetry is the industry standard for distributed tracing
Prometheus metrics enable alerting on latency, error rates, and resource usage
Query profiling helps developers optimize slow operations

Implementation Steps

Acceptance Criteria

OpenTelemetry traces flow through the full request lifecycle
Prometheus metrics cover the top 10 operational signals
Structured JSON logs include correlation IDs for request tracing
Profiling header has < 5% performance overhead when enabled
Grafana dashboard template works out of the box

Phase 4: Platform (Target: 2027+)

Goal: From database to platform.

13. Graph Analytics Engine

Priority: P3 (Low) Effort: Very Large Impact: Built-in graph algorithms eliminate need for external tools

Implementation Steps

13.1 PageRank algorithm
13.2 Community detection (Louvain method)
13.3 Shortest path (Dijkstra, A*)
13.4 Centrality measures (betweenness, closeness, degree)
13.5 Connected components
13.6 REST endpoints: POST /v1.0/tenants/{tenantGuid}/graphs/{graphGuid}/analytics/{algorithm}
13.7 Results stored as node/edge tags for subsequent queries
13.8 Async execution for large graphs with progress reporting

14. Hybrid Search (Vector + Graph Traversal)

Priority: P0 (Critical - THIS IS THE KILLER FEATURE) Effort: Large Impact: No other lightweight graph DB combines semantic similarity with relationship traversal

Why This Matters

This is the feature that makes LiteGraph irreplaceable for RAG applications:

"Find nodes semantically similar to this query that are within 2 hops of this context node"
Combines the best of vector databases (semantic search) with graph databases (relationship traversal)
Current workflow requires: (1) vector search, (2) for each result, traverse graph, (3) filter by proximity — 3 separate operations
Hybrid search does this in one call with query-time optimization

Implementation Steps

Acceptance Criteria

Single API call combines vector similarity with graph proximity
Results include both similarity scores and graph paths
Performance < 500ms for graphs with 100K nodes and 10K vectors (with HNSW index)
At least 2 execution strategies with automatic selection heuristic
Quality: 90%+ recall compared to exhaustive search

15. LiteGraph Cloud (Managed Service)

Priority: Depends on traction Effort: Very Large Impact: Removes all operational burden for developers

Implementation Steps

15.1 Multi-tenant hosting infrastructure
15.2 Usage-based billing (API calls, storage, vector dimensions)
15.3 Auto-scaling based on query load
15.4 Automated backups and point-in-time recovery
15.5 Dashboard: usage metrics, query analytics, billing
15.6 Free tier: 1 tenant, 10K nodes, 1K vectors
15.7 SOC 2 / GDPR compliance
15.8 Global regions (US, EU, APAC)

16. Plugin System

Priority: P3 (Low) Effort: Large Impact: Community-driven extensibility

Implementation Steps

Quick Wins (Implement This Week)

These items provide outsized impact with minimal effort:

QW-1 Add /openapi.json and /swagger endpoints (item #1 above — Watson has built-in support, just enable it)
QW-2 Add POST /v1.0/tenants/{tenantGuid}/graphs/{graphGuid}/query endpoint accepting structured multi-hop traversal JSON (not a full query language, but structured multi-operation in one call)
QW-3 Add retry logic to C# SDK (wrap RestWrapper calls with 3 retries + exponential backoff)
QW-4 Python SDK: Add Vector resource (the most critical gap for the AI/ML audience)
QW-5 JS SDK: Add Admin methods (backup/restore is table stakes for production use)

Appendix: Competitive Positioning

Feature	LiteGraph	Neo4j	Weaviate	Pinecone	Redis
Embeddable (in-process)	Yes	No	No	No	Yes (limited)
REST API	Yes	Yes	Yes	Yes	Yes
Multi-tenant	Built-in	Enterprise only	Yes	Yes	No
Graph traversal	Yes	Yes	No	No	Limited
Vector search (HNSW)	Yes	Add-on	Yes	Yes	Yes
Hybrid search	Planned	Limited	Yes	No	No
Query language	Planned	Cypher	GraphQL	No	Redis commands
MCP integration	145+ tools	No	No	No	No
SQLite backend	Yes	No	No	N/A	No
Open source	MIT	GPL/Commercial	BSD	No	BSD/Commercial
Package size	< 5MB	> 100MB (JVM)	> 50MB (Go)	N/A (SaaS)	> 10MB

Strategic differentiation: LiteGraph is the only database that combines embeddable deployment, graph traversal, vector search, multi-tenancy, and AI agent integration (MCP) in a single lightweight package under MIT license.

FilesExpand file tree

IMPROVEMENTS.md

Latest commit

History

IMPROVEMENTS.md

File metadata and controls

LiteGraph Improvements Roadmap

Phase 1: Developer Ergonomics (Target: Q2 2026)

1. OpenAPI 3.0 Specification + Swagger UI

Why This Matters

Implementation Steps

Acceptance Criteria

Files Changed

2. Query Language (LiteQL or Cypher Subset)

Why This Matters

Implementation Steps

Acceptance Criteria

Files Created

3. SDK Resilience Layer

Why This Matters

Implementation Steps

Acceptance Criteria

Files Changed

4. Transactions / Batch Atomics

Why This Matters

Implementation Steps

Acceptance Criteria

Phase 2: Ecosystem & Reach (Target: Q3 2026)

5. ASP.NET Core Migration (or Dual-Mode)

Why This Matters

Implementation Steps

Acceptance Criteria

Files Created

6. Python & JS SDK Parity

Why This Matters

Current Coverage Gap Analysis

Implementation Steps — Python SDK

Implementation Steps — JavaScript SDK

Acceptance Criteria

Files Changed/Created — Python

Files Changed/Created — JavaScript

7. Change Feeds / Event Streaming

Why This Matters

Implementation Steps

Acceptance Criteria

8. Schema Constraints (Optional Validation)

Why This Matters

Implementation Steps

Acceptance Criteria

Phase 3: Scale & Enterprise (Target: Q4 2026 - Q1 2027)

9. Pluggable Storage Backends

Why This Matters

Implementation Steps

Acceptance Criteria

10. RBAC & Fine-Grained Permissions

Why This Matters

Implementation Steps

Acceptance Criteria

11. Distributed Vector Indexing

Why This Matters

Implementation Steps

Acceptance Criteria

12. Observability

Why This Matters

Implementation Steps

Acceptance Criteria

Phase 4: Platform (Target: 2027+)

13. Graph Analytics Engine

Implementation Steps

14. Hybrid Search (Vector + Graph Traversal)

Why This Matters

Implementation Steps

Acceptance Criteria

15. LiteGraph Cloud (Managed Service)

Implementation Steps

16. Plugin System

Implementation Steps

Quick Wins (Implement This Week)

Appendix: Competitive Positioning