Skip to content

Latest commit

 

History

History
1878 lines (1585 loc) · 117 KB

File metadata and controls

1878 lines (1585 loc) · 117 KB

[0.6.6-dev] - 2026-04-03

🚀 Features

  • cg CLI tool (clickgraph-tool crate): Agent/script-oriented CLI for Cypher translation and execution without a running server. Commands: cg sql (Cypher→SQL), cg validate (parse + plan check), cg query (execute via remote ClickHouse), cg nl (NL→Cypher via LLM), cg schema show/validate/discover/diff. Config via ~/.config/cg/config.toml. Supports Anthropic (default) and any OpenAI-compatible API.

  • embedded feature now opt-in in clickgraph-embedded: chdb is no longer compiled by default. New Database::new_remote(schema, RemoteConfig) constructor executes Cypher against external ClickHouse with no chdb dependency — the backend used by cg query. Database::sql_only(schema) and Connection::query_to_sql() are always available for translation-only use.

  • Agent skills (skills/): Three publishable agent skills for Claude Code, LangChain, AutoGen, CrewAI, and OpenAI function calling — /cypher (NL→Cypher→SQL→execute), /graph-schema (show + validate schema), /schema-discover (generate schema YAML from ClickHouse via LLM). See skills/README.md for installation across frameworks.

  • openCypher TCK runner (clickgraph-tck/): Cucumber-based compatibility test suite running 402 openCypher TCK scenarios in embedded (chdb) mode. Results: 383/402 passed (95.3%), 0 failures, 19 skipped. The 19 skipped scenarios cover Cypher write clauses (CREATE, SET, DELETE, MERGE) — not yet supported as Cypher syntax; programmatic write API (create_node(), create_edge(), upsert_node()) is already available in embedded mode. Enabled with CLICKGRAPH_CHDB_TESTS=1 cargo test -p clickgraph-tck --test tck.

🐛 Bug Fixes

  • Debug println removed: Eliminated leftover println!("DEBUG TryFrom RenderExpr: ...") in render_plan/render_expr.rs that was polluting stdout during query translation.

[0.6.5-dev] - 2026-03-29

🚀 Features

  • Hybrid remote query + local storage (PR #240): Execute Cypher queries against a remote ClickHouse cluster from embedded mode, then store results locally in chdb as a subgraph for fast re-querying. New RemoteConfig for SystemConfig, plus Connection methods: query_remote(), query_remote_graph(), query_graph(), store_subgraph(). New GraphResult structured output and StoreStats return type. Available in Rust, Python (UniFFI), and Go (UniFFI) bindings.

  • Embedded write API (PR #236): create_node(), create_edge(), upsert_node(), upsert_edge() with batch variants (create_nodes(), create_edges()). delete_nodes(), delete_edges() for cleanup. import_json() and import_json_file() for bulk JSON import. Schema entries without source: get auto-created as ReplacingMergeTree tables. property_types field for type-aware DDL (PR #238).

  • Multi-format file import (PR #243): import_csv_file(), import_parquet_file(), import_file() (auto-detect from extension). Supports CSV, Parquet, TSV, JSON/NDJSON/JSONL formats.

  • Richer Value types (PR #244): Value::Date("YYYY-MM-DD"), Value::Timestamp("YYYY-MM-DD HH:MM:SS"), Value::UUID("8-4-4-4-12") auto-detected from ClickHouse JSON output. to_sql_literal() generates toDate()/toDateTime()/toUUID() wrappers. Value::string() constructor bypasses detection.

  • Kuzu API parity (PR #242): Value::as_bool(), query timing (get_compiling_time()/get_execution_time()), Database::in_memory(), Connection::set_query_timeout(), QueryResult::get_column_data_types().

  • DataFrame output (PR #245): Python QueryResult.get_as_df() (Pandas), get_as_arrow() (PyArrow), get_as_pl() (Polars) with lazy imports.

  • Python wrapper improvements (PR #246): result.compiling_time/execution_time/column_data_types properties. conn.create_node()/create_edge()/create_nodes()/import_file()/execute_sql() accept plain Python dicts with auto-conversion to FFI Value types.

🐛 Bug Fixes (from TCK work)

  • Cypher three-valued equality: Added cypher_literal_eq() in SQL generator implementing Cypher's null-propagating equality — null = anything → null, cross-type comparisons → false, list element-wise null propagation. Fixes 8 comparison test failures. (to_sql_query.rs)

  • VLP chained-pattern start labels: Multi-hop patterns like MATCH (n)-->(a)-->(b) RETURN b now correctly derive start labels for the second hop by recursing into the chained inner GraphRel. Supplements __Unlabeled start labels with schema from_node types for chained patterns. Fixes empty results on 2-hop traversals with labeled data. (cte_extraction.rs)

  • List-of-lists comparison: Extended is_literal_like() to recognise pure-literal nested lists, enabling native ClickHouse Array(Array(T)) comparison (element-by-element, matching Cypher's [2,1] > [2] semantics). Removed unnecessary has_type_mismatch helpers; all-literal arrays now render as-is. (render_expr.rs)

  • Type inference performance regression: Reverted max_combos from MAX_RAW_COMBINATIONS (200,000) to get_max_combinations() (500) — the raw-cap constant was accidentally used where the post-filter limit should be, causing 400× overhead in pattern combination generation. (type_inference.rs)

📚 Documentation

  • Tutorials and examples (PR #246): 5 runnable Python scripts (examples/embedded/) covering quick start, DataFrames, write API, GraphRAG hybrid workflow, and export formats. Wiki tutorial page (docs/wiki/Embedded-Tutorials.md) with Python + Rust code, architecture diagrams, and API quick reference.

🐛 Other Bug Fixes

  • Edge extraction fallback (PR #241): extract_edge_from_row falls back to from_id/to_id aliases when schema FK column names don't match SQL-generated column names.
  • Security dep updates: lz4_flex 0.11.5→0.11.6 (RUSTSEC-2026-0041), rustls-webpki 0.103.8→0.103.10 (RUSTSEC-2026-0049).

🧹 Infrastructure

  • CI: cargo audit ignores unmaintained rustls-pemfile warning (transitive dep via chdb-rust).

[0.6.4-dev] - 2026-03-14

🚀 Features

  • Denormalized & coupled schema support: Full query support for schemas where node properties are embedded in edge tables via from_node_properties/to_node_properties. Includes property mapping, ORDER BY resolution, UNION aggregate column rewriting, and id() on virtual nodes (PRs #224-#228).

  • OPTIONAL MATCH on denormalized schemas: New CTE + LEFT JOIN architecture for correct LEFT JOIN semantics when MATCH produces a UNION standalone node scan. Includes UnionDistribution skip for optional patterns, column reference rewriting, and join preservation through the optimizer (PRs #229-#230).

  • VLP on denormalized/polymorphic schemas: Fixed exact-length VLP cycle prevention for virtual nodes (no separate table), enabling *2, *3 patterns. Range VLP (*1..3), path variables, and shortestPath all work on denormalized schemas (PR #231).

  • Cross-schema pattern matrix tests: Comprehensive test suite covering 15 query patterns across 5 schema types (standard, FK-edge, denormalized, polymorphic, coupled). 151 tests passing, 0 xfails (PRs #226-#232).

🐛 Bug Fixes

  • Denormalized property mapping: get_properties_with_table_alias() resolves node properties through edge table's from_node_properties/to_node_properties with direction awareness (PR #225).
  • id(node) on denormalized nodes: SelectBuilder Case 5 now resolves through edge alias and mapped column instead of using the virtual node alias directly (PR #227).
  • UNION branch Column qualification: Bare Column("OriginCityName") expressions from denormalized ViewScans converted to PropertyAccessExp with correct alias in GraphNode handler (PR #228).
  • VLP cycle prevention: Moved extract_table_name calls inside non-denormalized branch — denormalized patterns use from_id/to_id directly (PR #231).
  • UnionDistribution: Skip distributing optional GraphRel over denormalized Union to preserve LEFT JOIN semantics (PR #229).
  • is_node_denormalized: Now handles Union of denormalized GraphNodes (PR #229).

🧹 Infrastructure

  • jemalloc memory allocator: Reduces memory fragmentation for long-running server workloads (PR #213).
  • Plan explosion guard: Prevents combinatorial blowup in multi-type VLP expansion (PR #212).
  • Test cleanup: ~103 stale xfail markers removed, 25 invalid test queries converted to skips (PRs #211, #218-#223, #227, #232).

[0.6.3-dev] - 2026-03-05

🚀 Features

  • APOC Export Procedures: Neo4j-compatible CALL apoc.export.{csv|json|parquet}.query(cypher, destination, config) for exporting query results. Supports local files, S3, GCS, Azure, and HTTP destinations. Works in HTTP server, Bolt protocol, and embedded mode.

    • Destination resolver: Maps URI schemes to ClickHouse INSERT INTO FUNCTION table functions (file(), s3(), url(), azureBlobStorage())
    • Parser fix: Standalone CALL with positional args now correctly parsed even when inner Cypher contains RETURN/UNION keywords
    • Config: Parquet compression codecs (snappy, gzip, lz4, zstd, brotli)
  • Embedded mode (PR #179): Run Cypher graph queries entirely in-process via chdb — no external ClickHouse server required. Supports Parquet, CSV, Iceberg, Delta Lake, and S3-compatible storage.

    • QueryExecutor trait: Abstracts SQL execution; RemoteClickHouseExecutor (existing) and ChdbExecutor (new) are the two backends. Default behaviour is unchanged.
    • clickgraph-embedded crate: Kuzu-compatible Rust library API — Database::new(schema, config), Connection::new(&db), conn.query(cypher), result.next()Row.
    • source: schema field: Optional per-node/relationship URI pointing to the data file. At startup, ClickGraph creates chdb VIEWs named after the schema table: field so existing SQL generation requires no changes.
    • URI schemes: file://, s3://, gs://, iceberg+s3://, iceberg+local://, delta+s3://, table_function:<raw>.
    • StorageCredentials: S3/GCS/Azure credentials applied as chdb SET commands at session init; falls back to environment variables and instance-profile credentials automatically.
    • Server embedded flag: --embedded CLI flag / CLICKGRAPH_EMBEDDED=true env var; HTTP and Bolt endpoints work as normal.
    • Tests: 9 source_resolver tests, 8 credential tests, 17 embedded unit tests, 10 e2e integration tests.
    • Docs: Embedded Mode wiki page

🚀 Features

  • LDBC SNB benchmark: 14/37 → 36/37 (97%) — 22 queries promoted from adapted to official Cypher. The only remaining gap is bi-16 (CALL subquery, a known language feature gap).

    • Official queries promoted: complex-3, complex-5, complex-7, complex-10, complex-12, complex-13, bi-3, bi-8, bi-14, and others
    • Adapted queries remaining: bi-17 (multi-VLP), complex-14 (weighted shortest path via cost(path))
  • GraphRAG structured output (format: "Graph") (PR #165): Query results returned as graph-structured JSON with nodes, edges, and properties — enables direct consumption by graph visualization and RAG pipelines.

  • ClickHouse cluster load balancing (CLICKHOUSE_CLUSTER env var) (PR #164): Distributes queries across ClickHouse cluster nodes for horizontal read scaling.

  • apoc.meta.schema() for MCP server compatibility (PR #163): Implements the Neo4j APOC procedure that MCP servers and graph tools use for schema introspection.

  • LLM-powered schema discovery (:discover command) (PR #146): Server formats a discovery prompt (POST /schemas/discover-prompt), client calls LLM (Anthropic or OpenAI-compatible) to generate YAML schema from ClickHouse table metadata. Replaced the GLiNER/gline-rs approach.

  • Weighted shortest path (cost(path) function) (PR #160): Supports Dijkstra-style weighted VLP traversal for queries like complex-14. WeightCteConfig carries weight info through the VLP pipeline; auto-creates bidirectional weight CTEs for undirected traversal.

  • List comprehension → arrayCount() optimization (PR #153): Parses [x IN list WHERE cond | expr] syntax, maps size(ListComprehension) to ClickHouse arrayCount() — avoids correlated subqueries that fail with UNION ALL ("Cannot clone Union plan step").

  • Pattern comprehension → pre-aggregated CTE approach (PR #159): Replaces correlated subqueries from size(PatternComprehension) with pre-aggregated CTEs + LEFT JOINs. Includes arrayConcat() for list concatenation (list1 + list2).

  • Official complex-7 — chained map access + NOT EXISTS (PR #152): Greedy chained property parsing (a.b.c), map literal node flattening (head(collect({key: node}))), split NOT EXISTS for undirected edges.

  • Official complex-3 — supertype inference + IN→OR expansion (PR #151): Supertype collapse (Post+Comment → Message), IN [col1, col2]OR expansion for ClickHouse compatibility, 5-WITH chain support.

  • Map property access (collect({score: x})[0].score → ClickHouse map subscript) (PR #147): Tracks map_keys through CTE pipeline, generates ArraySubscript for map property access with 0-based → 1-based index conversion.

  • UNWIND support (ARRAY JOIN) (PR #133): Translates Cypher UNWIND to ClickHouse ARRAY JOIN.

  • --log-level CLI flag for runtime log level configuration.

🐛 Bug Fixes

  • Undirected edge fixes: Removed has_nested_undirected_edge guard that prevented UNION split for mid-chain undirected edges (PR #147). Fixed BidirectionalUnion for multi-pattern MATCH with bound endpoints — collapses redundant Union to single Outgoing branch (PR #148).

  • VLP (variable-length path) fixes: Fixed path rewriting for reverse UNION branches (PR #135), composite ID support (PR #134, #136), *N..N exact-hop guard (PR #137), duplicate WITH RECURSIVE removal (PR #131), multi-VLP query support (PR #132), DISTINCT deduplication (PR #130), zero-lower-bound *0.. for single-type and multi-type VLPs (PR #142), CROSS JOIN removal for VLP CTEs in downstream queries (PR #145).

  • OPTIONAL MATCH fixes: INNER→LEFT JOIN conversion for CTE-backed JOINs in OPTIONAL MATCH context, spurious duplicate JOIN removal, orphan JOIN removal guards, collect(node) expansion to ID-only for has() compatibility (PR #143).

  • CTE/scope fixes: Bare variable resolution after WITH barrier (PR #120, #121), cte_references preservation in UNION branches (PR #122), composite alias augmentation (PR #128), buried WithClause preservation in DuplicateScansRemoving (PR #138).

  • shortestPath fixes: CASE path IS NULLifNull(minOrNull(hop_count), -1) rewriting, spurious non-VLP JOIN cleanup, endpoint inline filter preservation (PR #157).

  • Parser whitespace fix: MATCH/OPTIONAL MATCH now handle leading whitespace after $param syntax (PR #145).

  • Browser click-to-expand regressions: Fixed 5 bugs from scope resolution redesign — filter_tagging crash, VLP multi-type inference, type mismatch, polymorphic label extraction, pruned MATCH detection (PR #156).

  • Determinism fixes: HashSet→BTreeSet in anchor node selection, HashMap→BTreeMap in GraphSchema, sorted conversions in CTE extraction (PR #137, #139).

⚙️ Infrastructure

  • Integration test cleanup: 3,068 tests passing, 57 stale xfails removed (PR #169).
  • Scoping-only WITH collapse + benchmark infrastructure (PR #168): Optimizes scoping-only WITH clauses that don't need CTE materialization.
  • Schema-parameterized SQL generation tests: 76 tests across 6 schema variants (PR #162).
  • Browser interaction tests with full schema variant coverage (PR #161).
  • Version bump to v0.6.3-dev with README cleanup (PR #167).
  • Roadmap and guide updates (PR #166).

[0.6.2-dev] - 2026-02-20

⚙️ Architecture

  • Scope-aware variable resolution for CTE/UNION rendering (Feb 20, 2026, PR #120): Infrastructure for correct variable resolution across WITH barriers during SQL rendering.

    • Extended VariableSource::Cte with property_mapping (Cypher property → CTE column name) for runtime column resolution
    • Added resolve() to VariableRegistry for property lookup during SQL generation
    • Populated property mappings in build_chained_with_match_cte_plan loop from scope CTE variables
    • Wired VariableRegistry into SQL rendering via task-local QueryContext
    • Scope fixes: UNION branch recursion in rewrite_render_plan_with_scope; WITH barrier scope clearing between WITH clauses; per-CTE registry save/restore in Cte::to_sql()
    • Evidence: 2-WITH chain with bidirectional KNOWS now generates correct CTE alias references (a_b.p1_b_id instead of b.p1_b_id)
    • Files: 10 files, +486/-28 lines
    • Tests: 1,111 unit tests passing, LDBC 13/37 (35%) — no regression
  • Clean join generation architecture with anchor-aware algorithm (Feb 19, 2026, PR #117): Major refactoring of JOIN generation and ordering.

    • Core insight: Traditional node-edge-node is the base case (2 JOINs); all other JoinStrategy variants are optimizations that skip some JOINs
    • New generic algorithm: per-pattern loop → generate_pattern_joins() → VLP rewrites → optional marking → dedup → anchor selection → topological sort
    • Anchor-aware generation: Handles 4 cases (neither/left/right/both available) — critical for OPTIONAL MATCH shared-node patterns
    • Replaced ~1200 lines of per-strategy handler code with 64-line generic loop + clean 810-line module
    • Files: 5 files, +1002/-1296 lines (net -374 lines)
    • Tests: 1,040 unit tests passing, LDBC 13/37 (35%) — no regression

🐛 Bug Fixes

  • Neo4j Browser click-to-expand regression fixes (Feb 19, 2026, PR #116): Fixed 5 bugs introduced by the scope resolution redesign (PR #115) that completely broke click-to-expand in Neo4j Browser.
    • Bug 1 — filter_tagging crash: When TypeInference prunes all relationship types, filter_tagging crashed with no table context. Fixed by propagating Empty plan on error.
    • Bug 2a — VLP multi-type inference: Phase 1 computed the right GraphNode before plan_ctx was updated with inferred labels, causing Phase 2 to generate empty WHERE 0=1 UNION branches. Fixed by re-running infer_labels_recursive on the right node after multi-type detection.
    • Bug 2b — VLP+WITH type mismatch: JOIN between WITH CTEs and VLP CTEs failed (UInt64 vs String). Fixed by wrapping node id columns in toString().
    • Bug 2c — extract_node_labels not polymorphic: Returned only primary label when multiple node types were present. Fixed to return all types.
    • Bug 3 — empty SQL for pruned MATCH: is_return_only_query() misidentified pruned MATCH as pure RETURN. Fixed by checking Projection items for TableAlias (MATCH) vs Literal (RETURN).
    • Noise fix: HTTP OPTIONS/GET probes from Neo4j Browser on the Bolt port logged as ERROR. Downgraded to DEBUG.
    • Verification: User node expansion returns exactly 11 rows (3 FOLLOWS-out, 3 FOLLOWS-in, 2 AUTHORED, 3 LIKED) matching raw ClickHouse counts.

⚙️ Infrastructure

  • Neo4j Browser demo improvements (Feb 19, 2026, PR #116):
    • All 5 ClickHouse tables migrated from Memory to MergeTree ENGINE — data now persists across container restarts.
    • Removed duplicate data loading from setup.sh; init-db.sql is the single data entrypoint.
    • clickgraph service updated to official image genezhang/clickgraph:v0.6.2-dev.

🚀 Features

  • Foundational Variable Scope Resolution Redesign (Feb 2026): 🎉 MAJOR ARCHITECTURE FIX
    • Problem: The rendering pipeline resolved variables without scope context. Cypher's WITH creates scope barriers — only exported variables survive — but the SQL generator was unaware of this, causing leaked JOINs, wrong column references, and broken ORDER BY/GROUP BY/HAVING for post-WITH variables.
    • Root Cause: 13 separate resolution paths scattered across the codebase, a reverse_mapping hack (~88 usages) patching wrong results post-hoc.
    • Solution: VariableScope struct as a single, forward-only resolution source, built iteratively with each WITH iteration and threaded into every resolution site.
    • Architecture:
      VariableScope (new):
      ├─ Resolve alias.property → CteColumn | DbColumn | Unresolved
      ├─ Built per WITH iteration: scope.advance_with(alias, cte_name, mapping, labels)
      ├─ Covers: SELECT, WHERE, ORDER BY, GROUP BY, HAVING, JOIN conditions
      └─ Eliminates need for post-render reverse_mapping rewrites
      
    • Key Changes (22 commits):
      • src/render_plan/variable_scope.rs: New VariableScope, CteVariableInfo, rewrite_render_plan_with_scope() — expands bare CTE node vars into individual columns
      • src/render_plan/plan_builder_utils.rs: Scope built in build_chained_with_match_cte_plan() loop; alias rename mapping (WITH u AS person → maps person→u for property lookup)
      • src/render_plan/plan_builder.rs: Scope threaded into rendering pipeline
      • Removed ~1,362 net lines: intermediate_reverse_mapping, final reverse_mapping block, 6 helper functions for reverse-mapping rewrites
      • Fixed UNION CTE SELECT * → project needed columns per branch
      • Fixed aggregate UNION rendering (inner branches project raw columns, outer aggregates)
      • Fixed deterministic join ordering (HashMap+Vec preserves insertion order)
      • Fixed VLP+WITH JOIN type mismatch (toString() wrapping on UInt64 removed)
      • Fixed CTE node variable expansion in SELECT (bare a after WITH → individual columns)
      • Fixed alias renaming through WITH (WITH u AS person → resolves person.name)
    • Results:
      • ✅ 1,032/1,032 unit tests passing
      • ✅ Integration tests at parity with main branch (13/13 same pre-existing failures)
      • ✅ LDBC mini benchmark: 14/37 (38%), up from 10/37 (27%) baseline (+4 queries)
      • ✅ Zero new regressions
      • 🎯 Net: -1,362 lines (architecture cleaned, reverse_mapping eliminated)

🐛 Bug Fixes

  • ORDER BY, HAVING, LIMIT, SKIP clause extraction (Feb 17, 2026): Fixed critical bug where clauses were omitted in multiple code paths
    • Problem: Four code paths calling trait methods instead of utility functions → clauses dropped
    • Root Cause: self.extract_order_by() returns empty (trait default), should use plan_builder_utils::extract_order_by(self) (handles wrapper nodes)
    • Impact: ~50 ORDER BY integration tests failing, queries returning wrong order
    • Fixed Paths:
      1. GraphJoins path (commit 4a9ff13) - lines 2929-2938
      2. ViewScan path (commit 0acfd74) - lines 837, 845-847
      3. Union branch path (commit 0acfd74) - lines 1059, 1061, 1063-1065
      4. Pattern comprehension path (commit 0acfd74) - lines 1148, 1154, 1160-1161
    • Key Discovery: Cypher HAVING uses WITH...WHERE syntax (not direct HAVING keyword), already working correctly
    • Files Modified:
      • src/render_plan/plan_builder.rs: 4 code paths fixed to use utility functions
      • src/query_planner/analyzer/type_inference.rs: Fixed clippy warning
    • Testing: All 1,022 unit tests passing, ORDER BY verified in all query patterns
    • Expected Impact: ~50 failing integration tests → passing (585/960 → ~635/960, 61% → 66%)

🚀 Features

  • Schema/Type Inference Consolidation (Feb 16, 2026): 🎉 ARCHITECTURE CLEANUP - 668 LINES REMOVED

    • Mission: Merge overlapping SchemaInference + TypeInference into single unified pass
    • Problem: Two passes with duplicate logic (label inference, ViewScan resolution) + planning phase creating UNIONs without type knowledge → architectural debt
    • Solution: 6-phase incremental consolidation (Phases 0-E) with comprehensive testing
    • Implementation:
      • Phase 0: Added 79 gap coverage tests (multi-table, FK-edge, label inference, denormalized)
      • Phase A: Created function mapping document (8 cases analyzed)
      • Phase B: Extended TypeInference with Phase 0 (relationship inference) + Phase 3 placeholder
      • Phase C: Modified planning to return Empty for unlabeled nodes (removed 125 lines of premature UNION creation)
      • Phase D: Fixed SchemaInference to read labels from GraphNode.label (set by TypeInference Phase 2)
      • Phase E: Implemented full Phase 3 ViewScan resolution, removed SchemaInference completely
    • Architecture After:
      UnifiedTypeInference (4 phases):
      ├─ Phase 0: Relationship-based label inference (from SchemaInference)
      ├─ Phase 1: Filter→GraphRel UNION (existing, working)
      ├─ Phase 2: Untyped node UNION with direction validation (browser bug fix)
      └─ Phase 3: ViewScan resolution (from SchemaInference)
      
    • Key Changes:
      • src/query_planner/analyzer/type_inference.rs: +755 lines (Phase 0 + Phase 3 implementation)
      • src/query_planner/logical_plan/match_clause/helpers.rs: -125 lines (UNION creation removed)
      • src/query_planner/analyzer/schema_inference.rs: DELETED (-1308 lines)
      • src/query_planner/analyzer/mod.rs: Removed SchemaInference pass
    • Results:
      • ✅ Single source of truth for type resolution
      • ✅ Cleaner architecture (one pass instead of two overlapping passes)
      • ✅ Direction validation works everywhere (Phase C fix)
      • ✅ Better performance (one less analyzer pass)
      • ✅ All 1022 unit + 36 integration tests passing
      • 🎯 Net: -668 lines (removed 1445, added 777)
    • Testing: Comprehensive gap coverage tests, baseline capture with rollback tags, incremental validation at each phase
    • Documentation: Updated STATUS.md, type-inference architecture notes
    • Impact: 🎉 Major architectural improvement with zero behavior changes
  • Unified Type Inference with Direction Validation (Feb 16, 2026): 🎯 NEO4J BROWSER FIX

    • Problem: Neo4j Browser expand feature showed relationships in wrong direction (Post→User instead of schema-defined User→Post)
    • Root Cause: Browser queries like MATCH (a)--(b) WHERE id(a) IN [Post.1] had labels extracted from WHERE constraints, but no pass validated direction against schema. Invalid branches like (Post)-[AUTHORED]->(User) passed through despite schema defining User→Post.
    • Solution: Extended TypeInference to merge PatternResolver functionality, extract WHERE constraints, validate direction, and optimize undirected patterns
    • Key Improvements:
      • WHERE constraint extraction: extract_labels_from_where() decodes id() IN [...] patterns from LogicalExpr
      • Direction validation: check_relationship_exists_with_direction() enforces schema direction constraints
      • Undirected optimization: optimize_undirected_pattern() converts Direction::Either to unidirectional when all valid combinations go same direction
      • UNION generation: try_generate_union_with_constraints() creates Union with only schema-valid branches
    • Architecture:
      Filter(WHERE id(a) IN [...])
        └─ GraphRel(a, r, b, direction=Either)
      
      ↓ UnifiedTypeInference
      
      1. Extract labels from WHERE: a ∈ {Post}, b ∈ {User}
      2. Check schema: User→Post (AUTHORED, LIKED), User→User (FOLLOWS)
      3. Optimize: All Post combinations go backward → Convert Either to Incoming
      4. Generate Union with valid branches only
      
    • Algorithm (src/query_planner/analyzer/type_inference.rs):
      1. Intercepts Filter→GraphRel patterns
      2. Extracts WHERE constraints (labels from id() calls)
      3. Computes possible types (explicit labels + WHERE + schema)
      4. Optimizes undirected patterns (Either→Outgoing/Incoming when unidirectional)
      5. Validates each (left, rel, right) combination with direction check
      6. Generates Union if multiple branches, single branch if one, skips if zero
    • Results:
      • ✅ UNION generation: 3 branches for valid User→{User,Post} patterns
      • ✅ Direction filtering: MATCH (p:Post)--(u:User) correctly uses schema direction (User→Post)
      • ✅ Invalid branches excluded: MATCH (p:Post)-[r]->(u:User) returns 0 (correct!)
      • ✅ Undirected optimization: (Post)--(User) with Direction::Either converts to Incoming
    • PatternResolver Deprecated: Functionality merged into TypeInference
    • Testing: Manual verification with Neo4j Browser patterns, direction validation tests
    • Impact: 🎉 Neo4j Browser expand feature now shows correct relationship directions

🐛 Bug Fixes

  • OPTIONAL MATCH Schema Lookup Fix (Feb 3, 2026): ✅ ALL SMOKE TESTS PASSING
    • Problem: OPTIONAL MATCH queries failed with "Relationship with type FOLLOWS not found" due to incomplete node label inference
    • Root Cause: Relationship schemas stored only with composite keys (TYPE::FROM::TO), but OPTIONAL MATCH used simple keys (TYPE)
    • Solution: Enhanced schema storage and lookup to support both composite and simple key access patterns
    • Changes:
      • src/graph_catalog/config.rs: Store relationships with both composite and simple keys for backward compatibility
      • src/graph_catalog/graph_schema.rs: Added fallback logic in get_rel_schema_with_nodes() to try composite keys when simple key lookup fails
    • Result: All 10 smoke tests now passing (previously 7/10), including OPTIONAL MATCH with aggregation
    • Impact: Robust relationship resolution for all query types (regular MATCH, OPTIONAL MATCH, multi-type patterns)

�🚀 Features

  • PatternResolver - Automatic Type Enumeration (Feb 8, 2026): 🧠 SCHEMA INTELLIGENCE

    • Problem: Untyped graph patterns (MATCH (n)) fail or behave unpredictably without explicit type labels
    • Solution: Systematic type resolution that automatically enumerates all valid type combinations from schema
    • What Works:
      • Automatic discovery: Recursively finds all untyped variables in logical plan
      • Schema querying: Collects all valid node types for each untyped variable
      • Combination generation: Creates cartesian product of type assignments (limited to 38 by default)
      • Relationship validation: Filters combinations based on schema relationship constraints
      • Query cloning: Creates separate typed query for each valid combination
      • UNION ALL: Combines all typed queries into single result
      • Graceful fallback: Continues with original plan if errors occur
    • Example:
      -- Input: Exploratory query without type labels
      MATCH (o) RETURN o.name LIMIT 10
      
      -- PatternResolver transforms to:
      MATCH (o:User) RETURN o.name LIMIT 10
      UNION ALL
      MATCH (o:Post) RETURN o.name LIMIT 10
    • Architecture (7 phases, ~1100 lines):
      • Phase 0: Infrastructure (status message system, configuration)
      • Phase 1: Discovery (recursive traversal to find untyped GraphNode variables)
      • Phase 2: Schema Query (collect type candidates for each variable)
      • Phase 3: Combination Generation (iterative cartesian product with early termination)
      • Phase 4: Validation (extract relationships, filter invalid combinations)
      • Phase 5: Query Cloning (recursive cloning with label insertion)
      • Phase 6: UNION ALL (combine typed queries into Union plan)
      • Phase 7: Integration (Step 2.1 in analyzer pipeline, after TypeInference)
    • Configuration:
      • CLICKGRAPH_MAX_TYPE_COMBINATIONS=38 (default, max 1000)
      • Prevents combination explosion in large schemas
    • Performance: <10ms overhead for typical queries (1-2 untyped variables)
    • Integration Strategy:
      • TypeInference (Step 2): Handles deterministic type inference (e.g., from relationship type)
      • PatternResolver (Step 2.1): Handles non-deterministic cases (creates UNION ALL)
      • Complementary, not redundant - PatternResolver only activates on remaining untyped nodes
    • Use Cases:
      • Exploratory analysis: MATCH (n) RETURN count(n) - count all nodes across types
      • Multi-type patterns: MATCH (a)-[r]->(b) RETURN * - all relationships
      • Schema discovery: MATCH (n) RETURN distinct labels(n) - find node types
    • Impact: ✨ Enables true exploratory graph queries without manual type annotations
    • Testing:
      • 16 dedicated unit tests (100% passing)
      • 995/995 total tests passing (zero regressions)
      • Covers all phases: discovery, combinations, validation, cloning
    • Files:
      • New: src/query_planner/analyzer/pattern_resolver.rs (1033 lines)
      • New: src/query_planner/analyzer/pattern_resolver_config.rs (58 lines)
      • Modified: src/query_planner/analyzer/mod.rs (pipeline integration)
      • Modified: src/query_planner/plan_ctx/mod.rs (status message system)
    • Branch: feature/pattern-resolver (10 commits, +1202/-24 lines)
    • Documentation: See notes/pattern-resolver.md for implementation details
  • Property-Based UNION Pruning (Track C) (Feb 3, 2026): ⚡ PERFORMANCE OPTIMIZATION

    • Problem: Untyped graph patterns (MATCH (n) WHERE n.property...) generated UNION across ALL types, wasting resources
    • Solution: Automatic schema-based filtering - only query types that have the required properties
    • Performance: 10x-50x faster for queries on schemas with many node/relationship types
    • What Works:
      • Node patterns: MATCH (n) WHERE n.user_id = 1 → Only queries User type (not all 10+ types)
      • Relationship patterns: MATCH ()-[r]->() WHERE r.follow_date... → Only queries FOLLOWS type
      • UNION ALL queries: Each branch filters independently (automatic)
      • Single-branch optimization: Skips UNION wrapper when only 1 type matches
      • Empty result optimization: Returns 0 rows immediately when no types match
    • Property Extraction: ANY property reference implies property must exist
      • n.property > value → requires property
      • n.x = 1 AND n.y = 2 → requires both x and y
      • Works in functions: length(n.name) → requires name
    • Architecture (5 phases, ~800 lines):
      • Phase 1: WherePropertyExtractor - Recursively extracts ALL property references from WHERE clauses
      • Phase 2: SchemaPropertyFilter - Filters node/relationship schemas using HashSet::is_subset()
      • Phase 3: Single-branch optimization in generate_scan() (0 types → Empty, 1 type → ViewScan, N types → filtered UNION)
      • Phase 4: Relationship filtering in traversal.rs (stores filtered types in GraphRel.labels)
      • Phase 5: UNION ALL auto-supported (each branch gets independent PlanCtx)
    • Example:
      -- Before: UNION across ALL node types
      MATCH (n) WHERE n.user_id = 1 RETURN n
      -- Generated SQL scanned: users, posts, connections, orders, etc. (10+ tables)
      
      -- After: Only User type
      -- Generated SQL scanned: users (1 table)
      -- Result: 10x-50x faster
    • Impact: ✨ Neo4j Browser exploration queries now performant on large schemas
    • Testing:
      • 949/949 unit tests passing (100%, zero regressions)
      • 2/3 integration tests passing (schema loading setup pending)
    • Files:
      • New: src/query_planner/analyzer/where_property_extractor.rs (339 lines)
      • New: src/query_planner/logical_plan/match_clause/schema_filter.rs (130 lines)
      • New: tests/integration/test_track_c_property_filtering.py (155 lines)
      • Modified: helpers.rs, traversal.rs, view_scan.rs, filter_tagging.rs, schema_inference.rs, plan_ctx/mod.rs
    • Branch: feature/track-c-property-optimization (8 commits)
  • Top-Level UNION ALL Support (Feb 2, 2026): Combine multiple independent queries with UNION/UNION ALL

    • Syntax: query1 UNION ALL query2 for combining results from different queries
    • Features:
      • Per-branch clauses: DISTINCT, LIMIT, WHERE, ORDER BY supported in each branch
      • Mixed entity types: Nodes and relationships can be combined in same result set
      • Both UNION (removes duplicates) and UNION ALL (keeps duplicates) supported
    • Requirements:
      • Column count and names must match across branches
      • Types should be compatible (ClickHouse requirement)
    • Known Limitations:
      • Requires explicit labels (:User, :Post); untyped patterns (MATCH (n)) require Track C
      • Type casting may be needed for incompatible types across branches
    • Testing: 3 integration tests covering simple unions, DISTINCT/LIMIT, and mixed node/relationship queries
    • Examples:
      -- Multi-type aggregation
      MATCH (u:User) RETURN "users" AS type, count(*) AS count
      UNION ALL
      MATCH ()-[r:FOLLOWS]->() RETURN "follows" AS type, count(*) AS count
      
      -- Schema merging
      MATCH (u:User) RETURN u.name, u.email, "user" AS source
      UNION ALL
      MATCH (a:Admin) RETURN a.name, a.email, "admin" AS source
    • Files: server/handlers.rs, server/sql_generation_handler.rs, tests/integration/test_union_all.py
    • Branch: feature/top-level-union-all
    • Documentation: Added comprehensive section in Cypher Language Reference
  • Path UNION Queries for Neo4j Browser "Dot" Feature (Feb 2, 2026): ⭐ NEO4J COMPATIBILITY

    • Problem: Neo4j Browser's dot query explorer sends MATCH p=()-->() RETURN p but ClickGraph couldn't handle untyped paths with properties
    • Solution: Reused Union infrastructure to generate UNION ALL across all relationship types with JSON property format
    • How It Works:
      • plan_builder.rs detects path UNION patterns (GraphJoins with path tuples)
      • convert_path_branches_to_json() transforms each branch to consistent 4-column JSON schema
      • build_format_row_json() uses prefixed aliases (_s_city, _e_city, _r_follow_date) to avoid ClickHouse alias collision
      • select_builder.rs expands denormalized relationship properties via schema lookup
      • Bolt transformer strips prefixes for clean Neo4j Browser display
    • Generated SQL Pattern:
      SELECT tuple('fixed_path', 't1_0', 't2_0', 't3') as p,
             formatRowNoNewline('JSONEachRow', t1_0.user_id AS _s_user_id, ...) as _start_properties,
             formatRowNoNewline('JSONEachRow', t2_0.post_id AS _e_post_id, ...) as _end_properties,
             formatRowNoNewline('JSONEachRow', t3.post_date AS _r_post_date) as _rel_properties
      FROM users_bench t1_0 JOIN posts_bench t2_0 ... JOIN posts_bench t3
      UNION ALL ...
    • Impact: ✨ Neo4j Browser dot query now shows all connected edges with properties!
    • Key Features:
      • All relationship types included (denormalized + explicit edge tables)
      • Type preservation: numbers stay numbers, dates stay dates
      • Automatic property expansion for denormalized relationships (e.g., AUTHORED)
      • Clean property names in browser (prefixes internal only)
    • Files: src/render_plan/plan_builder.rs, src/render_plan/plan_builder_helpers.rs, src/render_plan/select_builder.rs, src/server/bolt_protocol/result_transformer.rs
  • Label-less Node Queries for Neo4j Browser "Dot" Feature (Feb 1, 2026): ⭐ NEO4J COMPATIBILITY

    • Problem: Neo4j Browser's exploration feature sends MATCH (n) RETURN n LIMIT 25 but ClickGraph required explicit labels
    • Solution: Reused existing Union infrastructure to generate UNION ALL across all node types when no label specified
    • How It Works:
      • generate_scan() detects label-less patterns and creates Union of ViewScans for all node types in schema
      • Multi-label scan detection recursively unwraps GraphJoins→Projection→GraphNode→ViewScan layers
      • json_builder::generate_multi_type_union_sql() generates uniform columns: _label, _id, _properties
      • is_multi_label_scan flag preserves special columns through Projection pass
    • Generated SQL Pattern:
      WITH __multi_label_union AS (
        SELECT 'User' as _label, toString(user_id) as _id, formatRowNoNewline('JSONEachRow', ...) as _properties FROM users
        UNION ALL
        SELECT 'Post' as _label, toString(post_id) as _id, formatRowNoNewline('JSONEachRow', ...) as _properties FROM posts
      )
      SELECT n._label, n._id, n._properties FROM __multi_label_union AS n LIMIT 25
    • Impact: ✨ Neo4j Browser "dot" exploration now works - click any node to see all connected nodes!
    • Files: src/query_planner/logical_plan/match_clause/helpers.rs, src/render_plan/plan_builder.rs, src/render_plan/mod.rs
  • RETURN Clause Evaluation for Procedures (Feb 1, 2026): ⭐ CRITICAL FEATURE - Full RETURN clause support for procedure-only queries

    • Problem: Neo4j Browser schema sidebar was empty because Browser sends complex UNION queries with RETURN clauses that aggregate procedure results
    • Solution: Implemented complete RETURN clause evaluator in src/procedures/return_evaluator.rs with:
      • Expression evaluation: variables, literals, map literals, list construction, property access
      • Aggregation functions: COLLECT (array aggregation), COUNT (with distinct support)
      • Array slicing: [..1000], [5..], [2..10] operations
      • Proper aggregation semantics: processes all records to produce single aggregated result
    • Architecture: Async-safe execution flow with ExecutionPlan enum to cross async boundaries
    • Example Query: CALL db.labels() YIELD label RETURN {name:'labels', data:COLLECT(label)[..1000]} AS result
    • Result Format: Returns aggregated structure Browser expects: {result: {name: 'labels', data: [...]}}
    • Impact: ✨ Neo4j Browser schema sidebar now auto-populates with labels, relationships, and properties!
    • Testing: 3/3 unit tests + E2E validation with Python neo4j-driver (3-branch UNION query works perfectly)
    • Files: New: src/procedures/return_evaluator.rs; Modified: src/server/bolt_protocol/handler.rs, src/procedures/executor.rs
  • Neo4j Schema Metadata Procedures (Feb 2026): Implemented 4 essential procedures for Neo4j tool compatibility

    • New Procedures:
      • CALL db.labels() - Returns all node labels in current schema
      • CALL db.relationshipTypes() - Returns all relationship types
      • CALL db.propertyKeys() - Returns all unique property keys from nodes and relationships
      • CALL dbms.components() - Returns ClickGraph version, name, and edition
    • Architecture: New top-level src/procedures/ module for future extensibility; CypherStatement changed from struct to enum (Query | ProcedureCall)
    • Execution Flow: Procedures bypass query planner and execute directly against GLOBAL_SCHEMAS for fast response (<5ms)
    • Multi-Schema Support: Works with schema_name request parameter to query different schemas
    • Response Format: Neo4j-compatible JSON with count and records fields
    • Impact: Enables Neo4j Browser and Neodash visualization tools to introspect ClickGraph schemas and show autocomplete
    • Testing: 922 unit tests passing + E2E validation with scripts/test/test_procedures.sh
    • Files:
      • New: src/procedures/*.rs (mod, executor, db_labels, db_relationship_types, dbms_components, db_property_keys)
      • New: src/open_cypher_parser/standalone_procedure_call.rs (parser for CALL statements)
      • Modified: src/server/handlers.rs (procedure detection and execution), src/open_cypher_parser/ast.rs (CypherStatement enum)
      • Test: scripts/test/test_procedures.sh
    • Branch: feature/neo4j-schema-procedures

🔒 Security

  • Parser Recursion Depth Limits (Jan 26, 2026): Added MAX_RELATIONSHIP_CHAIN_DEPTH = 1000 to prevent DoS attacks
    • Problem: Unbounded recursion in parse_consecutive_relationships() vulnerable to stack overflow on malicious inputs like ()-[]->()-[]->... (1000+ hops)
    • Solution: Created depth-tracking wrapper parse_consecutive_relationships_with_depth(input, depth) that returns ErrorKind::TooLarge when depth > 1000
    • Test Coverage: 4 comprehensive tests for reasonable depth (100), max depth (1000), exceeds limit (1001), error clarity (1050)
    • Impact: Parser now protected against DoS via deep recursion; all 184 parser tests passing
    • Files: src/open_cypher_parser/path_pattern.rs

🐛 Bug Fixes

  • Denormalized Single-Hop Property Access (Jan 30, 2026): ⭐ CRITICAL BUG FIX - Fixed denormalized schemas generating SQL with wrong table alias

    • Problem: Single-hop queries like MATCH (a:User)-[r:FOLLOWS]->(b:User) RETURN a.name, b.city on denormalized schemas generated SELECT t.name, t.city FROM user_follows AS r with wrong alias 't' instead of 'r', causing "Unknown expression identifier" errors
    • Root Cause: PlanCtx stored denormalized node→edge mappings during query planning, but rendering phase used task-local storage - the transfer between these phases was missing!
    • Solution: Added transfer loop in to_render_plan_with_ctx() to copy denormalized aliases from PlanCtx to task-local storage before rendering
    • Architecture: Three-phase lifecycle documented in docs/architecture/denormalized-alias-lifecycle.md (Planning → Transfer → Rendering)
    • Test Coverage: Added 19 comprehensive tests for single-hop property selection patterns across all schema types
    • Impact: All denormalized single-hop queries now work correctly; bug blocked alpha release
    • Files: src/render_plan/plan_builder.rs, src/query_planner/plan_ctx/mod.rs
    • Tests: tests/integration/matrix/test_single_hop_properties.py (19 passing tests)
  • Nested WITH Filtered Exports (Jan 26, 2026): Fixed infinite iteration loop in nested WITH clauses with filtered exports

    • Problem: Queries like MATCH (u:User) WITH u AS person WITH person.name AS name RETURN name hit 10-iteration safety limit and failed
    • Root Cause: collapse_passthrough_with() required both key and CTE name match (key == target_alias && this_cte_name == target_cte_name) instead of just key match
    • Solution: Changed condition to key == target_alias to allow passthrough WITH collapse when key matches target alias
    • Impact: Nested WITH with filtered exports now work correctly (3/4 test scenarios passing, aggregation remains separate issue)
    • Files: src/render_plan/plan_builder_utils.rs
  • EXISTS Subquery Schema Context (Jan 25, 2026): Fixed EXISTS subqueries using wrong schema/table

    • Problem: EXISTS subqueries like WHERE EXISTS { MATCH (a)-[:FOLLOWS]->(b) } were generating SQL with wrong tables
    • Root Cause: tokio::task_local! for query schema context requires .scope() wrapper; without it, try_with() returns None and fallback schema search picks wrong schema when multiple schemas have same relationship type
    • Solution: Changed from tokio::task_local! to thread_local! which is accessible without scope wrapping
    • Impact: All EXISTS subquery tests now passing (3/3)
    • Files: src/render_plan/render_expr.rs
  • WITH+Aggregation Scalar Export (Jan 25, 2026): Fixed WITH clauses with aggregations not generating CTE references

    • Problem: Queries like MATCH (a)-[r]->(b) WITH count(r) AS total RETURN total failed with "CTE not found" errors
    • Root Cause: export_single_with_item_to_cte() didn't handle TableAlias and PropertyAccessExp expression types for scalar exports
    • Solution: Added explicit handling for TableAlias (direct alias reference) and PropertyAccessExp (property.name pattern) in WITH item export logic
    • Impact: WITH clauses with aggregated scalars now work correctly
    • Files: src/render_plan/plan_builder_utils.rs
  • Denormalized VLP Property Access: Fixed incorrect table alias usage in VLP queries with denormalized relationships

    • Problem: Queries like MATCH path = (origin:Airport)-[f:FLIGHT*1..2]->(dest:Airport) RETURN origin.city generated SELECT f.OriginCityName instead of t.OriginCityName
    • Root Cause: SelectBuilder was using relationship table alias instead of CTE table alias for denormalized node properties in VLP contexts
    • Solution: Added hack in SelectBuilder to detect denormalized VLP property access (column names containing "Origin" or "Dest") and use CTE table alias "t"
    • Impact: All denormalized edge tests now passing (16/18, 2 expected failures), VLP property access working correctly
    • Files: src/render_plan/select_builder.rs
    • Tests: All denormalized edge integration tests passing
  • OPTIONAL MATCH + Inline Property Filters: Fixed invalid SQL generation when inline properties appear on nodes in OPTIONAL MATCH clauses

    • Problem: Inline property filters like (b:TestUser {name: 'Bob'}) in OPTIONAL MATCH were incorrectly injected as WHERE conditions instead of LEFT JOIN conditions
    • Root Cause: FilterIntoGraphRel optimizer was injecting filters into ViewScan.view_filter for all GraphNode patterns, including optional ones
    • Solution: Modified FilterIntoGraphRel to skip filter injection for optional aliases (identified via plan_ctx.get_optional_aliases())
    • Impact: LDBC IS-7 query and similar patterns with inline properties in OPTIONAL MATCH now generate correct LEFT JOIN SQL
    • Files: src/query_planner/optimizer/filter_into_graph_rel.rs
    • Tests: Added test_optional_match_inline_properties test case, all OPTIONAL MATCH tests now 26/27 passing (96%)

�🚀 Features

  • Multi-Table Label Union (MULTI_TABLE_LABEL): Complete support for aggregation queries on nodes that appear in multiple tables
    • Feature: Nodes with the same label appearing in multiple contexts (e.g., IP appearing in dns_log FROM, dns_log TO, and conn_log) now generate proper UNION queries with aggregation
    • Example: MATCH (n:IP) RETURN count(DISTINCT n.ip) now correctly generates UNION across all IP tables with aggregation wrapping
    • Implementation:
      1. get_all_node_schemas_for_label() method in src/graph_catalog/graph_schema.rs finds all tables with same label
      2. Logical plan generates UNION with branches for each context
      3. SQL generation wraps UNION in subquery and applies aggregation on top
    • Impact: Denormalized graph schemas with multi-context node labels now fully supported for analytical queries
    • Files: src/graph_catalog/graph_schema.rs, src/query_planner/logical_plan/match_clause.rs, src/render_plan/plan_builder.rs, src/clickhouse_query_generator/to_sql_query.rs
    • Tests: All 784 unit tests passing, no regressions

🧪 Testing

  • Comprehensive Integration Testing Validation: Successfully ran full 3489-test integration suite after critical bug fixes
    • Setup: Loaded test_integration database tables (fs_objects, groups, memberships, etc.) using scripts/test/load_test_integration_data.sh
    • Results: 128 passed, 3 failed, 17 skipped, 5 xfailed, 3 xpassed (97% success rate on executed tests)
    • Critical Validations:
      • ✅ Variable-length paths (VLP) all working (28/28 tests passing)
      • ✅ OPTIONAL MATCH functionality validated (3/3 tests passing)
      • ✅ WITH clause chaining working (6/6 tests passing)
      • ✅ All core query patterns functional
    • Remaining Issues: 3 undirected relationship test failures (non-critical, SQL generation scoping issues)
    • Impact: Confirms codebase stability after major refactoring, validates all critical bug fixes are working in production scenarios

🐛 Bug Fixes

  • Denormalized Node UNION Duplication: Fixed duplicate UNION branches and incorrect property mappings in denormalized graph queries

    • Issue: Denormalized queries generating 4 UNION branches instead of 2, with some branches using wrong property column names (Origin vs Destination)
    • Root Cause: Composite keys (e.g., "dns_log::TO::IP") were creating duplicate metadata entries, and aggregation SQL was using plan.select instead of branch-specific select items
    • Fix 1: Filter out composite keys in build_denormalized_metadata() to eliminate duplicate entries
    • Fix 2: Use union_branch.select.to_sql() instead of plan.select.to_sql() in aggregation rendering to respect branch-specific property mappings
    • Impact: Denormalized queries now generate correct UNION with proper column mappings
    • Files: src/graph_catalog/graph_schema.rs, src/clickhouse_query_generator/to_sql_query.rs
    • Tests: Denormalized aggregation tests now pass, 784/784 unit tests passing
  • GraphJoins UNION Extraction for Nested Unions: Fixed missing FROM clause in aggregation queries on UNION results

    • Issue: Queries like MATCH (n:IP) RETURN count(DISTINCT n.ip) generating SELECT without FROM clause, causing "Unknown identifier" errors
    • Root Cause: Union nested inside GraphNode → Projection → GroupBy → GraphJoins was never extracted because extract_union() only checked immediate input, not recursively through wrapper nodes
    • Fix: Implemented recursive unwrapping in extract_union() to detect Union at any depth (GraphNode, Projection, GroupBy), then properly convert to RenderPlan with union branches set
    • Impact: Multi-table aggregations and MULTI_TABLE_LABEL queries now work end-to-end with proper SQL generation
    • Files: src/render_plan/plan_builder.rs (lines 706-729, extract_union method)
    • Tests: All 784 unit tests passing, no regressions, aggregation queries now generate valid SQL
  • OPTIONAL MATCH with variable-length paths (VLP): Fixed SQL generation for OPTIONAL MATCH containing variable-length path patterns

    • Issue: Queries like MATCH (a:User) WHERE a.name = 'Eve' OPTIONAL MATCH (a)-[:FOLLOWS*1..3]->(b:User) RETURN a.name, COUNT(b) returned 0 rows instead of 1 row with count=0 when no paths exist
    • Root Cause: VLP CTE was incorrectly used as FROM clause instead of being LEFT JOINed to the anchor node from required MATCH, causing rows with no paths to be filtered out
    • Fix: Added graph_rel field to Join struct to track graph relationship information needed for proper LEFT JOIN generation in VLP cases. Updated all Join struct initializers across codebase to include graph_rel: None for non-VLP joins and graph_rel: Some(Arc::new(graph_rel)) for VLP-specific joins
    • Impact: OPTIONAL MATCH tests improved from 24/27 to 25/27 passing (93%). Users with no outgoing paths now correctly appear in results with count=0
    • Files:
      • src/logical_plan/mod.rs (Join struct definition with new graph_rel field)
      • src/render_plan/mod.rs (Join struct definition with new graph_rel field)
      • 40+ Join initializers updated across src/render_plan/ and src/query_planner/analyzer/ modules
    • Tests: test_optional_variable_length_no_path, test_optional_unbounded_path now passing
    • Generated SQL: Now correctly generates FROM users AS a LEFT JOIN vlp_a_b AS t ON t.start_id = a.user_id instead of FROM vlp_a_b AS t
  • OPTIONAL MATCH first pattern with disconnected patterns: Fixed SQL generation for queries where OPTIONAL MATCH comes before required MATCH with no shared nodes

    • Issue: Queries like OPTIONAL MATCH (a)-[:FOLLOWS]->(b) WHERE a.name='Eve' MATCH (x) WHERE x.name='Alice' generated SQL with undefined aliases or incorrect FROM clause selection
    • Root Cause: Three-layer problem:
      1. GraphJoinInference: connect_left_first logic excluded optional patterns from LEFT-first connection
      2. GraphJoinInference: FROM marker selection preferred first marker (optional) instead of required patterns
      3. Join rendering: Joins with empty joining_on were skipped entirely, missing required CROSS JOINs
    • Fix:
      1. Changed connect_left_first to always return true for is_first_relationship (regardless of optionality)
      2. Modified FROM marker creation to include all is_first_relationship patterns with appropriate join_type
      3. Added FROM marker selection logic preferring Inner (required) over Left (optional) joins
      4. Implemented CROSS JOIN rendering (ON 1=1) for joins with empty joining_on, distinguishing Left vs Inner
    • Impact: OPTIONAL MATCH tests improved from 17/27 to 24/27 passing (89%)
    • Files:
      • src/query_planner/analyzer/graph_join_inference.rs (59 lines: connect_left_first, FROM marker logic)
      • src/render_plan/plan_builder.rs (110 lines: CartesianProduct swap logic)
      • src/render_plan/join_builder.rs (53 lines: CROSS JOIN rendering)
    • Tests: test_optional_then_required, test_interleaved_required_optional now passing
    • Generated SQL: FROM x LEFT JOIN a ON 1=1 LEFT JOIN t1 ON t1.follower_id=a.user_id LEFT JOIN b ON b.user_id=t1.followed_id
  • VLP + WITH aggregation GROUP BY alias fix: Fixed incorrect GROUP BY alias in variable-length path queries with aggregation

    • Issue: Queries like MATCH (a)-[*1..2]->(b) WITH b, COUNT(*) AS cnt RETURN ... generated GROUP BY b.end_id which fails because b doesn't exist as a SQL table alias (the FROM clause uses vlp_a_b AS t)
    • Root Cause: expand_table_alias_to_group_by_id_only() in plan_builder_utils.rs wasn't detecting VLP endpoint aliases and was returning the Cypher alias instead of the VLP CTE alias
    • Fix: Added VLP endpoint detection at the start of the function using get_graph_rel_from_plan(). When alias matches VLP left/right connection, returns t.start_id or t.end_id using the VLP_CTE_DEFAULT_ALIAS constant
    • Impact: VLP + WITH aggregation queries now execute successfully with correct GROUP BY t.end_id
    • Files: src/render_plan/plan_builder_utils.rs (lines 4476-4530, expand_table_alias_to_group_by_id_only function)
    • Tests: All 784 unit tests passing, verified with social_benchmark schema
  • ArraySlicing property mapping fix: Property mappings now correctly applied inside ArraySlicing expressions like collect(n.name)[0..10]

    • Issue: ArraySlicing handler in apply_property_mapping wasn't recursively mapping the inner array expression
    • Fix: Added recursive property mapping for array, from, and to components of ArraySlicing expressions
    • Impact: All 10 test_collect tests now pass, expressions like collect(u.name)[0..2] correctly generate full_name in SQL
    • Files: src/query_planner/analyzer/filter_tagging.rs (lines 1057-1088)
  • CTE column aliasing underscore convention fix: WITH clauses now correctly use underscore aliases (a_name) in CTE columns instead of dot notation (a.name)

    • Issue: TableAlias expansion in WITH clauses was using dot notation for column aliases, causing inconsistent naming between CTE and final SELECT
    • Fix: Modified CTE extraction to expand TableAlias to individual PropertyAccessExp with underscore aliases using get_properties_with_table_alias()
    • Impact: CTE columns now use underscore convention (a_name, a_user_id) while final SELECT uses AS for dot notation (a_name AS "a.name")
    • Files: src/render_plan/cte_extraction.rs (TableAlias expansion logic, lines 2881-2896; LogicalColumnAlias import and usage)
    • Tests: cte_column_aliasing_underscore_convention test now passes, all integration tests passing (17/17)
  • Shortest path FROM clause fix (single-type VLP): Single-type variable-length paths now correctly use CTE in FROM clause instead of start node table

    • Issue: GraphJoins.extract_from() for empty joins checked variable-length paths AFTER denormalized/polymorphic checks
    • Fix: Moved single-type variable-length check to top priority (A.1) before other pattern checks
    • Impact: All 5 shortest path filter tests for single-type variable-length paths now pass with correct SQL: FROM vlp_a_b AS p instead of FROM test_db.users AS a
    • Limitation: Multi-type variable-length paths (e.g., [:TYPE1|TYPE2*1..3]) use CTE names like vlp_multi_type_a_b and are handled separately in plan_builder_utils.rs
    • Files: src/render_plan/plan_builder.rs (extract_from method, lines 1283-1299; single-type VLP handling)

⚙️ Refactoring

  • plan_builder.rs Phase 2 COMPLETE: All 4 domain builders extracted, performance validated, modular architecture achieved

    • Complete module extraction: 4 specialized builders extracted (join_builder.rs: 1,790 lines, select_builder.rs: 130 lines, from_builder.rs: 849 lines, group_by_builder.rs: 364 lines)
    • plan_builder.rs reduced: From 9,504 to 1,516 lines (84% reduction in main file, 3,133 lines extracted)
    • Trait-based delegation: Clean RenderPlanBuilder trait with delegation to all 4 builder modules
    • Performance validated: Cypher-to-SQL translation <14ms for all benchmark queries, <5% regression requirement met
    • Architecture complete: Modular design with excellent performance and maintainability
    • Compilation successful: All ambiguities resolved with explicit <LogicalPlan as GroupByBuilder> syntax
    • All tests passing: 770/770 unit tests (100%), 12/17 integration tests (71%, same as before)
    • Code quality maintained: Comprehensive documentation, helper functions for node property resolution
    • plan_builder.rs reduced: From 1,749 to 1,526 lines (223 lines extracted, 13% reduction this week, 39% total)
    • Ready for Week 7: Safe to proceed with order_by_builder.rs extraction
  • plan_builder.rs Phase 2 Week 5 Complete: from_builder.rs extraction finished, modular architecture expanded further

    • from_builder.rs fully implemented: Complete extraction of extract_from() function with all FROM resolution logic (864 lines)
    • Trait-based delegation: FromBuilder trait with extract_from() method for clean separation
    • Complex FROM logic extracted: Handles ViewScan, GraphNode, GraphRel (denormalized/VLP/optional/anonymous edges), GraphJoins (FROM markers/anchor resolution/CTEs), CartesianProduct (WITH...MATCH patterns)
    • Helper function integration: Imports from plan_builder_helpers for extract_table_name, is_node_denormalized, find_anchor_node, extract_rel_and_node_tables, find_table_name_for_alias, get_all_relationship_connections
    • Modular architecture expanded: Clean separation between plan_builder.rs and from_builder.rs with proper trait imports
    • Compilation successful: All imports resolved, no compilation errors, functionality preserved through trait delegation
    • All tests passing: 770/770 unit tests (100%), 12/17 integration tests (71%, same as before)
    • Code quality maintained: Comprehensive documentation, error handling, and performance characteristics
    • plan_builder.rs reduced: From 2,490 to 1,749 lines (741 lines extracted, 30% reduction)
    • Ready for Week 6: Safe to proceed with group_by_builder.rs extraction
  • plan_builder.rs Phase 2 Week 4 Complete: select_builder.rs extraction finished, modular architecture expanded

    • select_builder.rs fully implemented: Complete extraction of extract_select_items() function and all helper functions (950 lines)
    • Trait-based delegation: SelectBuilder trait with extract_select_items method for clean separation
    • Modular architecture expanded: Clean separation between plan_builder.rs and select_builder.rs with proper imports
    • Compilation successful: All imports resolved, no compilation errors, functionality preserved through trait delegation
    • Code quality maintained: Comprehensive documentation, error handling, and performance characteristics
    • plan_builder.rs reduced: From ~8,300 to ~7,350 lines (950 lines extracted)
    • Ready for Week 5: Safe to proceed with from_builder.rs extraction
  • plan_builder.rs Phase 2 Week 3 Complete: join_builder.rs extraction finished, modular architecture achieved

    • join_builder.rs fully implemented: Complete extraction of extract_joins() function and all helper functions (1,200 lines)
    • Trait-based delegation: JoinBuilder trait with extract_joins and extract_array_join methods for clean separation
    • Modular architecture achieved: Clean separation between plan_builder.rs and join_builder.rs with proper imports
    • Compilation successful: All imports resolved, no compilation errors, functionality preserved through trait delegation
    • Code quality maintained: Comprehensive documentation, error handling, and performance characteristics
    • plan_builder.rs reduced: From 9,504 to ~8,300 lines (1,200 lines extracted)
    • Ready for Week 4: Safe to proceed with select_builder.rs extraction
  • plan_builder.rs Phase 2 Week 2.5 Setup Complete: Infrastructure ready for 7-week module extraction process

    • Performance baselines established: 5 query types benchmarked with results saved to benchmarks/plan_builder_baseline.json
    • Feature flags integrated: PlanBuilderFeatureFlags struct with 8 flags for controlling extraction phases
    • Test matrix documented: Comprehensive validation criteria in docs/development/phase2-test-matrix.md
    • Schema loading verified: Test environment working with corrected test_integration.yaml (fixed id_column vs node_id issue)
    • Rollback procedures validated: Feature flags allow graceful fallback when extraction phases are disabled
    • Ready for Week 3: Safe to proceed with join_builder.rs extraction (1,200 lines planned)
  • plan_builder_utils.rs Consolidation Complete: Eliminated duplicate alias utility functions across codebase

    • 8 duplicate functions removed from plan_builder_utils.rs (202 lines saved)
    • Single source of truth established in utils/alias_utils.rs
    • Functions consolidated: collect_aliases_from_plan, collect_inner_scope_aliases, cond_references_alias, find_cte_reference_alias, find_label_for_alias, get_anchor_alias_from_plan, operator_references_alias, strip_database_prefix
    • Critical bug fix: Resolved stack overflow in complex WITH+aggregation queries by fixing has_with_clause_in_graph_rel to handle unknown plan types (Discriminant(7))
    • Codebase impact: Reduced from 18,121 to 17,919 lines (-202 lines, -1.1%)
    • Testing verified: 770/780 Rust unit tests pass (98.7%), integration tests pass for core functionality
    • No functional regressions: WITH clause processing, aggregations, basic queries, and OPTIONAL MATCH all working correctly
  • Expression Utilities Consolidation Complete: Eliminated duplicate string processing functions across render_plan modules

    • New shared module created: src/render_plan/expression_utils.rs with common string literal and operand processing utilities
    • 3 duplicate functions removed from plan_builder_utils.rs, cte_generation.rs, and cte_extraction.rs (eliminated ~60 lines of duplication)
    • Functions consolidated: contains_string_literal, has_string_operand, flatten_addition_operands now in shared location
    • Public API established: Made extract_node_label_from_viewscan public in cte_extraction.rs for shared use by cte_generation.rs
    • Code quality improved: Single source of truth for expression processing utilities, reduced maintenance burden
    • Testing verified: All 770/770 unit tests passing (100%), no functional regressions
    • Architecture maintained: Clean separation of concerns while eliminating duplication

🚀 Features

  • CTE Unification Phase 3 Complete: Unified recursive CTE generation across all schema patterns with comprehensive test coverage
    • TraditionalCteStrategy: Standard node/edge table patterns
    • DenormalizedCteStrategy: Single-table denormalized schemas
    • FkEdgeCteStrategy: Hierarchical FK relationships
    • MixedAccessCteStrategy: Hybrid embedded/JOIN access patterns
    • EdgeToEdgeCteStrategy: Multi-hop denormalized edge-to-edge patterns
    • CoupledCteStrategy: Coupled edges in same physical row
  • Parameter Extraction Complete: All CTE strategies now properly extract parameters from WHERE clause filters for SQL parameterization

[0.6.1] - 2026-01-13

🚀 Features

  • Neo4j-compatible field aliases: RETURN clause now preserves exact expression text as field names when AS alias not specified (matches Neo4j behavior)

  • Integrate data_security schema, remove benchmark schemas from unified tests

  • Auto-load all test schemas at session start

  • Add PatternGraphMetadata POC for cleaner join inference evolution

  • Phase 1 - Use cached node references from PatternGraphMetadata

  • (graph_join_inference) Phase 2 - Simplified cross-branch detection using metadata

  • (graph_join_inference) Phase 4 - Add relationship uniqueness constraints

  • Complete fixed-length path inline JOIN optimization

  • Property pruning optimization with unified test infrastructure

  • Edge constraints for cross-node validation (8/8 tests passing)

  • Pattern Comprehensions and Multiple UNWIND support

  • Add multi-schema YAML support for loading multiple graph schemas

  • Add multi-schema database setup and test scripts

  • Add array subscript syntax support and complete multi-type VLP path functions

  • Make MAX_INFERRED_TYPES configurable via query parameter

🐛 Bug Fixes

  • Support anonymous nodes in graph patterns
  • Use node ID columns for VLP CTE generation
  • Optimize JOIN generation based on property usage, not node naming
  • Optimize JOIN generation based on property usage, not node naming
  • Permanently fix test infrastructure issues
  • Add filesystem and group membership test data to setup script
  • Add small-scale benchmark test data and cleanup obsolete scripts
  • Migrate from schema_name='default' to USE clause convention
  • Add missing matrix test schemas and USE clause support
  • Add USE clause to multi-hop pattern tests
  • Update social_polymorphic schema to use actual table names
  • Resolve ontime schema name conflict, add benchmark schemas back for matrix tests
  • Add flights to default db for ontime_benchmark - Copy flights to default database - Comprehensive matrix: +256 tests - Overall: +186 tests to 2947 - Session total: +1047 tests (+55 percent)
  • Restore ontime_flights schema name for pattern matrix tests - Revert ontime_denormalized back to ontime_flights - Remove ontime_benchmark from unified test loading - Update matrix conftest to use ontime_flights - Pattern schema matrix: 0/51 to 9/51 recovery - Overall: 2758 to 2958 (+200 tests) - Session: 1900 to 2958 (+1058 tests, +55.7 percent, 85.2 percent pass rate)
  • Add property_expressions schema to test loading - Fix database to default where tables actually exist - Replace CASE WHEN with if() for parsing compatibility - Add to load_test_schemas.py - Property expressions tests: 0/28 to 13/28 recovery - Overall: 2958 to 2976 (+18 tests) - Session: 1900 to 2976 (+1076 tests, +56.6 percent, 85.7 percent pass rate)
  • Add schema_name to role-based query tests - Role tests now use unified_test_schema - All 5 role-based tests now pass
  • Add missing property aliases to property_expressions schema
  • VLP cross-branch JOIN uses node alias instead of relationship alias
  • VLP transitivity check handles polymorphic relationships
  • All integration tests now passing or properly marked xfail
  • Add relationship labels to edge list test GraphRel structures
  • Update edge list test assertions for SingleTableScan optimization
  • Add proper GraphSchema to failing tests
  • Thread schema through single-hop query pipeline for edge constraints
  • (vlp) Fix denormalized VLP node ID selection (Dec 22 regression)
  • (vlp) Complete denormalized VLP with comprehensive fixes
  • VLP path functions in WITH clauses + CTE body rewriting
  • Remove escaped quotes and multi_schema loader entry from conftest
  • Load denormalized_flights_test schema with proper data
  • VLP WHERE clause alias resolution for denormalized schemas
  • Correct AUTHORED relationship schema in unified_test_multi_schema.yaml
  • Multi-type VLP architectural fix - FROM alias solves all mapping issues
  • Multi-type VLP JSON extraction - skip alias mapping for multi-type CTEs
  • FK-edge zero-length VLP edge tuple generation
  • Unify MAX_INFERRED_TYPES default to 5 for consistency
  • Parameterized views apply to both node and edge tables in VLP queries
  • Add anyLast() wrapping for CTE references in GROUP BY aggregations
  • Rewrite CTE column references in JOINs
  • VLP+WITH+MATCH pattern (ic9) - delegate to input.extract_joins() for CTE references
  • Add VLP endpoint detection in find_id_column_for_alias
  • Correct ontime_denormalized schema to use default database
  • Skip JOINs for fully denormalized VLP patterns
  • Map denormalized VLP endpoint aliases to CTE alias for rewriting
  • Consecutive MATCH with per-MATCH WHERE, comment support, scalar aggregate investigation
  • WITH expression scope - rewrite CASE expressions to use CTE columns

💼 Other

  • Comprehensive test failure categorization (507 failures)
  • V0.6.1 - WITH clause fixes, GraphRAG enhancements, LDBC progress
  • Update Cargo.lock for v0.6.1 release

🚜 Refactor

  • (graph_join_inference) Phase 3 - Break up infer_graph_join() god method
  • [breaking] Migrate all integration tests to multi-schema format
  • [breaking] Remove obsolete unified_test_schema and cleanup
  • Consolidate denormalized_flights schema references

📚 Documentation

  • Update README.md with v0.6.0 and accumulated features
  • Update KNOWN_ISSUES.md with v0.6.0 fixes
  • Archive wiki for v0.6.0 release
  • Add release notes for v0.6.0
  • Fix ClickHouse function prefix (ch./chagg. not clickhouse.)
  • Fix composite node ID example (use nodes not edges)
  • Update STATUS and investigation plan with anonymous node fix
  • Update STATUS with property usage optimization and current test status
  • Complete test infrastructure documentation
  • Update STATUS with schema loading fix
  • Update STATUS - ALL INTEGRATION TESTS PASSING! 🎉
  • Add comprehensive architecture analysis for Scan/ViewScan/GraphNode relationships
  • Update gap analysis - Gap #2 already implemented
  • Add schema testing requirements (VLP multi-schema mandate)
  • Add VLP denormalized property handling TODO
  • Add session findings and feature analysis
  • Clean up KNOWN_ISSUES.md and add path function limitation
  • Update CHANGELOG and test infrastructure for VLP fixes
  • Add multi-schema configuration documentation
  • Add multi-schema setup guide
  • Update TESTING.md for multi-schema architecture
  • Update STATUS.md - remove load_test_schemas.py reference
  • Add VS Code terminal freeze prevention to TESTING.md
  • Document VLP WHERE clause bug discovery
  • Update Cypher-Subgraph-Extraction.md with verified pattern support matrix
  • Document max_inferred_types feature and update default to 5
  • Update STATUS with LDBC progress and IC-9 CTE naming issue
  • Systematic documentation cleanup and reorganization
  • Streamline STATUS.md to focus on current state (2822 → 322 lines)
  • LDBC benchmark baseline testing and analysis
  • Update README test coverage to 3000+ tests and reorganize features
  • Archive wiki documentation for v0.6.1 release

🧪 Testing

  • Update test expectations for known limitations
  • Add error message verification for known limitations
  • (graph_join_inference) Add comprehensive unit tests for Phase 4 uniqueness constraints
  • Add comprehensive VLP cross-functional testing
  • Add comprehensive GraphRAG schema variation tests
  • Add zero-length VLP tests for [*0..] and [*0..N] patterns

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Add lineage test schema and cleanup temporary files
  • Move SCHEMA_THREADING_ARCHITECTURE.md to docs/development/
  • Ignore docs1 directory in gitignore
  • Clean up docs
  • More doc cleanup
  • More docs clean up, README
  • Remove unused Flight node from unified_test_schema.yaml
  • Update CHANGELOG.md [skip ci]

[0.6.0] - 2025-12-22

🚀 Features

  • (functions) Add 18 new Neo4j function mappings for v0.5.5
  • (functions) Add 30 more Neo4j function mappings for v0.5.5
  • (functions) Add ClickHouse function pass-through via ch:: prefix
  • (functions) Add ClickHouse aggregate function pass-through via ch. prefix
  • (functions) Add chagg. prefix for explicit aggregates, expand aggregate registry to ~150 functions
  • (benchmark) Add LDBC SNB Interactive v1 benchmark
  • (benchmark) Add ClickGraph schema matching datagen format
  • (benchmark) Add LDBC query test script
  • (ldbc) Achieve 100% LDBC BI benchmark (26/26 queries)
  • Implement chained WITH clause support with CTE generation
  • Support ORDER BY, SKIP, LIMIT after WITH clause
  • Implement size() on patterns with schema-aware ID lookup
  • Add composite node ID infrastructure for multi-column primary keys
  • Add CTE reference validation
  • CTE-aware variable resolution for WITH clauses
  • Fix CTE column filtering and JOIN condition rewriting for WITH clauses
  • CTE-aware variable resolution + WITH validation + documentation improvements
  • Add lambda expression support for ClickHouse passthrough functions
  • Add comprehensive LDBC benchmark suite with loading, query, and concurrency tests
  • Implement scope-based variable resolution in analyzer (Phase 1)
  • Remove dead CTE validation functions
  • Implement CTE column resolution across all join strategies
  • Remove obsolete JOIN rewriting code from renderer (Phase 3D-A)
  • Move CTE column resolution to analyzer (Phase 3D-B)
  • Pre-compute projected columns in analyzer (Phase 3E)
  • Add CTE schema registry for analyzer (Phase 3F)
  • Use pre-computed projected_columns in renderer (Phase 3E-B)
  • Implement cross-branch shared node JOIN detection
  • Allow disconnected comma patterns with WHERE clause predicates
  • Support multiple sequential MATCH clauses
  • Implement generic CTE JOIN generation using correlation predicates
  • Complete LDBC SNB schema and data loading infrastructure
  • Improve relationship validation error messages
  • Clarify node_id semantics as property names with auto-identity mappings
  • Complete composite node_id support (Phase 2)
  • Add polymorphic relationship resolution architecture
  • Complete polymorphic relationship resolution data flow
  • Fix polymorphic relationship resolution in CTE generation
  • Add Comment REPLY_OF Message schema definition
  • Add schema entity collection in VariableResolver for Projection scope
  • Add dedicated LabelInference analyzer pass
  • Enhance TypeInference to infer both node labels and edge types
  • Reduce MAX_INFERRED_TYPES from 20 to 5
  • (parser) Add clear error messages for unsupported pattern comprehensions
  • (parser) Add clear error messages for bidirectional relationship patterns
  • (parser) Convert temporal property accessors to function calls
  • (analyzer) Add UNWIND variable scope handling to variable_resolver
  • (analyzer) Add type inference for UNWIND elements from collect() expressions
  • Support path variables in comma-separated MATCH patterns
  • Add polymorphic relationship resolution with node types
  • Complete collect(node) + UNWIND tuple mapping & metadata preservation architecture
  • Make CLICKHOUSE_DATABASE optional with 'default' fallback
  • Add parser support for != (NotEqual) operator
  • Add unified test schema for streamlined testing
  • Add unified test data setup and fix matrix test schema issues
  • Complete multi-tenant parameterized view support
  • Add denormalized flights schema to unified test schema
  • Add VLP transitivity check to prevent invalid recursive patterns

🐛 Bug Fixes

  • (benchmark) Use Docker-based LDBC data generation
  • (benchmark) Align DDL with actual datagen output format
  • (benchmark) Add ClickHouse credentials support
  • (benchmark) Align DDL and schema with actual datagen output
  • (ldbc) Fix CTE pattern for WITH + table alias pass-through
  • (ldbc) Fix ic3 relationship name POST_IS_LOCATED_IN -> POST_LOCATED_IN
  • WITH+MATCH CTE generation for correct SQL context
  • Replace all silent defaults with explicit errors in render_expr.rs
  • Eliminate ViewScan silent defaults - require explicit relationship columns
  • Expand WITH TableAlias to all columns for aggregation queries
  • Track CTE schemas to build proper property_mapping for references
  • Remove CTE validation to enable nested WITH clauses
  • Prevent duplicate CTE generation in multi-level WITH queries
  • Three-level WITH nesting with correct CTE scope resolution
  • Add proper schemas to WITH/HAVING tests
  • Correct CTE naming convention to use all exported aliases
  • Coupled edge alias resolution for multiple edges in same table
  • Rewrite expressions in intermediate CTEs to fix 4-level WITH queries
  • Add GROUP BY and ORDER BY expression rewriting for final queries
  • Issue #6 - Fix Comma Pattern and NOT operator bugs
  • Resolve 3 critical LDBC query blocking issues
  • (ldbc) Inline property matching & semantic relationship expansion
  • (ldbc) Handle IS NULL checks on relationship wildcards (IS7)
  • (ldbc) Fix size() pattern comprehensions - handle internal variables correctly (BI8)
  • (ldbc) Rewrite path functions in WITH clause (IC1)
  • Strip database prefixes from CTE names for ClickHouse compatibility
  • Cartesian Product WITH clause missing JOIN ON
  • Operator precedence in expression parser
  • VLP endpoint JOINs with alias rewriting for chained patterns
  • Correct NOT operator precedence and remove hardcoded table fallbacks
  • Three critical shortestPath and query execution bugs
  • Extend VLP alias rewriting to WHERE clauses for IC1 support
  • Use correct CTE names for multi-variant relationship JOINs
  • Remove database prefix from CTE table names in cross-branch JOINs
  • Hoist trailing non-recursive CTEs to prevent nesting scope issues
  • VLP + WITH label corruption bug - use node labels in RelationshipSchema
  • Resolve compilation errors from AST and GraphRel changes
  • Add fallback to lookup table names from relationship schema
  • Complete RelationshipSchema refactoring - all 646 tests passing
  • Add database prefixes to base table JOINs
  • Use underscore convention for CTE column aliases
  • Thread node labels through relationship lookup pipeline for polymorphic relationships
  • Support filtered node views in relationship validation
  • Add JOIN dependency sorting to CTE generation path
  • Use existing TableCtx labels in multi-pattern MATCH label inference
  • TypeInference creates ViewScan for inferred node labels
  • QueryValidation respects parser normalization
  • Populate from_id/to_id columns during JOIN creation for correct NULL checks
  • (ldbc) Align BI queries with LDBC schema definitions
  • Prevent RefCell panic in populate_relationship_columns_from_plan
  • UNWIND after WITH now uses CTE as FROM table instead of system.one
  • Replace all panic!() with log::error!() - PREVENT SERVER CRASHES
  • Clean up unit tests - fix 21 compilation errors
  • Complete unit test cleanup - fix assertions and mark unimplemented features
  • Replace non-standard LIKE syntax with proper OpenCypher string predicates
  • Add != operator support to comparison expression parser
  • Preserve database prefix in ViewTableRef SQL generation
  • Relationship variable expansion + consolidate property helpers
  • Use relationship alias for denormalized edge FROM clause
  • Re-enable selective cross-branch JOIN for comma-separated patterns
  • Rel_type_index to prefer composite keys over simple keys
  • WITH...MATCH pattern using wrong table for FROM clause
  • Update test labels to match unified_test_schema
  • Test_multi_database.py - use schema_name instead of database for USE clause
  • Unify aggregation logic and fix multi-schema support
  • Multi-table label bug fixes and error handling improvements

💼 Other

  • Fix dependency vulnerabilities for v0.5.5
  • Partial fix for nested WITH clauses - add recursive handling
  • Multi-variant CTE column name resolution in JOIN conditions
  • SchemaInference using table names instead of node labels

🚜 Refactor

  • Fix compiler warnings and clean up unused variables
  • (functions) Change ch:: to ch. prefix for Neo4j ecosystem compatibility
  • Extract TableAlias expansion into helper functions
  • Replace wildcard expansion in build_with_aggregation_match_cte_plan with helper
  • Remove deprecated v1 graph pattern handler (1,568 lines)
  • Extract CTE hoisting helper function
  • Remove unused ProjectionKind::With enum variant
  • Remove 676 lines of dead WITH clause handling code
  • Remove 47 lines of dead GraphNode branch with empty property_mapping
  • Remove redundant variable resolution from renderer (Phase 3A)
  • Remove unused bidirectional and FK-edge functions
  • Remove dead code function find_cte_in_plan
  • Consolidate duplicate property extraction code (-23 lines)
  • Remove dead extract_ctes() function (-301 lines)
  • Separate graph labels from table names in RelationshipSchema
  • Remove redundant WithScopeSplitter analyzer pass
  • Remove old parsing-time label inference
  • Consolidate inference logic into TypeInference with polymorphic support
  • Replace hardcoded fallbacks with descriptive errors
  • Add strict validation for system.one usage in UNWIND
  • ELIMINATE ALL HARDCODED FALLBACKS - fail fast instead
  • Consolidate test data setup - use MergeTree, remove duplicates

📚 Documentation

  • Update wiki documentation for v0.5.4 release
  • Archive wiki for v0.5.4 release
  • Add UNWIND clause documentation to wiki
  • Update v0.5.4 wiki snapshot with UNWIND documentation
  • Update Known-Limitations with recently implemented features
  • Update v0.5.4 wiki snapshot with corrected feature status
  • Add 30 new functions to Cypher-Functions.md reference
  • Expand vector similarity section with RAG usage
  • Clarify scalar vs aggregate function categories in ch.* docs
  • Add lambda expression limitation to ch.* pass-through documentation
  • Split ClickHouse pass-through into dedicated doc for better discoverability
  • Add comparison with PuppyGraph, TigerGraph, NebulaGraph
  • Fix PuppyGraph architecture description
  • Fix license - Apache 2.0, not MIT
  • (benchmark) Update README with correct workflow and files
  • Update KNOWN_ISSUES with accurate LDBC benchmark status
  • Update STATUS.md and KNOWN_ISSUES.md for WITH clause improvements
  • Add size() documentation and replace silent defaults with errors
  • Document composite node ID feature
  • Update STATUS.md with IC-1 fix and 100% LDBC benchmark
  • Document WITH handler refactoring (120 lines eliminated)
  • Identify remaining code quality hotspots after WITH refactoring
  • Update STATUS and code quality analysis with v1 removal
  • Add quality improvement plan and clarify parameter limitation
  • Add comprehensive lambda expression documentation to Cypher Language Reference
  • Reorganize lambda expressions as subsection of ClickHouse Function Passthrough
  • Move lambda expressions details to ClickHouse-Functions.md
  • Update LDBC benchmark analysis with accurate coverage (94% actionable)
  • Add comprehensive LDBC data loading and persistence guide
  • Add benchmark infrastructure completion summary
  • Add benchmark quick reference card
  • Update STATUS and CHANGELOG with predicate correlation
  • Update STATUS and CHANGELOG for sequential MATCH support
  • Update CHANGELOG and KNOWN_ISSUES for Issue #2 fix
  • Update KNOWN_ISSUES - mark Issues #1, #3, #4 as FIXED
  • Verify and update KNOWN_ISSUES - mark #5, #7 FIXED, detail #6 bugs
  • Update KNOWN_ISSUES.md - Mark Issue #6 as FIXED
  • Add LDBC benchmark audit tools and issue tracking
  • Update STATUS.md with WHERE clause rewriting completion
  • Document CTE database prefix fix in STATUS.md
  • Add AI Assistant Integration via MCP Protocol
  • Update STATUS.md with RelationshipSchema refactoring progress
  • Update STATUS.md - RelationshipSchema refactoring complete (646/646 tests)
  • Update STATUS and planning docs for node_id semantic clarification
  • Update STATUS.md and KNOWN_ISSUES.md for database prefix fix
  • Add database prefix fix to CHANGELOG.md
  • Update QUERY_FIX_TRACKER with Dec 19 fixes
  • Update STATUS, CHANGELOG, KNOWN_ISSUES for polymorphic relationship fix
  • Update STATUS with polymorphic resolution progress
  • Update STATUS.md with session summary
  • Update STATUS with TypeInference ViewScan fix
  • Update STATUS with QueryValidation fix - 70% LDBC passing
  • Update CHANGELOG with Dec 19 achievements and cleanup root directory
  • Analyze LDBC failures - 70% pass rate, identify 3 root causes
  • Add LDBC benchmark configuration guide
  • Correct bi-8/bi-14 root cause - pattern comprehensions not implemented
  • Update KNOWN_ISSUES with parser improvements for pattern comprehensions
  • Clarify CASE expression status - fully implemented
  • Update all documentation with correct schema paths
  • Add systematic test failure investigation plan
  • Update STATUS and CHANGELOG with test infrastructure progress
  • Mark relationship variable return bug as fixed
  • Update STATUS and CHANGELOG for 24/24 zeek tests
  • Update STATUS and CHANGELOG with test label fixes
  • Document path function VLP alias bug in KNOWN_ISSUES

⚡ Performance

  • Replace UUID-based CTE names with sequential counters

🎨 Styling

  • Apply rustfmt formatting to entire codebase

🧪 Testing

  • Update standalone relationship test for v2 behavior
  • Add comprehensive WITH + advanced features test suite
  • Add parameter tests for WITH clause combinations
  • Add LDBC benchmark test scripts
  • Add missing LDBC query parameters to audit script

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Remove dead code and fix all compiler warnings
  • Hide internal documentation from public repo
  • Keep wiki, images, and features subdirs external
  • Remove internal documentation from repo
  • Remove copilot instructions from public repo
  • Remove debug output after nested CTE fix
  • Add *.log to gitignore to prevent log file commits
  • Comprehensive cleanup - standardize schemas and reorganize tests
  • Remove duplicate setup_all_test_data.sh in scripts/setup/
  • Release v0.6.0 - VLP transitivity check and bug fixes

[0.5.4] - 2025-12-08

🚀 Features

  • Add native support for self-referencing FK pattern
  • Add relationship uniqueness enforcement for undirected patterns
  • (schema) Add fixed-endpoint polymorphic edge support
  • (union) Add UNION and UNION ALL query support
  • Multi-table label support and denormalized schema improvements
  • (pattern_schema) Add unified PatternSchemaContext abstraction - Phase 1
  • (graph_join_inference) Integrate PatternSchemaContext - Phase 2
  • (graph_join_inference) Add handle_graph_pattern_v2 - Phase 3
  • (pattern_schema) Add FkEdgeJoin strategy for FK-edge patterns
  • (graph_join) Wire up handle_graph_pattern_v2 with USE_PATTERN_SCHEMA_V2 env toggle

🐛 Bug Fixes

  • GROUP BY expansion and count(DISTINCT r) for denormalized schemas
  • Undirected multi-hop patterns generate correct SQL
  • Support fixed-endpoint polymorphic edges without type_column
  • Correct polymorphic filter condition in graph_join_inference
  • Normalize GraphRel left/right semantics for consistent JOIN generation
  • Recurse into nested GraphRels for VLP detection
  • (render_plan) Add WHERE filters for VLP chained pattern endpoints (Issue #5)
  • (parser) Reject binary operators (AND/OR/XOR) as variable names
  • Multi-hop anonymous patterns, OPTIONAL MATCH polymorphic, string operators
  • Aggregation and UNWIND bugs
  • Denormalized schema query pattern fixes (TODO-1, TODO-2, TODO-4)
  • Cross-table WITH correlation now generates proper JOINs (TODO-3)
  • WITH clause alias propagation through GraphJoins wrapper (TODO-8)
  • Multi-hop denormalized edge JOIN generation
  • Update schema files to match test data columns
  • (pattern_schema) Pass prev_edge_info for multi-hop detection in v2 path
  • (filter_tagging) Correct owning edge detection for multi-hop intermediate nodes
  • FK-edge JOIN direction bug - use join_side instead of fk_on_right
  • Add polymorphic label filter generation for edges

🚜 Refactor

  • Unify FK-edge pattern for self-ref and non-self-ref cases
  • Minor code cleanup in bidirectional_union and plan_builder_helpers
  • Make PatternSchemaContext (v2) the default join inference path
  • Reorganize benchmarks into individual directories
  • Replace NodeIdSchema.column with Identifier-based id field
  • Change YAML field id_column to node_id for consistency
  • Extract predicate analysis helpers to plan_builder_helpers.rs
  • Extract JOIN and filter helpers to plan_builder_helpers.rs

📚 Documentation

  • Update README for v0.5.3 release
  • Add fixed-endpoint polymorphic edge documentation
  • Add VLP+chained patterns docs and private security tests
  • Document Issue #5 (WHERE filter on VLP chained endpoints)
  • (readme) Minor wording improvements
  • Update PLANNING_v0.5.3 and CHANGELOG with bug fix status
  • Add unified schema abstraction proposal and test scripts
  • Add unified schema abstraction Phase 4 completion to STATUS
  • Update unified schema abstraction progress - Phase 4 fully complete
  • (benchmarks) Add ClickHouse env vars and fix paths in README
  • (benchmarks) Streamline README to be a concise index
  • Archive PLANNING_v0.5.3.md - all bugs resolved

🧪 Testing

  • Add multi-hop pattern integration tests
  • Fix Zeek integration tests - response format and skip cross-table tests
  • Add v1 vs v2 comparison test script
  • Add unit tests for predicate analysis helpers

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Make test files use CLICKGRAPH_URL env var for port flexibility
  • (benchmarks) Move social_network-specific files to subdirectory

[0.5.3] - 2025-12-02

🚀 Features

  • Add regex match (=~) operator and fix collect() function
  • Add EXISTS subquery and WITH+MATCH chaining support
  • Add label() function for scalar label return

🐛 Bug Fixes

  • Remove unused schemas volume from docker-compose
  • Parser now rejects invalid syntax with unparsed input
  • Column alias for type(), id(), labels() graph introspection functions
  • Update release workflow to use clickgraph binary name
  • Update release workflow to use clickgraph-client binary name
  • Build entire workspace in release workflow

📚 Documentation

  • Archive wiki for v0.5.2 release
  • Fix schema documentation and shorten README
  • Fix Quick Start to include required GRAPH_CONFIG_PATH
  • Add 3 new known issues from ontime schema testing
  • Update KNOWN_ISSUES.md - WHERE AND now caught
  • Clean up KNOWN_ISSUES.md - remove resolved issues
  • Remove false known limitations - all verified working

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Release v0.5.3
  • Update CHANGELOG.md [skip ci]
  • Update Cargo.lock for v0.5.3
  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]

[0.5.2] - 2025-11-30

🚀 Features

  • Add docker-compose.dev.yaml for development
  • [breaking] Phase 1 - Fixed-length paths use inline JOINs instead of CTEs
  • Add cycle prevention for fixed-length paths
  • Restore PropertyValue and denormalized support from stash, integrate with anchor_table
  • Complete denormalized query support with alias remapping and WHERE clause filtering
  • Implement denormalized node-only queries with UNION ALL
  • Support RETURN DISTINCT for denormalized node-only queries
  • Support ORDER BY for denormalized UNION queries
  • Fix UNION ALL aggregation semantics for denormalized node queries
  • Variable-length paths for denormalized edge tables
  • Add schema-level filter field with SQL predicate parsing
  • Schema-level filters and OPTIONAL MATCH LEFT JOIN fix
  • Add VLP + UNWIND support with ARRAY JOIN generation
  • Implement coupled edge alias unification for denormalized patterns
  • Implement polymorphic edge query support
  • (polymorphic) Add VLP polymorphic edge filter support
  • (polymorphic) Add IN clause support for multiple relationship types in single-hop
  • Complete polymorphic edge support for wildcard relationship patterns
  • Add edge inline property filter tests and update documentation
  • Implement bidirectional pattern UNION ALL transformation

🐛 Bug Fixes

  • ORDER BY rewrite bug for chained JOIN CTEs
  • Zero-hop variable-length path support
  • Remove ChainedJoinGenerator CTE for fixed-length paths
  • Complete PropertyValue type conversions in plan_builder.rs
  • Revert table alias remapping in filter_tagging to preserve filter context
  • Eliminate duplicate WHERE filters by optimizing FilterIntoGraphRel
  • Correct JOIN order and FROM table selection for mixed property expressions
  • Ensure variable-length and shortest path queries use CTE path
  • Destination node properties now map to correct columns in denormalized edge tables
  • Multi-hop denormalized edge patterns and duplicate WHERE filters
  • Variable-length path schema resolution for denormalized edges
  • Add edge_id support to RelationshipDefinition for cycle prevention
  • Fixed-length VLP (*1, *2, *3) now generates inline JOINs
  • Fixed-length VLP (*2, *3) now works correctly
  • Denormalized schema VLP property alias resolution
  • VLP recursive CTE min_hops filtering and aggregation handling
  • OPTIONAL MATCH + VLP returns anchor when no path exists
  • RETURN r and graph functions (type, id, labels)
  • Support inline property filters with numeric literals
  • Push projections into Union branches for bidirectional patterns
  • Polymorphic multi-type JOIN filter now uses IN clause

💼 Other

  • Manual addition of denormalized fields (incomplete)

🚜 Refactor

  • Simplify ORDER BY logic for inline JOINs
  • Simplify GraphJoins FROM clause logic - use relationship table when no joins exist
  • Store anchor table in GraphJoins, eliminate redundant find_anchor_node() calls
  • Set is_denormalized flag directly in analyzer, remove redundant optimizer pass
  • Move helper functions from plan_builder.rs to plan_builder_helpers.rs
  • Rename co-located → coupled edges terminology
  • Consolidate schema loading with shared helpers
  • Consolidated VLP handling with VlpSchemaType

📚 Documentation

  • Prioritize Docker Hub image in getting-started guide
  • Update README with v0.5.1 Docker Hub release
  • Add v0.5.2 planning document
  • Update wiki Quick Start to use Docker Hub image with credentials
  • Add Zeek network log examples and denormalized edge table guide
  • Update STATUS.md with denormalized single-hop fix
  • Update denormalized blocker notes with current status
  • Update denormalized edge status to COMPLETE
  • Add graph algorithm support to denormalized edge docs
  • Add 0-hop pattern support to denormalized edge docs
  • (wiki) Update denormalized properties with all supported patterns
  • Add coupled edges documentation
  • (wiki) Add Coupled Edges section to denormalized properties
  • Add v0.5.2 TODO list for polymorphic edges and code consolidation
  • Mark schema loading consolidation complete in TODO
  • Update STATUS.md with polymorphic edge filter completion
  • Add Schema-Basics.md and wiki versioning workflow
  • Update documentation for v0.5.2 schema variations
  • Update KNOWN_ISSUES.md with v0.5.2 status
  • Update KNOWN_ISSUES.md with fixed-length VLP resolution
  • Update KNOWN_ISSUES with VLP fixes and *0 pattern limitation
  • Add Cypher Subgraph Extraction wiki with Nebula GET SUBGRAPH comparison
  • Update README with v0.5.2 features

🎨 Styling

  • Use UNION instead of UNION DISTINCT

🧪 Testing

  • Add comprehensive Docker image validation suite
  • Add comprehensive schema variation test suite (73 tests)

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]
  • Clean up root directory - remove temp files and organize Python tests
  • Release v0.5.2
  • Update CHANGELOG.md [skip ci]
  • Update Cargo.lock for v0.5.2

[0.5.1] - 2025-11-21

🚀 Features

  • Add SQL Generation API (v0.5.1)
  • Implement RETURN DISTINCT for de-duplication
  • Add role-based connection pool for ClickHouse RBAC

🐛 Bug Fixes

  • Eliminate flaky cache LRU eviction test with millisecond timestamps
  • Replace docker_publish.yaml with docker-publish.yml
  • Add missing distinct field to all Projection initializations

📚 Documentation

  • Fix getting-started guide issues
  • Update STATUS.md with fixed flaky test achievement (423/423 passing)
  • Add /query/sql endpoint and RETURN DISTINCT documentation
  • Add /query/sql endpoint and RETURN DISTINCT to wiki

🧪 Testing

  • Add role-based connection pool integration tests

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Release v0.5.1
  • Update CHANGELOG.md [skip ci]

[0.5.0] - 2025-11-19

🚀 Features

  • (phase2) Add tenant_id and view_parameters to request context
  • (phase2) Thread tenant_id through HTTP/Bolt to query planner
  • Implement SET ROLE RBAC support for single-tenant deployments
  • (multi-tenancy) Add view_parameters field to schema config
  • (multi-tenancy) Implement parameterized view SQL generation
  • (multi-tenancy) Add Bolt protocol view_parameters extraction
  • (phase2) Add engine detection for FINAL keyword support
  • (phase2) Add use_final field to schema configuration
  • (phase2) Add FINAL keyword support to SQL generation
  • (phase2) Auto-schema discovery with column auto-detection
  • (auto-discovery) Add camelCase naming convention support
  • Add PowerShell scripts for wiki validation workflow
  • Add Helm chart for Kubernetes deployment

🐛 Bug Fixes

  • (phase2) Correct FINAL keyword placement - after alias
  • (tests) Add missing engine and use_final fields to test schemas
  • Implement property expansion for RETURN whole node queries
  • Update clickgraph-client and add documentation

🚜 Refactor

  • Minor code improvements in parser and planner

📚 Documentation

  • Phase 2 minimal RBAC - parameterized views with multi-parameter support
  • Fix Pattern 2 RBAC examples to use SET ROLE approach
  • Add Phase 2 progress to STATUS.md
  • Add comprehensive Phase 2 multi-tenancy status report
  • (multi-tenancy) Complete parameterized views documentation + cleanup
  • Update parameterized views note with cache optimization details
  • (phase2) Complete Phase 2 multi-tenancy documentation and tests
  • Correct Phase 2 status - 2/5 complete, not fully done
  • Update ROADMAP.md Phase 2 progress - 2/5 complete
  • (phase2) Update STATUS and CHANGELOG for FINAL syntax fix
  • (phase2) Update STATUS and CHANGELOG for auto-schema discovery
  • Align wiki examples with benchmark schema and add validation
  • Add session documentation and planning notes
  • Update STATUS, CHANGELOG, and KNOWN_ISSUES
  • Update ROADMAP with wiki documentation and bug fix progress
  • Mark Phase 2 complete - v0.5.0 release ready!

⚡ Performance

  • (cache) Optimize multi-tenant caching with SQL placeholders

🧪 Testing

  • Add comprehensive SET ROLE RBAC test suite
  • (multi-tenancy) Add parameterized views test infrastructure
  • (multi-tenancy) Add unit tests for view_parameters
  • Add integration test utilities and schema

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Clean up temporary test output and debug files

[0.4.0] - 2025-11-15

🚀 Features

  • Add parameter support via HTTP API + identity fallback for properties
  • Add production-ready query cache with LRU eviction
  • Complete Bolt 5.8 protocol implementation with E2E tests passing
  • Add Neo4j function support with 25+ function mappings
  • Complete E2E testing infrastructure + critical bug fixes
  • Unified benchmark architecture with scale factor parameter
  • Adjust post ratio to 20 and add 2 post-related benchmark queries
  • Add MergeTree engine support for large-scale benchmarks
  • (benchmark) Complete MergeTree benchmark infrastructure, discover multi-hop query bug
  • Add comprehensive regression test suite (799 tests)
  • Add pre-flight checks to test runner
  • Pre-load test_integration schema at server startup
  • Implement undirected relationship support (Direction::Either)

🐛 Bug Fixes

  • Multi-hop JOINs, SELECT aliases, SQL quoting + improve benchmark display
  • Use correct schema and database for integration tests
  • Start server without pre-loaded schema for integration tests
  • IS NULL operator in CASE expressions (22/25 tests passing)
  • Resolve compilation errors from API changes and incomplete cleanup
  • Additional GraphSchema::build() signature fixes in test files
  • Remove unused variable in view_resolver_tests.rs
  • Update error handling tests to match actual ClickGraph behavior

🚜 Refactor

  • Archive NEXT_STEPS.md in favor of ROADMAP.md
  • Remove inherited DDL generation code (~1250 LOC)
  • Remove bitmap index infrastructure (~200 LOC)
  • Remove use_edge_list flag (~50 LOC)
  • Flatten directory structure - remove brahmand/ wrapper
  • Remove expression_utils dead code - visitor pattern + utility functions
  • Convert CteGenerationContext to immutable builder pattern
  • Create plan_builder_helpers module (preparatory step)
  • Integrate plan_builder_helpers module
  • Add deprecation markers to duplicate helper functions
  • Complete deprecation markers for all helper functions (20/20)
  • Remove all deprecated helper functions (~736 LOC, 22% reduction)
  • Replace file-based debug logging with standard log::debug! macro

📚 Documentation

  • Update KNOWN_ISSUES and copilot-instructions - all major issues resolved
  • Add comprehensive ROADMAP with real-world features and prioritization
  • Architecture decision - Use string substitution for parameters (not ClickHouse .bind())
  • Update NEXT_STEPS.md roadmap with query cache completion
  • Update README and ROADMAP with query cache completion
  • Highlight parameter support in README and add usage restrictions
  • Update ROADMAP.md with Bolt 5.8 completion
  • Clarify anonymous node/edge pattern as TODO feature
  • Document flaky cache LRU eviction test
  • Document anonymous node SQL generation bug
  • Change 'production-ready' to 'development-ready' for v0.4.0

🧪 Testing

  • (benchmark) Add regression test script for CI/CD

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Complete v0.4.0 release preparation - Phase 1 complete

[0.3.0] - 2025-11-10

🚀 Features

  • Complete WITH clause with GROUP BY, HAVING, and CTE support
  • Enable per-request schema support for thread-safe multi-tenant architecture
  • Add schema-aware helper functions in render layer

🐛 Bug Fixes

  • Multi-hop graph query planning and join generation
  • Update path variable tests to match tuple() implementation
  • Improve anchor node selection to prefer LEFT nodes first
  • Prevent double schema prefix in CTE table names
  • Use correct node alias for FROM clause in GraphRel fallback
  • Prevent both LEFT and RIGHT nodes from being marked as anchor
  • Remove duplicate JOINs for path variable queries
  • Detect multiple relationship types in GraphJoins tree
  • Update JOINs to use UNION CTE for multiple relationship types
  • Correct release date in README (November 9, not 23)

💼 Other

  • Add schema to PlanCtx (Phases 1-3 complete)

🚜 Refactor

  • Remove BITMAP traversal code and fix relationship direction handling
  • Rename handle_edge_list_traversal to handle_graph_pattern
  • Remove redundant GLOBAL_GRAPH_SCHEMA

📚 Documentation

  • Prepare for next session and organize repository
  • Python integration test status report (36.4% passing)
  • Update STATUS and KNOWN_ISSUES for GLOBAL_GRAPH_SCHEMA removal
  • Clean up outdated KNOWN_ISSUES and update README

🧪 Testing

  • Add debugging utilities for anchor node and JOIN issues

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]
  • Disable automatic docker publish
  • Clean up test debris and remove deleted optimizer
  • Replace emoji characters with text equivalents in test files
  • Organize root directory for public repo
  • Bump version to 0.2.0
  • Bump version to 0.3.0

[0.2.0] - 2025-11-06

🚀 Features

  • Implement dual-key schema registration for startup-loaded schemas
  • Add COUNT(DISTINCT node) support and fix integration test infrastructure
  • Support edge-driven queries with anonymous node patterns

🐛 Bug Fixes

  • Simplify schema strategy - use only server's default schema
  • Remove ALL hardcoded property mappings - CRITICAL BUG FIX
  • Enhance column name helpers to support both prefixed and unprefixed names
  • Remove is_simple_relationship logic that skipped node joins
  • Configure Docker to use integration test schema
  • Only create node JOINs when nodes are referenced in query
  • Preserve table aliases in WHERE clause filters
  • Extract where_predicate from GraphRel during filter extraction
  • Remove direction-based logic from JOIN inference - both directions now work
  • GraphNode uses its own alias for PropertyAccessExp, not hardcoded 'u'
  • Complete OPTIONAL MATCH with clean SQL generation
  • Add user_id and product_id to schema property_mappings
  • Add schema prefix to JOIN tables in cte_extraction.rs
  • Handle fully qualified table names in table_to_id_column
  • Variable-length paths now generate recursive CTEs
  • Multiple relationship types now generate UNION CTEs
  • Correct edge list test assertions for direction semantics

💼 Other

  • Document property mapping bug investigation

🚜 Refactor

  • Remove /api/ prefix from routes for simplicity

📚 Documentation

  • Final Phase 1 summary with all 12 test suites
  • Add schema loading architecture documentation and API test
  • Update STATUS with integration test results
  • Create action plan for property mapping bug fix
  • Update STATUS and CHANGELOG with critical bug fix resolution
  • Document WHERE clause gap for simple MATCH queries
  • Add schema management endpoints and update API references
  • Update STATUS.md with WHERE clause alias fix
  • Update STATUS with WHERE predicate extraction fix
  • Update STATUS and CHANGELOG with schema fix
  • Update STATUS with complete session summary

🧪 Testing

  • Add comprehensive integration test framework
  • Add comprehensive relationship traversal tests
  • Add variable-length path and shortest path integration tests
  • Add OPTIONAL MATCH and aggregation integration tests
  • Complete Phase 1 integration test suite with CASE, paths, and multi-database
  • Add comprehensive error handling integration tests
  • Add basic performance regression tests
  • Initial integration test suite run - 272 tests collected
  • Fix schema/database naming separation in integration tests

⚙️ Miscellaneous Tasks

  • Update CHANGELOG.md [skip ci]

[0.1.0] - 2025-11-02

🚀 Features

  • (parser) Add shortest path function parsing
  • (planner) Add ShortestPathMode tracking to GraphRel
  • (planner) Detect and propagate shortest path mode
  • (sql) Implement shortest path SQL generation with depth filtering
  • Add WHERE clause filtering support for shortest path queries
  • Add path variable support to parser (Phase 2.1-2.2)
  • Track path variables in logical plan (Phase 2.3)
  • Pass path variable to SQL generator (Phase 2.4)
  • Phase 2.5 - Generate path object SQL for path variables
  • Phase 2.6 - Implement path functions (length, nodes, relationships)
  • WHERE clause filters for variable-length paths and shortestPath
  • Complete allShortestPaths implementation with WHERE filters
  • Implement alternate relationship types [:TYPE1|TYPE2] support
  • Implement multiple relationship types with UNION logic
  • Support multiple relationship types with labels vector
  • Complete Path Variables & Functions implementation
  • Complete Path Variables implementation with documentation
  • Add PageRank algorithm support with CALL statement
  • Complete Query Performance Metrics implementation
  • Complete CASE expressions implementation with full context support
  • Complete WHERE clause filtering pipeline for variable-length paths
  • Implement type-safe configuration management
  • Systematic error handling improvements - replace panic-prone unwrap() calls
  • Complete codebase health restructuring - eliminate runtime panics
  • Rebrand from Brahmand to ClickGraph
  • Update benchmark suite for ClickGraph rebrand and improved performance testing
  • Complete multiple relationship types feature with schema resolution
  • Complete WHERE clause filters with schema-driven resolution
  • Add per-table database support in multi-schema architecture
  • Complete schema-only architecture migration
  • Add medium benchmark (10K users, 50K follows) with performance metrics
  • Add large benchmark (5M users, 50M follows) - 90% success at massive scale!
  • Add Bolt protocol multi-database support
  • Add test convenience wrapper and update TESTING_GUIDE
  • Implement USE clause for multi-database selection in Cypher queries

🐛 Bug Fixes

  • (tests) Add exhaustive pattern matching for ShortestPath variants
  • (parser) Improve shortest path function parsing with case-insensitive matching
  • (parser) Consume leading whitespace in shortest path functions
  • (sql) Correct nested CTE structure for shortest path queries
  • (phase2) Phase 2.7 integration test fixes - path variables working end-to-end
  • WHERE clause handling for variable-length path queries
  • Enable stable background schema monitoring
  • Resolve critical TODO/FIXME items causing runtime panics
  • Root cause fix for duplicate JOIN generation in relationship queries
  • Three critical bug fixes for graph query execution
  • Consolidate benchmark results and add SUT information
  • Resolve path variable regressions after schema-only migration
  • Use last part of CTE name instead of second part

💼 Other

  • Prepare v0.1.0 release

🚜 Refactor

  • (sql) Wire shortest_path_mode through CTE generator
  • Extract CTE generation logic into dedicated module
  • Complete codebase health improvements - modular architecture
  • Standardize test organization with unit/integration/e2e structure
  • Extract common expression processing utilities
  • Organize benchmark suite into dedicated directory
  • Clean up and improve CTE handling for JOIN optimization
  • Remove GraphViewConfig and rename global variables
  • Complete migration from view-based to schema-only configuration
  • Organize project root directory structure

📚 Documentation

  • Add session recap and lessons learned
  • Add shortest path implementation session progress
  • Comprehensive shortest path implementation documentation
  • Add session completion summary
  • Update STATUS.md with Phase 2.7 completion - path variables fully working
  • Update STATUS.md to reflect current state of multiple relationship types
  • Add project documentation and cleanup summaries
  • Complete schema validation enhancement documentation
  • Update STATUS.md and CHANGELOG.md with completed features
  • Update NEXT_STEPS.md with recent completions and current priorities
  • Correct ViewScan relationship support - relationships DO use YAML schemas
  • Correct ViewScan relationship limitation in STATUS.md
  • Remove incorrect OPTIONAL MATCH limitation from STATUS.md and NEXT_STEPS.md
  • Document property mapping debug findings and render plan fixes
  • Update CHANGELOG with property mapping debug session
  • Update CHANGELOG with CASE expressions feature
  • Fix numbering inconsistencies and update WHERE clause filtering status
  • Update STATUS with type-safe configuration completion
  • Update STATUS.md with TODO/FIXME resolution completion
  • Clarify DDL parser TODOs are out-of-scope for read-only engine
  • Sync documentation with current project status
  • Update documentation with bug fixes and benchmark results
  • Update README with 100% benchmark success and recent bug fixes
  • Update STATUS.md with 100% benchmark success
  • Update STATUS and CHANGELOG with enterprise-scale validation
  • Add What's New section to README highlighting enterprise-scale validation
  • Complete benchmark documentation with all three scales
  • Add clear navigation to benchmark results
  • Tone down production-ready claims to development build
  • Add from_node/to_node fields to all relationship schema examples
  • Clarify node label terminology in comments and examples
  • Update STATUS.md with November 2nd achievements
  • Add multi-database support to README and API docs
  • Add PROJECT_STRUCTURE.md guide
  • Add comprehensive USE clause documentation

🧪 Testing

  • (parser) Add comprehensive shortest path parser tests
  • Add shortest path SQL generation test script
  • Add shortest path integration test files
  • Improve test infrastructure and schema configuration
  • Add end-to-end tests for USE clause functionality

⚙️ Miscellaneous Tasks

  • Update .gitignore to exclude temporary files
  • Disable CI on push to main (requires ClickHouse infrastructure)

[iewscan-complete] - 2025-10-19

🚀 Features

  • ✨ Added basic schema inferenc
  • ✨ support for multi node conditions
  • Support for multi node conditions
  • Query planner rewrite (#11)
  • Complete view-based graph infrastructure implementation
  • Comprehensive view optimization infrastructure
  • Complete ClickGraph production-ready implementation
  • Implement relationship traversal support with YAML view integration
  • Implement variable-length path traversal for Cypher queries
  • Complete end-to-end variable-length path execution
  • Add chained JOIN optimization for exact hop count queries
  • Add parser-level validation for variable-length paths
  • Make max_recursive_cte_evaluation_depth configurable with default of 100
  • Add OPTIONAL MATCH AST structures
  • Implement OPTIONAL MATCH parser
  • Implement OPTIONAL MATCH logical plan integration
  • Implement OPTIONAL MATCH with LEFT JOIN semantics
  • Implement view-based SQL translation with ViewScan for node queries
  • Add debug logging for full SQL queries
  • Add schema lookup for relationship types

🐛 Bug Fixes

  • 🐛 relation direction when same node types
  • 🐛 Property tagging to node name
  • 🐛 node name in return clause related issues
  • Count start issue (#6)
  • Schema integration bug - separate column names from node types
  • Rewrite GROUP BY and ORDER BY expressions for variable-length CTEs
  • Preserve Cypher variable aliases in plan sanitization
  • Qualify columns in IN subqueries and use schema columns
  • Prevent CTE nesting and add SELECT * default
  • Pass labels to generate_scan for ViewScan resolution

💼 Other

  • Node name in return clause related issues
  • Add RECURSIVE keyword to variable_length_demo.ipynb SQL descriptions

📚 Documentation

  • Add comprehensive changelog for October 15, 2025 session
  • Update README to use more appropriate terminology
  • Add comprehensive test coverage summary for variable-length paths
  • Simplify documentation structure for better maintainability
  • Add documentation standards to copilot-instructions.md
  • Add ViewScan completion documentation
  • Add git workflow guide and update .gitignore

🧪 Testing

  • Add comprehensive test suite for variable-length paths (30 tests)
  • Add comprehensive testing infrastructure

⚙️ Miscellaneous Tasks

  • Fixed docker pipeline mac issue
  • Fixed docker mac issue
  • Fixed docker image mac issue
  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]
  • Update CHANGELOG.md [skip ci]
  • Update Cargo.lock after axum 0.8.6 upgrade
  • Clean up debug logging and add NEXT_STEPS documentation