Skip to content

SQL support for isograph#968

Open
edmondop wants to merge 5 commits intomainfrom
feat/sql-datafusion-substrait
Open

SQL support for isograph#968
edmondop wants to merge 5 commits intomainfrom
feat/sql-datafusion-substrait

Conversation

@edmondop
Copy link
Contributor

Summary

This branch adds Phase 1 SQL support to Isograph using DataFusion and Substrait, enabling Isograph to execute SQL queries against SQLite databases via a compile-time query pipeline, and fixes the TypeScript type system to support both SQL and GraphQL response types.


What Was Built

1. SQL Query Pipeline (compile-time)

Isograph's compiler now generates a Substrait binary query plan alongside the existing GraphQL artifacts. The pipeline is:

MergedSelectionMap → DataFusion LogicalPlan → Substrait binary → Base64-encoded artifact

New crates:

  • crates/sql_network_protocolNetworkProtocol implementation for SQL; builds DataFusion LogicalPlan from Isograph's MergedSelectionMap, then serialises to Substrait binary via prost
  • crates/sql_lang_types — Type aliases and newtypes for SQL-specific language constructs
  • crates/sql_schema_parser — Parses SQL CREATE TABLE DDL into Isograph's schema model
  • crates/isograph_server — Axum HTTP server (/query endpoint) that accepts Substrait plans at runtime, executes them against DataFusion, and returns Arrow IPC data

Modified:

  • crates/artifact_content/src/entrypoint_artifact.rs — generates query_plan.bin (base64 Substrait) for SQL profiles
  • crates/artifact_content/src/raw_response_type.rs — adds readonly modifiers and field-name quoting to generated TypeScript types

2. SQLite Demo (demos/sqlite-demo)

A standalone Vite + React app that uses the SQL pipeline end-to-end:

  • Defines a Star Wars planets schema in schema.sql
  • Uses iso to write a HomePage entrypoint that selects id, name, climate, orbital_period, surface_water
  • The isograph compiler generates query_plan.bin, normalization_ast.ts, and raw_response_type.ts
  • At runtime, the frontend fetches from isograph-server, which runs the Substrait plan against DataFusion and returns Arrow IPC

3. TypeScript Type System Unification

Problem: SQL-generated types like { id: number, name: string } didn't satisfy the existing NetworkResponseObject constraint, which used a branded-key mapped type designed only for GraphQL.

Solution: Widened NetworkResponseObject in libs/isograph-react/src/core/cache.ts to use a plain string index signature:

// Before: mapped type with branded keys (GraphQL-only)
export type NetworkResponseObject = {
  readonly [K in ScalarNetworkResponseKey | LinkedNetworkResponseKey]: ...
} & { readonly id?: DataId; };

// After: plain string index signature (GraphQL + SQL)
export type NetworkResponseObject = {
  readonly [key: string]: undefined | NetworkResponseValue;
} & {
  readonly id?: DataId | number;  // SQL uses numeric IDs
  readonly __typename?: TypeName;
};

This is zero-breaking-change: GraphQL types still satisfy the wider constraint due to TypeScript structural typing.

4. CI and Build Fixes

  • swc / serde incompatibility: substrait 0.62 transitively pulled in serde 1.0.220+ which broke swc_common 2.0.1. Fixed by updating swc workspace deps to versions compatible with the new serde.
  • Clippy needless_return: CI runs with RUSTFLAGS=-D warnings; removed the redundant return statement that was failing the build.
  • pnpm frozen-lockfile: Added @playwright/test to sqlite-demo/package.json without updating the lockfile; fixed by running pnpm install.
  • protoc: Substrait code generation requires protoc; added installation to the CI cargo-test and cargo-clippy jobs.
  • Prettier / oxlint: Formatted files and fixed a type-only import in App.tsx.

5. Test Fixtures and E2E Tests

  • test-fixtures/databases/ — SQLite schema and init scripts for Phase 1–3 test progression
  • demos/sqlite-demo/e2e/phase1-sql.spec.ts — Playwright E2E tests validating:
    • Page loads and displays planet data from SQL query
    • isograph-server /query endpoint is called
    • Error handling and graceful degradation

Key Files

File Purpose
crates/sql_network_protocol/src/sql_network_protocol.rs NetworkProtocol impl for SQL
crates/sql_network_protocol/src/query_generation/logical_plan_builder.rs MergedSelectionMap → DataFusion LogicalPlan
crates/sql_network_protocol/src/substrait/serialize.rs LogicalPlan → Substrait binary
crates/isograph_server/src/main.rs Axum HTTP server
crates/artifact_content/src/entrypoint_artifact.rs Compiler: generates query_plan.bin
crates/artifact_content/src/raw_response_type.rs Compiler: generates typed raw response types
libs/isograph-react/src/core/cache.ts Runtime: NetworkResponseObject type widening
demos/sqlite-demo/ End-to-end demo
test-fixtures/databases/ SQLite test database fixtures

What Phase 1 Does NOT Include (Future Phases)

  • WHERE clauses / filtering (Phase 2)

  • JOINs / linked fields (Phase 3)

  • Mutations (Phase 4)

  • Subscriptions / live queries (Phase 5)

  • Connection pooling (isograph-server currently opens a new connection per request)

@edmondop edmondop force-pushed the feat/sql-datafusion-substrait branch 5 times, most recently from cc01b31 to cb3edb2 Compare February 26, 2026 17:13
Implements end-to-end SQL query pipeline for simple SELECT queries:

Compile-time: MergedSelectionMap → DataFusion LogicalPlan → Substrait binary
Runtime: Substrait plan → isograph-server execution → Arrow IPC response

New components:
- crates/sql_network_protocol: NetworkProtocol impl; LogicalPlan builder and
  Substrait serialization; parses SQL schema into Isograph's type model
- crates/sql_lang_types, crates/sql_schema_parser: SQL-specific language types
  and DDL parser
- crates/isograph_server: Axum HTTP server with /query endpoint
- demos/sqlite-demo: end-to-end Vite+React demo using Star Wars planets SQLite
- test-fixtures/databases: SQLite schema and init scripts for Phase 1–3
- E2E Playwright tests for the full pipeline

Type system: widen NetworkResponseObject from branded-key mapped type to plain
string index signature so SQL-generated types (e.g. { id: number, name: string })
satisfy the constraint without @ts-ignore. SQL numeric IDs accepted via
id?: DataId | number. Zero breaking change for existing GraphQL code.

Compiler: add readonly modifiers and field-name quoting to SQL-generated
raw_response_type.ts files.

Build fixes: update swc workspace deps for serde 1.0.220+ compatibility,
fix clippy needless_return, add protoc to CI for substrait code generation,
update pnpm lockfile, fix prettier and oxlint issues.

Co-Authored-By: Claude <noreply@anthropic.com>
@edmondop edmondop force-pushed the feat/sql-datafusion-substrait branch from cb3edb2 to fd8e190 Compare February 26, 2026 17:15
edmondop and others added 4 commits February 26, 2026 14:18
- Add Cross.toml: installs protobuf-compiler inside the cross Docker
  containers used for Linux ARM64 builds (substrait/prost-build requires
  protoc at compile time, and the cross images don't include it)
- Increase build-cli timeout from 15 → 45 minutes: adding substrait +
  DataFusion to the workspace significantly increases compile time on
  Windows and macOS ARM64 runners

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Uses Swatinem/rust-cache@v2 with:
- Per-target shared-key so Linux/macOS/Windows caches don't collide
- cache-on-failure: true so timed-out or failed builds still populate
  the cache for the next run (critical for the slow Windows/macOS ARM64
  targets that were hitting the 15-minute timeout)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two fixes:
1. Move timeout-minutes: 45 to ci.yml caller jobs — GitHub Actions
   ignores timeout-minutes in called reusable workflows; it must be
   set in the caller. The 15-min limit in build-cli.yml was never
   being overridden by our change there.

2. Install protoc 25.3 from GitHub releases in Cross.toml instead of
   apt's bundled version — the cross Docker image ships protoc < 3.15
   which rejects the `optional` keyword in proto3 syntax used by the
   substrait .proto files. Protoc 25.x supports it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant