Skip to content

RandMelville/epi-lang

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Epi — Epistemic Programming Interface

A type discipline for AI-augmented full-stack applications

License PyPI Python Status


Research status: v0.3 — active development, structural validation phase.

Author: Randerson Rebouças — PhD candidate, UFRGS. A short paper is being prepared for SBLP 2026.

What Epi is

Epi is a domain-specific language whose type system makes the epistemic boundary between deterministic computation and AI inference explicit. From a single .epi source, the transpiler generates a complete Next.js project — database schema, API routes, auth middleware, runtime validators, LLM inference calls, and UI components — such that AI-inferred values cannot bypass validation.

The thesis is narrow and honest: LLM hallucination cannot be prevented, but it can be contained at a type-system boundary, by construction, in the generated code.

What problem this solves

In a typical AI-augmented app, validation of LLM outputs, confidence thresholding, fallback handling, and audit trails are all optional — features the developer must remember to add. They get omitted in practice.

Epi makes them structural consequences of the type. A declaration like

risco: AI.Enum(Alto, Medio, Baixo, strict: true, confidence_threshold: 0.85)

necessarily produces:

  1. A database column (deterministic, Prisma).
  2. A Zod schema constraining the LLM output at runtime.
  3. An LLM inference call with confidence reporting.
  4. A checkpoint route that pauses for human review when confidence is below threshold.

You cannot compile an Epi program with an AI field and forget to validate the output. The transpiler emits the contract.

Example

@Language: Epi v0.3
@Goal: "Contract Analysis with Human-in-the-loop"

Entity Contrato {
    id: UUID(auto),
    titulo: Text,
    documento: Text,
    valor: Decimal,
    criado_em: DateTime(auto),
    risco: AI.Enum(Alto, Medio, Baixo, strict: true)
}
//       ▲ Rigid: deterministic    ▲ Epistemic: AI-inferred, validated at the boundary

Guard SomenteAdvogados {
    Condition: Auth.Role == "Lawyer"
}

Pulse ExtrairRisco {
    Input: Contrato
    Protect: Guard.SomenteAdvogados
    Process:
        Execute: AI.scan(
            source: Input.documento,
            prompt: file("@prompts/legal_scan.md"),
            temperature: 0.1,
            on_fail: Fallback.ManualReview(Queue: "Advogados")
        )
    Output: Contrato.risco
}

Pipeline AnalisarContrato {
    Flow: ExtrairRisco -> GerarResumo -> Notificar
    On_Error: Retry(max: 3, backoff: exponential)
}

Lens Dashboard {
    Mood: "Clean, Legal-Tech"            // [experimental]
    Display:
        Table(Contrato, columns: [titulo, valor, risco]),
        Form(Contrato) -> Button("Analisar").trigger(ExtrairRisco)
}

Fewer than 80 lines. The transpiler generates the entire Next.js project — schema, middleware, routes, validators, LLM calls, UI — with the epistemic contract enforced.

When to use Epi

  • Domains where audit-by-construction is required: legal, healthcare, education, government.
  • Apps where AI-inferred values must be persisted and traceable, not just shown.
  • Focused LLM-augmented products, not general-purpose AI platforms.
  • Settings where human-in-the-loop is structural (Trace + Checkpoint maps naturally).
  • Domains with a meaningful prior distribution (Bayesian update genuinely helps).

When NOT to use Epi

  • Conversational chatbots or customer-support assistants.
  • Apps centered on complex RAG, multi-tool agents, or fine-tuning workflows.
  • Teams with a mature in-house AI platform (LangGraph custom, internal orchestration).
  • Pure creative generation (confidence and checkpoint don't apply).
  • Latency-critical paths under 100 ms; Epi is for decision-grade flows.

See docs/LIMITATIONS.md for the full honest list of gaps in v0.3.

Architecture

A three-layer transpiler. The LLM is formally excluded from Layers 1 and 2.

.epi source
   ▼
[Layer 1: Parser]            Lark + EBNF              100% deterministic
   ▼
[Layer 2: Rigid Generator]   Prisma, middleware,      100% deterministic
                             routes, Zod validators
   ▼
[Layer 3: Epistemic Gen.]    LLM calls, Trace,        validated by Layer 2
                             Checkpoint, Lens

See docs/ARCHITECTURE.md for the full breakdown.

Quick start

The PyPI package installs the CLI; example .epi files currently live in this repo. A pip-only workflow lands in v0.4 with epi init.

git clone https://github.com/RandMelville/epi-lang.git
cd epi-lang
pip install epi-lang

# Validate
epi validate examples/contrato.epi

# Transpile to a Next.js project
epi transpile examples/contrato.epi --target nextjs --outdir ./generated

cd generated
npm install
cp .env.example .env
# Edit .env: DATABASE_URL=postgresql://...  and  ANTHROPIC_API_KEY=sk-ant-...
npx prisma migrate dev --name init
npm run dev

For development:

pip install -e ".[dev]"
pytest

The five primitives

Primitive What it does Status
Entity Data schema with typed fields, rigid + epistemic stable
Guard Auth & authorization, transpiles to middleware stable
Pulse AI execution unit with temperature, prompt, on_fail stable
Pipeline Composes Pulses with retry/backoff strategy stable
Lens Semantic UI declaration Display / Inject stable; Mood experimental

The epistemic type system

Two domains.

Rigid types — deterministic, no AI involvement:

UUID(auto)   Text   Int   Float   Decimal   Bool   DateTime(auto)   JSON

Epistemic types — AI-inferred, runtime-validated:

AI.Enum(values..., strict, prior, confidence_threshold)
AI.Text(max_tokens)
AI.Classification(labels)
AI.Score(min, max)
AI.Embedding(dimensions)

A single epistemic declaration generates a database column, a Zod validator, an LLM inference call, and optionally a checkpoint route. If it compiles, the runtime contract is enforced.

Trace + Checkpoint (v0.3 highlight)

A Pulse can be decomposed into Trace steps. Each step can Expose: intermediate reasoning fields and pause at a Checkpoint: for human review before the final output is committed.

Pulse AvaliarRespostaAluno {
    Trace CompreenderEnunciado {
        Execute: AI.reason(source: Input.enunciado, prompt: file("@prompts/..."))
        Expose: interpretacao, conceitos_chave, criterios_avaliacao
        Checkpoint: ReviewRequired(role: "Professor")
    }

    Trace AvaliarResposta {
        Execute: AI.classify(
            source: Input.resposta_aluno,
            confidence_threshold: 0.85,
            on_low_confidence: Checkpoint.ReviewRequired(role: "Professor")
        )
    }
}

This generates an in-memory TraceState store, inspect / resume HTTP routes, and an audit trail of every human approval or correction. Designed for high-stakes evaluation flows (pedagogical assessment, legal review, clinical triage).

Documentation

Document Purpose
docs/SPEC.md Formal language specification (English, canonical)
docs/ARCHITECTURE.md Transpiler architecture and design decisions
docs/MANIFESTO.md Why epistemic types matter
docs/LIMITATIONS.md What Epi does NOT do (honest list of gaps)
docs/CONTRIBUTING.md How to contribute
docs/PAPER.md Paper draft (SBLP 2026 in preparation)
docs/translations/SPEC-PT.md Portuguese translation (may lag)

Status

Stable in v0.3:

  • EBNF grammar (Lark)
  • AST with epistemic type system (Pydantic)
  • Parser + Lark transformer
  • Deterministic generators: Prisma schema, middleware, routes, Zod validators
  • Epistemic generators: LLM calls (Anthropic), Trace + Checkpoint, Bayesian prior
  • CLI: validate, parse, transpile
  • 121 tests passing
  • PyPI package (pip install epi-lang)

Experimental — known to be incomplete:

  • Lens.Mood — deterministic keyword-to-Tailwind lookup of 6 hardcoded moods. Not LLM-generated UI; do not rely on it.
  • --target fastapi — blocked in the CLI; current output is inconsistent.

Planned for v0.4:

  • Multi-provider LLM adapter (Ollama, OpenAI, Gemini, Anthropic) — currently Anthropic-only.
  • epi init for project bootstrap without cloning the repo.
  • FastAPI target completion (or formal removal).
  • Empirical evaluation study (Epi-generated vs hand-written equivalents).
  • SQLite option for quickstart without local Postgres.

Related work

System Relationship to Epi
ProbZelus (PLDI 2020) Separates deterministic and probabilistic reactive streams. Epi lifts the separation to application-level types.
SlicStan (POPL 2019) Information-flow types for probabilistic programs. Epi adapts the discipline to AI-augmented full-stack apps.
Russo & Sabelfeld, Dynamic vs. Static Flow-Sensitive Security Analysis (CSF 2010) Information-flow control as type-level separation of security levels — structural analog to Epi's rigid/epistemic separation.
BAML Typed LLM function signatures. Epi extends the idea to whole-application generation.
Wasp Full-stack DSL (React + Node). Epi adds epistemic types and AI-aware code generation.
DSPy (Stanford) Declarative LLM programming via signatures. Orthogonal to Epi — DSPy optimizes prompts, Epi separates type domains.

Citation

@software{reboucas2026epi,
  author  = {Rebouças, Randerson},
  title   = {Epi: An Epistemic Programming Interface for AI-Augmented Full-Stack Transpilation},
  year    = {2026},
  url     = {https://github.com/RandMelville/epi-lang},
  version = {0.3.0}
}

Contributing

See docs/CONTRIBUTING.md. Honest critique preferred over wishful documentation.

License

Apache License 2.0 — Copyright (c) 2026 Randerson Rebouças.

About

Epi — A zero-stack intent-oriented language with an epistemic type system. One .epi file in. Full stack out. Every AI output validated.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages