Skip to content

Avarok-Cybersecurity/llmnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLMNet Logo

The future is small models routed to — intelligently.

Rust License CI

Orchestrate multiple LLMs into intelligent, layered pipelines with a single configuration file

Quick Start · Examples · Configuration · CLI


What is llmnet?

llmnet creates layered AI pipelines where requests flow through multiple models with intelligent routing at each stage. Think of it as a neural network, but each "neuron" is an LLM.

User Query → Router → [Expert A | Expert B | Expert C] → Refiner → Response

Why llmnet?

  • Cost optimization: Route simple queries to cheap models, complex ones to powerful models
  • Specialization: Use domain-specific fine-tuned models for different query types
  • Quality: Add refinement layers to polish responses before delivery
  • Flexibility: Swap models without changing code—just update the config

Quick Start

Installation

git clone https://github.com/Avarok-Cybersecurity/llmnet.git
cd llmnet
cargo build --release

Run Your First Pipeline

# Validate configuration
./target/release/llmnet validate examples/basic-chatbot.json

# Start the server
./target/release/llmnet run examples/basic-chatbot.json

Deploy to a Cluster

# Start the control plane (in one terminal)
./target/release/llmnet serve --control-plane

# Deploy a pipeline (in another terminal)
./target/release/llmnet deploy examples/basic-chatbot.json

# Check status
./target/release/llmnet get pipelines

Make a Request

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llmnet", "messages": [{"role": "user", "content": "Hello!"}]}'

Example Guides

Each example includes real-world use cases showing how the same topology applies to different industries.

Example Topology Description Guide
Basic Chatbot 1-0-1 Simple LLM proxy đź“– Guide
Dual Expert 1-2-1 Route to specialized handlers đź“– Guide
OpenRouter Pipeline 1-2-1 Cloud-native with free models đź“– Guide
Multi-Layer Pipeline 1-2-1-1 Add refinement layer đź“– Guide
Conditional Routing 1-2-1 Route by input characteristics đź“– Guide
Nemotron Router 1-2-2-1 Enterprise with edge cases đź“– Guide
Calculator with Hooks 1-2-1 Validation hooks demo đź“– Guide

Quick Topology Reference

1-0-1:   User → LLM → Response                     (Basic proxy)
1-2-1:   User → Router → [A|B] → Response          (Dual expert)
1-2-1-1: User → Router → [A|B] → Refiner → Response (With refinement)
1-2-2-1: User → Router → [A|B] → [C|D] → Response  (Deep pipeline)

How It Works

Architecture

flowchart TB
    subgraph Layer0["Layer 0: Input"]
        R[Router Model]
    end

    subgraph Layer1["Layer 1: Specialists"]
        A[Expert A]
        B[Expert B]
    end

    subgraph Layer2["Layer 2: Refinement"]
        REF[Refiner]
    end

    subgraph Output["Output Layer"]
        O[Response]
    end

    R -->|"Analyzes intent"| A
    R -->|"Routes query"| B
    A --> REF
    B --> REF
    REF --> O

    style R fill:#e1f5fe
    style REF fill:#fff3e0
    style O fill:#c8e6c9
Loading

Core Concepts

Concept Description
Layer A stage in the pipeline (0 = input, higher = deeper)
Node A model endpoint within a layer
Router Layer 0 model that selects which downstream node handles the request
Condition Rule using system variables ($WORD_COUNT > 10) to filter targets
Adapter Protocol: openai-api, output, or ws (WebSocket)
Hooks Pre/post execution logic (observe or transform mode)
Functions Reusable operations: REST, Shell, WebSocket, gRPC
Secrets Credentials from env files, system env, or Vault

System Variables

Available in conditions and hooks:

Variable Context Description
$INPUT Pre/Post hooks Current input content
$OUTPUT Post hooks only LLM output
$NODE Pre/Post hooks Current node name
$PREV_NODE All Previous node name
$WORD_COUNT All Number of words in input
$INPUT_LENGTH All Character count
$HOP_COUNT All Number of hops so far
$TIMESTAMP All ISO 8601 timestamp
$REQUEST_ID All Unique request UUID
$secrets.* Functions Secret values

See Conditional Routing Guide for full documentation.


Configuration Reference

Minimal Example

{
  "models": {
    "my-model": {
      "type": "external",
      "interface": "openai-api",
      "url": "http://localhost:11434",
      "api-key": null
    }
  },
  "architecture": [
    {
      "name": "chat",
      "layer": 0,
      "model": "my-model",
      "adapter": "openai-api",
      "output-to": ["output"]
    },
    {
      "name": "output",
      "adapter": "output"
    }
  ]
}

Models Section

{
  "models": {
    "<model-name>": {
      "type": "external",
      "interface": "openai-api",
      "url": "<endpoint-url>",
      "api-key": "<optional-key-or-$ENV_VAR>"
    }
  }
}

Architecture Section

{
  "architecture": [
    {
      "name": "<unique-name>",
      "layer": 0,
      "model": "<model-name>",
      "adapter": "openai-api",
      "bind-addr": "0.0.0.0",
      "bind-port": "8080",
      "output-to": [1],
      "use-case": "Description for router",
      "if": "$WORD_COUNT > 10",
      "extra-options": {
        "model_override": "specific-model-id"
      }
    }
  ]
}
Field Type Description
name string Unique node identifier
layer number Pipeline stage (0 = input)
model string? Reference to models section
adapter string openai-api, output, or ws
output-to array Layer numbers [1] or node names ["output"]
use-case string? Description for LLM-based routing
if string? Condition for rule-based routing
hooks object? Pre/post hooks for the node

Secrets Section

Load credentials from various sources:

{
  "secrets": {
    "api-creds": {
      "source": "env-file",
      "path": "~/.config/llmnet/.env",
      "variables": ["API_KEY", "API_SECRET"]
    },
    "hf-token": {
      "source": "env",
      "variable": "HF_TOKEN"
    },
    "vault-secrets": {
      "source": "vault",
      "address": "https://vault.example.com",
      "path": "secret/data/llmnet/api"
    }
  }
}

Reference secrets using $secrets.{name}.{variable}:

"api-key": "$secrets.api-creds.API_KEY"

Functions Section

Define reusable operations for hooks:

{
  "functions": {
    "log-request": {
      "type": "rest",
      "method": "POST",
      "url": "https://api.example.com/log",
      "headers": {"Authorization": "Bearer $secrets.api.TOKEN"},
      "body": {"node": "$NODE", "input": "$INPUT"}
    },
    "validate-output": {
      "type": "shell",
      "command": "python",
      "args": ["validate.py", "--input", "$OUTPUT"],
      "timeout": 10
    }
  }
}
Type Description
rest HTTP requests (GET, POST, PUT, PATCH, DELETE)
shell Execute local commands
websocket Send WebSocket messages
grpc Call gRPC services

Hooks Section

Execute logic before/after LLM calls:

{
  "architecture": [
    {
      "name": "processor",
      "hooks": {
        "pre": [
          {"function": "log-request", "mode": "observe"}
        ],
        "post": [
          {"function": "validate-output", "mode": "transform", "on_failure": "abort"}
        ]
      }
    }
  ]
}
Mode Behavior
observe Fire-and-forget, doesn't affect pipeline
transform Waits for result, can modify data
on_failure Behavior
continue Log error, proceed with original data
abort Stop pipeline, return error

CLI Reference

llmnet provides a kubectl-like interface for managing LLM pipelines across local and remote clusters.

Core Commands

# Run a local pipeline (legacy mode)
llmnet run config.json

# Validate a configuration
llmnet validate config.json

# Start the control plane server
llmnet serve --control-plane

# Deploy a pipeline to the current context
llmnet deploy pipeline.yaml

# List resources
llmnet get pipelines
llmnet get nodes
llmnet get namespaces

# Scale a pipeline
llmnet scale my-pipeline --replicas 3

# Delete resources
llmnet delete pipeline my-pipeline

# View cluster status
llmnet status

Context Management

Manage connections to multiple LLMNet clusters:

# List available contexts
llmnet context list

# Add a remote cluster context
llmnet context add my-cluster --url http://10.0.0.1:8181

# Switch to a context
llmnet context use my-cluster

# Show current context
llmnet context current

Global Options

  -v, --verbose...      Increase logging verbosity
      --config <PATH>   Path to config file (default: ~/.llmnet/config)
  -h, --help            Print help
  -V, --version         Print version

Legacy Mode

For backwards compatibility, you can still run pipelines directly:

# Run with dry-run
llmnet run --dry-run config.json

# Override port
llmnet run --port 9000 config.json

# Load API keys from .env
llmnet run --env-file .env.production config.json

Client Usage

Python

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="not-needed")
response = client.chat.completions.create(
    model="llmnet",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js

import OpenAI from 'openai';

const client = new OpenAI({ baseURL: 'http://localhost:8080/v1', apiKey: 'not-needed' });
const response = await client.chat.completions.create({
  model: 'llmnet',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

curl

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llmnet", "messages": [{"role": "user", "content": "Hello!"}]}'

Advanced Topics

Using OpenRouter

Access multiple models through a single API:

{
  "models": {
    "router": {
      "url": "https://openrouter.ai/api",
      "api-key": "$OPENROUTER_API_KEY"
    }
  },
  "architecture": [
    {
      "name": "router",
      "extra-options": {
        "model_override": "google/gemma-3-27b-it:free"
      }
    }
  ]
}

See OpenRouter Pipeline Guide.

WebSocket Output

Stream responses or alerts to WebSocket endpoints:

{
  "name": "alert-ws",
  "if": "$AlertRequired",
  "adapter": "ws",
  "url": "ws://alerts:3000"
}

Nemotron Router

For intelligent routing, we recommend NVIDIA's Nemotron-Orchestrator-8B:

{
  "models": {
    "nemotron": {
      "url": "http://localhost:44443"
    }
  }
}

See Nemotron Router Guide and nemotron-router-8b.md.


License

MIT License - see LICENSE for details.


Built with Rust

About

Orchestrated and composable AI topologies

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages