LangCore LiteLLM

Provider plugin for LangCore — access 100+ language models through a single, unified interface via LiteLLM.

Overview

langcore-litellm is a provider plugin for LangCore that adds support for 100+ language models through LiteLLM's unified API. Install it, prefix your model ID with litellm/, and every LangCore extraction call routes through LiteLLM transparently — auto-discovered via Python entry points.

Features

100+ model support — OpenAI, Anthropic, Google, Azure, Mistral, Groq, Cohere, HuggingFace, Ollama, vLLM, and more through a single provider
Native async — uses litellm.acompletion() with asyncio.Semaphore for true non-blocking concurrent I/O (no thread pool overhead)
Multi-pass cache bypass — automatic per-pass cache control ensures fresh LLM responses on repeat extraction passes while keeping the first pass cacheable
Token usage tracking — captures prompt, completion, and total token counts (UsageStats) from every inference call
Concurrency control — configurable max_workers semaphore limits parallel async requests to prevent rate-limit errors
Zero-config plugin — auto-registered via Python entry points; no manual wiring required
Full parameter passthrough — forward any LiteLLM-supported parameter (temperature, top_p, timeout, etc.) through provider_kwargs

Installation

pip install langcore-litellm

Or install from source:

git clone https://github.com/JustStas/langcore-litellm
cd langcore-litellm
pip install -e .

Quick Start

Integration with LangCore

langcore-litellm integrates with LangCore through the provider plugin system. Create a model configuration, and LangCore handles the rest:

import langcore as lx

# Configure the LiteLLM provider
config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
)
model = lx.factory.create_model(config)

# Use with any LangCore extraction
result = lx.extract(
    text_or_documents="Acme Corp agrees to pay $50,000 to Beta LLC by March 2025.",
    model=model,
    prompt_description="Extract parties, monetary amounts, and dates.",
    examples=[
        lx.data.ExampleData(
            text="Alpha Inc will pay $10,000 to Omega Ltd by January 2024.",
            extractions=[
                lx.data.Extraction("party", "Alpha Inc", attributes={"role": "payer"}),
                lx.data.Extraction("party", "Omega Ltd", attributes={"role": "payee"}),
                lx.data.Extraction("monetary_amount", "$10,000"),
                lx.data.Extraction("date", "January 2024", attributes={"type": "deadline"}),
            ],
        )
    ],
)

print(result)

Usage

Supported Models

Model IDs must be prefixed with litellm/ (or litellm-) to route through this provider:

Provider	Example Model IDs
OpenAI	`litellm/gpt-4o`, `litellm/gpt-4o-mini`, `litellm/gpt-4-turbo`
Anthropic	`litellm/claude-3-opus`, `litellm/claude-3.5-sonnet`, `litellm/claude-3-haiku`
Google	`litellm/gemini-2.5-pro`, `litellm/gemini-2.0-flash`
Azure OpenAI	`litellm/azure/your-deployment-name`
Mistral	`litellm/mistral-large-latest`
Groq	`litellm/groq/llama-3.1-70b`
Ollama	`litellm/ollama/llama3.1`
And 100+ more	See LiteLLM providers

Environment Variables

Set the appropriate API key for your provider:

# OpenAI
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Google (Gemini)
export GEMINI_API_KEY="..."

# Azure OpenAI
export AZURE_API_KEY="..."
export AZURE_API_BASE="https://your-resource.openai.azure.com/"
export AZURE_API_VERSION="2024-02-01"

# Ollama (local)
export OLLAMA_API_BASE="http://localhost:11434"

See the LiteLLM documentation for the full list of provider-specific variables.

Async Extraction

The provider uses litellm.acompletion() for native async, avoiding thread overhead:

import asyncio
import langcore as lx

config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
)
model = lx.factory.create_model(config)

async def main():
    result = await lx.async_extract(
        text_or_documents="Agreement between Acme Corp and Beta LLC...",
        model=model,
        prompt_description="Extract parties and obligations.",
        examples=[...],
    )
    print(result)

asyncio.run(main())

Multi-Pass Extraction

When running multiple extraction passes (extraction_passes > 1), the provider automatically manages cache behaviour:

Pass	Cache Behaviour
1st pass	Normal — may be served from cache
2nd pass	Bypass — forces a fresh LLM response
3rd+ passes	Bypass — forces a fresh LLM response

This is fully automatic. LangCore threads a pass_num argument to the provider, which injects cache={"no-cache": True} for passes ≥ 1:

result = lx.extract(
    text_or_documents="Contract text...",
    model=model,
    prompt_description="Extract all entities.",
    examples=[...],
    extraction_passes=3,  # 3 passes, only the first is cacheable
)

Advanced Configuration

Forward any LiteLLM parameter through provider_kwargs:

config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
    provider_kwargs={
        "temperature": 0.2,
        "max_tokens": 2000,
        "top_p": 0.9,
        "timeout": 60,
        "api_key": "sk-...",  # override env var
    },
)
model = lx.factory.create_model(config)

Reserved Parameters

These parameters are consumed internally and are not forwarded to LiteLLM:

Parameter	Type	Default	Description
`max_workers`	`int`	`10`	Maximum concurrent async requests (semaphore size)
`pass_num`	`int`	`0`	Current extraction pass index — set automatically by LangCore during multi-pass extraction

Composing with Other Plugins

langcore-litellm serves as the base provider that other LangCore plugins wrap. Stack decorators to add audit logging, output validation, or hybrid rule-based extraction:

import langcore as lx
from langcore_audit import AuditLanguageModel, LoggingSink
from langcore_guardrails import GuardrailLanguageModel, SchemaValidator, OnFailAction

# Base LLM provider
config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
)
llm = lx.factory.create_model(config)

# Add output validation
guarded = GuardrailLanguageModel(
    model_id="guardrails/gpt-4o",
    inner=llm,
    validators=[SchemaValidator(MySchema, on_fail=OnFailAction.REASK)],
    max_retries=3,
)

# Add audit logging
audited = AuditLanguageModel(
    model_id="audit/gpt-4o",
    inner=guarded,
    sinks=[LoggingSink()],
)

# Use the full stack with LangCore
result = lx.extract(
    text_or_documents="Contract text...",
    model=audited,
    prompt_description="Extract entities.",
    examples=[...],
)

Output Format

Extractions return LangCore's standard AnnotatedDocument with precise character intervals:

AnnotatedDocument(
    extractions=[
        Extraction(
            extraction_class='party',
            extraction_text='Acme Corp',
            char_interval=CharInterval(start_pos=0, end_pos=9),
            alignment_status=<AlignmentStatus.MATCH_EXACT: 'match_exact'>,
            attributes={'role': 'payer'}
        ),
        Extraction(
            extraction_class='monetary_amount',
            extraction_text='$50,000',
            char_interval=CharInterval(start_pos=24, end_pos=31),
            alignment_status=<AlignmentStatus.MATCH_EXACT: 'match_exact'>,
            attributes={}
        ),
    ],
    text='Acme Corp agrees to pay $50,000 to Beta LLC by March 2025.'
)

Error Handling

API failures are captured gracefully — no unhandled exceptions:

ScoredOutput(score=0.0, output="LiteLLM API error: [error details]")

Development

pip install -e .            # Install in development mode
python test_plugin.py       # Run tests
python -m build             # Build package
twine upload dist/*         # Publish to PyPI

Requirements

Python ≥ 3.12
langcore
litellm ≥ 1.81.13

License

Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
langcore_litellm		langcore_litellm
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
conftest.py		conftest.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LangCore LiteLLM

Overview

Features

Installation

Quick Start

Integration with LangCore

Usage

Supported Models

Environment Variables

Async Extraction

Multi-Pass Extraction

Advanced Configuration

Reserved Parameters

Composing with Other Plugins

Output Format

Error Handling

Development

Requirements

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LangCore LiteLLM

Overview

Features

Installation

Quick Start

Integration with LangCore

Usage

Supported Models

Environment Variables

Async Extraction

Multi-Pass Extraction

Advanced Configuration

Reserved Parameters

Composing with Other Plugins

Output Format

Error Handling

Development

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages