A minimal Python library for interacting with various LLM providers, featuring automatic API key load balancing and streaming responses.
pip install smolllm
uv add "smolllm @ ../smolllm"
from dotenv import load_dotenv
import asyncio
from smolllm import ask_llm
# Load environment variables at your application startup
load_dotenv()
async def main():
response = await ask_llm(
"Say hello world",
model="gemini/gemini-2.0-flash"
)
print(response)
if __name__ == "__main__":
asyncio.run(main())
Format: provider/model_name
(e.g., openai/gpt-4
, gemini/gemini-2.0-flash
)
The library looks for API keys in environment variables following the pattern: {PROVIDER}_API_KEY
Example:
# .env
OPENAI_API_KEY=sk-xxx
GEMINI_API_KEY=key1,key2 # Multiple keys supported
Override default API endpoints using: {PROVIDER}_BASE_URL
Example:
OPENAI_BASE_URL=https://custom.openai.com/v1
OLLAMA_BASE_URL=http://localhost:11434/v1
You can combine multiple keys and base URLs in several ways:
- One key with multiple base URLs:
OLLAMA_API_KEY=ollama
OLLAMA_BASE_URL=http://localhost:11434/v1,http://other-server:11434/v1
- Multiple keys with one base URL:
GEMINI_API_KEY=key1,key2
GEMINI_BASE_URL=https://api.gemini.com/v1
- Paired keys and base URLs:
# Must have equal number of keys and URLs
# The library will randomly select matching pairs
GEMINI_API_KEY=key1,key2
GEMINI_BASE_URL=https://api.gemini.com/v1,https://api.gemini.com/v2
When using SmolLLM in your project, you should handle environment variables at your application level:
- Create a
.env
file:
# .env
OPENAI_API_KEY=sk-xxx
GEMINI_API_KEY=xxx,xxx2
ANTHROPIC_API_KEY=sk-xxx
- Load environment variables before using SmolLLM:
from dotenv import load_dotenv
import os
# Load at your application startup
load_dotenv()
# Now you can use SmolLLM
from smolllm import ask_llm
- Keep sensitive API keys in
.env
(add to .gitignore) - Create
.env.example
for documentation - For production, consider using your platform's secret management system
- When using multiple keys, separate with commas (no spaces)