Cerebras Systems builds the world's largest computer chip - the Wafer Scale Engine (WSE) - designed specifically for AI workloads. This cookbook provides comprehensive examples, tutorials, and best practices for developing and deploying AI models using Cerebras infrastructure, including both training on WSE clusters and fast inference via Cerebras Cloud.
| Section | Description |
|---|---|
| Get Started | SDK setup guides, authentication, and model exploration examples |
| Agents | AI agent implementations using Cerebras with CrewAI and Agno frameworks |
| Chat with Data | RAG implementations, multilingual PDF chat, and fusion techniques |
| Integrations | Framework integrations including LiteLLM for seamless API switching |
| Starter Apps | Production-ready applications: AI book writer and mindmap generator |
Get your API key by visiting Cerebras Cloud.
https://api.cerebras.ai/v1/chat/completions
- A Cerebras account
- A Cerebras Inference API key
- Python 3.7+ or Node.js 14+
export CEREBRAS_API_KEY="your-api-key-here"pip install --upgrade cerebras_cloud_sdkimport os
from cerebras.cloud.sdk import Cerebras
client = Cerebras(
api_key=os.environ.get("CEREBRAS_API_KEY"),
)
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Explain the advantages of wafer-scale computing for AI",
}
],
model="llama-4-scout-17b-16e-instruct",
)
print(chat_completion.choices[0].message.content)curl -X POST "https://api.cerebras.ai/v1/chat/completions" \
-H "Authorization: Bearer $CEREBRAS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-4-scout-17b-16e-instruct",
"messages": [
{"role": "user", "content": "What makes Cerebras WSE unique for AI training?"}
],
"max_tokens": 1024,
"temperature": 0.7
}'import { Cerebras } from 'cerebras_cloud_sdk';
const client = new Cerebras({
apiKey: process.env.CEREBRAS_API_KEY,
});
async function main() {
const chatCompletion = await client.chat.completions.create({
messages: [
{
role: 'user',
content: 'How does wafer-scale architecture improve AI performance?',
},
],
model: 'llama-4-scout-17b-16e-instruct',
});
console.log(chatCompletion.choices[0].message.content);
}
main();- Fast Inference: Ultra-low latency inference with Cerebras Cloud
- Model Zoo Integration: Access to optimized pre-trained models
- OpenAI-compatible API: Easy migration from OpenAI to Cerebras
- Streaming Support: Real-time response streaming capabilities
- Enterprise Ready: Production-grade infrastructure and support
| Model | Model ID | Description | Use Cases |
|---|---|---|---|
| Llama 4 Scout 17B | llama-4-scout-17b-16e-instruct |
High-performance instruction-tuned model | Chat, reasoning, code generation |
| Llama 3.1 8B | llama3.1-8b |
Efficient general-purpose model | Text generation, summarization |
| Llama 3.3 70B | llama-3.3-70b |
Large-scale model for complex tasks | Advanced reasoning, research |
| OpenAI GPT OSS 120B | gpt-oss-120b |
Open-source GPT-style model | General text generation, chat |
| Qwen 3 32B | qwen-3-32b |
Multilingual model with strong reasoning | Multilingual tasks, reasoning |
| Model | Model ID | Description | Use Cases |
|---|---|---|---|
| Llama 4 Maverick 17B | llama-4-maverick-17b-128e-instruct |
Extended context Llama 4 variant | Long-form content, extended reasoning |
| Qwen 3 235B Instruct | qwen-3-235b-a22b-instruct-2507 |
Large instruction-tuned model | Complex reasoning, advanced tasks |
| Qwen 3 235B Thinking | qwen-3-235b-a22b-thinking-2507 |
Reasoning-focused variant | Chain-of-thought, problem solving |
| Qwen 3 480B Coder | qwen-3-coder-480b |
Specialized code generation model | Code generation, programming tasks |
- Documentation: Cerebras Docs
- Inference API: Inference Documentation
- Training API: Training Documentation
- Model Zoo: Cerebras Model Zoo
- SDK: Python SDK | Node.js SDK
- Playground: Try Cerebras Cloud
- Live Demo: Cerebras Inference Demo
- Issues: Report bugs or request features via GitHub Issues
- Documentation: Visit Cerebras Documentation
- Contact: Reach out to Cerebras support team
We welcome contributions to the Cerebras-Cookbook repository! To contribute:
- Fork and clone the repository
- Create a new branch for your changes
- Make your changes following our documentation standards
- Test your examples thoroughly
- Submit a pull request
For major changes, please open an issue first to discuss your ideas.
- Include clear documentation and comments
- Provide example usage and expected outputs
- Follow the existing code style and structure
- Add appropriate tests where applicable
- Update the README if adding new sections
© 2025 Cerebras Systems | All Rights Reserved