GitHub - buildfastwithai/Cerebras-Cookbook

Overview

Cerebras Systems builds the world's largest computer chip - the Wafer Scale Engine (WSE) - designed specifically for AI workloads. This cookbook provides comprehensive examples, tutorials, and best practices for developing and deploying AI models using Cerebras infrastructure, including both training on WSE clusters and fast inference via Cerebras Cloud.

What You'll Find in This Repository

Section	Description
Get Started	SDK setup guides, authentication, and model exploration examples
Agents	AI agent implementations using Cerebras with CrewAI and Agno frameworks
Chat with Data	RAG implementations, multilingual PDF chat, and fusion techniques
Integrations	Framework integrations including LiteLLM for seamless API switching
Starter Apps	Production-ready applications: AI book writer and mindmap generator

Cerebras Starter Guide

Get Your API Key

Get your API key by visiting Cerebras Cloud.

⚠️ Security Note: Keep your API key secure! Never expose it in client-side code or public repositories.

API Endpoint

https://api.cerebras.ai/v1/chat/completions

Quick Start

Prerequisites

A Cerebras account
A Cerebras Inference API key
Python 3.7+ or Node.js 14+

Set up your API key

export CEREBRAS_API_KEY="your-api-key-here"

Install the Cerebras SDK

pip install --upgrade cerebras_cloud_sdk

Sample Code

Python

import os
from cerebras.cloud.sdk import Cerebras

client = Cerebras(
    api_key=os.environ.get("CEREBRAS_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the advantages of wafer-scale computing for AI",
        }
    ],
    model="llama-4-scout-17b-16e-instruct",
)

print(chat_completion.choices[0].message.content)

cURL

curl -X POST "https://api.cerebras.ai/v1/chat/completions" \
  -H "Authorization: Bearer $CEREBRAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-4-scout-17b-16e-instruct",
    "messages": [
      {"role": "user", "content": "What makes Cerebras WSE unique for AI training?"}
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'

JavaScript/Node.js

import { Cerebras } from 'cerebras_cloud_sdk';

const client = new Cerebras({
    apiKey: process.env.CEREBRAS_API_KEY,
});

async function main() {
    const chatCompletion = await client.chat.completions.create({
        messages: [
            {
                role: 'user',
                content: 'How does wafer-scale architecture improve AI performance?',
            },
        ],
        model: 'llama-4-scout-17b-16e-instruct',
    });

    console.log(chatCompletion.choices[0].message.content);
}

main();

Features

Fast Inference: Ultra-low latency inference with Cerebras Cloud
Model Zoo Integration: Access to optimized pre-trained models
OpenAI-compatible API: Easy migration from OpenAI to Cerebras
Streaming Support: Real-time response streaming capabilities
Enterprise Ready: Production-grade infrastructure and support

Available Models

Production Models

Model	Model ID	Description	Use Cases
Llama 4 Scout 17B	`llama-4-scout-17b-16e-instruct`	High-performance instruction-tuned model	Chat, reasoning, code generation
Llama 3.1 8B	`llama3.1-8b`	Efficient general-purpose model	Text generation, summarization
Llama 3.3 70B	`llama-3.3-70b`	Large-scale model for complex tasks	Advanced reasoning, research
OpenAI GPT OSS 120B	`gpt-oss-120b`	Open-source GPT-style model	General text generation, chat
Qwen 3 32B	`qwen-3-32b`	Multilingual model with strong reasoning	Multilingual tasks, reasoning

Preview Models

Model	Model ID	Description	Use Cases
Llama 4 Maverick 17B	`llama-4-maverick-17b-128e-instruct`	Extended context Llama 4 variant	Long-form content, extended reasoning
Qwen 3 235B Instruct	`qwen-3-235b-a22b-instruct-2507`	Large instruction-tuned model	Complex reasoning, advanced tasks
Qwen 3 235B Thinking	`qwen-3-235b-a22b-thinking-2507`	Reasoning-focused variant	Chain-of-thought, problem solving
Qwen 3 480B Coder	`qwen-3-coder-480b`	Specialized code generation model	Code generation, programming tasks

Resources

Documentation: Cerebras Docs
Inference API: Inference Documentation
Training API: Training Documentation
Model Zoo: Cerebras Model Zoo
SDK: Python SDK | Node.js SDK
Playground: Try Cerebras Cloud
Live Demo: Cerebras Inference Demo

Getting Support

Issues: Report bugs or request features via GitHub Issues
Documentation: Visit Cerebras Documentation
Contact: Reach out to Cerebras support team

Contributing

We welcome contributions to the Cerebras-Cookbook repository! To contribute:

Fork and clone the repository
Create a new branch for your changes
Make your changes following our documentation standards
Test your examples thoroughly
Submit a pull request

For major changes, please open an issue first to discuss your ideas.

Contribution Guidelines

Include clear documentation and comments
Provide example usage and expected outputs
Follow the existing code style and structure
Add appropriate tests where applicable
Update the README if adding new sections

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
agents		agents
chat-with-data		chat-with-data
get-started		get-started
images		images
integrations		integrations
starter-apps		starter-apps
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

What You'll Find in This Repository

Cerebras Starter Guide

Get Your API Key

API Endpoint

Quick Start

Prerequisites

Set up your API key

Install the Cerebras SDK

Sample Code

Python

cURL

JavaScript/Node.js

Features

Available Models

Production Models

Preview Models

Resources

Getting Support

Contributing

Contribution Guidelines

Legal

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Overview

What You'll Find in This Repository

Cerebras Starter Guide

Get Your API Key

API Endpoint

Quick Start

Prerequisites

Set up your API key

Install the Cerebras SDK

Sample Code

Python

cURL

JavaScript/Node.js

Features

Available Models

Production Models

Preview Models

Resources

Getting Support

Contributing

Contribution Guidelines

Legal

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages