Skip to content

buildfastwithai/Cerebras-Cookbook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cerebras Banner

API Status Documentation Inference GitHub Stars Twitter Follow

Overview

Cerebras Systems builds the world's largest computer chip - the Wafer Scale Engine (WSE) - designed specifically for AI workloads. This cookbook provides comprehensive examples, tutorials, and best practices for developing and deploying AI models using Cerebras infrastructure, including both training on WSE clusters and fast inference via Cerebras Cloud.

What You'll Find in This Repository

Section Description
Get Started SDK setup guides, authentication, and model exploration examples
Agents AI agent implementations using Cerebras with CrewAI and Agno frameworks
Chat with Data RAG implementations, multilingual PDF chat, and fusion techniques
Integrations Framework integrations including LiteLLM for seamless API switching
Starter Apps Production-ready applications: AI book writer and mindmap generator

Cerebras Starter Guide

Open In Colab

Get Your API Key

Get your API key by visiting Cerebras Cloud.

⚠️ Security Note: Keep your API key secure! Never expose it in client-side code or public repositories.

API Endpoint

https://api.cerebras.ai/v1/chat/completions

Quick Start

Prerequisites

  • A Cerebras account
  • A Cerebras Inference API key
  • Python 3.7+ or Node.js 14+

Set up your API key

export CEREBRAS_API_KEY="your-api-key-here"

Install the Cerebras SDK

pip install --upgrade cerebras_cloud_sdk

Sample Code

Python

import os
from cerebras.cloud.sdk import Cerebras

client = Cerebras(
    api_key=os.environ.get("CEREBRAS_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the advantages of wafer-scale computing for AI",
        }
    ],
    model="llama-4-scout-17b-16e-instruct",
)

print(chat_completion.choices[0].message.content)

cURL

curl -X POST "https://api.cerebras.ai/v1/chat/completions" \
  -H "Authorization: Bearer $CEREBRAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-4-scout-17b-16e-instruct",
    "messages": [
      {"role": "user", "content": "What makes Cerebras WSE unique for AI training?"}
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'

JavaScript/Node.js

import { Cerebras } from 'cerebras_cloud_sdk';

const client = new Cerebras({
    apiKey: process.env.CEREBRAS_API_KEY,
});

async function main() {
    const chatCompletion = await client.chat.completions.create({
        messages: [
            {
                role: 'user',
                content: 'How does wafer-scale architecture improve AI performance?',
            },
        ],
        model: 'llama-4-scout-17b-16e-instruct',
    });

    console.log(chatCompletion.choices[0].message.content);
}

main();

Features

  • Fast Inference: Ultra-low latency inference with Cerebras Cloud
  • Model Zoo Integration: Access to optimized pre-trained models
  • OpenAI-compatible API: Easy migration from OpenAI to Cerebras
  • Streaming Support: Real-time response streaming capabilities
  • Enterprise Ready: Production-grade infrastructure and support

Available Models

Production Models

Model Model ID Description Use Cases
Llama 4 Scout 17B llama-4-scout-17b-16e-instruct High-performance instruction-tuned model Chat, reasoning, code generation
Llama 3.1 8B llama3.1-8b Efficient general-purpose model Text generation, summarization
Llama 3.3 70B llama-3.3-70b Large-scale model for complex tasks Advanced reasoning, research
OpenAI GPT OSS 120B gpt-oss-120b Open-source GPT-style model General text generation, chat
Qwen 3 32B qwen-3-32b Multilingual model with strong reasoning Multilingual tasks, reasoning

Preview Models

Model Model ID Description Use Cases
Llama 4 Maverick 17B llama-4-maverick-17b-128e-instruct Extended context Llama 4 variant Long-form content, extended reasoning
Qwen 3 235B Instruct qwen-3-235b-a22b-instruct-2507 Large instruction-tuned model Complex reasoning, advanced tasks
Qwen 3 235B Thinking qwen-3-235b-a22b-thinking-2507 Reasoning-focused variant Chain-of-thought, problem solving
Qwen 3 480B Coder qwen-3-coder-480b Specialized code generation model Code generation, programming tasks

Resources

Getting Support

Contributing

We welcome contributions to the Cerebras-Cookbook repository! To contribute:

  1. Fork and clone the repository
  2. Create a new branch for your changes
  3. Make your changes following our documentation standards
  4. Test your examples thoroughly
  5. Submit a pull request

For major changes, please open an issue first to discuss your ideas.

Contribution Guidelines

  • Include clear documentation and comments
  • Provide example usage and expected outputs
  • Follow the existing code style and structure
  • Add appropriate tests where applicable
  • Update the README if adding new sections

Legal


© 2025 Cerebras Systems | All Rights Reserved

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors