Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 78 additions & 31 deletions agent-os/interfaces/whatsapp/introduction.mdx
Original file line number Diff line number Diff line change
@@ -1,59 +1,65 @@
---
title: WhatsApp
description: Host agents as WhatsApp applications
description: Host agents as WhatsApp applications.
---

Use the WhatsApp interface to serve Agents or Teams via WhatsApp. It mounts webhook routes on a FastAPI app and sends responses back to WhatsApp users and threads.
Use the WhatsApp interface to serve Agents, Teams, or Workflows on WhatsApp. It mounts webhook routes on a FastAPI app and sends responses back to WhatsApp users.

## Setup

Set up your WhatsApp Business API and configure the webhook URL to point to your AgentOS instance.
Follow the WhatsApp setup guide in the [deploy overview](/deploy/interfaces/whatsapp/overview).

Required environment variables:

- `WHATSAPP_ACCESS_TOKEN`
- `WHATSAPP_PHONE_NUMBER_ID`
- `WHATSAPP_VERIFY_TOKEN`
- `WHATSAPP_ACCESS_TOKEN` (from Meta App Dashboard)
- `WHATSAPP_PHONE_NUMBER_ID` (from WhatsApp API Setup)
- `WHATSAPP_VERIFY_TOKEN` (a string you create for webhook verification)
- Optional (production): `WHATSAPP_APP_SECRET` and `APP_ENV=production`

<Note>
The user's phone number is automatically used as the `user_id` for runs. This ensures that sessions and memory are appropriately scoped to the user.

The phone number is also used for the `session_id`, so a single WhatsApp conversation corresponds to a single session. This should be considered when managing session history.
The phone number is also used for the `session_id` (format: `wa:{phone_number}`), so a single WhatsApp conversation corresponds to a single session.

Check warning on line 22 in agent-os/interfaces/whatsapp/introduction.mdx

View check run for this annotation

Mintlify / Mintlify Validation (agno-v2) - vale-spellcheck

agent-os/interfaces/whatsapp/introduction.mdx#L22

Did you really mean 'phone_number'?
</Note>

## Example Usage

Create an agent, expose it with the `Whatsapp` interface, and serve via `AgentOS`:

```python
```python basic.py
from agno.agent import Agent
from agno.models.openai import OpenAIResponses
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIChat
from agno.os import AgentOS
from agno.os.interfaces.whatsapp import Whatsapp

image_agent = Agent(
model=OpenAIResponses(id="gpt-5.2"), # Ensure OPENAI_API_KEY is set
tools=[OpenAITools(image_model="gpt-image-1")],
markdown=True,
agent_db = SqliteDb(db_file="tmp/persistent_memory.db")

basic_agent = Agent(
name="Basic Agent",
model=OpenAIChat(id="gpt-4o"),
db=agent_db,
add_history_to_context=True,
num_history_runs=3,
add_datetime_to_context=True,
markdown=True,
)

agent_os = AgentOS(
agents=[image_agent],
interfaces=[Whatsapp(agent=image_agent)],
agents=[basic_agent],
interfaces=[Whatsapp(agent=basic_agent)],
)
app = agent_os.get_app()

if __name__ == "__main__":
agent_os.serve(app="basic:app", port=8000, reload=True)
agent_os.serve(app="basic:app", port=7777, reload=True)
```

See the [WhatsApp Examples](/agent-os/usage/interfaces/whatsapp/basic) for more usage patterns.

## Core Components

- `Whatsapp` (interface): Wraps an Agno `Agent` or `Team` for WhatsApp via FastAPI.
- `Whatsapp` (interface): Wraps an Agno `Agent`, `Team`, or `Workflow` for WhatsApp via FastAPI.
- `AgentOS.serve`: Serves the FastAPI app using Uvicorn.

## `Whatsapp` Interface
Expand All @@ -62,36 +68,77 @@

### Initialization Parameters

| Parameter | Type | Default | Description |
| --------- | ----------------- | ------- | ---------------------- |
| `agent` | `Optional[Agent]` | `None` | Agno `Agent` instance. |
| `team` | `Optional[Team]` | `None` | Agno `Team` instance. |
| Parameter | Type | Default | Description |
| --- | --- | --- | --- |
| `agent` | `Optional[Agent]` | `None` | Agno `Agent` instance. |
| `team` | `Optional[Team]` | `None` | Agno `Team` instance. |
| `workflow` | `Optional[Workflow]` | `None` | Agno `Workflow` instance. |
| `prefix` | `str` | `"/whatsapp"` | Custom FastAPI route prefix for the WhatsApp interface. |
| `tags` | `Optional[List[str]]` | `None` | FastAPI route tags for API documentation. Defaults to `["Whatsapp"]` if not provided. |

Provide `agent` or `team`.
Provide `agent`, `team`, or `workflow`.

### Key Method

| Method | Parameters | Return Type | Description |
| ------------ | ------------------------ | ----------- | -------------------------------------------------- |
| `get_router` | `use_async: bool = True` | `APIRouter` | Returns the FastAPI router and attaches endpoints. |
| Method | Parameters | Return Type | Description |
| --- | --- | --- | --- |
| `get_router` | None | `APIRouter` | Returns the FastAPI router and attaches endpoints. |

## Endpoints

Mounted under the `/whatsapp` prefix:

### `GET /whatsapp/status`

- Health/status of the interface.
- Health/status check for the interface.
- Returns `{"status": "available"}`.

### `GET /whatsapp/webhook`

- Verifies WhatsApp webhook (`hub.challenge`).
- Returns `hub.challenge` on success; `403` on token mismatch; `500` if `WHATSAPP_VERIFY_TOKEN` missing.
- Handles WhatsApp webhook verification (`hub.challenge`).
- Returns `hub.challenge` on success; `403` on token mismatch; `500` if `WHATSAPP_VERIFY_TOKEN` is not set.

### `POST /whatsapp/webhook`

- Receives WhatsApp messages and events.
- Validates signature (`X-Hub-Signature-256`); bypassed in development mode.
- Processes text, image, video, audio, and document messages via the agent/team.
- Sends replies (splits long messages; uploads and sends generated images).
- Validates the `X-Hub-Signature-256` header; bypassed when `APP_ENV=development`.
- Processes text, image, video, audio, and document messages via the agent/team/workflow.
- Sends replies back to the user (splits long messages at WhatsApp's 4096 character limit; uploads and sends generated images).
- Responses: `200 {"status": "processing"}` or `{"status": "ignored"}`, `403` invalid signature, `500` errors.

## Session Management

Sessions are scoped by phone number:

- **Session ID format**: `wa:{phone_number}`
- The user's phone number is used as both the `user_id` and the base for the `session_id`
- Each WhatsApp conversation maps to a single session

This means all messages from the same phone number share the same conversation context.

## Media Support

**Inbound** (user sends to bot): images, video, audio, and documents. Media is downloaded via the WhatsApp Business API and passed to the agent as image, video, audio, or file inputs.

**Outbound** (agent sends to user): generated images are uploaded to the WhatsApp Media API and sent as native WhatsApp image messages. Text responses are split into batches if they exceed the 4096 character limit.

## Reasoning Support

When the agent produces reasoning content (e.g., from ReasoningTools or ThinkingTools), it is sent as a separate italicized message before the main response.

## Testing the Integration

1. Run the app locally: `python <my-app>.py` (ensure ngrok is running)
2. In your Meta App's WhatsApp Setup, configure the webhook URL to `https://YOUR-NGROK-URL/whatsapp/webhook`
3. Subscribe to the `messages` webhook field
4. Send a message from your test phone number to the WhatsApp Business number

## Troubleshooting

| Symptom | Cause | Fix |
| --- | --- | --- |
| 403 errors on webhook | Invalid signature in production mode | Set `APP_ENV=development` for local testing, or set `WHATSAPP_APP_SECRET` with your Meta App Secret |
| Webhook verification fails | Token mismatch or app not running | Ensure `WHATSAPP_VERIFY_TOKEN` matches and the app is running before clicking "Verify and save" |
| No response from the bot | Missing access token or phone number ID | Verify `WHATSAPP_ACCESS_TOKEN` and `WHATSAPP_PHONE_NUMBER_ID` are set correctly |
| Images not sending | Upload failure | Check `WHATSAPP_ACCESS_TOKEN` has permission to upload media |
| `500` on webhook verify | `WHATSAPP_VERIFY_TOKEN` not set | Export `WHATSAPP_VERIFY_TOKEN` before running |
13 changes: 6 additions & 7 deletions agent-os/usage/interfaces/whatsapp/agent-with-media.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "WhatsApp Agent with Media Support"
description: "WhatsApp agent that analyzes images, videos, and audio using multimodal AI"
description: "WhatsApp agent that analyzes images, videos, audio, and documents using multimodal AI"
---

## Code
Expand All @@ -9,13 +9,14 @@ description: "WhatsApp agent that analyzes images, videos, and audio using multi
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.google import Gemini
from agno.os.app import AgentOS
from agno.os import AgentOS
from agno.os.interfaces.whatsapp import Whatsapp

agent_db = SqliteDb(db_file="tmp/persistent_memory.db")

media_agent = Agent(
name="Media Agent",
model=Gemini(id="gemini-2.0-flash"),
model=Gemini(id="gemini-3-flash-preview"),
db=agent_db,
add_history_to_context=True,
num_history_runs=3,
Expand All @@ -42,7 +43,6 @@ if __name__ == "__main__":
```bash
export WHATSAPP_ACCESS_TOKEN=your_whatsapp_access_token
export WHATSAPP_PHONE_NUMBER_ID=your_phone_number_id
export WHATSAPP_WEBHOOK_URL=your_webhook_url
export WHATSAPP_VERIFY_TOKEN=your_verify_token
export GOOGLE_API_KEY=your_google_api_key
export APP_ENV=development
Expand All @@ -64,9 +64,8 @@ if __name__ == "__main__":

## Key Features

- **Multimodal AI**: Gemini 2.0 Flash for image, video, and audio processing
- **Multimodal Analysis**: Gemini for image, video, audio, and document processing
- **Image Analysis**: Object recognition, scene understanding, text extraction
- **Video Processing**: Content analysis and summarization
- **Audio Support**: Voice message transcription and response
- **Context Integration**: Combines media analysis with conversation history

- **Conversation History**: Combines media analysis with context from last 3 interactions
26 changes: 12 additions & 14 deletions agent-os/usage/interfaces/whatsapp/agent-with-user-memory.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ description: "Personalized WhatsApp agent that remembers user information and pr

```python cookbook/os/interfaces/whatsapp/agent_with_user_memory.py
from textwrap import dedent

from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.memory.manager import MemoryManager
from agno.models.google import Gemini
from agno.os.app import AgentOS
from agno.os import AgentOS
from agno.os.interfaces.whatsapp import Whatsapp
from agno.tools.hackernews import HackerNewsTools
from agno.tools.websearch import WebSearchTools

agent_db = SqliteDb(db_file="tmp/persistent_memory.db")

Expand All @@ -30,7 +31,7 @@ memory_manager = MemoryManager(
personal_agent = Agent(
name="Basic Agent",
model=Gemini(id="gemini-2.0-flash"),
tools=[HackerNewsTools()],
tools=[WebSearchTools()],
add_history_to_context=True,
num_history_runs=3,
add_datetime_to_context=True,
Expand All @@ -40,10 +41,9 @@ personal_agent = Agent(
enable_agentic_memory=True,
instructions=dedent("""
You are a personal AI friend of the user, your purpose is to chat with the user about things and make them feel good.
First introduce yourself and ask for their name then, ask about themeselves, their hobbies, what they like to do and what they like to talk about.
Use the HackerNews tools to find latest information about things in the conversations
"""),
debug_mode=True,
First introduce yourself and ask for their name then, ask about themselves, their hobbies, what they like to do and what they like to talk about.
Use web search to find latest information about things in the conversations
"""),
)

agent_os = AgentOS(
Expand All @@ -65,8 +65,8 @@ if __name__ == "__main__":
```bash
export WHATSAPP_ACCESS_TOKEN=your_whatsapp_access_token
export WHATSAPP_PHONE_NUMBER_ID=your_phone_number_id
export WHATSAPP_WEBHOOK_URL=your_webhook_url
export WHATSAPP_VERIFY_TOKEN=your_verify_token
export GOOGLE_API_KEY=your_google_api_key
export APP_ENV=development
```
</Step>
Expand All @@ -86,9 +86,7 @@ if __name__ == "__main__":

## Key Features

- **Memory Management**: Remembers user names, hobbies, preferences, and activities
- **HackerNews**: Access to current information during conversations
- **Personalized Responses**: Uses stored memories for contextualized replies
- **Friendly AI**: Acts as personal AI friend with engaging conversation
- **Gemini Powered**: Fast, intelligent responses with multimodal capabilities

- **Agentic Memory**: MemoryManager captures user preferences, hobbies, and personal details
- **Cross-Session Recall**: Remembers user information across conversations
- **Web Search**: Uses WebSearchTools to find up-to-date information during conversations
- **Persistent Storage**: SQLite database for both sessions and memory
9 changes: 4 additions & 5 deletions agent-os/usage/interfaces/whatsapp/basic.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,15 @@ description: "Create a basic AI agent that integrates with WhatsApp Business API
```python cookbook/os/interfaces/whatsapp/basic.py
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.openai import OpenAIResponses
from agno.models.openai import OpenAIChat
from agno.os import AgentOS
from agno.os.interfaces.whatsapp import Whatsapp

agent_db = SqliteDb(db_file="tmp/persistent_memory.db")

basic_agent = Agent(
name="Basic Agent",
model=OpenAIResponses(id="gpt-5.2"),
model=OpenAIChat(id="gpt-4o"),
db=agent_db,
add_history_to_context=True,
num_history_runs=3,
Expand All @@ -42,7 +43,6 @@ if __name__ == "__main__":
```bash
export WHATSAPP_ACCESS_TOKEN=your_whatsapp_access_token
export WHATSAPP_PHONE_NUMBER_ID=your_phone_number_id
export WHATSAPP_WEBHOOK_URL=your_webhook_url
export WHATSAPP_VERIFY_TOKEN=your_verify_token
export OPENAI_API_KEY=your_openai_api_key
export APP_ENV=development
Expand All @@ -64,9 +64,8 @@ if __name__ == "__main__":

## Key Features

- **WhatsApp Integration**: Responds to messages automatically
- **WhatsApp Integration**: Responds to messages automatically
- **Conversation History**: Maintains context with last 3 interactions
- **Persistent Memory**: SQLite database for session storage
- **DateTime Context**: Time-aware responses
- **Markdown Support**: Rich text formatting in messages

19 changes: 8 additions & 11 deletions agent-os/usage/interfaces/whatsapp/image-generation-model.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "WhatsApp Image Generation Agent (Model-based)"
description: "WhatsApp agent that generates images using Gemini's built-in capabilities"
description: "WhatsApp agent that generates images using Gemini's built-in image generation"
---

## Code
Expand All @@ -9,18 +9,18 @@ description: "WhatsApp agent that generates images using Gemini's built-in capab
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.google import Gemini
from agno.os.app import AgentOS
from agno.os import AgentOS
from agno.os.interfaces.whatsapp import Whatsapp

agent_db = SqliteDb(db_file="tmp/persistent_memory.db")

image_agent = Agent(
id="image_generation_model",
db=agent_db,
model=Gemini(
id="gemini-2.0-flash-exp-image-generation",
id="models/gemini-2.5-flash-image",
response_modalities=["Text", "Image"],
),
debug_mode=True,
)

agent_os = AgentOS(
Expand All @@ -42,7 +42,6 @@ if __name__ == "__main__":
```bash
export WHATSAPP_ACCESS_TOKEN=your_whatsapp_access_token
export WHATSAPP_PHONE_NUMBER_ID=your_phone_number_id
export WHATSAPP_WEBHOOK_URL=your_webhook_url
export WHATSAPP_VERIFY_TOKEN=your_verify_token
export GOOGLE_API_KEY=your_google_api_key
export APP_ENV=development
Expand All @@ -64,9 +63,7 @@ if __name__ == "__main__":

## Key Features

- **Direct Image Generation**: Gemini 2.0 Flash experimental image generation
- **Text-to-Image**: Converts descriptions into visual content
- **Multimodal Responses**: Generates both text and images
- **WhatsApp Integration**: Sends images directly through WhatsApp
- **Debug Mode**: Enhanced logging for troubleshooting

- **Native Image Generation**: Gemini 2.5 Flash with built-in image generation
- **Multimodal Responses**: Generates both text and images in a single response
- **No External Tools**: Image generation is handled directly by the model
- **WhatsApp Delivery**: Images uploaded and sent as native WhatsApp photos
Loading
Loading