Skip to content

Commit 9b457dc

Browse files
Add Firecrawl to third-party tools (#839)
* Add Firecrawl tools page * Add abolute path * Python code samples * Update icon and cards * Update Firecrawl page content * Update nav * Fix image link * Other cleanup --------- Co-authored-by: Joe Fernandez <[email protected]>
1 parent 278bd2f commit 9b457dc

File tree

7 files changed

+163
-10
lines changed

7 files changed

+163
-10
lines changed

docs/assets/tools-firecrawl.png

40.1 KB
Loading

docs/tools/index.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,16 @@ Check out the following pre-built tools that you can use with ADK agents:
133133

134134
<div class="tool-card-grid">
135135

136+
<a href="/adk-docs/tools/third-party/firecrawl/" class="tool-card">
137+
<div class="tool-card-image-wrapper">
138+
<img src="../assets/tools-firecrawl.png" alt="Firecrawl">
139+
</div>
140+
<div class="tool-card-content">
141+
<h3>Firecrawl</h4>
142+
<p>Empower your AI apps with clean data from any website</p>
143+
</div>
144+
</a>
145+
136146
<a href="/adk-docs/tools/third-party/github/" class="tool-card">
137147
<div class="tool-card-image-wrapper">
138148
<img src="../assets/tools-github.png" alt="GitHub">
Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# Firecrawl
2+
3+
The [Firecrawl MCP Server](https://github.com/firecrawl/firecrawl-mcp-server)
4+
connects your ADK agent to the [Firecrawl](https://www.firecrawl.dev/) API, a
5+
service that can crawl any website and convert its content into clean,
6+
structured markdown. This allows your agent to ingest, search, and reason over
7+
web data from any URL, including all its subpages.
8+
9+
## Features
10+
11+
- **Agent-based Web Research**: Deploy an agent that can take a topic, use the
12+
search tool to find relevant URLs, and then use the scrape tool to extract the
13+
full content of each page for analysis or summarization.
14+
15+
- **Structured Data Extraction**: Use the extract tool to pull specific,
16+
structured information (like product names, prices, or contact info) from a
17+
list of URLs, powered by LLM extraction.
18+
19+
- **Large-Scale Content Ingestion**: Automate the scraping of entire websites or
20+
large batches of URLs using the batch scrape and crawl tools. This is ideal
21+
for populating a vector database for a RAG (Retrieval-Augmented Generation)
22+
pipeline.
23+
24+
## Prerequisites
25+
26+
- [Sign up on Firecrawl](https://www.firecrawl.dev/signin) and [get an API key](https://firecrawl.dev/app/api-keys)
27+
28+
## Usage with ADK
29+
30+
=== "Local MCP Server"
31+
32+
```python
33+
from google.adk.agents.llm_agent import Agent
34+
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
35+
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
36+
from mcp import StdioServerParameters
37+
38+
FIRECRAWL_API_KEY = "YOUR_FIRECRAWL_API_KEY"
39+
40+
root_agent = Agent(
41+
model="gemini-2.5-pro",
42+
name="firecrawl_agent",
43+
description="A helpful assistant for scraping websites with Firecrawl",
44+
instruction="Help the user search for website content",
45+
tools=[
46+
MCPToolset(
47+
connection_params=StdioConnectionParams(
48+
server_params = StdioServerParameters(
49+
command="npx",
50+
args=[
51+
"-y",
52+
"firecrawl-mcp",
53+
],
54+
env={
55+
"FIRECRAWL_API_KEY": FIRECRAWL_API_KEY,
56+
}
57+
),
58+
timeout=30,
59+
),
60+
)
61+
],
62+
)
63+
```
64+
65+
=== "Remote MCP Server"
66+
67+
```python
68+
from google.adk.agents.llm_agent import Agent
69+
from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams
70+
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
71+
72+
FIRECRAWL_API_KEY = "YOUR_FIRECRAWL_API_KEY"
73+
74+
root_agent = Agent(
75+
model="gemini-2.5-pro",
76+
name="firecrawl_agent",
77+
description="A helpful assistant for scraping websites with Firecrawl",
78+
instruction="Help the user search for website content",
79+
tools=[
80+
MCPToolset(
81+
connection_params=StreamableHTTPServerParams(
82+
url=f"https://mcp.firecrawl.dev/{FIRECRAWL_API_KEY}/v2/mcp",
83+
),
84+
)
85+
],
86+
)
87+
```
88+
89+
## Available tools
90+
91+
This toolset provides a comprehensive suite of functions for web crawling,
92+
scraping, and searching:
93+
94+
Tool | Name | Description
95+
---- | ---- | -----------
96+
Scrape Tool | `firecrawl_scrape` | Scrape content from a single URL with advanced options
97+
Batch Scrape Tool | `firecrawl_batch_scrape` | Scrape multiple URLs efficiently with built-in rate limiting and parallel processing
98+
Check Batch Status | `firecrawl_check_batch_status` | Check the status of a batch operation
99+
Map Tool | `firecrawl_map` | Map a website to discover all indexed URLs on the site
100+
Search Tool | `firecrawl_search` | Search the web and optionally extract content from search results
101+
Crawl Tool | `firecrawl_crawl` | Start an asynchronous crawl with advanced options
102+
Check Crawl Status | `firecrawl_check_crawl_status` | Check the status of a crawl job
103+
Extract Tool | `firecrawl_extract` | Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction
104+
105+
## Configuration
106+
107+
The Firecrawl MCP server can be configured using environment variables:
108+
109+
**Required**:
110+
111+
- `FIRECRAWL_API_KEY`: Your Firecrawl API key
112+
- Required when using cloud API (default)
113+
- Optional when using self-hosted instance with `FIRECRAWL_API_URL`
114+
115+
**Firecrawl API URL (optional)**:
116+
117+
- `FIRECRAWL_API_URL` (Optional): Custom API endpoint for self-hosted instances
118+
- Example: `https://firecrawl.your-domain.com`
119+
- If not provided, the cloud API will be used (requires API key)
120+
121+
**Retry configuration (optional)**:
122+
123+
- `FIRECRAWL_RETRY_MAX_ATTEMPTS`: Maximum number of retry attempts (default: 3)
124+
- `FIRECRAWL_RETRY_INITIAL_DELAY`: Initial delay in milliseconds before first retry (default: 1000)
125+
- `FIRECRAWL_RETRY_MAX_DELAY`: Maximum delay in milliseconds between retries (default: 10000)
126+
- `FIRECRAWL_RETRY_BACKOFF_FACTOR`: Exponential backoff multiplier (default: 2)
127+
128+
**Credit usage monitoring (optional)**:
129+
130+
- `FIRECRAWL_CREDIT_WARNING_THRESHOLD`: Credit usage warning threshold (default: 1000)
131+
- `FIRECRAWL_CREDIT_CRITICAL_THRESHOLD`: Credit usage critical threshold (default: 100)
132+
133+
## Additional resources
134+
135+
- [Firecrawl MCP Server Documentation](https://docs.firecrawl.dev/mcp-server)
136+
- [Firecrawl MCP Server Repository](https://github.com/firecrawl/firecrawl-mcp-server)
137+
- [Firecrawl Use Cases](https://docs.firecrawl.dev/use-cases/overview)
138+
- [Firecrawl Advanced Scraping Guide](https://docs.firecrawl.dev/advanced-scraping-guide)

docs/tools/third-party/hugging-face.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@ your ADK agent to the Hugging Face Hub and thousands of Gradio AI Applications.
2929
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
3030
from mcp import StdioServerParameters
3131

32+
HUGGING_FACE_TOKEN = "YOUR_HUGGING_FACE_TOKEN"
33+
3234
root_agent = Agent(
3335
model="gemini-2.5-pro",
3436
name="hugging_face_agent",
@@ -43,7 +45,7 @@ your ADK agent to the Hugging Face Hub and thousands of Gradio AI Applications.
4345
"@llmindset/hf-mcp-server",
4446
],
4547
env={
46-
"HF_TOKEN": "YOUR-HUGGING-FACE-TOKEN",
48+
"HF_TOKEN": HUGGING_FACE_TOKEN,
4749
}
4850
),
4951
timeout=30,

docs/tools/third-party/index.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,16 @@ Check out the following third-party tools that you can use with ADK agents:
44

55
<div class="tool-card-grid">
66

7+
<a href="/adk-docs/tools/third-party/firecrawl/" class="tool-card">
8+
<div class="tool-card-image-wrapper">
9+
<img src="../../assets/tools-firecrawl.png" alt="Firecrawl">
10+
</div>
11+
<div class="tool-card-content">
12+
<h3>Firecrawl</h4>
13+
<p>Empower your AI apps with clean data from any website</p>
14+
</div>
15+
</a>
16+
717
<a href="/adk-docs/tools/third-party/github/" class="tool-card">
818
<div class="tool-card-image-wrapper">
919
<img src="../../assets/tools-github.png" alt="GitHub">

docs/tutorials/index.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -46,12 +46,4 @@ applications with ADK. Explore our collection below and happy building:
4646

4747
[:octicons-arrow-right-24: Discover adk-samples](https://github.com/google/adk-samples){:target="_blank"}
4848

49-
- :material-console-line: **Agentic UI with AG-UI**
50-
51-
---
52-
53-
Build a rich user interface for your agent using the AG-UI protocol and CopilotKit.
54-
55-
[:octicons-arrow-right-24: Build an agentic UI](ag-ui.md)
56-
5749
</div>

mkdocs.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,11 +163,12 @@ nav:
163163
- Code Execution with Agent Engine: tools/google-cloud/code-exec-agent-engine.md
164164
- Third-party tools:
165165
- tools/third-party/index.md
166+
- Firecrawl: tools/third-party/firecrawl.md
166167
- GitHub: tools/third-party/github.md
167168
- Hugging Face: tools/third-party/hugging-face.md
168169
- LangChain tools: tools/third-party/langchain.md
169170
- CrewAI tools: tools/third-party/crewai.md
170-
- Agentic UI (AG-UI): tools/third-party/ag-ui.md
171+
- Agentic UI (AG-UI): tools/third-party/ag-ui.md
171172
- Custom Tools:
172173
- tools-custom/index.md
173174
- Function tools:

0 commit comments

Comments
 (0)