Skip to content

Commit d0c6486

Browse files
authored
docs: Add Resume feature docs (#784)
* docs: Add Resume feature * docs: Add Resume feature docs * remove generated update. * final review fixes.
1 parent 340a2fb commit d0c6486

File tree

2 files changed

+242
-0
lines changed

2 files changed

+242
-0
lines changed

docs/runtime/resume.md

Lines changed: 241 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
# Resume stopped agents
2+
3+
An ADK agent's execution can be interrupted by various factors including
4+
dropped network connections, power failure, or a required external system going
5+
offline. The Resume feature of ADK allows an agent workflow to pick up where it
6+
left off, avoiding the need to restart the entire workflow. In ADK Python 1.16
7+
and higher, you can configure an ADK workflow to be resumable, so that it tracks
8+
the execution of workflow and then allows you to resume it after an unexpected
9+
interruption.
10+
11+
This guide explains how to configure your ADK agent workflow to be resumable.
12+
If you use Custom Agents, you can update them to be resumable. For more
13+
information, see
14+
[Add resume to custom Agents](#custom-agents).
15+
16+
## Add resumable configuration
17+
18+
Enable the Resume function for an agent workflow by applying a Resumabiltiy
19+
configuration to the App object of your ADK workflow, as shown in the following
20+
code example:
21+
22+
```python
23+
app = App(
24+
name='my_resumable_agent',
25+
root_agent=root_agent,
26+
# Set the resumability config to enable resumability.
27+
resumability_config=ResumabilityConfig(
28+
is_resumable=True,
29+
),
30+
)
31+
```
32+
33+
!!! warning "Caution: Long Running Functions, Confirmations, Authentication"
34+
For agents that use
35+
[Long Running Functions](/adk-docs/tools/function-tools/#long-run-tool),
36+
[Confirmations](/adk-docs/tools/confirmation/), or
37+
[Authentication](/adk-docs/tools/authentication/)
38+
requiring user input, adding a resumable confirmation changes how these features
39+
operate. For more information, see the documentation for those features.
40+
41+
!!! info "Note: Custom Agents"
42+
Resume is not supported by default for Custom Agents. You must
43+
update the agent code for a Custom Agent to support the Resume feature. For
44+
information on modifying Custom Agents to support incremental resume
45+
functionality, see
46+
[Add resume to custom Agents](#custom-agents).
47+
48+
## Resume a stopped workflow
49+
50+
When an ADK workflow stops execution you can resume the workflow using a
51+
command containing the Invocation ID for the workflow instance, which can be
52+
found in the
53+
[Event](/adk-docs/events/#understanding-and-using-events)
54+
history of the workflow. Make sure the ADK API server is running, in case it was
55+
interrupted or powered off, and then run the following command to resume the
56+
workflow, as shown in the following API request example.
57+
58+
```console
59+
# restart the API server if needed:
60+
adk api_server my_resumable_agent/
61+
62+
# resume the agent:
63+
curl -X POST http://localhost:8000/run_sse \
64+
-H "Content-Type: application/json" \
65+
-d '{
66+
"app_name": "my_resumable_agent",
67+
"user_id": "u_123",
68+
"session_id": "s_abc",
69+
"invocation_id": "invocation-123",
70+
}'
71+
```
72+
73+
You can also resume a workflow using the Runner object Run Async method, as
74+
shown below:
75+
76+
```python
77+
runner.run_async(user_id='u_123', session_id='s_abc',
78+
invocation_id='invocation-123')
79+
80+
# When new_message is set to a function response,
81+
# we are trying to resume a long running function.
82+
```
83+
84+
!!! info "Note"
85+
Resuming a workflow from the ADK Web user interface or using the ADK
86+
command line (CLI) tool is not currently supported.
87+
88+
## How it works
89+
90+
The Resume feature works by logging completed Agent workflow tasks,
91+
including incremental steps using
92+
[Events](/adk-docs/events/) and
93+
[Event Actions](/adk-docs/events/#detecting-actions-and-side-effects).
94+
tracking completion of agent tasks within a resumable workflow. If a workflow is
95+
interrupted and then later restarted, the system resumes the workflow by setting
96+
the completion state of each agent. If an agent did not complete, the workflow
97+
system reinstates any completed Events for that agent, and restarts the workflow
98+
from the partially completed state. For multi-agent workflows, the specific
99+
resume behavior varies, based on the multi-agent classes in your workflow, as
100+
described below:
101+
102+
- **Sequential Agent**: Reads the current_sub_agent from its saved state
103+
to find the next sub-agent to run in the sequence.
104+
- **Loop Agent**: Uses the current_sub_agent and times_looped values to
105+
continue the loop from the last completed iteration and sub-agent.
106+
- **Parallel Agent**: Determines which sub-agents have already completed
107+
and only runs those that have not finished.
108+
109+
Event logging includes results from Tools which successfully returned a result.
110+
So if an agent successfully executed Function Tools A and B, and then failed
111+
during execution of tool C, the system reinstates the results from the
112+
tools A and B, and resumes the workflow by re-running the tool C request.
113+
114+
!!! warning "Caution: Tool execution behavior"
115+
When resuming a workflow with Tools, the Resume feature ensures
116+
that the Tools in an agent are run ***at least once***, and may run more than
117+
once when resuming a workflow. If your agent uses Tools where duplicate runs
118+
would have a negative impact, such as purchases, you should modify the Tool to
119+
check for and prevent duplicate runs.
120+
121+
!!! note "Note: Workflow modification with Resume not supported"
122+
Do not modify a stopped agent workflow before resuming it.
123+
For example adding or removing agents from workflow that has stopped
124+
and then resuming that workflow is not supported.
125+
126+
## Add resume to custom Agents {#custom-agents}
127+
128+
Custom agents have specific implementation requirements in order to support
129+
resumability. You must decide on and define workflow steps within your custom
130+
agent which produce a result which can be preserved before handing off to the
131+
next step of processing. The following steps outline how to modify a Custom
132+
Agent to support a workflow Resume.
133+
134+
- **Create CustomAgentState class**: Extend the BaseAgentState to create
135+
an object that preserves the state of your agent.
136+
- **Optionally, create WorkFlowStep class**: If your custom agent
137+
has sequential steps, consider creating a WorkFlowStep list object that
138+
defines the discrete, savable steps of the agent.
139+
- **Add initial agent state:** Modify your agent's async run function to
140+
set the initial state of your agent.
141+
- **Add agent state checkpoints**: Modify your agent's async run function
142+
to generate and save the agent state for each completed step of the agent's
143+
overall task.
144+
- **Add end of agent status to track agent state:** Modify your agent's
145+
async run function to include an `end_of_agent=True` status upon successful
146+
completion of the agent's full task.
147+
148+
The following example shows the required code modifications to the example
149+
StoryFlowAgent class shown in the
150+
[Custom Agents](/adk-docs/agents/custom-agents/#full-code-example)
151+
guide:
152+
153+
```python
154+
class WorkflowStep(int, Enum):
155+
INITIAL_STORY_GENERATION = 1
156+
CRITIC_REVISER_LOOP = 2
157+
POST_PROCESSING = 3
158+
CONDITIONAL_REGENERATION = 4
159+
160+
# Extend BaseAgentState
161+
162+
### class StoryFlowAgentState(BaseAgentState):
163+
164+
### step = WorkflowStep
165+
166+
@override
167+
async def _run_async_impl(
168+
self, ctx: InvocationContext
169+
) -> AsyncGenerator[Event, None]:
170+
"""
171+
Implements the custom orchestration logic for the story workflow.
172+
Uses the instance attributes assigned by Pydantic (e.g., self.story_generator).
173+
"""
174+
agent_state = self._load_agent_state(ctx, WorkflowStep)
175+
176+
if agent_state is None:
177+
# Record the start of the agent
178+
agent_state = StoryFlowAgentState(step=WorkflowStep.INITIAL_STORY_GENERATION)
179+
yield self._create_agent_state_event(ctx, agent_state)
180+
181+
next_step = agent_state.step
182+
logger.info(f"[{self.name}] Starting story generation workflow.")
183+
184+
# Step 1. Initial Story Generation
185+
if next_step <= WorkflowStep.INITIAL_STORY_GENERATION:
186+
logger.info(f"[{self.name}] Running StoryGenerator...")
187+
async for event in self.story_generator.run_async(ctx):
188+
yield event
189+
190+
# Check if story was generated before proceeding
191+
if "current_story" not in ctx.session.state or not ctx.session.state[
192+
"current_story"
193+
]:
194+
return # Stop processing if initial story failed
195+
196+
agent_state = StoryFlowAgentState(step=WorkflowStep.CRITIC_REVISER_LOOP)
197+
yield self._create_agent_state_event(ctx, agent_state)
198+
199+
# Step 2. Critic-Reviser Loop
200+
if next_step <= WorkflowStep.CRITIC_REVISER_LOOP:
201+
logger.info(f"[{self.name}] Running CriticReviserLoop...")
202+
async for event in self.loop_agent.run_async(ctx):
203+
logger.info(
204+
f"[{self.name}] Event from CriticReviserLoop: "
205+
f"{event.model_dump_json(indent=2, exclude_none=True)}"
206+
)
207+
yield event
208+
209+
agent_state = StoryFlowAgentState(step=WorkflowStep.POST_PROCESSING)
210+
yield self._create_agent_state_event(ctx, agent_state)
211+
212+
# Step 3. Sequential Post-Processing (Grammar and Tone Check)
213+
if next_step <= WorkflowStep.POST_PROCESSING:
214+
logger.info(f"[{self.name}] Running PostProcessing...")
215+
async for event in self.sequential_agent.run_async(ctx):
216+
logger.info(
217+
f"[{self.name}] Event from PostProcessing: "
218+
f"{event.model_dump_json(indent=2, exclude_none=True)}"
219+
)
220+
yield event
221+
222+
agent_state = StoryFlowAgentState(step=WorkflowStep.CONDITIONAL_REGENERATION)
223+
yield self._create_agent_state_event(ctx, agent_state)
224+
225+
# Step 4. Tone-Based Conditional Logic
226+
if next_step <= WorkflowStep.CONDITIONAL_REGENERATION:
227+
tone_check_result = ctx.session.state.get("tone_check_result")
228+
if tone_check_result == "negative":
229+
logger.info(f"[{self.name}] Tone is negative. Regenerating story...")
230+
async for event in self.story_generator.run_async(ctx):
231+
logger.info(
232+
f"[{self.name}] Event from StoryGenerator (Regen): "
233+
f"{event.model_dump_json(indent=2, exclude_none=True)}"
234+
)
235+
yield event
236+
else:
237+
logger.info(f"[{self.name}] Tone is not negative. Keeping current story.")
238+
239+
logger.info(f"[{self.name}] Workflow finished.")
240+
yield self._create_agent_state_event(ctx, end_of_agent=True)
241+
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,7 @@ nav:
164164
- Running Agents:
165165
- Agent Runtime: runtime/index.md
166166
- Runtime Config: runtime/runconfig.md
167+
- Resume Agents: runtime/resume.md
167168
- Deploy:
168169
- deploy/index.md
169170
- Agent Engine: deploy/agent-engine.md

0 commit comments

Comments
 (0)