- Context Management - Automatically summarize history when approaching token limits while preserving recent messages
- Session Persistence - Continue multi-turn conversations across runs using threads without losing context
- Execution Control - Limit steps, tokens, cost, or time to prevent runaway executions
Context Window Management
Automatic Summarization
Thesummarize_when_long hook automatically compresses conversation history when it grows too large:
-
Proactive: Before each step, checks if the last generation exceeded
max_tokens. If so, summarizes older messages before continuing. - Reactive: If a context length error occurs, summarizes the history and retries the step.
Summarization Options
guidance parameter helps the summarizer preserve information relevant to your task.
Limiting Agent Execution
Usemax_steps to limit think-act cycles, or stop conditions for finer control:
TaskAgent which continues until the agent explicitly calls finish_task or give_up_on_task:
Session Persistence
Continuing Conversations
Use explicit threads to continue conversations across runs:model_dump_json() may fail due to circular references. For session persistence, consider storing conversation history separately or using database-backed state management.
Breaking Work Into Sessions
For very long tasks, break work into resumable sessions:Handling Agent Stalling
When an agent stops calling tools without completing, it “stalls”. Handle this with theretry_with_feedback hook:
never() stop condition ensures that if the agent stops calling tools, it triggers AgentStalled rather than finishing successfully.
TaskAgent for Goal-Oriented Work
TaskAgent is pre-configured for tasks that require explicit completion:
Combining Strategies
For production workloads, combine multiple strategies:Monitoring Long Runs
Track progress during long-running agents:Best Practices
For Long-Running Analysis Tasks
UseTaskAgent with context management and safety limits:
For Iterative Workflows
Break work into explicit sessions with thread persistence:Thread Management
DO:- Use
Thread()objects to maintain conversation history across multipleagent.run()calls - Monitor thread size with
len(thread.messages) - Use summarization hooks to manage context window
- Try to serialize threads with
model_dump_json()(causes circular reference errors) - Reuse the same thread across different agents or tasks
- Ignore context window limits without summarization

