Agent class is stateless—it’s a reusable orchestrator. All conversation history (messages and events) is stored in a separate Thread object, enabling flexible state management across runs.
Why use threads:
- Persist Conversations - Continue multi-turn conversations across runs by passing the same thread to multiple agent calls
- Multi-Agent Collaboration - Share context between specialized agents by passing a common thread
- Fork and Explore - Create independent branches from a thread to test different approaches in parallel
Thread Properties
| Property | Type | Description |
|---|---|---|
messages | list[Message] | All messages in the conversation. |
events | list[AgentEvent] | All events that occurred during runs. |
total_usage | Usage | Aggregated token usage across all generations. |
last_usage | Usage | None | Token usage from the most recent generation. |
Default Thread
For convenience, every agent has an internal thread. When you callrun() or stream() without specifying a thread, the agent uses this internal one:
agent.reset():
Explicit Thread Management
Pass aThread to run() or stream() for explicit control over conversation state.
Persisting Conversations
Resume conversations by reusing a thread:Multi-Agent Collaboration
Share a thread between specialized agents:Thread Forking
Create an independent copy of a thread withfork():
Parallel Exploration
Fork threads to explore multiple approaches concurrently:Accessing Thread State After a Run
After a run completes, access the thread’s state directly and use the result for run-specific metrics:Usage Tracking
Track token usage across runs:Best Practices
When to Use Explicit Threads
Use the default thread (agent.thread) when:
- Single agent, single task execution
- No need to preserve state between separate operations
- Quick prototyping or simple scripts
- Continuing conversations across multiple runs
- Multiple agents need to share context
- You need to inspect or fork conversation state
- Building production applications with session management
Thread Lifecycle Management
Create a new thread for each independent conversation or session:Threads with Streaming
Threads work seamlessly with streaming - events are added to the thread as they occur:Error Handling
When a run fails, the thread retains all messages and events up to the failure:Memory and Context Window
Threads grow unbounded. For long-running conversations, use thesummarize_when_long hook:
- Proactive: Summarizes before each step if tokens exceed threshold
- Reactive: If a context length error occurs, summarizes and retries
Common Pitfalls
Do not share threads across unrelated tasks:model_dump_json():

