Agent-mode trajectories
Run a capability-bound agent against a Worlds manifest with a pinned runtime, and capture a policy snapshot for reproducibility.
Agent-mode replaces the built-in kali/c2 samplers with an agent you authored — your
prompts, your tools, your skills — running inside a Dreadnode runtime against the Worlds
environment. The result is a trajectory shaped exactly like the algorithmic ones
(success, termination reason, replayable steps) but driven by your own policy.
When to use agent mode
Section titled “When to use agent mode”Reach for agent mode when:
- You want to measure a specific capability against an environment.
- You’re collecting training data for your own agent, not for a generic sampler.
- You need the trajectory’s action vocabulary to come from tools you wrote, not the Worlds backend’s built-in command list.
For volume data, negative examples, or quick shape-of-graph sampling, kali or c2 are
faster. See Trajectories for the algorithmic path.
What you need
Section titled “What you need”- A manifest. Generation is the same as any other trajectory — only
modechanges. See Manifests. - A runtime. The runtime binds the model, environment, and tooling version the agent will use. See Runtimes.
- A capability installed on that runtime. The capability defines the agent’s prompts, tools, and skills. See Capabilities.
Submit an agent-mode trajectory
Section titled “Submit an agent-mode trajectory”dn worlds trajectory-create \ --manifest-id <manifest-id> \ --goal "Domain Admins" \ --count 1 \ --mode agent \ --runtime-id <runtime-id> \ --capability-name threat-hunting \ --agent-name triage--runtime-id and --capability-name are required for mode=agent. --agent-name
picks one agent from the capability when more than one is defined; omit it to use the
capability’s default.
--strategy still applies. Agent mode respects the strategy as a hint — recon-first
biases early tool calls toward enumeration, for example — but the agent can diverge.
The policy snapshot
Section titled “The policy snapshot”At submission time, Worlds captures a policy snapshot: an immutable record of which runtime and capability version will execute the trajectory. The snapshot is attached to the trajectory job and carries:
runtime_idandruntime_digest— the runtime’s pinned version.capability_name,capability_version, andcapability_artifact_digest— the capability bundle’s identity and content hash.capability_runtime_digest— how the capability is resolved on that runtime.agent_name— the specific agent inside the capability, if set.
The snapshot exists so trajectories stay reproducible even when the runtime or capability changes later. A trajectory you ran last month can be replayed, scored, and reasoned about against the exact policy that produced it. Updating the capability doesn’t retroactively rewrite what happened.
What gets recorded
Section titled “What gets recorded”Agent-mode trajectories capture the native agent run:
- Messages (user, assistant, tool) with the agent’s reasoning preserved.
- Tool calls with their arguments.
- Tool observations — results, errors, exit codes from the Worlds backend.
- Per-step metadata (targets, state transitions) on top of the message log.
This is the shape the training ETL reads when you turn agent-mode trajectories into SFT conversations or RL rollout data.
Pairing with rollouts
Section titled “Pairing with rollouts”Agent-mode trajectories land as durable records in the control plane — good for datasets and post-hoc scoring. For online RL where you want to run the agent in-process and shape rewards as steps happen, use rollouts instead. They share a runtime concept; the trade-off is durability vs. feedback latency.
What’s next
Section titled “What’s next”- Feed the run into training: Training integration
- See the snapshot structure: Trajectory reference
- Step-by-step inspection: Replay & artifacts