Skip to content

Custom engines

Run your own Claude Agent SDK loop on Dreadnode — wrap your orchestration as an engine and keep sessions, scoring, and evaluation.

A built-in engine like claude-code runs a standard Claude Code agent that the platform configures. When you already have your own agent loop — multi-phase orchestration, custom tool dispatch, logic between turns, written against the Claude Agent SDK — you wrap that loop as a custom engine instead. Your code keeps running; sessions, scoring, and evaluation ride on the events it emits.

A custom engine is an AgentEngine whose run_loop runs your code and yields native events as it goes.

Say you have a two-phase pentest agent. The phase logic between turns is your code:

import claude_agent_sdk as sdk
async def run_pentest(target_url: str) -> str:
recon = []
async for msg in sdk.query(
prompt=f"Recon {target_url}",
options=sdk.ClaudeAgentOptions(allowed_tools=["Bash", "WebFetch"]),
):
recon.append(msg)
endpoints = parse_endpoints(recon) # your logic
target = pick_target(endpoints) # your logic
out = ""
async for msg in sdk.query(prompt=f"Exploit {target}", options=sdk.ClaudeAgentOptions()):
out = final_text(msg)
return out

Subclass ClaudeCodeEngine to reuse its message translation. Your orchestration moves into run_loop; the only new lines turn each SDK message into native events and dispatch them:

import claude_agent_sdk as sdk
from dreadnode.agents.engines import (
ClaudeCodeEngine,
ClaudeCodeTranslationState,
EngineContext,
register_engine,
)
from dreadnode.agents.events import AgentEnd, AgentStart
@register_engine
class PentestEngine(ClaudeCodeEngine):
name = "acme-pentest"
async def run_loop(self, ctx: EngineContext):
state = ClaudeCodeTranslationState()
async for ev in ctx.dispatch(
AgentStart(agent_id=ctx.agent.agent_id, agent_name=ctx.agent.name)
):
yield ev
# --- recon phase ---
recon = []
async for msg in sdk.query(
prompt=f"Recon {ctx.goal}",
options=sdk.ClaudeAgentOptions(allowed_tools=["Bash", "WebFetch"]),
):
recon.append(msg)
for ev in self.translate(ctx, msg, state): # SDK message -> native events
async for out in ctx.dispatch(ev): # scorers/hooks run here
yield out
endpoints = parse_endpoints(recon) # your logic, unchanged
target = pick_target(endpoints) # your logic, unchanged
# --- exploit phase ---
async for msg in sdk.query(prompt=f"Exploit {target}", options=sdk.ClaudeAgentOptions()):
for ev in self.translate(ctx, msg, state):
async for out in ctx.dispatch(ev):
yield out
async for ev in ctx.dispatch(
AgentEnd(agent_id=ctx.agent.agent_id, status="finished", stop_reason="finished")
):
yield ev

parse_endpoints, pick_target, and your phase structure are untouched. The new code is the run_loop shell plus self.translate(...) and ctx.dispatch(...) around each message.

  • ctx (EngineContext) carries the run: ctx.goal is the task input, ctx.agent is the agent config, and ctx.dispatch(event) runs an event through the agent’s hooks.
  • self.translate(ctx, msg, state) turns one Claude Agent SDK message into native events — assistant text and tool calls become GenerationStep, tool results become ToolStep, with reasoning and token usage carried along. ClaudeCodeTranslationState holds per-run state (step counter, pending tool calls, token totals); it’s named for the harness it parses, so a future engine like codex has its own.
  • ctx.dispatch(event) is where scoring happens. Each event flows through the agent’s hooks, so the scorers attached to the agent — “did discovery surface the right endpoints,” “did validation over-filter” — see every step your loop emits.

run_loop must yield exactly one terminal AgentEnd. Everything between AgentStart and AgentEnd becomes the trajectory.

Reference the engine by module:Class in the agent frontmatter — the runtime imports it:

---
name: pentest-agent
model: claude-sonnet-4-5
engine: acme.engines:PentestEngine
---

No platform change is needed; engine: resolves built-in names and import references.

A foreign loop can’t enforce everything the native loop can, so an engine declares its enforcement surface and the runtime reconciles it against the session policy (see Engines → Governance). Subclassing ClaudeCodeEngine inherits an honest default — autonomy and step budget enforced, tool approval bridged, mid-loop steering observe-only. Override describe_enforcement if your loop enforces more or less. If a policy needs something your engine can’t enforce, the runtime refuses the session rather than pretending.

A custom engine runs your code inside the runtime. Treat it like any code you ship: an engine referenced by module:Class that isn’t an audited built-in runs out-of-process inside the runtime sandbox — the platform’s isolation boundary. Keep secrets and untrusted input on the sandbox side of that line.

  • Engines — the built-in claude-code path and governance reconciliation.
  • Agents — the engine frontmatter field.