Custom engines
Run your own Claude Agent SDK loop on Dreadnode — wrap your orchestration as an engine and keep sessions, scoring, and evaluation.
A built-in engine like claude-code runs a standard Claude Code agent that the platform configures. When you already have your own agent loop — multi-phase orchestration, custom tool dispatch, logic between turns, written against the Claude Agent SDK — you wrap that loop as a custom engine instead. Your code keeps running; sessions, scoring, and evaluation ride on the events it emits.
A custom engine is an AgentEngine whose run_loop runs your code and yields native events as it goes.
Your existing loop
Section titled “Your existing loop”Say you have a two-phase pentest agent. The phase logic between turns is your code:
import claude_agent_sdk as sdk
async def run_pentest(target_url: str) -> str: recon = [] async for msg in sdk.query( prompt=f"Recon {target_url}", options=sdk.ClaudeAgentOptions(allowed_tools=["Bash", "WebFetch"]), ): recon.append(msg)
endpoints = parse_endpoints(recon) # your logic target = pick_target(endpoints) # your logic
out = "" async for msg in sdk.query(prompt=f"Exploit {target}", options=sdk.ClaudeAgentOptions()): out = final_text(msg) return outThe same loop, wrapped as an engine
Section titled “The same loop, wrapped as an engine”Subclass ClaudeCodeEngine to reuse its message translation. Your orchestration moves into run_loop; the only new lines turn each SDK message into native events and dispatch them:
import claude_agent_sdk as sdk
from dreadnode.agents.engines import ( ClaudeCodeEngine, ClaudeCodeTranslationState, EngineContext, register_engine,)from dreadnode.agents.events import AgentEnd, AgentStart
@register_engineclass PentestEngine(ClaudeCodeEngine): name = "acme-pentest"
async def run_loop(self, ctx: EngineContext): state = ClaudeCodeTranslationState()
async for ev in ctx.dispatch( AgentStart(agent_id=ctx.agent.agent_id, agent_name=ctx.agent.name) ): yield ev
# --- recon phase --- recon = [] async for msg in sdk.query( prompt=f"Recon {ctx.goal}", options=sdk.ClaudeAgentOptions(allowed_tools=["Bash", "WebFetch"]), ): recon.append(msg) for ev in self.translate(ctx, msg, state): # SDK message -> native events async for out in ctx.dispatch(ev): # scorers/hooks run here yield out
endpoints = parse_endpoints(recon) # your logic, unchanged target = pick_target(endpoints) # your logic, unchanged
# --- exploit phase --- async for msg in sdk.query(prompt=f"Exploit {target}", options=sdk.ClaudeAgentOptions()): for ev in self.translate(ctx, msg, state): async for out in ctx.dispatch(ev): yield out
async for ev in ctx.dispatch( AgentEnd(agent_id=ctx.agent.agent_id, status="finished", stop_reason="finished") ): yield evparse_endpoints, pick_target, and your phase structure are untouched. The new code is the run_loop shell plus self.translate(...) and ctx.dispatch(...) around each message.
What each piece does
Section titled “What each piece does”ctx(EngineContext) carries the run:ctx.goalis the task input,ctx.agentis the agent config, andctx.dispatch(event)runs an event through the agent’s hooks.self.translate(ctx, msg, state)turns one Claude Agent SDK message into native events — assistant text and tool calls becomeGenerationStep, tool results becomeToolStep, with reasoning and token usage carried along.ClaudeCodeTranslationStateholds per-run state (step counter, pending tool calls, token totals); it’s named for the harness it parses, so a future engine likecodexhas its own.ctx.dispatch(event)is where scoring happens. Each event flows through the agent’s hooks, so the scorers attached to the agent — “did discovery surface the right endpoints,” “did validation over-filter” — see every step your loop emits.
run_loop must yield exactly one terminal AgentEnd. Everything between AgentStart and AgentEnd becomes the trajectory.
Declare it on an agent
Section titled “Declare it on an agent”Reference the engine by module:Class in the agent frontmatter — the runtime imports it:
---name: pentest-agentmodel: claude-sonnet-4-5engine: acme.engines:PentestEngine---No platform change is needed; engine: resolves built-in names and import references.
Declare what it can govern
Section titled “Declare what it can govern”A foreign loop can’t enforce everything the native loop can, so an engine declares its enforcement surface and the runtime reconciles it against the session policy (see Engines → Governance). Subclassing ClaudeCodeEngine inherits an honest default — autonomy and step budget enforced, tool approval bridged, mid-loop steering observe-only. Override describe_enforcement if your loop enforces more or less. If a policy needs something your engine can’t enforce, the runtime refuses the session rather than pretending.
Where it runs
Section titled “Where it runs”A custom engine runs your code inside the runtime. Treat it like any code you ship: an engine referenced by module:Class that isn’t an audited built-in runs out-of-process inside the runtime sandbox — the platform’s isolation boundary. Keep secrets and untrusted input on the sandbox side of that line.