Optimization

Submit, inspect, wait on, and retry hosted optimization jobs from the dn CLI.

dn optimize ... is the hosted optimization control-plane surface. Use it when the capability, dataset, and reward recipe already exist and you want the platform to run the job.

If you are still iterating on a local capability, use dn capability improve first. That command optimizes capability-owned local files against local datasets and leaves behind a candidate bundle plus ledger. dn optimize ... is for the published, hosted path.

What hosted CLI optimization is for

Today the hosted CLI path is intentionally narrow:

backend: gepa
target kind: capability_agent
optimized component: agent instructions

That is useful when you want platform-managed prompt or instruction improvement, not arbitrary local search.

Before you submit an optimization job

Hosted optimization is intentionally opinionated. The cleanest way to think about it is:

pick a published capability
pick the agent inside that capability whose instructions should change
pick a published dataset
pick a hosted reward recipe that scores the outputs

If any of those ingredients are still unstable, the SDK is usually a better place to experiment first.

That is the main boundary between the two CLI surfaces:

dn capability improve is local, stack-aware, and capability-scoped
dn capability improve can optionally use a proposer capability to suggest edits while the CLI still owns scoring and acceptance
dn optimize submit is hosted, published-artifact-based, and instruction-only today

Submit an optimization job

dn optimize submit \
  --server http://127.0.0.1:8000 \
  --api-key "$DREADNODE_API_KEY" \
  --organization dreadnode \
  --workspace localdev \
  --project default \
  --model openai/gpt-4o-mini \
  --capability [email protected] \
  --agent-name assistant \
  --dataset [email protected] \
  --val-dataset [email protected] \
  --reward-recipe exact_match_v1 \
  --objective "Improve instruction quality without increasing verbosity." \
  --max-metric-calls 100 \
  --max-trials 10 \
  --max-trials-without-improvement 3 \
  --max-runtime-sec 1800 \
  --reflection-lm gpt-5-mini \
  --wait \
  --json

What that command is doing:

it optimizes the selected agent’s instructions, not model weights
--dataset is the training set used during search
--val-dataset is the held-out set for checking whether the improvement generalizes
--reward-recipe defines how each candidate is scored
--reflection-lm controls the model used during reflection steps, which can be different from the target model being improved

The flags that matter most

Flag	Description
`--capability NAME@VERSION`	capability artifact containing the target agent
`--agent-name <name>`	required when the capability exports multiple agents
`--dataset NAME@VERSION`	training dataset used during optimization
`--val-dataset NAME@VERSION`	optional held-out validation dataset
`--reward-recipe <name>`	declarative hosted reward recipe
`--reward-params <json>`	JSON params passed to the reward recipe
`--seed <n>`	deterministic optimization seed
`--max-metric-calls <n>`	metric-call budget
`--max-trials <n>`	hard stop after this many trials
`--max-trials-without-improvement <n>`	stop after this many finished trials without a new best
`--max-runtime-sec <n>`	outer hosted sandbox lifetime override
`--reflection-lm <model>`	reflection model override; defaults to `--model`
`--no-capture-traces`	disable trajectory capture for reflection
`--wait`	poll until terminal state
`--json`	print the full job payload

How to think about the stopping controls

These three flags solve different problems:

--max-metric-calls limits scoring budget
--max-trials limits search length
--max-trials-without-improvement stops stagnant jobs that keep looping without a better result

If the job is already near-perfect but still iterating, --max-trials-without-improvement is usually the most useful brake.

After the job starts

Once the job exists, use the control-plane commands for different layers of inspection:

dn optimize list
dn optimize get <job-id>
dn optimize wait <job-id> --json
dn optimize logs <job-id>
dn optimize artifacts <job-id>
dn optimize cancel <job-id> --json
dn optimize retry <job-id>

Use them like this:

list finds old or in-flight jobs
get shows the saved config and top-level status
wait is the simplest way to block until a terminal outcome
logs tells you what the optimization loop is currently doing
artifacts is where to look for outputs worth reusing
retry reruns a terminal job when you want the same setup again

dn optimize wait exits non-zero if the job ends in failed or cancelled.

Read the result, not just the status

A completed job only tells you that the hosted loop finished. It does not tell you whether the result is useful.

After a successful run, check:

whether the best score actually improved
whether validation stayed strong, not just training
whether the artifacts contain instructions you would really want to ship

When sandboxes matter

Hosted optimization runs inside real sandboxes. If the job state and the underlying compute seem out of sync, inspect the compute directly:

dn sandbox list --state running
dn sandbox get <provider-sandbox-id>
dn sandbox logs <provider-sandbox-id>
dn sandbox delete --yes <provider-sandbox-id>

See /cli/sandboxes/ for the compute view.

Practical rule

Use dn optimize submit only after:

the capability is already published
the dataset is already published
the reward recipe is already known

If you are still iterating locally on the metric or the candidate shape, the SDK is usually the better place to experiment first.