Skip to content

Optimization

Submit, inspect, wait on, and retry hosted optimization jobs from the dn CLI.

dn optimize ... is the hosted optimization control-plane surface. Use it when the capability, dataset, and reward recipe already exist and you want the platform to run the job.

If you are still iterating on a local capability, use dn capability improve first. That command optimizes capability-owned local files against local datasets and leaves behind a candidate bundle plus ledger. dn optimize ... is for the published, hosted path.

Today the hosted CLI path is intentionally narrow:

  • backend: gepa
  • target kind: capability_agent
  • optimized component: agent instructions

That is useful when you want platform-managed prompt or instruction improvement, not arbitrary local search.

Hosted optimization is intentionally opinionated. The cleanest way to think about it is:

  1. pick a published capability
  2. pick the agent inside that capability whose instructions should change
  3. pick a published dataset
  4. pick a hosted reward recipe that scores the outputs

If any of those ingredients are still unstable, the SDK is usually a better place to experiment first.

That is the main boundary between the two CLI surfaces:

  • dn capability improve is local, stack-aware, and capability-scoped
  • dn capability improve can optionally use a proposer capability to suggest edits while the CLI still owns scoring and acceptance
  • dn optimize submit is hosted, published-artifact-based, and instruction-only today
Terminal window
dn optimize submit \
--server http://127.0.0.1:8000 \
--api-key "$DREADNODE_API_KEY" \
--organization dreadnode \
--workspace localdev \
--project default \
--model openai/gpt-4o-mini \
--capability [email protected] \
--agent-name assistant \
--dataset [email protected] \
--val-dataset [email protected] \
--reward-recipe exact_match_v1 \
--objective "Improve instruction quality without increasing verbosity." \
--max-metric-calls 100 \
--max-trials 10 \
--max-trials-without-improvement 3 \
--max-runtime-sec 1800 \
--reflection-lm gpt-5-mini \
--wait \
--json

What that command is doing:

  • it optimizes the selected agent’s instructions, not model weights
  • --dataset is the training set used during search
  • --val-dataset is the held-out set for checking whether the improvement generalizes
  • --reward-recipe defines how each candidate is scored
  • --reflection-lm controls the model used during reflection steps, which can be different from the target model being improved
FlagDescription
--capability NAME@VERSIONcapability artifact containing the target agent
--agent-name <name>required when the capability exports multiple agents
--dataset NAME@VERSIONtraining dataset used during optimization
--val-dataset NAME@VERSIONoptional held-out validation dataset
--reward-recipe <name>declarative hosted reward recipe
--reward-params <json>JSON params passed to the reward recipe
--seed <n>deterministic optimization seed
--max-metric-calls <n>metric-call budget
--max-trials <n>hard stop after this many trials
--max-trials-without-improvement <n>stop after this many finished trials without a new best
--max-runtime-sec <n>outer hosted sandbox lifetime override
--reflection-lm <model>reflection model override; defaults to --model
--no-capture-tracesdisable trajectory capture for reflection
--waitpoll until terminal state
--jsonprint the full job payload

These three flags solve different problems:

  • --max-metric-calls limits scoring budget
  • --max-trials limits search length
  • --max-trials-without-improvement stops stagnant jobs that keep looping without a better result

If the job is already near-perfect but still iterating, --max-trials-without-improvement is usually the most useful brake.

Once the job exists, use the control-plane commands for different layers of inspection:

Terminal window
dn optimize list
dn optimize get <job-id>
dn optimize wait <job-id> --json
dn optimize logs <job-id>
dn optimize artifacts <job-id>
dn optimize cancel <job-id> --json
dn optimize retry <job-id>

Use them like this:

  • list finds old or in-flight jobs
  • get shows the saved config and top-level status
  • wait is the simplest way to block until a terminal outcome
  • logs tells you what the optimization loop is currently doing
  • artifacts is where to look for outputs worth reusing
  • retry reruns a terminal job when you want the same setup again

dn optimize wait exits non-zero if the job ends in failed or cancelled.

A completed job only tells you that the hosted loop finished. It does not tell you whether the result is useful.

After a successful run, check:

  • whether the best score actually improved
  • whether validation stayed strong, not just training
  • whether the artifacts contain instructions you would really want to ship

Hosted optimization runs inside real sandboxes. If the job state and the underlying compute seem out of sync, inspect the compute directly:

Terminal window
dn sandbox list --state running
dn sandbox get <provider-sandbox-id>
dn sandbox logs <provider-sandbox-id>
dn sandbox delete --yes <provider-sandbox-id>

See /cli/sandboxes/ for the compute view.

Use dn optimize submit only after:

  • the capability is already published
  • the dataset is already published
  • the reward recipe is already known

If you are still iterating locally on the metric or the candidate shape, the SDK is usually the better place to experiment first.