Skip to content

Tasks

Define, publish, and validate security tasks for agents.

Terminal window
$ dn task <command>

Environments with success conditions that agents operate in — for evaluations, training, and optimization.

Aliases: new

Terminal window
$ dn task init <name>

Scaffold a new task directory ready for development.

The scaffolded task.yaml doubles as an entrypoint to the task contract: every spec feature appears as a commented opt-in block with a one-line hint. Pass --with-verify / --with-solution to scaffold the matching script stub and uncomment the matching block. Pass any catalog metadata flag (--description, --difficulty, --tag, etc.) to pre-fill that field.

The result passes structural validation immediately. dn task validate may still emit best-practice warnings until you fill in catalog metadata and add a reference solution.

Options

  • <name>, --name (Required)

Catalog metadata

  • --initial-version (default 0.1.0) — Initial semver version for the task.
  • --description — One-line catalog summary.
  • --difficulty — Difficulty level (easy, medium, or hard). [choices: easy, medium, hard]
  • --tag — Discovery tag (repeatable).
  • --source — Suite or group the task belongs to (e.g. apex, portswigger).
  • --author — Task author (free-form string).
  • --license — SPDX license identifier (e.g. MIT, Apache-2.0).
  • --repository — Source repository URL.
  • --max-agent-timeout-sec — Evaluation timeout hint in seconds (advisory).

Optional supplemental scripts

  • --with-verify (default False) — Drop a verify.sh stub and switch verification.method to script.
  • --with-solution (default False) — Drop a solution.sh stub and uncomment the solution: block.

Shape

  • --remote (default False) — Scaffold a remote/external task — no docker-compose, no Dockerfile.
  • --force (default False) — Overwrite an existing directory at the target path.
  • --path (default .) — Parent directory to create the task folder in.

Verification

  • --with-verify (default False) — Drop a verify.sh stub and switch verification.method to script.
  • --flag-value — Plaintext value for verification.value (default flag method only).
  • --flag-path — Path the agent writes for the flag (default /tmp/result.txt).

Aliases: upload

Terminal window
$ dn task push <path>

Publish a task to your organization’s registry.

Builds an OCI image from the task directory and pushes it. Skips the upload if the remote content already matches (idempotent). Pass —publish to make the task discoverable by other organizations.

Options

  • <path>, --path (Required) — Task directory containing task.yaml and docker-compose.yaml.
  • --name — Override the registry name.
  • --skip-upload (default False) — Build and validate locally without publishing.
  • --force (default False) — Push even if the remote content already matches.
  • --publish (default False) — Ensure the task is publicly discoverable after publishing.
Terminal window
$ dn task publish <refs>

Make one or more task families visible to other organizations.

Options

  • <refs>, --refs (Required)
Terminal window
$ dn task unpublish <refs>

Make one or more task families private.

Options

  • <refs>, --refs (Required)

Aliases: ls

Terminal window
$ dn task list

Show tasks in your organization.

Options

  • --search — Search by name or description.
  • --limit (default 50) — Maximum results to show.
  • --include-public (default False) — Include public tasks from other organizations.
  • --json (default False) — Output raw JSON instead of a summary.
Terminal window
$ dn task info <ref>

Show details and instructions for a task.

Displays metadata, visibility, difficulty, tags, and the full task instruction. Version is optional — defaults to the latest.

Options

  • <ref>, --ref (Required) — Task to inspect (e.g. my-task, [email protected]).
  • --json (default False) — Output raw JSON instead of formatted summary.

Aliases: download

Terminal window
$ dn task pull <ref>

Download a task for local development or inspection.

Pulls the task from the registry and extracts it to the local package cache. Use this to inspect how a task is built, fork it, or test it locally with docker compose.

Options

  • <ref>, --ref (Required) — Task to pull (e.g. my-task or acme/my-task).
  • --upgrade (default False) — Re-download even if already cached locally.

Aliases: rm

Terminal window
$ dn task delete <ref>

Remove a published task version from the registry.

Options

  • <ref>, --ref (Required) — Task to delete (e.g. [email protected]). Version is required.
  • --yes, -y (default False) — Skip the confirmation prompt.
Terminal window
$ dn task sync <directory>

Publish all tasks from a directory — ideal for CI pipelines.

Discovers subdirectories containing task.yaml, compares each against the registry by content hash, and only publishes those that changed.

Options

  • <directory>, --directory (Required) — Root directory containing task subdirectories.
  • --force (default False) — Publish all tasks even if unchanged.
  • --publish (default False) — Ensure published tasks are publicly discoverable.
  • --workers (default 8) — Number of parallel upload workers.

Aliases: check

Terminal window
$ dn task validate <path>

Check that task definitions are well-formed before publishing.

Validates task.yaml, docker-compose.yaml, port mappings, and script references. Discovers and validates all tasks when given a parent directory. When a path does not exist locally but resolves to a published task, validation can pull the remote task into a temporary local directory and run the same validation flow.

Options

  • <path>, --path (Required) — Task directory, parent directory containing multiple tasks, or published task ref when using remote validation.
  • --strict (default False) — Treat warnings as failures (exit code 1).
  • --build (default False) — Also run docker compose build for each task.
  • --smoke (default False) — Full lifecycle test — boot containers, verify that verify.sh rejects unsolved state, and (if solution.sh exists) verify it accepts the reference solution. Implies —build.
  • --pull (default False) — Treat path as a published task ref and pull it for local validation.
  • --yes, -y (default False) — Accept remote validation without prompting when path is not local.
  • --timeout — Per-task wall-clock budget in seconds for smoke testing. When unset, falls back to the task’s max_agent_timeout_sec or 120 seconds if neither is declared.