Skip to content

Tasks

Tasks are the unit of security challenge definition on Dreadnode. A task packages the instruction, environment, and verification rules for a challenge, but it is not itself a runtime session.

A task definition includes:

  • An instruction for the agent
  • An environment, usually defined by docker-compose.yaml
  • Service and port metadata used to expose the environment safely
  • Verification rules that determine whether the challenge is solved

Tasks are reusable. The same task definition can be run interactively, or as part of a benchmark evaluation.

Task definitions are uploaded as OCI artifacts. The platform validates the bundled task.yaml and compose files, stores the archive, and records the task as pending. The provider-specific template or image is built lazily on first execution, then reused for later runs.

When a task is first executed, the platform builds a runnable environment from the task definition. Build states include queued, building, ready, or failed. Once built, the environment is reused for subsequent runs.

Task instructions support template variables like {{ web_url }} and {{ service_url }} that are resolved against live sandbox connection details when you start an interactive session. This lets task authors write generic instructions that automatically include the correct URLs for each execution.

For interactive work, the platform provisions a runtime. Runtimes are exposed through the workspace-scoped runtimes API, while the underlying runtime records remain visible in the sandboxes inventory.

For judged and repeatable runs, the platform creates an evaluation. Each evaluation item combines the task environment with a runtime sandbox, runs the agent inside that runtime, and then executes the task’s verification rules. If the task has never been built for the active sandbox provider, the first run triggers that build before the environment sandbox is provisioned.

Verification remains part of the task definition. The task defines what success means, and the platform performs the actual verification step during execution.