Tasks
Tasks are the unit of security challenge definition on Dreadnode. A task packages the instruction, environment, and verification rules for a challenge, but it is not itself a runtime session.
What a task contains
Section titled “What a task contains”A task definition includes:
- An instruction for the agent
- An environment, usually defined by
docker-compose.yaml - Service and port metadata used to expose the environment safely
- Verification rules that determine whether the challenge is solved
Tasks are reusable. The same task definition can be run interactively, or as part of a benchmark evaluation.
Task definitions are uploaded as OCI artifacts. The platform validates the bundled task.yaml and compose files, stores the archive, and records the task as pending. The provider-specific template or image is built lazily on first execution, then reused for later runs.
Task builds
Section titled “Task builds”When a task is first executed, the platform builds a runnable environment from the task definition. Build states include queued, building, ready, or failed. Once built, the environment is reused for subsequent runs.
Instruction rendering
Section titled “Instruction rendering”Task instructions support template variables like {{ web_url }} and {{ service_url }} that are resolved against live sandbox connection details when you start an interactive session. This lets task authors write generic instructions that automatically include the correct URLs for each execution.
How tasks are executed
Section titled “How tasks are executed”Interactive solving
Section titled “Interactive solving”For interactive work, the platform provisions a runtime. Runtimes are exposed through the workspace-scoped runtimes API, while the underlying runtime records remain visible in the sandboxes inventory.
Automated benchmarking
Section titled “Automated benchmarking”For judged and repeatable runs, the platform creates an evaluation. Each evaluation item combines the task environment with a runtime sandbox, runs the agent inside that runtime, and then executes the task’s verification rules. If the task has never been built for the active sandbox provider, the first run triggers that build before the environment sandbox is provisioned.
Verification
Section titled “Verification”Verification remains part of the task definition. The task defines what success means, and the platform performs the actual verification step during execution.