Skip to content

Strikes#

Early Access

Strikes is currently in early access with trusted partners. Contact us for early access.

What is Strikes?#

Dreadnode Strikes is an AI agent training ground focused on offensive cyber security. Strikes offers real-world scenarios to test your agent, allowing you to analyze its performance and optimize it for deployment in an operational environment.

The ultimate goal of Strikes is to equip a red team with an army of reliable and highly capable agents that can perform complex workflows and achieve objectives at scale with minimal human interaction.

Why Strikes?#

Strikes provides a space to understand an agent's capabilities and performance on real-world tasks. Running an agent in a Strike produces outputs related to each objective, or Zone, in the Strike. The structure of the agent’s outputs is defined by the Strike Type, which not only enables scoring but also helps generate robust datasets with labeled performance data. You can leverage these findings to tune the agent, track the agent's success trajectory, and scale the agent’s capabilities.

Strikes provide a way for agent authors, model providers, and stakeholders to address key questions such as: How complex should an agent be? How specific should its functionality be? Which models perform the best? What external tools or systems should be integrated? Strikes allow you to see how different choices in these areas impact performance across various tasks. You can optimize both the process and the agent’s approach, while also collecting data on how these interactions influence task success rates.

Strike Components#

Strike#

A Strike is an environment definition consisting of a unique configuration, rule set, scoring system, and resources (code, hosts, etc.). Strikes are modular, cover a wide array of tasks and goals, and provide a well-structured interface for exposing agent interactions and tracking behaviors. Strikes range from isolated tests of specific capabilities to fully configured networks that mirror the real world.

Strike Type#

Strikes are categorized into Strike Types based on their purpose and objective(s). Static application security testing (SAST), capture the flag (CTF) challenges, and data enumeration are common Strike Types. Elements shared between Strikes of the same type include output semi-structure, scoring mechanics, general environment layout, and goal alignment. For example, a “Network CTF” Strike involves compromising local network access to retrieve flag values, where a “Data Extraction” Strike would focus on enumerating operationally useful and sensitive data from websites, files, code repositories, media files, etc.

Zones#

Each Strike is divided into several objectives or Zones. Zones are flexible and can segment different challenges, resource groups, network configurations, and more. Each Zone deploys an independent parallel copy of the agent to measure performance at scale.

Command Line Interface (CLI)#

The Dreadnode CLI is used to orchestrate the agent build process, deploy agents, access run information, and perform other Strikes-related tasks.

Agent#

A semi-autonomous system designed to perceive its environment, make decisions, and take actions to achieve specific goals without direct human intervention. In Strikes, deploying an agent creates a flexible container containing the agent code and runs it against the specified Strike. Agent templates are available for quickly getting started, or developers can select a preferred language, framework, and dependencies in their agent - as long as it can be containerized, it can be deployed in Strikes.

Registry#

We provide a private Docker registry server to host agent images. Its authentication is integrated with the platform, using usernames and API keys to manage access to images specific to each user. Users can view their registry at: registry.dreadnode.io/<username>/<agent>:<tag>

Outputs#

During a Strike run, agents generate outputs, which are then submitted to the Strike system for storage and scoring. These outputs follow a semi-structured format to support scoring when applicable but remain flexible enough to capture any information you wish to report. Once submitted, the outputs undergo processing, scoring, and enrichment with metadata from the Strike system. Some Strikes provide the final processed output back to the agent as "output feedback,” helping inform the agent of its success. Other Strikes retain this feedback for user reference only, ensuring that agents don't receive performance feedback that wouldn't be available in real-world tasks.