How to build a CTF-solving agent in Strikes
dreadnode/example-agents
. We’ll reference specific components throughout, but you can also explore the full implementation to understand how everything fits together.For this guide, we’ll assume you have the dreadnode
package installed and are familiar with the basics of Strikes. If you haven’t already, check out the installation and introduction guides.--model
argument, which is treated as an identifier to Rigging. Usually, the model name works as expected, but sometimes you need to supply a prefix like gemini/
or ollama/
:Clone the `dreadnode/example-agents` repository
Ensure Docker is running
Set your environment variables
dreadnode
package can use environment variables to configure the target server and token for sending run data.--server
and --token
arguments on the CLI.Run the agent
gpt-4.1
as our model (requires a valid OPENAI_API_KEY
) and limit the challenges to db_easy
.db_easy
challenge with the verbose "easy"
prompt in less than 10 steps:
db_easy
challenge using jq
:
sqli
challenge with the gpt-4.1
model:
Challenge
objects when our agent starts:
FLAG
environment variable is passed during build time, allowing it to be embedded in the container’s filesystem or applications. You can see how this argument is used by each challenge in their associated Dockerfile
and source code.@asyncontextmanager
to wrap our container startup code. This allows us to use the async with
syntax to ensure that our containers are cleaned up properly when we’re done with them.env
/kali
) and pass back a function to the caller which can be used to execute commands inside the container as long as our context manager is active (the containers are running):
start_containers
context manager and:
log_metric
calls where applicable and update our AgentLog
structure to reflect the current state of the agent.
give_up
tool is an optional addition that you can make as an agent author. Without it, agents might continue attempting the same failed approaches repeatedly when they’ve hit a fundamental limitation. However, agents might preemptively give up on challenges that they could have solved with more time. This is a tradeoff between efficiency and thoroughness.chat
for error states we want to track and log back to us.
limit
coroutines running at the same time. This is useful for:
backoff
library to handle rate limits from LLM providers and pass it to our Rigging generator. This library:
dn.log_metric
to track places we arrive in code, failure modes, and success rates.