Agent Summary
The following table provides a high-level overview and comparison of the agents available in this collection.| Agent | Description | Primary Use Case | Environment | Input Method | Key Tools |
|---|---|---|---|---|---|
| Dangerous Capabilities | Automatically build and run Capture The Flag (CTF) challenges | Reproduce Google’s “Dangerous Capabilities” evaluation | Python | A selected challenge container | Kali, Rigging, Dreadnode |
| Dotnet Reversing | Reverses and analyzes .NET binaries for vulnerabilities using an LLM. | Security analysis of .NET applications. | Python | Local .NET DLL/EXE files or NuGet package IDs. | dnlib, Rigging, Dreadnode |
| Python Agent | Executes Python code in a sandboxed Docker environment to perform general tasks. | General-purpose code execution, data analysis, automation. | Python, Docker | Natural language task, Docker image, volume mounts. | Docker, Jupyter Kernel, Rigging |
| Sast Scanning | Benchmarks LLM performance on SAST by running them against code with known vulnerabilities. | Evaluating and comparing LLMs for security code review. | Python, Docker (optional) | Pre-defined code challenges from a local directory. | Rigging, LiteLLM, Dreadnode |
| Sensitive Data | Scans various local or remote file systems (e.g., local, S3, GitHub) for sensitive data leaks. | Data governance and security auditing for exposed credentials/PII. | Python, fsspec | fsspec-compatible URI (e.g., s3://..., github://...). | fsspec, Rigging, Dreadnode |
Agents
Below are brief descriptions of each agent with a link to their detailed README files.1. Dangerous Capabilities Agent
This agent automatically builds and runs Capture The Flag (CTF) challenges. It is designed to reproduce Google’s “Dangerous Capabilities” evaluation. > More Details2. Dotnet Reversing Agent
This agent is designed to perform reverse engineering of .NET binaries. It can decompile .NET assemblies and use an LLM to analyze the resulting source code based on a user-defined task, such as “Find all critical security vulnerabilities.” > More Details3. Python Agent
A general-purpose agent that provides a sandboxed Jupyter environment inside a Docker container. It can execute Python code to accomplish a wide range of programmatic tasks, from data analysis to file manipulation, based on a natural language prompt. > More Details4. Sast Scanning Agent
This agent is a specialized framework for evaluating the security analysis capabilities of LLMs. It runs “challenges” where the model must find known, predefined vulnerabilities in a codebase. The agent scores the model’s performance, providing a quantitative way to benchmark different models for SAST. > More Details5. Sensitive Data Extraction Agent
An autonomous agent that explores and analyzes file systems to find and report sensitive data like credentials, API keys, and personal information. Leveragingfsspec, it can operate on local files, cloud storage (AWS S3, GCS), and remote repositories (GitHub).
> More Details
General Usage
While each agent has its own specific command-line arguments, they share a common setup:- Installation: Each agent is a Python application. Dependencies can be installed via
pip. - LLM Configuration: The agents use
litellmto connect to various LLMs. You must configure the appropriate environment variables for the model you intend to use (e.g.,OPENAI_API_KEY,ANTHROPIC_API_KEY). - Observability: To enable detailed logging, tracing, and metrics, you can configure the agents to connect to a Dreadnode server by providing a server URL and token.
Setup
All examples share the same project and dependencies, you setup the virtual environment with uv:Passing Models
For all agents, LLMs are usually specified with a--model argument, which is passed directly to our Rigging library.
You can read details about different ways to connect to providers, self-hosted servers, or even in-process local models in the docs
Usually, the obvious identifier works out of the box:
- You can pass API keys by setting the associated env var (
OPENAI_API_KEY) or by adding,api_key=...to your model string. - If you need to control which endpoint the model uses, you can add
,api_base=http://<host>:<port>to the model string. - As noted in the Rigging docs, these model strings also support properties like
temperatureandtop_kas needed.
Python Agent
A basic agent with access to a dockerized Jupyter kernel to execute code safely.- Provided a task (
--task), begin a generation loop with access to the Jupyter kernel - The work directory (
--work-dir) is mounted into the container, along with any other docker-style volumes (--volumes) - When finished, the agent marks the task as complete with a status and summary
- The work directory is logged as an artifact for the run
Dangerous Capabilities
Based on research from Google DeepMind, this agent works to solve a variety of CTF challenges given access to execute bash commands on a network-local Kali linux container.- For each challenge, produce P agent tasks where P = parallelism
- For all agent tasks, run them in parallel capped at your concurrency setting
- Inside each task, bring up the associated environment
- Continue requesting the next command from the inference model - execute it in the
envcontainer - If the flag is ever observed in the output, exit
- Otherwise run until an error, give up, or max-steps is reached
Dotnet Reversing
This agent is provided access to Cecil and ILSpy for use in reversing and analyzing Dotnet managed binaries for vulnerabilities.- Search for a term in target modules to identify functions of interest
- Decompile individual methods, types, or entire modules
- Collect all call flows which lead to a target method in all supplied binaries
- Report a vulnerability finding with associated path, method, and description
- Mark a task as complete with a summary
- Give up on a task with a reason
--nuget to the agent. It
will download the package, extract the binaries, and run the same analysis as above.
Sensitive Data Extraction
This agent is provided access to a filesystem tool based on fsspec for use in extracting sensitive data stored in files.fsspec, the agent can operate on
local files, Github repos, S3 buckets, and other cloud storage systems.
- https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations
- https://filesystem-spec.readthedocs.io/en/latest/api.html#other-known-implementations
SAST Vulnerability Scanning
This agent is designed to perform static code analysis to identify security vulnerabilities in source code. It uses a combination of direct file access and container-based approaches to analyze code for common security issues.- Execute targeted analysis commands to search through source files
- Report detailed findings with vulnerability location, type, and severity
- Support various programming languages through configurable extensions
- Operate in two modes: “direct” (filesystem access) or “container” (isolated analysis)
- Challenges and vulnerability patterns are defined in YAML configuration files, allowing for flexible targeting of specific security issues across different codebases.
Metrics and Scoring
The agent tracks several key metrics to evaluate performance:- valid_findings: Count of correctly identified vulnerabilities matching expected issues
- raw_findings: Total number of potential vulnerabilities reported by the model
- coverage: Percentage of known vulnerabilities successfully identified
- duplicates: Count of repeatedly reported vulnerabilities

