Prerequisites
Set up authentication, API keys, models, and compute before running AI red teaming.
Before running AI red teaming attacks, you need to configure authentication, model access, and choose where attacks will execute (local or Dreadnode-hosted compute).
1. Authenticate with the platform
Section titled “1. Authenticate with the platform”Log in to the Dreadnode platform so results flow to your project dashboard:
dn loginThis opens a browser for authentication and saves your credentials locally. Verify with:
dn whoamiYou should see your organization, workspace, and profile context.
2. Configure model access
Section titled “2. Configure model access”AI red teaming uses up to three LLM roles. You need at minimum a target model, and optionally separate models for the attacker and judge:
| Role | What it does | CLI flag | Required? |
|---|---|---|---|
| Target model | The model you are attacking. This is the system under test. | --target-model | Yes |
| Attacker model | Generates adversarial prompts that try to jailbreak the target. A stronger attacker model produces more creative attacks. | --attacker-model | No (defaults to target) |
| Judge model | Scores whether the target’s response constitutes a jailbreak. Evaluates attack success. | --judge-model | No (defaults to attacker) |
You can use the same model for all three roles, or use different models. The target is always the model, application, or agent you are testing. A common pattern is to use a more capable model as the attacker and judge to generate stronger attacks and more accurate scoring:
# Same model for all three rolesdn airt run --goal "..." --target-model openai/gpt-4o-mini
# Target is the model under test, stronger attacker/judge for better attacksdn airt run --goal "..." \ --target-model groq/llama-3.3-70b-versatile \ --attacker-model openai/gpt-4o \ --judge-model openai/gpt-4oIn the TUI, the agent model (set via --model or Ctrl+K) is the LLM that powers the agent itself. The target, attacker, and judge models are specified in your attack request and can be different from the agent model.
Option A: Use Dreadnode-hosted models
Section titled “Option A: Use Dreadnode-hosted models”Dreadnode proxies models from multiple providers. Select them in the TUI model picker or specify with --model:
# TUI picks up hosted models automaticallydn --capability ai-red-teaming --model openai/gpt-4o
# Or specify a hosted model explicitlydn --capability ai-red-teaming --model dn/sonnet-4.6In the TUI, press Ctrl+K to open the model picker. Models prefixed with dn route through Dreadnode’s proxy and don’t require separate API keys. You only pay the underlying inference cost from the provider, Dreadnode does not charge extra on top of model usage.
Option B: Use your own API keys (local compute)
Section titled “Option B: Use your own API keys (local compute)”If you want to use models directly from providers (OpenAI, Anthropic, Groq, etc.), export the API keys in your shell before launching:
# Set provider API keysexport OPENAI_API_KEY="sk-..."export ANTHROPIC_API_KEY="sk-ant-..."export GROQ_API_KEY="gsk_..."
# Then launch the TUI or run CLI attacksdn --capability ai-red-teaming --model openai/gpt-4odn airt run --goal "..." --attack tap --target-model openai/gpt-4o-miniThe TUI agent, CLI, and SDK all pick up environment variables automatically. Model names follow the provider/model-name format:
| Provider | Example model name |
|---|---|
| OpenAI | openai/gpt-4o-mini |
| Anthropic | anthropic/claude-sonnet-4-20250514 |
| Groq | groq/llama-3.3-70b-versatile |
| Mistral | mistral/mistral-large-latest |
| OpenRouter | openrouter/moonshotai/kimi-k2.5 |
Option C: Use Dreadnode-hosted compute with secrets
Section titled “Option C: Use Dreadnode-hosted compute with secrets”If you want attacks to execute on Dreadnode’s infrastructure (remote sandboxes) with your own provider keys, add them as secrets in the platform:
- Navigate to Settings > Secrets in the Dreadnode platform
- Add your API keys (e.g.,
OPENAI_API_KEY,GROQ_API_KEY) - Secrets are injected into sandbox environments automatically
See Secrets for details.
3. Choose compute mode
Section titled “3. Choose compute mode”Local compute (default)
Section titled “Local compute (default)”When you run dn --capability ai-red-teaming --model openai/gpt-4o or dn airt run, attacks execute on your local machine. You need:
- API keys exported as environment variables (Option B above)
- The
dreadnodeSDK installed (pip install dreadnode)
Results are uploaded to the platform via OTEL traces automatically.
Dreadnode-hosted compute (remote)
Section titled “Dreadnode-hosted compute (remote)”When you launch AI red teaming from the platform UI or connect to a remote runtime, attacks execute in Dreadnode sandboxes. You need:
- API keys configured as platform secrets (Option C above)
- A project and workspace set up in the platform
Connect to a remote runtime from the TUI:
dn --runtime-server <runtime-url> --capability ai-red-teamingThe status bar shows remote when connected to Dreadnode-hosted compute vs. local for local execution.
4. Set up a project
Section titled “4. Set up a project”Assessments belong to projects. Create one in the platform UI or let the AI Red Teaming agent create one for you:
- In the TUI, tell the agent: “Create a project called my-safety-audit in the main workspace”
- Or create it in the platform at your-org > Workspaces > your-workspace > New Project
Quick reference
Section titled “Quick reference”| What you need | Local compute | Dreadnode-hosted compute |
|---|---|---|
| Platform auth | dn login | dn login |
| Model access | export OPENAI_API_KEY=... | Add to Settings > Secrets |
| Launch TUI | dn --capability ai-red-teaming --model openai/gpt-4o | dn --runtime-server <url> --capability ai-red-teaming |
| Run CLI attack | dn airt run --goal "..." --target-model openai/gpt-4o-mini | Same, routed through sandbox |
| Status bar | Shows local | Shows remote |
Next steps
Section titled “Next steps”- Using the TUI Agent - run AI red teaming via natural language
- Using the CLI - repeatable attacks from the command line
- Using the SDK - programmatic attack workflows in Python