Skip to content

Prerequisites

Set up authentication, API keys, models, and compute before running AI red teaming.

Before running AI red teaming attacks, you need to configure authentication, model access, and choose where attacks will execute (local or Dreadnode-hosted compute).

Log in to the Dreadnode platform so results flow to your project dashboard:

Terminal window
dn login

This opens a browser for authentication and saves your credentials locally. Verify with:

Terminal window
dn whoami

You should see your organization, workspace, and profile context.

AI red teaming uses up to three LLM roles. You need at minimum a target model, and optionally separate models for the attacker and judge:

RoleWhat it doesCLI flagRequired?
Target modelThe model you are attacking. This is the system under test.--target-modelYes
Attacker modelGenerates adversarial prompts that try to jailbreak the target. A stronger attacker model produces more creative attacks.--attacker-modelNo (defaults to target)
Judge modelScores whether the target’s response constitutes a jailbreak. Evaluates attack success.--judge-modelNo (defaults to attacker)

You can use the same model for all three roles, or use different models. The target is always the model, application, or agent you are testing. A common pattern is to use a more capable model as the attacker and judge to generate stronger attacks and more accurate scoring:

Terminal window
# Same model for all three roles
dn airt run --goal "..." --target-model openai/gpt-4o-mini
# Target is the model under test, stronger attacker/judge for better attacks
dn airt run --goal "..." \
--target-model groq/llama-3.3-70b-versatile \
--attacker-model openai/gpt-4o \
--judge-model openai/gpt-4o

In the TUI, the agent model (set via --model or Ctrl+K) is the LLM that powers the agent itself. The target, attacker, and judge models are specified in your attack request and can be different from the agent model.

Dreadnode proxies models from multiple providers. Select them in the TUI model picker or specify with --model:

Terminal window
# TUI picks up hosted models automatically
dn --capability ai-red-teaming --model openai/gpt-4o
# Or specify a hosted model explicitly
dn --capability ai-red-teaming --model dn/sonnet-4.6

In the TUI, press Ctrl+K to open the model picker. Models prefixed with dn route through Dreadnode’s proxy and don’t require separate API keys. You only pay the underlying inference cost from the provider, Dreadnode does not charge extra on top of model usage.

Option B: Use your own API keys (local compute)

Section titled “Option B: Use your own API keys (local compute)”

If you want to use models directly from providers (OpenAI, Anthropic, Groq, etc.), export the API keys in your shell before launching:

Terminal window
# Set provider API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."
# Then launch the TUI or run CLI attacks
dn --capability ai-red-teaming --model openai/gpt-4o
dn airt run --goal "..." --attack tap --target-model openai/gpt-4o-mini

The TUI agent, CLI, and SDK all pick up environment variables automatically. Model names follow the provider/model-name format:

ProviderExample model name
OpenAIopenai/gpt-4o-mini
Anthropicanthropic/claude-sonnet-4-20250514
Groqgroq/llama-3.3-70b-versatile
Mistralmistral/mistral-large-latest
OpenRouteropenrouter/moonshotai/kimi-k2.5

Option C: Use Dreadnode-hosted compute with secrets

Section titled “Option C: Use Dreadnode-hosted compute with secrets”

If you want attacks to execute on Dreadnode’s infrastructure (remote sandboxes) with your own provider keys, add them as secrets in the platform:

  1. Navigate to Settings > Secrets in the Dreadnode platform
  2. Add your API keys (e.g., OPENAI_API_KEY, GROQ_API_KEY)
  3. Secrets are injected into sandbox environments automatically

See Secrets for details.

When you run dn --capability ai-red-teaming --model openai/gpt-4o or dn airt run, attacks execute on your local machine. You need:

  • API keys exported as environment variables (Option B above)
  • The dreadnode SDK installed (pip install dreadnode)

Results are uploaded to the platform via OTEL traces automatically.

When you launch AI red teaming from the platform UI or connect to a remote runtime, attacks execute in Dreadnode sandboxes. You need:

  • API keys configured as platform secrets (Option C above)
  • A project and workspace set up in the platform

Connect to a remote runtime from the TUI:

Terminal window
dn --runtime-server <runtime-url> --capability ai-red-teaming

The status bar shows remote when connected to Dreadnode-hosted compute vs. local for local execution.

Assessments belong to projects. Create one in the platform UI or let the AI Red Teaming agent create one for you:

  • In the TUI, tell the agent: “Create a project called my-safety-audit in the main workspace”
  • Or create it in the platform at your-org > Workspaces > your-workspace > New Project
What you needLocal computeDreadnode-hosted compute
Platform authdn logindn login
Model accessexport OPENAI_API_KEY=...Add to Settings > Secrets
Launch TUIdn --capability ai-red-teaming --model openai/gpt-4odn --runtime-server <url> --capability ai-red-teaming
Run CLI attackdn airt run --goal "..." --target-model openai/gpt-4o-miniSame, routed through sandbox
Status barShows localShows remote