Assessments

Organize AI red teaming campaigns - attack runs, analytics, findings, attacker prompts, and target responses.

An assessment is a named container that groups attack runs against an AI system and aggregates their results into analytics, findings, and compliance reports. Assessments enable AI red team operators to continuously run attack campaigns as part of an ongoing operation and see point-in-time results for each campaign. As you test different attack strategies, goals, transforms, and model versions over days or weeks, each assessment captures a snapshot with detailed metrics, traces, and findings that you can compare and track over time.

What an assessment is

An assessment answers: How vulnerable is this AI system to adversarial attacks?

You provide:

A target system to probe
One or more attack strategies (Tree of Attacks with Pruning (TAP), Graph of Attacks (GOAT), Crescendo, Prompt Automatic Iterative Refinement (PAIR), and others)
Goals describing what the attacks should attempt

Dreadnode executes attack runs and aggregates their telemetry into analytics on demand. An assessment belongs to a project within a workspace and accumulates results across multiple attack runs over time.

Assessments list

Navigate to the Assessments tab to see all assessments in the project:

Assessments list with sidebar and detail panel

The view has two panels:

Each assessment shows:

Assessment name - descriptive name (e.g., probe-incident_postmortem-094)
Target model - which model was attacked
Attack count - number of attack runs (e.g., “1 attacks”)
Attack Success Rate - percentage of successful trials (e.g., “100% Attack Success Rate”)
Timestamp - when the assessment was created
Status indicator - green dot for completed

Right panel - assessment detail

Click any assessment to see its full analytics.

Assessment detail

Assessment detail with metrics, severity breakdown, and findings

Assessment header

Assessment name and description explaining the test objective
Status badge - Completed, Running, or Failed

Metrics bar

Metric	Description
Overall Attack Success Rate	Percentage of trials that achieved the goal
Successful / Total Attacks	How many attack runs succeeded vs. total (e.g., 1/1)
Total Trials	Number of individual attempts in this assessment
Duration	Wall-clock time for the assessment
Pruned	Percentage of trials pruned by the attack optimizer (e.g., 17%)
Total Time	Cumulative compute time across all trials
Avg Trial Time	Average time per trial

Severity breakdown

A horizontal bar showing the severity distribution for this assessment’s findings. Color-coded by severity level (Critical, High, Medium, Low, Info).

Findings table

The assessment-level findings table shows all findings from this specific assessment, with:

All Findings / Filters toggle for filtering
Score column (sortable, descending by default)
Severity level with color dot
Type - jailbreak, partial, refusal
Attack - which attack strategy produced the finding
Assessment ID reference

Expanded finding - attacker prompt and target response

Click the expand arrow on any finding to see the full evidence:

Expanded finding showing Best Attacker Prompt and Target Response

The expanded view shows:

Best Attacker Prompt - the exact adversarial prompt that achieved the highest score. This is the evidence of what the attacker sent to break the model.
Target Response - the model’s actual response to the adversarial prompt. This shows exactly how the model failed.

This is critical for model builders who need to understand the exact failure mode and reproduce it.

Attack success rate by attack

Below the findings table, the Attack Success Rate by Attack section shows a breakdown of ASR per attack type. Toggle between Table and Chart views:

ASR by Attack section with Table/Chart toggle and findings detail

Table columns: Attack, Attack Model, Successful/Total, Trials, Best Score, Min Score, Average Score.

The Chart view shows a visual bar chart of Attack Success Rate per attack type, making it easy to compare which strategies were most effective.

Attack success rate by category

Below the attack breakdown, Attack Success Rate is grouped by goal category (e.g., harmful_content, malware, elections). This helps you understand which types of goals the target is most vulnerable to and where to focus remediation.

Key concepts

Concept	Definition
Assessment	A named, project-scoped container for a red teaming campaign
Attack Run	A single execution of an attack strategy (e.g., one Tree of Attacks with Pruning (TAP) run with a specific goal)
Trial	An individual attempt within an attack run - one conversation or prompt exchange
ASR	Attack Success Rate - fraction of trials that achieved the stated goal
Pruned	Trials the optimizer skipped because they were unlikely to improve on existing results
Transform	Adversarial technique applied to prompts (encoding, persuasion, injection)
Compliance Tag	Mapping from attack results to security framework categories

Compliance mapping

Results are automatically tagged against industry security frameworks:

OWASP Top 10 for LLM Applications - prompt injection, insecure output handling, training data poisoning
OWASP Agentic Security (ASI01–ASI10) - behavior hijacking, tool misuse, privilege escalation
MITRE ATLAS - adversarial ML threat matrix techniques
NIST AI Risk Management Framework - risk categories and controls
Google SAIF - Secure AI Framework categories

Creating assessments

Assessments are created automatically when you run attacks via the TUI, CLI, or SDK:

CLI:

dn airt create \
  --name "Q2 Security Assessment" \
  --description "Quarterly red team exercise" \
  --project-id <project-id>

SDK:

from dreadnode.airt import Assessment

assessment = Assessment(
    name="Q2 Security Assessment",
    description="Quarterly red team exercise",
    target=target,
    model="openai/gpt-4o-mini",
    goal="Reveal the system prompt",
)

Managing assessments

# List all assessments
dn airt list

# Get assessment details
dn airt get <assessment-id> --json

# Update status
dn airt update <assessment-id> --status completed

# Delete an assessment
dn airt delete <assessment-id>

Assessment lifecycle

Created - assessment registered with the platform
Running - attack runs executing and uploading results
Completed - all attacks finished, analytics available
Failed - assessment encountered errors during execution

Next steps

Traces - inspect individual trial conversations in the trace tree
Analytics Reporting & Export Reporting - generate reports from assessment data
Overview Dashboard - view cross-assessment metrics