Traces
Inspect individual attack conversations, trial details, and scoring for AI red teaming runs.
Traces capture the full conversation history of every trial in an attack run. Use them to understand exactly what prompts were sent, what the target responded, and how the response was scored. Traces are the evidence of where the model is failing. They give model builders, and particularly post-safety-training teams, the exact data they need to build better mitigations for the risks identified: the winning adversarial prompt, the harmful response the model produced, and the judge’s reasoning for why it scored as a jailbreak.
Traces list
Section titled “Traces list”The Traces view shows all attack traces for the project, each tagged with its outcome:

Each trace entry shows:
- Study name - the attack type (e.g.,
study:tap_attack) - Duration - how long the study took to execute
- Type -
studylabel - Outcome badge - color-coded result:
- jailbreak (red) - attack succeeded
- refusal (green) - target refused
- partial (yellow) - partial success
Trace tree
Section titled “Trace tree”Click any trace to expand its trace tree. The trace tree shows the hierarchical structure of the attack:
- Trace span - top-level container for the attack
- Trial spans - individual optimization iterations
- Target call - the prompt sent and response received
- Evaluator call - the judge model’s score
- Trial spans - individual optimization iterations
Each span includes:
- Full prompt text sent to the target
- Complete target response
- Jailbreak score (0.0 to 1.0)
- Timing information
- Model configuration
View modes
Section titled “View modes”Toggle between two view modes in the top-right:
- Detail - structured view with expandable spans and formatted content
- Timeline - chronological waterfall view showing execution timing across spans
CLI trace inspection
Section titled “CLI trace inspection”Access trace data from the command line:
# Get trace statistics for an assessmentdn airt traces <assessment-id>
# Get attack-level spansdn airt attacks <assessment-id>
# Get trial-level spans with filteringdn airt trials <assessment-id> --min-score 0.8dn airt trials <assessment-id> --attack-name tap --jailbreaks-onlydn airt trials <assessment-id> --limit 10Trial filters
Section titled “Trial filters”| Filter | Description |
|---|---|
--attack-name | Filter by attack type (tap, pair, crescendo, etc.) |
--min-score | Only show trials above this score threshold |
--jailbreaks-only | Only show successful jailbreaks |
--limit | Maximum number of trials to return |
Using traces for analysis
Section titled “Using traces for analysis”Traces help you answer:
- What worked? - sort by score to find the highest-scoring trials and examine the prompts that succeeded
- Why did it work? - read the full conversation to understand the attack path
- Which transforms helped? - compare scores with and without specific transforms
- Which attack is most effective? - compare outcomes across study types for the same goal
- Is the model consistently vulnerable? - look at outcome distribution (jailbreak vs refusal ratio)
Next steps
Section titled “Next steps”- Overview Dashboard - view aggregated metrics
- Assessments - drill into individual campaigns
- Analytics Reporting & Export Reporting - generate reports from trace data